All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization
@ 2017-02-22  3:46 zhanghailiang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: zhanghailiang @ 2017-02-22  3:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst, lizhijian
  Cc: qemu-devel, xuquan8, pss.wulizhen, zhanghailiang

Hi,

Patch 1 fixes a double free bug, and patch 2 fixes a memory leak bug.
Patch 3 is an optimization for filter-rewriter.

Please review, thanks.

zhanghailiang (3):
  net/colo: fix memory double free error
  filter-rewriter: fix memory leak for connection in
    connection_track_table
  filter-rewriter: skip net_checksum_calculate() while offset = 0

 net/colo.c            |  4 +--
 net/colo.h            |  4 +++
 net/filter-rewriter.c | 86 +++++++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 79 insertions(+), 15 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error
  2017-02-22  3:46 [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization zhanghailiang
@ 2017-02-22  3:46 ` zhanghailiang
  2017-02-22  8:39   ` Zhang Chen
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table zhanghailiang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0 zhanghailiang
  2 siblings, 1 reply; 21+ messages in thread
From: zhanghailiang @ 2017-02-22  3:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst, lizhijian
  Cc: qemu-devel, xuquan8, pss.wulizhen, zhanghailiang

The 'primary_list' and 'secondary_list' members of struct Connection
is not allocated through dynamically g_queue_new(), but we free it by using
g_queue_free(), which will lead to a double-free bug.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/colo.c b/net/colo.c
index 6a6eacd..8cc166b 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -147,9 +147,9 @@ void connection_destroy(void *opaque)
     Connection *conn = opaque;
 
     g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
-    g_queue_free(&conn->primary_list);
+    g_queue_clear(&conn->primary_list);
     g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
-    g_queue_free(&conn->secondary_list);
+    g_queue_clear(&conn->secondary_list);
     g_slice_free(Connection, conn);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-22  3:46 [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization zhanghailiang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
@ 2017-02-22  3:46 ` zhanghailiang
  2017-02-22  8:07   ` Jason Wang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0 zhanghailiang
  2 siblings, 1 reply; 21+ messages in thread
From: zhanghailiang @ 2017-02-22  3:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst, lizhijian
  Cc: qemu-devel, xuquan8, pss.wulizhen, zhanghailiang

After a net connection is closed, we didn't clear its releated resources
in connection_track_table, which will lead to memory leak.

Let't track the state of net connection, if it is closed, its related
resources will be cleared up.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo.h            |  4 +++
 net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/net/colo.h b/net/colo.h
index 7c524f3..cd9027f 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -18,6 +18,7 @@
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
 #include "qemu/timer.h"
+#include "slirp/tcp.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -69,6 +70,9 @@ typedef struct Connection {
      * run once in independent tcp connection
      */
     int syn_flag;
+
+    int tcp_state; /* TCP FSM state */
+    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
 } Connection;
 
 uint32_t connection_key_hash(const void *opaque);
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index c4ab91c..7e7ec35 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
 }
 
 /* handle tcp packet from primary guest */
-static int handle_primary_tcp_pkt(NetFilterState *nf,
+static int handle_primary_tcp_pkt(RewriterState *rf,
                                   Connection *conn,
-                                  Packet *pkt)
+                                  Packet *pkt, ConnectionKey *key)
 {
     struct tcphdr *tcp_pkt;
 
@@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
         tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
 
         net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+        /*
+         * Case 1:
+         * The *server* side of this connect is VM, *client* tries to close
+         * the connection.
+         *
+         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
+         * packet from server side. From this point, we can ensure that there
+         * will be no packets in the connection, except that, some errors
+         * happen between the path of 'filter object' and vNIC, if this rare
+         * case really happen, we can still create a new connection,
+         * So it is safe to remove the connection from connection_track_table.
+         *
+         */
+        if ((conn->tcp_state == TCPS_LAST_ACK) &&
+            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+            fprintf(stderr, "Remove conn "
+            g_hash_table_remove(rf->connection_track_table, key);
+        }
+    }
+    /*
+     * Case 2:
+     * The *server* side of this connect is VM, *server* tries to close
+     * the connection.
+     *
+     * We got 'fin=1, ack=1' packet from client side, we need to
+     * record the seq of 'fin=1, ack=1' packet.
+     */
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
+        conn->tcp_state = TCPS_LAST_ACK;
     }
 
     return 0;
 }
 
 /* handle tcp packet from secondary guest */
-static int handle_secondary_tcp_pkt(NetFilterState *nf,
+static int handle_secondary_tcp_pkt(RewriterState *rf,
                                     Connection *conn,
-                                    Packet *pkt)
+                                    Packet *pkt, ConnectionKey *key)
 {
     struct tcphdr *tcp_pkt;
 
@@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
         tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
 
         net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+        /*
+         * Case 2:
+         * The *server* side of this connect is VM, *server* tries to close
+         * the connection.
+         *
+         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
+         * packet from client side. Like Case 1, there should be no packets
+         * in the connection from now know, But the difference here is
+         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
+         * TODO: Fix above case.
+         */
+        if ((conn->tcp_state == TCPS_LAST_ACK) &&
+            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+            g_hash_table_remove(rf->connection_track_table, key);
+        }
+    }
+    /*
+     * Case 1:
+     * The *server* side of this connect is VM, *client* tries to close
+     * the connection.
+     *
+     * We got 'fin=1, ack=1' packet from server side, we need to
+     * record the seq of 'fin=1, ack=1' packet.
+     */
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
+        conn->tcp_state = TCPS_LAST_ACK;
     }
-
     return 0;
 }
 
@@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 
         if (sender == nf->netdev) {
             /* NET_FILTER_DIRECTION_TX */
-            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
+            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
                 qemu_net_queue_send(s->incoming_queue, sender, 0,
                 (const uint8_t *)pkt->data, pkt->size, NULL);
                 packet_destroy(pkt, NULL);
@@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
             }
         } else {
             /* NET_FILTER_DIRECTION_RX */
-            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
+            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
                 qemu_net_queue_send(s->incoming_queue, sender, 0,
                 (const uint8_t *)pkt->data, pkt->size, NULL);
                 packet_destroy(pkt, NULL);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0
  2017-02-22  3:46 [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization zhanghailiang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table zhanghailiang
@ 2017-02-22  3:46 ` zhanghailiang
  2017-02-24  8:08   ` Zhang Chen
  2 siblings, 1 reply; 21+ messages in thread
From: zhanghailiang @ 2017-02-22  3:46 UTC (permalink / raw)
  To: jasowang, zhangchen.fnst, lizhijian
  Cc: qemu-devel, xuquan8, pss.wulizhen, zhanghailiang

While the offset of packets's sequence for primary side and
secondary side is zero, it is unnecessary to call net_checksum_calculate()
to recalculate the checksume value of packets.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/filter-rewriter.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 7e7ec35..c9a6d43 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -93,10 +93,12 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
             conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
             conn->syn_flag = 0;
         }
-        /* handle packets to the secondary from the primary */
-        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
+        if (conn->offset) {
+            /* handle packets to the secondary from the primary */
+            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
 
-        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+        }
         /*
          * Case 1:
          * The *server* side of this connect is VM, *client* tries to close
@@ -112,7 +114,6 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
          */
         if ((conn->tcp_state == TCPS_LAST_ACK) &&
             (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
-            fprintf(stderr, "Remove conn "
             g_hash_table_remove(rf->connection_track_table, key);
         }
     }
@@ -159,10 +160,13 @@ static int handle_secondary_tcp_pkt(RewriterState *rf,
     }
 
     if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
-        /* handle packets to the primary from the secondary*/
-        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
+        /* Only need to adjust seq while offset is Non-zero */
+        if (conn->offset) {
+            /* handle packets to the primary from the secondary*/
+            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
 
-        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+        }
         /*
          * Case 2:
          * The *server* side of this connect is VM, *server* tries to close
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table zhanghailiang
@ 2017-02-22  8:07   ` Jason Wang
  2017-02-22  8:45     ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2017-02-22  8:07 UTC (permalink / raw)
  To: zhanghailiang, zhangchen.fnst, lizhijian
  Cc: qemu-devel, xuquan8, pss.wulizhen



On 2017年02月22日 11:46, zhanghailiang wrote:
> After a net connection is closed, we didn't clear its releated resources
> in connection_track_table, which will lead to memory leak.

Not a real leak but would lead reset of hash table if too many closed 
connections.

>
> Let't track the state of net connection, if it is closed, its related
> resources will be cleared up.

The issue is the state were tracked partially, do we need a full state 
machine here?

>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/colo.h            |  4 +++
>   net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------
>   2 files changed, 67 insertions(+), 7 deletions(-)
>
> diff --git a/net/colo.h b/net/colo.h
> index 7c524f3..cd9027f 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -18,6 +18,7 @@
>   #include "slirp/slirp.h"
>   #include "qemu/jhash.h"
>   #include "qemu/timer.h"
> +#include "slirp/tcp.h"
>   
>   #define HASHTABLE_MAX_SIZE 16384
>   
> @@ -69,6 +70,9 @@ typedef struct Connection {
>        * run once in independent tcp connection
>        */
>       int syn_flag;
> +
> +    int tcp_state; /* TCP FSM state */
> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>   } Connection;
>   
>   uint32_t connection_key_hash(const void *opaque);
> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
> index c4ab91c..7e7ec35 100644
> --- a/net/filter-rewriter.c
> +++ b/net/filter-rewriter.c
> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>   }
>   
>   /* handle tcp packet from primary guest */
> -static int handle_primary_tcp_pkt(NetFilterState *nf,
> +static int handle_primary_tcp_pkt(RewriterState *rf,
>                                     Connection *conn,
> -                                  Packet *pkt)
> +                                  Packet *pkt, ConnectionKey *key)
>   {
>       struct tcphdr *tcp_pkt;
>   
> @@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
>           tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>   
>           net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +        /*
> +         * Case 1:
> +         * The *server* side of this connect is VM, *client* tries to close
> +         * the connection.
> +         *
> +         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
> +         * packet from server side. From this point, we can ensure that there
> +         * will be no packets in the connection, except that, some errors
> +         * happen between the path of 'filter object' and vNIC, if this rare
> +         * case really happen, we can still create a new connection,
> +         * So it is safe to remove the connection from connection_track_table.
> +         *
> +         */
> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
> +            fprintf(stderr, "Remove conn "

Can this even compile?

> +            g_hash_table_remove(rf->connection_track_table, key);
> +        }
> +    }
> +    /*
> +     * Case 2:
> +     * The *server* side of this connect is VM, *server* tries to close
> +     * the connection.
> +     *
> +     * We got 'fin=1, ack=1' packet from client side, we need to
> +     * record the seq of 'fin=1, ack=1' packet.
> +     */
> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
> +        conn->tcp_state = TCPS_LAST_ACK;
>       }
>   
>       return 0;
>   }
>   
>   /* handle tcp packet from secondary guest */
> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>                                       Connection *conn,
> -                                    Packet *pkt)
> +                                    Packet *pkt, ConnectionKey *key)
>   {
>       struct tcphdr *tcp_pkt;
>   
> @@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
>           tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>   
>           net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +        /*
> +         * Case 2:
> +         * The *server* side of this connect is VM, *server* tries to close
> +         * the connection.
> +         *
> +         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
> +         * packet from client side. Like Case 1, there should be no packets
> +         * in the connection from now know, But the difference here is
> +         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
> +         * TODO: Fix above case.
> +         */
> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
> +            g_hash_table_remove(rf->connection_track_table, key);
> +        }
> +    }
> +    /*
> +     * Case 1:
> +     * The *server* side of this connect is VM, *client* tries to close
> +     * the connection.
> +     *
> +     * We got 'fin=1, ack=1' packet from server side, we need to
> +     * record the seq of 'fin=1, ack=1' packet.
> +     */
> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
> +        conn->tcp_state = TCPS_LAST_ACK;

I thought the tcp_state should store the state of TCP from the view of 
secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. And 
the handle of active close needs more states here. E.g if connection is 
in FIN_WAIT_2, the connection is only half closed, remote peer can still 
send packet to us unless we receive a FIN.

Thanks

>       }
> -
>       return 0;
>   }
>   
> @@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>   
>           if (sender == nf->netdev) {
>               /* NET_FILTER_DIRECTION_TX */
> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>                   packet_destroy(pkt, NULL);
> @@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>               }
>           } else {
>               /* NET_FILTER_DIRECTION_RX */
> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>                   packet_destroy(pkt, NULL);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
@ 2017-02-22  8:39   ` Zhang Chen
  0 siblings, 0 replies; 21+ messages in thread
From: Zhang Chen @ 2017-02-22  8:39 UTC (permalink / raw)
  To: zhanghailiang, jasowang, lizhijian
  Cc: zhangchen.fnst, qemu-devel, xuquan8, pss.wulizhen



On 02/22/2017 11:46 AM, zhanghailiang wrote:
> The 'primary_list' and 'secondary_list' members of struct Connection
> is not allocated through dynamically g_queue_new(), but we free it by using
> g_queue_free(), which will lead to a double-free bug.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>

> ---
>   net/colo.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/colo.c b/net/colo.c
> index 6a6eacd..8cc166b 100644
> --- a/net/colo.c
> +++ b/net/colo.c
> @@ -147,9 +147,9 @@ void connection_destroy(void *opaque)
>       Connection *conn = opaque;
>   
>       g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
> -    g_queue_free(&conn->primary_list);
> +    g_queue_clear(&conn->primary_list);
>       g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
> -    g_queue_free(&conn->secondary_list);
> +    g_queue_clear(&conn->secondary_list);
>       g_slice_free(Connection, conn);
>   }
>   

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-22  8:07   ` Jason Wang
@ 2017-02-22  8:45     ` Hailiang Zhang
  2017-02-22  8:51       ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-22  8:45 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/22 16:07, Jason Wang wrote:
>
>
> On 2017年02月22日 11:46, zhanghailiang wrote:
>> After a net connection is closed, we didn't clear its releated resources
>> in connection_track_table, which will lead to memory leak.
>
> Not a real leak but would lead reset of hash table if too many closed
> connections.
>

Yes, you are right, there will be lots of stale connection data in hash table
if we don't remove it while it is been closed. Which

>>
>> Let't track the state of net connection, if it is closed, its related
>> resources will be cleared up.
>
> The issue is the state were tracked partially, do we need a full state
> machine here?
>

Not, IMHO, we only care about the last state of it, because, we will do nothing
even if we track the intermedial states.

>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>    net/colo.h            |  4 +++
>>    net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------
>>    2 files changed, 67 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/colo.h b/net/colo.h
>> index 7c524f3..cd9027f 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -18,6 +18,7 @@
>>    #include "slirp/slirp.h"
>>    #include "qemu/jhash.h"
>>    #include "qemu/timer.h"
>> +#include "slirp/tcp.h"
>>
>>    #define HASHTABLE_MAX_SIZE 16384
>>
>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>         * run once in independent tcp connection
>>         */
>>        int syn_flag;
>> +
>> +    int tcp_state; /* TCP FSM state */
>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>    } Connection;
>>
>>    uint32_t connection_key_hash(const void *opaque);
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index c4ab91c..7e7ec35 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>    }
>>
>>    /* handle tcp packet from primary guest */
>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>                                      Connection *conn,
>> -                                  Packet *pkt)
>> +                                  Packet *pkt, ConnectionKey *key)
>>    {
>>        struct tcphdr *tcp_pkt;
>>
>> @@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
>>            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>>
>>            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        /*
>> +         * Case 1:
>> +         * The *server* side of this connect is VM, *client* tries to close
>> +         * the connection.
>> +         *
>> +         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
>> +         * packet from server side. From this point, we can ensure that there
>> +         * will be no packets in the connection, except that, some errors
>> +         * happen between the path of 'filter object' and vNIC, if this rare
>> +         * case really happen, we can still create a new connection,
>> +         * So it is safe to remove the connection from connection_track_table.
>> +         *
>> +         */
>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> +            fprintf(stderr, "Remove conn "
>
> Can this even compile?
>

Oops, i forgot to remove it, will remove it in next version.

>> +            g_hash_table_remove(rf->connection_track_table, key);
>> +        }
>> +    }
>> +    /*
>> +     * Case 2:
>> +     * The *server* side of this connect is VM, *server* tries to close
>> +     * the connection.
>> +     *
>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>> +     * record the seq of 'fin=1, ack=1' packet.
>> +     */
>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>> +        conn->tcp_state = TCPS_LAST_ACK;
>>        }
>>
>>        return 0;
>>    }
>>
>>    /* handle tcp packet from secondary guest */
>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>                                        Connection *conn,
>> -                                    Packet *pkt)
>> +                                    Packet *pkt, ConnectionKey *key)
>>    {
>>        struct tcphdr *tcp_pkt;
>>
>> @@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>>
>>            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        /*
>> +         * Case 2:
>> +         * The *server* side of this connect is VM, *server* tries to close
>> +         * the connection.
>> +         *
>> +         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
>> +         * packet from client side. Like Case 1, there should be no packets
>> +         * in the connection from now know, But the difference here is
>> +         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
>> +         * TODO: Fix above case.
>> +         */
>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> +            g_hash_table_remove(rf->connection_track_table, key);
>> +        }
>> +    }
>> +    /*
>> +     * Case 1:
>> +     * The *server* side of this connect is VM, *client* tries to close
>> +     * the connection.
>> +     *
>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>> +     * record the seq of 'fin=1, ack=1' packet.
>> +     */
>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>> +        conn->tcp_state = TCPS_LAST_ACK;
>
> I thought the tcp_state should store the state of TCP from the view of
> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. And
> the handle of active close needs more states here. E.g if connection is
> in FIN_WAIT_2, the connection is only half closed, remote peer can still
> send packet to us unless we receive a FIN.
>

Yes, i know what you mean, actually, here, we try to only track the last
two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1'
and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
ensure that the 'fin=1,seq=u' packet has been posted.

Another reason is we may can't track the 'fin=1,seq=u' packet while
we start COLO while one connection is closing, which the 'fin=1,seq=u' packet
has been posted.

Actually, here, if we start COLO while one connection is closing, which the
'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this
case, the connection will be left in hash table for ever though it is harmless.
Any ideas for this case ?

For the above codes question, i'd like to change tcp_state to tap_closing_wait,
is it OK ?

Thanks.
Hailiang

> Thanks
>
>>        }
>> -
>>        return 0;
>>    }
>>
>> @@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>>
>>            if (sender == nf->netdev) {
>>                /* NET_FILTER_DIRECTION_TX */
>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>                    qemu_net_queue_send(s->incoming_queue, sender, 0,
>>                    (const uint8_t *)pkt->data, pkt->size, NULL);
>>                    packet_destroy(pkt, NULL);
>> @@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>>                }
>>            } else {
>>                /* NET_FILTER_DIRECTION_RX */
>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>                    qemu_net_queue_send(s->incoming_queue, sender, 0,
>>                    (const uint8_t *)pkt->data, pkt->size, NULL);
>>                    packet_destroy(pkt, NULL);
>
>
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-22  8:45     ` Hailiang Zhang
@ 2017-02-22  8:51       ` Hailiang Zhang
  2017-02-23  4:16         ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-22  8:51 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/22 16:45, Hailiang Zhang wrote:
> On 2017/2/22 16:07, Jason Wang wrote:
>>
>>
>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>> After a net connection is closed, we didn't clear its releated resources
>>> in connection_track_table, which will lead to memory leak.
>>
>> Not a real leak but would lead reset of hash table if too many closed
>> connections.
>>
>
> Yes, you are right, there will be lots of stale connection data in hash table
> if we don't remove it while it is been closed. Which
>
>>>
>>> Let't track the state of net connection, if it is closed, its related
>>> resources will be cleared up.
>>
>> The issue is the state were tracked partially, do we need a full state
>> machine here?
>>
>
> Not, IMHO, we only care about the last state of it, because, we will do nothing
> even if we track the intermedial states.
>
>>>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> ---
>>>     net/colo.h            |  4 +++
>>>     net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------
>>>     2 files changed, 67 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/net/colo.h b/net/colo.h
>>> index 7c524f3..cd9027f 100644
>>> --- a/net/colo.h
>>> +++ b/net/colo.h
>>> @@ -18,6 +18,7 @@
>>>     #include "slirp/slirp.h"
>>>     #include "qemu/jhash.h"
>>>     #include "qemu/timer.h"
>>> +#include "slirp/tcp.h"
>>>
>>>     #define HASHTABLE_MAX_SIZE 16384
>>>
>>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>>          * run once in independent tcp connection
>>>          */
>>>         int syn_flag;
>>> +
>>> +    int tcp_state; /* TCP FSM state */
>>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>>     } Connection;
>>>
>>>     uint32_t connection_key_hash(const void *opaque);
>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>> index c4ab91c..7e7ec35 100644
>>> --- a/net/filter-rewriter.c
>>> +++ b/net/filter-rewriter.c
>>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>>     }
>>>
>>>     /* handle tcp packet from primary guest */
>>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>>                                       Connection *conn,
>>> -                                  Packet *pkt)
>>> +                                  Packet *pkt, ConnectionKey *key)
>>>     {
>>>         struct tcphdr *tcp_pkt;
>>>
>>> @@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
>>>             tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>>>
>>>             net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +        /*
>>> +         * Case 1:
>>> +         * The *server* side of this connect is VM, *client* tries to close
>>> +         * the connection.
>>> +         *
>>> +         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
>>> +         * packet from server side. From this point, we can ensure that there
>>> +         * will be no packets in the connection, except that, some errors
>>> +         * happen between the path of 'filter object' and vNIC, if this rare
>>> +         * case really happen, we can still create a new connection,
>>> +         * So it is safe to remove the connection from connection_track_table.
>>> +         *
>>> +         */
>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>> +            fprintf(stderr, "Remove conn "
>>
>> Can this even compile?
>>
>
> Oops, i forgot to remove it, will remove it in next version.
>
>>> +            g_hash_table_remove(rf->connection_track_table, key);
>>> +        }
>>> +    }
>>> +    /*
>>> +     * Case 2:
>>> +     * The *server* side of this connect is VM, *server* tries to close
>>> +     * the connection.
>>> +     *
>>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>>> +     * record the seq of 'fin=1, ack=1' packet.
>>> +     */
>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>         }
>>>
>>>         return 0;
>>>     }
>>>
>>>     /* handle tcp packet from secondary guest */
>>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>>                                         Connection *conn,
>>> -                                    Packet *pkt)
>>> +                                    Packet *pkt, ConnectionKey *key)
>>>     {
>>>         struct tcphdr *tcp_pkt;
>>>
>>> @@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>>             tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>>>
>>>             net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +        /*
>>> +         * Case 2:
>>> +         * The *server* side of this connect is VM, *server* tries to close
>>> +         * the connection.
>>> +         *
>>> +         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
>>> +         * packet from client side. Like Case 1, there should be no packets
>>> +         * in the connection from now know, But the difference here is
>>> +         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
>>> +         * TODO: Fix above case.
>>> +         */
>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>> +            g_hash_table_remove(rf->connection_track_table, key);
>>> +        }
>>> +    }
>>> +    /*
>>> +     * Case 1:
>>> +     * The *server* side of this connect is VM, *client* tries to close
>>> +     * the connection.
>>> +     *
>>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>>> +     * record the seq of 'fin=1, ack=1' packet.
>>> +     */
>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>
>> I thought the tcp_state should store the state of TCP from the view of
>> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. And
>> the handle of active close needs more states here. E.g if connection is
>> in FIN_WAIT_2, the connection is only half closed, remote peer can still
>> send packet to us unless we receive a FIN.
>>
>
> Yes, i know what you mean, actually, here, we try to only track the last
> two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1'
                                              ^ 'FIN=1,ACK=1,seq=w,ack=u+1'

> and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
      ^ 'ACK=1,seq=u+1,ack=w+1'                     ^ 'FIN=1,ACK=1'

> ensure that the 'fin=1,seq=u' packet has been posted.
>
                   ^ 'FIN=1,seq=u'

> Another reason is we may can't track the 'fin=1,seq=u' packet while
> we start COLO while one connection is closing, which the 'fin=1,seq=u' packet
> has been posted.
>
> Actually, here, if we start COLO while one connection is closing, which the
> 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this

  ^ 'FIN=1,ACK=1'

Sorry for the typo. :)

> case, the connection will be left in hash table for ever though it is harmless.
> Any ideas for this case ?
>
> For the above codes question, i'd like to change tcp_state to tap_closing_wait,
> is it OK ?
>
> Thanks.
> Hailiang
>
>> Thanks
>>
>>>         }
>>> -
>>>         return 0;
>>>     }
>>>
>>> @@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>>>
>>>             if (sender == nf->netdev) {
>>>                 /* NET_FILTER_DIRECTION_TX */
>>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>>                     qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>                     (const uint8_t *)pkt->data, pkt->size, NULL);
>>>                     packet_destroy(pkt, NULL);
>>> @@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>>>                 }
>>>             } else {
>>>                 /* NET_FILTER_DIRECTION_RX */
>>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>>                     qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>                     (const uint8_t *)pkt->data, pkt->size, NULL);
>>>                     packet_destroy(pkt, NULL);
>>
>>
>> .
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-22  8:51       ` Hailiang Zhang
@ 2017-02-23  4:16         ` Jason Wang
  2017-02-27  3:11           ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2017-02-23  4:16 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst, lizhijian
  Cc: xuquan8, qemu-devel, pss.wulizhen



On 2017年02月22日 16:51, Hailiang Zhang wrote:
> On 2017/2/22 16:45, Hailiang Zhang wrote:
>> On 2017/2/22 16:07, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>> After a net connection is closed, we didn't clear its releated 
>>>> resources
>>>> in connection_track_table, which will lead to memory leak.
>>>
>>> Not a real leak but would lead reset of hash table if too many closed
>>> connections.
>>>
>>
>> Yes, you are right, there will be lots of stale connection data in 
>> hash table
>> if we don't remove it while it is been closed. Which


Ok, so let's come up with a better title of the patch.

>>
>>>>
>>>> Let't track the state of net connection, if it is closed, its related
>>>> resources will be cleared up.
>>>
>>> The issue is the state were tracked partially, do we need a full state
>>> machine here?
>>>
>>
>> Not, IMHO, we only care about the last state of it, because, we will 
>> do nothing
>> even if we track the intermedial states.

Well, you care at least syn state too. Without a complete state machine, 
it's very hard to track even partial state I believe. And you will fail 
to track some state transition for sure which makes the code fragile.

>>
>>>>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> ---
>>>>     net/colo.h            |  4 +++
>>>>     net/filter-rewriter.c | 70 
>>>> +++++++++++++++++++++++++++++++++++++++++++++------
>>>>     2 files changed, 67 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/net/colo.h b/net/colo.h
>>>> index 7c524f3..cd9027f 100644
>>>> --- a/net/colo.h
>>>> +++ b/net/colo.h
>>>> @@ -18,6 +18,7 @@
>>>>     #include "slirp/slirp.h"
>>>>     #include "qemu/jhash.h"
>>>>     #include "qemu/timer.h"
>>>> +#include "slirp/tcp.h"
>>>>
>>>>     #define HASHTABLE_MAX_SIZE 16384
>>>>
>>>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>>>          * run once in independent tcp connection
>>>>          */
>>>>         int syn_flag;
>>>> +
>>>> +    int tcp_state; /* TCP FSM state */
>>>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>>>     } Connection;
>>>>
>>>>     uint32_t connection_key_hash(const void *opaque);
>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>>> index c4ab91c..7e7ec35 100644
>>>> --- a/net/filter-rewriter.c
>>>> +++ b/net/filter-rewriter.c
>>>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>>>     }
>>>>
>>>>     /* handle tcp packet from primary guest */
>>>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>>>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>>>                                       Connection *conn,
>>>> -                                  Packet *pkt)
>>>> +                                  Packet *pkt, ConnectionKey *key)
>>>>     {
>>>>         struct tcphdr *tcp_pkt;
>>>>
>>>> @@ -97,15 +97,45 @@ static int 
>>>> handle_primary_tcp_pkt(NetFilterState *nf,
>>>>             tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + 
>>>> conn->offset);
>>>>
>>>>             net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>>> +        /*
>>>> +         * Case 1:
>>>> +         * The *server* side of this connect is VM, *client* tries 
>>>> to close
>>>> +         * the connection.
>>>> +         *
>>>> +         * We got 'ack=1' packets from client side, it acks 
>>>> 'fin=1, ack=1'
>>>> +         * packet from server side. From this point, we can ensure 
>>>> that there
>>>> +         * will be no packets in the connection, except that, some 
>>>> errors
>>>> +         * happen between the path of 'filter object' and vNIC, if 
>>>> this rare
>>>> +         * case really happen, we can still create a new connection,
>>>> +         * So it is safe to remove the connection from 
>>>> connection_track_table.
>>>> +         *
>>>> +         */
>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>> +            fprintf(stderr, "Remove conn "
>>>
>>> Can this even compile?
>>>
>>
>> Oops, i forgot to remove it, will remove it in next version.
>>
>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>> +        }
>>>> +    }
>>>> +    /*
>>>> +     * Case 2:
>>>> +     * The *server* side of this connect is VM, *server* tries to 
>>>> close
>>>> +     * the connection.
>>>> +     *
>>>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>> +     */
>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | 
>>>> TH_FIN)) {
>>>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>         }
>>>>
>>>>         return 0;
>>>>     }
>>>>
>>>>     /* handle tcp packet from secondary guest */
>>>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>>>                                         Connection *conn,
>>>> -                                    Packet *pkt)
>>>> +                                    Packet *pkt, ConnectionKey *key)
>>>>     {
>>>>         struct tcphdr *tcp_pkt;
>>>>
>>>> @@ -133,8 +163,34 @@ static int 
>>>> handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>             tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - 
>>>> conn->offset);
>>>>
>>>>             net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>>> +        /*
>>>> +         * Case 2:
>>>> +         * The *server* side of this connect is VM, *server* tries 
>>>> to close
>>>> +         * the connection.
>>>> +         *
>>>> +         * We got 'ack=1' packets from server side, it acks 
>>>> 'fin=1, ack=1'
>>>> +         * packet from client side. Like Case 1, there should be 
>>>> no packets
>>>> +         * in the connection from now know, But the difference 
>>>> here is
>>>> +         * if the packet is lost, We will get the resent 
>>>> 'fin=1,ack=1' packet.
>>>> +         * TODO: Fix above case.
>>>> +         */
>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>> +        }
>>>> +    }
>>>> +    /*
>>>> +     * Case 1:
>>>> +     * The *server* side of this connect is VM, *client* tries to 
>>>> close
>>>> +     * the connection.
>>>> +     *
>>>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>> +     */
>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | 
>>>> TH_FIN)) {
>>>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>
>>> I thought the tcp_state should store the state of TCP from the view of
>>> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. 
>>> And
>>> the handle of active close needs more states here. E.g if connection is
>>> in FIN_WAIT_2, the connection is only half closed, remote peer can 
>>> still
>>> send packet to us unless we receive a FIN.
>>>
>>
>> Yes, i know what you mean, actually, here, we try to only track the last
>> two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1'
>                                              ^ 
> 'FIN=1,ACK=1,seq=w,ack=u+1'
>
>> and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
>      ^ 'ACK=1,seq=u+1,ack=w+1'                     ^ 'FIN=1,ACK=1'
>
>> ensure that the 'fin=1,seq=u' packet has been posted.
>>
>                   ^ 'FIN=1,seq=u'

That's just the case I'm saying, the transition above is in fact:

secondary(ESTABLISHED)
secondary(FIN_WAIT_1): ->  FIN,seq=w,ack=u+1   -> :remote
secondary(FIN_WAIT_2): <-  seq=u+1,ack=w+1     <- :remote

So we are in fact in FIN_WAIT_2, which means the connection is only half 
closed, but your patch will treat this as fully closed connection and 
will remove the connection from the hashtable.

What's more I don't think we can decide passive or active close by:


+    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {

Since both cases will send FIN,ACK for sure.

>
>> Another reason is we may can't track the 'fin=1,seq=u' packet while
>> we start COLO while one connection is closing, which the 
>> 'fin=1,seq=u' packet
>> has been posted.
>>
>> Actually, here, if we start COLO while one connection is closing, 
>> which the
>> 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this
>
>  ^ 'FIN=1,ACK=1'
>
> Sorry for the typo. :)
>
>> case, the connection will be left in hash table for ever though it is 
>> harmless.
>> Any ideas for this case ?

Sorry I don't follow the question.

>>
>> For the above codes question, i'd like to change tcp_state to 
>> tap_closing_wait,
>> is it OK ?

You mean "tcp_closing_wait". I think we need first figure out if we can 
track the state correctly first.

Thanks

>>
>> Thanks.
>> Hailiang
>>
>>> Thanks
>>>
>>>>         }
>>>> -
>>>>         return 0;
>>>>     }
>>>>
>>>> @@ -178,7 +234,7 @@ static ssize_t 
>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>
>>>>             if (sender == nf->netdev) {
>>>>                 /* NET_FILTER_DIRECTION_TX */
>>>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>>>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>                     (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>                     packet_destroy(pkt, NULL);
>>>> @@ -191,7 +247,7 @@ static ssize_t 
>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>                 }
>>>>             } else {
>>>>                 /* NET_FILTER_DIRECTION_RX */
>>>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>>>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>                     (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>                     packet_destroy(pkt, NULL);
>>>
>>>
>>> .
>>>
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0
  2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0 zhanghailiang
@ 2017-02-24  8:08   ` Zhang Chen
  2017-02-24  8:23     ` Zhang Chen
  2017-02-27  1:36     ` Hailiang Zhang
  0 siblings, 2 replies; 21+ messages in thread
From: Zhang Chen @ 2017-02-24  8:08 UTC (permalink / raw)
  To: zhanghailiang, jasowang, lizhijian
  Cc: zhangchen.fnst, qemu-devel, xuquan8, pss.wulizhen



On 02/22/2017 11:46 AM, zhanghailiang wrote:
> While the offset of packets's sequence for primary side and
> secondary side is zero, it is unnecessary to call net_checksum_calculate()
> to recalculate the checksume value of packets.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/filter-rewriter.c | 18 +++++++++++-------
>   1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
> index 7e7ec35..c9a6d43 100644
> --- a/net/filter-rewriter.c
> +++ b/net/filter-rewriter.c
> @@ -93,10 +93,12 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>               conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
>               conn->syn_flag = 0;
>           }
> -        /* handle packets to the secondary from the primary */
> -        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
> +        if (conn->offset) {

This is wrong, conn->offset maybe is a negative value(like -1000),
So you can change here to  "if (conn->offset == 0) {"


> +            /* handle packets to the secondary from the primary */
> +            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>   
> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +        }
>           /*
>            * Case 1:
>            * The *server* side of this connect is VM, *client* tries to close
> @@ -112,7 +114,6 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>            */
>           if ((conn->tcp_state == TCPS_LAST_ACK) &&
>               (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
> -            fprintf(stderr, "Remove conn "

Here need fix.

>               g_hash_table_remove(rf->connection_track_table, key);
>           }
>       }
> @@ -159,10 +160,13 @@ static int handle_secondary_tcp_pkt(RewriterState *rf,
>       }
>   
>       if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
> -        /* handle packets to the primary from the secondary*/
> -        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
> +        /* Only need to adjust seq while offset is Non-zero */
> +        if (conn->offset) {

Refer to the above comments.

Thanks
Zhang Chen

> +            /* handle packets to the primary from the secondary*/
> +            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>   
> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
> +        }
>           /*
>            * Case 2:
>            * The *server* side of this connect is VM, *server* tries to close

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0
  2017-02-24  8:08   ` Zhang Chen
@ 2017-02-24  8:23     ` Zhang Chen
  2017-02-27  1:36     ` Hailiang Zhang
  1 sibling, 0 replies; 21+ messages in thread
From: Zhang Chen @ 2017-02-24  8:23 UTC (permalink / raw)
  To: zhanghailiang, jasowang, lizhijian
  Cc: zhangchen.fnst, qemu-devel, xuquan8, pss.wulizhen



On 02/24/2017 04:08 PM, Zhang Chen wrote:
>
>
> On 02/22/2017 11:46 AM, zhanghailiang wrote:
>> While the offset of packets's sequence for primary side and
>> secondary side is zero, it is unnecessary to call 
>> net_checksum_calculate()
>> to recalculate the checksume value of packets.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   net/filter-rewriter.c | 18 +++++++++++-------
>>   1 file changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index 7e7ec35..c9a6d43 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -93,10 +93,12 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>>               conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
>>               conn->syn_flag = 0;
>>           }
>> -        /* handle packets to the secondary from the primary */
>> -        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>> +        if (conn->offset) {
>
> This is wrong, conn->offset maybe is a negative value(like -1000),
> So you can change here to  "if (conn->offset == 0) {"

s/conn->offset == 0 / conn->offset != 0

>
>
>> +            /* handle packets to the secondary from the primary */
>> +            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + 
>> conn->offset);
>>   -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        }
>>           /*
>>            * Case 1:
>>            * The *server* side of this connect is VM, *client* tries 
>> to close
>> @@ -112,7 +114,6 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>>            */
>>           if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>               (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> -            fprintf(stderr, "Remove conn "
>
> Here need fix.
>
>> g_hash_table_remove(rf->connection_track_table, key);
>>           }
>>       }
>> @@ -159,10 +160,13 @@ static int 
>> handle_secondary_tcp_pkt(RewriterState *rf,
>>       }
>>         if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
>> -        /* handle packets to the primary from the secondary*/
>> -        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>> +        /* Only need to adjust seq while offset is Non-zero */
>> +        if (conn->offset) {
>
> Refer to the above comments.
>
> Thanks
> Zhang Chen
>
>> +            /* handle packets to the primary from the secondary*/
>> +            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - 
>> conn->offset);
>>   -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        }
>>           /*
>>            * Case 2:
>>            * The *server* side of this connect is VM, *server* tries 
>> to close
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0
  2017-02-24  8:08   ` Zhang Chen
  2017-02-24  8:23     ` Zhang Chen
@ 2017-02-27  1:36     ` Hailiang Zhang
  2017-02-27  3:44       ` Zhang Chen
  1 sibling, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-27  1:36 UTC (permalink / raw)
  To: Zhang Chen, jasowang, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/24 16:08, Zhang Chen wrote:
>
>
> On 02/22/2017 11:46 AM, zhanghailiang wrote:
>> While the offset of packets's sequence for primary side and
>> secondary side is zero, it is unnecessary to call net_checksum_calculate()
>> to recalculate the checksume value of packets.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>    net/filter-rewriter.c | 18 +++++++++++-------
>>    1 file changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index 7e7ec35..c9a6d43 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -93,10 +93,12 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>>                conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
>>                conn->syn_flag = 0;
>>            }
>> -        /* handle packets to the secondary from the primary */
>> -        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>> +        if (conn->offset) {
>
> This is wrong, conn->offset maybe is a negative value(like -1000),
> So you can change here to  "if (conn->offset == 0) {"
>

Er, if it is a negative value, it can still go into this if (conn->offset)
branch, and we need to adjust the checksum value in this case.

>
>> +            /* handle packets to the secondary from the primary */
>> +            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>>
>> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        }
>>            /*
>>             * Case 1:
>>             * The *server* side of this connect is VM, *client* tries to close
>> @@ -112,7 +114,6 @@ static int handle_primary_tcp_pkt(RewriterState *rf,
>>             */
>>            if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>                (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> -            fprintf(stderr, "Remove conn "
>
> Here need fix.
>

OK.

>>                g_hash_table_remove(rf->connection_track_table, key);
>>            }
>>        }
>> @@ -159,10 +160,13 @@ static int handle_secondary_tcp_pkt(RewriterState *rf,
>>        }
>>
>>        if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
>> -        /* handle packets to the primary from the secondary*/
>> -        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>> +        /* Only need to adjust seq while offset is Non-zero */
>> +        if (conn->offset) {
>
> Refer to the above comments.
>
> Thanks
> Zhang Chen
>
>> +            /* handle packets to the primary from the secondary*/
>> +            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>>
>> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> +        }
>>            /*
>>             * Case 2:
>>             * The *server* side of this connect is VM, *server* tries to close
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-23  4:16         ` Jason Wang
@ 2017-02-27  3:11           ` Hailiang Zhang
  2017-02-27  3:40             ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-27  3:11 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/23 12:16, Jason Wang wrote:
>
>
> On 2017年02月22日 16:51, Hailiang Zhang wrote:
>> On 2017/2/22 16:45, Hailiang Zhang wrote:
>>> On 2017/2/22 16:07, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>>> After a net connection is closed, we didn't clear its releated
>>>>> resources
>>>>> in connection_track_table, which will lead to memory leak.
>>>>
>>>> Not a real leak but would lead reset of hash table if too many closed
>>>> connections.
>>>>
>>>
>>> Yes, you are right, there will be lots of stale connection data in
>>> hash table
>>> if we don't remove it while it is been closed. Which
>
>
> Ok, so let's come up with a better title of the patch.
>

OK.

>>>
>>>>>
>>>>> Let't track the state of net connection, if it is closed, its related
>>>>> resources will be cleared up.
>>>>
>>>> The issue is the state were tracked partially, do we need a full state
>>>> machine here?
>>>>
>>>
>>> Not, IMHO, we only care about the last state of it, because, we will
>>> do nothing
>>> even if we track the intermedial states.
>
> Well, you care at least syn state too. Without a complete state machine,
> it's very hard to track even partial state I believe. And you will fail
> to track some state transition for sure which makes the code fragile.
>

Agree, but here things are a little different. There are some extreme cases
that we may can't track the complete process of closing connection.
For example (I have explained that in the bellow, it seems that you didn't
got it ;) ).
If VM is running before we want to make it goes into COLO FT state,
there maybe some connections exist already, in extreme case, VM is going into
COLO state while some connections are in half closing state, we can only track
the bellow half closing state in filter-rewriter and colo compare object.

>>>
>>>>>
>>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>>> ---
>>>>>      net/colo.h            |  4 +++
>>>>>      net/filter-rewriter.c | 70
>>>>> +++++++++++++++++++++++++++++++++++++++++++++------
>>>>>      2 files changed, 67 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/net/colo.h b/net/colo.h
>>>>> index 7c524f3..cd9027f 100644
>>>>> --- a/net/colo.h
>>>>> +++ b/net/colo.h
>>>>> @@ -18,6 +18,7 @@
>>>>>      #include "slirp/slirp.h"
>>>>>      #include "qemu/jhash.h"
>>>>>      #include "qemu/timer.h"
>>>>> +#include "slirp/tcp.h"
>>>>>
>>>>>      #define HASHTABLE_MAX_SIZE 16384
>>>>>
>>>>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>>>>           * run once in independent tcp connection
>>>>>           */
>>>>>          int syn_flag;
>>>>> +
>>>>> +    int tcp_state; /* TCP FSM state */
>>>>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>>>>      } Connection;
>>>>>
>>>>>      uint32_t connection_key_hash(const void *opaque);
>>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>>>> index c4ab91c..7e7ec35 100644
>>>>> --- a/net/filter-rewriter.c
>>>>> +++ b/net/filter-rewriter.c
>>>>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>>>>      }
>>>>>
>>>>>      /* handle tcp packet from primary guest */
>>>>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>>>>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>>>>                                        Connection *conn,
>>>>> -                                  Packet *pkt)
>>>>> +                                  Packet *pkt, ConnectionKey *key)
>>>>>      {
>>>>>          struct tcphdr *tcp_pkt;
>>>>>
>>>>> @@ -97,15 +97,45 @@ static int
>>>>> handle_primary_tcp_pkt(NetFilterState *nf,
>>>>>              tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) +
>>>>> conn->offset);
>>>>>
>>>>>              net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>>>> +        /*
>>>>> +         * Case 1:
>>>>> +         * The *server* side of this connect is VM, *client* tries
>>>>> to close
>>>>> +         * the connection.
>>>>> +         *
>>>>> +         * We got 'ack=1' packets from client side, it acks
>>>>> 'fin=1, ack=1'
>>>>> +         * packet from server side. From this point, we can ensure
>>>>> that there
>>>>> +         * will be no packets in the connection, except that, some
>>>>> errors
>>>>> +         * happen between the path of 'filter object' and vNIC, if
>>>>> this rare
>>>>> +         * case really happen, we can still create a new connection,
>>>>> +         * So it is safe to remove the connection from
>>>>> connection_track_table.
>>>>> +         *
>>>>> +         */
>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>> +            fprintf(stderr, "Remove conn "
>>>>
>>>> Can this even compile?
>>>>
>>>
>>> Oops, i forgot to remove it, will remove it in next version.
>>>
>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>> +        }
>>>>> +    }
>>>>> +    /*
>>>>> +     * Case 2:
>>>>> +     * The *server* side of this connect is VM, *server* tries to
>>>>> close
>>>>> +     * the connection.
>>>>> +     *
>>>>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>> +     */
>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>> TH_FIN)) {
>>>>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>>          }
>>>>>
>>>>>          return 0;
>>>>>      }
>>>>>
>>>>>      /* handle tcp packet from secondary guest */
>>>>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>>>>                                          Connection *conn,
>>>>> -                                    Packet *pkt)
>>>>> +                                    Packet *pkt, ConnectionKey *key)
>>>>>      {
>>>>>          struct tcphdr *tcp_pkt;
>>>>>
>>>>> @@ -133,8 +163,34 @@ static int
>>>>> handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>>              tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) -
>>>>> conn->offset);
>>>>>
>>>>>              net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>>>> +        /*
>>>>> +         * Case 2:
>>>>> +         * The *server* side of this connect is VM, *server* tries
>>>>> to close
>>>>> +         * the connection.
>>>>> +         *
>>>>> +         * We got 'ack=1' packets from server side, it acks
>>>>> 'fin=1, ack=1'
>>>>> +         * packet from client side. Like Case 1, there should be
>>>>> no packets
>>>>> +         * in the connection from now know, But the difference
>>>>> here is
>>>>> +         * if the packet is lost, We will get the resent
>>>>> 'fin=1,ack=1' packet.
>>>>> +         * TODO: Fix above case.
>>>>> +         */
>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>> +        }
>>>>> +    }
>>>>> +    /*
>>>>> +     * Case 1:
>>>>> +     * The *server* side of this connect is VM, *client* tries to
>>>>> close
>>>>> +     * the connection.
>>>>> +     *
>>>>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>> +     */
>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>> TH_FIN)) {
>>>>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>
>>>> I thought the tcp_state should store the state of TCP from the view of
>>>> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion.
>>>> And
>>>> the handle of active close needs more states here. E.g if connection is
>>>> in FIN_WAIT_2, the connection is only half closed, remote peer can
>>>> still
>>>> send packet to us unless we receive a FIN.
>>>>
>>>
>>> Yes, i know what you mean, actually, here, we try to only track the last
>>> two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1'
>>                                               ^
>> 'FIN=1,ACK=1,seq=w,ack=u+1'
>>
>>> and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
>>       ^ 'ACK=1,seq=u+1,ack=w+1'                     ^ 'FIN=1,ACK=1'
>>
>>> ensure that the 'fin=1,seq=u' packet has been posted.
>>>
>>                    ^ 'FIN=1,seq=u'
>
> That's just the case I'm saying, the transition above is in fact:
>
> secondary(ESTABLISHED)
> secondary(FIN_WAIT_1): ->  FIN,seq=w,ack=u+1   -> :remote
> secondary(FIN_WAIT_2): <-  seq=u+1,ack=w+1     <- :remote
>
> So we are in fact in FIN_WAIT_2, which means the connection is only half
> closed, but your patch will treat this as fully closed connection and
> will remove the connection from the hashtable.
>

Er, here we track the last two states 'FIN=1, ACK=1' and  'ACK=1' ( which asks
the 'FIN=1,ACK=1' packet, We will remove the connection while got the 'ACK=1'
packet, so  is it enough ?

> What's more I don't think we can decide passive or active close by:
>
>
> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>
> Since both cases will send FIN,ACK for sure.
>

I didn't quite understand, here we have tracked the closing request no matter
it is from the server side (passive close ?) or client side ( active close ?).
You can refer to the comment in codes, 'Case 1' and 'Case 2' comments.

Here, it seems that we can't track one case which both sides send the closing
requests at the same time, in that case, there are only 'FIN=1' and 'ACK=1'
packets.


Thanks.
Hailiang

>>
>>> Another reason is we may can't track the 'fin=1,seq=u' packet while
>>> we start COLO while one connection is closing, which the
>>> 'fin=1,seq=u' packet
>>> has been posted.
>>>
>>> Actually, here, if we start COLO while one connection is closing,
>>> which the
>>> 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this
>>
>>   ^ 'FIN=1,ACK=1'
>>
>> Sorry for the typo. :)
>>
>>> case, the connection will be left in hash table for ever though it is
>>> harmless.
>>> Any ideas for this case ?
>
> Sorry I don't follow the question.
>
>>>
>>> For the above codes question, i'd like to change tcp_state to
>>> tap_closing_wait,
>>> is it OK ?
>
> You mean "tcp_closing_wait". I think we need first figure out if we can
> track the state correctly first.
>
> Thanks
>
>>>
>>> Thanks.
>>> Hailiang
>>>
>>>> Thanks
>>>>
>>>>>          }
>>>>> -
>>>>>          return 0;
>>>>>      }
>>>>>
>>>>> @@ -178,7 +234,7 @@ static ssize_t
>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>
>>>>>              if (sender == nf->netdev) {
>>>>>                  /* NET_FILTER_DIRECTION_TX */
>>>>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>>>>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>                      (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>                      packet_destroy(pkt, NULL);
>>>>> @@ -191,7 +247,7 @@ static ssize_t
>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>                  }
>>>>>              } else {
>>>>>                  /* NET_FILTER_DIRECTION_RX */
>>>>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>>>>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>                      (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>                      packet_destroy(pkt, NULL);
>>>>
>>>>
>>>> .
>>>>
>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  3:11           ` Hailiang Zhang
@ 2017-02-27  3:40             ` Jason Wang
  2017-02-27  4:09               ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2017-02-27  3:40 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst, lizhijian
  Cc: xuquan8, qemu-devel, pss.wulizhen



On 2017年02月27日 11:11, Hailiang Zhang wrote:
> On 2017/2/23 12:16, Jason Wang wrote:
>>
>>
>> On 2017年02月22日 16:51, Hailiang Zhang wrote:
>>> On 2017/2/22 16:45, Hailiang Zhang wrote:
>>>> On 2017/2/22 16:07, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>>>> After a net connection is closed, we didn't clear its releated
>>>>>> resources
>>>>>> in connection_track_table, which will lead to memory leak.
>>>>>
>>>>> Not a real leak but would lead reset of hash table if too many closed
>>>>> connections.
>>>>>
>>>>
>>>> Yes, you are right, there will be lots of stale connection data in
>>>> hash table
>>>> if we don't remove it while it is been closed. Which
>>
>>
>> Ok, so let's come up with a better title of the patch.
>>
>
> OK.
>
>>>>
>>>>>>
>>>>>> Let't track the state of net connection, if it is closed, its 
>>>>>> related
>>>>>> resources will be cleared up.
>>>>>
>>>>> The issue is the state were tracked partially, do we need a full 
>>>>> state
>>>>> machine here?
>>>>>
>>>>
>>>> Not, IMHO, we only care about the last state of it, because, we will
>>>> do nothing
>>>> even if we track the intermedial states.
>>
>> Well, you care at least syn state too. Without a complete state machine,
>> it's very hard to track even partial state I believe. And you will fail
>> to track some state transition for sure which makes the code fragile.
>>
>
> Agree, but here things are a little different. There are some extreme 
> cases
> that we may can't track the complete process of closing connection.
> For example (I have explained that in the bellow, it seems that you 
> didn't
> got it ;) ).
> If VM is running before we want to make it goes into COLO FT state,
> there maybe some connections exist already, in extreme case, VM is 
> going into
> COLO state while some connections are in half closing state, we can 
> only track
> the bellow half closing state in filter-rewriter and colo compare object.
>
>>>>
>>>>>>
>>>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>>>> ---
>>>>>>      net/colo.h            |  4 +++
>>>>>>      net/filter-rewriter.c | 70
>>>>>> +++++++++++++++++++++++++++++++++++++++++++++------
>>>>>>      2 files changed, 67 insertions(+), 7 deletions(-)
>>>>>>
>>>>>> diff --git a/net/colo.h b/net/colo.h
>>>>>> index 7c524f3..cd9027f 100644
>>>>>> --- a/net/colo.h
>>>>>> +++ b/net/colo.h
>>>>>> @@ -18,6 +18,7 @@
>>>>>>      #include "slirp/slirp.h"
>>>>>>      #include "qemu/jhash.h"
>>>>>>      #include "qemu/timer.h"
>>>>>> +#include "slirp/tcp.h"
>>>>>>
>>>>>>      #define HASHTABLE_MAX_SIZE 16384
>>>>>>
>>>>>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>>>>>           * run once in independent tcp connection
>>>>>>           */
>>>>>>          int syn_flag;
>>>>>> +
>>>>>> +    int tcp_state; /* TCP FSM state */
>>>>>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>>>>>      } Connection;
>>>>>>
>>>>>>      uint32_t connection_key_hash(const void *opaque);
>>>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>>>>> index c4ab91c..7e7ec35 100644
>>>>>> --- a/net/filter-rewriter.c
>>>>>> +++ b/net/filter-rewriter.c
>>>>>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>>>>>      }
>>>>>>
>>>>>>      /* handle tcp packet from primary guest */
>>>>>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>>>>>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>>>>>                                        Connection *conn,
>>>>>> -                                  Packet *pkt)
>>>>>> +                                  Packet *pkt, ConnectionKey *key)
>>>>>>      {
>>>>>>          struct tcphdr *tcp_pkt;
>>>>>>
>>>>>> @@ -97,15 +97,45 @@ static int
>>>>>> handle_primary_tcp_pkt(NetFilterState *nf,
>>>>>>              tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) +
>>>>>> conn->offset);
>>>>>>
>>>>>>              net_checksum_calculate((uint8_t *)pkt->data, 
>>>>>> pkt->size);
>>>>>> +        /*
>>>>>> +         * Case 1:
>>>>>> +         * The *server* side of this connect is VM, *client* tries
>>>>>> to close
>>>>>> +         * the connection.
>>>>>> +         *
>>>>>> +         * We got 'ack=1' packets from client side, it acks
>>>>>> 'fin=1, ack=1'
>>>>>> +         * packet from server side. From this point, we can ensure
>>>>>> that there
>>>>>> +         * will be no packets in the connection, except that, some
>>>>>> errors
>>>>>> +         * happen between the path of 'filter object' and vNIC, if
>>>>>> this rare
>>>>>> +         * case really happen, we can still create a new 
>>>>>> connection,
>>>>>> +         * So it is safe to remove the connection from
>>>>>> connection_track_table.
>>>>>> +         *
>>>>>> +         */
>>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>>> +            fprintf(stderr, "Remove conn "
>>>>>
>>>>> Can this even compile?
>>>>>
>>>>
>>>> Oops, i forgot to remove it, will remove it in next version.
>>>>
>>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>>> +        }
>>>>>> +    }
>>>>>> +    /*
>>>>>> +     * Case 2:
>>>>>> +     * The *server* side of this connect is VM, *server* tries to
>>>>>> close
>>>>>> +     * the connection.
>>>>>> +     *
>>>>>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>>> +     */
>>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>>> TH_FIN)) {
>>>>>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>>>          }
>>>>>>
>>>>>>          return 0;
>>>>>>      }
>>>>>>
>>>>>>      /* handle tcp packet from secondary guest */
>>>>>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>>>>>                                          Connection *conn,
>>>>>> -                                    Packet *pkt)
>>>>>> +                                    Packet *pkt, ConnectionKey 
>>>>>> *key)
>>>>>>      {
>>>>>>          struct tcphdr *tcp_pkt;
>>>>>>
>>>>>> @@ -133,8 +163,34 @@ static int
>>>>>> handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>>>              tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) -
>>>>>> conn->offset);
>>>>>>
>>>>>>              net_checksum_calculate((uint8_t *)pkt->data, 
>>>>>> pkt->size);
>>>>>> +        /*
>>>>>> +         * Case 2:
>>>>>> +         * The *server* side of this connect is VM, *server* tries
>>>>>> to close
>>>>>> +         * the connection.
>>>>>> +         *
>>>>>> +         * We got 'ack=1' packets from server side, it acks
>>>>>> 'fin=1, ack=1'
>>>>>> +         * packet from client side. Like Case 1, there should be
>>>>>> no packets
>>>>>> +         * in the connection from now know, But the difference
>>>>>> here is
>>>>>> +         * if the packet is lost, We will get the resent
>>>>>> 'fin=1,ack=1' packet.
>>>>>> +         * TODO: Fix above case.
>>>>>> +         */
>>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>>> +        }
>>>>>> +    }
>>>>>> +    /*
>>>>>> +     * Case 1:
>>>>>> +     * The *server* side of this connect is VM, *client* tries to
>>>>>> close
>>>>>> +     * the connection.
>>>>>> +     *
>>>>>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>>> +     */
>>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>>> TH_FIN)) {
>>>>>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>>
>>>>> I thought the tcp_state should store the state of TCP from the 
>>>>> view of
>>>>> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion.
>>>>> And
>>>>> the handle of active close needs more states here. E.g if 
>>>>> connection is
>>>>> in FIN_WAIT_2, the connection is only half closed, remote peer can
>>>>> still
>>>>> send packet to us unless we receive a FIN.
>>>>>
>>>>
>>>> Yes, i know what you mean, actually, here, we try to only track the 
>>>> last
>>>> two steps for closing a connection, that is 
>>>> 'fin=1,ack=1,seq=2,ack=u+1'
>>>                                               ^
>>> 'FIN=1,ACK=1,seq=w,ack=u+1'
>>>
>>>> and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
>>>       ^ 'ACK=1,seq=u+1,ack=w+1'                     ^ 'FIN=1,ACK=1'
>>>
>>>> ensure that the 'fin=1,seq=u' packet has been posted.
>>>>
>>>                    ^ 'FIN=1,seq=u'
>>
>> That's just the case I'm saying, the transition above is in fact:
>>
>> secondary(ESTABLISHED)
>> secondary(FIN_WAIT_1): ->  FIN,seq=w,ack=u+1   -> :remote
>> secondary(FIN_WAIT_2): <-  seq=u+1,ack=w+1     <- :remote
>>
>> So we are in fact in FIN_WAIT_2, which means the connection is only half
>> closed, but your patch will treat this as fully closed connection and
>> will remove the connection from the hashtable.
>>
>
> Er, here we track the last two states 'FIN=1, ACK=1' and  'ACK=1' ( 
> which asks
> the 'FIN=1,ACK=1' packet, We will remove the connection while got the 
> 'ACK=1'
> packet, so  is it enough ?

But the connection is not closed in fact, no? It's legal for remote to 
continue sending tons of packet to us even after this.

>
>> What's more I don't think we can decide passive or active close by:
>>
>>
>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>>
>> Since both cases will send FIN,ACK for sure.
>>
>
> I didn't quite understand, here we have tracked the closing request no 
> matter
> it is from the server side (passive close ?) or client side ( active 
> close ?).
> You can refer to the comment in codes, 'Case 1' and 'Case 2' comments.

I think you need differ them since passive close is much simpler, and it 
seems that your code may work only in this case.

>
> Here, it seems that we can't track one case which both sides send the 
> closing
> requests at the same time, in that case, there are only 'FIN=1' and 
> 'ACK=1'
> packets.
>

Yes, RFC allows this.

Thanks

>
> Thanks.
> Hailiang
>
>>>
>>>> Another reason is we may can't track the 'fin=1,seq=u' packet while
>>>> we start COLO while one connection is closing, which the
>>>> 'fin=1,seq=u' packet
>>>> has been posted.
>>>>
>>>> Actually, here, if we start COLO while one connection is closing,
>>>> which the
>>>> 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In 
>>>> this
>>>
>>>   ^ 'FIN=1,ACK=1'
>>>
>>> Sorry for the typo. :)
>>>
>>>> case, the connection will be left in hash table for ever though it is
>>>> harmless.
>>>> Any ideas for this case ?
>>
>> Sorry I don't follow the question.
>>
>>>>
>>>> For the above codes question, i'd like to change tcp_state to
>>>> tap_closing_wait,
>>>> is it OK ?
>>
>> You mean "tcp_closing_wait". I think we need first figure out if we can
>> track the state correctly first.
>>
>> Thanks
>>
>>>>
>>>> Thanks.
>>>> Hailiang
>>>>
>>>>> Thanks
>>>>>
>>>>>>          }
>>>>>> -
>>>>>>          return 0;
>>>>>>      }
>>>>>>
>>>>>> @@ -178,7 +234,7 @@ static ssize_t
>>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>>
>>>>>>              if (sender == nf->netdev) {
>>>>>>                  /* NET_FILTER_DIRECTION_TX */
>>>>>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>>>>>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>>                      (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>>                      packet_destroy(pkt, NULL);
>>>>>> @@ -191,7 +247,7 @@ static ssize_t
>>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>>                  }
>>>>>>              } else {
>>>>>>                  /* NET_FILTER_DIRECTION_RX */
>>>>>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>>>>>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>>                      (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>>                      packet_destroy(pkt, NULL);
>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>>
>>
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0
  2017-02-27  1:36     ` Hailiang Zhang
@ 2017-02-27  3:44       ` Zhang Chen
  0 siblings, 0 replies; 21+ messages in thread
From: Zhang Chen @ 2017-02-27  3:44 UTC (permalink / raw)
  To: Hailiang Zhang, jasowang, lizhijian
  Cc: zhangchen.fnst, xuquan8, qemu-devel, pss.wulizhen



On 02/27/2017 09:36 AM, Hailiang Zhang wrote:
> On 2017/2/24 16:08, Zhang Chen wrote:
>>
>>
>> On 02/22/2017 11:46 AM, zhanghailiang wrote:
>>> While the offset of packets's sequence for primary side and
>>> secondary side is zero, it is unnecessary to call 
>>> net_checksum_calculate()
>>> to recalculate the checksume value of packets.
>>>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> ---
>>>    net/filter-rewriter.c | 18 +++++++++++-------
>>>    1 file changed, 11 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>> index 7e7ec35..c9a6d43 100644
>>> --- a/net/filter-rewriter.c
>>> +++ b/net/filter-rewriter.c
>>> @@ -93,10 +93,12 @@ static int handle_primary_tcp_pkt(RewriterState 
>>> *rf,
>>>                conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
>>>                conn->syn_flag = 0;
>>>            }
>>> -        /* handle packets to the secondary from the primary */
>>> -        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + 
>>> conn->offset);
>>> +        if (conn->offset) {
>>
>> This is wrong, conn->offset maybe is a negative value(like -1000),
>> So you can change here to  "if (conn->offset == 0) {"
>>
>
> Er, if it is a negative value, it can still go into this if 
> (conn->offset)
> branch, and we need to adjust the checksum value in this case.
>

Sorry, I misunderstand it, ignore.

Thanks
Zhang Chen

>>
>>> +            /* handle packets to the secondary from the primary */
>>> +            tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + 
>>> conn->offset);
>>>
>>> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +        }
>>>            /*
>>>             * Case 1:
>>>             * The *server* side of this connect is VM, *client* 
>>> tries to close
>>> @@ -112,7 +114,6 @@ static int handle_primary_tcp_pkt(RewriterState 
>>> *rf,
>>>             */
>>>            if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>                (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>> -            fprintf(stderr, "Remove conn "
>>
>> Here need fix.
>>
>
> OK.
>
>>> g_hash_table_remove(rf->connection_track_table, key);
>>>            }
>>>        }
>>> @@ -159,10 +160,13 @@ static int 
>>> handle_secondary_tcp_pkt(RewriterState *rf,
>>>        }
>>>
>>>        if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
>>> -        /* handle packets to the primary from the secondary*/
>>> -        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - 
>>> conn->offset);
>>> +        /* Only need to adjust seq while offset is Non-zero */
>>> +        if (conn->offset) {
>>
>> Refer to the above comments.
>>
>> Thanks
>> Zhang Chen
>>
>>> +            /* handle packets to the primary from the secondary*/
>>> +            tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - 
>>> conn->offset);
>>>
>>> -        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +            net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>>> +        }
>>>            /*
>>>             * Case 2:
>>>             * The *server* side of this connect is VM, *server* 
>>> tries to close
>>
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  3:40             ` Jason Wang
@ 2017-02-27  4:09               ` Hailiang Zhang
  2017-02-27  5:35                 ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-27  4:09 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/27 11:40, Jason Wang wrote:
>
>
> On 2017年02月27日 11:11, Hailiang Zhang wrote:
>> On 2017/2/23 12:16, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月22日 16:51, Hailiang Zhang wrote:
>>>> On 2017/2/22 16:45, Hailiang Zhang wrote:
>>>>> On 2017/2/22 16:07, Jason Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>>>>> After a net connection is closed, we didn't clear its releated
>>>>>>> resources
>>>>>>> in connection_track_table, which will lead to memory leak.
>>>>>>
>>>>>> Not a real leak but would lead reset of hash table if too many closed
>>>>>> connections.
>>>>>>
>>>>>
>>>>> Yes, you are right, there will be lots of stale connection data in
>>>>> hash table
>>>>> if we don't remove it while it is been closed. Which
>>>
>>>
>>> Ok, so let's come up with a better title of the patch.
>>>
>>
>> OK.
>>
>>>>>
>>>>>>>
>>>>>>> Let't track the state of net connection, if it is closed, its
>>>>>>> related
>>>>>>> resources will be cleared up.
>>>>>>
>>>>>> The issue is the state were tracked partially, do we need a full
>>>>>> state
>>>>>> machine here?
>>>>>>
>>>>>
>>>>> Not, IMHO, we only care about the last state of it, because, we will
>>>>> do nothing
>>>>> even if we track the intermedial states.
>>>
>>> Well, you care at least syn state too. Without a complete state machine,
>>> it's very hard to track even partial state I believe. And you will fail
>>> to track some state transition for sure which makes the code fragile.
>>>
>>
>> Agree, but here things are a little different. There are some extreme
>> cases
>> that we may can't track the complete process of closing connection.
>> For example (I have explained that in the bellow, it seems that you
>> didn't
>> got it ;) ).
>> If VM is running before we want to make it goes into COLO FT state,
>> there maybe some connections exist already, in extreme case, VM is
>> going into
>> COLO state while some connections are in half closing state, we can
>> only track
>> the bellow half closing state in filter-rewriter and colo compare object.
>>
>>>>>
>>>>>>>
>>>>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>>>>> ---
>>>>>>>       net/colo.h            |  4 +++
>>>>>>>       net/filter-rewriter.c | 70
>>>>>>> +++++++++++++++++++++++++++++++++++++++++++++------
>>>>>>>       2 files changed, 67 insertions(+), 7 deletions(-)
>>>>>>>
>>>>>>> diff --git a/net/colo.h b/net/colo.h
>>>>>>> index 7c524f3..cd9027f 100644
>>>>>>> --- a/net/colo.h
>>>>>>> +++ b/net/colo.h
>>>>>>> @@ -18,6 +18,7 @@
>>>>>>>       #include "slirp/slirp.h"
>>>>>>>       #include "qemu/jhash.h"
>>>>>>>       #include "qemu/timer.h"
>>>>>>> +#include "slirp/tcp.h"
>>>>>>>
>>>>>>>       #define HASHTABLE_MAX_SIZE 16384
>>>>>>>
>>>>>>> @@ -69,6 +70,9 @@ typedef struct Connection {
>>>>>>>            * run once in independent tcp connection
>>>>>>>            */
>>>>>>>           int syn_flag;
>>>>>>> +
>>>>>>> +    int tcp_state; /* TCP FSM state */
>>>>>>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>>>>>>       } Connection;
>>>>>>>
>>>>>>>       uint32_t connection_key_hash(const void *opaque);
>>>>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>>>>>>> index c4ab91c..7e7ec35 100644
>>>>>>> --- a/net/filter-rewriter.c
>>>>>>> +++ b/net/filter-rewriter.c
>>>>>>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>>>>>>>       }
>>>>>>>
>>>>>>>       /* handle tcp packet from primary guest */
>>>>>>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>>>>>>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>>>>>>                                         Connection *conn,
>>>>>>> -                                  Packet *pkt)
>>>>>>> +                                  Packet *pkt, ConnectionKey *key)
>>>>>>>       {
>>>>>>>           struct tcphdr *tcp_pkt;
>>>>>>>
>>>>>>> @@ -97,15 +97,45 @@ static int
>>>>>>> handle_primary_tcp_pkt(NetFilterState *nf,
>>>>>>>               tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) +
>>>>>>> conn->offset);
>>>>>>>
>>>>>>>               net_checksum_calculate((uint8_t *)pkt->data,
>>>>>>> pkt->size);
>>>>>>> +        /*
>>>>>>> +         * Case 1:
>>>>>>> +         * The *server* side of this connect is VM, *client* tries
>>>>>>> to close
>>>>>>> +         * the connection.
>>>>>>> +         *
>>>>>>> +         * We got 'ack=1' packets from client side, it acks
>>>>>>> 'fin=1, ack=1'
>>>>>>> +         * packet from server side. From this point, we can ensure
>>>>>>> that there
>>>>>>> +         * will be no packets in the connection, except that, some
>>>>>>> errors
>>>>>>> +         * happen between the path of 'filter object' and vNIC, if
>>>>>>> this rare
>>>>>>> +         * case really happen, we can still create a new
>>>>>>> connection,
>>>>>>> +         * So it is safe to remove the connection from
>>>>>>> connection_track_table.
>>>>>>> +         *
>>>>>>> +         */
>>>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>>>> +            fprintf(stderr, "Remove conn "
>>>>>>
>>>>>> Can this even compile?
>>>>>>
>>>>>
>>>>> Oops, i forgot to remove it, will remove it in next version.
>>>>>
>>>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +    /*
>>>>>>> +     * Case 2:
>>>>>>> +     * The *server* side of this connect is VM, *server* tries to
>>>>>>> close
>>>>>>> +     * the connection.
>>>>>>> +     *
>>>>>>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>>>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>>>> +     */
>>>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>>>> TH_FIN)) {
>>>>>>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>>>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>>>>           }
>>>>>>>
>>>>>>>           return 0;
>>>>>>>       }
>>>>>>>
>>>>>>>       /* handle tcp packet from secondary guest */
>>>>>>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>>>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>>>>>>                                           Connection *conn,
>>>>>>> -                                    Packet *pkt)
>>>>>>> +                                    Packet *pkt, ConnectionKey
>>>>>>> *key)
>>>>>>>       {
>>>>>>>           struct tcphdr *tcp_pkt;
>>>>>>>
>>>>>>> @@ -133,8 +163,34 @@ static int
>>>>>>> handle_secondary_tcp_pkt(NetFilterState *nf,
>>>>>>>               tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) -
>>>>>>> conn->offset);
>>>>>>>
>>>>>>>               net_checksum_calculate((uint8_t *)pkt->data,
>>>>>>> pkt->size);
>>>>>>> +        /*
>>>>>>> +         * Case 2:
>>>>>>> +         * The *server* side of this connect is VM, *server* tries
>>>>>>> to close
>>>>>>> +         * the connection.
>>>>>>> +         *
>>>>>>> +         * We got 'ack=1' packets from server side, it acks
>>>>>>> 'fin=1, ack=1'
>>>>>>> +         * packet from client side. Like Case 1, there should be
>>>>>>> no packets
>>>>>>> +         * in the connection from now know, But the difference
>>>>>>> here is
>>>>>>> +         * if the packet is lost, We will get the resent
>>>>>>> 'fin=1,ack=1' packet.
>>>>>>> +         * TODO: Fix above case.
>>>>>>> +         */
>>>>>>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>>>>>>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>>>>>>> + g_hash_table_remove(rf->connection_track_table, key);
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +    /*
>>>>>>> +     * Case 1:
>>>>>>> +     * The *server* side of this connect is VM, *client* tries to
>>>>>>> close
>>>>>>> +     * the connection.
>>>>>>> +     *
>>>>>>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>>>>>>> +     * record the seq of 'fin=1, ack=1' packet.
>>>>>>> +     */
>>>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>>>> TH_FIN)) {
>>>>>>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>>>>>>> +        conn->tcp_state = TCPS_LAST_ACK;
>>>>>>
>>>>>> I thought the tcp_state should store the state of TCP from the
>>>>>> view of
>>>>>> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion.
>>>>>> And
>>>>>> the handle of active close needs more states here. E.g if
>>>>>> connection is
>>>>>> in FIN_WAIT_2, the connection is only half closed, remote peer can
>>>>>> still
>>>>>> send packet to us unless we receive a FIN.
>>>>>>
>>>>>
>>>>> Yes, i know what you mean, actually, here, we try to only track the
>>>>> last
>>>>> two steps for closing a connection, that is
>>>>> 'fin=1,ack=1,seq=2,ack=u+1'
>>>>                                                ^
>>>> 'FIN=1,ACK=1,seq=w,ack=u+1'
>>>>
>>>>> and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
>>>>        ^ 'ACK=1,seq=u+1,ack=w+1'                     ^ 'FIN=1,ACK=1'
>>>>
>>>>> ensure that the 'fin=1,seq=u' packet has been posted.
>>>>>
>>>>                     ^ 'FIN=1,seq=u'
>>>
>>> That's just the case I'm saying, the transition above is in fact:
>>>
>>> secondary(ESTABLISHED)
>>> secondary(FIN_WAIT_1): ->  FIN,seq=w,ack=u+1   -> :remote
>>> secondary(FIN_WAIT_2): <-  seq=u+1,ack=w+1     <- :remote
>>>
>>> So we are in fact in FIN_WAIT_2, which means the connection is only half
>>> closed, but your patch will treat this as fully closed connection and
>>> will remove the connection from the hashtable.
>>>
>>
>> Er, here we track the last two states 'FIN=1, ACK=1' and  'ACK=1' (
>> which asks
>> the 'FIN=1,ACK=1' packet, We will remove the connection while got the
>> 'ACK=1'
>> packet, so  is it enough ?
>
> But the connection is not closed in fact, no? It's legal for remote to
> continue sending tons of packet to us even after this.
>

Er, I'm a little confused, Here, for server side,
i think after got the 'ACK=1,seq=u+1,ack=w+1', it is closed,
so i remove it from hash table, wrong ?

Client:                                    Server:

ESTABLISHED|                               |
            | -> FIN=1,seq=u   ->           |
FIN_WAIT_1 |                               |
            | <- ACK=1,seq=v,ack=u+1 <-     |
FINA_WAIT_2|                               |CLOSE_WAIT
            | <- FIN=1,ACK=1,seq=w,ack=u+1<-|
            |                               |LAST+ACK
            |   -> ACK=1,seq=u+1,ack=w+1    |
TIME_WAIT  |                               |CLOSED
CLOSED     |                               |

>>
>>> What's more I don't think we can decide passive or active close by:
>>>
>>>
>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>>>
>>> Since both cases will send FIN,ACK for sure.
>>>
>>
>> I didn't quite understand, here we have tracked the closing request no
>> matter
>> it is from the server side (passive close ?) or client side ( active
>> close ?).
>> You can refer to the comment in codes, 'Case 1' and 'Case 2' comments.
>
> I think you need differ them since passive close is much simpler, and it
> seems that your code may work only in this case.
>
>>
>> Here, it seems that we can't track one case which both sides send the
>> closing
>> requests at the same time, in that case, there are only 'FIN=1' and
>> 'ACK=1'
>> packets.
>>
>
> Yes, RFC allows this.
>

Hmm, I'd like to remove this patch from this series,
And send it as a single patch after all the above questions
been solved. How about the other patches ?

Thanks,
Hailiang


> Thanks
>
>>
>> Thanks.
>> Hailiang
>>
>>>>
>>>>> Another reason is we may can't track the 'fin=1,seq=u' packet while
>>>>> we start COLO while one connection is closing, which the
>>>>> 'fin=1,seq=u' packet
>>>>> has been posted.
>>>>>
>>>>> Actually, here, if we start COLO while one connection is closing,
>>>>> which the
>>>>> 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In
>>>>> this
>>>>
>>>>    ^ 'FIN=1,ACK=1'
>>>>
>>>> Sorry for the typo. :)
>>>>
>>>>> case, the connection will be left in hash table for ever though it is
>>>>> harmless.
>>>>> Any ideas for this case ?
>>>
>>> Sorry I don't follow the question.
>>>
>>>>>
>>>>> For the above codes question, i'd like to change tcp_state to
>>>>> tap_closing_wait,
>>>>> is it OK ?
>>>
>>> You mean "tcp_closing_wait". I think we need first figure out if we can
>>> track the state correctly first.
>>>
>>> Thanks
>>>
>>>>>
>>>>> Thanks.
>>>>> Hailiang
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>>           }
>>>>>>> -
>>>>>>>           return 0;
>>>>>>>       }
>>>>>>>
>>>>>>> @@ -178,7 +234,7 @@ static ssize_t
>>>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>>>
>>>>>>>               if (sender == nf->netdev) {
>>>>>>>                   /* NET_FILTER_DIRECTION_TX */
>>>>>>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>>>>>>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>>>                       (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>>>                       packet_destroy(pkt, NULL);
>>>>>>> @@ -191,7 +247,7 @@ static ssize_t
>>>>>>> colo_rewriter_receive_iov(NetFilterState *nf,
>>>>>>>                   }
>>>>>>>               } else {
>>>>>>>                   /* NET_FILTER_DIRECTION_RX */
>>>>>>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>>>>>>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>>>>>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>>>>>>>                       (const uint8_t *)pkt->data, pkt->size, NULL);
>>>>>>>                       packet_destroy(pkt, NULL);
>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  4:09               ` Hailiang Zhang
@ 2017-02-27  5:35                 ` Jason Wang
  2017-02-27  6:53                   ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2017-02-27  5:35 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst, lizhijian
  Cc: xuquan8, qemu-devel, pss.wulizhen



On 2017年02月27日 12:09, Hailiang Zhang wrote:
> On 2017/2/27 11:40, Jason Wang wrote:
>>
>>
>> On 2017年02月27日 11:11, Hailiang Zhang wrote:
>>> On 2017/2/23 12:16, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年02月22日 16:51, Hailiang Zhang wrote:
>>>>> On 2017/2/22 16:45, Hailiang Zhang wrote:
>>>>>> On 2017/2/22 16:07, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>>>>>> After a net connection is closed, we didn't clear its releated
>>>>>>>> resources
>>>>>>>> in connection_track_table, which will lead to memory leak.
>>>>>>>
>>>>>>> Not a real leak but would lead reset of hash table if too many 
>>>>>>> closed
>>>>>>> connections.
>>>>>>>
>>>>>>
>>>>>> Yes, you are right, there will be lots of stale connection data in
>>>>>> hash table
>>>>>> if we don't remove it while it is been closed. Which
>>>>
>>>>
>>>> Ok, so let's come up with a better title of the patch.
>>>>
>>>
>>> OK.
>>>
>>>>>>
>>>>>>>>
>>>>>>>> Let't track the state of net connection, if it is closed, its
>>>>>>>> related
>>>>>>>> resources will be cleared up.
>>>>>>>
>>>>>>> The issue is the state were tracked partially, do we need a full
>>>>>>> state
>>>>>>> machine here?
>>>>>>>
>>>>>>
>>>>>> Not, IMHO, we only care about the last state of it, because, we will
>>>>>> do nothing
>>>>>> even if we track the intermedial states.
>>>>
>>>> Well, you care at least syn state too. Without a complete state 
>>>> machine,
>>>> it's very hard to track even partial state I believe. And you will 
>>>> fail
>>>> to track some state transition for sure which makes the code fragile.
>>>>
>>>
>>> Agree, but here things are a little different. There are some extreme
>>> cases
>>> that we may can't track the complete process of closing connection.
>>> For example (I have explained that in the bellow, it seems that you
>>> didn't
>>> got it ;) ).
>>> If VM is running before we want to make it goes into COLO FT state,
>>> there maybe some connections exist already, in extreme case, VM is
>>> going into
>>> COLO state while some connections are in half closing state, we can
>>> only track
>>> the bellow half closing state in filter-rewriter and colo compare 
>>> object.
>>>
>>>

[...]

>>> Er, here we track the last two states 'FIN=1, ACK=1' and  'ACK=1' (
>>> which asks
>>> the 'FIN=1,ACK=1' packet, We will remove the connection while got the
>>> 'ACK=1'
>>> packet, so  is it enough ?
>>
>> But the connection is not closed in fact, no? It's legal for remote to
>> continue sending tons of packet to us even after this.
>>
>
> Er, I'm a little confused, Here, for server side,
> i think after got the 'ACK=1,seq=u+1,ack=w+1', it is closed,
> so i remove it from hash table, wrong ?
>
> Client:                                    Server:
>
> ESTABLISHED|                               |
>            | -> FIN=1,seq=u   ->           |

This is case A and ACK should be set in this segment too.

> FIN_WAIT_1 |                               |
>            | <- ACK=1,seq=v,ack=u+1 <-     |
> FINA_WAIT_2|                               |CLOSE_WAIT
>            | <- FIN=1,ACK=1,seq=w,ack=u+1<-|
>            |                               |LAST+ACK

This is case B.

> |   -> ACK=1,seq=u+1,ack=w+1    |
> TIME_WAIT  |                               |CLOSED
> CLOSED     |                               |
>

I think the issue is that your code can not differ A from B.

Thanks

>>>
>>>> What's more I don't think we can decide passive or active close by:
>>>>
>>>>
>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | 
>>>> TH_FIN)) {
>>>>
>>>> Since both cases will send FIN,ACK for sure.
>>>>
>>>
>>> I didn't quite understand, here we have tracked the closing request no
>>> matter
>>> it is from the server side (passive close ?) or client side ( active
>>> close ?).
>>> You can refer to the comment in codes, 'Case 1' and 'Case 2' comments.
>>
>> I think you need differ them since passive close is much simpler, and it
>> seems that your code may work only in this case.
>>
>>>
>>> Here, it seems that we can't track one case which both sides send the
>>> closing
>>> requests at the same time, in that case, there are only 'FIN=1' and
>>> 'ACK=1'
>>> packets.
>>>
>>
>> Yes, RFC allows this.
>>
>
> Hmm, I'd like to remove this patch from this series,
> And send it as a single patch after all the above questions
> been solved. How about the other patches ?
>

Looks good except for the compiling issue of patch 3.

Thanks

> Thanks,
> Hailiang
>
>
>> Thanks
>>
>>>
>>> Thanks.
>>> Hailiang
>>>

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  5:35                 ` Jason Wang
@ 2017-02-27  6:53                   ` Hailiang Zhang
  2017-02-27  9:05                     ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-27  6:53 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/27 13:35, Jason Wang wrote:
>
>
> On 2017年02月27日 12:09, Hailiang Zhang wrote:
>> On 2017/2/27 11:40, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月27日 11:11, Hailiang Zhang wrote:
>>>> On 2017/2/23 12:16, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月22日 16:51, Hailiang Zhang wrote:
>>>>>> On 2017/2/22 16:45, Hailiang Zhang wrote:
>>>>>>> On 2017/2/22 16:07, Jason Wang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2017年02月22日 11:46, zhanghailiang wrote:
>>>>>>>>> After a net connection is closed, we didn't clear its releated
>>>>>>>>> resources
>>>>>>>>> in connection_track_table, which will lead to memory leak.
>>>>>>>>
>>>>>>>> Not a real leak but would lead reset of hash table if too many
>>>>>>>> closed
>>>>>>>> connections.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, you are right, there will be lots of stale connection data in
>>>>>>> hash table
>>>>>>> if we don't remove it while it is been closed. Which
>>>>>
>>>>>
>>>>> Ok, so let's come up with a better title of the patch.
>>>>>
>>>>
>>>> OK.
>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Let't track the state of net connection, if it is closed, its
>>>>>>>>> related
>>>>>>>>> resources will be cleared up.
>>>>>>>>
>>>>>>>> The issue is the state were tracked partially, do we need a full
>>>>>>>> state
>>>>>>>> machine here?
>>>>>>>>
>>>>>>>
>>>>>>> Not, IMHO, we only care about the last state of it, because, we will
>>>>>>> do nothing
>>>>>>> even if we track the intermedial states.
>>>>>
>>>>> Well, you care at least syn state too. Without a complete state
>>>>> machine,
>>>>> it's very hard to track even partial state I believe. And you will
>>>>> fail
>>>>> to track some state transition for sure which makes the code fragile.
>>>>>
>>>>
>>>> Agree, but here things are a little different. There are some extreme
>>>> cases
>>>> that we may can't track the complete process of closing connection.
>>>> For example (I have explained that in the bellow, it seems that you
>>>> didn't
>>>> got it ;) ).
>>>> If VM is running before we want to make it goes into COLO FT state,
>>>> there maybe some connections exist already, in extreme case, VM is
>>>> going into
>>>> COLO state while some connections are in half closing state, we can
>>>> only track
>>>> the bellow half closing state in filter-rewriter and colo compare
>>>> object.
>>>>
>>>>
>
> [...]
>
>>>> Er, here we track the last two states 'FIN=1, ACK=1' and  'ACK=1' (
>>>> which asks
>>>> the 'FIN=1,ACK=1' packet, We will remove the connection while got the
>>>> 'ACK=1'
>>>> packet, so  is it enough ?
>>>
>>> But the connection is not closed in fact, no? It's legal for remote to
>>> continue sending tons of packet to us even after this.
>>>
>>
>> Er, I'm a little confused, Here, for server side,
>> i think after got the 'ACK=1,seq=u+1,ack=w+1', it is closed,
>> so i remove it from hash table, wrong ?
>>
>> Client:                                    Server:
>>
>> ESTABLISHED|                               |
>>             | -> FIN=1,seq=u   ->           |
>
> This is case A and ACK should be set in this segment too.
>
>> FIN_WAIT_1 |                               |
>>             | <- ACK=1,seq=v,ack=u+1 <-     |
>> FINA_WAIT_2|                               |CLOSE_WAIT
>>             | <- FIN=1,ACK=1,seq=w,ack=u+1<-|
>>             |                               |LAST+ACK
>
> This is case B.
>
>> |   -> ACK=1,seq=u+1,ack=w+1    |
>> TIME_WAIT  |                               |CLOSED
>> CLOSED     |                               |
>>
>
> I think the issue is that your code can not differ A from B.
>

We have a parameter 'fin_ack_seq' recording the sequence of
'FIN=1,ACK=1,seq=w,ack=u+1' and if the ack value from the opposite
side is is 'w+1', we can consider this connection is closed, no ?

> Thanks
>
>>>>
>>>>> What's more I don't think we can decide passive or active close by:
>>>>>
>>>>>
>>>>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK |
>>>>> TH_FIN)) {
>>>>>
>>>>> Since both cases will send FIN,ACK for sure.
>>>>>
>>>>
>>>> I didn't quite understand, here we have tracked the closing request no
>>>> matter
>>>> it is from the server side (passive close ?) or client side ( active
>>>> close ?).
>>>> You can refer to the comment in codes, 'Case 1' and 'Case 2' comments.
>>>
>>> I think you need differ them since passive close is much simpler, and it
>>> seems that your code may work only in this case.
>>>
>>>>
>>>> Here, it seems that we can't track one case which both sides send the
>>>> closing
>>>> requests at the same time, in that case, there are only 'FIN=1' and
>>>> 'ACK=1'
>>>> packets.
>>>>
>>>
>>> Yes, RFC allows this.
>>>
>>
>> Hmm, I'd like to remove this patch from this series,
>> And send it as a single patch after all the above questions
>> been solved. How about the other patches ?
>>
>
> Looks good except for the compiling issue of patch 3.
>

OK, i will fix it in version 3.

Thanks

> Thanks
>
>> Thanks,
>> Hailiang
>>
>>
>>> Thanks
>>>
>>>>
>>>> Thanks.
>>>> Hailiang
>>>>
>
> [...]
>
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  6:53                   ` Hailiang Zhang
@ 2017-02-27  9:05                     ` Jason Wang
  2017-02-27 10:29                       ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Wang @ 2017-02-27  9:05 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst, lizhijian
  Cc: xuquan8, qemu-devel, pss.wulizhen



On 2017年02月27日 14:53, Hailiang Zhang wrote:
>> I think the issue is that your code can not differ A from B.
>>
>
> We have a parameter 'fin_ack_seq' recording the sequence of
> 'FIN=1,ACK=1,seq=w,ack=u+1' and if the ack value from the opposite
> side is is 'w+1', we can consider this connection is closed, no ? 

Let's see what happens, consider VM is doing active close (reuse the 
figure above):

(VM)
Client:                                    Server:

ESTABLISHED|                               |
            | -> FIN=1,seq=u   ->           |

handle_secondary():
fin_ack_seq = u
tcp_state = TCPS_LAST_ACK

FIN_WAIT_1 |                               |
            | <- ACK=1,seq=v,ack=u+1 <-     |

handle_primary():
fin_ack_seq = ack + 1
g_hash_table_remove()

But we probably want it to be removed in TIME_WAIT_CLOSED.

Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27  9:05                     ` Jason Wang
@ 2017-02-27 10:29                       ` Hailiang Zhang
  2017-02-28  3:14                         ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Hailiang Zhang @ 2017-02-27 10:29 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst, lizhijian; +Cc: xuquan8, qemu-devel, pss.wulizhen

On 2017/2/27 17:05, Jason Wang wrote:
>
>
> On 2017年02月27日 14:53, Hailiang Zhang wrote:
>>> I think the issue is that your code can not differ A from B.
>>>
>>
>> We have a parameter 'fin_ack_seq' recording the sequence of
>> 'FIN=1,ACK=1,seq=w,ack=u+1' and if the ack value from the opposite
>> side is is 'w+1', we can consider this connection is closed, no ?
>

Hi Jason,

Thanks very much for your patience.

> Let's see what happens, consider VM is doing active close (reuse the
> figure above):
>

(We didn't support tracking the connection start
by the VM in current rewriter codes.
I mean the Client side is VM).

Your figure is not quite correct, the process should be:
                                           (VM)
Client:                                    Server:

ESTABLISHED|                               |
            | -> FIN=1,seq=u   ->           |
FIN_WAIT_1 |                               |
            | <- ACK=1,seq=v,ack=u+1 <-     |
FINA_WAIT_2|                               |CLOSE_WAIT
            | <- FIN=1,ACK=1,seq=w,ack=u+1<-|
handle_secondary():
fin_ack_seq = w
tcp_state = TCPS_LAST_ACK

            |                               |LAST+ACK
            |   -> ACK=1,seq=u+1,ack=w+1    |
TIME_WAIT  |                               |CLOSED
CLOSED     |                               |

handle_primary():
if (ack = fin_ack_seq + 1)
    g_hash_table_remove()

> (VM)
> Client:                                    Server:
>
> ESTABLISHED|                               |
>              | -> FIN=1,seq=u   ->           |
>
> handle_secondary():
> fin_ack_seq = u
> tcp_state = TCPS_LAST_ACK
>
> FIN_WAIT_1 |                               |
>              | <- ACK=1,seq=v,ack=u+1 <-     |
>
> handle_primary():
> fin_ack_seq = ack + 1
> g_hash_table_remove()
>
> But we probably want it to be removed in TIME_WAIT_CLOSED.
>

Yes, we should removed it after 2MSL, because the last
the sever side may not get the 'ACK=1,seq=v,ack=u+1' packet,
and it will resend the 'FIN=1,ACK=1,seq=w,ack=u+1'.

Thanks.


> Thanks
>
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
  2017-02-27 10:29                       ` Hailiang Zhang
@ 2017-02-28  3:14                         ` Jason Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2017-02-28  3:14 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst, lizhijian
  Cc: xuquan8, qemu-devel, pss.wulizhen



On 2017年02月27日 18:29, Hailiang Zhang wrote:
> On 2017/2/27 17:05, Jason Wang wrote:
>>
>>
>> On 2017年02月27日 14:53, Hailiang Zhang wrote:
>>>> I think the issue is that your code can not differ A from B.
>>>>
>>>
>>> We have a parameter 'fin_ack_seq' recording the sequence of
>>> 'FIN=1,ACK=1,seq=w,ack=u+1' and if the ack value from the opposite
>>> side is is 'w+1', we can consider this connection is closed, no ?
>>
>
> Hi Jason,
>
> Thanks very much for your patience.
>
>> Let's see what happens, consider VM is doing active close (reuse the
>> figure above):
>>
>
> (We didn't support tracking the connection start
> by the VM in current rewriter codes.
> I mean the Client side is VM).

So the questions are still there:

1) Can this patch knows whether or not the connection is started by VM? 
If yes, any specific reason to do this?
2) Even if it only support VM as server, server can still do active 
close for many reasons. What if VM send FIN first?

>
> Your figure is not quite correct, the process should be:
>                                           (VM)
> Client:                                    Server:
>
> ESTABLISHED|                               |
>            | -> FIN=1,seq=u   ->           |
> FIN_WAIT_1 |                               |
>            | <- ACK=1,seq=v,ack=u+1 <-     |
> FINA_WAIT_2|                               |CLOSE_WAIT
>            | <- FIN=1,ACK=1,seq=w,ack=u+1<-|
> handle_secondary():
> fin_ack_seq = w
> tcp_state = TCPS_LAST_ACK
>
>            |                               |LAST+ACK
>            |   -> ACK=1,seq=u+1,ack=w+1    |
> TIME_WAIT  |                               |CLOSED
> CLOSED     |                               |
>
> handle_primary():
> if (ack = fin_ack_seq + 1)
>    g_hash_table_remove()

Yes, that's what I said. The code looks correct for passive close.

Thanks

>
>> (VM)
>> Client:                                    Server:
>>
>> ESTABLISHED|                               |
>>              | -> FIN=1,seq=u   ->           |
>>
>> handle_secondary():
>> fin_ack_seq = u
>> tcp_state = TCPS_LAST_ACK
>>
>> FIN_WAIT_1 |                               |
>>              | <- ACK=1,seq=v,ack=u+1 <-     |
>>
>> handle_primary():
>> fin_ack_seq = ack + 1
>> g_hash_table_remove()
>>
>> But we probably want it to be removed in TIME_WAIT_CLOSED.
>>
>
> Yes, we should removed it after 2MSL, because the last
> the sever side may not get the 'ACK=1,seq=v,ack=u+1' packet,
> and it will resend the 'FIN=1,ACK=1,seq=w,ack=u+1'.
>
> Thanks.
>
>
>> Thanks
>>
>> .
>>
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-02-28  3:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22  3:46 [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization zhanghailiang
2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
2017-02-22  8:39   ` Zhang Chen
2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table zhanghailiang
2017-02-22  8:07   ` Jason Wang
2017-02-22  8:45     ` Hailiang Zhang
2017-02-22  8:51       ` Hailiang Zhang
2017-02-23  4:16         ` Jason Wang
2017-02-27  3:11           ` Hailiang Zhang
2017-02-27  3:40             ` Jason Wang
2017-02-27  4:09               ` Hailiang Zhang
2017-02-27  5:35                 ` Jason Wang
2017-02-27  6:53                   ` Hailiang Zhang
2017-02-27  9:05                     ` Jason Wang
2017-02-27 10:29                       ` Hailiang Zhang
2017-02-28  3:14                         ` Jason Wang
2017-02-22  3:46 ` [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0 zhanghailiang
2017-02-24  8:08   ` Zhang Chen
2017-02-24  8:23     ` Zhang Chen
2017-02-27  1:36     ` Hailiang Zhang
2017-02-27  3:44       ` Zhang Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.