Re: [PATCH mptcp-next v8 8/8] selftests: mptcp: add mp_fail testcases

From: Geliang Tang <geliang.tang@suse.com>
To: Matthieu Baerts <matthieu.baerts@tessares.net>
Cc: Davide Caratti <dcaratti@redhat.com>, mptcp@lists.linux.dev
Subject: Re: [PATCH mptcp-next v8 8/8] selftests: mptcp: add mp_fail testcases
Date: Thu, 4 Nov 2021 18:30:03 +0800	[thread overview]
Message-ID: <20211104103003.GA5555@dhcp-10-157-36-190> (raw)
In-Reply-To: <f4467f1a-ddae-92b6-48f5-30ab03dfc77f@tessares.net>

Hi Matt, Davide,

On Thu, Nov 04, 2021 at 10:14:35AM +0100, Matthieu Baerts wrote:
> Hi Geliang,
> 
> On 04/11/2021 01:43, Mat Martineau wrote:
> > On Fri, 29 Oct 2021, Geliang Tang wrote:
> > 
> >> Added the test cases for MP_FAIL, use 'tc' command to trigger the
> >> checksum failure.
> >>
> >> Suggested-by: Davide Caratti <dcaratti@redhat.com>
> >> Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> >> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> >> ---
> >> tools/testing/selftests/net/mptcp/config      |  5 ++
> >> .../testing/selftests/net/mptcp/mptcp_join.sh | 75 +++++++++++++++++--
> >> 2 files changed, 72 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/tools/testing/selftests/net/mptcp/config
> >> b/tools/testing/selftests/net/mptcp/config
> >> index 0faaccd21447..f522288b2204 100644
> >> --- a/tools/testing/selftests/net/mptcp/config
> >> +++ b/tools/testing/selftests/net/mptcp/config
> >> @@ -15,3 +15,8 @@ CONFIG_NETFILTER_XTABLES=m
> >> CONFIG_NETFILTER_XT_MATCH_BPF=m
> >> CONFIG_NF_TABLES_IPV4=y
> >> CONFIG_NF_TABLES_IPV6=y
> >> +CONFIG_NET_ACT_CSUM=m
> >> +CONFIG_NET_ACT_PEDIT=m
> >> +CONFIG_NET_CLS_ACT=m
> >> +CONFIG_NET_CLS_FLOWER=m
> >> +CONFIG_NET_SCH_INGRESS=m
> >> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> >> b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> >> index 2684ef9c0d42..d33cb5ce0ff3 100755
> >> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> >> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> >> @@ -178,6 +178,12 @@ if [ $? -ne 0 ];then
> >>     exit $ksft_skip
> >> fi
> >>
> >> +jq -V > /dev/null 2>&1
> >> +if [ $? -ne 0 ];then
> >> +    echo "SKIP: Could not run all tests without jq tool"
> >> +    exit $ksft_skip
> >> +fi
> >> +
> >> print_file_err()
> >> {
> >>     ls -l "$1" 1>&2
> >> @@ -232,6 +238,28 @@ link_failure()
> >>     done
> >> }
> >>
> >> +checksum_failure()
> >> +{
> >> +    i="$1"
> >> +
> >> +    tc -n $ns2 qdisc add dev ns2eth$i clsact
> >> +    tc -n $ns2 filter add dev ns2eth$i egress \
> >> +        protocol ip prio 1000 \
> >> +        flower ip_proto tcp \
> >> +        action pedit munge offset 148 u32 invert \
> >> +        pipe csum tcp \
> >> +        index 100
> > 
> > Could you add a comment explaining what the filter is doing? While it
> > looks like the idea is to invert the 32-bit word at byte offset 148, I'm
> > not tc-literate enough to know if there are other effects (and I didn't
> > watch the email threads closely on this).

Matt, could you please add this comment for me? I didn't understand
these tc commands very well, and I can't explain them very clearly in
English.

> 
> Yes that's correct. It takes only TCP packets on the egress side that
> are big enough but print a message in dmesg if not. After the
> modification, the TCP checksum is recomputed so the packet is valid.
> 
> > 
> >> +
> >> +    while true; do
> >> +        local pkt=$(tc -n $ns2 -j -s action show action csum index 100 |
> >> +                jq '.[1].actions[0].stats.packets')

Should we show the action 'pedit' here, instead of 'csum'?

The action 'csum' output is like this:

# tc -n $ns2 -j -s action show action csum index 100

[{"total acts":1},{"actions":[{"order":0,"kind":"csum","csum":"tcp","control_action":{"type":"pass"},"index":100,"ref":1,"bind":1,"installed":29,"last_used":2,"stats":{"bytes":11576,"packets":42,"drops":0,"overlimits":0,"requeues":0,"backlog":0,"qlen":0}}]}]

"overlimits" is always 0 in 'csum'.

But the action 'pedit' output is like this:

# tc -n $ns2 -j -s action show action pedit index 100

[{"total acts":1},{"actions":[{"order":0 pedit ,"control_action":{"type":"pipe"}keys 1
 	 index 1 ref 1 bind 1,"installed":29,"last_used":2
	 key #0  at 148: val ffffffff mask ffffffff
 ,"stats":{"bytes":11576,"packets":42,"drops":0,"overlimits":36,"requeues":0,"backlog":0,"qlen":0}}]}]

"overlimits" values are all right here, but this output can't be parsed by
'jq'. It looks like something wrong to show 'pedit'.

Is this a bug in act_pedit, or we use pedit incorrectly? How can I fix this?

> >> +        if [ $pkt -gt 0 ]; then
> >> +            tc -n $ns2 qdisc del dev ns2eth$i clsact
> >> +            break
> >> +        fi
> 
> On my side, I had to add a "sleep .1" to slow things down and have the
> tests a bit more stable.
> 
> I also used "flower ip_proto tcp tcp_flags 0x10/0xff" and it seems
> working fine.

Could you please share your checksum_failure function here, I'll test it
on my side?

> 
> But still and probably caused by the "sleep .1" that seems needed, I
> have many messages in dmesg:
> 
>     tc action pedit offset 162 out of bounds
> 
> Because everything is shown on the serial, it is a bit annoying.
> 
> Do you think it would be possible to limit to TCP ACK and also make sure
> data are only sent in one direction or limit incoming traffic to 1 byte
> if it is easier? With this, we should then have a very few pure ACKs
> packets on the egress side and avoid most of the TC warning messages
> printed by the kernel, no?
> 
> >> +    done
> >> +}
> >> +
> >> # $1: IP address
> >> is_v6()
> >> {
> >> @@ -371,6 +399,9 @@ do_transfer()
> >>     if [[ "${addr_nr_ns2}" = "fullmesh_"* ]]; then
> >>         flags="${flags},fullmesh"
> >>         addr_nr_ns2=${addr_nr_ns2:9}
> >> +    elif [[ "${addr_nr_ns2}" = "fail_"* ]]; then
> >> +        checksum_failure ${addr_nr_ns2:5}
> >> +        addr_nr_ns2=0
> >>     fi
> >>
> >>     if [ $addr_nr_ns2 -gt 0 ]; then
> >> @@ -542,6 +573,8 @@ run_tests()
> >> chk_csum_nr()
> >> {
> >>     local msg=${1:-""}
> >> +    local csum_ns1=${2:-0}
> >> +    local csum_ns2=${3:-0}
> >>     local count
> >>     local dump_stats
> >>
> >> @@ -553,8 +586,8 @@ chk_csum_nr()
> >>     printf " %-36s %s" "$msg" "sum"
> >>     count=`ip netns exec $ns1 nstat -as | grep MPTcpExtDataCsumErr |
> >> awk '{print $2}'`
> >>     [ -z "$count" ] && count=0
> >> -    if [ "$count" != 0 ]; then
> >> -        echo "[fail] got $count data checksum error[s] expected 0"
> >> +    if [ "$count" != $csum_ns1 ]; then
> >> +        echo "[fail] got $count data checksum error[s] expected
> >> $csum_ns1"
> >>         ret=1
> >>         dump_stats=1
> >>     else
> >> @@ -563,8 +596,8 @@ chk_csum_nr()
> >>     echo -n " - csum  "
> >>     count=`ip netns exec $ns2 nstat -as | grep MPTcpExtDataCsumErr |
> >> awk '{print $2}'`
> >>     [ -z "$count" ] && count=0
> >> -    if [ "$count" != 0 ]; then
> >> -        echo "[fail] got $count data checksum error[s] expected 0"
> >> +    if [ "$count" != $csum_ns2 ]; then
> >> +        echo "[fail] got $count data checksum error[s] expected
> >> $csum_ns2"
> >>         ret=1
> >>         dump_stats=1
> >>     else
> >> @@ -658,6 +691,8 @@ chk_join_nr()
> >>     local syn_nr=$2
> >>     local syn_ack_nr=$3
> >>     local ack_nr=$4
> >> +    local fail_nr=${5:-0}
> >> +    local infi_nr=${6:-0}
> >>     local count
> >>     local dump_stats
> >>
> >> @@ -700,9 +735,9 @@ chk_join_nr()
> >>         ip netns exec $ns2 nstat -as | grep MPTcp
> >>     fi
> >>     if [ $checksum -eq 1 ]; then
> >> -        chk_csum_nr
> >> -        chk_fail_nr 0 0
> >> -        chk_infi_nr 0 0
> >> +        chk_csum_nr "" $fail_nr
> >> +        chk_fail_nr $fail_nr $fail_nr
> >> +        chk_infi_nr $infi_nr $infi_nr
> >>     fi
> >> }
> >>
> >> @@ -1837,6 +1872,25 @@ fullmesh_tests()
> >>     chk_add_nr 1 1
> >> }
> >>
> >> +fail_tests()
> >> +{
> > 
> > Since MP_FAIL is only relevent with checksums enabled, I think it would
> > be good to set $checksum=1 here and restore its previous value before
> > returning (in case other tests run after this).
> > 
> >> +    # 1 subflow
> >> +    reset
> >> +    ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> >> +    ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> >> +    run_tests $ns1 $ns2 10.0.1.1 2 0 "fail_1" fast

Here I passed 2 as the 4th argument of run_tests like the link_failure
tests to create the bigger data files. I don't know why it dosen't work
when passing 0 as the argument?

> >> +    chk_join_nr "MP_FAIL test, 1 subflow" 0 0 0 1 1
> >> +
> >> +    # multiple subflows
> >> +    reset
> >> +    ip netns exec $ns1 ./pm_nl_ctl limits 0 2
> >> +    ip netns exec $ns2 ./pm_nl_ctl limits 0 2
> >> +    ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 dev ns2eth3 flags
> >> subflow
> >> +    ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 dev ns2eth2 flags
> >> subflow
> >> +    run_tests $ns1 $ns2 10.0.1.1 2 0 "fail_2" fast
> >> +    chk_join_nr "MP_FAIL test, multiple subflows" 2 2 2 1
> >> +}
> >> +
> >> all_tests()
> >> {
> >>     subflows_tests
> >> @@ -1853,6 +1907,7 @@ all_tests()
> >>     checksum_tests
> >>     deny_join_id0_tests
> >>     fullmesh_tests
> >> +    fail_tests
> >> }
> >>
> >> usage()
> >> @@ -1872,6 +1927,7 @@ usage()
> >>     echo "  -S checksum_tests"
> >>     echo "  -d deny_join_id0_tests"
> >>     echo "  -m fullmesh_tests"
> >> +    echo "  -F fail_tests"
> >>     echo "  -c capture pcap files"
> >>     echo "  -C enable data checksum"
> >>     echo "  -h help"
> >> @@ -1907,7 +1963,7 @@ if [ $do_all_tests -eq 1 ]; then
> >>     exit $ret
> >> fi
> >>
> >> -while getopts 'fsltra64bpkdmchCS' opt; do
> >> +while getopts 'fsltra64bpkdmchCSF' opt; do
> >>     case $opt in
> >>         f)
> >>             subflows_tests
> >> @@ -1951,6 +2007,9 @@ while getopts 'fsltra64bpkdmchCS' opt; do
> >>         m)
> >>             fullmesh_tests
> >>             ;;
> >> +        F)
> >> +            fail_tests
> >> +            ;;
> >>         c)
> >>             ;;
> >>         C)
> >> -- 
> >> 2.26.2
> > 
> > When I run:
> > 
> > $ sudo ./mptcp_join.sh -FC
> > 
> > I get:
> > 
> > """
> > Created /tmp/tmp.UTIlkuhfRW (size 1 KB) containing data sent by client
> > Created /tmp/tmp.pMPhJddNfQ (size 1 KB) containing data sent by server
> > Created /tmp/tmp.OEx5cmVqbp (size 10075 KB) containing data sent by client
> > Created /tmp/tmp.RAtBQXOXtR (size 6144 KB) containing data sent by server
> > [ FAIL ] file received by server does not match (in, out):
> > -rw-------. 1 root root 20633656 Nov  3 17:32 /tmp/tmp.HtyoJjpCOe
> > Trailing bytes are:
> > MPTCP_TEST_FILE_END_MARKER
> > -rw-------. 1 root root 20633656 Nov  3 17:32 /tmp/tmp.j8HYN1PUVh
> > Trailing bytes are:
> > MPTCP_TEST_FILE_END_MARKER
> > 01 MP_FAIL test, 1 subflow              syn[ ok ] - synack[ ok ] - ack[
> > ok ]
> >                                         sum[ ok ] - csum  [ ok ]
> >                                         ftx[ ok ] - frx   [ ok ]
> >                                         itx[ ok ] - irx   [ ok ]
> > [ FAIL ] file received by server does not match (in, out):
> > -rw-------. 1 root root 20633656 Nov  3 17:32 /tmp/tmp.HtyoJjpCOe
> > Trailing bytes are:
> > MPTCP_TEST_FILE_END_MARKER
> > -rw-------. 1 root root 20633656 Nov  3 17:32 /tmp/tmp.j8HYN1PUVh
> > Trailing bytes are:
> > MPTCP_TEST_FILE_END_MARKER
> > 02 MP_FAIL test, multiple subflows      syn[ ok ] - synack[ ok ] - ack[
> > ok ]
> >                                         sum[ ok ] - csum  [ ok ]
> >                                         ftx[ ok ] - frx   [ ok ]
> >                                         itx[ ok ] - irx   [ ok ]
> > """
> > 
> > The chk_join_nr output shows all [ ok ], which is good. But the file
> > mismatches (I assume due to the inverted data) are failures because the
> > checksums are intended to reject the bad data. Am I understanding the
> > file mismatches correctly?
> > 
> > If MP_FAIL is working, the files should match.
> 
> I also have the same error on my side but not all the time.

I got these file mismatches errors all the time. Could you please share
your test script and the logs when you get the right output?

The first test "01 MP_FAIL test, 1 subflow" will get the file mismatches
all the time, since the checksum error data can't be retransmitted in
the current codes.

But the second test "02 MP_FAIL test, multiple subflows" shouldn't get
the file mismatches. RST is sent on the checksum error subflow, so this
subflow is closed. The data will be retransmitted on the other subflow.
It works by using the old test case "[PATCH mptcp-next v7 8/8]
DO-NOT-MERGE: mptcp: mp_fail test". Data will be retransmitted, and no
the file mismatches. I don't know why it dosen't work by this new test
case. In this test case, I saw the error data has been dropped, and the
new data has been retransmitted in the log. I'll try to fix this.

Thanks,
-Geliang

> 
> Cheers,
> Matt
> -- 
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net
>