All of lore.kernel.org
 help / color / mirror / Atom feed
* load balancing between two chains
@ 2020-01-20  2:46 sbezverk
  2020-01-20 11:23 ` Phil Sutter
  0 siblings, 1 reply; 13+ messages in thread
From: sbezverk @ 2020-01-20  2:46 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netfilter-devel

Hello Phil,

While doing some performance test, btw the results are awesome so far, I came across an issue. It is kubernetes environment, there is a Cluster scope service with 2 backends, 2 pods. The rule for this service program a load balancing between 2 chains representing each backend pod.  When I curl the service, only 1 backend pod replies, second times out. If I delete pod which was working, then second pod starts replying to curl requests. Here are some logs and packets captures. Appreciate if you could take a look at it and share your thoughts.

Thank you
Serguei

!
! Service chain port TCP 808 is exposed port
!
        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                numgen random mod 2 vmap { 0 : jump k8s-nfproxy-sep-RGU2UYFOJNW24NA5, 1 : jump k8s-nfproxy-sep-I7XZOUOVPIQW4IXA } comment ""
        }
!
!  backend pod 1 listens on port 8080
!
        chain k8s-nfproxy-sep-RGU2UYFOJNW24NA5 {
                ip saddr 57.112.0.35 meta mark set 0x00004000
                dnat to 57.112.0.35:8080 fully-random comment "I"
        }
!
! backend pod 2 listens on port 8989
!
        chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
                ip saddr 57.112.0.36 meta mark set 0x00004000
                dnat to 57.112.0.36:8989 fully-random comment "I"
        }

sbezverk@kube-4:~/pods/nftables$ kubectl get svc
NAME                        TYPE        CLUSTER-IP      EXTERNAL-IP      PORT(S)           AGE
app2                        ClusterIP   57.141.53.140   192.168.80.104   808/TCP,809/UDP   12h


sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
Still alive pod1 :)

sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
curl: (7) Failed to connect to 192.168.80.104 port 808: Connection refused

sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
Still alive pod1 :)

sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
curl: (7) Failed to connect to 192.168.80.104 port 808: Connection refused

sbezverk@kube-4:~/pods/nftables$ kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
app2-backend-1-57df95db4d-5n9sz         2/2     Running   0          7h2m
app2-backend-2-5b9c9b7b6f-8zppz         2/2     Running   0          6h46m

!
! As you can see each pod is listening on corresponding container ports
!
sbezverk@kube-4:~/pods/nftables$ kubectl exec app2-backend-1-57df95db4d-5n9sz -- netstat -tunlp
Defaulting container name to nft.
Use 'kubectl describe pod/app2-backend-1-57df95db4d-5n9sz -n default' to see all of the containers in this pod.
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      40/nc               
tcp        0      0 0.0.0.0:5555            0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::8080                 :::*                    LISTEN      40/nc               
tcp6       0      0 :::5555                 :::*                    LISTEN      -                   

sbezverk@kube-4:~/pods/nftables$ kubectl exec app2-backend-2-5b9c9b7b6f-8zppz -- netstat -tunlp
Defaulting container name to nft.
Use 'kubectl describe pod/app2-backend-2-5b9c9b7b6f-8zppz -n default' to see all of the containers in this pod.
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:8989            0.0.0.0:*               LISTEN      9/nc                
tcp        0      0 0.0.0.0:6666            0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::8989                 :::*                    LISTEN      9/nc                
tcp6       0      0 :::6666                 :::*                    LISTEN      -                   



sbezverk@kube-4:~/pods/nftables$ curl http://57.141.53.140:808
Still alive pod1 :)
sbezverk@kube-4:~/pods/nftables$ curl http://57.141.53.140:808
^C
sbezverk@kube-4:~/pods/nftables$



[root@app2-backend-1-57df95db4d-5n9sz /]# tcpdump -i eth0 -v -x -nnnn
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

01:46:08.015618 IP (tos 0x0, ttl 64, id 15032, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.80.104.24259 > 57.112.0.35.8080: Flags [S], cksum 0x4ad2 (incorrect -> 0x6541), seq 995259474, win 65495, options [mss 65495,sackOK,TS val 2275469446 ecr 0,nop,wscale 7], length 0
        0x0000:  4500 003c 3ab8 4000 4006 b560 c0a8 5068
        0x0010:  3970 0023 5ec3 1f90 3b52 7452 0000 0000
        0x0020:  a002 ffd7 4ad2 0000 0204 ffd7 0402 080a
        0x0030:  87a0 e886 0000 0000 0103 0307
01:46:08.015635 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    57.112.0.35.8080 > 192.168.80.104.24259: Flags [S.], cksum 0x4ad2 (incorrect -> 0xc783), seq 2583386211, ack 995259475, win 65160, options [mss 1460,sackOK,TS val 1202282263 ecr 2275469446,nop,wscale 7], length 0
        0x0000:  4500 003c 0000 4000 4006 f018 3970 0023
        0x0010:  c0a8 5068 1f90 5ec3 99fb 5863 3b52 7453
        0x0020:  a012 fe88 4ad2 0000 0204 05b4 0402 080a
        0x0030:  47a9 5f17 87a0 e886 0103 0307
01:46:08.015652 IP (tos 0x0, ttl 64, id 15033, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.24259 > 57.112.0.35.8080: Flags [.], cksum 0x4aca (incorrect -> 0xf2d8), ack 1, win 512, options [nop,nop,TS val 2275469446 ecr 1202282263], length 0
        0x0000:  4500 0034 3ab9 4000 4006 b567 c0a8 5068
        0x0010:  3970 0023 5ec3 1f90 3b52 7453 99fb 5864
        0x0020:  8010 0200 4aca 0000 0101 080a 87a0 e886
        0x0030:  47a9 5f17
01:46:08.015700 IP (tos 0x0, ttl 64, id 15034, offset 0, flags [DF], proto TCP (6), length 134)
    192.168.80.104.24259 > 57.112.0.35.8080: Flags [P.], cksum 0x4b1c (incorrect -> 0x0ba6), seq 1:83, ack 1, win 512, options [nop,nop,TS val 2275469446 ecr 1202282263], length 82: HTTP, length: 82
        GET / HTTP/1.1
        Host: 192.168.80.104:808
        User-Agent: curl/7.58.0
        Accept: */*

        0x0000:  4500 0086 3aba 4000 4006 b514 c0a8 5068
        0x0010:  3970 0023 5ec3 1f90 3b52 7453 99fb 5864
        0x0020:  8018 0200 4b1c 0000 0101 080a 87a0 e886
        0x0030:  47a9 5f17 4745 5420 2f20 4854 5450 2f31
        0x0040:  2e31 0d0a 486f 7374 3a20 3139 322e 3136
        0x0050:  382e 3830 2e31 3034 3a38 3038 0d0a 5573
        0x0060:  6572 2d41 6765 6e74 3a20 6375 726c 2f37
        0x0070:  2e35 382e 300d 0a41 6363 6570 743a 202a
        0x0080:  2f2a 0d0a 0d0a
01:46:08.015704 IP (tos 0x0, ttl 64, id 5774, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.35.8080 > 192.168.80.104.24259: Flags [.], cksum 0x4aca (incorrect -> 0xf289), ack 83, win 509, options [nop,nop,TS val 1202282263 ecr 2275469446], length 0
        0x0000:  4500 0034 168e 4000 4006 d992 3970 0023
        0x0010:  c0a8 5068 1f90 5ec3 99fb 5864 3b52 74a5
        0x0020:  8010 01fd 4aca 0000 0101 080a 47a9 5f17
        0x0030:  87a0 e886
01:46:08.015784 IP (tos 0x0, ttl 64, id 5775, offset 0, flags [DF], proto TCP (6), length 89)
    57.112.0.35.8080 > 192.168.80.104.24259: Flags [P.], cksum 0x4aef (incorrect -> 0x2924), seq 1:38, ack 83, win 509, options [nop,nop,TS val 1202282263 ecr 2275469446], length 37: HTTP, length: 37
        HTTP/1.0 200 Ok

        Still alive pod1 :)
        0x0000:  4500 0059 168f 4000 4006 d96c 3970 0023
        0x0010:  c0a8 5068 1f90 5ec3 99fb 5864 3b52 74a5
        0x0020:  8018 01fd 4aef 0000 0101 080a 47a9 5f17
        0x0030:  87a0 e886 4854 5450 2f31 2e30 2032 3030
        0x0040:  204f 6b0a 0a53 7469 6c6c 2061 6c69 7665
        0x0050:  2070 6f64 3120 3a29 0a
01:46:08.015808 IP (tos 0x0, ttl 64, id 15035, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.24259 > 57.112.0.35.8080: Flags [.], cksum 0x4aca (incorrect -> 0xf261), ack 38, win 512, options [nop,nop,TS val 2275469446 ecr 1202282263], length 0
        0x0000:  4500 0034 3abb 4000 4006 b565 c0a8 5068
        0x0010:  3970 0023 5ec3 1f90 3b52 74a5 99fb 5889
        0x0020:  8010 0200 4aca 0000 0101 080a 87a0 e886
        0x0030:  47a9 5f17
01:46:08.015877 IP (tos 0x0, ttl 64, id 5776, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.35.8080 > 192.168.80.104.24259: Flags [F.], cksum 0x4aca (incorrect -> 0xf263), seq 38, ack 83, win 509, options [nop,nop,TS val 1202282263 ecr 2275469446], length 0
        0x0000:  4500 0034 1690 4000 4006 d990 3970 0023
        0x0010:  c0a8 5068 1f90 5ec3 99fb 5889 3b52 74a5
        0x0020:  8011 01fd 4aca 0000 0101 080a 47a9 5f17
        0x0030:  87a0 e886
01:46:08.016107 IP (tos 0x0, ttl 64, id 15036, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.24259 > 57.112.0.35.8080: Flags [F.], cksum 0x4aca (incorrect -> 0xf25f), seq 83, ack 39, win 512, options [nop,nop,TS val 2275469446 ecr 1202282263], length 0
        0x0000:  4500 0034 3abc 4000 4006 b564 c0a8 5068
        0x0010:  3970 0023 5ec3 1f90 3b52 74a5 99fb 588a
        0x0020:  8011 0200 4aca 0000 0101 080a 87a0 e886
        0x0030:  47a9 5f17
01:46:08.016119 IP (tos 0x0, ttl 64, id 5777, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.35.8080 > 192.168.80.104.24259: Flags [.], cksum 0x4aca (incorrect -> 0xf262), ack 84, win 509, options [nop,nop,TS val 1202282263 ecr 2275469446], length 0
        0x0000:  4500 0034 1691 4000 4006 d98f 3970 0023
        0x0010:  c0a8 5068 1f90 5ec3 99fb 588a 3b52 74a6
        0x0020:  8010 01fd 4aca 0000 0101 080a 47a9 5f17
        0x0030:  87a0 e886
01:46:13.222944 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.1.1 tell 57.112.0.35, length 28
        0x0000:  0001 0800 0604 0001 8ad7 3650 134c 3970
        0x0010:  0023 0000 0000 0000 a9fe 0101
01:46:13.222989 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 57.112.0.35 tell 192.168.80.104, length 28
        0x0000:  0001 0800 0604 0001 eeee eeee eeee c0a8
        0x0010:  5068 0000 0000 0000 3970 0023
01:46:13.222992 ARP, Ethernet (len 6), IPv4 (len 4), Reply 57.112.0.35 is-at 8a:d7:36:50:13:4c, length 28
        0x0000:  0001 0800 0604 0002 8ad7 3650 134c 3970
        0x0010:  0023 eeee eeee eeee c0a8 5068
01:46:13.223010 ARP, Ethernet (len 6), IPv4 (len 4), Reply 169.254.1.1 is-at ee:ee:ee:ee:ee:ee, length 28
        0x0000:  0001 0800 0604 0002 eeee eeee eeee a9fe
        0x0010:  0101 8ad7 3650 134c 3970 0023


sbezverk@kube-4:~/pods/nftables$ kubectl delete -f pod-app2.yaml 
deployment.apps "app2-backend-1" deleted
sbezverk@kube-4:~/pods/nftables$ curl http://57.141.53.140:808
Still alive from pod2 :)
sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
Still alive from pod2 :)

[root@app2-backend-2-5b9c9b7b6f-8zppz /]# tcpdump -i eth0 -v -x -nnn
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

01:47:24.603281 IP (tos 0x0, ttl 64, id 3490, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.80.104.29974 > 57.112.0.36.8989: Flags [S], cksum 0x4ad3 (incorrect -> 0x5f6b), seq 3208549399, win 64240, options [mss 1460,sackOK,TS val 1264020510 ecr 0,nop,wscale 7], length 0
        0x0000:  4500 003c 0da2 4000 4006 e275 c0a8 5068
        0x0010:  3970 0024 7516 231d bf3e 9417 0000 0000
        0x0020:  a002 faf0 4ad3 0000 0204 05b4 0402 080a
        0x0030:  4b57 6c1e 0000 0000 0103 0307
01:47:24.603310 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    57.112.0.36.8989 > 192.168.80.104.29974: Flags [S.], cksum 0x4ad3 (incorrect -> 0x0fd1), seq 2648221851, ack 3208549400, win 65160, options [mss 1460,sackOK,TS val 911265580 ecr 1264020510,nop,wscale 7], length 0
        0x0000:  4500 003c 0000 4000 4006 f017 3970 0024
        0x0010:  c0a8 5068 231d 7516 9dd8 a89b bf3e 9418
        0x0020:  a012 fe88 4ad3 0000 0204 05b4 0402 080a
        0x0030:  3650 cf2c 4b57 6c1e 0103 0307
01:47:24.603338 IP (tos 0x0, ttl 64, id 3491, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.29974 > 57.112.0.36.8989: Flags [.], cksum 0x4acb (incorrect -> 0x3b30), ack 1, win 502, options [nop,nop,TS val 1264020510 ecr 911265580], length 0
        0x0000:  4500 0034 0da3 4000 4006 e27c c0a8 5068
        0x0010:  3970 0024 7516 231d bf3e 9418 9dd8 a89c
        0x0020:  8010 01f6 4acb 0000 0101 080a 4b57 6c1e
        0x0030:  3650 cf2c
01:47:24.603387 IP (tos 0x0, ttl 64, id 3492, offset 0, flags [DF], proto TCP (6), length 133)
    192.168.80.104.29974 > 57.112.0.36.8989: Flags [P.], cksum 0x4b1c (incorrect -> 0x3c4e), seq 1:82, ack 1, win 502, options [nop,nop,TS val 1264020511 ecr 911265580], length 81
        0x0000:  4500 0085 0da4 4000 4006 e22a c0a8 5068
        0x0010:  3970 0024 7516 231d bf3e 9418 9dd8 a89c
        0x0020:  8018 01f6 4b1c 0000 0101 080a 4b57 6c1f
        0x0030:  3650 cf2c 4745 5420 2f20 4854 5450 2f31
        0x0040:  2e31 0d0a 486f 7374 3a20 3537 2e31 3431
        0x0050:  2e35 332e 3134 303a 3830 380d 0a55 7365
        0x0060:  722d 4167 656e 743a 2063 7572 6c2f 372e
        0x0070:  3538 2e30 0d0a 4163 6365 7074 3a20 2a2f
        0x0080:  2a0d 0a0d 0a
01:47:24.603391 IP (tos 0x0, ttl 64, id 12585, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.29974: Flags [.], cksum 0x4acb (incorrect -> 0x3ad6), ack 82, win 509, options [nop,nop,TS val 911265581 ecr 1264020511], length 0
        0x0000:  4500 0034 3129 4000 4006 bef6 3970 0024
        0x0010:  c0a8 5068 231d 7516 9dd8 a89c bf3e 9469
        0x0020:  8010 01fd 4acb 0000 0101 080a 3650 cf2d
        0x0030:  4b57 6c1f
01:47:24.603419 IP (tos 0x0, ttl 64, id 12586, offset 0, flags [DF], proto TCP (6), length 94)
    57.112.0.36.8989 > 192.168.80.104.29974: Flags [P.], cksum 0x4af5 (incorrect -> 0x58ad), seq 1:43, ack 82, win 509, options [nop,nop,TS val 911265581 ecr 1264020511], length 42
        0x0000:  4500 005e 312a 4000 4006 becb 3970 0024
        0x0010:  c0a8 5068 231d 7516 9dd8 a89c bf3e 9469
        0x0020:  8018 01fd 4af5 0000 0101 080a 3650 cf2d
        0x0030:  4b57 6c1f 4854 5450 2f31 2e30 2032 3030
        0x0040:  204f 6b0a 0a53 7469 6c6c 2061 6c69 7665
        0x0050:  2066 726f 6d20 706f 6432 203a 290a
01:47:24.603498 IP (tos 0x0, ttl 64, id 3493, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.29974 > 57.112.0.36.8989: Flags [.], cksum 0x4acb (incorrect -> 0x3ab3), ack 43, win 502, options [nop,nop,TS val 1264020511 ecr 911265581], length 0
        0x0000:  4500 0034 0da5 4000 4006 e27a c0a8 5068
        0x0010:  3970 0024 7516 231d bf3e 9469 9dd8 a8c6
        0x0020:  8010 01f6 4acb 0000 0101 080a 4b57 6c1f
        0x0030:  3650 cf2d
01:47:24.603553 IP (tos 0x0, ttl 64, id 12587, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.29974: Flags [F.], cksum 0x4acb (incorrect -> 0x3aab), seq 43, ack 82, win 509, options [nop,nop,TS val 911265581 ecr 1264020511], length 0
        0x0000:  4500 0034 312b 4000 4006 bef4 3970 0024
        0x0010:  c0a8 5068 231d 7516 9dd8 a8c6 bf3e 9469
        0x0020:  8011 01fd 4acb 0000 0101 080a 3650 cf2d
        0x0030:  4b57 6c1f
01:47:24.603607 IP (tos 0x0, ttl 64, id 3494, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.29974 > 57.112.0.36.8989: Flags [F.], cksum 0x4acb (incorrect -> 0x3ab1), seq 82, ack 44, win 502, options [nop,nop,TS val 1264020511 ecr 911265581], length 0
        0x0000:  4500 0034 0da6 4000 4006 e279 c0a8 5068
        0x0010:  3970 0024 7516 231d bf3e 9469 9dd8 a8c7
        0x0020:  8011 01f6 4acb 0000 0101 080a 4b57 6c1f
        0x0030:  3650 cf2d
01:47:24.603616 IP (tos 0x0, ttl 64, id 12588, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.29974: Flags [.], cksum 0x4acb (incorrect -> 0x3aaa), ack 83, win 509, options [nop,nop,TS val 911265581 ecr 1264020511], length 0
        0x0000:  4500 0034 312c 4000 4006 bef3 3970 0024
        0x0010:  c0a8 5068 231d 7516 9dd8 a8c7 bf3e 946a
        0x0020:  8010 01fd 4acb 0000 0101 080a 3650 cf2d
        0x0030:  4b57 6c1f
01:47:29.766863 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 169.254.1.1 tell 57.112.0.36, length 28
        0x0000:  0001 0800 0604 0001 3e10 5d99 078e 3970
        0x0010:  0024 0000 0000 0000 a9fe 0101
01:47:29.766939 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 57.112.0.36 tell 192.168.80.104, length 28
        0x0000:  0001 0800 0604 0001 eeee eeee eeee c0a8
        0x0010:  5068 0000 0000 0000 3970 0024
01:47:29.766943 ARP, Ethernet (len 6), IPv4 (len 4), Reply 57.112.0.36 is-at 3e:10:5d:99:07:8e, length 28
        0x0000:  0001 0800 0604 0002 3e10 5d99 078e 3970
        0x0010:  0024 eeee eeee eeee c0a8 5068
01:47:29.766962 ARP, Ethernet (len 6), IPv4 (len 4), Reply 169.254.1.1 is-at ee:ee:ee:ee:ee:ee, length 28
        0x0000:  0001 0800 0604 0002 eeee eeee eeee a9fe
        0x0010:  0101 3e10 5d99 078e 3970 0024
01:47:46.295822 IP (tos 0x0, ttl 64, id 9839, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.80.104.44286 > 57.112.0.36.8989: Flags [S], cksum 0x4ad3 (incorrect -> 0x65ba), seq 1424853131, win 65495, options [mss 65495,sackOK,TS val 2275567726 ecr 0,nop,wscale 7], length 0
        0x0000:  4500 003c 266f 4000 4006 c9a8 c0a8 5068
        0x0010:  3970 0024 acfe 231d 54ed 888b 0000 0000
        0x0020:  a002 ffd7 4ad3 0000 0204 ffd7 0402 080a
        0x0030:  87a2 686e 0000 0000 0103 0307
01:47:46.295841 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    57.112.0.36.8989 > 192.168.80.104.44286: Flags [S.], cksum 0x4ad3 (incorrect -> 0x6349), seq 2661811440, ack 1424853132, win 65160, options [mss 1460,sackOK,TS val 911287273 ecr 2275567726,nop,wscale 7], length 0
        0x0000:  4500 003c 0000 4000 4006 f017 3970 0024
        0x0010:  c0a8 5068 231d acfe 9ea8 04f0 54ed 888c
        0x0020:  a012 fe88 4ad3 0000 0204 05b4 0402 080a
        0x0030:  3651 23e9 87a2 686e 0103 0307
01:47:46.295859 IP (tos 0x0, ttl 64, id 9840, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.44286 > 57.112.0.36.8989: Flags [.], cksum 0x4acb (incorrect -> 0x8e9e), ack 1, win 512, options [nop,nop,TS val 2275567726 ecr 911287273], length 0
        0x0000:  4500 0034 2670 4000 4006 c9af c0a8 5068
        0x0010:  3970 0024 acfe 231d 54ed 888c 9ea8 04f1
        0x0020:  8010 0200 4acb 0000 0101 080a 87a2 686e
        0x0030:  3651 23e9
01:47:46.295948 IP (tos 0x0, ttl 64, id 9841, offset 0, flags [DF], proto TCP (6), length 134)
    192.168.80.104.44286 > 57.112.0.36.8989: Flags [P.], cksum 0x4b1d (incorrect -> 0xa76b), seq 1:83, ack 1, win 512, options [nop,nop,TS val 2275567726 ecr 911287273], length 82
        0x0000:  4500 0086 2671 4000 4006 c95c c0a8 5068
        0x0010:  3970 0024 acfe 231d 54ed 888c 9ea8 04f1
        0x0020:  8018 0200 4b1d 0000 0101 080a 87a2 686e
        0x0030:  3651 23e9 4745 5420 2f20 4854 5450 2f31
        0x0040:  2e31 0d0a 486f 7374 3a20 3139 322e 3136
        0x0050:  382e 3830 2e31 3034 3a38 3038 0d0a 5573
        0x0060:  6572 2d41 6765 6e74 3a20 6375 726c 2f37
        0x0070:  2e35 382e 300d 0a41 6363 6570 743a 202a
        0x0080:  2f2a 0d0a 0d0a
01:47:46.295952 IP (tos 0x0, ttl 64, id 43541, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.44286: Flags [.], cksum 0x4acb (incorrect -> 0x8e4f), ack 83, win 509, options [nop,nop,TS val 911287273 ecr 2275567726], length 0
        0x0000:  4500 0034 aa15 4000 4006 460a 3970 0024
        0x0010:  c0a8 5068 231d acfe 9ea8 04f1 54ed 88de
        0x0020:  8010 01fd 4acb 0000 0101 080a 3651 23e9
        0x0030:  87a2 686e
01:47:46.296030 IP (tos 0x0, ttl 64, id 43542, offset 0, flags [DF], proto TCP (6), length 94)
    57.112.0.36.8989 > 192.168.80.104.44286: Flags [P.], cksum 0x4af5 (incorrect -> 0xac26), seq 1:43, ack 83, win 509, options [nop,nop,TS val 911287273 ecr 2275567726], length 42
        0x0000:  4500 005e aa16 4000 4006 45df 3970 0024
        0x0010:  c0a8 5068 231d acfe 9ea8 04f1 54ed 88de
        0x0020:  8018 01fd 4af5 0000 0101 080a 3651 23e9
        0x0030:  87a2 686e 4854 5450 2f31 2e30 2032 3030
        0x0040:  204f 6b0a 0a53 7469 6c6c 2061 6c69 7665
        0x0050:  2066 726f 6d20 706f 6432 203a 290a
01:47:46.296070 IP (tos 0x0, ttl 64, id 9842, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.44286 > 57.112.0.36.8989: Flags [.], cksum 0x4acb (incorrect -> 0x8e22), ack 43, win 512, options [nop,nop,TS val 2275567726 ecr 911287273], length 0
        0x0000:  4500 0034 2672 4000 4006 c9ad c0a8 5068
        0x0010:  3970 0024 acfe 231d 54ed 88de 9ea8 051b
        0x0020:  8010 0200 4acb 0000 0101 080a 87a2 686e
        0x0030:  3651 23e9
01:47:46.296113 IP (tos 0x0, ttl 64, id 43543, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.44286: Flags [F.], cksum 0x4acb (incorrect -> 0x8e24), seq 43, ack 83, win 509, options [nop,nop,TS val 911287273 ecr 2275567726], length 0
        0x0000:  4500 0034 aa17 4000 4006 4608 3970 0024
        0x0010:  c0a8 5068 231d acfe 9ea8 051b 54ed 88de
        0x0020:  8011 01fd 4acb 0000 0101 080a 3651 23e9
        0x0030:  87a2 686e
01:47:46.296186 IP (tos 0x0, ttl 64, id 9843, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.80.104.44286 > 57.112.0.36.8989: Flags [F.], cksum 0x4acb (incorrect -> 0x8e20), seq 83, ack 44, win 512, options [nop,nop,TS val 2275567726 ecr 911287273], length 0
        0x0000:  4500 0034 2673 4000 4006 c9ac c0a8 5068
        0x0010:  3970 0024 acfe 231d 54ed 88de 9ea8 051c
        0x0020:  8011 0200 4acb 0000 0101 080a 87a2 686e
        0x0030:  3651 23e9
01:47:46.296199 IP (tos 0x0, ttl 64, id 43544, offset 0, flags [DF], proto TCP (6), length 52)
    57.112.0.36.8989 > 192.168.80.104.44286: Flags [.], cksum 0x4acb (incorrect -> 0x8e23), ack 84, win 509, options [nop,nop,TS val 911287273 ecr 2275567726], length 0
        0x0000:  4500 0034 aa18 4000 4006 4607 3970 0024
        0x0010:  c0a8 5068 231d acfe 9ea8 051c 54ed 88df
        0x0020:  8010 01fd 4acb 0000 0101 080a 3651 23e9
        0x0030:  87a2 686e

sbezverk@kube-4:~/pods/nftables$ kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
app2-backend-1-57df95db4d-wvgrk         2/2     Running   0          39s
app2-backend-2-5b9c9b7b6f-8zppz         2/2     Running   0          6h59m
icmp-responder-nsc-69c7bc4f84-9kwsj     3/3     Running   0          3d22h
icmp-responder-nse-75868d8cdc-vgtl5     1/1     Running   0          3d22h
nsm-admission-webhook-b947766c8-rjjl4   1/1     Running   0          3d23h
nsm-vpp-forwarder-c5z44                 1/1     Running   0          3d23h
nsmgr-znh86                             3/3     Running   1          5h17m
prefix-service-58dcbd95d6-pd7g7         1/1     Running   0          3d23h
sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
Still alive from pod2 :)
sbezverk@kube-4:~/pods/nftables$ curl http://192.168.80.104:808
curl: (7) Failed to connect to 192.168.80.104 port 808: Connection refused
sbezverk@kube-4:~/pods/nftables$ curl http://57.141.53.140:808
Still alive from pod2 :)
sbezverk@kube-4:~/pods/nftables$ curl http://57.141.53.140:808

curl: (7) Failed to connect to 57.141.53.140 port 808: Connection timed out



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20  2:46 load balancing between two chains sbezverk
@ 2020-01-20 11:23 ` Phil Sutter
  2020-01-20 16:31   ` sbezverk
  0 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-01-20 11:23 UTC (permalink / raw)
  To: sbezverk; +Cc: netfilter-devel

Hi Serguei,

On Sun, Jan 19, 2020 at 09:46:11PM -0500, sbezverk wrote:
> While doing some performance test, btw the results are awesome so far, I came across an issue. It is kubernetes environment, there is a Cluster scope service with 2 backends, 2 pods. The rule for this service program a load balancing between 2 chains representing each backend pod.  When I curl the service, only 1 backend pod replies, second times out. If I delete pod which was working, then second pod starts replying to curl requests. Here are some logs and packets captures. Appreciate if you could take a look at it and share your thoughts.

Please add counters to your rules to check if both dnat statements are
hit. You may also switch 'jump' in vmap to 'goto' and add a final rule
in k8s-nfproxy-svc-M53CN2XYVUHRQ7UB (which should never see packets).

Did you provide a dump of traffic between load-balancer and pod2? (No
traffic is relevant info, too!) A dump of /proc/net/nf_conntrack in
error situation might reveal something, too.

Cheers, Phil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 11:23 ` Phil Sutter
@ 2020-01-20 16:31   ` sbezverk
  2020-01-20 17:06     ` Florian Westphal
  0 siblings, 1 reply; 13+ messages in thread
From: sbezverk @ 2020-01-20 16:31 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netfilter-devel

HI Phil,

There is no loadblancer, curl is executed from the actual node with both pods, so all traffic is local to the node.

As per your suggestion I modified nfproxy rules:

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                numgen random mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ }
                counter packets 3 bytes 180 comment ""
        }

        chain k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.38 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.38:8080 fully-random
        }

        chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
                counter packets 1 bytes 60 comment ""
                ip saddr 57.112.0.36 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.36:8989 fully-random
        }

I could not find file /proc/net/nf_conntrac but I do see nf_conntrack module is loaded:
nf_conntrack          131072  12 xt_conntrack,nf_nat,nft_ct,nft_nat,nf_nat_ipv6,ipt_MASQUERADE,nf_nat_ipv4,xt_nat,nf_conntrack_netlink,nft_masq,nft_masq_ipv4,ip_vs

tcpdump in the pod 1 does not see any curl's generated packets, but in pod 2 it does. 

I noticed one hofully useful fact,  it is always endpoint associated with 1st chain in numgen rule works and 2nd does not.

Anything else I could try to collect to understand why this rule does not work as intended?

Thank you very much for your help
Serguei

On 2020-01-20, 6:23 AM, "Phil Sutter" <n0-1@orbyte.nwl.cc on behalf of phil@nwl.cc> wrote:

    Hi Serguei,
    
    On Sun, Jan 19, 2020 at 09:46:11PM -0500, sbezverk wrote:
    > While doing some performance test, btw the results are awesome so far, I came across an issue. It is kubernetes environment, there is a Cluster scope service with 2 backends, 2 pods. The rule for this service program a load balancing between 2 chains representing each backend pod.  When I curl the service, only 1 backend pod replies, second times out. If I delete pod which was working, then second pod starts replying to curl requests. Here are some logs and packets captures. Appreciate if you could take a look at it and share your thoughts.
    
    Please add counters to your rules to check if both dnat statements are
    hit. You may also switch 'jump' in vmap to 'goto' and add a final rule
    in k8s-nfproxy-svc-M53CN2XYVUHRQ7UB (which should never see packets).
    
    Did you provide a dump of traffic between load-balancer and pod2? (No
    traffic is relevant info, too!) A dump of /proc/net/nf_conntrack in
    error situation might reveal something, too.
    
    Cheers, Phil
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 16:31   ` sbezverk
@ 2020-01-20 17:06     ` Florian Westphal
  2020-01-20 17:42       ` sbezverk
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2020-01-20 17:06 UTC (permalink / raw)
  To: sbezverk; +Cc: Phil Sutter, netfilter-devel

sbezverk <sbezverk@gmail.com> wrote:
> HI Phil,
> 
> There is no loadblancer, curl is executed from the actual node with both pods, so all traffic is local to the node.
> 
> As per your suggestion I modified nfproxy rules:
> 
>         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
>                 numgen random mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ }
>                 counter packets 3 bytes 180 comment ""
>         }
> 
>         chain k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ {
>                 counter packets 0 bytes 0 comment ""
>                 ip saddr 57.112.0.38 meta mark set 0x00004000 comment ""
>                 dnat to 57.112.0.38:8080 fully-random
>         }
> 
>         chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
>                 counter packets 1 bytes 60 comment ""
>                 ip saddr 57.112.0.36 meta mark set 0x00004000 comment ""
>                 dnat to 57.112.0.36:8989 fully-random
>         }

Weird, it looks like it generates 0 and something else, not 1.

Works for me on x86_64 with 5.4.10 kernel:

table ip test {
        chain output {
                type filter hook output priority filter; policy accept;
                jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB
        }

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                numgen random mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ }
                counter packets 0 bytes 0
        }

        chain k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ {
                counter packets 68602 bytes 5763399
                ip saddr 57.112.0.38 meta mark set 0x00004000 comment ""
        }

        chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
                counter packets 69159 bytes 5809685
                ip saddr 57.112.0.36 meta mark set 0x00004000 comment ""
        }
}

(I removed nat rules and then ran ping -f 127.0.0.1).

Does it work when you use "numgen inc" instead of "numgen rand" ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 17:06     ` Florian Westphal
@ 2020-01-20 17:42       ` sbezverk
  2020-01-20 21:39         ` Florian Westphal
  0 siblings, 1 reply; 13+ messages in thread
From: sbezverk @ 2020-01-20 17:42 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

Hello,

Changed kernel to 5.4.10, and switch to use "inc" instead of "random".  Now first curl works and second fails. Whenever second chain is selected to be used,  curl connection gets stuck. 

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 }
                counter packets 1 bytes 60 comment ""
        }

        chain k8s-nfproxy-sep-TMVEFT7EX55F4T62 {
                counter packets 1 bytes 60 comment ""
                ip saddr 57.112.0.41 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.41:8080 fully-random
        }

        chain k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.52 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.52:8989 fully-random
        }

Any debug I could enable to see where the packet goes?

Thank you
Serguei
On 2020-01-20, 12:06 PM, "Florian Westphal" <fw@strlen.de> wrote:

    sbezverk <sbezverk@gmail.com> wrote:
    > HI Phil,
    > 
    > There is no loadblancer, curl is executed from the actual node with both pods, so all traffic is local to the node.
    > 
    > As per your suggestion I modified nfproxy rules:
    > 
    >         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
    >                 numgen random mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ }
    >                 counter packets 3 bytes 180 comment ""
    >         }
    > 
    >         chain k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ {
    >                 counter packets 0 bytes 0 comment ""
    >                 ip saddr 57.112.0.38 meta mark set 0x00004000 comment ""
    >                 dnat to 57.112.0.38:8080 fully-random
    >         }
    > 
    >         chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
    >                 counter packets 1 bytes 60 comment ""
    >                 ip saddr 57.112.0.36 meta mark set 0x00004000 comment ""
    >                 dnat to 57.112.0.36:8989 fully-random
    >         }
    
    Weird, it looks like it generates 0 and something else, not 1.
    
    Works for me on x86_64 with 5.4.10 kernel:
    
    table ip test {
            chain output {
                    type filter hook output priority filter; policy accept;
                    jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB
            }
    
            chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                    numgen random mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ }
                    counter packets 0 bytes 0
            }
    
            chain k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ {
                    counter packets 68602 bytes 5763399
                    ip saddr 57.112.0.38 meta mark set 0x00004000 comment ""
            }
    
            chain k8s-nfproxy-sep-I7XZOUOVPIQW4IXA {
                    counter packets 69159 bytes 5809685
                    ip saddr 57.112.0.36 meta mark set 0x00004000 comment ""
            }
    }
    
    (I removed nat rules and then ran ping -f 127.0.0.1).
    
    Does it work when you use "numgen inc" instead of "numgen rand" ?
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 17:42       ` sbezverk
@ 2020-01-20 21:39         ` Florian Westphal
  2020-01-20 21:54           ` sbezverk
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Westphal @ 2020-01-20 21:39 UTC (permalink / raw)
  To: sbezverk; +Cc: Florian Westphal, Phil Sutter, netfilter-devel

sbezverk <sbezverk@gmail.com> wrote:
> Changed kernel to 5.4.10, and switch to use "inc" instead of "random".  Now first curl works and second fails. Whenever second chain is selected to be used,  curl connection gets stuck. 
> 
>         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
>                 numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 }
>                 counter packets 1 bytes 60 comment ""
>         }
> 
>         chain k8s-nfproxy-sep-TMVEFT7EX55F4T62 {
>                 counter packets 1 bytes 60 comment ""
>                 ip saddr 57.112.0.41 meta mark set 0x00004000 comment ""
>                 dnat to 57.112.0.41:8080 fully-random
>         }
> 
>         chain k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 {
>                 counter packets 0 bytes 0 comment ""
>                 ip saddr 57.112.0.52 meta mark set 0x00004000 comment ""
>                 dnat to 57.112.0.52:8989 fully-random
>         }
> 
> Any debug I could enable to see where the packet goes?

The counter after numgen should not increment, but it does.
Either numgen does something wrong, or hash alg is broken and doesn't
find a result for "1".

There was such a bug but it was fixed in 5.1...

please show:
uname -a
nft --version

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 21:39         ` Florian Westphal
@ 2020-01-20 21:54           ` sbezverk
  2020-01-20 22:00             ` Florian Westphal
  0 siblings, 1 reply; 13+ messages in thread
From: sbezverk @ 2020-01-20 21:54 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?

It is Ubuntu 18.04

sbezverk@kube-4:~$ uname -a
Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
sbezverk@kube-4:~$ sudo nft --version
nftables v0.9.1 (Headless Horseman)
sbezverk@kube-4:~$

I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.

If you need to enable any kernel/module  debugging or try some debugging code, please let me know I am opened, also live debugging session on my server is also possible. This issue has a very high priority as it would be a real show stopper in kubernetes environment.

Thank you
Serguei 

On 2020-01-20, 4:39 PM, "Florian Westphal" <fw@strlen.de> wrote:

    sbezverk <sbezverk@gmail.com> wrote:
    > Changed kernel to 5.4.10, and switch to use "inc" instead of "random".  Now first curl works and second fails. Whenever second chain is selected to be used,  curl connection gets stuck. 
    > 
    >         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
    >                 numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 }
    >                 counter packets 1 bytes 60 comment ""
    >         }
    > 
    >         chain k8s-nfproxy-sep-TMVEFT7EX55F4T62 {
    >                 counter packets 1 bytes 60 comment ""
    >                 ip saddr 57.112.0.41 meta mark set 0x00004000 comment ""
    >                 dnat to 57.112.0.41:8080 fully-random
    >         }
    > 
    >         chain k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 {
    >                 counter packets 0 bytes 0 comment ""
    >                 ip saddr 57.112.0.52 meta mark set 0x00004000 comment ""
    >                 dnat to 57.112.0.52:8989 fully-random
    >         }
    > 
    > Any debug I could enable to see where the packet goes?
    
    The counter after numgen should not increment, but it does.
    Either numgen does something wrong, or hash alg is broken and doesn't
    find a result for "1".
    
    There was such a bug but it was fixed in 5.1...
    
    please show:
    uname -a
    nft --version
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 21:54           ` sbezverk
@ 2020-01-20 22:00             ` Florian Westphal
  2020-01-20 22:07               ` sbezverk
  2020-01-20 22:12               ` Florian Westphal
  0 siblings, 2 replies; 13+ messages in thread
From: Florian Westphal @ 2020-01-20 22:00 UTC (permalink / raw)
  To: sbezverk; +Cc: Florian Westphal, Phil Sutter, netfilter-devel

sbezverk <sbezverk@gmail.com> wrote:
> Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?
> 
> It is Ubuntu 18.04
> 
> sbezverk@kube-4:~$ uname -a
> Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> sbezverk@kube-4:~$ sudo nft --version
> nftables v0.9.1 (Headless Horseman)
> sbezverk@kube-4:~$
> 
> I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.

In that case, please include "nft --debug=netlink list ruleset".

It would also be good to check if things work when you add it via nft
tool.

>     > 
>     >         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
>     >                 numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 }
>     >                 counter packets 1 bytes 60 comment ""
>     >         }

Just to clarify, the "goto" means that the "counter" should NEVER
increment here because nft interpreter returns to the chain that had

"jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB".

jump and goto do the same thing except that goto doesn't record the
location/chain to return to.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 22:00             ` Florian Westphal
@ 2020-01-20 22:07               ` sbezverk
  2020-01-20 22:12               ` Florian Westphal
  1 sibling, 0 replies; 13+ messages in thread
From: sbezverk @ 2020-01-20 22:07 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

Here you go:

sbezverk@kube-4:~$ sudo nft --debug=netlink list ruleset
ip kube-nfproxy-v4 filter-input 23 
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000008 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 jump -> k8s-filter-services ]
  userdata = { 
ip kube-nfproxy-v4 filter-input 24 23 
  [ immediate reg 0 jump -> k8s-filter-firewall ]
  userdata = { 
ip kube-nfproxy-v4 filter-output 27 
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000008 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 jump -> k8s-filter-services ]
  userdata = { 
ip kube-nfproxy-v4 filter-output 28 27 
  [ immediate reg 0 jump -> k8s-filter-firewall ]
  userdata = { 
ip kube-nfproxy-v4 filter-forward 25 
  [ immediate reg 0 jump -> k8s-filter-forward ]
  userdata = { 
ip kube-nfproxy-v4 filter-forward 26 25 
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000008 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 jump -> k8s-filter-services ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-firewall 29 
  [ meta load mark => reg 1 ]
  [ cmp eq reg 1 0x00008000 ]
  [ immediate reg 0 drop ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-services 35 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 4b @ network header + 16 => reg 9 ]
  [ payload load 2b @ transport header + 2 => reg 10 ]
  [ lookup reg 1 set no-endpoints dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-forward 30 
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000001 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 drop ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-forward 31 30 
  [ meta load mark => reg 1 ]
  [ cmp eq reg 1 0x00004000 ]
  [ immediate reg 0 accept ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-forward 32 31 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x0000f0ff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x00007039 ]
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000006 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 accept ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-forward 33 32 
  [ payload load 4b @ network header + 16 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x0000f0ff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x00007039 ]
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x00000006 ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]
  [ immediate reg 0 accept ]
  userdata = { 
ip kube-nfproxy-v4 k8s-filter-do-reject 34 
  [ reject type 0 code 1 ]
  userdata = { 
ip kube-nfproxy-v4 nat-preroutin 36 
  [ immediate reg 0 jump -> k8s-nat-services ]
  userdata = { 
ip kube-nfproxy-v4 nat-output 37 
  [ immediate reg 0 jump -> k8s-nat-services ]
  userdata = { 
ip kube-nfproxy-v4 nat-postrouting 38 
  [ immediate reg 0 jump -> k8s-nat-postrouting ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-mark-drop 39 
  [ immediate reg 1 0x00008000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-do-mark-masq 47 
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  [ immediate reg 0 return ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-mark-masq 48 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 4b @ network header + 16 => reg 9 ]
  [ payload load 2b @ transport header + 2 => reg 10 ]
  [ lookup reg 1 set do-mark-masq dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-mark-masq 49 48 
  [ immediate reg 0 return ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-services 41 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x0000f0ff ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00007039 ]
  [ immediate reg 0 jump -> k8s-nat-mark-masq ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-services 42 41 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 4b @ network header + 16 => reg 9 ]
  [ payload load 2b @ transport header + 2 => reg 10 ]
  [ lookup reg 1 set cluster-ip dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-services 43 42 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 4b @ network header + 16 => reg 9 ]
  [ payload load 2b @ transport header + 2 => reg 10 ]
  [ lookup reg 1 set external-ip dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-services 44 43 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 4b @ network header + 16 => reg 9 ]
  [ payload load 2b @ transport header + 2 => reg 10 ]
  [ lookup reg 1 set loadbalancer-ip dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-services 45 44 
  [ fib daddr type => reg 1 ]
  [ cmp eq reg 1 0x00000002 ]
  [ immediate reg 0 jump -> k8s-nat-nodeports ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-nodeports 46 
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ payload load 2b @ transport header + 2 => reg 9 ]
  [ lookup reg 1 set nodeports dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nat-postrouting 40 
  [ meta load mark => reg 1 ]
  [ cmp eq reg 1 0x00004000 ]
  [ masq flags 0xc ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-Z2V2H34MNX3I6O2G 112 
  [ numgen reg 1 = inc mod 2 ]
  [ lookup reg 1 set __map2 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-Z2V2H34MNX3I6O2G 59 112 
  [ counter pkts 1 bytes 60 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 54 
  [ counter pkts 3 bytes 180 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 55 54 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x6850a8c0 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 56 55 
  [ immediate reg 1 0x6850a8c0 ]
  [ immediate reg 2 0x00002b19 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 108 56 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 109 108 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x6850a8c0 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-WTQR35QT3M6PVG5X 110 109 
  [ immediate reg 1 0x6850a8c0 ]
  [ immediate reg 2 0x00002b19 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-M53CN2XYVUHRQ7UB 170 
  [ numgen reg 1 = inc mod 3 ]
  [ lookup reg 1 set __map5 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-M53CN2XYVUHRQ7UB 76 170 
  [ counter pkts 4 bytes 240 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-PL4AZP3AKMRCVEEV 101 
  [ numgen reg 1 = inc mod 2 ]
  [ lookup reg 1 set __map1 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-PL4AZP3AKMRCVEEV 83 101 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-F3FYSUNEU5GRF2PR 67 
  [ counter pkts 156 bytes 9360 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-F3FYSUNEU5GRF2PR 68 67 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x27007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-F3FYSUNEU5GRF2PR 69 68 
  [ immediate reg 1 0x27007039 ]
  [ immediate reg 2 0x0000911f ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-TMVEFT7EX55F4T62 71 
  [ counter pkts 3 bytes 180 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-TMVEFT7EX55F4T62 72 71 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x29007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-TMVEFT7EX55F4T62 73 72 
  [ immediate reg 1 0x29007039 ]
  [ immediate reg 2 0x0000901f ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-UOK7V3LF34NNNXJK 78 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-UOK7V3LF34NNNXJK 79 78 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x29007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-UOK7V3LF34NNNXJK 80 79 
  [ immediate reg 1 0x29007039 ]
  [ immediate reg 2 0x00009a1f ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-ZQKXCYOBISQCSH6Q 124 
  [ numgen reg 1 = inc mod 1 ]
  [ lookup reg 1 set __map4 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-ZQKXCYOBISQCSH6Q 125 124 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 88 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 89 88 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x34007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 90 89 
  [ immediate reg 1 0x34007039 ]
  [ immediate reg 2 0x00001d23 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-MLOFX2HRWDMEIJ2C 138 
  [ numgen reg 1 = inc mod 2 ]
  [ lookup reg 1 set __map6 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-MLOFX2HRWDMEIJ2C 132 138 
  [ counter pkts 1597 bytes 126466 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-AB4FZJCEEYJGUR7G 97 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-AB4FZJCEEYJGUR7G 98 97 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x34007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-AB4FZJCEEYJGUR7G 99 98 
  [ immediate reg 1 0x34007039 ]
  [ immediate reg 2 0x00002623 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-BKEZZE5BBBAFLJMD 151 
  [ numgen reg 1 = inc mod 2 ]
  [ lookup reg 1 set __map7 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-BKEZZE5BBBAFLJMD 145 151 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-XZFCNG333PM4X5VI 164 
  [ numgen reg 1 = inc mod 2 ]
  [ lookup reg 1 set __map8 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-XZFCNG333PM4X5VI 158 164 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-ALEQQYFAJOE576GL 117 
  [ numgen reg 1 = inc mod 1 ]
  [ lookup reg 1 set __map0 dreg 0 0x0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-svc-ALEQQYFAJOE576GL 118 117 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-5CXJFIVYWUOH4QP5 120 
  [ counter pkts 1 bytes 60 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-5CXJFIVYWUOH4QP5 121 120 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2f007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-5CXJFIVYWUOH4QP5 122 121 
  [ immediate reg 1 0x2f007039 ]
  [ immediate reg 2 0x0000bb01 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-ZLBUKWY4CZE4VBQ6 127 
  [ counter pkts 1597 bytes 127401 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-ZLBUKWY4CZE4VBQ6 128 127 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2a007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-ZLBUKWY4CZE4VBQ6 129 128 
  [ immediate reg 1 0x2a007039 ]
  [ immediate reg 2 0x00003500 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-L7QM2ZN4KU2U3Y7S 134 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-L7QM2ZN4KU2U3Y7S 135 134 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2b007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-L7QM2ZN4KU2U3Y7S 136 135 
  [ immediate reg 1 0x2b007039 ]
  [ immediate reg 2 0x00003500 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-47JQSZ5IZC6OSGGT 140 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-47JQSZ5IZC6OSGGT 141 140 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2a007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-47JQSZ5IZC6OSGGT 142 141 
  [ immediate reg 1 0x2a007039 ]
  [ immediate reg 2 0x00003500 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-SLRAZLUBLWQJXVD6 147 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-SLRAZLUBLWQJXVD6 148 147 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2b007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-SLRAZLUBLWQJXVD6 149 148 
  [ immediate reg 1 0x2b007039 ]
  [ immediate reg 2 0x00003500 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MDXSOI4QEYHXQ5TE 153 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MDXSOI4QEYHXQ5TE 154 153 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2a007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MDXSOI4QEYHXQ5TE 155 154 
  [ immediate reg 1 0x2a007039 ]
  [ immediate reg 2 0x0000c123 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MQDIJAQHMGQYQDQC 160 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MQDIJAQHMGQYQDQC 161 160 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x2b007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-MQDIJAQHMGQYQDQC 162 161 
  [ immediate reg 1 0x2b007039 ]
  [ immediate reg 2 0x0000c123 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-23NTSA2UXPPQIPK4 166 
  [ counter pkts 0 bytes 0 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-23NTSA2UXPPQIPK4 167 166 
  [ payload load 4b @ network header + 12 => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0xffffffff ) ^ 0x00000000 ]
  [ cmp eq reg 1 0x35007039 ]
  [ immediate reg 1 0x00004000 ]
  [ meta set mark with reg 1 ]
  userdata = { 
ip kube-nfproxy-v4 k8s-nfproxy-sep-23NTSA2UXPPQIPK4 168 167 
  [ immediate reg 1 0x35007039 ]
  [ immediate reg 2 0x00005322 ]
  [ nat dnat ip addr_min reg 1 addr_max reg 1 proto_min reg 2 proto_max reg 2 flags 16]
  userdata = { 
table inet filter {
        chain input {
                type filter hook input priority filter; policy accept;
        }

        chain forward {
                type filter hook forward priority filter; policy accept;
        }

        chain output {
                type filter hook output priority filter; policy accept;
        }
}
table ip kube-nfproxy-v4 {
        map no-endpoints {
                type inet_proto . ipv4_addr . inet_service : verdict
        }

        map do-mark-masq {
                type inet_proto . ipv4_addr . inet_service : verdict
                elements = { tcp . 57.128.0.1 . 443 : jump k8s-nat-do-mark-masq,
                             tcp . 57.128.0.10 . 53 : jump k8s-nat-do-mark-masq,
                             tcp . 57.128.0.10 . 9153 : jump k8s-nat-do-mark-masq,
                             tcp . 57.139.80.125 . 8081 : jump k8s-nat-do-mark-masq,
                             tcp . 57.141.10.218 . 443 : jump k8s-nat-do-mark-masq,
                             tcp . 57.141.53.140 . 808 : jump k8s-nat-do-mark-masq,
                             tcp . 192.168.80.104 . 808 : jump k8s-nat-do-mark-masq,
                             udp . 57.128.0.10 . 53 : jump k8s-nat-do-mark-masq,
                             udp . 57.141.53.140 . 809 : jump k8s-nat-do-mark-masq,
                             udp . 192.168.80.104 . 809 : jump k8s-nat-do-mark-masq }
        }

        map cluster-ip {
                type inet_proto . ipv4_addr . inet_service : verdict
                elements = { tcp . 57.128.0.1 . 443 : jump k8s-nfproxy-svc-Z2V2H34MNX3I6O2G,
                             tcp . 57.128.0.10 . 53 : jump k8s-nfproxy-svc-BKEZZE5BBBAFLJMD,
                             tcp . 57.128.0.10 . 9153 : jump k8s-nfproxy-svc-XZFCNG333PM4X5VI,
                             tcp . 57.139.80.125 . 8081 : jump k8s-nfproxy-svc-ALEQQYFAJOE576GL,
                             tcp . 57.141.10.218 . 443 : jump k8s-nfproxy-svc-ZQKXCYOBISQCSH6Q,
                             tcp . 57.141.53.140 . 808 : jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB,
                             udp . 57.128.0.10 . 53 : jump k8s-nfproxy-svc-MLOFX2HRWDMEIJ2C,
                             udp . 57.141.53.140 . 809 : jump k8s-nfproxy-svc-PL4AZP3AKMRCVEEV }
        }

        map external-ip {
                type inet_proto . ipv4_addr . inet_service : verdict
                elements = { tcp . 192.168.80.104 . 808 : jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB,
                             udp . 192.168.80.104 . 809 : jump k8s-nfproxy-svc-PL4AZP3AKMRCVEEV }
        }

        map loadbalancer-ip {
                type inet_proto . ipv4_addr . inet_service : verdict
        }

        map nodeports {
                type inet_proto . inet_service : verdict
                elements = { tcp . 30283 : jump k8s-nfproxy-svc-ALEQQYFAJOE576GL }
        }

        chain filter-input {
                type filter hook input priority filter; policy accept;
                ct state new jump k8s-filter-services comment "	jump k8s-filter-firewall comment "}

        chain filter-output {
                type filter hook output priority filter; policy accept;
                ct state new jump k8s-filter-services
                jump k8s-filter-firewall comment "}

        chain filter-forward {
                type filter hook forward priority filter; policy accept;
                jump k8s-filter-forward
                ct state new jump k8s-filter-services comment "}

        chain k8s-filter-firewall {
                meta mark 0x00008000 drop
        }

        chain k8s-filter-services {
                ip protocol . ip daddr . @th,16,16 vmap @no-endpoints
        }

        chain k8s-filter-forward {
                ct state invalid drop
                meta mark 0x00004000 accept comment "	ip saddr 57.112.0.0/12 ct state established,related accept
                ip daddr 57.112.0.0/12 ct state established,related accept
        }

        chain k8s-filter-do-reject {
                reject with icmp type host-unreachable
        }

        chain nat-preroutin {
                type nat hook prerouting priority filter; policy accept;
                jump k8s-nat-services
        }

        chain nat-output {
                type nat hook output priority filter; policy accept;
                jump k8s-nat-services
        }

        chain nat-postrouting {
                type nat hook postrouting priority filter; policy accept;
                jump k8s-nat-postrouting comment "}

        chain k8s-nat-mark-drop {
                meta mark set 0x00008000
        }

        chain k8s-nat-do-mark-masq {
                meta mark set 0x00004000 return
        }

        chain k8s-nat-mark-masq {
                ip protocol . ip daddr . @th,16,16 vmap @do-mark-masq
                return comment ""
        }

        chain k8s-nat-services {
                ip saddr != 57.112.0.0/12 jump k8s-nat-mark-masq
                ip protocol . ip daddr . @th,16,16 vmap @cluster-ip comment "	ip protocol . ip daddr . @th,16,16 vmap @external-ip
                ip protocol . ip daddr . @th,16,16 vmap @loadbalancer-ip
                fib daddr type local jump k8s-nat-nodeports comment "2"
        }

        chain k8s-nat-nodeports {
                ip protocol . @th,16,16 vmap @nodeports comment ""
        }

        chain k8s-nat-postrouting {
                meta mark 0x00004000 masquerade random,persistent comment ""
        }

        chain k8s-nfproxy-svc-Z2V2H34MNX3I6O2G {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-WTQR35QT3M6PVG5X, 1 : goto k8s-nfproxy-sep-WTQR35QT3M6PVG5X }
                counter packets 1 bytes 60 comment ""
        }

        chain k8s-nfproxy-fw-Z2V2H34MNX3I6O2G {
        }

        chain k8s-nfproxy-xlb-Z2V2H34MNX3I6O2G {
        }

        chain k8s-nfproxy-sep-WTQR35QT3M6PVG5X {
                counter packets 3 bytes 180 comment ""
                ip saddr 192.168.80.104 meta mark set 0x00004000 comment ""
                dnat to 192.168.80.104:6443 fully-random
                counter packets 0 bytes 0
                ip saddr 192.168.80.104 meta mark set 0x00004000 comment ""
                dnat to 192.168.80.104:6443 fully-random comment ""
        }

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
                numgen inc mod 3 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5, 2 : goto k8s-nfproxy-sep-23NTSA2UXPPQIPK4 }
                counter packets 4 bytes 240 comment ""
        }

        chain k8s-nfproxy-fw-M53CN2XYVUHRQ7UB {
        }

        chain k8s-nfproxy-xlb-M53CN2XYVUHRQ7UB {
        }

        chain k8s-nfproxy-svc-PL4AZP3AKMRCVEEV {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-UOK7V3LF34NNNXJK, 1 : goto k8s-nfproxy-sep-AB4FZJCEEYJGUR7G }
                counter packets 0 bytes 0 comment ""
        }

        chain k8s-nfproxy-fw-PL4AZP3AKMRCVEEV {
        }

        chain k8s-nfproxy-xlb-PL4AZP3AKMRCVEEV {
        }

        chain k8s-nfproxy-sep-F3FYSUNEU5GRF2PR {
                counter packets 156 bytes 9360 comment ""
                ip saddr 57.112.0.39 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.39:8081 fully-random
        }

        chain k8s-nfproxy-sep-TMVEFT7EX55F4T62 {
                counter packets 3 bytes 180 comment ""
                ip saddr 57.112.0.41 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.41:8080 fully-random
        }

        chain k8s-nfproxy-sep-UOK7V3LF34NNNXJK {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.41 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.41:8090 fully-random
        }

        chain k8s-nfproxy-svc-ZQKXCYOBISQCSH6Q {
                numgen inc mod 1 vmap { 0 : goto k8s-nfproxy-sep-5CXJFIVYWUOH4QP5 } comment ""
                counter packets 0 bytes 0 comment ""
        }

        chain k8s-nfproxy-fw-ZQKXCYOBISQCSH6Q {
        }

        chain k8s-nfproxy-xlb-ZQKXCYOBISQCSH6Q {
        }

        chain k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.52 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.52:8989 fully-random
        }

        chain k8s-nfproxy-svc-MLOFX2HRWDMEIJ2C {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-ZLBUKWY4CZE4VBQ6, 1 : goto k8s-nfproxy-sep-L7QM2ZN4KU2U3Y7S }
                counter packets 1597 bytes 126466 comment ""
        }

        chain k8s-nfproxy-fw-MLOFX2HRWDMEIJ2C {
        }

        chain k8s-nfproxy-xlb-MLOFX2HRWDMEIJ2C {
        }

        chain k8s-nfproxy-sep-AB4FZJCEEYJGUR7G {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.52 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.52:8998 fully-random
        }

        chain k8s-nfproxy-svc-BKEZZE5BBBAFLJMD {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-47JQSZ5IZC6OSGGT, 1 : goto k8s-nfproxy-sep-SLRAZLUBLWQJXVD6 }
                counter packets 0 bytes 0 comment ""
        }

        chain k8s-nfproxy-fw-BKEZZE5BBBAFLJMD {
        }

        chain k8s-nfproxy-xlb-BKEZZE5BBBAFLJMD {
        }

        chain k8s-nfproxy-svc-XZFCNG333PM4X5VI {
                numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-MDXSOI4QEYHXQ5TE, 1 : goto k8s-nfproxy-sep-MQDIJAQHMGQYQDQC }
                counter packets 0 bytes 0 comment ""
        }

        chain k8s-nfproxy-fw-XZFCNG333PM4X5VI {
        }

        chain k8s-nfproxy-xlb-XZFCNG333PM4X5VI {
        }

        chain k8s-nfproxy-svc-ALEQQYFAJOE576GL {
                numgen inc mod 1 vmap { 0 : goto k8s-nfproxy-sep-F3FYSUNEU5GRF2PR } comment ""
                counter packets 0 bytes 0 comment ""
        }

        chain k8s-nfproxy-fw-ALEQQYFAJOE576GL {
        }

        chain k8s-nfproxy-xlb-ALEQQYFAJOE576GL {
        }

        chain k8s-nfproxy-sep-5CXJFIVYWUOH4QP5 {
                counter packets 1 bytes 60 comment ""
                ip saddr 57.112.0.47 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.47:443 fully-random
        }

        chain k8s-nfproxy-sep-ZLBUKWY4CZE4VBQ6 {
                counter packets 1597 bytes 127401 comment ""
                ip saddr 57.112.0.42 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.42:53 fully-random
        }

        chain k8s-nfproxy-sep-L7QM2ZN4KU2U3Y7S {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.43 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.43:53 fully-random
        }

        chain k8s-nfproxy-sep-47JQSZ5IZC6OSGGT {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.42 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.42:53 fully-random
        }

        chain k8s-nfproxy-sep-SLRAZLUBLWQJXVD6 {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.43 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.43:53 fully-random
        }

        chain k8s-nfproxy-sep-MDXSOI4QEYHXQ5TE {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.42 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.42:9153 fully-random
        }

        chain k8s-nfproxy-sep-MQDIJAQHMGQYQDQC {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.43 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.43:9153 fully-random
        }

        chain k8s-nfproxy-sep-23NTSA2UXPPQIPK4 {
                counter packets 0 bytes 0 comment ""
                ip saddr 57.112.0.53 meta mark set 0x00004000 comment ""
                dnat to 57.112.0.53:8787 fully-random
        }
}
table ip6 kube-nfproxy-v6 {
}
sbezverk@kube-4:~$ 






On 2020-01-20, 5:00 PM, "Florian Westphal" <fw@strlen.de> wrote:

    sbezverk <sbezverk@gmail.com> wrote:
    > Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?
    > 
    > It is Ubuntu 18.04
    > 
    > sbezverk@kube-4:~$ uname -a
    > Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    > sbezverk@kube-4:~$ sudo nft --version
    > nftables v0.9.1 (Headless Horseman)
    > sbezverk@kube-4:~$
    > 
    > I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.
    
    In that case, please include "nft --debug=netlink list ruleset".
    
    It would also be good to check if things work when you add it via nft
    tool.
    
    >     > 
    >     >         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
    >     >                 numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 }
    >     >                 counter packets 1 bytes 60 comment ""
    >     >         }
    
    Just to clarify, the "goto" means that the "counter" should NEVER
    increment here because nft interpreter returns to the chain that had
    
    "jump k8s-nfproxy-svc-M53CN2XYVUHRQ7UB".
    
    jump and goto do the same thing except that goto doesn't record the
    location/chain to return to.
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 22:00             ` Florian Westphal
  2020-01-20 22:07               ` sbezverk
@ 2020-01-20 22:12               ` Florian Westphal
  2020-01-20 22:50                 ` sbezverk
  2020-01-21  4:18                 ` sbezverk
  1 sibling, 2 replies; 13+ messages in thread
From: Florian Westphal @ 2020-01-20 22:12 UTC (permalink / raw)
  To: Florian Westphal; +Cc: sbezverk, Phil Sutter, netfilter-devel

Florian Westphal <fw@strlen.de> wrote:
> sbezverk <sbezverk@gmail.com> wrote:
> > Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?
> > 
> > It is Ubuntu 18.04
> > 
> > sbezverk@kube-4:~$ uname -a
> > Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> > sbezverk@kube-4:~$ sudo nft --version
> > nftables v0.9.1 (Headless Horseman)
> > sbezverk@kube-4:~$
> > 
> > I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.
> 
> In that case, please include "nft --debug=netlink list ruleset".
> 
> It would also be good to check if things work when you add it via nft
> tool.

Oh, and for the fun of it, you could also try this:

chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
	numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ, 16777216 : goto endianbug }
	counter packets 0 bytes 0 } 
        chain endianbug {
		 counter packets 0 bytes 0
	}
 ...

numgen generates a 32bit number in host byte order, so nft internally
converts the keys accordingly (16777216 is htonl(1)).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 22:12               ` Florian Westphal
@ 2020-01-20 22:50                 ` sbezverk
  2020-01-21  4:18                 ` sbezverk
  1 sibling, 0 replies; 13+ messages in thread
From: sbezverk @ 2020-01-20 22:50 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

It started working?!?!?!?!

sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive pod1 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod2 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod3 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive pod1 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod2 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod3 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive pod1 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod2 :)

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB { # handle 60
                numgen inc mod 3 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 1 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5, 2 : goto k8s-nfproxy-sep-23NTSA2UXPPQIPK4, 16777216 : goto endianbug } # handle 174
                counter packets 4 bytes 240 comment "" # handle 76
        }


        chain endianbug { # handle 171
                counter packets 0 bytes 0 # handle 172
        }

Why is that?

Thank you
Serguei

On 2020-01-20, 5:12 PM, "Florian Westphal" <fw@strlen.de> wrote:

    Florian Westphal <fw@strlen.de> wrote:
    > sbezverk <sbezverk@gmail.com> wrote:
    > > Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?
    > > 
    > > It is Ubuntu 18.04
    > > 
    > > sbezverk@kube-4:~$ uname -a
    > > Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    > > sbezverk@kube-4:~$ sudo nft --version
    > > nftables v0.9.1 (Headless Horseman)
    > > sbezverk@kube-4:~$
    > > 
    > > I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.
    > 
    > In that case, please include "nft --debug=netlink list ruleset".
    > 
    > It would also be good to check if things work when you add it via nft
    > tool.
    
    Oh, and for the fun of it, you could also try this:
    
    chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
    	numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ, 16777216 : goto endianbug }
    	counter packets 0 bytes 0 } 
            chain endianbug {
    		 counter packets 0 bytes 0
    	}
     ...
    
    numgen generates a 32bit number in host byte order, so nft internally
    converts the keys accordingly (16777216 is htonl(1)).
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-20 22:12               ` Florian Westphal
  2020-01-20 22:50                 ` sbezverk
@ 2020-01-21  4:18                 ` sbezverk
  2020-01-21  5:24                   ` Florian Westphal
  1 sibling, 1 reply; 13+ messages in thread
From: sbezverk @ 2020-01-21  4:18 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

Hello,

After changing code to  set element id as a non big-endian, loadbalancing started working, the side effect though,  set shows large number for elements ID.

sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive pod1 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod3 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive from pod2 :)
sbezverk@kube-4:~$ curl http://57.141.53.140:808
Still alive pod1 :)

        chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB { # handle 60
                numgen inc mod 3 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 16777216 : goto k8s-nfproxy-sep-23NTSA2UXPPQIPK4, 33554432 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 } # handle 155
                counter packets 0 bytes 0 comment "" # handle 136
        }

Let me know if you plan to fix it eventually.

Thank you very much for your help
Serguei

On 2020-01-20, 5:12 PM, "Florian Westphal" <fw@strlen.de> wrote:

    Florian Westphal <fw@strlen.de> wrote:
    > sbezverk <sbezverk@gmail.com> wrote:
    > > Numgen has GOTO directive and not Jump (Phil asked to change it), I thought it means after hitting any chains in numgen the processing will go back to service chain, no?
    > > 
    > > It is Ubuntu 18.04
    > > 
    > > sbezverk@kube-4:~$ uname -a
    > > Linux kube-4 5.4.10-050410-generic #202001091038 SMP Thu Jan 9 10:41:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    > > sbezverk@kube-4:~$ sudo nft --version
    > > nftables v0.9.1 (Headless Horseman)
    > > sbezverk@kube-4:~$
    > > 
    > > I also want to remind you that I do NOT use nft cli to program rules, I use nft cli just to see resulting rules.
    > 
    > In that case, please include "nft --debug=netlink list ruleset".
    > 
    > It would also be good to check if things work when you add it via nft
    > tool.
    
    Oh, and for the fun of it, you could also try this:
    
    chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB {
    	numgen inc mod 2 vmap { 0 : goto k8s-nfproxy-sep-I7XZOUOVPIQW4IXA, 1 : goto k8s-nfproxy-sep-ZNSGEJWUBCC5QYMQ, 16777216 : goto endianbug }
    	counter packets 0 bytes 0 } 
            chain endianbug {
    		 counter packets 0 bytes 0
    	}
     ...
    
    numgen generates a 32bit number in host byte order, so nft internally
    converts the keys accordingly (16777216 is htonl(1)).
    



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: load balancing between two chains
  2020-01-21  4:18                 ` sbezverk
@ 2020-01-21  5:24                   ` Florian Westphal
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2020-01-21  5:24 UTC (permalink / raw)
  To: sbezverk; +Cc: Florian Westphal, Phil Sutter, netfilter-devel

sbezverk <sbezverk@gmail.com> wrote:
> Hello,
> 
> After changing code to  set element id as a non big-endian, loadbalancing started working, the side effect though,  set shows large number for elements ID.
> 
> sbezverk@kube-4:~$ curl http://57.141.53.140:808
> Still alive pod1 :)
> sbezverk@kube-4:~$ curl http://57.141.53.140:808
> Still alive from pod3 :)
> sbezverk@kube-4:~$ curl http://57.141.53.140:808
> Still alive from pod2 :)
> sbezverk@kube-4:~$ curl http://57.141.53.140:808
> Still alive pod1 :)
> 
>         chain k8s-nfproxy-svc-M53CN2XYVUHRQ7UB { # handle 60
>                 numgen inc mod 3 vmap { 0 : goto k8s-nfproxy-sep-TMVEFT7EX55F4T62, 16777216 : goto k8s-nfproxy-sep-23NTSA2UXPPQIPK4, 33554432 : goto k8s-nfproxy-sep-GTJ7BFLUOQRCGMD5 } # handle 155
>                 counter packets 0 bytes 0 comment "" # handle 136
>         }
> 
> Let me know if you plan to fix it eventually.

This is becuase nft tool stores the key endianess in metadata, so it
can know if it needs to byteswap or not.

See mnl_nft_set_add() in src/mnl.c in nftables source code. Look for
NFTNL_UDATA_SET_KEYBYTEORDER .  If your library sets this to 1
(BYTEORDER_HOST_ENDIAN), nft will display the correct values.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-21  5:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20  2:46 load balancing between two chains sbezverk
2020-01-20 11:23 ` Phil Sutter
2020-01-20 16:31   ` sbezverk
2020-01-20 17:06     ` Florian Westphal
2020-01-20 17:42       ` sbezverk
2020-01-20 21:39         ` Florian Westphal
2020-01-20 21:54           ` sbezverk
2020-01-20 22:00             ` Florian Westphal
2020-01-20 22:07               ` sbezverk
2020-01-20 22:12               ` Florian Westphal
2020-01-20 22:50                 ` sbezverk
2020-01-21  4:18                 ` sbezverk
2020-01-21  5:24                   ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.