All of lore.kernel.org
 help / color / mirror / Atom feed
* UDP IPVS: Incorrect conntrack entry in reply tuple
@ 2022-02-16  8:38 Vivek Thrivikraman
  0 siblings, 0 replies; only message in thread
From: Vivek Thrivikraman @ 2022-02-16  8:38 UTC (permalink / raw)
  To: netfilter

Hi

We are using Kubernetes in IPVS mode, we have a scaled set up with 159 nodes, ~11k pods and ~1500 services(k8s in ipvs mode). We have many pods which acts as UDP clients, which tries to access multiple VIP, on every node restart we are seeing many client UDP requests to multiple VIP:port serving UDP traffic is getting black holed because of invalid conntrack entry(here the UDP servers/Real servers are not restarted as they are not in the restarted node). 
The client udp pods keeps on retrying connection with the same udp source port(we reuse source port) and because of the incorrect conntrack entry and frequent connections, conntrack entry stays and traffic is never restored. From the logs it was clear that the first packet came after the ipvs rules were programmed, but still the reply tuple did not contain the real server ip, the reply tuple was just a reverse of original direction(ie reply tuple had source as VIP and destination as the actual source)

Invalid conntrack:
[1644409107.142928] [NEW] udp 17 30 src=50.117.79.3 dst=64.64.64.59 sport=42467 dport=8890 [UNREPLIED] src=64.64.64.59 dst=50.117.79.3 sport=8890 dport=42467
Timestamp above translates to : GMT: Wednesday, February 9, 2022 12:18:27.142 PM
VIP IP:64.64.64.59 , udp client pod:50.117.79.3
If you see the 2nd tuple(reverse direction) does not have NAT'ed entries(but from below logs looks like by that time ipvs rules were already written).

Logs which clearly shows ipvs rules were added before first packet(conntrack entry above):
I0209 12:17:52.516740 1 endpointslicecache.go:358] "Setting endpoints for service port name" portName="bat-t2/cnf-cp-sfs-t2-46-net-udp:tapp-udp" endpoints=[192.168.11.237:8890 192.168.110.139:8890 192.168.135.77:8890 192.168.183.209:8890 192.168.200.199:8890 192.168.214.234:8890 192.168.215.32:8890 192.168.219.117:8890 192.168.219.222:8890 192.168.220.125:8890 192.168.247.165:8890 192.168.248.97:8890 192.168.250.78:8890 192.168.255.76:8890 192.168.37.61:8890 192.168.55.131:8890 192.168.57.24:8890 192.168.59.101:8890 192.168.7.62:8890 192.168.88.125:8890]
I0209 12:17:56.325097 1 proxier.go:1972] "Adding new service" serviceName="bat-t2/cnf-cp-sfs-t2-46-net-udp:tapp-udp" virtualServer="64.64.64.59:8890/UDP"
I0209 12:17:56.325126 1 proxier.go:1996] "Bind address" address="64.64.64.59"
I0209 12:18:01.339193 1 ipset.go:176] "Successfully added ip set entry to ip set" ipSetEntry="64.64.64.59,udp:8890" ipSet="KUBE-LOAD-BALANCER"
I0209 12:18:11.338453 1 proxier.go:1008] "syncProxyRules complete" elapsed="18.876838575s"

So the first packet came at 12:18:27.142 but the rules for the same service with endpoints were programmed well before at 12:18:11.338453. And there were no other events related to this service/endpoint until the first packet came.

Adding output of ipvadm commands, below command shows that ipvsadm has one hit(1 inactconn) for the packet in question:
UDP 05:00  UDP         50.117.79.3:42467  64.64.64.59:8890   192.168.88.125:8890

UDP  64.64.64.59:8890 rr
  -> 192.168.7.62:8890            Masq    1      0          1         
  -> 192.168.11.237:8890          Masq    1      0          0         
  -> 192.168.37.61:8890           Masq    1      0          1         
  -> 192.168.55.131:8890          Masq    1      0          1         
  -> 192.168.57.24:8890           Masq    1      0          1         
  -> 192.168.59.101:8890          Masq    1      0          1         
  -> 192.168.88.125:8890          Masq    1      0          1         
  -> 192.168.110.139:8890         Masq    1      0          0         
  -> 192.168.135.77:8890          Masq    1      0          0         
  -> 192.168.183.209:8890         Masq    1      0          0         
  -> 192.168.200.199:8890         Masq    1      0          0         
  -> 192.168.214.234:8890         Masq    1      0          0         
  -> 192.168.215.32:8890          Masq    1      0          0         
  -> 192.168.219.117:8890         Masq    1      0          0         
  -> 192.168.219.222:8890         Masq    1      0          0         
  -> 192.168.220.125:8890         Masq    1      0          0         
  -> 192.168.247.165:8890         Masq    1      0          0         
  -> 192.168.248.97:8890          Masq    1      0          0         
  -> 192.168.250.78:8890          Masq    1      0          0         
  -> 192.168.255.76:8890          Masq    1      0          0        

Can anyone please help with any specific scenario where the reply tuple is not populated with real server ip? Or if someone could help with what other logs/data to be collected to move forward.

Thanks,
Vivek


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-02-16  8:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-16  8:38 UDP IPVS: Incorrect conntrack entry in reply tuple Vivek Thrivikraman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.