Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: Michal Kubecek <mkubecek@suse.cz>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org, Firo Yang <firo.yang@suse.com>
Subject: possible race in __inet_lookup_established()
Date: Wed, 20 Nov 2019 09:39:19 +0100
Message-ID: <20191120083919.GH27852@unicorn.suse.cz> (raw)

Hello Eric,

we are investigating a crash in socket lookup in a distribution kernel
based on v4.12 but the possible problem we found seems to also apply to
current mainline (or net) code.

The common pattern is:

- the crash always happens in __inet_lookup_established() in

	sk_nulls_for_each_rcu(sk, node, &head->chain) {
		if (sk->sk_hash != hash)     <-----------------
			continue;

  as sk is an invalid pointer; in particular, &sk->sk_nulls_node is null
  so dereferencing sk->sk_hash faults

- the reason is that previous sk value pointed to a listening socket
  rather than an established one; as listening socket uses sk_node, end
  of the chain is marked by a null pointer which is not detected as
  a chain end by sk_nulls_for_each_rcu()

- there is no socket matching skb which is a TCP pure ACK having
  127.0.0.1 as both source and destination

- the chain pointed to by head variable is empty

Firo Yang came with the theory that this could be a race between socket
lookup and freing the socket and replacing it with a listening one:

1. CPU A gets a pointer to an established socket as sk in the
sk_nulls_for_each_rcu() loop in __inet_lookup_established() but does not
thake a reference to it.

2. CPU B frees the socket

3. Slab object pointed to by sk is reused for a new listening socket.
This socket has null sk->sk_node->next which uses the same spot as
sk->sk_nulls_node->next

4. CPU A tests sk->sk_nulls_node->next with is_a_nulls() (false) and
follows the pointer, resulting in a fault dereferencing sk->sk_hash.

Unless we missed something, there is no protection against established
socket being freed and replaced by a new listening one while
__inet_lookup_established() has a pointer to it. The RCU loop only
prevents the slab object being reused for a different slab cache or
something completely different but as established and listening sockets
share the same slab cache, it does not protect us from switching from
established to listening.

As far as I can say, this kind of race could have happened for quite
long but before your commit 3b24d854cb35 ("tcp/dccp: do not touch
listener sk_refcnt under synflood"), the worst that could happen would
be switching to a chain in listener lookup table, following it to its
end and then (most likely) restarting the lookup or failing. Now that
established and listening sockets use different list types, replacing
one with the other can be deadly.

Do you agree that this race is possible or is there something we missed
that would prevent it?

Michal

             reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20  8:39 Michal Kubecek [this message]
     [not found] ` <CANn89iJYXh7AwK8_Aiz3wXqugG0icPNW6OPsPxwOvpH90kr+Ew@mail.gmail.com>
2019-11-20 18:10   ` Michal Kubecek
2019-11-20 19:13     ` Eric Dumazet
2019-11-20 19:52       ` Michal Kubecek
2019-11-20 20:04         ` Eric Dumazet
2019-11-20 20:49           ` Michal Kubecek
2019-11-20 20:57             ` Eric Dumazet
2019-11-20 21:13               ` Michal Kubecek
2019-11-20 21:54                 ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191120083919.GH27852@unicorn.suse.cz \
    --to=mkubecek@suse.cz \
    --cc=edumazet@google.com \
    --cc=firo.yang@suse.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git