linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Cc: <ast@kernel.org>, <benh@amazon.com>, <bpf@vger.kernel.org>,
	<daniel@iogearbox.net>, <davem@davemloft.net>,
	<edumazet@google.com>, <eric.dumazet@gmail.com>,
	<kuba@kernel.org>, <kuni1840@gmail.com>,
	<linux-kernel@vger.kernel.org>, <netdev@vger.kernel.org>
Subject: Re: [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.
Date: Thu, 10 Dec 2020 11:33:40 -0800	[thread overview]
Message-ID: <20201210193340.x6qdykdalhdebxv3@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20201210055810.60068-1-kuniyu@amazon.co.jp>

On Thu, Dec 10, 2020 at 02:58:10PM +0900, Kuniyuki Iwashima wrote:

[ ... ]

> > > I've implemented one-by-one migration only for the accept queue for now.
> > > In addition to the concern about TFO queue,
> > You meant this queue:  queue->fastopenq.rskq_rst_head?
> 
> Yes.
> 
> 
> > Can "req" be passed?
> > I did not look up the lock/race in details for that though.
> 
> I think if we rewrite freeing TFO requests part like one of accept queue
> using reqsk_queue_remove(), we can also migrate them.
> 
> In this patchset, selecting a listener for accept queue, the TFO queue of
> the same listener is also migrated to another listener in order to prevent
> TFO spoofing attack.
> 
> If the request in the accept queue is migrated one by one, I am wondering
> which should the request in TFO queue be migrated to prevent attack or
> freed.
> 
> I think user need not know about keeping such requests in kernel to prevent
> attacks, so passing them to eBPF prog is confusing. But, redistributing
> them randomly without user's intention can make some irrelevant listeners
> unnecessarily drop new TFO requests, so this is also bad. Moreover, freeing
> such requests seems not so good in the point of security.
The current behavior (during process restart) is also not carrying this
security queue.  Not carrying them in this patch will make it
less secure than the current behavior during process restart?
Do you need it now or it is something that can be considered for later
without changing uapi bpf.h?

> > > ---8<---
> > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > > index a82fd4c912be..d0ddd3cb988b 100644
> > > --- a/net/ipv4/inet_connection_sock.c
> > > +++ b/net/ipv4/inet_connection_sock.c
> > > @@ -1001,6 +1001,29 @@ struct sock *inet_csk_reqsk_queue_add(struct sock *sk,
> > >  }
> > >  EXPORT_SYMBOL(inet_csk_reqsk_queue_add);
> > >  
> > > +static bool inet_csk_reqsk_queue_migrate(struct sock *sk, struct sock *nsk, struct request_sock *req)
> > > +{
> > > +       struct request_sock_queue *queue = &inet_csk(nsk)->icsk_accept_queue;
> > > +       bool migrated = false;
> > > +
> > > +       spin_lock(&queue->rskq_lock);
> > > +       if (likely(nsk->sk_state == TCP_LISTEN)) {
> > > +               migrated = true;
> > > +
> > > +               req->dl_next = NULL;
> > > +               if (queue->rskq_accept_head == NULL)
> > > +                       WRITE_ONCE(queue->rskq_accept_head, req);
> > > +               else
> > > +                       queue->rskq_accept_tail->dl_next = req;
> > > +               queue->rskq_accept_tail = req;
> > > +               sk_acceptq_added(nsk);
> > > +               inet_csk_reqsk_queue_migrated(sk, nsk, req);
> > need to first resolve the question raised in patch 5 regarding
> > to the update on req->rsk_listener though.
> 
> In the unhash path, it is also safe to call sock_put() for the old listner.
> 
> In inet_csk_listen_stop(), the sk_refcnt of the listener >= 1. If the
> listener does not have immature requests, sk_refcnt is 1 and freed in
> __tcp_close().
> 
>   sock_hold(sk) in __tcp_close()
>   sock_put(sk) in inet_csk_destroy_sock()
>   sock_put(sk) in __tcp_clsoe()
I don't see how it is different here than in patch 5.
I could be missing something.

Lets contd the discussion on the other thread (patch 5) first.

> 
> 
> > > +       }
> > > +       spin_unlock(&queue->rskq_lock);
> > > +
> > > +       return migrated;
> > > +}
> > > +
> > >  struct sock *inet_csk_complete_hashdance(struct sock *sk, struct sock *child,
> > >                                          struct request_sock *req, bool own_req)
> > >  {
> > > @@ -1023,9 +1046,11 @@ EXPORT_SYMBOL(inet_csk_complete_hashdance);
> > >   */
> > >  void inet_csk_listen_stop(struct sock *sk)
> > >  {
> > > +       struct sock_reuseport *reuseport_cb = rcu_access_pointer(sk->sk_reuseport_cb);
> > >         struct inet_connection_sock *icsk = inet_csk(sk);
> > >         struct request_sock_queue *queue = &icsk->icsk_accept_queue;
> > >         struct request_sock *next, *req;
> > > +       struct sock *nsk;
> > >  
> > >         /* Following specs, it would be better either to send FIN
> > >          * (and enter FIN-WAIT-1, it is normal close)
> > > @@ -1043,8 +1068,19 @@ void inet_csk_listen_stop(struct sock *sk)
> > >                 WARN_ON(sock_owned_by_user(child));
> > >                 sock_hold(child);
> > >  
> > > +               if (reuseport_cb) {
> > > +                       nsk = reuseport_select_migrated_sock(sk, req_to_sk(req)->sk_hash, NULL);
> > > +                       if (nsk) {
> > > +                               if (inet_csk_reqsk_queue_migrate(sk, nsk, req))
> > > +                                       goto unlock_sock;
> > > +                               else
> > > +                                       sock_put(nsk);
> > > +                       }
> > > +               }
> > > +
> > >                 inet_child_forget(sk, req, child);
> > >                 reqsk_put(req);
> > > +unlock_sock:
> > >                 bh_unlock_sock(child);
> > >                 local_bh_enable();
> > >                 sock_put(child);
> > > ---8<---
> > > 
> > > 
> > > > > >   5. lock the accept queue of the new listener
> > > > > >   6. splice requests and increment refcount
> > > > > >   7. unlock
> > > > > > 
> > > > > > Also, I think splicing is better to keep the order of requests. Adding one
> > > > > > by one reverses it.
> > > > > It can keep the order but I think it is orthogonal here.

  reply	other threads:[~2020-12-10 19:35 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01 14:44 [PATCH v1 bpf-next 00/11] Socket migration for SO_REUSEPORT Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 01/11] tcp: Keep TCP_CLOSE sockets in the reuseport group Kuniyuki Iwashima
2020-12-05  1:31   ` Martin KaFai Lau
2020-12-06  4:38     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 02/11] bpf: Define migration types for SO_REUSEPORT Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues Kuniyuki Iwashima
2020-12-01 15:25   ` Eric Dumazet
2020-12-03 14:14     ` Kuniyuki Iwashima
2020-12-03 14:31       ` Eric Dumazet
2020-12-03 15:41         ` Kuniyuki Iwashima
2020-12-07 20:33       ` Martin KaFai Lau
2020-12-08  6:31         ` Kuniyuki Iwashima
2020-12-08  7:34           ` Martin KaFai Lau
2020-12-08  8:17             ` Kuniyuki Iwashima
2020-12-09  3:09               ` Martin KaFai Lau
2020-12-09  8:05                 ` Kuniyuki Iwashima
2020-12-09 16:57                   ` Kuniyuki Iwashima
2020-12-10  1:53                     ` Martin KaFai Lau
2020-12-10  5:58                       ` Kuniyuki Iwashima
2020-12-10 19:33                         ` Martin KaFai Lau [this message]
2020-12-14 17:16                           ` Kuniyuki Iwashima
2020-12-05  1:42   ` Martin KaFai Lau
2020-12-06  4:41     ` Kuniyuki Iwashima
     [not found]     ` <20201205160307.91179-1-kuniyu@amazon.co.jp>
2020-12-07 20:14       ` Martin KaFai Lau
2020-12-08  6:27         ` Kuniyuki Iwashima
2020-12-08  8:13           ` Martin KaFai Lau
2020-12-08  9:02             ` Kuniyuki Iwashima
2020-12-08  6:54   ` Martin KaFai Lau
2020-12-08  7:42     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 04/11] tcp: Migrate TFO requests causing RST during TCP_SYN_RECV Kuniyuki Iwashima
2020-12-01 15:30   ` Eric Dumazet
2020-12-01 14:44 ` [PATCH v1 bpf-next 05/11] tcp: Migrate TCP_NEW_SYN_RECV requests Kuniyuki Iwashima
2020-12-01 15:13   ` Eric Dumazet
2020-12-03 14:12     ` Kuniyuki Iwashima
2020-12-10  0:07   ` Martin KaFai Lau
2020-12-10  5:15     ` Kuniyuki Iwashima
2020-12-10 18:49       ` Martin KaFai Lau
2020-12-14 17:03         ` Kuniyuki Iwashima
2020-12-15  2:58           ` Martin KaFai Lau
2020-12-16 16:41             ` Kuniyuki Iwashima
2020-12-16 22:24               ` Martin KaFai Lau
2020-12-01 14:44 ` [PATCH v1 bpf-next 06/11] bpf: Introduce two attach types for BPF_PROG_TYPE_SK_REUSEPORT Kuniyuki Iwashima
2020-12-02  2:04   ` Andrii Nakryiko
2020-12-02 19:19     ` Martin KaFai Lau
2020-12-03  4:24       ` Martin KaFai Lau
2020-12-03 14:16         ` Kuniyuki Iwashima
2020-12-04  5:56           ` Martin KaFai Lau
2020-12-06  4:32             ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 07/11] libbpf: Set expected_attach_type " Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 08/11] bpf: Add migration to sk_reuseport_(kern|md) Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 09/11] bpf: Support bpf_get_socket_cookie_sock() for BPF_PROG_TYPE_SK_REUSEPORT Kuniyuki Iwashima
2020-12-04 19:58   ` Martin KaFai Lau
2020-12-06  4:36     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 10/11] bpf: Call bpf_run_sk_reuseport() for socket migration Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 11/11] bpf: Test BPF_SK_REUSEPORT_SELECT_OR_MIGRATE Kuniyuki Iwashima
2020-12-05  1:50   ` Martin KaFai Lau
2020-12-06  4:43     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201210193340.x6qdykdalhdebxv3@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=ast@kernel.org \
    --cc=benh@amazon.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@amazon.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).