netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Lau <kafai@fb.com>
To: Joe Stringer <joe@isovalent.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH bpf] bpf: Check sk_fullsock() before returning from bpf_sk_lookup()
Date: Sun, 19 May 2019 02:07:29 +0000	[thread overview]
Message-ID: <20190519020703.nbioindo5krpgupi@kafai-mbp> (raw)
In-Reply-To: <CADa=RyxfhK+XhAwf_C_an=+RnsQCPCXV23Qrwk-3OC1oLdHM=A@mail.gmail.com>

On Sat, May 18, 2019 at 06:52:48PM -0700, Joe Stringer wrote:
> On Sat, May 18, 2019, 09:05 Martin Lau <kafai@fb.com> wrote:
> >
> > On Sat, May 18, 2019 at 08:38:46AM -1000, Joe Stringer wrote:
> > > On Fri, May 17, 2019, 12:02 Martin Lau <kafai@fb.com> wrote:
> > >
> > > > On Fri, May 17, 2019 at 02:51:48PM -0700, Eric Dumazet wrote:
> > > > >
> > > > >
> > > > > On 5/17/19 2:21 PM, Martin KaFai Lau wrote:
> > > > > > The BPF_FUNC_sk_lookup_xxx helpers return RET_PTR_TO_SOCKET_OR_NULL.
> > > > > > Meaning a fullsock ptr and its fullsock's fields in bpf_sock can be
> > > > > > accessed, e.g. type, protocol, mark and priority.
> > > > > > Some new helper, like bpf_sk_storage_get(), also expects
> > > > > > ARG_PTR_TO_SOCKET is a fullsock.
> > > > > >
> > > > > > bpf_sk_lookup() currently calls sk_to_full_sk() before returning.
> > > > > > However, the ptr returned from sk_to_full_sk() is not guaranteed
> > > > > > to be a fullsock.  For example, it cannot get a fullsock if sk
> > > > > > is in TCP_TIME_WAIT.
> > > > > >
> > > > > > This patch checks for sk_fullsock() before returning. If it is not
> > > > > > a fullsock, sock_gen_put() is called if needed and then returns NULL.
> > > > > >
> > > > > > Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF")
> > > > > > Cc: Joe Stringer <joe@isovalent.com>
> > > > > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > > > > > ---
> > > > > >  net/core/filter.c | 16 ++++++++++++++--
> > > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > > > > index 55bfc941d17a..85def5a20aaf 100644
> > > > > > --- a/net/core/filter.c
> > > > > > +++ b/net/core/filter.c
> > > > > > @@ -5337,8 +5337,14 @@ __bpf_sk_lookup(struct sk_buff *skb, struct
> > > > bpf_sock_tuple *tuple, u32 len,
> > > > > >     struct sock *sk = __bpf_skc_lookup(skb, tuple, len, caller_net,
> > > > > >                                        ifindex, proto, netns_id,
> > > > flags);
> > > > > >
> > > > > > -   if (sk)
> > > > > > +   if (sk) {
> > > > > >             sk = sk_to_full_sk(sk);
> > > > > > +           if (!sk_fullsock(sk)) {
> > > > > > +                   if (!sock_flag(sk, SOCK_RCU_FREE))
> > > > > > +                           sock_gen_put(sk);
> > > > >
> > > > > This looks a bit convoluted/weird.
> > > > >
> > > > > What about telling/asking __bpf_skc_lookup() to not return a non
> > > > fullsock instead ?
> > > > It is becausee some other helpers, like BPF_FUNC_skc_lookup_tcp,
> > > > can return non fullsock
> > > >
> > >
> > > FYI this is necessary for finding a transparently proxied socket for a
> > > non-local connection (tproxy use case).
> > You meant it is necessary to return a non fullsock from the
> > BPF_FUNC_sk_lookup_xxx helpers?
> 
> Yes, that's what I want to associate with the skb so that the delivery
> to the SO_TRANSPARENT is received properly.
> 
> For the first packet of a connection, we look up the socket using the
> tproxy socket port as the destination, and deliver the packet there.
> The SO_TRANSPARENT logic then kicks in and sends back the ack and
> creates the non-full sock for the connection tuple, which can be
> entirely unrelated to local addresses or ports.
> 
> For the second forward-direction packet, (ie ACK in 3-way handshake)
> then we must deliver the packet to this non-full sock as that's what
> is negotiating the proxied connection. If you look up using the packet
> tuple then get the full sock from it, it will go back to the
> SO_TRANSPARENT parent socket. Delivering the ACK there will result in
> a RST being sent back, because the SO_TRANSPARENT socket is just there
> to accept new connections for connections to be proxied. So this is
> the case where I need the non-full sock.
> 
> (In practice, the lookup logic attempts the packet tuple first then if
> that fails, uses the tproxy port for lookup to achieve the above).
hmm...I am likely missing something.

1) The above can be done by the "BPF_FUNC_skC_lookup_tcp" which
   returns a non fullsock (RET_PTR_TO_SOCK_COMMON_OR_NULL), no?

2) The bpf_func_proto of "BPF_FUNC_sk_lookup_tcp" returns
   fullsock (RET_PTR_TO_SOCKET_OR_NULL) and the bpf_prog (and
   the verifier) is expecting that.  How to address the bug here?

  reply	other threads:[~2019-05-19  2:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17 21:21 [PATCH bpf] bpf: Check sk_fullsock() before returning from bpf_sk_lookup() Martin KaFai Lau
2019-05-17 21:51 ` Eric Dumazet
2019-05-17 22:01   ` Martin Lau
     [not found]     ` <CADa=RyxisbcVeXL7yq6o02XOgWd87QCzq-6zDXRnm9RoD2WM=A@mail.gmail.com>
2019-05-18 19:05       ` Martin Lau
2019-05-19  1:52         ` Joe Stringer
2019-05-19  2:07           ` Martin Lau [this message]
2019-05-20 18:38             ` Martin Lau
2019-05-20 18:56             ` Joe Stringer
2019-05-20 18:57 ` Joe Stringer
2019-05-21 14:48 ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190519020703.nbioindo5krpgupi@kafai-mbp \
    --to=kafai@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eric.dumazet@gmail.com \
    --cc=joe@isovalent.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).