From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Borkmann Subject: Re: [PATCH bpf-next v6 1/4] bpf: sockmap, refactor sockmap routines to work with hashmap Date: Tue, 15 May 2018 21:19:15 +0200 Message-ID: <954e3b7f-9001-e528-c6ab-f8f69c84cfd8@iogearbox.net> References: <1526317219-7752-1-git-send-email-john.fastabend@gmail.com> <1526317219-7752-2-git-send-email-john.fastabend@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, davem@davemloft.net To: John Fastabend , ast@kernel.org Return-path: Received: from www62.your-server.de ([213.133.104.62]:49658 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751104AbeEOTTS (ORCPT ); Tue, 15 May 2018 15:19:18 -0400 In-Reply-To: <1526317219-7752-2-git-send-email-john.fastabend@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 05/14/2018 07:00 PM, John Fastabend wrote: [...] > +static int __sock_map_ctx_update_elem(struct bpf_map *map, > + struct bpf_sock_progs *progs, > + struct sock *sock, > + struct sock **map_link, > + void *key) > { [...] > * sock being added. If the sock is already attached to BPF programs > * this results in an error. > */ > - verdict = READ_ONCE(stab->bpf_verdict); > - parse = READ_ONCE(stab->bpf_parse); > - tx_msg = READ_ONCE(stab->bpf_tx_msg); > + verdict = READ_ONCE(progs->bpf_verdict); > + parse = READ_ONCE(progs->bpf_parse); > + tx_msg = READ_ONCE(progs->bpf_tx_msg); > > if (parse && verdict) { > /* bpf prog refcnt may be zero if a concurrent attach operation > @@ -1703,11 +1691,11 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops, > * we increment the refcnt. If this is the case abort with an > * error. > */ > - verdict = bpf_prog_inc_not_zero(stab->bpf_verdict); > + verdict = bpf_prog_inc_not_zero(progs->bpf_verdict); > if (IS_ERR(verdict)) > return PTR_ERR(verdict); > > - parse = bpf_prog_inc_not_zero(stab->bpf_parse); > + parse = bpf_prog_inc_not_zero(progs->bpf_parse); > if (IS_ERR(parse)) { > bpf_prog_put(verdict); > return PTR_ERR(parse); > @@ -1715,7 +1703,7 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops, > } > > if (tx_msg) { > - tx_msg = bpf_prog_inc_not_zero(stab->bpf_tx_msg); > + tx_msg = bpf_prog_inc_not_zero(progs->bpf_tx_msg); > if (IS_ERR(tx_msg)) { > if (verdict) > bpf_prog_put(verdict); Not directly related to the patch since it doesn't change the logic, but it feels something is not quite right here (unless I'm missing something). The code is as follows: [...] verdict = READ_ONCE(progs->bpf_verdict); parse = READ_ONCE(progs->bpf_parse); tx_msg = READ_ONCE(progs->bpf_tx_msg); if (parse && verdict) { /* bpf prog refcnt may be zero if a concurrent attach operation * removes the program after the above READ_ONCE() but before * we increment the refcnt. If this is the case abort with an * error. */ verdict = bpf_prog_inc_not_zero(progs->bpf_verdict); if (IS_ERR(verdict)) return PTR_ERR(verdict); parse = bpf_prog_inc_not_zero(progs->bpf_parse); if (IS_ERR(parse)) { bpf_prog_put(verdict); return PTR_ERR(parse); } } if (tx_msg) { tx_msg = bpf_prog_inc_not_zero(progs->bpf_tx_msg); if (IS_ERR(tx_msg)) { if (verdict) bpf_prog_put(verdict); if (parse) bpf_prog_put(parse); return PTR_ERR(tx_msg); } } [...] As you say in the comment above the function wrt locking notes that the __sock_map_ctx_update_elem() can be called concurrently. All operations operate on sock_map using cmpxchg and xchg operations to ensure we do not get stale references. Any reads into the map must be done with READ_ONCE() because of this. You initially use the READ_ONCE() on the verdict/parse/tx_msg, but later on when grabbing the reference you use again progs->bpf_verdict/bpf_parse/bpf_tx_msg which would potentially refetch it, but if updates would happen concurrently e.g. to the three progs, they could be NULL in the mean-time, no? bpf_prog_inc_not_zero() would then crash. Why are not the ones used that you fetched previously via READ_ONCE() for taking the ref? The second question I had is that verdict/parse/tx_msg are updated independently from each other and each could be NULL or non-NULL. What if, say, parse is NULL and verdict as well as tx_msg is non-NULL and the bpf_prog_inc_not_zero() on the tx_msg prog fails. Doesn't this cause a use-after-free since a ref on verdict wasn't taken earlier but the bpf_prog_put() will cause accidental misbalance/free of the progs? It would probably help to clarify the locking comment a bit more if indeed the above should be okay as is. Thanks, Daniel