Re: [PATCH bpf-next v3 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE

From: Daniel Borkmann <daniel@iogearbox.net>
To: "Björn Töpel" <bjorn.topel@gmail.com>,
	ast@kernel.org, netdev@vger.kernel.org
Cc: magnus.karlsson@intel.com, magnus.karlsson@gmail.com,
	bpf@vger.kernel.org, bjorn.topel@intel.com,
	jonathan.lemon@gmail.com,
	syzbot+c82697e3043781e08802@syzkaller.appspotmail.com,
	hdanton@sina.com, i.maximets@samsung.com
Subject: Re: [PATCH bpf-next v3 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE
Date: Thu, 5 Sep 2019 15:14:15 +0200	[thread overview]
Message-ID: <e948901d-6d0a-3764-7d2e-723f891bd3d9@iogearbox.net> (raw)
In-Reply-To: <20190904114913.17217-1-bjorn.topel@gmail.com>

On 9/4/19 1:49 PM, Björn Töpel wrote:
> This is a four patch series of various barrier, {READ, WRITE}_ONCE
> cleanups in the AF_XDP socket code. More details can be found in the
> corresponding commit message. Previous revisions: v1 [4] and v2 [5].
> 
> For an AF_XDP socket, most control plane operations are done under the
> control mutex (struct xdp_sock, mutex), but there are some places
> where members of the struct is read outside the control mutex. The
> dev, queue_id members are set in bind() and cleared at cleanup. The
> umem, fq, cq, tx, rx, and state member are all assigned in various
> places, e.g. bind() and setsockopt(). When the members are assigned,
> they are protected by the control mutex, but since they are read
> outside the mutex, a WRITE_ONCE is required to avoid store-tearing on
> the read-side.
> 
> Prior the state variable was introduced by Ilya, the dev member was
> used to determine whether the socket was bound or not. However, when
> dev was read, proper SMP barriers and READ_ONCE were missing. In order
> to address the missing barriers and READ_ONCE, we start using the
> state variable as a point of synchronization. The state member
> read/write is paired with proper SMP barriers, and from this follows
> that the members described above does not need READ_ONCE statements if
> used in conjunction with state check.
> 
> To summarize: The members struct xdp_sock members dev, queue_id, umem,
> fq, cq, tx, rx, and state were read lock-less, with incorrect barriers
> and missing {READ, WRITE}_ONCE. After this series umem, fq, cq, tx,
> rx, and state are read lock-less. When these members are updated,
> WRITE_ONCE is used. When read, READ_ONCE are only used when read
> outside the control mutex (e.g. mmap) or, not synchronized with the
> state member (XSK_BOUND plus smp_rmb())
> 
> Thanks,
> Björn
> 
> [1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/
> [2] https://lwn.net/Articles/793253/
> [3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
> [4] https://lore.kernel.org/bpf/20190822091306.20581-1-bjorn.topel@gmail.com/
> [5] https://lore.kernel.org/bpf/20190826061053.15996-1-bjorn.topel@gmail.com/
> 
> v2->v3:
>    Minor restructure of commits.
>    Improve cover and commit messages. (Daniel)
> v1->v2:
>    Removed redundant dev check. (Jonathan)
> 
> Björn Töpel (4):
>    xsk: avoid store-tearing when assigning queues
>    xsk: avoid store-tearing when assigning umem
>    xsk: use state member for socket synchronization
>    xsk: lock the control mutex in sock_diag interface
> 
>   net/xdp/xsk.c      | 60 ++++++++++++++++++++++++++++++++--------------
>   net/xdp/xsk_diag.c |  3 +++
>   2 files changed, 45 insertions(+), 18 deletions(-)
> 

Applied, thanks!