All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Björn Töpel" <bjorn.topel@gmail.com>
To: ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org
Cc: "Björn Töpel" <bjorn.topel@gmail.com>,
	magnus.karlsson@intel.com, magnus.karlsson@gmail.com,
	bpf@vger.kernel.org, bjorn.topel@intel.com,
	jonathan.lemon@gmail.com,
	syzbot+c82697e3043781e08802@syzkaller.appspotmail.com,
	hdanton@sina.com, i.maximets@samsung.com
Subject: [PATCH bpf-next v3 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE
Date: Wed,  4 Sep 2019 13:49:09 +0200	[thread overview]
Message-ID: <20190904114913.17217-1-bjorn.topel@gmail.com> (raw)

This is a four patch series of various barrier, {READ, WRITE}_ONCE
cleanups in the AF_XDP socket code. More details can be found in the
corresponding commit message. Previous revisions: v1 [4] and v2 [5].

For an AF_XDP socket, most control plane operations are done under the
control mutex (struct xdp_sock, mutex), but there are some places
where members of the struct is read outside the control mutex. The
dev, queue_id members are set in bind() and cleared at cleanup. The
umem, fq, cq, tx, rx, and state member are all assigned in various
places, e.g. bind() and setsockopt(). When the members are assigned,
they are protected by the control mutex, but since they are read
outside the mutex, a WRITE_ONCE is required to avoid store-tearing on
the read-side.

Prior the state variable was introduced by Ilya, the dev member was
used to determine whether the socket was bound or not. However, when
dev was read, proper SMP barriers and READ_ONCE were missing. In order
to address the missing barriers and READ_ONCE, we start using the
state variable as a point of synchronization. The state member
read/write is paired with proper SMP barriers, and from this follows
that the members described above does not need READ_ONCE statements if
used in conjunction with state check.

To summarize: The members struct xdp_sock members dev, queue_id, umem,
fq, cq, tx, rx, and state were read lock-less, with incorrect barriers
and missing {READ, WRITE}_ONCE. After this series umem, fq, cq, tx,
rx, and state are read lock-less. When these members are updated,
WRITE_ONCE is used. When read, READ_ONCE are only used when read
outside the control mutex (e.g. mmap) or, not synchronized with the
state member (XSK_BOUND plus smp_rmb())

Thanks,
Björn

[1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/
[2] https://lwn.net/Articles/793253/
[3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
[4] https://lore.kernel.org/bpf/20190822091306.20581-1-bjorn.topel@gmail.com/
[5] https://lore.kernel.org/bpf/20190826061053.15996-1-bjorn.topel@gmail.com/

v2->v3:
  Minor restructure of commits.
  Improve cover and commit messages. (Daniel)
v1->v2:
  Removed redundant dev check. (Jonathan)

Björn Töpel (4):
  xsk: avoid store-tearing when assigning queues
  xsk: avoid store-tearing when assigning umem
  xsk: use state member for socket synchronization
  xsk: lock the control mutex in sock_diag interface

 net/xdp/xsk.c      | 60 ++++++++++++++++++++++++++++++++--------------
 net/xdp/xsk_diag.c |  3 +++
 2 files changed, 45 insertions(+), 18 deletions(-)

-- 
2.20.1


             reply	other threads:[~2019-09-04 11:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 11:49 Björn Töpel [this message]
2019-09-04 11:49 ` [PATCH bpf-next v3 1/4] xsk: avoid store-tearing when assigning queues Björn Töpel
2019-09-04 11:49 ` [PATCH bpf-next v3 2/4] xsk: avoid store-tearing when assigning umem Björn Töpel
2019-09-04 11:49 ` [PATCH bpf-next v3 3/4] xsk: use state member for socket synchronization Björn Töpel
2019-09-04 16:40   ` Jonathan Lemon
2019-09-04 11:49 ` [PATCH bpf-next v3 4/4] xsk: lock the control mutex in sock_diag interface Björn Töpel
2019-09-05 13:14 ` [PATCH bpf-next v3 0/4] xsk: various CPU barrier and {READ, WRITE}_ONCE Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190904114913.17217-1-bjorn.topel@gmail.com \
    --to=bjorn.topel@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hdanton@sina.com \
    --cc=i.maximets@samsung.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=syzbot+c82697e3043781e08802@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.