All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hao Luo <haoluo@google.com>
To: Hou Tao <houtao@huaweicloud.com>
Cc: bpf@vger.kernel.org, Song Liu <songliubraving@fb.com>,
	Hao Sun <sunhao.th@gmail.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Andrii Nakryiko <andrii@kernel.org>, Yonghong Song <yhs@fb.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <kafai@fb.com>, KP Singh <kpsingh@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Jiri Olsa <jolsa@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Lorenz Bauer <oss@lmb.io>,
	houtao1@huawei.com
Subject: Re: [PATCH 1/3] bpf: Disable preemption when increasing per-cpu map_locked
Date: Sun, 21 Aug 2022 09:42:00 -0700	[thread overview]
Message-ID: <CA+khW7jv6J9FSqNvaHNzYNpEBoQX6wPEEdoD4OwkPQt844Wwmw@mail.gmail.com> (raw)
In-Reply-To: <20220821033223.2598791-2-houtao@huaweicloud.com>

Hi Hou Tao,

On Sat, Aug 20, 2022 at 8:14 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> Per-cpu htab->map_locked is used to prohibit the concurrent accesses
> from both NMI and non-NMI contexts. But since 74d862b682f51 ("sched:
> Make migrate_disable/enable() independent of RT"), migrations_disable()
> is also preemptible under !PREEMPT_RT case, so now map_locked also
> disallows concurrent updates from normal contexts (e.g. userspace
> processes) unexpectedly as shown below:
>
> process A                      process B
>
> htab_map_update_elem()
>   htab_lock_bucket()
>     migrate_disable()
>     /* return 1 */
>     __this_cpu_inc_return()
>     /* preempted by B */
>
>                                htab_map_update_elem()
>                                  /* the same bucket as A */
>                                  htab_lock_bucket()
>                                    migrate_disable()
>                                    /* return 2, so lock fails */
>                                    __this_cpu_inc_return()
>                                    return -EBUSY
>
> A fix that seems feasible is using in_nmi() in htab_lock_bucket() and
> only checking the value of map_locked for nmi context. But it will
> re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered
> through non-tracing program (e.g. fentry program).
>
> So fixing it by using disable_preempt() instead of migrate_disable() when
> increasing htab->map_locked. However when htab_use_raw_lock() is false,
> bucket lock will be a sleepable spin-lock and it breaks disable_preempt(),
> so still use migrate_disable() for spin-lock case.
>
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---

IIUC, this patch enlarges the scope of preemption disable to cover inc
map_locked. But I don't think the change is meaningful.

This patch only affects the case when raw lock is used. In the case of
raw lock, irq is disabled for b->raw_lock protected critical section.
A raw spin lock itself doesn't block in both RT and non-RT. So, my
understanding about this patch is, it just makes sure preemption
doesn't happen on the exact __this_cpu_inc_return. But the window is
so small that it should be really unlikely to happen.

>  kernel/bpf/hashtab.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 6c530a5e560a..ad09da139589 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -162,17 +162,25 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
>                                    unsigned long *pflags)
>  {
>         unsigned long flags;
> +       bool use_raw_lock;
>
>         hash = hash & HASHTAB_MAP_LOCK_MASK;
>
> -       migrate_disable();
> +       use_raw_lock = htab_use_raw_lock(htab);
> +       if (use_raw_lock)
> +               preempt_disable();
> +       else
> +               migrate_disable();
>         if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>                 __this_cpu_dec(*(htab->map_locked[hash]));
> -               migrate_enable();
> +               if (use_raw_lock)
> +                       preempt_enable();
> +               else
> +                       migrate_enable();
>                 return -EBUSY;
>         }
>
> -       if (htab_use_raw_lock(htab))
> +       if (use_raw_lock)
>                 raw_spin_lock_irqsave(&b->raw_lock, flags);
>         else
>                 spin_lock_irqsave(&b->lock, flags);
> @@ -185,13 +193,18 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab,
>                                       struct bucket *b, u32 hash,
>                                       unsigned long flags)
>  {
> +       bool use_raw_lock = htab_use_raw_lock(htab);
> +
>         hash = hash & HASHTAB_MAP_LOCK_MASK;
> -       if (htab_use_raw_lock(htab))
> +       if (use_raw_lock)
>                 raw_spin_unlock_irqrestore(&b->raw_lock, flags);
>         else
>                 spin_unlock_irqrestore(&b->lock, flags);
>         __this_cpu_dec(*(htab->map_locked[hash]));
> -       migrate_enable();
> +       if (use_raw_lock)
> +               preempt_enable();
> +       else
> +               migrate_enable();
>  }
>
>  static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);
> --
> 2.29.2
>

  reply	other threads:[~2022-08-21 16:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-21  3:32 [PATCH 0/3] fixes for concurrent htab updates Hou Tao
2022-08-21  3:32 ` [PATCH 1/3] bpf: Disable preemption when increasing per-cpu map_locked Hou Tao
2022-08-21 16:42   ` Hao Luo [this message]
2022-08-22  1:27     ` Hou Tao
2022-08-22  3:21       ` Hao Luo
2022-08-22 12:07         ` Hou Tao
2022-08-22 18:01           ` Hao Luo
2022-08-23  0:56             ` Hao Luo
2022-08-23  1:29               ` Alexei Starovoitov
2022-08-23  2:57                 ` Hou Tao
2022-08-23  4:50                   ` Alexei Starovoitov
2022-08-23  6:41                     ` Hou Tao
2022-08-23  2:54               ` Hou Tao
2022-08-22  8:13   ` Sebastian Andrzej Siewior
2022-08-22 12:09     ` Hou Tao
2022-08-22 15:30       ` Sebastian Andrzej Siewior
2022-08-21  3:32 ` [PATCH 2/3] bpf: Allow normally concurrent map updates for !htab_use_raw_lock() case Hou Tao
2022-08-21  3:32 ` [PATCH 3/3] bpf: Propagate error from htab_lock_bucket() to userspace Hou Tao
2022-08-22  1:21 ` [PATCH 0/3] fixes for concurrent htab updates Hou Tao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+khW7jv6J9FSqNvaHNzYNpEBoQX6wPEEdoD4OwkPQt844Wwmw@mail.gmail.com \
    --to=haoluo@google.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=houtao1@huawei.com \
    --cc=houtao@huaweicloud.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=oss@lmb.io \
    --cc=sdf@google.com \
    --cc=songliubraving@fb.com \
    --cc=sunhao.th@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.