linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
To: Lai Jiangshan <laijs@linux.alibaba.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Michel Lespinasse <walken@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [PATCH V2 1/2] rbtree_latch: quit searching when reaching to maximum depth
Date: Sat, 23 May 2020 08:56:21 +0800	[thread overview]
Message-ID: <CAJhGHyDp4LYcd8xzL761UYAhj6AY=R4Uik6b4tVN1sqyw95Dsg@mail.gmail.com> (raw)
In-Reply-To: <20200515155912.1713-1-laijs@linux.alibaba.com>

On Sat, May 16, 2020 at 12:01 AM Lai Jiangshan <laijs@linux.alibaba.com> wrote:
>
> lib/rbtree.c has ensured that there is not possible to
> inadvertently cause (temporary) loops in the tree structure
> as seen in program order of the modifier. But loop is still
> possible to be seen in searcher due to CPU's reordering.
>
> for example:
> modifier                                searcher
>
> left rotate at parent
> parent->rb_right is node
>                                         search to parent
>                                         parent->rb_right is node
>                                      +->see node->rb_left changed
> WRITE_ONCE(parent->rb_right, tmp);-+ |  node->rb_left is parennt
> no smp_wmb(), some ARCHs can       | |
> reorder these two writes           | |  loop long between
> WRITE_ONCE(node->rb_left, parent);-+-+  parent and node
>                                    |
>                                    +--->finally see
>                                         parent->rb_right
>
> The long loop won't stop until the modifer's CPU flushes
> its writes. Too avoid it, we should limit the searching depth.
> There are no more than (1<<BITS_PER_LONG)-1 nodes in the tree.
> And the max_depth of a tree is no more than 2*lg(node_count+1),
> which is no mare than 2*BITS_PER_LONG.
>
> So the searching should stop when diving down up to
> 2*BITS_PER_LONG depth.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Michel Lespinasse <walken@google.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---

Hello

Could anyone have a review or an ack on the updated patch?
Compared v1:
  Applied Mathieu's suggest to change the changelog
  Avoid the depth check on x86, so the patch makes no functionality
    change on x86

And which tree should the patches route to? It is memory ordering
related.

Thanks
Lai

>  include/linux/rbtree_latch.h | 39 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/include/linux/rbtree_latch.h b/include/linux/rbtree_latch.h
> index 7d012faa509a..638942f53c0a 100644
> --- a/include/linux/rbtree_latch.h
> +++ b/include/linux/rbtree_latch.h
> @@ -102,11 +102,47 @@ __lt_erase(struct latch_tree_node *ltn, struct latch_tree_root *ltr, int idx)
>         rb_erase(&ltn->node[idx], &ltr->tree[idx]);
>  }
>
> +/*
> + * Beware when rbtree is being searched in RCU read sites.
> + *
> + * lib/rbtree.c has ensured that there is not possible to
> + * inadvertently cause (temporary) loops in the tree structure
> + * as seen in program order of the modifier. But loop is still
> + * possible to be seen in searcher due to CPU's reordering.
> + *
> + * for example:
> + * modifier                               searcher
> + *
> + * left rotate at parent                  search to parent
> + * parent->rb_right is node               parent->rb_right is node
> + *                                     +->see node->rb_left changed
> + * WRITE_ONCE(parent->rb_right, tmp);-+ |  node->rb_left is parennt
> + * no smp_wmb(), some ARCHs can       | |
> + * reorder these two writes           | |  loop long between
> + * WRITE_ONCE(node->rb_left, parent);-+-+  parent and node
> + *                                   |
> + *                                   +--->finally see
> + *                                        parent->rb_right
> + *
> + * The long loop won't stop until the searcher finally see the changes
> + * from the modifier. Too avoid it, we should limit the searching depth.
> + *
> + * Limited by the address space of the kernel, there are no more than
> + * (1<<BITS_PER_LONG)-1 nodes in the tree. And the max_depth of a tree is
> + * no more than 2*lg(node_count+1), which is no mare than 2*BITS_PER_LONG.
> + *
> + * So the searching should stop when diving down up to
> + * 2*BITS_PER_LONG depth.
> + *
> + * Note: the above problem is not subject to the TSO memory model, such as
> + * x86, which can dispense with the depth check.
> + */
>  static __always_inline struct latch_tree_node *
>  __lt_find(void *key, struct latch_tree_root *ltr, int idx,
>           int (*comp)(void *key, struct latch_tree_node *node))
>  {
>         struct rb_node *node = rcu_dereference_raw(ltr->tree[idx].rb_node);
> +       int depth = 2 * BITS_PER_LONG;
>         struct latch_tree_node *ltn;
>         int c;
>
> @@ -120,6 +156,9 @@ __lt_find(void *key, struct latch_tree_root *ltr, int idx,
>                         node = rcu_dereference_raw(node->rb_right);
>                 else
>                         return ltn;
> +
> +               if (!IS_ENABLED(CONFIG_X86) && (--depth < 0))
> +                       break;
>         }
>
>         return NULL;
> --
> 2.20.1
>

  parent reply	other threads:[~2020-05-23  0:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-15 12:47 [PATCH 1/2] rbtree_latch: quit searching when reaching to maximum depth Lai Jiangshan
2020-05-15 12:47 ` [PATCH 2/2] rbtree_latch: don't need to check seq when it found a node Lai Jiangshan
2020-05-15 13:04   ` Peter Zijlstra
2020-05-15 13:00 ` [PATCH 1/2] rbtree_latch: quit searching when reaching to maximum depth Peter Zijlstra
2020-05-15 14:39   ` Lai Jiangshan
2020-05-15 15:01     ` Peter Zijlstra
2020-05-15 15:59       ` [PATCH V2 " Lai Jiangshan
2020-05-15 15:59         ` [PATCH V2 2/2] rbtree_latch: don't need to check seq when it found a node Lai Jiangshan
2020-05-16  4:27           ` Michel Lespinasse
2020-05-16  4:52             ` Lai Jiangshan
2020-05-16  5:03               ` Michel Lespinasse
2020-05-23  0:56         ` Lai Jiangshan [this message]
2020-05-15 13:14 ` [PATCH 1/2] rbtree_latch: quit searching when reaching to maximum depth Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJhGHyDp4LYcd8xzL761UYAhj6AY=R4Uik6b4tVN1sqyw95Dsg@mail.gmail.com' \
    --to=jiangshanlai+lkml@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).