All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible BUG in the entry removal process of dm-btree structure
@ 2015-05-18  7:21 Dennis Yang
  2015-05-18 12:51 ` Joe Thornber
  0 siblings, 1 reply; 2+ messages in thread
From: Dennis Yang @ 2015-05-18  7:21 UTC (permalink / raw)
  To: device-mapper development

Hi,

Recently, I had run into the same dm-btree-remove bug reported before
in this mailing list.
(https://www.redhat.com/archives/dm-devel/2013-February/msg00063.html)
I google this and it turns out there is someone else had faced the
same issue with CentOS couple of days ago.
(http://mogu.io/docker_crash).

Also, since we will run thin_check every couple of hours on our server
to check if the metadata is corrupted, we do occasionally find that
the key ordering of some btree nodes become unsorted by using the
thin_debug tool to examine them. Therefore, my colleague and I have
traced the dm-btree-remove.c and find something which might be the
root cause of these issues.

In dm-btree-remove.c, there is a redistribute3() function which is
called when we try to rebalance the entry of three nodes with their
total entry count is higher than a defined threshold. I paste the code
below for further discussion.

target = (nr_left + nr_center + nr_right) / 3;
if (nr_left < nr_right) {
    s = nr_left - target;

    if (s < 0 && nr_center < -s) {
        /* not enough in central node */
        shift(left, center, nr_center);
        s = nr_center - target;
        shift(left, right, s);
        nr_right += s;
    } else
        shift(left, center, s);

    shift(center, right, target - nr_right);
} else {

If the entry count of left node is smaller than target and the entry
count of center node is not enough to cover this gap, I think what we
try to do here is to move all the entries in the center node to the
left node first, then move some entries from the right node to the
left one to make its entry count equals to target. However, since
nr_center is always positive, the code above will first move up to
nr_center entries from the left node to the center node. This, in our
opinion, might trigger the BUG_ON in shift(), for we cannot sure left
node has nr_center nodes to be moved. At this point, the left node
will have (nr_left - nr_center) entries, and the center node will have
(2 * nr_center) entries. Then, it will try to move (target -
nr_center) entries from right to left. Since the center node still
have (2 * nr_center) entries, moving entries directly from right to
left will mess up the ordering of entries between left and center
node.

To fix this issue, I think we should stick with the plan that move all
the entries from center node to left/right node first, and then make
it up to target entries by moving them from right/left node to it. I
have attached a simple patch to demonstrate the idea here.

Any help would be grateful.
Thanks.

Dennis

diff --git a/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
b/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
index b88757c..9026fb8 100644
--- a/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
+++ b/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
@@ -309,8 +309,9 @@ static void redistribute3(struct dm_btree_info
*info, struct btree_node *parent,

          if (s < 0 && nr_center < -s) {
          /* not enough in central node */
-             shift(left, center, nr_center);
-             s = nr_center - target;
+            shift(center, left, nr_center);
+            s += nr_center;
              shift(left, right, s);
              nr_right += s;
         } else
@@ -323,7 +324,7 @@ static void redistribute3(struct dm_btree_info
*info, struct btree_node *parent,
        if (s > 0 && nr_center < s) {
        /* not enough in central node */
            shift(center, right, nr_center);
-           s = target - nr_center;
+          s -= nr_center;
            shift(left, right, s);
            nr_left -= s;
       } else

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: Possible BUG in the entry removal process of dm-btree structure
  2015-05-18  7:21 Possible BUG in the entry removal process of dm-btree structure Dennis Yang
@ 2015-05-18 12:51 ` Joe Thornber
  0 siblings, 0 replies; 2+ messages in thread
From: Joe Thornber @ 2015-05-18 12:51 UTC (permalink / raw)
  To: device-mapper development

On Mon, May 18, 2015 at 03:21:52PM +0800, Dennis Yang wrote:
> Hi,
> 
> Recently, I had run into the same dm-btree-remove bug reported before
> in this mailing list.

Hi Dennis,

You diagnosis looks correct, though I think your patch introduces
entry reordering.

> In dm-btree-remove.c, there is a redistribute3() function which is
> called when we try to rebalance the entry of three nodes with their
> total entry count is higher than a defined threshold. I paste the code
> below for further discussion.
> 
> target = (nr_left + nr_center + nr_right) / 3;
> if (nr_left < nr_right) {
>     s = nr_left - target;
> 
>     if (s < 0 && nr_center < -s) {
>         /* not enough in central node */
>         shift(left, center, nr_center);
>         s = nr_center - target;
>         shift(left, right, s);
>         nr_right += s;
>     } else
>         shift(left, center, s);
> 
>     shift(center, right, target - nr_right);
> } else {
> 
> If the entry count of left node is smaller than target and the entry
> count of center node is not enough to cover this gap, I think what we
> try to do here is to move all the entries in the center node to the
> left node first, then move some entries from the right node to the
> left one to make its entry count equals to target. However, since
> nr_center is always positive, the code above will first move up to
> nr_center entries from the left node to the center node.

Yep, this is a bug, it should be -nr_center.

> To fix this issue, I think we should stick with the plan that move all
> the entries from center node to left/right node first, and then make
> it up to target entries by moving them from right/left node to it. I
> have attached a simple patch to demonstrate the idea here.
> 
> diff --git a/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
> b/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
> index b88757c..9026fb8 100644
> --- a/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
> +++ b/linux-3.12.6/drivers/md/persistent-data/dm-btree-remove.c
> @@ -309,8 +309,9 @@ static void redistribute3(struct dm_btree_info
> *info, struct btree_node *parent,
> 
>           if (s < 0 && nr_center < -s) {
>           /* not enough in central node */
> -             shift(left, center, nr_center);
> -             s = nr_center - target;
> +            shift(center, left, nr_center);

You can't put the nodes out of order, what you actually want to do is:
shift(left, center, -nr_center).  Other than that your patch looks
good.  Updated patch below.

Thanks,

- Joe


diff --git a/drivers/md/persistent-data/dm-btree-remove.c b/drivers/md/persistent-data/dm-btree-remove.c
index e04cfd2..9836c0a 100644
--- a/drivers/md/persistent-data/dm-btree-remove.c
+++ b/drivers/md/persistent-data/dm-btree-remove.c
@@ -309,8 +309,8 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
 
                if (s < 0 && nr_center < -s) {
                        /* not enough in central node */
-                       shift(left, center, nr_center);
-                       s = nr_center - target;
+                       shift(left, center, -nr_center);
+                       s += nr_center;
                        shift(left, right, s);
                        nr_right += s;
                } else
@@ -323,7 +323,7 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
                if (s > 0 && nr_center < s) {
                        /* not enough in central node */
                        shift(center, right, nr_center);
-                       s = target - nr_center;
+                       s -= nr_center;
                        shift(left, right, s);
                        nr_left -= s;
                } else

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-05-18 12:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-18  7:21 Possible BUG in the entry removal process of dm-btree structure Dennis Yang
2015-05-18 12:51 ` Joe Thornber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.