All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] rbd: parent_overlap == 0 fixes
@ 2015-01-20 12:41 Ilya Dryomov
  2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Ilya Dryomov @ 2015-01-20 12:41 UTC (permalink / raw)
  To: ceph-devel; +Cc: Alex Elder

Hello,

This series fixes what turned out to be an old (v3.10-v3.11) bug, which
can cause things like http://tracker.ceph.com/issues/10352.  Most of
the damage was brought by 3b5cf2a2f174 ("rbd: clean up a few things in
the refresh path") which went into v3.11, but v3.10 wasn't fully OK
either.

Thanks,

                Ilya


Ilya Dryomov (3):
  rbd: fix rbd_dev_parent_get() when parent_overlap == 0
  rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
  rbd: do not treat standalone as flatten

 drivers/block/rbd.c | 55 +++++++++++++++++------------------------------------
 1 file changed, 17 insertions(+), 38 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0
  2015-01-20 12:41 [PATCH 0/3] rbd: parent_overlap == 0 fixes Ilya Dryomov
@ 2015-01-20 12:41 ` Ilya Dryomov
  2015-01-21  1:22   ` Josh Durgin
  2015-01-27  3:14   ` Alex Elder
  2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
  2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
  2 siblings, 2 replies; 10+ messages in thread
From: Ilya Dryomov @ 2015-01-20 12:41 UTC (permalink / raw)
  To: ceph-devel; +Cc: Alex Elder

The comment for rbd_dev_parent_get() said

    * We must get the reference before checking for the overlap to
    * coordinate properly with zeroing the parent overlap in
    * rbd_dev_v2_parent_info() when an image gets flattened.  We
    * drop it again if there is no overlap.

but the "drop it again if there is no overlap" part was missing from
the implementation.  This lead to absurd parent_ref values for images
with parent_overlap == 0, as parent_ref was incremented for each
img_request and virtually never decremented.

Fix this by leveraging the fact that refresh path calls
rbd_dev_v2_parent_info() under header_rwsem and use it for read in
rbd_dev_parent_get(), instead of messing around with atomics.  Get rid
of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
they'd pair with now and I suspect we are in a pretty miserable
situation as far as proper locking goes regardless.

Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
---
 drivers/block/rbd.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 31fa00f0d707..2990a1c75159 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2098,32 +2098,26 @@ static void rbd_dev_parent_put(struct rbd_device *rbd_dev)
  * If an image has a non-zero parent overlap, get a reference to its
  * parent.
  *
- * We must get the reference before checking for the overlap to
- * coordinate properly with zeroing the parent overlap in
- * rbd_dev_v2_parent_info() when an image gets flattened.  We
- * drop it again if there is no overlap.
- *
  * Returns true if the rbd device has a parent with a non-zero
  * overlap and a reference for it was successfully taken, or
  * false otherwise.
  */
 static bool rbd_dev_parent_get(struct rbd_device *rbd_dev)
 {
-	int counter;
+	int counter = 0;
 
 	if (!rbd_dev->parent_spec)
 		return false;
 
-	counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
-	if (counter > 0 && rbd_dev->parent_overlap)
-		return true;
-
-	/* Image was flattened, but parent is not yet torn down */
+	down_read(&rbd_dev->header_rwsem);
+	if (rbd_dev->parent_overlap)
+		counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
+	up_read(&rbd_dev->header_rwsem);
 
 	if (counter < 0)
 		rbd_warn(rbd_dev, "parent reference overflow");
 
-	return false;
+	return counter > 0;
 }
 
 /*
@@ -4238,7 +4232,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
 		 */
 		if (rbd_dev->parent_overlap) {
 			rbd_dev->parent_overlap = 0;
-			smp_mb();
 			rbd_dev_parent_put(rbd_dev);
 			pr_info("%s: clone image has been flattened\n",
 				rbd_dev->disk->disk_name);
@@ -4284,7 +4277,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
 	 * treat it specially.
 	 */
 	rbd_dev->parent_overlap = overlap;
-	smp_mb();
 	if (!overlap) {
 
 		/* A null parent_spec indicates it's the initial probe */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
  2015-01-20 12:41 [PATCH 0/3] rbd: parent_overlap == 0 fixes Ilya Dryomov
  2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
@ 2015-01-20 12:41 ` Ilya Dryomov
  2015-01-21  1:24   ` Josh Durgin
  2015-01-27  3:40   ` Alex Elder
  2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
  2 siblings, 2 replies; 10+ messages in thread
From: Ilya Dryomov @ 2015-01-20 12:41 UTC (permalink / raw)
  To: ceph-devel; +Cc: Alex Elder

This effectively reverts the last hunk of 392a9dad7e77 ("rbd: detect
when clone image is flattened").

The problem with parent_overlap != 0 condition is that it's possible
and completely valid to have an image with parent_overlap == 0 whose
parent state needs to be cleaned up on unmap.  The next commit, which
drops the "clone image now standalone" logic, opens up another window
of opportunity to hit this, but even without it

    # cat parent-ref.sh
    #!/bin/bash
    rbd create --image-format 2 --size 1 foo
    rbd snap create foo@snap
    rbd snap protect foo@snap
    rbd clone foo@snap bar
    rbd resize --allow-shrink --size 0 bar
    rbd resize --size 1 bar
    DEV=$(rbd map bar)
    rbd unmap $DEV

leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client
hanging around.

My thinking behind calling rbd_dev_parent_put() unconditionally is that
there shouldn't be any requests in flight at that point in time as we
are deep into unmap sequence.  Hence, even if rbd_dev_unparent() caused
by flatten is delayed by in-flight requests, it will have finished by
the time we reach rbd_dev_unprobe() caused by unmap, thus turning
unconditional rbd_dev_parent_put() into a no-op.

Fixes: http://tracker.ceph.com/issues/10352

Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
---
 drivers/block/rbd.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 2990a1c75159..b85d52005a21 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -5075,10 +5075,7 @@ static void rbd_dev_unprobe(struct rbd_device *rbd_dev)
 {
 	struct rbd_image_header	*header;
 
-	/* Drop parent reference unless it's already been done (or none) */
-
-	if (rbd_dev->parent_overlap)
-		rbd_dev_parent_put(rbd_dev);
+	rbd_dev_parent_put(rbd_dev);
 
 	/* Free dynamic fields from the header, then zero it out */
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] rbd: do not treat standalone as flatten
  2015-01-20 12:41 [PATCH 0/3] rbd: parent_overlap == 0 fixes Ilya Dryomov
  2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
  2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
@ 2015-01-20 12:41 ` Ilya Dryomov
  2015-01-21  1:25   ` Josh Durgin
  2015-01-27  3:55   ` Alex Elder
  2 siblings, 2 replies; 10+ messages in thread
From: Ilya Dryomov @ 2015-01-20 12:41 UTC (permalink / raw)
  To: ceph-devel; +Cc: Alex Elder

If the clone is resized down to 0, it becomes standalone.  If such
resize is carried over while an image is mapped we would detect this
and call rbd_dev_parent_put() which means "let go of all parent state,
including the spec(s) of parent images(s)".  This leads to a mismatch
between "rbd info" and sysfs parent fields, so a fix is in order.

    # rbd create --image-format 2 --size 1 foo
    # rbd snap create foo@snap
    # rbd snap protect foo@snap
    # rbd clone foo@snap bar
    # DEV=$(rbd map bar)
    # rbd resize --allow-shrink --size 0 bar
    # rbd resize --size 1 bar
    # rbd info bar | grep parent
            parent: rbd/foo@snap

Before:

    # cat /sys/bus/rbd/devices/0/parent
    (no parent image)

After:

    # cat /sys/bus/rbd/devices/0/parent
    pool_id 0
    pool_name rbd
    image_id 10056b8b4567
    image_name foo
    snap_id 2
    snap_name snap
    overlap 0

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
---
 drivers/block/rbd.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b85d52005a21..e818c2a6ffb1 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -4273,32 +4273,22 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
 	}
 
 	/*
-	 * We always update the parent overlap.  If it's zero we
-	 * treat it specially.
+	 * We always update the parent overlap.  If it's zero we issue
+	 * a warning, as we will proceed as if there was no parent.
 	 */
-	rbd_dev->parent_overlap = overlap;
 	if (!overlap) {
-
-		/* A null parent_spec indicates it's the initial probe */
-
 		if (parent_spec) {
-			/*
-			 * The overlap has become zero, so the clone
-			 * must have been resized down to 0 at some
-			 * point.  Treat this the same as a flatten.
-			 */
-			rbd_dev_parent_put(rbd_dev);
-			pr_info("%s: clone image now standalone\n",
-				rbd_dev->disk->disk_name);
+			/* refresh, careful to warn just once */
+			if (rbd_dev->parent_overlap)
+				rbd_warn(rbd_dev,
+				    "clone now standalone (overlap became 0)");
 		} else {
-			/*
-			 * For the initial probe, if we find the
-			 * overlap is zero we just pretend there was
-			 * no parent image.
-			 */
-			rbd_warn(rbd_dev, "ignoring parent with overlap 0");
+			/* initial probe */
+			rbd_warn(rbd_dev, "clone is standalone (overlap 0)");
 		}
 	}
+	rbd_dev->parent_overlap = overlap;
+
 out:
 	ret = 0;
 out_err:
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0
  2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
@ 2015-01-21  1:22   ` Josh Durgin
  2015-01-27  3:14   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Josh Durgin @ 2015-01-21  1:22 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 04:41 AM, Ilya Dryomov wrote:
> The comment for rbd_dev_parent_get() said
>
>      * We must get the reference before checking for the overlap to
>      * coordinate properly with zeroing the parent overlap in
>      * rbd_dev_v2_parent_info() when an image gets flattened.  We
>      * drop it again if there is no overlap.
>
> but the "drop it again if there is no overlap" part was missing from
> the implementation.  This lead to absurd parent_ref values for images
> with parent_overlap == 0, as parent_ref was incremented for each
> img_request and virtually never decremented.
>
> Fix this by leveraging the fact that refresh path calls
> rbd_dev_v2_parent_info() under header_rwsem and use it for read in
> rbd_dev_parent_get(), instead of messing around with atomics.  Get rid
> of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
> they'd pair with now and I suspect we are in a pretty miserable
> situation as far as proper locking goes regardless.

Yeah, looks like we need some refactoring to read parent_overlap safely
in the I/O path in a few places.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

> Cc: stable@vger.kernel.org # 3.11+
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
> ---
>   drivers/block/rbd.c | 20 ++++++--------------
>   1 file changed, 6 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 31fa00f0d707..2990a1c75159 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -2098,32 +2098,26 @@ static void rbd_dev_parent_put(struct rbd_device *rbd_dev)
>    * If an image has a non-zero parent overlap, get a reference to its
>    * parent.
>    *
> - * We must get the reference before checking for the overlap to
> - * coordinate properly with zeroing the parent overlap in
> - * rbd_dev_v2_parent_info() when an image gets flattened.  We
> - * drop it again if there is no overlap.
> - *
>    * Returns true if the rbd device has a parent with a non-zero
>    * overlap and a reference for it was successfully taken, or
>    * false otherwise.
>    */
>   static bool rbd_dev_parent_get(struct rbd_device *rbd_dev)
>   {
> -	int counter;
> +	int counter = 0;
>
>   	if (!rbd_dev->parent_spec)
>   		return false;
>
> -	counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> -	if (counter > 0 && rbd_dev->parent_overlap)
> -		return true;
> -
> -	/* Image was flattened, but parent is not yet torn down */
> +	down_read(&rbd_dev->header_rwsem);
> +	if (rbd_dev->parent_overlap)
> +		counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> +	up_read(&rbd_dev->header_rwsem);
>
>   	if (counter < 0)
>   		rbd_warn(rbd_dev, "parent reference overflow");
>
> -	return false;
> +	return counter > 0;
>   }
>
>   /*
> @@ -4238,7 +4232,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
>   		 */
>   		if (rbd_dev->parent_overlap) {
>   			rbd_dev->parent_overlap = 0;
> -			smp_mb();
>   			rbd_dev_parent_put(rbd_dev);
>   			pr_info("%s: clone image has been flattened\n",
>   				rbd_dev->disk->disk_name);
> @@ -4284,7 +4277,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
>   	 * treat it specially.
>   	 */
>   	rbd_dev->parent_overlap = overlap;
> -	smp_mb();
>   	if (!overlap) {
>
>   		/* A null parent_spec indicates it's the initial probe */
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
  2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
@ 2015-01-21  1:24   ` Josh Durgin
  2015-01-27  3:40   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Josh Durgin @ 2015-01-21  1:24 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 04:41 AM, Ilya Dryomov wrote:
> This effectively reverts the last hunk of 392a9dad7e77 ("rbd: detect
> when clone image is flattened").
>
> The problem with parent_overlap != 0 condition is that it's possible
> and completely valid to have an image with parent_overlap == 0 whose
> parent state needs to be cleaned up on unmap.  The next commit, which
> drops the "clone image now standalone" logic, opens up another window
> of opportunity to hit this, but even without it
>
>      # cat parent-ref.sh
>      #!/bin/bash
>      rbd create --image-format 2 --size 1 foo
>      rbd snap create foo@snap
>      rbd snap protect foo@snap
>      rbd clone foo@snap bar
>      rbd resize --allow-shrink --size 0 bar
>      rbd resize --size 1 bar
>      DEV=$(rbd map bar)
>      rbd unmap $DEV
>
> leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client
> hanging around.
>
> My thinking behind calling rbd_dev_parent_put() unconditionally is that
> there shouldn't be any requests in flight at that point in time as we
> are deep into unmap sequence.  Hence, even if rbd_dev_unparent() caused
> by flatten is delayed by in-flight requests, it will have finished by
> the time we reach rbd_dev_unprobe() caused by unmap, thus turning
> unconditional rbd_dev_parent_put() into a no-op.
>
> Fixes: http://tracker.ceph.com/issues/10352
>
> Cc: stable@vger.kernel.org # 3.11+
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>

Reviewed-by: Josh Durgin <jdurgin@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] rbd: do not treat standalone as flatten
  2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
@ 2015-01-21  1:25   ` Josh Durgin
  2015-01-27  3:55   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Josh Durgin @ 2015-01-21  1:25 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 04:41 AM, Ilya Dryomov wrote:
> If the clone is resized down to 0, it becomes standalone.  If such
> resize is carried over while an image is mapped we would detect this
> and call rbd_dev_parent_put() which means "let go of all parent state,
> including the spec(s) of parent images(s)".  This leads to a mismatch
> between "rbd info" and sysfs parent fields, so a fix is in order.
>
>      # rbd create --image-format 2 --size 1 foo
>      # rbd snap create foo@snap
>      # rbd snap protect foo@snap
>      # rbd clone foo@snap bar
>      # DEV=$(rbd map bar)
>      # rbd resize --allow-shrink --size 0 bar
>      # rbd resize --size 1 bar
>      # rbd info bar | grep parent
>              parent: rbd/foo@snap
>
> Before:
>
>      # cat /sys/bus/rbd/devices/0/parent
>      (no parent image)
>
> After:
>
>      # cat /sys/bus/rbd/devices/0/parent
>      pool_id 0
>      pool_name rbd
>      image_id 10056b8b4567
>      image_name foo
>      snap_id 2
>      snap_name snap
>      overlap 0
>
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>

Reviewed-by: Josh Durgin <jdurgin@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0
  2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
  2015-01-21  1:22   ` Josh Durgin
@ 2015-01-27  3:14   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Elder @ 2015-01-27  3:14 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 06:41 AM, Ilya Dryomov wrote:
> The comment for rbd_dev_parent_get() said
> 
>     * We must get the reference before checking for the overlap to
>     * coordinate properly with zeroing the parent overlap in
>     * rbd_dev_v2_parent_info() when an image gets flattened.  We
>     * drop it again if there is no overlap.
> 
> but the "drop it again if there is no overlap" part was missing from
> the implementation.  This lead to absurd parent_ref values for images
> with parent_overlap == 0, as parent_ref was incremented for each
> img_request and virtually never decremented.

You're right about this.  If the image had a parent with no
overlap this would leak a reference to the parent image.  The
code should have said:

	counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
	if (counter > 0) {
		if (rbd_dev->parent_overlap)
			return true;
		atomic_dec(&rbd_dev->parent_ref);
	} else if (counter < 0) {
 		rbd_warn(rbd_dev, "parent reference overflow");
	}

> Fix this by leveraging the fact that refresh path calls
> rbd_dev_v2_parent_info() under header_rwsem and use it for read in
> rbd_dev_parent_get(), instead of messing around with atomics.  Get rid
> of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
> they'd pair with now and I suspect we are in a pretty miserable
> situation as far as proper locking goes regardless.

The point of the memory barrier was to ensure that when parent_overlap
gets zeroed, this code sees the zero rather than the old non-zero
value.  The atomic_inc_return_safe() call has an implicit memory
barrier to match the smp_mb() call.  It allowed the synchronization
to occur without the use of a lock.

We're trying to atomically determine whether an image request needs
to be marked as layered, to know how to handle ENOENT on parent reads.
If it is a write to an image with a parent having a non-zero overlap,
it's layered, otherwise we can treat it as a simple request.

I think in this particular case, this is just an optimization,
trying very hard to avoid having to do layered image handling
if the parent has become flattened.  I think that even if it
got old information (suggesting non-zero overlap) things would
behave correctly, just less efficiently.

Using the semaphore adds a lock to this path and therefore
implements whatever barriers are being removed.  I'm not
sure how often this is hit--maybe the optimization isn't
buying much after all.

I am getting a little rusty on some of details of what
precisely happens when a layered image gets flattened.
But I think this looks OK.  Maybe just watch for small
(perhaps insignificant) performance regressions with
this change in place...

Reviewed-by: Alex Elder <elder@linaro.org>

> Cc: stable@vger.kernel.org # 3.11+
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
> ---
>  drivers/block/rbd.c | 20 ++++++--------------
>  1 file changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 31fa00f0d707..2990a1c75159 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -2098,32 +2098,26 @@ static void rbd_dev_parent_put(struct rbd_device *rbd_dev)
>   * If an image has a non-zero parent overlap, get a reference to its
>   * parent.
>   *
> - * We must get the reference before checking for the overlap to
> - * coordinate properly with zeroing the parent overlap in
> - * rbd_dev_v2_parent_info() when an image gets flattened.  We
> - * drop it again if there is no overlap.
> - *
>   * Returns true if the rbd device has a parent with a non-zero
>   * overlap and a reference for it was successfully taken, or
>   * false otherwise.
>   */
>  static bool rbd_dev_parent_get(struct rbd_device *rbd_dev)
>  {
> -	int counter;
> +	int counter = 0;
>  
>  	if (!rbd_dev->parent_spec)
>  		return false;
>  
> -	counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> -	if (counter > 0 && rbd_dev->parent_overlap)
> -		return true;
> -
> -	/* Image was flattened, but parent is not yet torn down */
> +	down_read(&rbd_dev->header_rwsem);
> +	if (rbd_dev->parent_overlap)
> +		counter = atomic_inc_return_safe(&rbd_dev->parent_ref);
> +	up_read(&rbd_dev->header_rwsem);
>  
>  	if (counter < 0)
>  		rbd_warn(rbd_dev, "parent reference overflow");
>  
> -	return false;
> +	return counter > 0;
>  }
>  
>  /*
> @@ -4238,7 +4232,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
>  		 */
>  		if (rbd_dev->parent_overlap) {
>  			rbd_dev->parent_overlap = 0;
> -			smp_mb();
>  			rbd_dev_parent_put(rbd_dev);
>  			pr_info("%s: clone image has been flattened\n",
>  				rbd_dev->disk->disk_name);
> @@ -4284,7 +4277,6 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
>  	 * treat it specially.
>  	 */
>  	rbd_dev->parent_overlap = overlap;
> -	smp_mb();
>  	if (!overlap) {
>  
>  		/* A null parent_spec indicates it's the initial probe */
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
  2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
  2015-01-21  1:24   ` Josh Durgin
@ 2015-01-27  3:40   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Elder @ 2015-01-27  3:40 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 06:41 AM, Ilya Dryomov wrote:
> This effectively reverts the last hunk of 392a9dad7e77 ("rbd: detect
> when clone image is flattened").
> 
> The problem with parent_overlap != 0 condition is that it's possible
> and completely valid to have an image with parent_overlap == 0 whose
> parent state needs to be cleaned up on unmap.  The next commit, which
> drops the "clone image now standalone" logic, opens up another window
> of opportunity to hit this, but even without it
> 
>     # cat parent-ref.sh
>     #!/bin/bash
>     rbd create --image-format 2 --size 1 foo
>     rbd snap create foo@snap
>     rbd snap protect foo@snap
>     rbd clone foo@snap bar
>     rbd resize --allow-shrink --size 0 bar
>     rbd resize --size 1 bar
>     DEV=$(rbd map bar)
>     rbd unmap $DEV
> 
> leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client
> hanging around.

I'm not sure why the last reference to the parent
doesn't get dropped (and state cleaned up) as soon
as the overlap becomes 0.  I suspect it's the original
reference taken when there's a parent, we don't get
rid of it until it's torn down.  (I think we should.)

It seems to me the test here should be for a non-null
parent_spec pointer rather than non-zero parent_overlap.
And that's done inside rbd_dev_parent_put(), so your
change looks reasonable to me.

Reviewed-by: Alex Elder <elder@linaro.org>

> 
> My thinking behind calling rbd_dev_parent_put() unconditionally is that
> there shouldn't be any requests in flight at that point in time as we
> are deep into unmap sequence.  Hence, even if rbd_dev_unparent() caused
> by flatten is delayed by in-flight requests, it will have finished by
> the time we reach rbd_dev_unprobe() caused by unmap, thus turning
> unconditional rbd_dev_parent_put() into a no-op.
> 
> Fixes: http://tracker.ceph.com/issues/10352
> 
> Cc: stable@vger.kernel.org # 3.11+
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
> ---
>  drivers/block/rbd.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 2990a1c75159..b85d52005a21 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -5075,10 +5075,7 @@ static void rbd_dev_unprobe(struct rbd_device *rbd_dev)
>  {
>  	struct rbd_image_header	*header;
>  
> -	/* Drop parent reference unless it's already been done (or none) */
> -
> -	if (rbd_dev->parent_overlap)
> -		rbd_dev_parent_put(rbd_dev);
> +	rbd_dev_parent_put(rbd_dev);
>  
>  	/* Free dynamic fields from the header, then zero it out */
>  
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] rbd: do not treat standalone as flatten
  2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
  2015-01-21  1:25   ` Josh Durgin
@ 2015-01-27  3:55   ` Alex Elder
  1 sibling, 0 replies; 10+ messages in thread
From: Alex Elder @ 2015-01-27  3:55 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel; +Cc: Alex Elder

On 01/20/2015 06:41 AM, Ilya Dryomov wrote:
> If the clone is resized down to 0, it becomes standalone.  If such
> resize is carried over while an image is mapped we would detect this
> and call rbd_dev_parent_put() which means "let go of all parent state,
> including the spec(s) of parent images(s)".  This leads to a mismatch
> between "rbd info" and sysfs parent fields, so a fix is in order.
> 
>     # rbd create --image-format 2 --size 1 foo
>     # rbd snap create foo@snap
>     # rbd snap protect foo@snap
>     # rbd clone foo@snap bar
>     # DEV=$(rbd map bar)
>     # rbd resize --allow-shrink --size 0 bar
>     # rbd resize --size 1 bar
>     # rbd info bar | grep parent
>             parent: rbd/foo@snap
> 
> Before:
> 
>     # cat /sys/bus/rbd/devices/0/parent
>     (no parent image)
> 
> After:
> 
>     # cat /sys/bus/rbd/devices/0/parent
>     pool_id 0
>     pool_name rbd
>     image_id 10056b8b4567
>     image_name foo
>     snap_id 2
>     snap_name snap
>     overlap 0
> 
> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>

Hmm.  Interesting.

I think that a parent with an overlap of 0 is of no
real use.  So in the last patch I was suggesting it
should just go away.

But now, looking at it from this perspective, the fact
that an image *came from* a particular parent, but which
has no more overlap, could be useful information.  The
parent shouldn't simply go away without the user requesting
that.

I haven't completely followed through the logic of keeping
the reference around but I understand what you're doing and
it looks OK to me.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  drivers/block/rbd.c | 30 ++++++++++--------------------
>  1 file changed, 10 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index b85d52005a21..e818c2a6ffb1 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -4273,32 +4273,22 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev)
>  	}
>  
>  	/*
> -	 * We always update the parent overlap.  If it's zero we
> -	 * treat it specially.
> +	 * We always update the parent overlap.  If it's zero we issue
> +	 * a warning, as we will proceed as if there was no parent.
>  	 */
> -	rbd_dev->parent_overlap = overlap;
>  	if (!overlap) {
> -
> -		/* A null parent_spec indicates it's the initial probe */
> -
>  		if (parent_spec) {
> -			/*
> -			 * The overlap has become zero, so the clone
> -			 * must have been resized down to 0 at some
> -			 * point.  Treat this the same as a flatten.
> -			 */
> -			rbd_dev_parent_put(rbd_dev);
> -			pr_info("%s: clone image now standalone\n",
> -				rbd_dev->disk->disk_name);
> +			/* refresh, careful to warn just once */
> +			if (rbd_dev->parent_overlap)
> +				rbd_warn(rbd_dev,
> +				    "clone now standalone (overlap became 0)");
>  		} else {
> -			/*
> -			 * For the initial probe, if we find the
> -			 * overlap is zero we just pretend there was
> -			 * no parent image.
> -			 */
> -			rbd_warn(rbd_dev, "ignoring parent with overlap 0");
> +			/* initial probe */
> +			rbd_warn(rbd_dev, "clone is standalone (overlap 0)");
>  		}
>  	}
> +	rbd_dev->parent_overlap = overlap;
> +
>  out:
>  	ret = 0;
>  out_err:
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-01-27  3:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-20 12:41 [PATCH 0/3] rbd: parent_overlap == 0 fixes Ilya Dryomov
2015-01-20 12:41 ` [PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap == 0 Ilya Dryomov
2015-01-21  1:22   ` Josh Durgin
2015-01-27  3:14   ` Alex Elder
2015-01-20 12:41 ` [PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally Ilya Dryomov
2015-01-21  1:24   ` Josh Durgin
2015-01-27  3:40   ` Alex Elder
2015-01-20 12:41 ` [PATCH 3/3] rbd: do not treat standalone as flatten Ilya Dryomov
2015-01-21  1:25   ` Josh Durgin
2015-01-27  3:55   ` Alex Elder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.