All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/1] block: pass the right options for BlockDriver.bdrv_open()
@ 2017-03-29  1:16 Dong Jia Shi
  2017-03-29  1:16 ` [Qemu-devel] [PATCH v3 1/1] " Dong Jia Shi
  0 siblings, 1 reply; 4+ messages in thread
From: Dong Jia Shi @ 2017-03-29  1:16 UTC (permalink / raw)
  To: qemu-block, kwolf, mreitz
  Cc: qemu-devel, bjsdjshi, cornelia.huck, borntraeger

Trying to restore rbd image on ceph cluster from snapshot with
qemu-img could trigger a calling to raw_open with a NULL @options,
and that will lead to a failure of the snapshot applying.

[root@s8345007 ~]# gdb --args qemu-img snapshot -a snap1 rbd:test_pool/dj_image
... ...
Program received signal SIGSEGV, Segmentation fault.
0x00000000801395a8 in qdict_next_entry (qdict=0x0, first_bucket=0) at qobject/qdict.c:327
327	        if (!QLIST_EMPTY(&qdict->table[i])) {
(gdb) bt
#0  0x00000000801395a8 in qdict_next_entry (qdict=0x0, first_bucket=0) at qobject/qdict.c:327
#1  0x0000000080139626 in qdict_first (qdict=0x0) at qobject/qdict.c:340
#2  0x000000008013a00c in qdict_extract_subqdict (src=0x0, dst=0x3ffffffec50, start=0x80698260 "file.")
    at qobject/qdict.c:576
#3  0x0000000080019c26 in bdrv_open_child_bs (filename=0x0, options=0x0, bdref_key=0x8017ab38 "file", 
    parent=0x80630300, child_role=0x80176108 <child_file>, allow_none=false, errp=0x0) at block.c:2018
#4  0x0000000080019dfa in bdrv_open_child (filename=0x0, options=0x0, bdref_key=0x8017ab38 "file", 
    parent=0x80630300, child_role=0x80176108 <child_file>, allow_none=false, errp=0x0) at block.c:2065
#5  0x000000008002b9a0 in raw_open (bs=0x80630300, options=0x0, flags=8194, errp=0x0) at block/raw-format.c:387
#6  0x0000000080087516 in bdrv_snapshot_goto (bs=0x80630300, snapshot_id=0x3fffffff75c "snap1")
    at block/snapshot.c:194
#7  0x0000000080010b8c in img_snapshot (argc=4, argv=0x3fffffff4c0) at qemu-img.c:2937
#8  0x00000000800140e4 in main (argc=4, argv=0x3fffffff4c0) at qemu-img.c:4373

The problematic code is /block/snapshot.c:194:
178 int bdrv_snapshot_goto(BlockDriverState *bs,
179                        const char *snapshot_id)
180 {
181     BlockDriver *drv = bs->drv;
182     int ret, open_ret;
183 
184     if (!drv) {
185         return -ENOMEDIUM;
186     }
187     if (drv->bdrv_snapshot_goto) {
188         return drv->bdrv_snapshot_goto(bs, snapshot_id);
189     }
190 
191     if (bs->file) {
192         drv->bdrv_close(bs);
193         ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
194         open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
195         if (open_ret < 0) {
196             bdrv_unref(bs->file->bs);
197             bs->drv = NULL;
198             return open_ret;
199         }
200         return ret;
201     }
202 
203     return -ENOTSUP;
204 }

After a chat with Kevin, my understanding is that's because NULL
@options is not a valid parameter value for the bdrv_open callback of
the BlockDriver.
We shoule prepare a @options by adding the "file" key-value pair to
the actual options that were given for the node (i.e. bs->options),
and pass it to the callback.

Then Max pointed out that we're not guaranteed that "file" is a
nested QDict. The following commands will both result in segfaults:
$ ./qemu-img snapshot -a foo \
"json:{'driver':'raw','file':{'driver':'qcow2','file':{'driver':'file','filename':'foo.qcow2'}}}"
$ ./qemu-img snapshot -a foo --image-opts \
driver=raw,file.driver=qcow2,file.file.driver=file,file.file.filename=foo.qcow2

Max also proposed for the previous patch (v2) to:
(1) Remove every option in "options" that has a "file." prefix before
the qdict_put() call.
(2) Use bdrv_unref_child(bs, bs->file) instead of bdrv_unref(bs->file->bs).

So I adopted Max's suggestions, and here is the new patch.

Dong Jia Shi (1):
  block: pass the right options for BlockDriver.bdrv_open()

 block/snapshot.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

-- 
2.8.4

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH v3 1/1] block: pass the right options for BlockDriver.bdrv_open()
  2017-03-29  1:16 [Qemu-devel] [PATCH v3 0/1] block: pass the right options for BlockDriver.bdrv_open() Dong Jia Shi
@ 2017-03-29  1:16 ` Dong Jia Shi
  2017-03-29 21:07   ` Max Reitz
  0 siblings, 1 reply; 4+ messages in thread
From: Dong Jia Shi @ 2017-03-29  1:16 UTC (permalink / raw)
  To: qemu-block, kwolf, mreitz
  Cc: qemu-devel, bjsdjshi, cornelia.huck, borntraeger

raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). If that
bdrv_open_child() fails, the field becomes NULL. While we are at
it, we also correct the cleanning up action for a call failure of
BlockDriver.bdrv_open() by replacing bdrv_unref() with
bdrv_unref_child().

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
---
 block/snapshot.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index bf5c2ca..281626c 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -27,6 +27,7 @@
 #include "block/block_int.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qstring.h"
 
 QemuOptsList internal_snapshot_opts = {
     .name = "snapshot",
@@ -189,11 +190,20 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
     }
 
     if (bs->file) {
+        QDict *options = qdict_clone_shallow(bs->options);
+        QDict *file_options;
+
+        qdict_extract_subqdict(options, &file_options, "file.");
+        QDECREF(file_options);
+        qdict_put(options, "file",
+                  qstring_from_str(bdrv_get_node_name(bs->file->bs)));
+
         drv->bdrv_close(bs);
         ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
-        open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
+        open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
+        QDECREF(options);
         if (open_ret < 0) {
-            bdrv_unref(bs->file->bs);
+            bdrv_unref_child(bs, bs->file);
             bs->drv = NULL;
             return open_ret;
         }
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/1] block: pass the right options for BlockDriver.bdrv_open()
  2017-03-29  1:16 ` [Qemu-devel] [PATCH v3 1/1] " Dong Jia Shi
@ 2017-03-29 21:07   ` Max Reitz
  2017-04-05  2:28     ` Dong Jia Shi
  0 siblings, 1 reply; 4+ messages in thread
From: Max Reitz @ 2017-03-29 21:07 UTC (permalink / raw)
  To: Dong Jia Shi, qemu-block, kwolf; +Cc: qemu-devel, cornelia.huck, borntraeger

[-- Attachment #1: Type: text/plain, Size: 3877 bytes --]

On 29.03.2017 03:16, Dong Jia Shi wrote:
> raw_open() expects the caller always passing in the right actual
> @options parameter. But when trying to applying snapshot on a RBD
> image, bdrv_snapshot_goto() calls raw_open() (by calling the
> bdrv_open callback on the BlockDriver) with a NULL @options, and
> that will result in a Segmentation fault.
> 
> For the other non-raw format drivers, it also makes sense to passing
> in the actual options, althought they don't trigger the problem so
> far.
> 
> Let's prepare a @options by adding the "file" key-value pair to a
> copy of the actual options that were given for the node (i.e.
> bs->options), and pass it to the callback.
> 
> BlockDriver.bdrv_open() expects bs->file to be NULL and just
> overwrites it with the result from bdrv_open_child(). If that
> bdrv_open_child() fails, the field becomes NULL. While we are at
> it, we also correct the cleanning up action for a call failure of
> BlockDriver.bdrv_open() by replacing bdrv_unref() with
> bdrv_unref_child().
> 
> Suggested-by: Max Reitz <mreitz@redhat.com>
> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> ---
>  block/snapshot.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/block/snapshot.c b/block/snapshot.c
> index bf5c2ca..281626c 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -27,6 +27,7 @@
>  #include "block/block_int.h"
>  #include "qapi/error.h"
>  #include "qapi/qmp/qerror.h"
> +#include "qapi/qmp/qstring.h"
>  
>  QemuOptsList internal_snapshot_opts = {
>      .name = "snapshot",
> @@ -189,11 +190,20 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
>      }
>  
>      if (bs->file) {
> +        QDict *options = qdict_clone_shallow(bs->options);
> +        QDict *file_options;
> +
> +        qdict_extract_subqdict(options, &file_options, "file.");
> +        QDECREF(file_options);
> +        qdict_put(options, "file",
> +                  qstring_from_str(bdrv_get_node_name(bs->file->bs)));
> +
>          drv->bdrv_close(bs);
>          ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
> -        open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
> +        open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
> +        QDECREF(options);
>          if (open_ret < 0) {
> -            bdrv_unref(bs->file->bs);
> +            bdrv_unref_child(bs, bs->file);
>              bs->drv = NULL;
>              return open_ret;
>          }

I just noticed another issue (sorry I did not before...):

In drv->bdrv_open(), the block driver will generally overwrite bs->file
without looking at it because it assumes that it's NULL. That means we
should probably actually make sure it's NULL because otherwise we will
the child BDS will have a reference count that is 1 too high.

(bdrv_open_inherit() (ultimately called from bdrv_open_child()) invokes
bdrv_ref() for a child BDS specified by node-name reference.)

That means we should unconditionally invoke
bdrv_unref_child(bs, bs->file) before calling drv->bdrv_open(). But we
have to wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.

So I think it should look something like this:

BlockDriverState *file;
QDict *options = ...;
QDict *file_options;

file = bs->file->bs;
/* Prevent it from getting deleted when detached from bs */
bdrv_ref(file);

qdict_extract_subqdict(...);
QDECREF(file_options);
qdict_put(..., qstring_from_str(bdrv_get_node_name(file)));

drv->bdrv_close(bs);
bdrv_unref_child(bs, bs->file);
bs->file = NULL;

ret = bdrv_snapshot_goto(file, snapshot_id);
open_ret = drv->bdrv_open(...);
if (open_ret < 0) {
    bdrv_unref(file);
    bs->drv = NULL;
    return open_ret;
}

assert(bs->file->bs == file);
bdrv_unref(file);


Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/1] block: pass the right options for BlockDriver.bdrv_open()
  2017-03-29 21:07   ` Max Reitz
@ 2017-04-05  2:28     ` Dong Jia Shi
  0 siblings, 0 replies; 4+ messages in thread
From: Dong Jia Shi @ 2017-04-05  2:28 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dong Jia Shi, qemu-block, kwolf, qemu-devel, cornelia.huck, borntraeger

* Max Reitz <mreitz@redhat.com> [2017-03-29 23:07:22 +0200]:

> On 29.03.2017 03:16, Dong Jia Shi wrote:
> > raw_open() expects the caller always passing in the right actual
> > @options parameter. But when trying to applying snapshot on a RBD
> > image, bdrv_snapshot_goto() calls raw_open() (by calling the
> > bdrv_open callback on the BlockDriver) with a NULL @options, and
> > that will result in a Segmentation fault.
> > 
> > For the other non-raw format drivers, it also makes sense to passing
> > in the actual options, althought they don't trigger the problem so
> > far.
> > 
> > Let's prepare a @options by adding the "file" key-value pair to a
> > copy of the actual options that were given for the node (i.e.
> > bs->options), and pass it to the callback.
> > 
> > BlockDriver.bdrv_open() expects bs->file to be NULL and just
> > overwrites it with the result from bdrv_open_child(). If that
> > bdrv_open_child() fails, the field becomes NULL. While we are at
> > it, we also correct the cleanning up action for a call failure of
> > BlockDriver.bdrv_open() by replacing bdrv_unref() with
> > bdrv_unref_child().
> > 
> > Suggested-by: Max Reitz <mreitz@redhat.com>
> > Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> > ---
> >  block/snapshot.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/snapshot.c b/block/snapshot.c
> > index bf5c2ca..281626c 100644
> > --- a/block/snapshot.c
> > +++ b/block/snapshot.c
> > @@ -27,6 +27,7 @@
> >  #include "block/block_int.h"
> >  #include "qapi/error.h"
> >  #include "qapi/qmp/qerror.h"
> > +#include "qapi/qmp/qstring.h"
> >  
> >  QemuOptsList internal_snapshot_opts = {
> >      .name = "snapshot",
> > @@ -189,11 +190,20 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
> >      }
> >  
> >      if (bs->file) {
> > +        QDict *options = qdict_clone_shallow(bs->options);
> > +        QDict *file_options;
> > +
> > +        qdict_extract_subqdict(options, &file_options, "file.");
> > +        QDECREF(file_options);
> > +        qdict_put(options, "file",
> > +                  qstring_from_str(bdrv_get_node_name(bs->file->bs)));
> > +
> >          drv->bdrv_close(bs);
> >          ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
> > -        open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
> > +        open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
> > +        QDECREF(options);
> >          if (open_ret < 0) {
> > -            bdrv_unref(bs->file->bs);
> > +            bdrv_unref_child(bs, bs->file);
> >              bs->drv = NULL;
> >              return open_ret;
> >          }
> 
> I just noticed another issue (sorry I did not before...):
Hi Max,
No need for sorry.

> 
> In drv->bdrv_open(), the block driver will generally overwrite bs->file
> without looking at it because it assumes that it's NULL. That means we
> should probably actually make sure it's NULL because otherwise we will
> the child BDS will have a reference count that is 1 too high.
> 
> (bdrv_open_inherit() (ultimately called from bdrv_open_child()) invokes
> bdrv_ref() for a child BDS specified by node-name reference.)
> 
> That means we should unconditionally invoke
> bdrv_unref_child(bs, bs->file) before calling drv->bdrv_open(). But we
> have to wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
> deleted in the meantime.
Understood.

> 
> So I think it should look something like this:
> 
> BlockDriverState *file;
> QDict *options = ...;
> QDict *file_options;
> 
> file = bs->file->bs;
> /* Prevent it from getting deleted when detached from bs */
> bdrv_ref(file);
> 
> qdict_extract_subqdict(...);
> QDECREF(file_options);
> qdict_put(..., qstring_from_str(bdrv_get_node_name(file)));
> 
> drv->bdrv_close(bs);
> bdrv_unref_child(bs, bs->file);
> bs->file = NULL;
> 
> ret = bdrv_snapshot_goto(file, snapshot_id);
> open_ret = drv->bdrv_open(...);
Here, I think we'd still need a:
QDECREF(options);

> if (open_ret < 0) {
>     bdrv_unref(file);
>     bs->drv = NULL;
>     return open_ret;
> }
> 
> assert(bs->file->bs == file);
> bdrv_unref(file);

It looks convoluted, but I don't see a better solution. So I will
prepare a new patch based on your code.

Thanks for the help!

> 
> 
> Max
> 

-- 
Dong Jia Shi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-05  2:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-29  1:16 [Qemu-devel] [PATCH v3 0/1] block: pass the right options for BlockDriver.bdrv_open() Dong Jia Shi
2017-03-29  1:16 ` [Qemu-devel] [PATCH v3 1/1] " Dong Jia Shi
2017-03-29 21:07   ` Max Reitz
2017-04-05  2:28     ` Dong Jia Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.