All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
@ 2015-01-29 16:25 Kashyap Chamarthy
  2015-01-29 16:47 ` Richard W.M. Jones
  2015-01-30 17:15 ` Kevin Wolf
  0 siblings, 2 replies; 9+ messages in thread
From: Kashyap Chamarthy @ 2015-01-29 16:25 UTC (permalink / raw)
  To: qemu-devel

A simple reproducer below.

Export a disk image over NBD (I realize port 10809 is default, thought
I'd explicitly mention anyhow):

  $ qemu-nbd --f qcow2 -p10809 \
        /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img -t


Create an overlay with backing file exported via NBD:

  $ qemu-img create -f qcow2 -F \
        nbd -o backing_file=nbd://localhost overlay1.qcow2
    Formatting 'overlay1.qcow2', fmt=qcow2 size=41126400 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off


Let's attempt to boot the overlay with a minimal QEMU:

  $ qemu-system-x86_64               \
     -nographic                      \
     -nodefconfig                    \
     -nodefaults                     \
     -m 2048                         \
     -device virtio-scsi-pci,id=scsi \
     -device virtio-serial-pci       \
     -serial stdio                   \
     -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
  Segmentation fault (core dumped)


On the shell where `qemu-nbd` is running, I notice this

  nbd.c:nbd_receive_request():L756: read failed


Haven't investigated further with GDB, thought I'd bring it up here
first.


Versions
--------

  $ rpm -q qemu; uname -r
  qemu-2.1.2-7.fc21.x86_64
  3.17.8-300.fc21.x86_64

-- 
/kashyap

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-29 16:25 [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed Kashyap Chamarthy
@ 2015-01-29 16:47 ` Richard W.M. Jones
  2015-01-29 17:22   ` Kashyap Chamarthy
  2015-01-30 17:15 ` Kevin Wolf
  1 sibling, 1 reply; 9+ messages in thread
From: Richard W.M. Jones @ 2015-01-29 16:47 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: qemu-devel

On Thu, Jan 29, 2015 at 05:25:09PM +0100, Kashyap Chamarthy wrote:
> A simple reproducer below.
> 
> Export a disk image over NBD (I realize port 10809 is default, thought
> I'd explicitly mention anyhow):
> 
>   $ qemu-nbd --f qcow2 -p10809 \
>         /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img -t
> 
> 
> Create an overlay with backing file exported via NBD:
> 
>   $ qemu-img create -f qcow2 -F \
>         nbd -o backing_file=nbd://localhost overlay1.qcow2
>     Formatting 'overlay1.qcow2', fmt=qcow2 size=41126400 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off
> 
> 
> Let's attempt to boot the overlay with a minimal QEMU:
> 
>   $ qemu-system-x86_64               \
>      -nographic                      \
>      -nodefconfig                    \
>      -nodefaults                     \
>      -m 2048                         \
>      -device virtio-scsi-pci,id=scsi \
>      -device virtio-serial-pci       \
>      -serial stdio                   \
>      -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
>   Segmentation fault (core dumped)
> 
> 
> On the shell where `qemu-nbd` is running, I notice this
> 
>   nbd.c:nbd_receive_request():L756: read failed

This is a "normal error" -- it just means the client dropped the
connection.

You really need to get the stack trace from that core dump to
debug this further.

Rich.

> Haven't investigated further with GDB, thought I'd bring it up here
> first.
> 
> 
> Versions
> --------
> 
>   $ rpm -q qemu; uname -r
>   qemu-2.1.2-7.fc21.x86_64
>   3.17.8-300.fc21.x86_64
> 
> -- 
> /kashyap

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-29 16:47 ` Richard W.M. Jones
@ 2015-01-29 17:22   ` Kashyap Chamarthy
  2015-01-29 23:33     ` Kashyap Chamarthy
  0 siblings, 1 reply; 9+ messages in thread
From: Kashyap Chamarthy @ 2015-01-29 17:22 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel

On Thu, Jan 29, 2015 at 04:47:23PM +0000, Richard W.M. Jones wrote:
> On Thu, Jan 29, 2015 at 05:25:09PM +0100, Kashyap Chamarthy wrote:
> > A simple reproducer below.
> > 
> > Export a disk image over NBD (I realize port 10809 is default, thought
> > I'd explicitly mention anyhow):
> > 
> >   $ qemu-nbd --f qcow2 -p10809 \
> >         /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img -t
> > 
> > 
> > Create an overlay with backing file exported via NBD:
> > 
> >   $ qemu-img create -f qcow2 -F \
> >         nbd -o backing_file=nbd://localhost overlay1.qcow2
> >     Formatting 'overlay1.qcow2', fmt=qcow2 size=41126400 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off
> > 
> > 
> > Let's attempt to boot the overlay with a minimal QEMU:
> > 
> >   $ qemu-system-x86_64               \
> >      -nographic                      \
> >      -nodefconfig                    \
> >      -nodefaults                     \
> >      -m 2048                         \
> >      -device virtio-scsi-pci,id=scsi \
> >      -device virtio-serial-pci       \
> >      -serial stdio                   \
> >      -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
> >   Segmentation fault (core dumped)
> > 
> > 
> > On the shell where `qemu-nbd` is running, I notice this
> > 
> >   nbd.c:nbd_receive_request():L756: read failed
> 
> This is a "normal error" -- it just means the client dropped the
> connection.

Yeah, deduced so.
 
> You really need to get the stack trace from that core dump to
> debug this further.

I don't see the core dump locally (ABRT or some such not configured),
will re-test this with `gdb` in a little while to get the traces.

-- 
/kashyap

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-29 17:22   ` Kashyap Chamarthy
@ 2015-01-29 23:33     ` Kashyap Chamarthy
  0 siblings, 0 replies; 9+ messages in thread
From: Kashyap Chamarthy @ 2015-01-29 23:33 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel

On Thu, Jan 29, 2015 at 06:22:22PM +0100, Kashyap Chamarthy wrote:
> On Thu, Jan 29, 2015 at 04:47:23PM +0000, Richard W.M. Jones wrote:
> > On Thu, Jan 29, 2015 at 05:25:09PM +0100, Kashyap Chamarthy wrote:

[. . .]

> > > On the shell where `qemu-nbd` is running, I notice this
> > > 
> > >   nbd.c:nbd_receive_request():L756: read failed
> > 
>  
> > You really need to get the stack trace from that core dump to
> > debug this further.
> 
> I don't see the core dump locally (ABRT or some such not configured),
> will re-test this with `gdb` in a little while to get the traces.

Okay, now I had the coredump, and ran GDB ('bt full') with it[1] with
qemu-debuginfo files installed. Not sure if it has inough info, as I
didn't install the ~ 3.6 GB of other missing debuginfo RPMs (this is my
primary laptop, I can replicate this test on a different machine with
the missing debuginfo files if needed).

Next, I tried the external wrapper technique you once wrote about[1], to
run QEMU under `gdbserver`, here's the result:

  $ gdb
  . . .

  (gdb) file /usr/bin/qemu-system-x86_64 
  Reading symbols from /usr/bin/qemu-system-x86_64...Reading symbols from /usr/lib/debug/usr/bin/qemu-system-x86_64.debug...done.
  done.
  (gdb) target remote tcp::1234
  Remote debugging using tcp::1234
  Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.20.so.debug...done.
  done.
  Loaded symbols for /lib64/ld-linux-x86-64.so.2
  0x00007ffff7ddbcf0 in _start () from /lib64/ld-linux-x86-64.so.2
  (gdb) cont
  Continuing.
  
  Program received signal SIGSEGV, Segmentation fault.
  0x0000555555872159 in aio_set_fd_handler (ctx=ctx@entry=0x0, fd=8, io_read=io_read@entry=0x5555558a5470 <nbd_reply_ready>, io_write=io_write@entry=0x5555558a4d80 <nbd_restart_write>, 
      opaque=opaque@entry=0x5555562605e0) at aio-posix.c:50

Does this help?


Precise steps of what I did:

(1) Use the below wrapper script with `gdbserver`
----------------------------------------------------------------
$ cat /export/qemu-wrapper.sh 
#!/bin/bash -

if ! echo "$@" | grep -sqE -- '-help|-version|-device \?' ; then
      gdbserver="gdbserver :1234"
      fi

      exec $gdbserver /usr/bin/qemu-system-x86_64 "$@"
----------------------------------------------------------------

(2) On shell #1, export the backing file over QEMU NBD:

    $ qemu-nbd -f qcow2 -p10809 \
        /var/lib/libvirt/images/f21vm.qcow2 -t


(3) On shell #3, Invoke the QEMU wrapper script:

    $ ./qemu-wrapper.sh             \
    -nographic                      \
    -nodefconfig                    \
    -nodefaults                     \
    -m 2048                         \
    -device virtio-scsi-pci,id=scsi \
    -device virtio-serial-pci       \
    -serial stdio                   \
    -drive file=./overlay1-f21vm.qcow2,format=qcow2,if=virtio,cache=writeback
    Process /usr/bin/qemu-system-x86_64 created; pid = 1944
    Listening on port 1234


(4) On shell #3, run GDB (output is above).


[1] https://kashyapc.fedorapeople.org/virt/qemu-nbd-test/stack-traces-from-coredump.txt

-- 
/kashyap

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-29 16:25 [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed Kashyap Chamarthy
  2015-01-29 16:47 ` Richard W.M. Jones
@ 2015-01-30 17:15 ` Kevin Wolf
  2015-01-30 18:41   ` Kashyap Chamarthy
  1 sibling, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2015-01-30 17:15 UTC (permalink / raw)
  To: Kashyap Chamarthy; +Cc: qemu-devel, stefanha

Am 29.01.2015 um 17:25 hat Kashyap Chamarthy geschrieben:
> A simple reproducer below.
> 
> Export a disk image over NBD (I realize port 10809 is default, thought
> I'd explicitly mention anyhow):
> 
>   $ qemu-nbd --f qcow2 -p10809 \
>         /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img -t
> 
> 
> Create an overlay with backing file exported via NBD:
> 
>   $ qemu-img create -f qcow2 -F \
>         nbd -o backing_file=nbd://localhost overlay1.qcow2
>     Formatting 'overlay1.qcow2', fmt=qcow2 size=41126400 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off
> 
> 
> Let's attempt to boot the overlay with a minimal QEMU:
> 
>   $ qemu-system-x86_64               \
>      -nographic                      \
>      -nodefconfig                    \
>      -nodefaults                     \
>      -m 2048                         \
>      -device virtio-scsi-pci,id=scsi \
>      -device virtio-serial-pci       \
>      -serial stdio                   \
>      -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
>   Segmentation fault (core dumped)
> 
> 
> On the shell where `qemu-nbd` is running, I notice this
> 
>   nbd.c:nbd_receive_request():L756: read failed
> 
> 
> Haven't investigated further with GDB, thought I'd bring it up here
> first.
> 
> 
> Versions
> --------
> 
>   $ rpm -q qemu; uname -r
>   qemu-2.1.2-7.fc21.x86_64
>   3.17.8-300.fc21.x86_64

Copying Stefan because he's the master of AIO contexts and it is
bs->aio_context that becomes NULL. I couldn't see anything obvious.

In the meantime, could you retest on git master?

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-30 17:15 ` Kevin Wolf
@ 2015-01-30 18:41   ` Kashyap Chamarthy
  2015-01-30 19:32     ` Max Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Kashyap Chamarthy @ 2015-01-30 18:41 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On Fri, Jan 30, 2015 at 06:15:21PM +0100, Kevin Wolf wrote:
> Am 29.01.2015 um 17:25 hat Kashyap Chamarthy geschrieben:

> >   $ qemu-system-x86_64               \
> >      -nographic                      \
> >      -nodefconfig                    \
> >      -nodefaults                     \
> >      -m 2048                         \
> >      -device virtio-scsi-pci,id=scsi \
> >      -device virtio-serial-pci       \
> >      -serial stdio                   \
> >      -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
> >   Segmentation fault (core dumped)
> > 
> > 
> > On the shell where `qemu-nbd` is running, I notice this
> > 
> >   nbd.c:nbd_receive_request():L756: read failed
> > 
> > 
> > Haven't investigated further with GDB, thought I'd bring it up here
> > first.
> > 
> > 
> > Versions
> > --------
> > 
> >   $ rpm -q qemu; uname -r
> >   qemu-2.1.2-7.fc21.x86_64
> >   3.17.8-300.fc21.x86_64
> 
> Copying Stefan because he's the master of AIO contexts and it is
> bs->aio_context that becomes NULL. I couldn't see anything obvious.
>
> 
> In the meantime, could you retest on git master?

Just tested from git, and I can still reproduce it.

That's the commit I'm at:

  $ git describe 
  v2.2.0-682-g16017c4


Run the NBD server, from git:

  $ /home/kashyapc/build/qemu/qemu-nbd -f qcow2 \
      -p10809 ./f21vm.qcow2 -t


Create the overlay:

  $ /home/kashyapc/build/qemu/qemu-img create \
      -f qcow2 -F nbd -o backing_file=nbd://localhost overlay2-of-f21vm.qcow2
  Segmentation fault (core dumped)

Creating the overlay from the  git-compiled `qemu-img` binary fails.

So, let's create the overlay using the `qemu-img` binary from the system
(RPM version noted above) and boot the overlay from the just compiled
QEMU x86_64 binary from git, still core dumps:

  $ /home/kashyapc/build/qemu/x86_64-softmmu/qemu-system-x86_64 \
      -nographic                      \
      -nodefconfig                    \
      -nodefaults                     \
      -m 2048                         \
      -device virtio-scsi-pci,id=scsi \
      -device virtio-serial-pci       \
      -serial stdio                   \
      -drive file=./overlay2-f21vm.qcow2,format=qcow2,if=virtio,cache=writeback
  Segmentation fault (core dumped)


PS: I'm traveling, so I'll be a little slow to respond here, but can
provide more debugging info from the coredump of `qemu-img` binary as I
have access to a real computer.


-- 
/kashyap

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-30 18:41   ` Kashyap Chamarthy
@ 2015-01-30 19:32     ` Max Reitz
  2015-01-30 22:13       ` Kashyap Chamarthy
  2015-02-02  8:14       ` Stefan Hajnoczi
  0 siblings, 2 replies; 9+ messages in thread
From: Max Reitz @ 2015-01-30 19:32 UTC (permalink / raw)
  To: Kashyap Chamarthy, Kevin Wolf; +Cc: qemu-devel, stefanha

On 2015-01-30 at 13:41, Kashyap Chamarthy wrote:
> On Fri, Jan 30, 2015 at 06:15:21PM +0100, Kevin Wolf wrote:
>> Am 29.01.2015 um 17:25 hat Kashyap Chamarthy geschrieben:
>>>    $ qemu-system-x86_64               \
>>>       -nographic                      \
>>>       -nodefconfig                    \
>>>       -nodefaults                     \
>>>       -m 2048                         \
>>>       -device virtio-scsi-pci,id=scsi \
>>>       -device virtio-serial-pci       \
>>>       -serial stdio                   \
>>>       -drive file=./overlay1.qcow2,format=qcow2,if=virtio,cache=writeback
>>>    Segmentation fault (core dumped)
>>>
>>>
>>> On the shell where `qemu-nbd` is running, I notice this
>>>
>>>    nbd.c:nbd_receive_request():L756: read failed
>>>
>>>
>>> Haven't investigated further with GDB, thought I'd bring it up here
>>> first.
>>>
>>>
>>> Versions
>>> --------
>>>
>>>    $ rpm -q qemu; uname -r
>>>    qemu-2.1.2-7.fc21.x86_64
>>>    3.17.8-300.fc21.x86_64
>> Copying Stefan because he's the master of AIO contexts and it is
>> bs->aio_context that becomes NULL. I couldn't see anything obvious.
>>
>>
>> In the meantime, could you retest on git master?
> Just tested from git, and I can still reproduce it.
>
> That's the commit I'm at:
>
>    $ git describe
>    v2.2.0-682-g16017c4
>
>
> Run the NBD server, from git:
>
>    $ /home/kashyapc/build/qemu/qemu-nbd -f qcow2 \
>        -p10809 ./f21vm.qcow2 -t
>
>
> Create the overlay:
>
>    $ /home/kashyapc/build/qemu/qemu-img create \
>        -f qcow2 -F nbd -o backing_file=nbd://localhost overlay2-of-f21vm.qcow2
>    Segmentation fault (core dumped)

You want to use -F raw. The file format is raw, not nbd (nbd is the 
protocol over which the data is read, which is in format raw).

Anyway, -F nbd shouldn't result in a segfault. One way to prevent this 
is to check whether the backing file format specified (or any format 
given to qemu-img in general) is a real format or the name of a protocol 
driver and then error out if it's the latter; but that would be more of 
a hotfix.

Kevin, Stefan: The real problem is that block/nbd.c stores a 
BDRVNBDState object in bs->opaque and passes &BDRVNBDState.client (an 
NbdClientSession object) to the block/nbd-client.c functions. Those 
functions then receive the BDS pointer from client->bs. If an NBD BDS is 
a root BDS (as in this case), at some point a bdrv_swap() may happen 
(and it does happen here) which leads to ((BDRVNBDState 
*)bs->opaque)->client.bs != bs, and that's where the segfault comes from 
(bdrv_get_aio_context() returns NULL).

One way to fix this real problem is to remove the BDS pointer from the 
NbdClientSession and to always pass the BDS explicitly to the 
block/nbd-client.c functions; the other is to always update the BDS 
pointer in NbdClientSession in block/nbd.c. I'll try the former, and if 
it doesn't work, will do the latter (if you don't object).

Max

> Creating the overlay from the  git-compiled `qemu-img` binary fails.
>
> So, let's create the overlay using the `qemu-img` binary from the system
> (RPM version noted above) and boot the overlay from the just compiled
> QEMU x86_64 binary from git, still core dumps:
>
>    $ /home/kashyapc/build/qemu/x86_64-softmmu/qemu-system-x86_64 \
>        -nographic                      \
>        -nodefconfig                    \
>        -nodefaults                     \
>        -m 2048                         \
>        -device virtio-scsi-pci,id=scsi \
>        -device virtio-serial-pci       \
>        -serial stdio                   \
>        -drive file=./overlay2-f21vm.qcow2,format=qcow2,if=virtio,cache=writeback
>    Segmentation fault (core dumped)
>
>
> PS: I'm traveling, so I'll be a little slow to respond here, but can
> provide more debugging info from the coredump of `qemu-img` binary as I
> have access to a real computer.
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-30 19:32     ` Max Reitz
@ 2015-01-30 22:13       ` Kashyap Chamarthy
  2015-02-02  8:14       ` Stefan Hajnoczi
  1 sibling, 0 replies; 9+ messages in thread
From: Kashyap Chamarthy @ 2015-01-30 22:13 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, stefanha

On Fri, Jan 30, 2015 at 02:32:25PM -0500, Max Reitz wrote:
> On 2015-01-30 at 13:41, Kashyap Chamarthy wrote:
> >On Fri, Jan 30, 2015 at 06:15:21PM +0100, Kevin Wolf wrote:
> >>Am 29.01.2015 um 17:25 hat Kashyap Chamarthy geschrieben:

[. . .]

> >>Copying Stefan because he's the master of AIO contexts and it is
> >>bs->aio_context that becomes NULL. I couldn't see anything obvious.
> >>
> >>
> >>In the meantime, could you retest on git master?
> >Just tested from git, and I can still reproduce it.
> >
> >That's the commit I'm at:
> >
> >   $ git describe
> >   v2.2.0-682-g16017c4
> >
> >
> >Run the NBD server, from git:
> >
> >   $ /home/kashyapc/build/qemu/qemu-nbd -f qcow2 \
> >       -p10809 ./f21vm.qcow2 -t
> >
> >
> >Create the overlay:
> >
> >   $ /home/kashyapc/build/qemu/qemu-img create \
> >       -f qcow2 -F nbd -o backing_file=nbd://localhost overlay2-of-f21vm.qcow2
> >   Segmentation fault (core dumped)
> 
> You want to use -F raw. The file format is raw, not nbd (nbd is the protocol
> over which the data is read, which is in format raw).

Noted, thanks for this detail. However, the `qemu-img` binary from the
system (version noted earlier in the thread) honors the "-F nbd" option
just fine:

  $ qemu-img create -f qcow2 -F nbd \
     -o backing_file=nbd://localhost overlay3-of-f21vm.qcow2
  Formatting 'overlay3-of-f21vm.qcow2', fmt=qcow2 size=42949672960 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off

Then, the segfault with the git-compiled `qemu-img` binary seems like
some kind of an incorrect "regression" (if you could call it so). 

Anyhow, the `qemu-img` from git works correctly as per your reasoning:

  $ /home/kashyapc/build/qemu/qemu-img create \
      -f qcow2 -F raw -o backing_file=nbd://localhost

> Anyway, -F nbd shouldn't result in a segfault. One way to prevent this is to
> check whether the backing file format specified (or any format given to
> qemu-img in general) is a real format or the name of a protocol driver and
> then error out if it's the latter; but that would be more of a hotfix.
> 
> Kevin, Stefan: The real problem is that block/nbd.c stores a BDRVNBDState
> object in bs->opaque and passes &BDRVNBDState.client (an NbdClientSession
> object) to the block/nbd-client.c functions. Those functions then receive
> the BDS pointer from client->bs. If an NBD BDS is a root BDS (as in this
> case), at some point a bdrv_swap() may happen (and it does happen here)
> which leads to ((BDRVNBDState *)bs->opaque)->client.bs != bs, and that's
> where the segfault comes from (bdrv_get_aio_context() returns NULL).
> 
> One way to fix this real problem is to remove the BDS pointer from the
> NbdClientSession and to always pass the BDS explicitly to the
> block/nbd-client.c functions; the other is to always update the BDS pointer
> in NbdClientSession in block/nbd.c. I'll try the former, and if it doesn't
> work, will do the latter (if you don't object).
 
Thank you for investigating.


-- 
/kashyap

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed
  2015-01-30 19:32     ` Max Reitz
  2015-01-30 22:13       ` Kashyap Chamarthy
@ 2015-02-02  8:14       ` Stefan Hajnoczi
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2015-02-02  8:14 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, qemu-devel, Kashyap Chamarthy

[-- Attachment #1: Type: text/plain, Size: 1170 bytes --]

On Fri, Jan 30, 2015 at 02:32:25PM -0500, Max Reitz wrote:
> Kevin, Stefan: The real problem is that block/nbd.c stores a BDRVNBDState
> object in bs->opaque and passes &BDRVNBDState.client (an NbdClientSession
> object) to the block/nbd-client.c functions. Those functions then receive
> the BDS pointer from client->bs. If an NBD BDS is a root BDS (as in this
> case), at some point a bdrv_swap() may happen (and it does happen here)
> which leads to ((BDRVNBDState *)bs->opaque)->client.bs != bs, and that's
> where the segfault comes from (bdrv_get_aio_context() returns NULL).
> 
> One way to fix this real problem is to remove the BDS pointer from the
> NbdClientSession and to always pass the BDS explicitly to the
> block/nbd-client.c functions; the other is to always update the BDS pointer
> in NbdClientSession in block/nbd.c. I'll try the former, and if it doesn't
> work, will do the latter (if you don't object).

Sounds good.

On a related note I asked John Snow to look at QED and vvfat's
.bdrv_rebind() usage.  I think we can get rid of that API after
propagating BlockDriverState *bs arguments to QED and vvfat functions.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-02-02  8:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29 16:25 [Qemu-devel] QEMU segfault: Booting an overlay with backing_file over NBD: nbd.c:nbd_receive_request():L756: read failed Kashyap Chamarthy
2015-01-29 16:47 ` Richard W.M. Jones
2015-01-29 17:22   ` Kashyap Chamarthy
2015-01-29 23:33     ` Kashyap Chamarthy
2015-01-30 17:15 ` Kevin Wolf
2015-01-30 18:41   ` Kashyap Chamarthy
2015-01-30 19:32     ` Max Reitz
2015-01-30 22:13       ` Kashyap Chamarthy
2015-02-02  8:14       ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.