* Null dereference in bdrv_unregister_buf() probably memory-backend-file related?
@ 2022-11-21 16:21 Jonathan Cameron via
2022-11-21 16:47 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron via @ 2022-11-21 16:21 UTC (permalink / raw)
To: qemu-devel, Hanna Reitz, Stefan Hajnoczi, Kevin Wolf, linuxarm,
David Hildenbrand, Igor Mammedov
First CC list is a guess as I haven't managed to root cause where things are
going wrong yet.
Originally hit this whilst rebasing some CXL patches on v7.2.0-rc1.
CXL makes extensive use of memory-backends and most my tests happen
to use memory-backend-file
Issue seen on arm64 and x86 though helpfully on x86 the crash appears in an entirely
unrelated location (though the 'fix' works).
Fairly minimal test command line.
qemu-system-aarch64 \
-M virt \
-drive if=none,file=full.qcow2,format=qcow2,id=hd \
-device virtio-blk,drive=hd \
-object memory-backend-file,id=cxl-mem1,mem-path=/tmp/cxltest.raw,size=256M,align=256M \
Powerdown the machine or ctrl-c during boot gives a segfault.
On arm64 it was in a stable location that made at least some sense in that
bs in the below snippet is NULL.
I added the follow work around and the segfault goes away...
[PATCH] temp
---
block/io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/io.c b/block/io.c
index b9424024f9..750e1366aa 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3324,6 +3324,9 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size)
{
BdrvChild *child;
+ if (!bs) {
+ return;
+ }
GLOBAL_STATE_CODE();
if (bs->drv && bs->drv->bdrv_unregister_buf) {
bs->drv->bdrv_unregister_buf(bs, host, size);
--
2.37.2
Not present on v7.1.0 . I'll start a bisection shortly but may take a while.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Null dereference in bdrv_unregister_buf() probably memory-backend-file related?
2022-11-21 16:21 Null dereference in bdrv_unregister_buf() probably memory-backend-file related? Jonathan Cameron via
@ 2022-11-21 16:47 ` Stefan Hajnoczi
2022-11-21 17:29 ` Jonathan Cameron via
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-11-21 16:47 UTC (permalink / raw)
To: Jonathan Cameron
Cc: qemu-devel, Hanna Reitz, Stefan Hajnoczi, Kevin Wolf, linuxarm,
David Hildenbrand, Igor Mammedov
On Mon, 21 Nov 2022 at 11:22, Jonathan Cameron via
<qemu-devel@nongnu.org> wrote:
>
> First CC list is a guess as I haven't managed to root cause where things are
> going wrong yet.
>
> Originally hit this whilst rebasing some CXL patches on v7.2.0-rc1.
> CXL makes extensive use of memory-backends and most my tests happen
> to use memory-backend-file
>
> Issue seen on arm64 and x86 though helpfully on x86 the crash appears in an entirely
> unrelated location (though the 'fix' works).
>
> Fairly minimal test command line.
>
> qemu-system-aarch64 \
> -M virt \
> -drive if=none,file=full.qcow2,format=qcow2,id=hd \
> -device virtio-blk,drive=hd \
> -object memory-backend-file,id=cxl-mem1,mem-path=/tmp/cxltest.raw,size=256M,align=256M \
>
> Powerdown the machine or ctrl-c during boot gives a segfault.
> On arm64 it was in a stable location that made at least some sense in that
> bs in the below snippet is NULL.
>
> I added the follow work around and the segfault goes away...
>
> [PATCH] temp
>
> ---
> block/io.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/block/io.c b/block/io.c
> index b9424024f9..750e1366aa 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -3324,6 +3324,9 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size)
> {
> BdrvChild *child;
>
> + if (!bs) {
> + return;
> + }
> GLOBAL_STATE_CODE();
> if (bs->drv && bs->drv->bdrv_unregister_buf) {
> bs->drv->bdrv_unregister_buf(bs, host, size);
bdrv_*() APIs generally don't accept NULL bs arguments.
I think blk_unregister_buf() needs to handle the blk_bs() NULL return
value. Can you confirm that the parent function is
blk_unregister_buf()?
This bug may have been introduced by commit baf422684d73 ("virtio-blk:
use BDRV_REQ_REGISTERED_BUF optimization hint").
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Null dereference in bdrv_unregister_buf() probably memory-backend-file related?
2022-11-21 16:47 ` Stefan Hajnoczi
@ 2022-11-21 17:29 ` Jonathan Cameron via
2022-11-21 21:03 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Cameron via @ 2022-11-21 17:29 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Hanna Reitz, Stefan Hajnoczi, Kevin Wolf, linuxarm,
David Hildenbrand, Igor Mammedov
On Mon, 21 Nov 2022 11:47:48 -0500
Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, 21 Nov 2022 at 11:22, Jonathan Cameron via
> <qemu-devel@nongnu.org> wrote:
> >
> > First CC list is a guess as I haven't managed to root cause where things are
> > going wrong yet.
> >
> > Originally hit this whilst rebasing some CXL patches on v7.2.0-rc1.
> > CXL makes extensive use of memory-backends and most my tests happen
> > to use memory-backend-file
> >
> > Issue seen on arm64 and x86 though helpfully on x86 the crash appears in an entirely
> > unrelated location (though the 'fix' works).
> >
> > Fairly minimal test command line.
> >
> > qemu-system-aarch64 \
> > -M virt \
> > -drive if=none,file=full.qcow2,format=qcow2,id=hd \
> > -device virtio-blk,drive=hd \
> > -object memory-backend-file,id=cxl-mem1,mem-path=/tmp/cxltest.raw,size=256M,align=256M \
> >
> > Powerdown the machine or ctrl-c during boot gives a segfault.
> > On arm64 it was in a stable location that made at least some sense in that
> > bs in the below snippet is NULL.
> >
> > I added the follow work around and the segfault goes away...
> >
> > [PATCH] temp
> >
> > ---
> > block/io.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/block/io.c b/block/io.c
> > index b9424024f9..750e1366aa 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -3324,6 +3324,9 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size)
> > {
> > BdrvChild *child;
> >
> > + if (!bs) {
> > + return;
> > + }
> > GLOBAL_STATE_CODE();
> > if (bs->drv && bs->drv->bdrv_unregister_buf) {
> > bs->drv->bdrv_unregister_buf(bs, host, size);
>
> bdrv_*() APIs generally don't accept NULL bs arguments.
>
> I think blk_unregister_buf() needs to handle the blk_bs() NULL return
> value. Can you confirm that the parent function is
> blk_unregister_buf()?
>
> This bug may have been introduced by commit baf422684d73 ("virtio-blk:
> use BDRV_REQ_REGISTERED_BUF optimization hint").
Got it in one. I just bisected to exactly that patch
+ using the below change indeed works just as well as the above.
Now I'd send this as a patch, but I don't yet sufficiently understand what that change you
referenced did to break things Seems it registered a notifier that is getting
called for all ram blocks, not just the one virtio-blk ones?
Perhaps better if you send a fix with an explanation :)
Thanks for the quick response and correct identification of the problem.
Jonathan
diff --git a/block/block-backend.c b/block/block-backend.c
index b48c91f4e1..e281569137 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2582,8 +2582,13 @@ bool blk_register_buf(BlockBackend *blk, void *host, size_t size, Error **errp)
void blk_unregister_buf(BlockBackend *blk, void *host, size_t size)
{
+ BlockDriverState *bs;
GLOBAL_STATE_CODE();
- bdrv_unregister_buf(blk_bs(blk), host, size);
+
+ bs = blk_bs(blk);
+ if (bs) {
+ bdrv_unregister_buf(blk_bs(blk), host, size);
+ }
}
int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in,
>
> Stefan
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Null dereference in bdrv_unregister_buf() probably memory-backend-file related?
2022-11-21 17:29 ` Jonathan Cameron via
@ 2022-11-21 21:03 ` Stefan Hajnoczi
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2022-11-21 21:03 UTC (permalink / raw)
To: Jonathan Cameron
Cc: qemu-devel, Hanna Reitz, Stefan Hajnoczi, Kevin Wolf, linuxarm,
David Hildenbrand, Igor Mammedov
On Mon, 21 Nov 2022 at 12:29, Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 21 Nov 2022 11:47:48 -0500
> Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> > On Mon, 21 Nov 2022 at 11:22, Jonathan Cameron via
> > <qemu-devel@nongnu.org> wrote:
> > >
> > > First CC list is a guess as I haven't managed to root cause where things are
> > > going wrong yet.
> > >
> > > Originally hit this whilst rebasing some CXL patches on v7.2.0-rc1.
> > > CXL makes extensive use of memory-backends and most my tests happen
> > > to use memory-backend-file
> > >
> > > Issue seen on arm64 and x86 though helpfully on x86 the crash appears in an entirely
> > > unrelated location (though the 'fix' works).
> > >
> > > Fairly minimal test command line.
> > >
> > > qemu-system-aarch64 \
> > > -M virt \
> > > -drive if=none,file=full.qcow2,format=qcow2,id=hd \
> > > -device virtio-blk,drive=hd \
> > > -object memory-backend-file,id=cxl-mem1,mem-path=/tmp/cxltest.raw,size=256M,align=256M \
> > >
> > > Powerdown the machine or ctrl-c during boot gives a segfault.
> > > On arm64 it was in a stable location that made at least some sense in that
> > > bs in the below snippet is NULL.
> > >
> > > I added the follow work around and the segfault goes away...
> > >
> > > [PATCH] temp
> > >
> > > ---
> > > block/io.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/block/io.c b/block/io.c
> > > index b9424024f9..750e1366aa 100644
> > > --- a/block/io.c
> > > +++ b/block/io.c
> > > @@ -3324,6 +3324,9 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size)
> > > {
> > > BdrvChild *child;
> > >
> > > + if (!bs) {
> > > + return;
> > > + }
> > > GLOBAL_STATE_CODE();
> > > if (bs->drv && bs->drv->bdrv_unregister_buf) {
> > > bs->drv->bdrv_unregister_buf(bs, host, size);
> >
> > bdrv_*() APIs generally don't accept NULL bs arguments.
> >
> > I think blk_unregister_buf() needs to handle the blk_bs() NULL return
> > value. Can you confirm that the parent function is
> > blk_unregister_buf()?
> >
> > This bug may have been introduced by commit baf422684d73 ("virtio-blk:
> > use BDRV_REQ_REGISTERED_BUF optimization hint").
> Got it in one. I just bisected to exactly that patch
>
> + using the below change indeed works just as well as the above.
> Now I'd send this as a patch, but I don't yet sufficiently understand what that change you
> referenced did to break things Seems it registered a notifier that is getting
> called for all ram blocks, not just the one virtio-blk ones?
>
> Perhaps better if you send a fix with an explanation :)
Sure, no problem. I have reproduced the bug and will send a patch.
Thanks,
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-11-21 21:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 16:21 Null dereference in bdrv_unregister_buf() probably memory-backend-file related? Jonathan Cameron via
2022-11-21 16:47 ` Stefan Hajnoczi
2022-11-21 17:29 ` Jonathan Cameron via
2022-11-21 21:03 ` Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.