qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Thomas Huth <thuth@redhat.com>,
	jiangkunkun@huawei.com, Andrew Jones <drjones@redhat.com>,
	qemu-devel@nongnu.org, qemu-arm@nongnu.org,
	Gerd Hoffmann <kraxel@redhat.com>,
	wanghaibin.wang@huawei.com, Zenghui Yu <yuzenghui@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v2 2/2] accel: kvm: Add aligment assert for kvm_log_clear_one_slot
Date: Tue, 9 Mar 2021 11:08:14 -0500	[thread overview]
Message-ID: <20210309160814.GA763132@xz-x1> (raw)
In-Reply-To: <YEeM8eUUzm9AlaFI@work-vm>

On Tue, Mar 09, 2021 at 02:57:53PM +0000, Dr. David Alan Gilbert wrote:
> * Thomas Huth (thuth@redhat.com) wrote:
> > On 09/03/2021 15.05, Keqian Zhu wrote:
> > > 
> > > 
> > > On 2021/3/9 21:48, Thomas Huth wrote:
> > > > On 17/12/2020 02.49, Keqian Zhu wrote:
> > > > > The parameters start and size are transfered from QEMU memory
> > > > > emulation layer. It can promise that they are TARGET_PAGE_SIZE
> > > > > aligned. However, KVM needs they are qemu_real_page_size aligned.
> > > > > 
> > > > > Though no caller breaks this aligned requirement currently, we'd
> > > > > better add an explicit assert to avoid future breaking.
> > > > > 
> > > > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > > > ---
> > > > >    accel/kvm/kvm-all.c | 7 +++++++
> > > > >    1 file changed, 7 insertions(+)
> > > > > 
> > > > > ---
> > > > > v2
> > > > >    - Address Andrew's commment (Use assert instead of return err).
> > > > > 
> > > > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > > > > index f6b16a8df8..73b195cc41 100644
> > > > > --- a/accel/kvm/kvm-all.c
> > > > > +++ b/accel/kvm/kvm-all.c
> > > > > @@ -692,6 +692,10 @@ out:
> > > > >    #define KVM_CLEAR_LOG_ALIGN  (qemu_real_host_page_size << KVM_CLEAR_LOG_SHIFT)
> > > > >    #define KVM_CLEAR_LOG_MASK   (-KVM_CLEAR_LOG_ALIGN)
> > > > >    +/*
> > > > > + * As the granule of kvm dirty log is qemu_real_host_page_size,
> > > > > + * @start and @size are expected and restricted to align to it.
> > > > > + */
> > > > >    static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start,
> > > > >                                      uint64_t size)
> > > > >    {
> > > > > @@ -701,6 +705,9 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start,
> > > > >        unsigned long *bmap_clear = NULL, psize = qemu_real_host_page_size;
> > > > >        int ret;
> > > > >    +    /* Make sure start and size are qemu_real_host_page_size aligned */
> > > > > +    assert(QEMU_IS_ALIGNED(start | size, psize));
> > > > 
> > > > Sorry, but that was a bad idea: It triggers and kills my Centos 6 VM:
> > > > 
> > > > $ qemu-system-x86_64 -accel kvm -hda ~/virt/images/centos6.qcow2 -m 1G
> > > > qemu-system-x86_64: ../../devel/qemu/accel/kvm/kvm-all.c:690: kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start | size, psize)' failed.
> > > > Aborted (core dumped)
> > > Hi Thomas,
> > > 
> > > I think this patch is ok, maybe it trigger a potential bug?
> > 
> > Well, sure, there is either a bug somewhere else or in this new code. But it's certainly not normal that the assert() triggers, is it?
> > 
> > FWIW, here's a backtrace:
> > 
> > #0  0x00007ffff2c1584f in raise () at /lib64/libc.so.6
> > #1  0x00007ffff2bffc45 in abort () at /lib64/libc.so.6
> > #2  0x00007ffff2bffb19 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> > #3  0x00007ffff2c0de36 in .annobin_assert.c_end () at /lib64/libc.so.6
> > #4  0x0000555555ba25f3 in kvm_log_clear_one_slot
> >     (size=6910080, start=0, as_id=0, mem=0x555556e1ee00)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:691
> > #5  0x0000555555ba25f3 in kvm_physical_log_clear
> >     (section=0x7fffffffd0b0, section=0x7fffffffd0b0, kml=0x555556dbaac0)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:843
> > #6  0x0000555555ba25f3 in kvm_log_clear (listener=0x555556dbaac0, section=0x7fffffffd0b0)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:1253
> > #7  0x0000555555b023d8 in memory_region_clear_dirty_bitmap
> >     (mr=mr@entry=0x5555573394c0, start=start@entry=0, len=len@entry=6910080)
> >     at ../../devel/qemu/softmmu/memory.c:2132
> > #8  0x0000555555b313d9 in cpu_physical_memory_snapshot_and_clear_dirty
> >     (mr=mr@entry=0x5555573394c0, offset=offset@entry=0, length=length@entry=6910080, client=client@entry=0) at ../../devel/qemu/softmmu/physmem.c:1109
> > #9  0x0000555555b02483 in memory_region_snapshot_and_clear_dirty
> >     (mr=mr@entry=0x5555573394c0, addr=addr@entry=0, size=size@entry=6910080, client=client@entry=0)
> >     at ../../devel/qemu/softmmu/memory.c:2146
> 
> Could you please figure out which memory region this is?
> WTH is that size? Is that really the problem that the size is just
> crazy?

It seems vga_draw_graphic() could call memory_region_snapshot_and_clear_dirty()
with not-page-aligned size.  cpu_physical_memory_snapshot_and_clear_dirty()
actually took care of most of it on alignment, however still the "length"
parameter got passed in without alignment check or so.

Cc Gerd too.

I'm not sure how many use cases are there like this.. if there're a lot maybe
we can indeed drop this assert patch, but instead in kvm_log_clear_one_slot()
we should ALIGN_DOWN the size to smallest host page size. Say, if we need to
clear dirty bit for range (0, 0x1020), we should only clean (0, 0x1000) since
there can still be dirty data on range (0x1020, 0x2000).

Thanks,

> 
> Dave
> 
> > #10 0x0000555555babe99 in vga_draw_graphic (full_update=0, s=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1661
> > #11 0x0000555555babe99 in vga_update_display (opaque=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1784
> > #12 0x0000555555babe99 in vga_update_display (opaque=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1757
> > #13 0x00005555558ddd32 in graphic_hw_update (con=0x555556a11800)
> >     at ../../devel/qemu/ui/console.c:279
> > #14 0x00005555558dccd2 in dpy_refresh (s=0x555556c17da0) at ../../devel/qemu/ui/console.c:1742
> > #15 0x00005555558dccd2 in gui_update (opaque=opaque@entry=0x555556c17da0)
> >     at ../../devel/qemu/ui/console.c:209
> > #16 0x0000555555dbd520 in timerlist_run_timers (timer_list=0x555556937c50)
> >     at ../../devel/qemu/util/qemu-timer.c:574
> > #17 0x0000555555dbd520 in timerlist_run_timers (timer_list=0x555556937c50)
> >     at ../../devel/qemu/util/qemu-timer.c:499
> > #18 0x0000555555dbd74a in qemu_clock_run_timers (type=<optimized out>)
> >     at ../../devel/qemu/util/qemu-timer.c:670
> > #19 0x0000555555dbd74a in qemu_clock_run_all_timers () at ../../devel/qemu/util/qemu-timer.c:670
> > 
> > Looks like something in the vga code calls this with size=6910080
> > and thus triggers the alignment assertion?
> > 
> >  Thomas
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

-- 
Peter Xu



  reply	other threads:[~2021-03-09 16:54 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17  1:49 [PATCH v2 0/2] accel: kvm: Some bugfixes for kvm dirty log Keqian Zhu
2020-12-17  1:49 ` [PATCH v2 1/2] accel: kvm: Fix memory waste under mismatch page size Keqian Zhu
2020-12-17  1:49 ` [PATCH v2 2/2] accel: kvm: Add aligment assert for kvm_log_clear_one_slot Keqian Zhu
2020-12-17 12:18   ` Andrew Jones
2020-12-17 14:36   ` Peter Xu
2021-02-01 15:14   ` Philippe Mathieu-Daudé
2021-02-02  1:17     ` Keqian Zhu
2021-03-09 13:48   ` Thomas Huth
2021-03-09 14:05     ` Keqian Zhu
2021-03-09 14:45       ` Thomas Huth
2021-03-09 14:57         ` Dr. David Alan Gilbert
2021-03-09 16:08           ` Peter Xu [this message]
2021-03-10  1:57             ` Keqian Zhu
2021-03-09 16:20           ` Thomas Huth
2021-03-09 16:26             ` Peter Maydell
2021-03-09 19:03               ` Paolo Bonzini
2021-03-09 15:11         ` zhukeqian
2021-01-06  7:07 ` [PATCH v2 0/2] accel: kvm: Some bugfixes for kvm dirty log Keqian Zhu
2021-01-25  7:51 ` Keqian Zhu
2021-02-01 13:07   ` Keqian Zhu
2021-03-02 11:43 ` [PING] " Keqian Zhu
2021-03-02 13:14   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210309160814.GA763132@xz-x1 \
    --to=peterx@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=drjones@redhat.com \
    --cc=jiangkunkun@huawei.com \
    --cc=kraxel@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).