All of lore.kernel.org
 help / color / mirror / Atom feed
* frequently ballooning results in qemu exit
@ 2013-03-13 13:50 Hanweidong
  2013-03-14 10:17 ` George Dunlap
  0 siblings, 1 reply; 25+ messages in thread
From: Hanweidong @ 2013-03-13 13:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Gonglei (Arei), Yanqiangjun

We created a 64bit SLES11 SP1 guest, and then used a script to change memory (using mem-set command) periodically (in 1 second): set 1G, set 2G, set 1G, set 2G, and so on. 
After a few minutes, we encountered QEMU exit due to SIGBUS error. Below is the call trace captured by gdb:

The call trace:
Program received signal SIGBUS, Bus error.
0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
#1  0x00007f94fa67016d in address_space_rw (as=<optimized out>, addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at /usr/include/bits/string3.h:52
#2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>, val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>)
    at /opt/new/tools/qemu-xen-dir/xen-all.c:709
#3  write_phys_req_item (val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:720
#4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:736
#5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-dir/xen-all.c:793
#6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0) at /opt/new/tools/qemu-xen-dir/xen-all.c:868
#7  0x00007f94fa5e3262 in qemu_iohandler_poll (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>, xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
#8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:418
#9  0x00007f94fa6616dc in main_loop () at vl.c:1770
#10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:3999

It looks mapcache has something wrong because memcpy failed with the address from mapcache. Any ideas about this issue? Thanks!

--weidong

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-13 13:50 frequently ballooning results in qemu exit Hanweidong
@ 2013-03-14 10:17 ` George Dunlap
  2013-03-14 10:38   ` Anthony PERARD
  2013-03-14 10:48   ` Hanweidong
  0 siblings, 2 replies; 25+ messages in thread
From: George Dunlap @ 2013-03-14 10:17 UTC (permalink / raw)
  To: Hanweidong
  Cc: Anthony PERARD, Andrew Cooper, Gonglei (Arei), Yanqiangjun, xen-devel

On Wed, Mar 13, 2013 at 1:50 PM, Hanweidong <hanweidong@huawei.com> wrote:
> We created a 64bit SLES11 SP1 guest, and then used a script to change memory (using mem-set command) periodically (in 1 second): set 1G, set 2G, set 1G, set 2G, and so on.
> After a few minutes, we encountered QEMU exit due to SIGBUS error. Below is the call trace captured by gdb:
>
> The call trace:
> Program received signal SIGBUS, Bus error.
> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>, addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at /usr/include/bits/string3.h:52
> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>, val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>)
>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:720
> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:736
> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-dir/xen-all.c:793
> #6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0) at /opt/new/tools/qemu-xen-dir/xen-all.c:868
> #7  0x00007f94fa5e3262 in qemu_iohandler_poll (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>, xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:418
> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> #10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:3999
>
> It looks mapcache has something wrong because memcpy failed with the address from mapcache. Any ideas about this issue? Thanks!

Which version of Xen and qemu are you using?  In particular,
qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest are
you using?  Is there anything on the xen console (either via the
serial port or 'xl dmesg')?

At first glance it looks like maybe qemu is trying to access, via the
mapcache, pages which have been ballooned out.  But it seems like it
should only be doing so in response to a guest request -- is this
correct, Anthony?

Andy, are there any patches in the XenServer qemu patchqueue that
might be related to this?

 -George

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-14 10:17 ` George Dunlap
@ 2013-03-14 10:38   ` Anthony PERARD
  2013-03-14 14:10     ` Hanweidong
  2013-03-14 10:48   ` Hanweidong
  1 sibling, 1 reply; 25+ messages in thread
From: Anthony PERARD @ 2013-03-14 10:38 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, xen-devel, Gonglei (Arei), Yanqiangjun, Hanweidong

On 14/03/13 10:17, George Dunlap wrote:
> On Wed, Mar 13, 2013 at 1:50 PM, Hanweidong <hanweidong@huawei.com> wrote:
>> We created a 64bit SLES11 SP1 guest, and then used a script to change memory (using mem-set command) periodically (in 1 second): set 1G, set 2G, set 1G, set 2G, and so on.
>> After a few minutes, we encountered QEMU exit due to SIGBUS error. Below is the call trace captured by gdb:
>>
>> The call trace:
>> Program received signal SIGBUS, Bus error.
>> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
>> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>, addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at /usr/include/bits/string3.h:52
>> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>, val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>)
>>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
>> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>, req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:720
>> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-dir/xen-all.c:736
>> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-dir/xen-all.c:793
>> #6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0) at /opt/new/tools/qemu-xen-dir/xen-all.c:868
>> #7  0x00007f94fa5e3262 in qemu_iohandler_poll (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>, xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
>> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:418
>> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
>> #10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:3999
>>
>> It looks mapcache has something wrong because memcpy failed with the address from mapcache. Any ideas about this issue? Thanks!
> 
> Which version of Xen and qemu are you using?  In particular,
> qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest are
> you using?  Is there anything on the xen console (either via the
> serial port or 'xl dmesg')?
> 
> At first glance it looks like maybe qemu is trying to access, via the
> mapcache, pages which have been ballooned out.  But it seems like it
> should only be doing so in response to a guest request -- is this
> correct, Anthony?

Yes, this look like a guest IO request. One things I don't know is what
happen if there is guest addresses present in the mapcache that have
been balloon out, then but back in the guest, are those addresses in
mapcache still correct?

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-14 10:17 ` George Dunlap
  2013-03-14 10:38   ` Anthony PERARD
@ 2013-03-14 10:48   ` Hanweidong
  1 sibling, 0 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-14 10:48 UTC (permalink / raw)
  To: George Dunlap
  Cc: Anthony PERARD, Andrew Cooper, Gonglei (Arei), Yanqiangjun, xen-devel



> -----Original Message-----
> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
> Dunlap
> Sent: 2013年3月14日 18:17
> To: Hanweidong
> Cc: xen-devel@lists.xen.org; Gonglei (Arei); Yanqiangjun; Anthony
> PERARD; Andrew Cooper
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> On Wed, Mar 13, 2013 at 1:50 PM, Hanweidong <hanweidong@huawei.com>
> wrote:
> > We created a 64bit SLES11 SP1 guest, and then used a script to change
> memory (using mem-set command) periodically (in 1 second): set 1G, set
> 2G, set 1G, set 2G, and so on.
> > After a few minutes, we encountered QEMU exit due to SIGBUS error.
> Below is the call trace captured by gdb:
> >
> > The call trace:
> > Program received signal SIGBUS, Bus error.
> > 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > (gdb) bt
> > #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>,
> addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at
> /usr/include/bits/string3.h:52
> > #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>,
> val=<optimized out>, i=<optimized out>, req=<optimized out>,
> addr=<optimized out>)
> >     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> > #3  write_phys_req_item (val=<optimized out>, i=<optimized out>,
> req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:720
> > #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:736
> > #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:793
> > #6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0) at
> /opt/new/tools/qemu-xen-dir/xen-all.c:868
> > #7  0x00007f94fa5e3262 in qemu_iohandler_poll (readfds=0x7f94faeea7a0
> <rfds>, writefds=0x7f94faeea820 <wfds>, xfds=<optimized out>,
> ret=<optimized out>) at iohandler.c:125
> > #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized out>)
> at main-loop.c:418
> > #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> > #10 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
> out>) at vl.c:3999
> >
> > It looks mapcache has something wrong because memcpy failed with the
> address from mapcache. Any ideas about this issue? Thanks!
> 
> Which version of Xen and qemu are you using?  In particular,
> qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest are
> you using?  Is there anything on the xen console (either via the
> serial port or 'xl dmesg')?

Xen changeset is 26580, qemu commit is 656365a38e41a5b8a2c4d5ab2ada9fbf231f1ebc.
BTW, it's no problem when using qemu-traditional. The guest is 64bit SLES11 SP1.
The 'xl dmesg' output is as below (we did the test with 3 guests, it can be
reproduced very quickly (about 20 seconds)):

(XEN) HVM25: HVM Loader
(XEN) HVM25: Detected Xen v4.3-unstable
(XEN) HVM25: Xenbus rings @0xfeffc000, event channel 4
(XEN) HVM25: System requested SeaBIOS
(XEN) HVM25: CPU speed is 2500 MHz
(XEN) irq.c:270: Dom25 PCI link 0 changed 0 -> 5
(XEN) HVM25: PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom25 PCI link 1 changed 0 -> 10
(XEN) HVM25: PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom25 PCI link 2 changed 0 -> 11
(XEN) HVM25: PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom25 PCI link 3 changed 0 -> 5
(XEN) HVM25: PCI-ISA link 3 routed to IRQ5
(XEN) HVM25: pci dev 01:2 INTD->IRQ5
(XEN) HVM25: pci dev 01:3 INTA->IRQ10
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x2000000 bar_sz=0x2000000
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x2001000 bar_sz=0x1000
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x2011000 bar_sz=0x10000
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x3011000 bar_sz=0x1000000
(XEN) HVM25: pci dev 03:0 INTA->IRQ5
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x3011100 bar_sz=0x100
(XEN) HVM25: Randy: pci_setup 0: mmio_total=0x3021100 bar_sz=0x10000
(XEN) HVM25: pci dev 04:0 INTA->IRQ5
(XEN) HVM25: Randy: pci_setup 1: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM25: 0xfc000000 PCI_MEM_START=0xf0000000 PCI_MEM_END=0xfc000000
(XEN) HVM25: Randy: pci_setup 2: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM25: 0xfc000000
(XEN) HVM25: Randy: pci_setup 3: low_mem_pgend:0x7f800 high_mem_pgend:0x0 reserved_mem_pgst
(XEN) HVM25: art:0xfeff8
(XEN) HVM25: Randy: pci_setup: mmio_left: 0xc000000
(XEN) HVM25: pci dev 02:0 bar 10 size 2000000: f0000008
(XEN) HVM25: pci dev 03:0 bar 14 size 1000000: f2000008
(XEN) HVM25: pci dev 02:0 bar 30 size 10000: f3000000
(XEN) HVM25: pci dev 04:0 bar 30 size 10000: f3010000
(XEN) HVM25: pci dev 02:0 bar 14 size 1000: f3020000
(XEN) HVM25: pci dev 03:0 bar 10 size 100: 0000c001
(XEN) HVM25: pci dev 04:0 bar 10 size 100: 0000c101
(XEN) HVM25: pci dev 04:0 bar 14 size 100: f3021000
(XEN) HVM25: pci dev 01:2 bar 20 size 20: 0000c201
(XEN) HVM25: pci dev 01:1 bar 20 size 10: 0000c221
(XEN) HVM25: Multiprocessor initialisation:
(XEN) HVM25:  - CPU0 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM25:  - CPU1 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM25: Testing HVM environment:
(XEN) HVM25:  - REP INSB across page boundaries ... passed
(XEN) HVM25:  - GS base MSRs and SWAPGS ... passed
(XEN) HVM25: Passed 2 of 2 tests
(XEN) HVM25: Writing SMBIOS tables ...
(XEN) HVM25: Loading SeaBIOS ...
(XEN) HVM25: Creating MP tables ...
(XEN) HVM25: Loading ACPI ...
(XEN) HVM25: vm86 TSS at fc00a080
(XEN) HVM25: BIOS map:
(XEN) HVM25:  10000-100d3: Scratch space
(XEN) HVM25:  e0000-fffff: Main BIOS
(XEN) HVM25: E820 table:
(XEN) HVM25:  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(XEN) HVM25:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM25:  [01]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM25:  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(XEN) HVM25:  HOLE: 00000000:7f800000 - 00000000:fc000000
(XEN) HVM25:  [03]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM25: Invoking SeaBIOS ...
(XEN) HVM25: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM25: 
(XEN) HVM25: Found Xen hypervisor signature at 40000000
(XEN) HVM25: xen: copy e820...
(XEN) HVM25: Ram Size=0x7f800000 (0x0000000000000000 high)
(XEN) HVM25: Relocating low data from 0x000e2490 to 0x000ef790 (size 2156)
(XEN) HVM25: Relocating init from 0x000e2cfc to 0x7f7e20f0 (size 56804)
(XEN) HVM25: CPU Mhz=2500
(XEN) HVM25: Found 8 PCI devices (max PCI bus is 00)
(XEN) HVM25: Allocated Xen hypercall page at 7f7ff000
(XEN) HVM25: Detected Xen v4.3-unstable
(XEN) HVM25: Found 2 cpu(s) max supported 2 cpu(s)
(XEN) HVM25: xen: copy BIOS tables...
(XEN) HVM25: Copying SMBIOS entry point from 0x00010010 to 0x000fdb10
(XEN) HVM25: Copying MPTABLE from 0xfc001170/fc001180 to 0x000fda10
(XEN) HVM25: Copying PIR from 0x00010030 to 0x000fd990
(XEN) HVM25: Copying ACPI RSDP from 0x000100b0 to 0x000fd960
(XEN) HVM25: Scan for VGA option rom
(XEN) HVM25: Running option rom at c000:0003
(XEN) stdvga.c:147:d25 entering stdvga and caching modes
(XEN) HVM25: Turning on vga text mode console
(XEN) HVM25: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM25: 
(XEN) HVM25: UHCI init on dev 00:01.2 (io=c200)
(XEN) HVM25: Found 1 lpt ports
(XEN) HVM25: Found 1 serial ports
(XEN) HVM25: ATA controller 1 at 1f0/3f4/c220 (irq 14 dev 9)
(XEN) HVM25: ATA controller 2 at 170/374/c228 (irq 15 dev 9)
(XEN) HVM25: ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (10240 MiBytes)
(XEN) HVM25: Searching bootorder for: /pci@i0cf8/*@1,1/drive@0/disk@0
(XEN) HVM25: DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD]
(XEN) HVM25: Searching bootorder for: /pci@i0cf8/*@1,1/drive@1/disk@0
(XEN) HVM25: PS2 keyboard initialized
(XEN) HVM25: All threads complete.
(XEN) HVM25: Scan for option roms
(XEN) HVM25: Running option rom at c900:0003
(XEN) HVM25: pmm call arg1=1
(XEN) HVM25: pmm call arg1=0
(XEN) HVM25: pmm call arg1=1
(XEN) HVM25: pmm call arg1=0
(XEN) HVM25: Searching bootorder for: /pci@i0cf8/*@4
(XEN) HVM25: Press F12 for boot menu.
(XEN) HVM25: 
(XEN) HVM26: HVM Loader
(XEN) HVM26: Detected Xen v4.3-unstable
(XEN) HVM26: Xenbus rings @0xfeffc000, event channel 4
(XEN) HVM26: System requested SeaBIOS
(XEN) HVM26: CPU speed is 2500 MHz
(XEN) irq.c:270: Dom26 PCI link 0 changed 0 -> 5
(XEN) HVM26: PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom26 PCI link 1 changed 0 -> 10
(XEN) HVM26: PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom26 PCI link 2 changed 0 -> 11
(XEN) HVM26: PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom26 PCI link 3 changed 0 -> 5
(XEN) HVM26: PCI-ISA link 3 routed to IRQ5
(XEN) HVM26: pci dev 01:2 INTD->IRQ5
(XEN) HVM26: pci dev 01:3 INTA->IRQ10
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x2000000 bar_sz=0x2000000
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x2001000 bar_sz=0x1000
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x2011000 bar_sz=0x10000
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x3011000 bar_sz=0x1000000
(XEN) HVM26: pci dev 03:0 INTA->IRQ5
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x3011100 bar_sz=0x100
(XEN) HVM26: Randy: pci_setup 0: mmio_total=0x3021100 bar_sz=0x10000
(XEN) HVM26: pci dev 04:0 INTA->IRQ5
(XEN) HVM26: Randy: pci_setup 1: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM26: 0xfc000000 PCI_MEM_START=0xf0000000 PCI_MEM_END=0xfc000000
(XEN) HVM26: Randy: pci_setup 2: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM26: 0xfc000000
(XEN) HVM26: Randy: pci_setup 3: low_mem_pgend:0x7f800 high_mem_pgend:0x0 reserved_mem_pgst
(XEN) HVM26: art:0xfeff8
(XEN) HVM26: Randy: pci_setup: mmio_left: 0xc000000
(XEN) HVM26: pci dev 02:0 bar 10 size 2000000: f0000008
(XEN) HVM26: pci dev 03:0 bar 14 size 1000000: f2000008
(XEN) HVM26: pci dev 02:0 bar 30 size 10000: f3000000
(XEN) HVM26: pci dev 04:0 bar 30 size 10000: f3010000
(XEN) HVM26: pci dev 02:0 bar 14 size 1000: f3020000
(XEN) HVM26: pci dev 03:0 bar 10 size 100: 0000c001
(XEN) HVM26: pci dev 04:0 bar 10 size 100: 0000c101
(XEN) HVM26: pci dev 04:0 bar 14 size 100: f3021000
(XEN) HVM26: pci dev 01:2 bar 20 size 20: 0000c201
(XEN) HVM26: pci dev 01:1 bar 20 size 10: 0000c221
(XEN) HVM26: Multiprocessor initialisation:
(XEN) HVM26:  - CPU0 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM26:  - CPU1 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM26: Testing HVM environment:
(XEN) HVM26:  - REP INSB across page boundaries ... passed
(XEN) HVM26:  - GS base MSRs and SWAPGS ... passed
(XEN) HVM26: Passed 2 of 2 tests
(XEN) HVM26: Writing SMBIOS tables ...
(XEN) HVM26: Loading SeaBIOS ...
(XEN) HVM26: Creating MP tables ...
(XEN) HVM26: Loading ACPI ...
(XEN) HVM26: vm86 TSS at fc00a080
(XEN) HVM26: BIOS map:
(XEN) HVM26:  10000-100d3: Scratch space
(XEN) HVM26:  e0000-fffff: Main BIOS
(XEN) HVM26: E820 table:
(XEN) HVM26:  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(XEN) HVM26:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM26:  [01]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM26:  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(XEN) HVM26:  HOLE: 00000000:7f800000 - 00000000:fc000000
(XEN) HVM26:  [03]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM26: Invoking SeaBIOS ...
(XEN) HVM26: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM26: 
(XEN) HVM26: Found Xen hypervisor signature at 40000000
(XEN) HVM26: xen: copy e820...
(XEN) HVM26: Ram Size=0x7f800000 (0x0000000000000000 high)
(XEN) HVM26: Relocating low data from 0x000e2490 to 0x000ef790 (size 2156)
(XEN) HVM26: Relocating init from 0x000e2cfc to 0x7f7e20f0 (size 56804)
(XEN) HVM26: CPU Mhz=2500
(XEN) HVM26: Found 8 PCI devices (max PCI bus is 00)
(XEN) HVM26: Allocated Xen hypercall page at 7f7ff000
(XEN) HVM26: Detected Xen v4.3-unstable
(XEN) HVM26: Found 2 cpu(s) max supported 2 cpu(s)
(XEN) HVM26: xen: copy BIOS tables...
(XEN) HVM26: Copying SMBIOS entry point from 0x00010010 to 0x000fdb10
(XEN) HVM26: Copying MPTABLE from 0xfc001170/fc001180 to 0x000fda10
(XEN) HVM26: Copying PIR from 0x00010030 to 0x000fd990
(XEN) HVM26: Copying ACPI RSDP from 0x000100b0 to 0x000fd960
(XEN) HVM26: Scan for VGA option rom
(XEN) HVM26: Running option rom at c000:0003
(XEN) stdvga.c:147:d26 entering stdvga and caching modes
(XEN) HVM26: Turning on vga text mode console
(XEN) HVM26: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM26: 
(XEN) HVM26: UHCI init on dev 00:01.2 (io=c200)
(XEN) HVM26: Found 1 lpt ports
(XEN) HVM26: Found 1 serial ports
(XEN) HVM26: ATA controller 1 at 1f0/3f4/c220 (irq 14 dev 9)
(XEN) HVM26: ATA controller 2 at 170/374/c228 (irq 15 dev 9)
(XEN) HVM26: ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (10240 MiBytes)
(XEN) HVM26: Searching bootorder for: /pci@i0cf8/*@1,1/drive@0/disk@0
(XEN) HVM26: DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD]
(XEN) HVM26: Searching bootorder for: /pci@i0cf8/*@1,1/drive@1/disk@0
(XEN) HVM26: PS2 keyboard initialized
(XEN) HVM27: HVM Loader
(XEN) HVM27: Detected Xen v4.3-unstable
(XEN) HVM27: Xenbus rings @0xfeffc000, event channel 4
(XEN) HVM27: System requested SeaBIOS
(XEN) HVM27: CPU speed is 2500 MHz
(XEN) irq.c:270: Dom27 PCI link 0 changed 0 -> 5
(XEN) HVM27: PCI-ISA link 0 routed to IRQ5
(XEN) irq.c:270: Dom27 PCI link 1 changed 0 -> 10
(XEN) HVM27: PCI-ISA link 1 routed to IRQ10
(XEN) irq.c:270: Dom27 PCI link 2 changed 0 -> 11
(XEN) HVM27: PCI-ISA link 2 routed to IRQ11
(XEN) irq.c:270: Dom27 PCI link 3 changed 0 -> 5
(XEN) HVM27: PCI-ISA link 3 routed to IRQ5
(XEN) HVM27: pci dev 01:2 INTD->IRQ5
(XEN) HVM27: pci dev 01:3 INTA->IRQ10
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x2000000 bar_sz=0x2000000
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x2001000 bar_sz=0x1000
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x2011000 bar_sz=0x10000
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x3011000 bar_sz=0x1000000
(XEN) HVM27: pci dev 03:0 INTA->IRQ5
(XEN) HVM26: All threads complete.
(XEN) HVM26: Scan for option roms
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x3011100 bar_sz=0x100
(XEN) HVM27: Randy: pci_setup 0: mmio_total=0x3021100 bar_sz=0x10000
(XEN) HVM27: pci dev 04:0 INTA->IRQ5
(XEN) HVM26: Running option rom at c900:0003
(XEN) HVM26: pmm call arg1=1
(XEN) HVM26: pmm call arg1=0
(XEN) HVM26: pmm call arg1=1
(XEN) HVM26: pmm call arg1=0
(XEN) HVM27: Randy: pci_setup 1: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM27: 0xfc000000 PCI_MEM_START=0xf0000000 PCI_MEM_END=0xfc000000
(XEN) HVM27: Randy: pci_setup 2: mmio_total=0x3021100 pci_mem_start=0xf0000000 pci_mem_end=
(XEN) HVM27: 0xfc000000
(XEN) HVM27: Randy: pci_setup 3: low_mem_pgend:0x7f800 high_mem_pgend:0x0 reserved_mem_pgst
(XEN) HVM27: art:0xfeff8
(XEN) HVM27: Randy: pci_setup: mmio_left: 0xc000000
(XEN) HVM27: pci dev 02:0 bar 10 size 2000000: f0000008
(XEN) HVM27: pci dev 03:0 bar 14 size 1000000: f2000008
(XEN) HVM27: pci dev 02:0 bar 30 size 10000: f3000000
(XEN) HVM27: pci dev 04:0 bar 30 size 10000: f3010000
(XEN) HVM27: pci dev 02:0 bar 14 size 1000: f3020000
(XEN) HVM27: pci dev 03:0 bar 10 size 100: 0000c001
(XEN) HVM27: pci dev 04:0 bar 10 size 100: 0000c101
(XEN) HVM27: pci dev 04:0 bar 14 size 100: f3021000
(XEN) HVM27: pci dev 01:2 bar 20 size 20: 0000c201
(XEN) HVM27: pci dev 01:1 bar 20 size 10: 0000c221
(XEN) HVM27: Multiprocessor initialisation:
(XEN) HVM27:  - CPU0 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM27:  - CPU1 ... 46-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done.
(XEN) HVM27: Testing HVM environment:
(XEN) HVM26: Searching bootorder for: /pci@i0cf8/*@4
(XEN) HVM27:  - REP INSB across page boundaries ... passed
(XEN) HVM27:  - GS base MSRs and SWAPGS ... passed
(XEN) HVM27: Passed 2 of 2 tests
(XEN) HVM27: Writing SMBIOS tables ...
(XEN) HVM27: Loading SeaBIOS ...
(XEN) HVM27: Creating MP tables ...
(XEN) HVM27: Loading ACPI ...
(XEN) HVM27: vm86 TSS at fc00a080
(XEN) HVM27: BIOS map:
(XEN) HVM27:  10000-100d3: Scratch space
(XEN) HVM27:  e0000-fffff: Main BIOS
(XEN) HVM27: E820 table:
(XEN) HVM27:  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(XEN) HVM27:  HOLE: 00000000:000a0000 - 00000000:000e0000
(XEN) HVM27:  [01]: 00000000:000e0000 - 00000000:00100000: RESERVED
(XEN) HVM27:  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(XEN) HVM27:  HOLE: 00000000:7f800000 - 00000000:fc000000
(XEN) HVM27:  [03]: 00000000:fc000000 - 00000001:00000000: RESERVED
(XEN) HVM27: Invoking SeaBIOS ...
(XEN) HVM27: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM27: 
(XEN) HVM27: Found Xen hypervisor signature at 40000000
(XEN) HVM27: xen: copy e820...
(XEN) HVM27: Ram Size=0x7f800000 (0x0000000000000000 high)
(XEN) HVM27: Relocating low data from 0x000e2490 to 0x000ef790 (size 2156)
(XEN) HVM27: Relocating init from 0x000e2cfc to 0x7f7e20f0 (size 56804)
(XEN) HVM27: CPU Mhz=2500
(XEN) HVM27: Found 8 PCI devices (max PCI bus is 00)
(XEN) HVM27: Allocated Xen hypercall page at 7f7ff000
(XEN) HVM27: Detected Xen v4.3-unstable
(XEN) HVM27: Found 2 cpu(s) max supported 2 cpu(s)
(XEN) HVM27: xen: copy BIOS tables...
(XEN) HVM27: Copying SMBIOS entry point from 0x00010010 to 0x000fdb10
(XEN) HVM27: Copying MPTABLE from 0xfc001170/fc001180 to 0x000fda10
(XEN) HVM26: Press F12 for boot menu.
(XEN) HVM27: Copying PIR from 0x00010030 to 0x000fd990
(XEN) HVM27: Copying ACPI RSDP from 0x000100b0 to 0x000fd960
(XEN) HVM27: Scan for VGA option rom
(XEN) HVM26: 
(XEN) HVM27: Running option rom at c000:0003
(XEN) stdvga.c:147:d27 entering stdvga and caching modes
(XEN) HVM27: Turning on vga text mode console
(XEN) HVM27: SeaBIOS (version rel-1.7.1-0-g51755c3-20130306_205414-linux-DPMZPO)
(XEN) HVM27: 
(XEN) HVM27: UHCI init on dev 00:01.2 (io=c200)
(XEN) HVM27: Found 1 lpt ports
(XEN) HVM27: Found 1 serial ports
(XEN) HVM27: ATA controller 1 at 1f0/3f4/c220 (irq 14 dev 9)
(XEN) HVM27: ATA controller 2 at 170/374/c228 (irq 15 dev 9)
(XEN) HVM27: ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (10240 MiBytes)
(XEN) HVM27: Searching bootorder for: /pci@i0cf8/*@1,1/drive@0/disk@0
(XEN) HVM27: DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD]
(XEN) HVM27: Searching bootorder for: /pci@i0cf8/*@1,1/drive@1/disk@0
(XEN) HVM27: PS2 keyboard initialized
(XEN) HVM27: All threads complete.
(XEN) HVM27: Scan for option roms
(XEN) HVM27: Running option rom at c900:0003
(XEN) HVM27: pmm call arg1=1
(XEN) HVM27: pmm call arg1=0
(XEN) HVM27: pmm call arg1=1
(XEN) HVM27: pmm call arg1=0
(XEN) HVM27: Searching bootorder for: /pci@i0cf8/*@4
(XEN) HVM27: Press F12 for boot menu.
(XEN) HVM27: 
(XEN) HVM25: drive 0x000fd910: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=20971521
(XEN) HVM25: 
(XEN) HVM25: Space available for UMB: 000ca000-000ee800
(XEN) HVM25: Returned 61440 bytes of ZoneHigh
(XEN) HVM25: e820 map has 6 items:
(XEN) HVM25:   0: 0000000000000000 - 000000000009fc00 = 1 RAM
(XEN) HVM25:   1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED
(XEN) HVM25:   2: 00000000000f0000 - 0000000000100000 = 2 RESERVED
(XEN) HVM25:   3: 0000000000100000 - 000000007f7ff000 = 1 RAM
(XEN) HVM25:   4: 000000007f7ff000 - 000000007f800000 = 2 RESERVED
(XEN) HVM25:   5: 00000000fc000000 - 0000000100000000 = 2 RESERVED
(XEN) HVM25: enter handle_19:
(XEN) HVM25:   NULL
(XEN) HVM25: Booting from Hard Disk...
(XEN) HVM25: Booting from 0000:7c00
(XEN) stdvga.c:151:d25 leaving stdvga
(XEN) HVM26: drive 0x000fd910: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=20971521
(XEN) HVM26: 
(XEN) HVM26: Space available for UMB: 000ca000-000ee800
(XEN) HVM26: Returned 61440 bytes of ZoneHigh
(XEN) HVM26: e820 map has 6 items:
(XEN) HVM26:   0: 0000000000000000 - 000000000009fc00 = 1 RAM
(XEN) HVM26:   1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED
(XEN) HVM26:   2: 00000000000f0000 - 0000000000100000 = 2 RESERVED
(XEN) HVM26:   3: 0000000000100000 - 000000007f7ff000 = 1 RAM
(XEN) HVM26:   4: 000000007f7ff000 - 000000007f800000 = 2 RESERVED
(XEN) HVM26:   5: 00000000fc000000 - 0000000100000000 = 2 RESERVED
(XEN) HVM26: enter handle_19:
(XEN) HVM26:   NULL
(XEN) HVM26: Booting from Hard Disk...
(XEN) HVM26: Booting from 0000:7c00
(XEN) stdvga.c:151:d26 leaving stdvga
(XEN) HVM27: drive 0x000fd910: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=20971521
(XEN) HVM27: 
(XEN) HVM27: Space available for UMB: 000ca000-000ee800
(XEN) HVM27: Returned 61440 bytes of ZoneHigh
(XEN) HVM27: e820 map has 6 items:
(XEN) HVM27:   0: 0000000000000000 - 000000000009fc00 = 1 RAM
(XEN) HVM27:   1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED
(XEN) HVM27:   2: 00000000000f0000 - 0000000000100000 = 2 RESERVED
(XEN) HVM27:   3: 0000000000100000 - 000000007f7ff000 = 1 RAM
(XEN) HVM27:   4: 000000007f7ff000 - 000000007f800000 = 2 RESERVED
(XEN) HVM27:   5: 00000000fc000000 - 0000000100000000 = 2 RESERVED
(XEN) HVM27: enter handle_19:
(XEN) HVM27:   NULL
(XEN) HVM27: Booting from Hard Disk...
(XEN) HVM27: Booting from 0000:7c00
(XEN) stdvga.c:151:d27 leaving stdvga
(XEN) stdvga.c:147:d25 entering stdvga and caching modes
(XEN) stdvga.c:151:d25 leaving stdvga
(XEN) stdvga.c:147:d27 entering stdvga and caching modes
(XEN) stdvga.c:151:d27 leaving stdvga
(XEN) stdvga.c:147:d26 entering stdvga and caching modes
(XEN) stdvga.c:151:d26 leaving stdvga
(XEN) irq.c:270: Dom25 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom25 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom25 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom25 PCI link 3 changed 5 -> 0
(XEN) irq.c:375: Dom25 callback via changed to PCI INTx Dev 0x03 IntA
(XEN) irq.c:270: Dom27 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom27 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom27 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom27 PCI link 3 changed 5 -> 0
(XEN) irq.c:375: Dom27 callback via changed to PCI INTx Dev 0x03 IntA
(XEN) irq.c:270: Dom26 PCI link 0 changed 5 -> 0
(XEN) irq.c:270: Dom26 PCI link 1 changed 10 -> 0
(XEN) irq.c:270: Dom26 PCI link 2 changed 11 -> 0
(XEN) irq.c:270: Dom26 PCI link 3 changed 5 -> 0
(XEN) irq.c:375: Dom26 callback via changed to PCI INTx Dev 0x03 IntA

> 
> At first glance it looks like maybe qemu is trying to access, via the
> mapcache, pages which have been ballooned out.  But it seems like it
> should only be doing so in response to a guest request -- is this
> correct, Anthony?
> 
> Andy, are there any patches in the XenServer qemu patchqueue that
> might be related to this?
> 
>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-14 10:38   ` Anthony PERARD
@ 2013-03-14 14:10     ` Hanweidong
  2013-03-14 14:34       ` Tim Deegan
  0 siblings, 1 reply; 25+ messages in thread
From: Hanweidong @ 2013-03-14 14:10 UTC (permalink / raw)
  To: Anthony PERARD, George Dunlap
  Cc: Andrew Cooper, Gonglei (Arei), Yanqiangjun, xen-devel



> -----Original Message-----
> From: Anthony PERARD [mailto:anthony.perard@citrix.com]
> Sent: 2013年3月14日 18:39
> To: George Dunlap
> Cc: Hanweidong; xen-devel@lists.xen.org; Gonglei (Arei); Yanqiangjun;
> Andrew Cooper
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> On 14/03/13 10:17, George Dunlap wrote:
> > On Wed, Mar 13, 2013 at 1:50 PM, Hanweidong <hanweidong@huawei.com>
> wrote:
> >> We created a 64bit SLES11 SP1 guest, and then used a script to
> change memory (using mem-set command) periodically (in 1 second): set
> 1G, set 2G, set 1G, set 2G, and so on.
> >> After a few minutes, we encountered QEMU exit due to SIGBUS error.
> Below is the call trace captured by gdb:
> >>
> >> The call trace:
> >> Program received signal SIGBUS, Bus error.
> >> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> >> (gdb) bt
> >> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> >> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>,
> addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at
> /usr/include/bits/string3.h:52
> >> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>,
> val=<optimized out>, i=<optimized out>, req=<optimized out>,
> addr=<optimized out>)
> >>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> >> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>,
> req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:720
> >> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:736
> >> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-
> dir/xen-all.c:793
> >> #6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0)
> at /opt/new/tools/qemu-xen-dir/xen-all.c:868
> >> #7  0x00007f94fa5e3262 in qemu_iohandler_poll
> (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>,
> xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
> >> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized
> out>) at main-loop.c:418
> >> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> >> #10 main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at vl.c:3999
> >>
> >> It looks mapcache has something wrong because memcpy failed with the
> address from mapcache. Any ideas about this issue? Thanks!
> >
> > Which version of Xen and qemu are you using?  In particular,
> > qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest are
> > you using?  Is there anything on the xen console (either via the
> > serial port or 'xl dmesg')?
> >
> > At first glance it looks like maybe qemu is trying to access, via the
> > mapcache, pages which have been ballooned out.  But it seems like it
> > should only be doing so in response to a guest request -- is this
> > correct, Anthony?
> 
> Yes, this look like a guest IO request. One things I don't know is what
> happen if there is guest addresses present in the mapcache that have
> been balloon out, then but back in the guest, are those addresses in
> mapcache still correct?
> 

I'm also curious about this. There is a window between memory balloon out 
and QEMU invalidate mapcache. 

We found the failed address was got from an existed mapcache entry, not 
generated by xen_remap_bucket(). 

--weidong
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-14 14:10     ` Hanweidong
@ 2013-03-14 14:34       ` Tim Deegan
  2013-03-15  5:54         ` Hanweidong
  0 siblings, 1 reply; 25+ messages in thread
From: Tim Deegan @ 2013-03-14 14:34 UTC (permalink / raw)
  To: Hanweidong
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, xen-devel,
	Gonglei (Arei),
	Anthony PERARD

At 14:10 +0000 on 14 Mar (1363270234), Hanweidong wrote:
> > >> The call trace:
> > >> Program received signal SIGBUS, Bus error.
> > >> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > >> (gdb) bt
> > >> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > >> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>,
> > addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at
> > /usr/include/bits/string3.h:52
> > >> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>,
> > val=<optimized out>, i=<optimized out>, req=<optimized out>,
> > addr=<optimized out>)
> > >>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> > >> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>,
> > req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-xen-
> > dir/xen-all.c:720
> > >> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-xen-
> > dir/xen-all.c:736
> > >> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-xen-
> > dir/xen-all.c:793
> > >> #6  0x00007f94fa748abe in cpu_handle_ioreq (opaque=0x7f94fb39d3f0)
> > at /opt/new/tools/qemu-xen-dir/xen-all.c:868
> > >> #7  0x00007f94fa5e3262 in qemu_iohandler_poll
> > (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>,
> > xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
> > >> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized
> > out>) at main-loop.c:418
> > >> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> > >> #10 main (argc=<optimized out>, argv=<optimized out>,
> > envp=<optimized out>) at vl.c:3999
> > >>
> > >> It looks mapcache has something wrong because memcpy failed with the
> > address from mapcache. Any ideas about this issue? Thanks!
> > >
> > > Which version of Xen and qemu are you using?  In particular,
> > > qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest are
> > > you using?  Is there anything on the xen console (either via the
> > > serial port or 'xl dmesg')?
> > >
> > > At first glance it looks like maybe qemu is trying to access, via the
> > > mapcache, pages which have been ballooned out.  But it seems like it
> > > should only be doing so in response to a guest request -- is this
> > > correct, Anthony?
> > 
> > Yes, this look like a guest IO request. One things I don't know is what
> > happen if there is guest addresses present in the mapcache that have
> > been balloon out, then but back in the guest, are those addresses in
> > mapcache still correct?
> > 
> 
> I'm also curious about this. There is a window between memory balloon out 
> and QEMU invalidate mapcache. 

That by itself is OK; I don't think we need to provide any meaningful
semantics if the guest is accessing memory that it's ballooned out. 

The question is where the SIGBUS comes from: either qemu has a mapping
of the old memory, in which case it can write to it safely, or it
doesn't, in which case it shouldn't try.

Is this running on a linux kernel?  ISTR some BSD kernels would
demand-populate foreign mappings, which might fail like this.

Tim.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-14 14:34       ` Tim Deegan
@ 2013-03-15  5:54         ` Hanweidong
  2013-03-21 12:15           ` Tim Deegan
  0 siblings, 1 reply; 25+ messages in thread
From: Hanweidong @ 2013-03-15  5:54 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, xen-devel,
	Gonglei (Arei),
	Anthony PERARD



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Tim Deegan
> Sent: 2013年3月14日 22:34
> To: Hanweidong
> Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-devel@lists.xen.org;
> Gonglei (Arei); Anthony PERARD
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> At 14:10 +0000 on 14 Mar (1363270234), Hanweidong wrote:
> > > >> The call trace:
> > > >> Program received signal SIGBUS, Bus error.
> > > >> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > > >> (gdb) bt
> > > >> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > > >> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>,
> > > addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at
> > > /usr/include/bits/string3.h:52
> > > >> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>,
> > > val=<optimized out>, i=<optimized out>, req=<optimized out>,
> > > addr=<optimized out>)
> > > >>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> > > >> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>,
> > > req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:720
> > > >> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:736
> > > >> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:793
> > > >> #6  0x00007f94fa748abe in cpu_handle_ioreq
> (opaque=0x7f94fb39d3f0)
> > > at /opt/new/tools/qemu-xen-dir/xen-all.c:868
> > > >> #7  0x00007f94fa5e3262 in qemu_iohandler_poll
> > > (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>,
> > > xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
> > > >> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized
> > > out>) at main-loop.c:418
> > > >> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> > > >> #10 main (argc=<optimized out>, argv=<optimized out>,
> > > envp=<optimized out>) at vl.c:3999
> > > >>
> > > >> It looks mapcache has something wrong because memcpy failed with
> the
> > > address from mapcache. Any ideas about this issue? Thanks!
> > > >
> > > > Which version of Xen and qemu are you using?  In particular,
> > > > qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest
> are
> > > > you using?  Is there anything on the xen console (either via the
> > > > serial port or 'xl dmesg')?
> > > >
> > > > At first glance it looks like maybe qemu is trying to access, via
> the
> > > > mapcache, pages which have been ballooned out.  But it seems like
> it
> > > > should only be doing so in response to a guest request -- is this
> > > > correct, Anthony?
> > >
> > > Yes, this look like a guest IO request. One things I don't know is
> what
> > > happen if there is guest addresses present in the mapcache that
> have
> > > been balloon out, then but back in the guest, are those addresses
> in
> > > mapcache still correct?
> > >
> >
> > I'm also curious about this. There is a window between memory balloon
> out
> > and QEMU invalidate mapcache.
> 
> That by itself is OK; I don't think we need to provide any meaningful
> semantics if the guest is accessing memory that it's ballooned out.
> 
> The question is where the SIGBUS comes from: either qemu has a mapping
> of the old memory, in which case it can write to it safely, or it
> doesn't, in which case it shouldn't try.
> 

The error always happened at memcpy in if (is_write) branch in address_space_rw. 

We found that, after the last xen_invalidate_map_cache, the mapcache entry related to the failed address was mapped:
	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
	==xen_remap_bucket== end entry->paddr_index=7a3,entry->vaddr_base=2a2d9000,size=1048576,address_index=7a3
and then, QEMU can use the mapped address successfully for several times (more than 20) before the error occurred. 
      ...
	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
	==xen_map_cache== return 2a2d9000+c1ec0=2a39aec0
	==address_space_rw== ptr=2a39aec0
	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
	==address_space_rw== ptr=2a39aec4
	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
	==address_space_rw== ptr=2a39aec8
	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
	==address_space_rw== ptr=2a39aecc
	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
	==xen_map_cache== return 92a407000+6c108=2a473108
	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
	==xen_map_cache==first return 2a407000+6c10c=2a47310c
	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
	==xen_map_cache==first return 2a407000+6c110=2a473110
	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
	==xen_map_cache== return 2a2d9000+95000=2a36e000
	==address_space_rw== ptr=2a36e000
      here, the SIGBUS error occurred.


> Is this running on a linux kernel?  ISTR some BSD kernels would
> demand-populate foreign mappings, which might fail like this.
> 

It's linux kernel.

--weidong
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-15  5:54         ` Hanweidong
@ 2013-03-21 12:15           ` Tim Deegan
  2013-03-21 13:33             ` Hanweidong
  0 siblings, 1 reply; 25+ messages in thread
From: Tim Deegan @ 2013-03-21 12:15 UTC (permalink / raw)
  To: Hanweidong
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, xen-devel,
	Gonglei (Arei),
	Anthony PERARD

At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote:
> > > I'm also curious about this. There is a window between memory balloon
> > out
> > > and QEMU invalidate mapcache.
> > 
> > That by itself is OK; I don't think we need to provide any meaningful
> > semantics if the guest is accessing memory that it's ballooned out.
> > 
> > The question is where the SIGBUS comes from: either qemu has a mapping
> > of the old memory, in which case it can write to it safely, or it
> > doesn't, in which case it shouldn't try.
> 
> The error always happened at memcpy in if (is_write) branch in
> address_space_rw.

Sure, but _why_?  Why does this access cause SIGBUS?  Presumably there's
some part of the mapcache code that thinks it has a mapping there when
it doesn't.

> We found that, after the last xen_invalidate_map_cache, the mapcache entry related to the failed address was mapped:
> 	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
> 	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
> 	==xen_remap_bucket== end entry->paddr_index=7a3,entry->vaddr_base=2a2d9000,size=1048576,address_index=7a3

OK, so that's 0x2a2d9000 -- 0x2a3d8fff.

> 	==address_space_rw== ptr=2a39aec0
> 	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
> 	==address_space_rw== ptr=2a39aec4
> 	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
> 	==address_space_rw== ptr=2a39aec8
> 	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
> 	==address_space_rw== ptr=2a39aecc

These are all to page 0x2a3e9a___.

> 	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
> 	==xen_map_cache== return 92a407000+6c108=2a473108
> 	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
> 	==xen_map_cache==first return 2a407000+6c10c=2a47310c
> 	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
> 	==xen_map_cache==first return 2a407000+6c110=2a473110
> 	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
> 	==xen_map_cache== return 2a2d9000+95000=2a36e000
> 	==address_space_rw== ptr=2a36e000

And this is to page 0x2a36e___, a different page in the same bucket.

>       here, the SIGBUS error occurred.

So that page isn't mapped.  Which means:
- it was never mapped (and the mapcache code didn't handle the error
  correctly at map time); or
- it was never mapped (and the mapcache hasn't checked its own records
  before using the map); or
- it was mapped (and something unmapped it in the meantime).

Why not add some tests in xen_remap_bucket to check that all the pages
that qemu records as mapped are actually there?

Tim.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-21 12:15           ` Tim Deegan
@ 2013-03-21 13:33             ` Hanweidong
  2013-03-25 12:40               ` [Qemu-devel] [Xen-devel] " Hanweidong
  2013-03-25 12:40               ` Hanweidong
  0 siblings, 2 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-21 13:33 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, xen-devel,
	Gonglei (Arei),
	Anthony PERARD

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Tim Deegan
> Sent: 2013年3月21日 20:16
> To: Hanweidong
> Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-devel@lists.xen.org;
> Gonglei (Arei); Anthony PERARD
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote:
> > > > I'm also curious about this. There is a window between memory
> balloon
> > > out
> > > > and QEMU invalidate mapcache.
> > >
> > > That by itself is OK; I don't think we need to provide any
> meaningful
> > > semantics if the guest is accessing memory that it's ballooned out.
> > >
> > > The question is where the SIGBUS comes from: either qemu has a
> mapping
> > > of the old memory, in which case it can write to it safely, or it
> > > doesn't, in which case it shouldn't try.
> >
> > The error always happened at memcpy in if (is_write) branch in
> > address_space_rw.
> 
> Sure, but _why_?  Why does this access cause SIGBUS?  Presumably
> there's
> some part of the mapcache code that thinks it has a mapping there when
> it doesn't.
> 
> > We found that, after the last xen_invalidate_map_cache, the mapcache
> entry related to the failed address was mapped:
> > 	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
> > 	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
> > 	==xen_remap_bucket== end entry->paddr_index=7a3,entry-
> >vaddr_base=2a2d9000,size=1048576,address_index=7a3
> 
> OK, so that's 0x2a2d9000 -- 0x2a3d8fff.
> 
> > 	==address_space_rw== ptr=2a39aec0
> > 	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
> > 	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
> > 	==address_space_rw== ptr=2a39aec4
> > 	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
> > 	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
> > 	==address_space_rw== ptr=2a39aec8
> > 	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
> > 	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
> > 	==address_space_rw== ptr=2a39aecc
> 
> These are all to page 0x2a3e9a___.
> 
> > 	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
> > 	==xen_map_cache== return 92a407000+6c108=2a473108
> > 	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
> > 	==xen_map_cache==first return 2a407000+6c10c=2a47310c
> > 	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
> > 	==xen_map_cache==first return 2a407000+6c110=2a473110
> > 	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
> > 	==xen_map_cache== return 2a2d9000+95000=2a36e000
> > 	==address_space_rw== ptr=2a36e000
> 
> And this is to page 0x2a36e___, a different page in the same bucket.
> 
> >       here, the SIGBUS error occurred.
> 
> So that page isn't mapped.  Which means:
> - it was never mapped (and the mapcache code didn't handle the error
>   correctly at map time); or
> - it was never mapped (and the mapcache hasn't checked its own records
>   before using the map); or
> - it was mapped (and something unmapped it in the meantime).
> 
> Why not add some tests in xen_remap_bucket to check that all the pages
> that qemu records as mapped are actually there?
> 

Hi Tim,

We did more debugging these days, and root caused it just now. We found the pfn which caused the SIGBUS was never mapped. We verified it by printing *err* variable return from xc_map_foreign_bulk(), and its bit was also not set in entry->valid_mapping. But test_bits didn't return right value for this situation. qemu_get_ram_ptr() almost passed size 0 to xen_map_cache(), and then xen_map_cache() calls test_bits() to test valid_mapping, and test_bits always return 1 when size is 0 because qemu's find_next_zero_bit() has following code:
	if (offset >= size) {
        return size;
    	}

One way to fix test_bits() issue is to change size >> XC_PAGE_SHIFT to (size + 1 << XC_PAGE_SHIFT) >> XC_PAGE_SHIFT in xen_map_cache(). But we think it still cannot solve the problem perfectly, because test_bits will only test one bit when size is 0. Why does qemu_get_ram_ptr() always call xen_map_cache() with size is 0 when block->offset == 0? If some operations need to access big range address (more than 1 page), this way still has problem. So I think it should pass actual size to xen_map_cache(). 

--weidong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-03-21 13:33             ` Hanweidong
@ 2013-03-25 12:40               ` Hanweidong
  2013-03-29 12:37                 ` [Qemu-devel] " Stefano Stabellini
  2013-03-29 12:37                 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  2013-03-25 12:40               ` Hanweidong
  1 sibling, 2 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-25 12:40 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, qemu-devel, xen-devel,
	Gonglei (Arei),
	Anthony PERARD, Wangzhenguo

We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.

Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
Signed-off-by: Weidong Han <hanweidong@huawei.com>

diff --git a/xen-mapcache.c b/xen-mapcache.c
index 31c06dc..bd4a97f 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
     hwaddr address_index;
     hwaddr address_offset;
     hwaddr __size = size;
+    hwaddr __test_bit_size = size;
     bool translated = false;

 tryagain:
@@ -210,7 +211,23 @@ tryagain:

     trace_xen_map_cache(phys_addr);

-    if (address_index == mapcache->last_address_index && !lock && !__size) {
+    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
+
+    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
+    if (size) {
+        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
+
+        if (__test_bit_size % XC_PAGE_SIZE) {
+            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size % XC_PAGE_SIZE);
+        }
+    } else {
+        __test_bit_size = XC_PAGE_SIZE;
+    }
+
+    if (address_index == mapcache->last_address_index && !lock && !__size &&
+        test_bits(address_offset >> XC_PAGE_SHIFT,
+                  __test_bit_size >> XC_PAGE_SHIFT,
+                  entry->valid_mapping)) {
         trace_xen_map_cache_return(mapcache->last_address_vaddr + address_offset);
         return mapcache->last_address_vaddr + address_offset;
     }
@@ -225,11 +242,10 @@ tryagain:
         __size = MCACHE_BUCKET_SIZE;
     }

-    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
-
     while (entry && entry->lock && entry->vaddr_base &&
             (entry->paddr_index != address_index || entry->size != __size ||
-             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+             !test_bits(address_offset >> XC_PAGE_SHIFT,
+                 __test_bit_size >> XC_PAGE_SHIFT,
                  entry->valid_mapping))) {
         pentry = entry;
         entry = entry->next;
@@ -241,13 +257,15 @@ tryagain:
     } else if (!entry->lock) {
         if (!entry->vaddr_base || entry->paddr_index != address_index ||
                 entry->size != __size ||
-                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+                !test_bits(address_offset >> XC_PAGE_SHIFT,
+                    __test_bit_size >> XC_PAGE_SHIFT,
                     entry->valid_mapping)) {
             xen_remap_bucket(entry, __size, address_index);
         }
     }

-    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
+                __test_bit_size >> XC_PAGE_SHIFT,
                 entry->valid_mapping)) {
         mapcache->last_address_index = -1;
         if (!translated && mapcache->phys_offset_to_gaddr) {


> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Hanweidong
> Sent: 2013年3月21日 21:33
> To: Tim Deegan
> Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-devel@lists.xen.org;
> Gonglei (Arei); Anthony PERARD
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Tim Deegan
> > Sent: 2013年3月21日 20:16
> > To: Hanweidong
> > Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-
> devel@lists.xen.org;
> > Gonglei (Arei); Anthony PERARD
> > Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> >
> > At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote:
> > > > > I'm also curious about this. There is a window between memory
> > balloon
> > > > out
> > > > > and QEMU invalidate mapcache.
> > > >
> > > > That by itself is OK; I don't think we need to provide any
> > meaningful
> > > > semantics if the guest is accessing memory that it's ballooned
> out.
> > > >
> > > > The question is where the SIGBUS comes from: either qemu has a
> > mapping
> > > > of the old memory, in which case it can write to it safely, or it
> > > > doesn't, in which case it shouldn't try.
> > >
> > > The error always happened at memcpy in if (is_write) branch in
> > > address_space_rw.
> >
> > Sure, but _why_?  Why does this access cause SIGBUS?  Presumably
> > there's
> > some part of the mapcache code that thinks it has a mapping there
> when
> > it doesn't.
> >
> > > We found that, after the last xen_invalidate_map_cache, the
> mapcache
> > entry related to the failed address was mapped:
> > > 	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
> > > 	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
> > > 	==xen_remap_bucket== end entry->paddr_index=7a3,entry-
> > >vaddr_base=2a2d9000,size=1048576,address_index=7a3
> >
> > OK, so that's 0x2a2d9000 -- 0x2a3d8fff.
> >
> > > 	==address_space_rw== ptr=2a39aec0
> > > 	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
> > > 	==address_space_rw== ptr=2a39aec4
> > > 	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
> > > 	==address_space_rw== ptr=2a39aec8
> > > 	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
> > > 	==address_space_rw== ptr=2a39aecc
> >
> > These are all to page 0x2a3e9a___.
> >
> > > 	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
> > > 	==xen_map_cache== return 92a407000+6c108=2a473108
> > > 	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
> > > 	==xen_map_cache==first return 2a407000+6c10c=2a47310c
> > > 	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
> > > 	==xen_map_cache==first return 2a407000+6c110=2a473110
> > > 	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
> > > 	==xen_map_cache== return 2a2d9000+95000=2a36e000
> > > 	==address_space_rw== ptr=2a36e000
> >
> > And this is to page 0x2a36e___, a different page in the same bucket.
> >
> > >       here, the SIGBUS error occurred.
> >
> > So that page isn't mapped.  Which means:
> > - it was never mapped (and the mapcache code didn't handle the error
> >   correctly at map time); or
> > - it was never mapped (and the mapcache hasn't checked its own
> records
> >   before using the map); or
> > - it was mapped (and something unmapped it in the meantime).
> >
> > Why not add some tests in xen_remap_bucket to check that all the
> pages
> > that qemu records as mapped are actually there?
> >
> 
> Hi Tim,
> 
> We did more debugging these days, and root caused it just now. We found
> the pfn which caused the SIGBUS was never mapped. We verified it by
> printing *err* variable return from xc_map_foreign_bulk(), and its bit
> was also not set in entry->valid_mapping. But test_bits didn't return
> right value for this situation. qemu_get_ram_ptr() almost passed size 0
> to xen_map_cache(), and then xen_map_cache() calls test_bits() to test
> valid_mapping, and test_bits always return 1 when size is 0 because
> qemu's find_next_zero_bit() has following code:
> 	if (offset >= size) {
>         return size;
>     	}
> 
> One way to fix test_bits() issue is to change size >> XC_PAGE_SHIFT to
> (size + 1 << XC_PAGE_SHIFT) >> XC_PAGE_SHIFT in xen_map_cache(). But we
> think it still cannot solve the problem perfectly, because test_bits
> will only test one bit when size is 0. Why does qemu_get_ram_ptr()
> always call xen_map_cache() with size is 0 when block->offset == 0? If
> some operations need to access big range address (more than 1 page),
> this way still has problem. So I think it should pass actual size to
> xen_map_cache().
> 
> --weidong
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: frequently ballooning results in qemu exit
  2013-03-21 13:33             ` Hanweidong
  2013-03-25 12:40               ` [Qemu-devel] [Xen-devel] " Hanweidong
@ 2013-03-25 12:40               ` Hanweidong
  1 sibling, 0 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-25 12:40 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, qemu-devel, xen-devel,
	Gonglei (Arei),
	Anthony PERARD, Wangzhenguo

We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.

Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
Signed-off-by: Weidong Han <hanweidong@huawei.com>

diff --git a/xen-mapcache.c b/xen-mapcache.c
index 31c06dc..bd4a97f 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
     hwaddr address_index;
     hwaddr address_offset;
     hwaddr __size = size;
+    hwaddr __test_bit_size = size;
     bool translated = false;

 tryagain:
@@ -210,7 +211,23 @@ tryagain:

     trace_xen_map_cache(phys_addr);

-    if (address_index == mapcache->last_address_index && !lock && !__size) {
+    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
+
+    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
+    if (size) {
+        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
+
+        if (__test_bit_size % XC_PAGE_SIZE) {
+            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size % XC_PAGE_SIZE);
+        }
+    } else {
+        __test_bit_size = XC_PAGE_SIZE;
+    }
+
+    if (address_index == mapcache->last_address_index && !lock && !__size &&
+        test_bits(address_offset >> XC_PAGE_SHIFT,
+                  __test_bit_size >> XC_PAGE_SHIFT,
+                  entry->valid_mapping)) {
         trace_xen_map_cache_return(mapcache->last_address_vaddr + address_offset);
         return mapcache->last_address_vaddr + address_offset;
     }
@@ -225,11 +242,10 @@ tryagain:
         __size = MCACHE_BUCKET_SIZE;
     }

-    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
-
     while (entry && entry->lock && entry->vaddr_base &&
             (entry->paddr_index != address_index || entry->size != __size ||
-             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+             !test_bits(address_offset >> XC_PAGE_SHIFT,
+                 __test_bit_size >> XC_PAGE_SHIFT,
                  entry->valid_mapping))) {
         pentry = entry;
         entry = entry->next;
@@ -241,13 +257,15 @@ tryagain:
     } else if (!entry->lock) {
         if (!entry->vaddr_base || entry->paddr_index != address_index ||
                 entry->size != __size ||
-                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+                !test_bits(address_offset >> XC_PAGE_SHIFT,
+                    __test_bit_size >> XC_PAGE_SHIFT,
                     entry->valid_mapping)) {
             xen_remap_bucket(entry, __size, address_index);
         }
     }

-    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
+                __test_bit_size >> XC_PAGE_SHIFT,
                 entry->valid_mapping)) {
         mapcache->last_address_index = -1;
         if (!translated && mapcache->phys_offset_to_gaddr) {


> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Hanweidong
> Sent: 2013年3月21日 21:33
> To: Tim Deegan
> Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-devel@lists.xen.org;
> Gonglei (Arei); Anthony PERARD
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Tim Deegan
> > Sent: 2013年3月21日 20:16
> > To: Hanweidong
> > Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-
> devel@lists.xen.org;
> > Gonglei (Arei); Anthony PERARD
> > Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> >
> > At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote:
> > > > > I'm also curious about this. There is a window between memory
> > balloon
> > > > out
> > > > > and QEMU invalidate mapcache.
> > > >
> > > > That by itself is OK; I don't think we need to provide any
> > meaningful
> > > > semantics if the guest is accessing memory that it's ballooned
> out.
> > > >
> > > > The question is where the SIGBUS comes from: either qemu has a
> > mapping
> > > > of the old memory, in which case it can write to it safely, or it
> > > > doesn't, in which case it shouldn't try.
> > >
> > > The error always happened at memcpy in if (is_write) branch in
> > > address_space_rw.
> >
> > Sure, but _why_?  Why does this access cause SIGBUS?  Presumably
> > there's
> > some part of the mapcache code that thinks it has a mapping there
> when
> > it doesn't.
> >
> > > We found that, after the last xen_invalidate_map_cache, the
> mapcache
> > entry related to the failed address was mapped:
> > > 	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
> > > 	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
> > > 	==xen_remap_bucket== end entry->paddr_index=7a3,entry-
> > >vaddr_base=2a2d9000,size=1048576,address_index=7a3
> >
> > OK, so that's 0x2a2d9000 -- 0x2a3d8fff.
> >
> > > 	==address_space_rw== ptr=2a39aec0
> > > 	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
> > > 	==address_space_rw== ptr=2a39aec4
> > > 	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
> > > 	==address_space_rw== ptr=2a39aec8
> > > 	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
> > > 	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
> > > 	==address_space_rw== ptr=2a39aecc
> >
> > These are all to page 0x2a3e9a___.
> >
> > > 	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
> > > 	==xen_map_cache== return 92a407000+6c108=2a473108
> > > 	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
> > > 	==xen_map_cache==first return 2a407000+6c10c=2a47310c
> > > 	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
> > > 	==xen_map_cache==first return 2a407000+6c110=2a473110
> > > 	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
> > > 	==xen_map_cache== return 2a2d9000+95000=2a36e000
> > > 	==address_space_rw== ptr=2a36e000
> >
> > And this is to page 0x2a36e___, a different page in the same bucket.
> >
> > >       here, the SIGBUS error occurred.
> >
> > So that page isn't mapped.  Which means:
> > - it was never mapped (and the mapcache code didn't handle the error
> >   correctly at map time); or
> > - it was never mapped (and the mapcache hasn't checked its own
> records
> >   before using the map); or
> > - it was mapped (and something unmapped it in the meantime).
> >
> > Why not add some tests in xen_remap_bucket to check that all the
> pages
> > that qemu records as mapped are actually there?
> >
> 
> Hi Tim,
> 
> We did more debugging these days, and root caused it just now. We found
> the pfn which caused the SIGBUS was never mapped. We verified it by
> printing *err* variable return from xc_map_foreign_bulk(), and its bit
> was also not set in entry->valid_mapping. But test_bits didn't return
> right value for this situation. qemu_get_ram_ptr() almost passed size 0
> to xen_map_cache(), and then xen_map_cache() calls test_bits() to test
> valid_mapping, and test_bits always return 1 when size is 0 because
> qemu's find_next_zero_bit() has following code:
> 	if (offset >= size) {
>         return size;
>     	}
> 
> One way to fix test_bits() issue is to change size >> XC_PAGE_SHIFT to
> (size + 1 << XC_PAGE_SHIFT) >> XC_PAGE_SHIFT in xen_map_cache(). But we
> think it still cannot solve the problem perfectly, because test_bits
> will only test one bit when size is 0. Why does qemu_get_ram_ptr()
> always call xen_map_cache() with size is 0 when block->offset == 0? If
> some operations need to access big range address (more than 1 page),
> this way still has problem. So I think it should pass actual size to
> xen_map_cache().
> 
> --weidong
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-03-25 12:40               ` [Qemu-devel] [Xen-devel] " Hanweidong
  2013-03-29 12:37                 ` [Qemu-devel] " Stefano Stabellini
@ 2013-03-29 12:37                 ` Stefano Stabellini
  2013-03-30 15:04                   ` [Qemu-devel] " Hanweidong
  2013-03-30 15:04                   ` [Qemu-devel] [Xen-devel] " Hanweidong
  1 sibling, 2 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-03-29 12:37 UTC (permalink / raw)
  To: Hanweidong
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

On Mon, 25 Mar 2013, Hanweidong wrote:
> We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.
> 
> Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> Signed-off-by: Weidong Han <hanweidong@huawei.com>

Thanks for the patch and for debugging this difficult problem.
The reality is that size is never actually 0: when qemu_get_ram_ptr
calls xen_map_cache with size 0, it actually means "map until the end of
the page". As a consequence test_bits should always test at least 1 bit,
like you wrote.

I tried to simplify your patch a bit, does this one work for you?


diff --git a/xen-mapcache.c b/xen-mapcache.c
index dc6d1fa..b03b373 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
     hwaddr address_index;
     hwaddr address_offset;
     hwaddr __size = size;
+    hwaddr __test_bit_size = size >> XC_PAGE_SHIFT;
     bool translated = false;
 
 tryagain:
@@ -224,12 +225,16 @@ tryagain:
     } else {
         __size = MCACHE_BUCKET_SIZE;
     }
+    /* always test at least one page */
+    if (!__test_bit_size) {
+        __test_bit_size = 1UL;
+    }
 
     entry = &mapcache->entry[address_index % mapcache->nr_buckets];
 
     while (entry && entry->lock && entry->vaddr_base &&
             (entry->paddr_index != address_index || entry->size != __size ||
-             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+             !test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                  entry->valid_mapping))) {
         pentry = entry;
         entry = entry->next;
@@ -241,13 +246,13 @@ tryagain:
     } else if (!entry->lock) {
         if (!entry->vaddr_base || entry->paddr_index != address_index ||
                 entry->size != __size ||
-                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+                !test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                     entry->valid_mapping)) {
             xen_remap_bucket(entry, __size, address_index);
         }
     }
 
-    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+    if(!test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                 entry->valid_mapping)) {
         mapcache->last_address_index = -1;
         if (!translated && mapcache->phys_offset_to_gaddr) {

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-03-25 12:40               ` [Qemu-devel] [Xen-devel] " Hanweidong
@ 2013-03-29 12:37                 ` Stefano Stabellini
  2013-03-29 12:37                 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  1 sibling, 0 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-03-29 12:37 UTC (permalink / raw)
  To: Hanweidong
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

On Mon, 25 Mar 2013, Hanweidong wrote:
> We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.
> 
> Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> Signed-off-by: Weidong Han <hanweidong@huawei.com>

Thanks for the patch and for debugging this difficult problem.
The reality is that size is never actually 0: when qemu_get_ram_ptr
calls xen_map_cache with size 0, it actually means "map until the end of
the page". As a consequence test_bits should always test at least 1 bit,
like you wrote.

I tried to simplify your patch a bit, does this one work for you?


diff --git a/xen-mapcache.c b/xen-mapcache.c
index dc6d1fa..b03b373 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
     hwaddr address_index;
     hwaddr address_offset;
     hwaddr __size = size;
+    hwaddr __test_bit_size = size >> XC_PAGE_SHIFT;
     bool translated = false;
 
 tryagain:
@@ -224,12 +225,16 @@ tryagain:
     } else {
         __size = MCACHE_BUCKET_SIZE;
     }
+    /* always test at least one page */
+    if (!__test_bit_size) {
+        __test_bit_size = 1UL;
+    }
 
     entry = &mapcache->entry[address_index % mapcache->nr_buckets];
 
     while (entry && entry->lock && entry->vaddr_base &&
             (entry->paddr_index != address_index || entry->size != __size ||
-             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+             !test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                  entry->valid_mapping))) {
         pentry = entry;
         entry = entry->next;
@@ -241,13 +246,13 @@ tryagain:
     } else if (!entry->lock) {
         if (!entry->vaddr_base || entry->paddr_index != address_index ||
                 entry->size != __size ||
-                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+                !test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                     entry->valid_mapping)) {
             xen_remap_bucket(entry, __size, address_index);
         }
     }
 
-    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
+    if(!test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
                 entry->valid_mapping)) {
         mapcache->last_address_index = -1;
         if (!translated && mapcache->phys_offset_to_gaddr) {

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-03-29 12:37                 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  2013-03-30 15:04                   ` [Qemu-devel] " Hanweidong
@ 2013-03-30 15:04                   ` Hanweidong
  2013-04-01 14:39                     ` Stefano Stabellini
  2013-04-01 14:39                     ` Stefano Stabellini
  1 sibling, 2 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-30 15:04 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo


> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年3月29日 20:37
> To: Hanweidong
> Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> Perard; Wangzhenguo
> Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Mon, 25 Mar 2013, Hanweidong wrote:
> > We fixed this issue by below patch which computed the correct size
> for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> xen_map_cache() with size is 0 if the requested address is in the RAM.
> Then xen_map_cache() will pass the size 0 to test_bits() for checking
> if the corresponding pfn was mapped in cache. But test_bits() will
> always return 1 when size is 0 without any bit testing. Actually, for
> this case, test_bits should check one bit. So this patch introduced a
> __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> as its size.
> >
> > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> 
> Thanks for the patch and for debugging this difficult problem.
> The reality is that size is never actually 0: when qemu_get_ram_ptr
> calls xen_map_cache with size 0, it actually means "map until the end
> of
> the page". As a consequence test_bits should always test at least 1 bit,
> like you wrote.

Yes, for the case of size is 0, we can just simply set __test_bit_size 1. But for size > 0, I think set __test_bit_size to size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2 bits, but size >> XC_PAGE_SHIFT is only 1. 

-weidong

> 
> I tried to simplify your patch a bit, does this one work for you?
> 
> 
> diff --git a/xen-mapcache.c b/xen-mapcache.c
> index dc6d1fa..b03b373 100644
> --- a/xen-mapcache.c
> +++ b/xen-mapcache.c
> @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> size,
>      hwaddr address_index;
>      hwaddr address_offset;
>      hwaddr __size = size;
> +    hwaddr __test_bit_size = size >> XC_PAGE_SHIFT;
>      bool translated = false;
> 
>  tryagain:
> @@ -224,12 +225,16 @@ tryagain:
>      } else {
>          __size = MCACHE_BUCKET_SIZE;
>      }
> +    /* always test at least one page */
> +    if (!__test_bit_size) {
> +        __test_bit_size = 1UL;
> +    }
> 
>      entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> 
>      while (entry && entry->lock && entry->vaddr_base &&
>              (entry->paddr_index != address_index || entry->size !=
> __size ||
> -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> __test_bit_size,
>                   entry->valid_mapping))) {
>          pentry = entry;
>          entry = entry->next;
> @@ -241,13 +246,13 @@ tryagain:
>      } else if (!entry->lock) {
>          if (!entry->vaddr_base || entry->paddr_index != address_index
> ||
>                  entry->size != __size ||
> -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> __test_bit_size,
>                      entry->valid_mapping)) {
>              xen_remap_bucket(entry, __size, address_index);
>          }
>      }
> 
> -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +    if(!test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
>                  entry->valid_mapping)) {
>          mapcache->last_address_index = -1;
>          if (!translated && mapcache->phys_offset_to_gaddr) {

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-03-29 12:37                 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
@ 2013-03-30 15:04                   ` Hanweidong
  2013-03-30 15:04                   ` [Qemu-devel] [Xen-devel] " Hanweidong
  1 sibling, 0 replies; 25+ messages in thread
From: Hanweidong @ 2013-03-30 15:04 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo


> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年3月29日 20:37
> To: Hanweidong
> Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> Perard; Wangzhenguo
> Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Mon, 25 Mar 2013, Hanweidong wrote:
> > We fixed this issue by below patch which computed the correct size
> for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> xen_map_cache() with size is 0 if the requested address is in the RAM.
> Then xen_map_cache() will pass the size 0 to test_bits() for checking
> if the corresponding pfn was mapped in cache. But test_bits() will
> always return 1 when size is 0 without any bit testing. Actually, for
> this case, test_bits should check one bit. So this patch introduced a
> __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> as its size.
> >
> > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> 
> Thanks for the patch and for debugging this difficult problem.
> The reality is that size is never actually 0: when qemu_get_ram_ptr
> calls xen_map_cache with size 0, it actually means "map until the end
> of
> the page". As a consequence test_bits should always test at least 1 bit,
> like you wrote.

Yes, for the case of size is 0, we can just simply set __test_bit_size 1. But for size > 0, I think set __test_bit_size to size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2 bits, but size >> XC_PAGE_SHIFT is only 1. 

-weidong

> 
> I tried to simplify your patch a bit, does this one work for you?
> 
> 
> diff --git a/xen-mapcache.c b/xen-mapcache.c
> index dc6d1fa..b03b373 100644
> --- a/xen-mapcache.c
> +++ b/xen-mapcache.c
> @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> size,
>      hwaddr address_index;
>      hwaddr address_offset;
>      hwaddr __size = size;
> +    hwaddr __test_bit_size = size >> XC_PAGE_SHIFT;
>      bool translated = false;
> 
>  tryagain:
> @@ -224,12 +225,16 @@ tryagain:
>      } else {
>          __size = MCACHE_BUCKET_SIZE;
>      }
> +    /* always test at least one page */
> +    if (!__test_bit_size) {
> +        __test_bit_size = 1UL;
> +    }
> 
>      entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> 
>      while (entry && entry->lock && entry->vaddr_base &&
>              (entry->paddr_index != address_index || entry->size !=
> __size ||
> -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> __test_bit_size,
>                   entry->valid_mapping))) {
>          pentry = entry;
>          entry = entry->next;
> @@ -241,13 +246,13 @@ tryagain:
>      } else if (!entry->lock) {
>          if (!entry->vaddr_base || entry->paddr_index != address_index
> ||
>                  entry->size != __size ||
> -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> __test_bit_size,
>                      entry->valid_mapping)) {
>              xen_remap_bucket(entry, __size, address_index);
>          }
>      }
> 
> -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> +    if(!test_bits(address_offset >> XC_PAGE_SHIFT, __test_bit_size,
>                  entry->valid_mapping)) {
>          mapcache->last_address_index = -1;
>          if (!translated && mapcache->phys_offset_to_gaddr) {
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-03-30 15:04                   ` [Qemu-devel] [Xen-devel] " Hanweidong
@ 2013-04-01 14:39                     ` Stefano Stabellini
  2013-04-02  1:06                       ` Hanweidong
  2013-04-02  1:06                       ` Hanweidong
  2013-04-01 14:39                     ` Stefano Stabellini
  1 sibling, 2 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-01 14:39 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 6176 bytes --]

On Sat, 30 Mar 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年3月29日 20:37
> > To: Hanweidong
> > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > Perard; Wangzhenguo
> > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > We fixed this issue by below patch which computed the correct size
> > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> > xen_map_cache() with size is 0 if the requested address is in the RAM.
> > Then xen_map_cache() will pass the size 0 to test_bits() for checking
> > if the corresponding pfn was mapped in cache. But test_bits() will
> > always return 1 when size is 0 without any bit testing. Actually, for
> > this case, test_bits should check one bit. So this patch introduced a
> > __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> > then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> > as its size.
> > >
> > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > 
> > Thanks for the patch and for debugging this difficult problem.
> > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > calls xen_map_cache with size 0, it actually means "map until the end
> > of
> > the page". As a consequence test_bits should always test at least 1 bit,
> > like you wrote.
> 
> Yes, for the case of size is 0, we can just simply set __test_bit_size 1. But for size > 0, I think set __test_bit_size to size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2 bits, but size >> XC_PAGE_SHIFT is only 1. 
> 

I was assuming that the size is always page aligned.
Looking through the code actually I think that it's better not to rely
on this assumption.

Looking back at your original patch:



> We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.
> 
> Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> Signed-off-by: Weidong Han <hanweidong@huawei.com>
> 
> diff --git a/xen-mapcache.c b/xen-mapcache.c
> index 31c06dc..bd4a97f 100644
> --- a/xen-mapcache.c
> +++ b/xen-mapcache.c
> @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
>      hwaddr address_index;
>      hwaddr address_offset;
>      hwaddr __size = size;
> +    hwaddr __test_bit_size = size;
>      bool translated = false;
> 
>  tryagain:
> @@ -210,7 +211,23 @@ tryagain:
> 
>      trace_xen_map_cache(phys_addr);
> 
> -    if (address_index == mapcache->last_address_index && !lock && !__size) {
> +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];

there is no need to move this line up here, see below


> +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> +    if (size) {
> +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> +
> +        if (__test_bit_size % XC_PAGE_SIZE) {
> +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size % XC_PAGE_SIZE);
> +        }
> +    } else {
> +        __test_bit_size = XC_PAGE_SIZE;
> +    }

this is OK


> +    if (address_index == mapcache->last_address_index && !lock && !__size &&
> +        test_bits(address_offset >> XC_PAGE_SHIFT,
> +                  __test_bit_size >> XC_PAGE_SHIFT,
> +                  entry->valid_mapping)) {
>          trace_xen_map_cache_return(mapcache->last_address_vaddr + address_offset);
>          return mapcache->last_address_vaddr + address_offset;
>      }

Unless I am missing something this change is unnecessary: if the mapping
is not valid than mapcache->last_address_index is set to -1.


> @@ -225,11 +242,10 @@ tryagain:
>          __size = MCACHE_BUCKET_SIZE;
>      }
> 
> -    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> -

just leave entry where it is


>      while (entry && entry->lock && entry->vaddr_base &&
>              (entry->paddr_index != address_index || entry->size != __size ||
> -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> +                 __test_bit_size >> XC_PAGE_SHIFT,
>                   entry->valid_mapping))) {
>          pentry = entry;
>          entry = entry->next;
> @@ -241,13 +257,15 @@ tryagain:
>      } else if (!entry->lock) {
>          if (!entry->vaddr_base || entry->paddr_index != address_index ||
>                  entry->size != __size ||
> -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> +                    __test_bit_size >> XC_PAGE_SHIFT,
>                      entry->valid_mapping)) {
>              xen_remap_bucket(entry, __size, address_index);
>          }
>      }
> 
> -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
> +                __test_bit_size >> XC_PAGE_SHIFT,
>                  entry->valid_mapping)) {
>          mapcache->last_address_index = -1;
>          if (!translated && mapcache->phys_offset_to_gaddr) {

This is fine

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-03-30 15:04                   ` [Qemu-devel] [Xen-devel] " Hanweidong
  2013-04-01 14:39                     ` Stefano Stabellini
@ 2013-04-01 14:39                     ` Stefano Stabellini
  1 sibling, 0 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-01 14:39 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 6176 bytes --]

On Sat, 30 Mar 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年3月29日 20:37
> > To: Hanweidong
> > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > Perard; Wangzhenguo
> > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > We fixed this issue by below patch which computed the correct size
> > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> > xen_map_cache() with size is 0 if the requested address is in the RAM.
> > Then xen_map_cache() will pass the size 0 to test_bits() for checking
> > if the corresponding pfn was mapped in cache. But test_bits() will
> > always return 1 when size is 0 without any bit testing. Actually, for
> > this case, test_bits should check one bit. So this patch introduced a
> > __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> > then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> > as its size.
> > >
> > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > 
> > Thanks for the patch and for debugging this difficult problem.
> > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > calls xen_map_cache with size 0, it actually means "map until the end
> > of
> > the page". As a consequence test_bits should always test at least 1 bit,
> > like you wrote.
> 
> Yes, for the case of size is 0, we can just simply set __test_bit_size 1. But for size > 0, I think set __test_bit_size to size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2 bits, but size >> XC_PAGE_SHIFT is only 1. 
> 

I was assuming that the size is always page aligned.
Looking through the code actually I think that it's better not to rely
on this assumption.

Looking back at your original patch:



> We fixed this issue by below patch which computed the correct size for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call xen_map_cache() with size is 0 if the requested address is in the RAM. Then xen_map_cache() will pass the size 0 to test_bits() for checking if the corresponding pfn was mapped in cache. But test_bits() will always return 1 when size is 0 without any bit testing. Actually, for this case, test_bits should check one bit. So this patch introduced a __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE, then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT as its size.
> 
> Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> Signed-off-by: Weidong Han <hanweidong@huawei.com>
> 
> diff --git a/xen-mapcache.c b/xen-mapcache.c
> index 31c06dc..bd4a97f 100644
> --- a/xen-mapcache.c
> +++ b/xen-mapcache.c
> @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
>      hwaddr address_index;
>      hwaddr address_offset;
>      hwaddr __size = size;
> +    hwaddr __test_bit_size = size;
>      bool translated = false;
> 
>  tryagain:
> @@ -210,7 +211,23 @@ tryagain:
> 
>      trace_xen_map_cache(phys_addr);
> 
> -    if (address_index == mapcache->last_address_index && !lock && !__size) {
> +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];

there is no need to move this line up here, see below


> +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> +    if (size) {
> +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> +
> +        if (__test_bit_size % XC_PAGE_SIZE) {
> +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size % XC_PAGE_SIZE);
> +        }
> +    } else {
> +        __test_bit_size = XC_PAGE_SIZE;
> +    }

this is OK


> +    if (address_index == mapcache->last_address_index && !lock && !__size &&
> +        test_bits(address_offset >> XC_PAGE_SHIFT,
> +                  __test_bit_size >> XC_PAGE_SHIFT,
> +                  entry->valid_mapping)) {
>          trace_xen_map_cache_return(mapcache->last_address_vaddr + address_offset);
>          return mapcache->last_address_vaddr + address_offset;
>      }

Unless I am missing something this change is unnecessary: if the mapping
is not valid than mapcache->last_address_index is set to -1.


> @@ -225,11 +242,10 @@ tryagain:
>          __size = MCACHE_BUCKET_SIZE;
>      }
> 
> -    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> -

just leave entry where it is


>      while (entry && entry->lock && entry->vaddr_base &&
>              (entry->paddr_index != address_index || entry->size != __size ||
> -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> +                 __test_bit_size >> XC_PAGE_SHIFT,
>                   entry->valid_mapping))) {
>          pentry = entry;
>          entry = entry->next;
> @@ -241,13 +257,15 @@ tryagain:
>      } else if (!entry->lock) {
>          if (!entry->vaddr_base || entry->paddr_index != address_index ||
>                  entry->size != __size ||
> -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> +                    __test_bit_size >> XC_PAGE_SHIFT,
>                      entry->valid_mapping)) {
>              xen_remap_bucket(entry, __size, address_index);
>          }
>      }
> 
> -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >> XC_PAGE_SHIFT,
> +    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
> +                __test_bit_size >> XC_PAGE_SHIFT,
>                  entry->valid_mapping)) {
>          mapcache->last_address_index = -1;
>          if (!translated && mapcache->phys_offset_to_gaddr) {

This is fine

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-04-01 14:39                     ` Stefano Stabellini
@ 2013-04-02  1:06                       ` Hanweidong
  2013-04-02 13:27                         ` [Qemu-devel] " Stefano Stabellini
  2013-04-02 13:27                         ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  2013-04-02  1:06                       ` Hanweidong
  1 sibling, 2 replies; 25+ messages in thread
From: Hanweidong @ 2013-04-02  1:06 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo



> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年4月1日 22:39
> To: Hanweidong
> Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> (Arei); Anthony Perard; Wangzhenguo
> Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Sat, 30 Mar 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年3月29日 20:37
> > > To: Hanweidong
> > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > > Perard; Wangzhenguo
> > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results
> in
> > > qemu exit
> > >
> > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > We fixed this issue by below patch which computed the correct
> size
> > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> call
> > > xen_map_cache() with size is 0 if the requested address is in the
> RAM.
> > > Then xen_map_cache() will pass the size 0 to test_bits() for
> checking
> > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > always return 1 when size is 0 without any bit testing. Actually,
> for
> > > this case, test_bits should check one bit. So this patch introduced
> a
> > > __test_bit_size which is greater than 0 and a multiple of
> XC_PAGE_SIZE,
> > > then test_bits can work correctly with __test_bit_size >>
> XC_PAGE_SHIFT
> > > as its size.
> > > >
> > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > >
> > > Thanks for the patch and for debugging this difficult problem.
> > > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > > calls xen_map_cache with size 0, it actually means "map until the
> end
> > > of
> > > the page". As a consequence test_bits should always test at least 1
> bit,
> > > like you wrote.
> >
> > Yes, for the case of size is 0, we can just simply set
> __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test
> 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2
> bits, but size >> XC_PAGE_SHIFT is only 1.
> >
> 
> I was assuming that the size is always page aligned.
> Looking through the code actually I think that it's better not to rely
> on this assumption.
> 
> Looking back at your original patch:
> 
> 
> 
> > We fixed this issue by below patch which computed the correct size
> for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> xen_map_cache() with size is 0 if the requested address is in the RAM.
> Then xen_map_cache() will pass the size 0 to test_bits() for checking
> if the corresponding pfn was mapped in cache. But test_bits() will
> always return 1 when size is 0 without any bit testing. Actually, for
> this case, test_bits should check one bit. So this patch introduced a
> __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> as its size.
> >
> > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> >
> > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > index 31c06dc..bd4a97f 100644
> > --- a/xen-mapcache.c
> > +++ b/xen-mapcache.c
> > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> size,
> >      hwaddr address_index;
> >      hwaddr address_offset;
> >      hwaddr __size = size;
> > +    hwaddr __test_bit_size = size;
> >      bool translated = false;
> >
> >  tryagain:
> > @@ -210,7 +211,23 @@ tryagain:
> >
> >      trace_xen_map_cache(phys_addr);
> >
> > -    if (address_index == mapcache->last_address_index && !lock
> && !__size) {
> > +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> 
> there is no need to move this line up here, see below
> 
> 
> > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > +    if (size) {
> > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> > +
> > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> XC_PAGE_SIZE);
> > +        }
> > +    } else {
> > +        __test_bit_size = XC_PAGE_SIZE;
> > +    }
> 
> this is OK
> 
> 
> > +    if (address_index == mapcache->last_address_index && !lock
> && !__size &&
> > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > +                  entry->valid_mapping)) {
> >          trace_xen_map_cache_return(mapcache->last_address_vaddr +
> address_offset);
> >          return mapcache->last_address_vaddr + address_offset;
> >      }
> 
> Unless I am missing something this change is unnecessary: if the
> mapping
> is not valid than mapcache->last_address_index is set to -1.

mapcache->last_address_index means the corresponding bucket (1MB) was mapped, but we noticed that some pages of the bucket may be not mapped. So we need to check if it's mapped even the address_index is equal to last_address_index.

-weidong
 
> 
> 
> > @@ -225,11 +242,10 @@ tryagain:
> >          __size = MCACHE_BUCKET_SIZE;
> >      }
> >
> > -    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> > -
> 
> just leave entry where it is
> 
> 
> >      while (entry && entry->lock && entry->vaddr_base &&
> >              (entry->paddr_index != address_index || entry->size !=
> __size ||
> > -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                 __test_bit_size >> XC_PAGE_SHIFT,
> >                   entry->valid_mapping))) {
> >          pentry = entry;
> >          entry = entry->next;
> > @@ -241,13 +257,15 @@ tryagain:
> >      } else if (!entry->lock) {
> >          if (!entry->vaddr_base || entry->paddr_index !=
> address_index ||
> >                  entry->size != __size ||
> > -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                    __test_bit_size >> XC_PAGE_SHIFT,
> >                      entry->valid_mapping)) {
> >              xen_remap_bucket(entry, __size, address_index);
> >          }
> >      }
> >
> > -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                __test_bit_size >> XC_PAGE_SHIFT,
> >                  entry->valid_mapping)) {
> >          mapcache->last_address_index = -1;
> >          if (!translated && mapcache->phys_offset_to_gaddr) {
> 
> This is fine

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-04-01 14:39                     ` Stefano Stabellini
  2013-04-02  1:06                       ` Hanweidong
@ 2013-04-02  1:06                       ` Hanweidong
  1 sibling, 0 replies; 25+ messages in thread
From: Hanweidong @ 2013-04-02  1:06 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo



> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年4月1日 22:39
> To: Hanweidong
> Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> (Arei); Anthony Perard; Wangzhenguo
> Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Sat, 30 Mar 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年3月29日 20:37
> > > To: Hanweidong
> > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > > Perard; Wangzhenguo
> > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results
> in
> > > qemu exit
> > >
> > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > We fixed this issue by below patch which computed the correct
> size
> > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> call
> > > xen_map_cache() with size is 0 if the requested address is in the
> RAM.
> > > Then xen_map_cache() will pass the size 0 to test_bits() for
> checking
> > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > always return 1 when size is 0 without any bit testing. Actually,
> for
> > > this case, test_bits should check one bit. So this patch introduced
> a
> > > __test_bit_size which is greater than 0 and a multiple of
> XC_PAGE_SIZE,
> > > then test_bits can work correctly with __test_bit_size >>
> XC_PAGE_SHIFT
> > > as its size.
> > > >
> > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > >
> > > Thanks for the patch and for debugging this difficult problem.
> > > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > > calls xen_map_cache with size 0, it actually means "map until the
> end
> > > of
> > > the page". As a consequence test_bits should always test at least 1
> bit,
> > > like you wrote.
> >
> > Yes, for the case of size is 0, we can just simply set
> __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test
> 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2
> bits, but size >> XC_PAGE_SHIFT is only 1.
> >
> 
> I was assuming that the size is always page aligned.
> Looking through the code actually I think that it's better not to rely
> on this assumption.
> 
> Looking back at your original patch:
> 
> 
> 
> > We fixed this issue by below patch which computed the correct size
> for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> xen_map_cache() with size is 0 if the requested address is in the RAM.
> Then xen_map_cache() will pass the size 0 to test_bits() for checking
> if the corresponding pfn was mapped in cache. But test_bits() will
> always return 1 when size is 0 without any bit testing. Actually, for
> this case, test_bits should check one bit. So this patch introduced a
> __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> as its size.
> >
> > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> >
> > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > index 31c06dc..bd4a97f 100644
> > --- a/xen-mapcache.c
> > +++ b/xen-mapcache.c
> > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> size,
> >      hwaddr address_index;
> >      hwaddr address_offset;
> >      hwaddr __size = size;
> > +    hwaddr __test_bit_size = size;
> >      bool translated = false;
> >
> >  tryagain:
> > @@ -210,7 +211,23 @@ tryagain:
> >
> >      trace_xen_map_cache(phys_addr);
> >
> > -    if (address_index == mapcache->last_address_index && !lock
> && !__size) {
> > +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> 
> there is no need to move this line up here, see below
> 
> 
> > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > +    if (size) {
> > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> > +
> > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> XC_PAGE_SIZE);
> > +        }
> > +    } else {
> > +        __test_bit_size = XC_PAGE_SIZE;
> > +    }
> 
> this is OK
> 
> 
> > +    if (address_index == mapcache->last_address_index && !lock
> && !__size &&
> > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > +                  entry->valid_mapping)) {
> >          trace_xen_map_cache_return(mapcache->last_address_vaddr +
> address_offset);
> >          return mapcache->last_address_vaddr + address_offset;
> >      }
> 
> Unless I am missing something this change is unnecessary: if the
> mapping
> is not valid than mapcache->last_address_index is set to -1.

mapcache->last_address_index means the corresponding bucket (1MB) was mapped, but we noticed that some pages of the bucket may be not mapped. So we need to check if it's mapped even the address_index is equal to last_address_index.

-weidong
 
> 
> 
> > @@ -225,11 +242,10 @@ tryagain:
> >          __size = MCACHE_BUCKET_SIZE;
> >      }
> >
> > -    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> > -
> 
> just leave entry where it is
> 
> 
> >      while (entry && entry->lock && entry->vaddr_base &&
> >              (entry->paddr_index != address_index || entry->size !=
> __size ||
> > -             !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +             !test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                 __test_bit_size >> XC_PAGE_SHIFT,
> >                   entry->valid_mapping))) {
> >          pentry = entry;
> >          entry = entry->next;
> > @@ -241,13 +257,15 @@ tryagain:
> >      } else if (!entry->lock) {
> >          if (!entry->vaddr_base || entry->paddr_index !=
> address_index ||
> >                  entry->size != __size ||
> > -                !test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +                !test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                    __test_bit_size >> XC_PAGE_SHIFT,
> >                      entry->valid_mapping)) {
> >              xen_remap_bucket(entry, __size, address_index);
> >          }
> >      }
> >
> > -    if(!test_bits(address_offset >> XC_PAGE_SHIFT, size >>
> XC_PAGE_SHIFT,
> > +    if(!test_bits(address_offset >> XC_PAGE_SHIFT,
> > +                __test_bit_size >> XC_PAGE_SHIFT,
> >                  entry->valid_mapping)) {
> >          mapcache->last_address_index = -1;
> >          if (!translated && mapcache->phys_offset_to_gaddr) {
> 
> This is fine
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-04-02  1:06                       ` Hanweidong
  2013-04-02 13:27                         ` [Qemu-devel] " Stefano Stabellini
@ 2013-04-02 13:27                         ` Stefano Stabellini
  2013-04-03  8:15                           ` Hanweidong
  2013-04-03  8:15                           ` Hanweidong
  1 sibling, 2 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-02 13:27 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 6452 bytes --]

On Tue, 2 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年4月1日 22:39
> > To: Hanweidong
> > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> > (Arei); Anthony Perard; Wangzhenguo
> > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年3月29日 20:37
> > > > To: Hanweidong
> > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > > > Perard; Wangzhenguo
> > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results
> > in
> > > > qemu exit
> > > >
> > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > We fixed this issue by below patch which computed the correct
> > size
> > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> > call
> > > > xen_map_cache() with size is 0 if the requested address is in the
> > RAM.
> > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > checking
> > > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > > always return 1 when size is 0 without any bit testing. Actually,
> > for
> > > > this case, test_bits should check one bit. So this patch introduced
> > a
> > > > __test_bit_size which is greater than 0 and a multiple of
> > XC_PAGE_SIZE,
> > > > then test_bits can work correctly with __test_bit_size >>
> > XC_PAGE_SHIFT
> > > > as its size.
> > > > >
> > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > >
> > > > Thanks for the patch and for debugging this difficult problem.
> > > > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > > > calls xen_map_cache with size 0, it actually means "map until the
> > end
> > > > of
> > > > the page". As a consequence test_bits should always test at least 1
> > bit,
> > > > like you wrote.
> > >
> > > Yes, for the case of size is 0, we can just simply set
> > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test
> > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2
> > bits, but size >> XC_PAGE_SHIFT is only 1.
> > >
> > 
> > I was assuming that the size is always page aligned.
> > Looking through the code actually I think that it's better not to rely
> > on this assumption.
> > 
> > Looking back at your original patch:
> > 
> > 
> > 
> > > We fixed this issue by below patch which computed the correct size
> > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> > xen_map_cache() with size is 0 if the requested address is in the RAM.
> > Then xen_map_cache() will pass the size 0 to test_bits() for checking
> > if the corresponding pfn was mapped in cache. But test_bits() will
> > always return 1 when size is 0 without any bit testing. Actually, for
> > this case, test_bits should check one bit. So this patch introduced a
> > __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> > then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> > as its size.
> > >
> > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > >
> > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > index 31c06dc..bd4a97f 100644
> > > --- a/xen-mapcache.c
> > > +++ b/xen-mapcache.c
> > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> > size,
> > >      hwaddr address_index;
> > >      hwaddr address_offset;
> > >      hwaddr __size = size;
> > > +    hwaddr __test_bit_size = size;
> > >      bool translated = false;
> > >
> > >  tryagain:
> > > @@ -210,7 +211,23 @@ tryagain:
> > >
> > >      trace_xen_map_cache(phys_addr);
> > >
> > > -    if (address_index == mapcache->last_address_index && !lock
> > && !__size) {
> > > +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> > 
> > there is no need to move this line up here, see below
> > 
> > 
> > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > +    if (size) {
> > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> > > +
> > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > XC_PAGE_SIZE);
> > > +        }
> > > +    } else {
> > > +        __test_bit_size = XC_PAGE_SIZE;
> > > +    }
> > 
> > this is OK
> > 
> > 
> > > +    if (address_index == mapcache->last_address_index && !lock
> > && !__size &&
> > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > +                  entry->valid_mapping)) {
> > >          trace_xen_map_cache_return(mapcache->last_address_vaddr +
> > address_offset);
> > >          return mapcache->last_address_vaddr + address_offset;
> > >      }
> > 
> > Unless I am missing something this change is unnecessary: if the
> > mapping
> > is not valid than mapcache->last_address_index is set to -1.
> 
> mapcache->last_address_index means the corresponding bucket (1MB) was mapped, but we noticed that some pages of the bucket may be not mapped. So we need to check if it's mapped even the address_index is equal to last_address_index.

That is a good point, but the current fix doesn't fully address that
problem: the first entry found in the cache might not be the one
corresponding to last_address_index.

I think that the right fix here would be to replace last_address_index
and last_address_vaddr with a last_entry pointer.

I have sent a small patch series that includes your patch, can you
please let me know if it does solve your problem and if you think that
is correct?

The patch series is here: 

http://marc.info/?l=qemu-devel&m=136490915902679
http://marc.info/?l=qemu-devel&m=136490915602678

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-04-02  1:06                       ` Hanweidong
@ 2013-04-02 13:27                         ` Stefano Stabellini
  2013-04-02 13:27                         ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  1 sibling, 0 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-02 13:27 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 6452 bytes --]

On Tue, 2 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年4月1日 22:39
> > To: Hanweidong
> > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> > (Arei); Anthony Perard; Wangzhenguo
> > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年3月29日 20:37
> > > > To: Hanweidong
> > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun; qemu-
> > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei); Anthony
> > > > Perard; Wangzhenguo
> > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning results
> > in
> > > > qemu exit
> > > >
> > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > We fixed this issue by below patch which computed the correct
> > size
> > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> > call
> > > > xen_map_cache() with size is 0 if the requested address is in the
> > RAM.
> > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > checking
> > > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > > always return 1 when size is 0 without any bit testing. Actually,
> > for
> > > > this case, test_bits should check one bit. So this patch introduced
> > a
> > > > __test_bit_size which is greater than 0 and a multiple of
> > XC_PAGE_SIZE,
> > > > then test_bits can work correctly with __test_bit_size >>
> > XC_PAGE_SHIFT
> > > > as its size.
> > > > >
> > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > >
> > > > Thanks for the patch and for debugging this difficult problem.
> > > > The reality is that size is never actually 0: when qemu_get_ram_ptr
> > > > calls xen_map_cache with size 0, it actually means "map until the
> > end
> > > > of
> > > > the page". As a consequence test_bits should always test at least 1
> > bit,
> > > > like you wrote.
> > >
> > > Yes, for the case of size is 0, we can just simply set
> > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to test
> > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test 2
> > bits, but size >> XC_PAGE_SHIFT is only 1.
> > >
> > 
> > I was assuming that the size is always page aligned.
> > Looking through the code actually I think that it's better not to rely
> > on this assumption.
> > 
> > Looking back at your original patch:
> > 
> > 
> > 
> > > We fixed this issue by below patch which computed the correct size
> > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will call
> > xen_map_cache() with size is 0 if the requested address is in the RAM.
> > Then xen_map_cache() will pass the size 0 to test_bits() for checking
> > if the corresponding pfn was mapped in cache. But test_bits() will
> > always return 1 when size is 0 without any bit testing. Actually, for
> > this case, test_bits should check one bit. So this patch introduced a
> > __test_bit_size which is greater than 0 and a multiple of XC_PAGE_SIZE,
> > then test_bits can work correctly with __test_bit_size >> XC_PAGE_SHIFT
> > as its size.
> > >
> > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > >
> > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > index 31c06dc..bd4a97f 100644
> > > --- a/xen-mapcache.c
> > > +++ b/xen-mapcache.c
> > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr
> > size,
> > >      hwaddr address_index;
> > >      hwaddr address_offset;
> > >      hwaddr __size = size;
> > > +    hwaddr __test_bit_size = size;
> > >      bool translated = false;
> > >
> > >  tryagain:
> > > @@ -210,7 +211,23 @@ tryagain:
> > >
> > >      trace_xen_map_cache(phys_addr);
> > >
> > > -    if (address_index == mapcache->last_address_index && !lock
> > && !__size) {
> > > +    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
> > 
> > there is no need to move this line up here, see below
> > 
> > 
> > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > +    if (size) {
> > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE - 1));
> > > +
> > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > XC_PAGE_SIZE);
> > > +        }
> > > +    } else {
> > > +        __test_bit_size = XC_PAGE_SIZE;
> > > +    }
> > 
> > this is OK
> > 
> > 
> > > +    if (address_index == mapcache->last_address_index && !lock
> > && !__size &&
> > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > +                  entry->valid_mapping)) {
> > >          trace_xen_map_cache_return(mapcache->last_address_vaddr +
> > address_offset);
> > >          return mapcache->last_address_vaddr + address_offset;
> > >      }
> > 
> > Unless I am missing something this change is unnecessary: if the
> > mapping
> > is not valid than mapcache->last_address_index is set to -1.
> 
> mapcache->last_address_index means the corresponding bucket (1MB) was mapped, but we noticed that some pages of the bucket may be not mapped. So we need to check if it's mapped even the address_index is equal to last_address_index.

That is a good point, but the current fix doesn't fully address that
problem: the first entry found in the cache might not be the one
corresponding to last_address_index.

I think that the right fix here would be to replace last_address_index
and last_address_vaddr with a last_entry pointer.

I have sent a small patch series that includes your patch, can you
please let me know if it does solve your problem and if you think that
is correct?

The patch series is here: 

http://marc.info/?l=qemu-devel&m=136490915902679
http://marc.info/?l=qemu-devel&m=136490915602678

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-04-02 13:27                         ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
@ 2013-04-03  8:15                           ` Hanweidong
  2013-04-03 10:36                               ` [Qemu-devel] " Stefano Stabellini
  2013-04-03  8:15                           ` Hanweidong
  1 sibling, 1 reply; 25+ messages in thread
From: Hanweidong @ 2013-04-03  8:15 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年4月2日 21:28
> To: Hanweidong
> Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> (Arei); Anthony Perard; Wangzhenguo
> Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Tue, 2 Apr 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年4月1日 22:39
> > > To: Hanweidong
> > > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
> Gonglei
> > > (Arei); Anthony Perard; Wangzhenguo
> > > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results
> in
> > > qemu exit
> > >
> > > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > > -----Original Message-----
> > > > > From: Stefano Stabellini
> [mailto:stefano.stabellini@eu.citrix.com]
> > > > > Sent: 2013年3月29日 20:37
> > > > > To: Hanweidong
> > > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun;
> qemu-
> > > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei);
> Anthony
> > > > > Perard; Wangzhenguo
> > > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning
> results
> > > in
> > > > > qemu exit
> > > > >
> > > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > > We fixed this issue by below patch which computed the correct
> > > size
> > > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr()
> will
> > > call
> > > > > xen_map_cache() with size is 0 if the requested address is in
> the
> > > RAM.
> > > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > > checking
> > > > > if the corresponding pfn was mapped in cache. But test_bits()
> will
> > > > > always return 1 when size is 0 without any bit testing.
> Actually,
> > > for
> > > > > this case, test_bits should check one bit. So this patch
> introduced
> > > a
> > > > > __test_bit_size which is greater than 0 and a multiple of
> > > XC_PAGE_SIZE,
> > > > > then test_bits can work correctly with __test_bit_size >>
> > > XC_PAGE_SHIFT
> > > > > as its size.
> > > > > >
> > > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > >
> > > > > Thanks for the patch and for debugging this difficult problem.
> > > > > The reality is that size is never actually 0: when
> qemu_get_ram_ptr
> > > > > calls xen_map_cache with size 0, it actually means "map until
> the
> > > end
> > > > > of
> > > > > the page". As a consequence test_bits should always test at
> least 1
> > > bit,
> > > > > like you wrote.
> > > >
> > > > Yes, for the case of size is 0, we can just simply set
> > > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to
> test
> > > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test
> 2
> > > bits, but size >> XC_PAGE_SHIFT is only 1.
> > > >
> > >
> > > I was assuming that the size is always page aligned.
> > > Looking through the code actually I think that it's better not to
> rely
> > > on this assumption.
> > >
> > > Looking back at your original patch:
> > >
> > >
> > >
> > > > We fixed this issue by below patch which computed the correct
> size
> > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> call
> > > xen_map_cache() with size is 0 if the requested address is in the
> RAM.
> > > Then xen_map_cache() will pass the size 0 to test_bits() for
> checking
> > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > always return 1 when size is 0 without any bit testing. Actually,
> for
> > > this case, test_bits should check one bit. So this patch introduced
> a
> > > __test_bit_size which is greater than 0 and a multiple of
> XC_PAGE_SIZE,
> > > then test_bits can work correctly with __test_bit_size >>
> XC_PAGE_SHIFT
> > > as its size.
> > > >
> > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > >
> > > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > > index 31c06dc..bd4a97f 100644
> > > > --- a/xen-mapcache.c
> > > > +++ b/xen-mapcache.c
> > > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr,
> hwaddr
> > > size,
> > > >      hwaddr address_index;
> > > >      hwaddr address_offset;
> > > >      hwaddr __size = size;
> > > > +    hwaddr __test_bit_size = size;
> > > >      bool translated = false;
> > > >
> > > >  tryagain:
> > > > @@ -210,7 +211,23 @@ tryagain:
> > > >
> > > >      trace_xen_map_cache(phys_addr);
> > > >
> > > > -    if (address_index == mapcache->last_address_index && !lock
> > > && !__size) {
> > > > +    entry = &mapcache->entry[address_index % mapcache-
> >nr_buckets];
> > >
> > > there is no need to move this line up here, see below
> > >
> > >
> > > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > > +    if (size) {
> > > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE -
> 1));
> > > > +
> > > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > > XC_PAGE_SIZE);
> > > > +        }
> > > > +    } else {
> > > > +        __test_bit_size = XC_PAGE_SIZE;
> > > > +    }
> > >
> > > this is OK
> > >
> > >
> > > > +    if (address_index == mapcache->last_address_index && !lock
> > > && !__size &&
> > > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > > +                  entry->valid_mapping)) {
> > > >          trace_xen_map_cache_return(mapcache->last_address_vaddr
> +
> > > address_offset);
> > > >          return mapcache->last_address_vaddr + address_offset;
> > > >      }
> > >
> > > Unless I am missing something this change is unnecessary: if the
> > > mapping
> > > is not valid than mapcache->last_address_index is set to -1.
> >
> > mapcache->last_address_index means the corresponding bucket (1MB) was
> mapped, but we noticed that some pages of the bucket may be not mapped.
> So we need to check if it's mapped even the address_index is equal to
> last_address_index.
> 
> That is a good point, but the current fix doesn't fully address that
> problem: the first entry found in the cache might not be the one
> corresponding to last_address_index.
> 
> I think that the right fix here would be to replace last_address_index
> and last_address_vaddr with a last_entry pointer.
> 
> I have sent a small patch series that includes your patch, can you
> please let me know if it does solve your problem and if you think that
> is correct?
> 

The patches look good for me. We verified that the patches solved our problem. 

--weidong

> The patch series is here:
> 
> http://marc.info/?l=qemu-devel&m=136490915902679
> http://marc.info/?l=qemu-devel&m=136490915602678



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
  2013-04-02 13:27                         ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
  2013-04-03  8:15                           ` Hanweidong
@ 2013-04-03  8:15                           ` Hanweidong
  1 sibling, 0 replies; 25+ messages in thread
From: Hanweidong @ 2013-04-03  8:15 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Andrew Cooper, Yanqiangjun, Tim (Xen.org),
	qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年4月2日 21:28
> To: Hanweidong
> Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> (Arei); Anthony Perard; Wangzhenguo
> Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> qemu exit
> 
> On Tue, 2 Apr 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年4月1日 22:39
> > > To: Hanweidong
> > > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
> Gonglei
> > > (Arei); Anthony Perard; Wangzhenguo
> > > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results
> in
> > > qemu exit
> > >
> > > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > > -----Original Message-----
> > > > > From: Stefano Stabellini
> [mailto:stefano.stabellini@eu.citrix.com]
> > > > > Sent: 2013年3月29日 20:37
> > > > > To: Hanweidong
> > > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun;
> qemu-
> > > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei);
> Anthony
> > > > > Perard; Wangzhenguo
> > > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning
> results
> > > in
> > > > > qemu exit
> > > > >
> > > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > > We fixed this issue by below patch which computed the correct
> > > size
> > > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr()
> will
> > > call
> > > > > xen_map_cache() with size is 0 if the requested address is in
> the
> > > RAM.
> > > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > > checking
> > > > > if the corresponding pfn was mapped in cache. But test_bits()
> will
> > > > > always return 1 when size is 0 without any bit testing.
> Actually,
> > > for
> > > > > this case, test_bits should check one bit. So this patch
> introduced
> > > a
> > > > > __test_bit_size which is greater than 0 and a multiple of
> > > XC_PAGE_SIZE,
> > > > > then test_bits can work correctly with __test_bit_size >>
> > > XC_PAGE_SHIFT
> > > > > as its size.
> > > > > >
> > > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > >
> > > > > Thanks for the patch and for debugging this difficult problem.
> > > > > The reality is that size is never actually 0: when
> qemu_get_ram_ptr
> > > > > calls xen_map_cache with size 0, it actually means "map until
> the
> > > end
> > > > > of
> > > > > the page". As a consequence test_bits should always test at
> least 1
> > > bit,
> > > > > like you wrote.
> > > >
> > > > Yes, for the case of size is 0, we can just simply set
> > > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to
> test
> > > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test
> 2
> > > bits, but size >> XC_PAGE_SHIFT is only 1.
> > > >
> > >
> > > I was assuming that the size is always page aligned.
> > > Looking through the code actually I think that it's better not to
> rely
> > > on this assumption.
> > >
> > > Looking back at your original patch:
> > >
> > >
> > >
> > > > We fixed this issue by below patch which computed the correct
> size
> > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> call
> > > xen_map_cache() with size is 0 if the requested address is in the
> RAM.
> > > Then xen_map_cache() will pass the size 0 to test_bits() for
> checking
> > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > always return 1 when size is 0 without any bit testing. Actually,
> for
> > > this case, test_bits should check one bit. So this patch introduced
> a
> > > __test_bit_size which is greater than 0 and a multiple of
> XC_PAGE_SIZE,
> > > then test_bits can work correctly with __test_bit_size >>
> XC_PAGE_SHIFT
> > > as its size.
> > > >
> > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > >
> > > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > > index 31c06dc..bd4a97f 100644
> > > > --- a/xen-mapcache.c
> > > > +++ b/xen-mapcache.c
> > > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr,
> hwaddr
> > > size,
> > > >      hwaddr address_index;
> > > >      hwaddr address_offset;
> > > >      hwaddr __size = size;
> > > > +    hwaddr __test_bit_size = size;
> > > >      bool translated = false;
> > > >
> > > >  tryagain:
> > > > @@ -210,7 +211,23 @@ tryagain:
> > > >
> > > >      trace_xen_map_cache(phys_addr);
> > > >
> > > > -    if (address_index == mapcache->last_address_index && !lock
> > > && !__size) {
> > > > +    entry = &mapcache->entry[address_index % mapcache-
> >nr_buckets];
> > >
> > > there is no need to move this line up here, see below
> > >
> > >
> > > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > > +    if (size) {
> > > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE -
> 1));
> > > > +
> > > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > > XC_PAGE_SIZE);
> > > > +        }
> > > > +    } else {
> > > > +        __test_bit_size = XC_PAGE_SIZE;
> > > > +    }
> > >
> > > this is OK
> > >
> > >
> > > > +    if (address_index == mapcache->last_address_index && !lock
> > > && !__size &&
> > > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > > +                  entry->valid_mapping)) {
> > > >          trace_xen_map_cache_return(mapcache->last_address_vaddr
> +
> > > address_offset);
> > > >          return mapcache->last_address_vaddr + address_offset;
> > > >      }
> > >
> > > Unless I am missing something this change is unnecessary: if the
> > > mapping
> > > is not valid than mapcache->last_address_index is set to -1.
> >
> > mapcache->last_address_index means the corresponding bucket (1MB) was
> mapped, but we noticed that some pages of the bucket may be not mapped.
> So we need to check if it's mapped even the address_index is equal to
> last_address_index.
> 
> That is a good point, but the current fix doesn't fully address that
> problem: the first entry found in the cache might not be the one
> corresponding to last_address_index.
> 
> I think that the right fix here would be to replace last_address_index
> and last_address_vaddr with a last_entry pointer.
> 
> I have sent a small patch series that includes your patch, can you
> please let me know if it does solve your problem and if you think that
> is correct?
> 

The patches look good for me. We verified that the patches solved our problem. 

--weidong

> The patch series is here:
> 
> http://marc.info/?l=qemu-devel&m=136490915902679
> http://marc.info/?l=qemu-devel&m=136490915602678


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Xen-devel] frequently ballooning results in qemu exit
  2013-04-03  8:15                           ` Hanweidong
@ 2013-04-03 10:36                               ` Stefano Stabellini
  0 siblings, 0 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-03 10:36 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 7689 bytes --]

On Wed, 3 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年4月2日 21:28
> > To: Hanweidong
> > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> > (Arei); Anthony Perard; Wangzhenguo
> > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Tue, 2 Apr 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年4月1日 22:39
> > > > To: Hanweidong
> > > > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > > > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
> > Gonglei
> > > > (Arei); Anthony Perard; Wangzhenguo
> > > > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results
> > in
> > > > qemu exit
> > > >
> > > > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > > > -----Original Message-----
> > > > > > From: Stefano Stabellini
> > [mailto:stefano.stabellini@eu.citrix.com]
> > > > > > Sent: 2013年3月29日 20:37
> > > > > > To: Hanweidong
> > > > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun;
> > qemu-
> > > > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei);
> > Anthony
> > > > > > Perard; Wangzhenguo
> > > > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning
> > results
> > > > in
> > > > > > qemu exit
> > > > > >
> > > > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > > > We fixed this issue by below patch which computed the correct
> > > > size
> > > > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr()
> > will
> > > > call
> > > > > > xen_map_cache() with size is 0 if the requested address is in
> > the
> > > > RAM.
> > > > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > > > checking
> > > > > > if the corresponding pfn was mapped in cache. But test_bits()
> > will
> > > > > > always return 1 when size is 0 without any bit testing.
> > Actually,
> > > > for
> > > > > > this case, test_bits should check one bit. So this patch
> > introduced
> > > > a
> > > > > > __test_bit_size which is greater than 0 and a multiple of
> > > > XC_PAGE_SIZE,
> > > > > > then test_bits can work correctly with __test_bit_size >>
> > > > XC_PAGE_SHIFT
> > > > > > as its size.
> > > > > > >
> > > > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > > >
> > > > > > Thanks for the patch and for debugging this difficult problem.
> > > > > > The reality is that size is never actually 0: when
> > qemu_get_ram_ptr
> > > > > > calls xen_map_cache with size 0, it actually means "map until
> > the
> > > > end
> > > > > > of
> > > > > > the page". As a consequence test_bits should always test at
> > least 1
> > > > bit,
> > > > > > like you wrote.
> > > > >
> > > > > Yes, for the case of size is 0, we can just simply set
> > > > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > > > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > > > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to
> > test
> > > > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test
> > 2
> > > > bits, but size >> XC_PAGE_SHIFT is only 1.
> > > > >
> > > >
> > > > I was assuming that the size is always page aligned.
> > > > Looking through the code actually I think that it's better not to
> > rely
> > > > on this assumption.
> > > >
> > > > Looking back at your original patch:
> > > >
> > > >
> > > >
> > > > > We fixed this issue by below patch which computed the correct
> > size
> > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> > call
> > > > xen_map_cache() with size is 0 if the requested address is in the
> > RAM.
> > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > checking
> > > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > > always return 1 when size is 0 without any bit testing. Actually,
> > for
> > > > this case, test_bits should check one bit. So this patch introduced
> > a
> > > > __test_bit_size which is greater than 0 and a multiple of
> > XC_PAGE_SIZE,
> > > > then test_bits can work correctly with __test_bit_size >>
> > XC_PAGE_SHIFT
> > > > as its size.
> > > > >
> > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > >
> > > > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > > > index 31c06dc..bd4a97f 100644
> > > > > --- a/xen-mapcache.c
> > > > > +++ b/xen-mapcache.c
> > > > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr,
> > hwaddr
> > > > size,
> > > > >      hwaddr address_index;
> > > > >      hwaddr address_offset;
> > > > >      hwaddr __size = size;
> > > > > +    hwaddr __test_bit_size = size;
> > > > >      bool translated = false;
> > > > >
> > > > >  tryagain:
> > > > > @@ -210,7 +211,23 @@ tryagain:
> > > > >
> > > > >      trace_xen_map_cache(phys_addr);
> > > > >
> > > > > -    if (address_index == mapcache->last_address_index && !lock
> > > > && !__size) {
> > > > > +    entry = &mapcache->entry[address_index % mapcache-
> > >nr_buckets];
> > > >
> > > > there is no need to move this line up here, see below
> > > >
> > > >
> > > > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > > > +    if (size) {
> > > > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE -
> > 1));
> > > > > +
> > > > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > > > XC_PAGE_SIZE);
> > > > > +        }
> > > > > +    } else {
> > > > > +        __test_bit_size = XC_PAGE_SIZE;
> > > > > +    }
> > > >
> > > > this is OK
> > > >
> > > >
> > > > > +    if (address_index == mapcache->last_address_index && !lock
> > > > && !__size &&
> > > > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > > > +                  entry->valid_mapping)) {
> > > > >          trace_xen_map_cache_return(mapcache->last_address_vaddr
> > +
> > > > address_offset);
> > > > >          return mapcache->last_address_vaddr + address_offset;
> > > > >      }
> > > >
> > > > Unless I am missing something this change is unnecessary: if the
> > > > mapping
> > > > is not valid than mapcache->last_address_index is set to -1.
> > >
> > > mapcache->last_address_index means the corresponding bucket (1MB) was
> > mapped, but we noticed that some pages of the bucket may be not mapped.
> > So we need to check if it's mapped even the address_index is equal to
> > last_address_index.
> > 
> > That is a good point, but the current fix doesn't fully address that
> > problem: the first entry found in the cache might not be the one
> > corresponding to last_address_index.
> > 
> > I think that the right fix here would be to replace last_address_index
> > and last_address_vaddr with a last_entry pointer.
> > 
> > I have sent a small patch series that includes your patch, can you
> > please let me know if it does solve your problem and if you think that
> > is correct?
> > 
> 
> The patches look good for me. We verified that the patches solved our problem. 

Great, thanks!
I'll submit a pull request.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] frequently ballooning results in qemu exit
@ 2013-04-03 10:36                               ` Stefano Stabellini
  0 siblings, 0 replies; 25+ messages in thread
From: Stefano Stabellini @ 2013-04-03 10:36 UTC (permalink / raw)
  To: Hanweidong
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Yanqiangjun,
	Tim (Xen.org), qemu-devel, xen-devel, Gonglei (Arei),
	Anthony Perard, Wangzhenguo

[-- Attachment #1: Type: text/plain, Size: 7689 bytes --]

On Wed, 3 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年4月2日 21:28
> > To: Hanweidong
> > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Gonglei
> > (Arei); Anthony Perard; Wangzhenguo
> > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results in
> > qemu exit
> > 
> > On Tue, 2 Apr 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年4月1日 22:39
> > > > To: Hanweidong
> > > > Cc: Stefano Stabellini; Tim (Xen.org); George Dunlap; Andrew Cooper;
> > > > Yanqiangjun; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
> > Gonglei
> > > > (Arei); Anthony Perard; Wangzhenguo
> > > > Subject: RE: [Qemu-devel] [Xen-devel] frequently ballooning results
> > in
> > > > qemu exit
> > > >
> > > > On Sat, 30 Mar 2013, Hanweidong wrote:
> > > > > > -----Original Message-----
> > > > > > From: Stefano Stabellini
> > [mailto:stefano.stabellini@eu.citrix.com]
> > > > > > Sent: 2013年3月29日 20:37
> > > > > > To: Hanweidong
> > > > > > Cc: Tim (Xen.org); George Dunlap; Andrew Cooper; Yanqiangjun;
> > qemu-
> > > > > > devel@nongnu.org; xen-devel@lists.xen.org; Gonglei (Arei);
> > Anthony
> > > > > > Perard; Wangzhenguo
> > > > > > Subject: Re: [Qemu-devel] [Xen-devel] frequently ballooning
> > results
> > > > in
> > > > > > qemu exit
> > > > > >
> > > > > > On Mon, 25 Mar 2013, Hanweidong wrote:
> > > > > > > We fixed this issue by below patch which computed the correct
> > > > size
> > > > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr()
> > will
> > > > call
> > > > > > xen_map_cache() with size is 0 if the requested address is in
> > the
> > > > RAM.
> > > > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > > > checking
> > > > > > if the corresponding pfn was mapped in cache. But test_bits()
> > will
> > > > > > always return 1 when size is 0 without any bit testing.
> > Actually,
> > > > for
> > > > > > this case, test_bits should check one bit. So this patch
> > introduced
> > > > a
> > > > > > __test_bit_size which is greater than 0 and a multiple of
> > > > XC_PAGE_SIZE,
> > > > > > then test_bits can work correctly with __test_bit_size >>
> > > > XC_PAGE_SHIFT
> > > > > > as its size.
> > > > > > >
> > > > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > > >
> > > > > > Thanks for the patch and for debugging this difficult problem.
> > > > > > The reality is that size is never actually 0: when
> > qemu_get_ram_ptr
> > > > > > calls xen_map_cache with size 0, it actually means "map until
> > the
> > > > end
> > > > > > of
> > > > > > the page". As a consequence test_bits should always test at
> > least 1
> > > > bit,
> > > > > > like you wrote.
> > > > >
> > > > > Yes, for the case of size is 0, we can just simply set
> > > > __test_bit_size 1. But for size > 0, I think set __test_bit_size to
> > > > size >> XC_PAGE_SHIFT is incorrect. If size is not multiple of
> > > > XC_PAGE_SIZE, then the part of (size % XC_PAGE_SIZE) also needs to
> > test
> > > > 1 bit. For example size is XC_PAGE_SIZE + 1, then it needs to test
> > 2
> > > > bits, but size >> XC_PAGE_SHIFT is only 1.
> > > > >
> > > >
> > > > I was assuming that the size is always page aligned.
> > > > Looking through the code actually I think that it's better not to
> > rely
> > > > on this assumption.
> > > >
> > > > Looking back at your original patch:
> > > >
> > > >
> > > >
> > > > > We fixed this issue by below patch which computed the correct
> > size
> > > > for test_bits(). qemu_get_ram_ptr() and qemu_safe_ram_ptr() will
> > call
> > > > xen_map_cache() with size is 0 if the requested address is in the
> > RAM.
> > > > Then xen_map_cache() will pass the size 0 to test_bits() for
> > checking
> > > > if the corresponding pfn was mapped in cache. But test_bits() will
> > > > always return 1 when size is 0 without any bit testing. Actually,
> > for
> > > > this case, test_bits should check one bit. So this patch introduced
> > a
> > > > __test_bit_size which is greater than 0 and a multiple of
> > XC_PAGE_SIZE,
> > > > then test_bits can work correctly with __test_bit_size >>
> > XC_PAGE_SHIFT
> > > > as its size.
> > > > >
> > > > > Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
> > > > > Signed-off-by: Weidong Han <hanweidong@huawei.com>
> > > > >
> > > > > diff --git a/xen-mapcache.c b/xen-mapcache.c
> > > > > index 31c06dc..bd4a97f 100644
> > > > > --- a/xen-mapcache.c
> > > > > +++ b/xen-mapcache.c
> > > > > @@ -202,6 +202,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr,
> > hwaddr
> > > > size,
> > > > >      hwaddr address_index;
> > > > >      hwaddr address_offset;
> > > > >      hwaddr __size = size;
> > > > > +    hwaddr __test_bit_size = size;
> > > > >      bool translated = false;
> > > > >
> > > > >  tryagain:
> > > > > @@ -210,7 +211,23 @@ tryagain:
> > > > >
> > > > >      trace_xen_map_cache(phys_addr);
> > > > >
> > > > > -    if (address_index == mapcache->last_address_index && !lock
> > > > && !__size) {
> > > > > +    entry = &mapcache->entry[address_index % mapcache-
> > >nr_buckets];
> > > >
> > > > there is no need to move this line up here, see below
> > > >
> > > >
> > > > > +    /* __test_bit_size is always a multiple of XC_PAGE_SIZE */
> > > > > +    if (size) {
> > > > > +        __test_bit_size = size + (phys_addr & (XC_PAGE_SIZE -
> > 1));
> > > > > +
> > > > > +        if (__test_bit_size % XC_PAGE_SIZE) {
> > > > > +            __test_bit_size += XC_PAGE_SIZE - (__test_bit_size %
> > > > XC_PAGE_SIZE);
> > > > > +        }
> > > > > +    } else {
> > > > > +        __test_bit_size = XC_PAGE_SIZE;
> > > > > +    }
> > > >
> > > > this is OK
> > > >
> > > >
> > > > > +    if (address_index == mapcache->last_address_index && !lock
> > > > && !__size &&
> > > > > +        test_bits(address_offset >> XC_PAGE_SHIFT,
> > > > > +                  __test_bit_size >> XC_PAGE_SHIFT,
> > > > > +                  entry->valid_mapping)) {
> > > > >          trace_xen_map_cache_return(mapcache->last_address_vaddr
> > +
> > > > address_offset);
> > > > >          return mapcache->last_address_vaddr + address_offset;
> > > > >      }
> > > >
> > > > Unless I am missing something this change is unnecessary: if the
> > > > mapping
> > > > is not valid than mapcache->last_address_index is set to -1.
> > >
> > > mapcache->last_address_index means the corresponding bucket (1MB) was
> > mapped, but we noticed that some pages of the bucket may be not mapped.
> > So we need to check if it's mapped even the address_index is equal to
> > last_address_index.
> > 
> > That is a good point, but the current fix doesn't fully address that
> > problem: the first entry found in the cache might not be the one
> > corresponding to last_address_index.
> > 
> > I think that the right fix here would be to replace last_address_index
> > and last_address_vaddr with a last_entry pointer.
> > 
> > I have sent a small patch series that includes your patch, can you
> > please let me know if it does solve your problem and if you think that
> > is correct?
> > 
> 
> The patches look good for me. We verified that the patches solved our problem. 

Great, thanks!
I'll submit a pull request.

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-04-03 10:36 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-13 13:50 frequently ballooning results in qemu exit Hanweidong
2013-03-14 10:17 ` George Dunlap
2013-03-14 10:38   ` Anthony PERARD
2013-03-14 14:10     ` Hanweidong
2013-03-14 14:34       ` Tim Deegan
2013-03-15  5:54         ` Hanweidong
2013-03-21 12:15           ` Tim Deegan
2013-03-21 13:33             ` Hanweidong
2013-03-25 12:40               ` [Qemu-devel] [Xen-devel] " Hanweidong
2013-03-29 12:37                 ` [Qemu-devel] " Stefano Stabellini
2013-03-29 12:37                 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2013-03-30 15:04                   ` [Qemu-devel] " Hanweidong
2013-03-30 15:04                   ` [Qemu-devel] [Xen-devel] " Hanweidong
2013-04-01 14:39                     ` Stefano Stabellini
2013-04-02  1:06                       ` Hanweidong
2013-04-02 13:27                         ` [Qemu-devel] " Stefano Stabellini
2013-04-02 13:27                         ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2013-04-03  8:15                           ` Hanweidong
2013-04-03 10:36                             ` Stefano Stabellini
2013-04-03 10:36                               ` [Qemu-devel] " Stefano Stabellini
2013-04-03  8:15                           ` Hanweidong
2013-04-02  1:06                       ` Hanweidong
2013-04-01 14:39                     ` Stefano Stabellini
2013-03-25 12:40               ` Hanweidong
2013-03-14 10:48   ` Hanweidong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.