All of lore.kernel.org
 help / color / mirror / Atom feed
* How to diagnose memory leak in kvm-qemu-0.14.0?
@ 2011-05-18 16:44 Steve Kemp
  2011-05-19  8:40 ` Stefan Hajnoczi
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-18 16:44 UTC (permalink / raw)
  To: kvm


  I'm running the most recent release of KVM, version 0.14.0
 on a host kernel 2.6.32.15, and seem to be able to trigger
 a leak of memory pretty easily.

  Inside a guest the following one-liner will cause the KVM
 process on the host to gradually increase its memory
 consumption:

    while true; do
      wget http://mirror.bytemark.co.uk/misc/test-files/500M; cp 500M new; rm 500M new; sleep 10 ;
    done

  The guest is launched using loopback files:

  "/opt/kvm/bin/qemu-system-x86_64 -m 500 -drive
  file=/machines/kvm6/jail/root_fs,if=virtio,cache=off -drive
  file=/machines/kvm6/jail/swap_file,if=virtio,boot=off -net
  nic,macaddr=fe:ff:00:00:5a:cb,model=virtio -net
  tap,ifname=tap_kvm6,script=no -pidfile /var/kvm/kvm6.pid -name kvm6
  -serial /dev/tty -no-reboot -monitor
   unix:/var/kvm/kvm6.mon,server,nowait -kernel
   /machines/kvm6/jail/linux -append root=/dev/vda  clocksource=acpi_pm
   notsc console=tty0 console=ttyS0,115200n8 -nographic -boot c
   "

  So we're using an external code, and the virtio for both block
 and NIC.  Using e1000, rtl8139, or virtio for the NIC results in the
 same leak.

  top shows this:


    23074 kvm6      20   0  656m 515m 1060 S  0.0  0.8  13:10.79 qemu-system-x86

  So 656Mb virt, 515Mb resident, and over time the virtual memory
 increases significantly.

  Should I be blaming KVM for this leak?  Or is it possible it is either
 the host or the guest kernels?  Any assistance in tracking down the
 leak is most welcome - even vague hints.

  If helpful I'd be happy to share logins to either a host or a leaky
 guest, or both.


Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-18 16:44 How to diagnose memory leak in kvm-qemu-0.14.0? Steve Kemp
@ 2011-05-19  8:40 ` Stefan Hajnoczi
  2011-05-19  8:50   ` Stefan Hajnoczi
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-19  8:40 UTC (permalink / raw)
  To: Steve Kemp; +Cc: kvm

On Wed, May 18, 2011 at 5:44 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
>
>  I'm running the most recent release of KVM, version 0.14.0
>  on a host kernel 2.6.32.15, and seem to be able to trigger
>  a leak of memory pretty easily.
>
>  Inside a guest the following one-liner will cause the KVM
>  process on the host to gradually increase its memory
>  consumption:
>
>    while true; do
>      wget http://mirror.bytemark.co.uk/misc/test-files/500M; cp 500M new; rm 500M new; sleep 10 ;
>    done

You are exercising both networking and storage.  Have you cut the test
down to just wget vs cp/rm?  Also why the sleep 10?

If you are building qemu-kvm from source you might like to enable
tracing to track memory allocations in qemu-kvm.  For full information
see qemu-kvm/docs/tracing.txt.  There are several trace events of
interest:
$ cd qemu-kvm
$ $EDITOR trace-events
# qemu-malloc.c
disable qemu_malloc(size_t size, void *ptr) "size %zu ptr %p"
disable qemu_realloc(void *ptr, size_t size, void *newptr) "ptr %p
size %zu newptr %p"
disable qemu_free(void *ptr) "ptr %p"

# osdep.c
disable qemu_memalign(size_t alignment, size_t size, void *ptr)
"alignment %zu size %zu ptr %p"
disable qemu_vmalloc(size_t size, void *ptr) "size %zu ptr %p"
disable qemu_vfree(void *ptr) "ptr %p"
^--- remove the "disable" property from these memory allocation events
$ ./configure --enable-trace-backend=simple [...]
$ make
$ # run the VM, reproduce the leak, shut the VM down
$ scripts/simpletrace.py trace-events trace-<pid>  # where <pid> was
the process ID

It is fairly easy to write a script that correlates mallocs and frees,
printing out memory allocations that were never freed at the end.
There is a Python API for processing trace files, here is an
explanation of how ot use it:
http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html

If you have SystemTap installed you may wish to use the "dtrace"
backend instead of "simple".  You can then use SystemTap scripts on
the probes.  SystemTap is more powerful, it should allow you to
extract call stacks when probes are fired but I'm not experienced with
it.

Feel free to contact me on #qemu (oftc) or #kvm (freenode) IRC if you
want some pointers, my nick is stefanha.

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-19  8:40 ` Stefan Hajnoczi
@ 2011-05-19  8:50   ` Stefan Hajnoczi
  2011-05-19 11:00   ` Steve Kemp
  2011-05-19 11:57   ` Steve Kemp
  2 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-19  8:50 UTC (permalink / raw)
  To: Steve Kemp; +Cc: kvm

On Thu, May 19, 2011 at 9:40 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, May 18, 2011 at 5:44 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> If you have SystemTap installed you may wish to use the "dtrace"
> backend instead of "simple".  You can then use SystemTap scripts on
> the probes.  SystemTap is more powerful, it should allow you to
> extract call stacks when probes are fired but I'm not experienced with
> it.

Forgot to add that the __builtin_return_address() gcc extension can be
used to collect return addresses even with the simple trace backend:
http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Return-Address.html#index-g_t_005f_005fbuiltin_005freturn_005faddress-2431

I've used it in the past as a poor man's stack trace when tracking
down memory leaks.

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-19  8:40 ` Stefan Hajnoczi
  2011-05-19  8:50   ` Stefan Hajnoczi
@ 2011-05-19 11:00   ` Steve Kemp
  2011-05-19 11:57   ` Steve Kemp
  2 siblings, 0 replies; 11+ messages in thread
From: Steve Kemp @ 2011-05-19 11:00 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:

> You are exercising both networking and storage.  Have you cut the test
> down to just wget vs cp/rm?

  Both seem to leak; but the cp/rm leaks more.  Which suggests to me
 that we're seeing a leak in the virtio block handling.

> Also why the sleep 10?

  Just to keep the machine responsive!

> If you are building qemu-kvm from source you might like to enable
> tracing to track memory allocations in qemu-kvm.  For full information
> see qemu-kvm/docs/tracing.txt.

  Thanks that was a good read, and your wee recipe was very useful.

  I'm now rebuilding with tracing of virtio stuff to see if anything
 leaps out at me.

> Feel free to contact me on #qemu (oftc) or #kvm (freenode) IRC if you
> want some pointers, my nick is stefanha.

  Thanks a lot.

Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-19  8:40 ` Stefan Hajnoczi
  2011-05-19  8:50   ` Stefan Hajnoczi
  2011-05-19 11:00   ` Steve Kemp
@ 2011-05-19 11:57   ` Steve Kemp
  2011-05-20 11:01     ` Stefan Hajnoczi
  2 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-19 11:57 UTC (permalink / raw)
  To: kvm

On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:

> You are exercising both networking and storage.  Have you cut the test
> down to just wget vs cp/rm?  Also why the sleep 10?

  I'm 99% certain the leak is coming from the virtio block device
 now.  A simple test is:

 wget http://mirror.bytemark.co.uk/misc/test-files/500M
 while true; do cp 500M foo.img; rm foo.img; sleep 2; done

  "top" shows the virt memory growing to >1gb in under two minutes.

Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-19 11:57   ` Steve Kemp
@ 2011-05-20 11:01     ` Stefan Hajnoczi
  2011-05-20 11:47       ` Steve Kemp
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 11:01 UTC (permalink / raw)
  To: Steve Kemp; +Cc: kvm

On Thu, May 19, 2011 at 12:57 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:
>
>> You are exercising both networking and storage.  Have you cut the test
>> down to just wget vs cp/rm?  Also why the sleep 10?
>
>  I'm 99% certain the leak is coming from the virtio block device
>  now.  A simple test is:
>
>  wget http://mirror.bytemark.co.uk/misc/test-files/500M
>  while true; do cp 500M foo.img; rm foo.img; sleep 2; done
>
>  "top" shows the virt memory growing to >1gb in under two minutes.

Were you able to track down the culprit?

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-20 11:01     ` Stefan Hajnoczi
@ 2011-05-20 11:47       ` Steve Kemp
  2011-05-20 13:16         ` Stefan Hajnoczi
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 11:47 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

On Fri May 20, 2011 at 12:01:58 +0100, Stefan Hajnoczi wrote:

> >  wget http://mirror.bytemark.co.uk/misc/test-files/500M
> >  while true; do cp 500M foo.img; rm foo.img; sleep 2; done
> >
> >  "top" shows the virt memory growing to >1gb in under two minutes.
>
> Were you able to track down the culprit?

  Yes, or at least confirm my suspicion.  The virtio block device
 is the source of the leak.

  Host kernel: 2.6.32.15
  Guest Kernel: linux-2.6.32.23

  Leaking case:

  opt/kvm2/bin/qemu-system-x86_64 -m 500 \
    -drive file=/machines/kvm2/jail/root_fs,if=virtio,cache=off

  Non leaking case:

   /opt/kvm/current/bin/qemu-system-x86_64 -m 500 \
     -drive file=/machines/kvm1/jail/root_fs,cache=off ..

  The leak occurs with both KVM 0.12.5 and 0.14.0.

  I've had a quick read of hw/virtio-blk.c but didn't see anything
 glaringly obvious.  I'll need to trace through the code, drink more
 coffee, or get lucky to narrow it down further.

Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-20 11:47       ` Steve Kemp
@ 2011-05-20 13:16         ` Stefan Hajnoczi
  2011-05-20 13:47           ` Steve Kemp
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 13:16 UTC (permalink / raw)
  To: Steve Kemp; +Cc: kvm

On Fri, May 20, 2011 at 12:47 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Fri May 20, 2011 at 12:01:58 +0100, Stefan Hajnoczi wrote:
>
>> >  wget http://mirror.bytemark.co.uk/misc/test-files/500M
>> >  while true; do cp 500M foo.img; rm foo.img; sleep 2; done
>> >
>> >  "top" shows the virt memory growing to >1gb in under two minutes.
>>
>> Were you able to track down the culprit?
>
>  Yes, or at least confirm my suspicion.  The virtio block device
>  is the source of the leak.
>
>  Host kernel: 2.6.32.15
>  Guest Kernel: linux-2.6.32.23
>
>  Leaking case:
>
>  opt/kvm2/bin/qemu-system-x86_64 -m 500 \
>    -drive file=/machines/kvm2/jail/root_fs,if=virtio,cache=off
>
>  Non leaking case:
>
>   /opt/kvm/current/bin/qemu-system-x86_64 -m 500 \
>     -drive file=/machines/kvm1/jail/root_fs,cache=off ..
>
>  The leak occurs with both KVM 0.12.5 and 0.14.0.
>
>  I've had a quick read of hw/virtio-blk.c but didn't see anything
>  glaringly obvious.  I'll need to trace through the code, drink more
>  coffee, or get lucky to narrow it down further.

Enabling the memory allocation trace events and adding the
__builtin_return_address() to them should provide enough information
to catch the caller who is leaking memory.

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-20 13:16         ` Stefan Hajnoczi
@ 2011-05-20 13:47           ` Steve Kemp
  2011-05-20 14:32             ` Stefan Hajnoczi
  0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 13:47 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

On Fri May 20, 2011 at 14:16:05 +0100, Stefan Hajnoczi wrote:

> >  I've had a quick read of hw/virtio-blk.c but didn't see anything
> >  glaringly obvious.  I'll need to trace through the code, drink more
> >  coffee, or get lucky to narrow it down further.
>
> Enabling the memory allocation trace events and adding the
> __builtin_return_address() to them should provide enough information
> to catch the caller who is leaking memory.

  I'm trying to do that at the moment.  So far the only thing I've
 done is add a trace on virtio_blk_alloc_request - I'm noticing
 a leak there pretty easily.

  I see *two* request structures be allocated all the time, one
 is used and freed, the other is ignored.  That seems pretty
 conclusively wrong to me, but I'm trying to understand how that
 happens:

 virtio_blk_alloc_request 0.000 req=0x91e08f0  -> Allocation 1
 virtio_blk_alloc_request 77.659 req=0x9215650  -> Allocation 2
 virtio_blk_rw_complete 449.469 req=0x91e08f0 ret=0x0 -> First is used.
 virtio_blk_req_complete 1.955 req=0x91e08f0 status=0x0 -> First is freed.

 second is never seen again.

Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-20 13:47           ` Steve Kemp
@ 2011-05-20 14:32             ` Stefan Hajnoczi
  2011-05-20 14:52               ` Steve Kemp
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 14:32 UTC (permalink / raw)
  To: Steve Kemp; +Cc: kvm

On Fri, May 20, 2011 at 2:47 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Fri May 20, 2011 at 14:16:05 +0100, Stefan Hajnoczi wrote:
>
>> >  I've had a quick read of hw/virtio-blk.c but didn't see anything
>> >  glaringly obvious.  I'll need to trace through the code, drink more
>> >  coffee, or get lucky to narrow it down further.
>>
>> Enabling the memory allocation trace events and adding the
>> __builtin_return_address() to them should provide enough information
>> to catch the caller who is leaking memory.
>
>  I'm trying to do that at the moment.  So far the only thing I've
>  done is add a trace on virtio_blk_alloc_request - I'm noticing
>  a leak there pretty easily.
>
>  I see *two* request structures be allocated all the time, one
>  is used and freed, the other is ignored.  That seems pretty
>  conclusively wrong to me, but I'm trying to understand how that
>  happens:
>
>  virtio_blk_alloc_request 0.000 req=0x91e08f0  -> Allocation 1
>  virtio_blk_alloc_request 77.659 req=0x9215650  -> Allocation 2

Are you sure this isn't the temporary one that is allocated but freed
immediately once the virtqueue is empty:

static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
{
    VirtIOBlockReq *req = virtio_blk_alloc_request(s);

    if (req != NULL) {
        if (!virtqueue_pop(s->vq, &req->elem)) {
            qemu_free(req);  <--- virtqueue empty, we're done
            return NULL;
        }
    }

    return req;
}

>  virtio_blk_rw_complete 449.469 req=0x91e08f0 ret=0x0 -> First is used.
>  virtio_blk_req_complete 1.955 req=0x91e08f0 status=0x0 -> First is freed.
>
>  second is never seen again.

Sounds scary 8).

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
  2011-05-20 14:32             ` Stefan Hajnoczi
@ 2011-05-20 14:52               ` Steve Kemp
  0 siblings, 0 replies; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 14:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

On Fri May 20, 2011 at 15:32:34 +0100, Stefan Hajnoczi wrote:

> >  virtio_blk_alloc_request 0.000 req=0x91e08f0  -> Allocation 1
> >  virtio_blk_alloc_request 77.659 req=0x9215650  -> Allocation 2
>
> Are you sure this isn't the temporary one that is allocated but freed
> immediately once the virtqueue is empty:

  Good catch.  Adding traces above both the qemu_free() calls I can
 see that the allocation & freeing of VirtIOBlockReq structures
 is paired.

  Looks like I'm going to have to bite the bullet and do real allocation
 tracking.

Steve Kemp
--
                                          Bytemark Hosting
                                http://www.bytemark.co.uk/
                                  phone UK: 0845 004 3 004
          Dedicated Linux hosts from 15ukp ($30) per month

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-05-20 14:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-18 16:44 How to diagnose memory leak in kvm-qemu-0.14.0? Steve Kemp
2011-05-19  8:40 ` Stefan Hajnoczi
2011-05-19  8:50   ` Stefan Hajnoczi
2011-05-19 11:00   ` Steve Kemp
2011-05-19 11:57   ` Steve Kemp
2011-05-20 11:01     ` Stefan Hajnoczi
2011-05-20 11:47       ` Steve Kemp
2011-05-20 13:16         ` Stefan Hajnoczi
2011-05-20 13:47           ` Steve Kemp
2011-05-20 14:32             ` Stefan Hajnoczi
2011-05-20 14:52               ` Steve Kemp

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.