* How to diagnose memory leak in kvm-qemu-0.14.0?
@ 2011-05-18 16:44 Steve Kemp
2011-05-19 8:40 ` Stefan Hajnoczi
0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-18 16:44 UTC (permalink / raw)
To: kvm
I'm running the most recent release of KVM, version 0.14.0
on a host kernel 2.6.32.15, and seem to be able to trigger
a leak of memory pretty easily.
Inside a guest the following one-liner will cause the KVM
process on the host to gradually increase its memory
consumption:
while true; do
wget http://mirror.bytemark.co.uk/misc/test-files/500M; cp 500M new; rm 500M new; sleep 10 ;
done
The guest is launched using loopback files:
"/opt/kvm/bin/qemu-system-x86_64 -m 500 -drive
file=/machines/kvm6/jail/root_fs,if=virtio,cache=off -drive
file=/machines/kvm6/jail/swap_file,if=virtio,boot=off -net
nic,macaddr=fe:ff:00:00:5a:cb,model=virtio -net
tap,ifname=tap_kvm6,script=no -pidfile /var/kvm/kvm6.pid -name kvm6
-serial /dev/tty -no-reboot -monitor
unix:/var/kvm/kvm6.mon,server,nowait -kernel
/machines/kvm6/jail/linux -append root=/dev/vda clocksource=acpi_pm
notsc console=tty0 console=ttyS0,115200n8 -nographic -boot c
"
So we're using an external code, and the virtio for both block
and NIC. Using e1000, rtl8139, or virtio for the NIC results in the
same leak.
top shows this:
23074 kvm6 20 0 656m 515m 1060 S 0.0 0.8 13:10.79 qemu-system-x86
So 656Mb virt, 515Mb resident, and over time the virtual memory
increases significantly.
Should I be blaming KVM for this leak? Or is it possible it is either
the host or the guest kernels? Any assistance in tracking down the
leak is most welcome - even vague hints.
If helpful I'd be happy to share logins to either a host or a leaky
guest, or both.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-18 16:44 How to diagnose memory leak in kvm-qemu-0.14.0? Steve Kemp
@ 2011-05-19 8:40 ` Stefan Hajnoczi
2011-05-19 8:50 ` Stefan Hajnoczi
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-19 8:40 UTC (permalink / raw)
To: Steve Kemp; +Cc: kvm
On Wed, May 18, 2011 at 5:44 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
>
> I'm running the most recent release of KVM, version 0.14.0
> on a host kernel 2.6.32.15, and seem to be able to trigger
> a leak of memory pretty easily.
>
> Inside a guest the following one-liner will cause the KVM
> process on the host to gradually increase its memory
> consumption:
>
> while true; do
> wget http://mirror.bytemark.co.uk/misc/test-files/500M; cp 500M new; rm 500M new; sleep 10 ;
> done
You are exercising both networking and storage. Have you cut the test
down to just wget vs cp/rm? Also why the sleep 10?
If you are building qemu-kvm from source you might like to enable
tracing to track memory allocations in qemu-kvm. For full information
see qemu-kvm/docs/tracing.txt. There are several trace events of
interest:
$ cd qemu-kvm
$ $EDITOR trace-events
# qemu-malloc.c
disable qemu_malloc(size_t size, void *ptr) "size %zu ptr %p"
disable qemu_realloc(void *ptr, size_t size, void *newptr) "ptr %p
size %zu newptr %p"
disable qemu_free(void *ptr) "ptr %p"
# osdep.c
disable qemu_memalign(size_t alignment, size_t size, void *ptr)
"alignment %zu size %zu ptr %p"
disable qemu_vmalloc(size_t size, void *ptr) "size %zu ptr %p"
disable qemu_vfree(void *ptr) "ptr %p"
^--- remove the "disable" property from these memory allocation events
$ ./configure --enable-trace-backend=simple [...]
$ make
$ # run the VM, reproduce the leak, shut the VM down
$ scripts/simpletrace.py trace-events trace-<pid> # where <pid> was
the process ID
It is fairly easy to write a script that correlates mallocs and frees,
printing out memory allocations that were never freed at the end.
There is a Python API for processing trace files, here is an
explanation of how ot use it:
http://blog.vmsplice.net/2011/03/how-to-write-trace-analysis-scripts-for.html
If you have SystemTap installed you may wish to use the "dtrace"
backend instead of "simple". You can then use SystemTap scripts on
the probes. SystemTap is more powerful, it should allow you to
extract call stacks when probes are fired but I'm not experienced with
it.
Feel free to contact me on #qemu (oftc) or #kvm (freenode) IRC if you
want some pointers, my nick is stefanha.
Stefan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-19 8:40 ` Stefan Hajnoczi
@ 2011-05-19 8:50 ` Stefan Hajnoczi
2011-05-19 11:00 ` Steve Kemp
2011-05-19 11:57 ` Steve Kemp
2 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-19 8:50 UTC (permalink / raw)
To: Steve Kemp; +Cc: kvm
On Thu, May 19, 2011 at 9:40 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, May 18, 2011 at 5:44 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> If you have SystemTap installed you may wish to use the "dtrace"
> backend instead of "simple". You can then use SystemTap scripts on
> the probes. SystemTap is more powerful, it should allow you to
> extract call stacks when probes are fired but I'm not experienced with
> it.
Forgot to add that the __builtin_return_address() gcc extension can be
used to collect return addresses even with the simple trace backend:
http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Return-Address.html#index-g_t_005f_005fbuiltin_005freturn_005faddress-2431
I've used it in the past as a poor man's stack trace when tracking
down memory leaks.
Stefan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-19 8:40 ` Stefan Hajnoczi
2011-05-19 8:50 ` Stefan Hajnoczi
@ 2011-05-19 11:00 ` Steve Kemp
2011-05-19 11:57 ` Steve Kemp
2 siblings, 0 replies; 11+ messages in thread
From: Steve Kemp @ 2011-05-19 11:00 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm
On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:
> You are exercising both networking and storage. Have you cut the test
> down to just wget vs cp/rm?
Both seem to leak; but the cp/rm leaks more. Which suggests to me
that we're seeing a leak in the virtio block handling.
> Also why the sleep 10?
Just to keep the machine responsive!
> If you are building qemu-kvm from source you might like to enable
> tracing to track memory allocations in qemu-kvm. For full information
> see qemu-kvm/docs/tracing.txt.
Thanks that was a good read, and your wee recipe was very useful.
I'm now rebuilding with tracing of virtio stuff to see if anything
leaps out at me.
> Feel free to contact me on #qemu (oftc) or #kvm (freenode) IRC if you
> want some pointers, my nick is stefanha.
Thanks a lot.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-19 8:40 ` Stefan Hajnoczi
2011-05-19 8:50 ` Stefan Hajnoczi
2011-05-19 11:00 ` Steve Kemp
@ 2011-05-19 11:57 ` Steve Kemp
2011-05-20 11:01 ` Stefan Hajnoczi
2 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-19 11:57 UTC (permalink / raw)
To: kvm
On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:
> You are exercising both networking and storage. Have you cut the test
> down to just wget vs cp/rm? Also why the sleep 10?
I'm 99% certain the leak is coming from the virtio block device
now. A simple test is:
wget http://mirror.bytemark.co.uk/misc/test-files/500M
while true; do cp 500M foo.img; rm foo.img; sleep 2; done
"top" shows the virt memory growing to >1gb in under two minutes.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-19 11:57 ` Steve Kemp
@ 2011-05-20 11:01 ` Stefan Hajnoczi
2011-05-20 11:47 ` Steve Kemp
0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 11:01 UTC (permalink / raw)
To: Steve Kemp; +Cc: kvm
On Thu, May 19, 2011 at 12:57 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Thu May 19, 2011 at 09:40:41 +0100, Stefan Hajnoczi wrote:
>
>> You are exercising both networking and storage. Have you cut the test
>> down to just wget vs cp/rm? Also why the sleep 10?
>
> I'm 99% certain the leak is coming from the virtio block device
> now. A simple test is:
>
> wget http://mirror.bytemark.co.uk/misc/test-files/500M
> while true; do cp 500M foo.img; rm foo.img; sleep 2; done
>
> "top" shows the virt memory growing to >1gb in under two minutes.
Were you able to track down the culprit?
Stefan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-20 11:01 ` Stefan Hajnoczi
@ 2011-05-20 11:47 ` Steve Kemp
2011-05-20 13:16 ` Stefan Hajnoczi
0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 11:47 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm
On Fri May 20, 2011 at 12:01:58 +0100, Stefan Hajnoczi wrote:
> > wget http://mirror.bytemark.co.uk/misc/test-files/500M
> > while true; do cp 500M foo.img; rm foo.img; sleep 2; done
> >
> > "top" shows the virt memory growing to >1gb in under two minutes.
>
> Were you able to track down the culprit?
Yes, or at least confirm my suspicion. The virtio block device
is the source of the leak.
Host kernel: 2.6.32.15
Guest Kernel: linux-2.6.32.23
Leaking case:
opt/kvm2/bin/qemu-system-x86_64 -m 500 \
-drive file=/machines/kvm2/jail/root_fs,if=virtio,cache=off
Non leaking case:
/opt/kvm/current/bin/qemu-system-x86_64 -m 500 \
-drive file=/machines/kvm1/jail/root_fs,cache=off ..
The leak occurs with both KVM 0.12.5 and 0.14.0.
I've had a quick read of hw/virtio-blk.c but didn't see anything
glaringly obvious. I'll need to trace through the code, drink more
coffee, or get lucky to narrow it down further.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-20 11:47 ` Steve Kemp
@ 2011-05-20 13:16 ` Stefan Hajnoczi
2011-05-20 13:47 ` Steve Kemp
0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 13:16 UTC (permalink / raw)
To: Steve Kemp; +Cc: kvm
On Fri, May 20, 2011 at 12:47 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Fri May 20, 2011 at 12:01:58 +0100, Stefan Hajnoczi wrote:
>
>> > wget http://mirror.bytemark.co.uk/misc/test-files/500M
>> > while true; do cp 500M foo.img; rm foo.img; sleep 2; done
>> >
>> > "top" shows the virt memory growing to >1gb in under two minutes.
>>
>> Were you able to track down the culprit?
>
> Yes, or at least confirm my suspicion. The virtio block device
> is the source of the leak.
>
> Host kernel: 2.6.32.15
> Guest Kernel: linux-2.6.32.23
>
> Leaking case:
>
> opt/kvm2/bin/qemu-system-x86_64 -m 500 \
> -drive file=/machines/kvm2/jail/root_fs,if=virtio,cache=off
>
> Non leaking case:
>
> /opt/kvm/current/bin/qemu-system-x86_64 -m 500 \
> -drive file=/machines/kvm1/jail/root_fs,cache=off ..
>
> The leak occurs with both KVM 0.12.5 and 0.14.0.
>
> I've had a quick read of hw/virtio-blk.c but didn't see anything
> glaringly obvious. I'll need to trace through the code, drink more
> coffee, or get lucky to narrow it down further.
Enabling the memory allocation trace events and adding the
__builtin_return_address() to them should provide enough information
to catch the caller who is leaking memory.
Stefan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-20 13:16 ` Stefan Hajnoczi
@ 2011-05-20 13:47 ` Steve Kemp
2011-05-20 14:32 ` Stefan Hajnoczi
0 siblings, 1 reply; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 13:47 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm
On Fri May 20, 2011 at 14:16:05 +0100, Stefan Hajnoczi wrote:
> > I've had a quick read of hw/virtio-blk.c but didn't see anything
> > glaringly obvious. I'll need to trace through the code, drink more
> > coffee, or get lucky to narrow it down further.
>
> Enabling the memory allocation trace events and adding the
> __builtin_return_address() to them should provide enough information
> to catch the caller who is leaking memory.
I'm trying to do that at the moment. So far the only thing I've
done is add a trace on virtio_blk_alloc_request - I'm noticing
a leak there pretty easily.
I see *two* request structures be allocated all the time, one
is used and freed, the other is ignored. That seems pretty
conclusively wrong to me, but I'm trying to understand how that
happens:
virtio_blk_alloc_request 0.000 req=0x91e08f0 -> Allocation 1
virtio_blk_alloc_request 77.659 req=0x9215650 -> Allocation 2
virtio_blk_rw_complete 449.469 req=0x91e08f0 ret=0x0 -> First is used.
virtio_blk_req_complete 1.955 req=0x91e08f0 status=0x0 -> First is freed.
second is never seen again.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-20 13:47 ` Steve Kemp
@ 2011-05-20 14:32 ` Stefan Hajnoczi
2011-05-20 14:52 ` Steve Kemp
0 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2011-05-20 14:32 UTC (permalink / raw)
To: Steve Kemp; +Cc: kvm
On Fri, May 20, 2011 at 2:47 PM, Steve Kemp <steve@bytemark.co.uk> wrote:
> On Fri May 20, 2011 at 14:16:05 +0100, Stefan Hajnoczi wrote:
>
>> > I've had a quick read of hw/virtio-blk.c but didn't see anything
>> > glaringly obvious. I'll need to trace through the code, drink more
>> > coffee, or get lucky to narrow it down further.
>>
>> Enabling the memory allocation trace events and adding the
>> __builtin_return_address() to them should provide enough information
>> to catch the caller who is leaking memory.
>
> I'm trying to do that at the moment. So far the only thing I've
> done is add a trace on virtio_blk_alloc_request - I'm noticing
> a leak there pretty easily.
>
> I see *two* request structures be allocated all the time, one
> is used and freed, the other is ignored. That seems pretty
> conclusively wrong to me, but I'm trying to understand how that
> happens:
>
> virtio_blk_alloc_request 0.000 req=0x91e08f0 -> Allocation 1
> virtio_blk_alloc_request 77.659 req=0x9215650 -> Allocation 2
Are you sure this isn't the temporary one that is allocated but freed
immediately once the virtqueue is empty:
static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
{
VirtIOBlockReq *req = virtio_blk_alloc_request(s);
if (req != NULL) {
if (!virtqueue_pop(s->vq, &req->elem)) {
qemu_free(req); <--- virtqueue empty, we're done
return NULL;
}
}
return req;
}
> virtio_blk_rw_complete 449.469 req=0x91e08f0 ret=0x0 -> First is used.
> virtio_blk_req_complete 1.955 req=0x91e08f0 status=0x0 -> First is freed.
>
> second is never seen again.
Sounds scary 8).
Stefan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: How to diagnose memory leak in kvm-qemu-0.14.0?
2011-05-20 14:32 ` Stefan Hajnoczi
@ 2011-05-20 14:52 ` Steve Kemp
0 siblings, 0 replies; 11+ messages in thread
From: Steve Kemp @ 2011-05-20 14:52 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: kvm
On Fri May 20, 2011 at 15:32:34 +0100, Stefan Hajnoczi wrote:
> > virtio_blk_alloc_request 0.000 req=0x91e08f0 -> Allocation 1
> > virtio_blk_alloc_request 77.659 req=0x9215650 -> Allocation 2
>
> Are you sure this isn't the temporary one that is allocated but freed
> immediately once the virtqueue is empty:
Good catch. Adding traces above both the qemu_free() calls I can
see that the allocation & freeing of VirtIOBlockReq structures
is paired.
Looks like I'm going to have to bite the bullet and do real allocation
tracking.
Steve Kemp
--
Bytemark Hosting
http://www.bytemark.co.uk/
phone UK: 0845 004 3 004
Dedicated Linux hosts from 15ukp ($30) per month
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-05-20 14:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-18 16:44 How to diagnose memory leak in kvm-qemu-0.14.0? Steve Kemp
2011-05-19 8:40 ` Stefan Hajnoczi
2011-05-19 8:50 ` Stefan Hajnoczi
2011-05-19 11:00 ` Steve Kemp
2011-05-19 11:57 ` Steve Kemp
2011-05-20 11:01 ` Stefan Hajnoczi
2011-05-20 11:47 ` Steve Kemp
2011-05-20 13:16 ` Stefan Hajnoczi
2011-05-20 13:47 ` Steve Kemp
2011-05-20 14:32 ` Stefan Hajnoczi
2011-05-20 14:52 ` Steve Kemp
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.