All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.38.1 general protection fault
@ 2011-03-25  9:32 Tomasz Chmielewski
  2011-03-26  9:15 ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Chmielewski @ 2011-03-25  9:32 UTC (permalink / raw)
  To: kvm

I got this on a 2.6.38.1 system which (I think) had some problem accessing guest image on a btrfs filesystem.


general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/kernel/uevent_seqnum
CPU 0 
Modules linked in: ipt_MASQUERADE vhost_net kvm_intel kvm iptable_filter xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables bridge stp btrfs zlib_deflate crc32c libcrc32c coretemp f71882fg snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc tpm_tis tpm tpm_bios pcspkr i7core_edac edac_core r8169 mii raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 ahci libahci sata_nv sata_sil sata_via 3w_9xxx 3w_xxxx [last unloaded: scsi_wait_scan]

Pid: 10199, comm: kvm Not tainted 2.6.38.1 #1 MSI MS-7522/MSI X58 Pro-E (MS-7522)
RIP: 0010:[<ffffffffa02cae20>]  [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm]
RSP: 0018:ffff880508ee9bf0  EFLAGS: 00010202
RAX: 00008805d6b087f8 RBX: ffff8805b7b10000 RCX: 0000000000000050
RDX: 0000000000000000 RSI: 00008805d6b087f8 RDI: ffff8805b7b10000
RBP: ffff880508ee9c10 R08: ffff8801061d4000 R09: ffffc9001f19aff0
R10: 0000000000000030 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc9001f19aff8 R14: 0000000000000060 R15: ffff8801061d4000
FS:  00007f7ca25d6730(0000) GS:ffff8800bf400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000462b10 CR3: 00000003ac47f000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kvm (pid: 10199, threadinfo ffff880508ee8000, task ffff88001b5a5b00)
Stack:
 ffffffffffffffcf 00000000000220ff 0000000000000001 ffff8801061d4050
 ffff880508ee9c80 ffffffffa02c8a54 0000000000000030 ffffffffa02cae00
 0000000000000000 00007f7c80a2b000 ffff8805b7b10000 0000000000000001
Call Trace:
 [<ffffffffa02c8a54>] kvm_handle_hva+0xb4/0x170 [kvm]
 [<ffffffffa02cae00>] ? kvm_unmap_rmapp+0x0/0x70 [kvm]
 [<ffffffffa02c8b27>] kvm_unmap_hva+0x17/0x20 [kvm]
 [<ffffffffa02b1e72>] kvm_mmu_notifier_invalidate_range_start+0x62/0xb0 [kvm]
 [<ffffffff8113ea11>] __mmu_notifier_invalidate_range_start+0x51/0x70
 [<ffffffff8111e2c1>] copy_page_range+0x3b1/0x460
 [<ffffffff812c5628>] ? rb_insert_color+0x98/0x140
 [<ffffffff81060cdc>] dup_mm+0x2fc/0x500
 [<ffffffff810617fe>] copy_process+0x8be/0x11b0
 [<ffffffff81062165>] do_fork+0x75/0x350
 [<ffffffff81177bcd>] ? mntput+0x1d/0x40
 [<ffffffff8115b095>] ? fput+0x1e5/0x270
 [<ffffffff815aa7f5>] ? _raw_spin_lock_irq+0x15/0x20
 [<ffffffff81075141>] ? sigprocmask+0x91/0x110
 [<ffffffff81014ab8>] sys_clone+0x28/0x30
 [<ffffffff8100c3e3>] stub_clone+0x13/0x20
 [<ffffffff8100c0c2>] ? system_call_fastpath+0x16/0x1b
Code: 49 89 01 eb 91 66 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 45 31 e4 48 89 fb 49 89 f5 eb 1d 0f 1f 00 <f6> 06 01 74 38 48 8b 15 a4 66 02 00 48 89 df 41 bc 01 00 00 00 
RIP  [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm]
 RSP <ffff880508ee9bf0>
---[ end trace 85201a339b7635fc ]---



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-25  9:32 2.6.38.1 general protection fault Tomasz Chmielewski
@ 2011-03-26  9:15 ` Avi Kivity
  2011-03-26 10:42   ` Tomasz Chmielewski
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2011-03-26  9:15 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm, Andrea Arcangeli

On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote:
> I got this on a 2.6.38.1 system which (I think) had some problem accessing guest image on a btrfs filesystem.
>
>
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/kernel/uevent_seqnum
> CPU 0
> Modules linked in: ipt_MASQUERADE vhost_net kvm_intel kvm iptable_filter xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables bridge stp btrfs zlib_deflate crc32c libcrc32c coretemp f71882fg snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc tpm_tis tpm tpm_bios pcspkr i7core_edac edac_core r8169 mii raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 ahci libahci sata_nv sata_sil sata_via 3w_9xxx 3w_xxxx [last unloaded: scsi_wait_scan]
>
> Pid: 10199, comm: kvm Not tainted 2.6.38.1 #1 MSI MS-7522/MSI X58 Pro-E (MS-7522)
> RIP: 0010:[<ffffffffa02cae20>]  [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm]
> RSP: 0018:ffff880508ee9bf0  EFLAGS: 00010202
> RAX: 00008805d6b087f8 RBX: ffff8805b7b10000 RCX: 0000000000000050
> RDX: 0000000000000000 RSI: 00008805d6b087f8 RDI: ffff8805b7b10000
> RBP: ffff880508ee9c10 R08: ffff8801061d4000 R09: ffffc9001f19aff0
> R10: 0000000000000030 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffc9001f19aff8 R14: 0000000000000060 R15: ffff8801061d4000
> FS:  00007f7ca25d6730(0000) GS:ffff8800bf400000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000462b10 CR3: 00000003ac47f000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kvm (pid: 10199, threadinfo ffff880508ee8000, task ffff88001b5a5b00)
> Stack:
>   ffffffffffffffcf 00000000000220ff 0000000000000001 ffff8801061d4050
>   ffff880508ee9c80 ffffffffa02c8a54 0000000000000030 ffffffffa02cae00
>   0000000000000000 00007f7c80a2b000 ffff8805b7b10000 0000000000000001
> Call Trace:
>   [<ffffffffa02c8a54>] kvm_handle_hva+0xb4/0x170 [kvm]
>   [<ffffffffa02cae00>] ? kvm_unmap_rmapp+0x0/0x70 [kvm]
>   [<ffffffffa02c8b27>] kvm_unmap_hva+0x17/0x20 [kvm]
>   [<ffffffffa02b1e72>] kvm_mmu_notifier_invalidate_range_start+0x62/0xb0 [kvm]
>   [<ffffffff8113ea11>] __mmu_notifier_invalidate_range_start+0x51/0x70
>   [<ffffffff8111e2c1>] copy_page_range+0x3b1/0x460
>   [<ffffffff812c5628>] ? rb_insert_color+0x98/0x140
>   [<ffffffff81060cdc>] dup_mm+0x2fc/0x500
>   [<ffffffff810617fe>] copy_process+0x8be/0x11b0
>   [<ffffffff81062165>] do_fork+0x75/0x350
>   [<ffffffff81177bcd>] ? mntput+0x1d/0x40
>   [<ffffffff8115b095>] ? fput+0x1e5/0x270
>   [<ffffffff815aa7f5>] ? _raw_spin_lock_irq+0x15/0x20
>   [<ffffffff81075141>] ? sigprocmask+0x91/0x110
>   [<ffffffff81014ab8>] sys_clone+0x28/0x30
>   [<ffffffff8100c3e3>] stub_clone+0x13/0x20
>   [<ffffffff8100c0c2>] ? system_call_fastpath+0x16/0x1b
> Code: 49 89 01 eb 91 66 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 45 31 e4 48 89 fb 49 89 f5 eb 1d 0f 1f 00<f6>  06 01 74 38 48 8b 15 a4 66 02 00 48 89 df 41 bc 01 00 00 00
> RIP  [<ffffffffa02cae20>] kvm_unmap_rmapp+0x20/0x70 [kvm]
>   RSP<ffff880508ee9bf0>
> ---[ end trace 85201a339b7635fc ]---
>
>
>
    0:    55                       push   %rbp
    1:    48 89 e5                 mov    %rsp,%rbp
    4:    41 55                    push   %r13
    6:    41 54                    push   %r12
    8:    53                       push   %rbx
    9:    48 83 ec 08              sub    $0x8,%rsp
    d:    0f 1f 44 00 00           nopl   0x0(%rax,%rax,1)
   12:    45 31 e4                 xor    %r12d,%r12d
   15:    48 89 fb                 mov    %rdi,%rbx
   18:    49 89 f5                 mov    %rsi,%r13
   1b:    eb 1d                    jmp    0x3a
   1d:    0f 1f 00                 nopl   (%rax)
   20:    f6 06 01                 testb  $0x1,(%rsi)


Looks like the top 16 bits of %rsi are flipped.

Also wierd to see a fork().  What's your qemu command line?

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-26  9:15 ` Avi Kivity
@ 2011-03-26 10:42   ` Tomasz Chmielewski
  2011-03-27  9:42     ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Chmielewski @ 2011-03-26 10:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Andrea Arcangeli

On 26.03.2011 10:15, Avi Kivity wrote:
> On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote:
>> I got this on a 2.6.38.1 system which (I think) had some problem 
>> accessing guest image on a btrfs filesystem.
>>
>>
>> general protection fault: 0000 [#1] SMP

(...)

> 0: 55 push %rbp
> 1: 48 89 e5 mov %rsp,%rbp
> 4: 41 55 push %r13
> 6: 41 54 push %r12
> 8: 53 push %rbx
> 9: 48 83 ec 08 sub $0x8,%rsp
> d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 12: 45 31 e4 xor %r12d,%r12d
> 15: 48 89 fb mov %rdi,%rbx
> 18: 49 89 f5 mov %rsi,%r13
> 1b: eb 1d jmp 0x3a
> 1d: 0f 1f 00 nopl (%rax)
> 20: f6 06 01 testb $0x1,(%rsi)
> 
> 
> Looks like the top 16 bits of %rsi are flipped.
> 
> Also wierd to see a fork(). What's your qemu command line?

/usr/bin/kvm -monitor unix:/var/run/qemu-server/113.mon,server,nowait -vnc unix:/var/run/qemu-server/113.vnc,password -pidfile /var/run/qemu-server/113.pid -daemonize -usbdevice tablet -name swcache -smp sockets=1,cores=1 -nodefaults -boot menu=on -vga cirrus -tdf -k de -drive file=/var/lib/vz/template/iso/systemrescuecd-x86-2.0.0.iso,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/113/vm-113-disk-1.raw,if=scsi,index=0,cache=none,boot=on -m 1024 -netdev type=tap,id=vlan0d0,ifname=tap113i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=DE:42:48:50:D8:69,netdev=vlan0d0 -netdev type=tap,id=vlan100d0,ifname=tap113i100d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=72:D2:6E:8E:07:4D,netdev=vlan100d0



-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-26 10:42   ` Tomasz Chmielewski
@ 2011-03-27  9:42     ` Avi Kivity
  2011-03-28  6:24       ` Tomasz Chmielewski
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2011-03-27  9:42 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm, Andrea Arcangeli

On 03/26/2011 12:42 PM, Tomasz Chmielewski wrote:
> On 26.03.2011 10:15, Avi Kivity wrote:
> >  On 03/25/2011 11:32 AM, Tomasz Chmielewski wrote:
> >>  I got this on a 2.6.38.1 system which (I think) had some problem
> >>  accessing guest image on a btrfs filesystem.
> >>
> >>
> >>  general protection fault: 0000 [#1] SMP
>
> (...)
>
> >  0: 55 push %rbp
> >  1: 48 89 e5 mov %rsp,%rbp
> >  4: 41 55 push %r13
> >  6: 41 54 push %r12
> >  8: 53 push %rbx
> >  9: 48 83 ec 08 sub $0x8,%rsp
> >  d: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> >  12: 45 31 e4 xor %r12d,%r12d
> >  15: 48 89 fb mov %rdi,%rbx
> >  18: 49 89 f5 mov %rsi,%r13
> >  1b: eb 1d jmp 0x3a
> >  1d: 0f 1f 00 nopl (%rax)
> >  20: f6 06 01 testb $0x1,(%rsi)
> >
> >
> >  Looks like the top 16 bits of %rsi are flipped.
> >
> >  Also wierd to see a fork(). What's your qemu command line?
>
> /usr/bin/kvm -monitor unix:/var/run/qemu-server/113.mon,server,nowait -vnc unix:/var/run/qemu-server/113.vnc,password -pidfile /var/run/qemu-server/113.pid -daemonize -usbdevice tablet -name swcache -smp sockets=1,cores=1 -nodefaults -boot menu=on -vga cirrus -tdf -k de -drive file=/var/lib/vz/template/iso/systemrescuecd-x86-2.0.0.iso,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/113/vm-113-disk-1.raw,if=scsi,index=0,cache=none,boot=on -m 1024 -netdev type=tap,id=vlan0d0,ifname=tap113i0d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=DE:42:48:50:D8:69,netdev=vlan0d0 -netdev type=tap,id=vlan100d0,ifname=tap113i100d0,script=/var/lib/qemu-server/bridge-vlan,vhost=on -device virtio-net-pci,mac=72:D2:6E:8E:07:4D,netdev=vlan100d0
>
>

Okay, the fork came from the ,script=.

The issue with %rsi looks like a use-after-free, however 
kvm_mmu_notifier_invalidate_range_start appears to be properly srcu 
protected.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-27  9:42     ` Avi Kivity
@ 2011-03-28  6:24       ` Tomasz Chmielewski
  2011-03-28  9:19         ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Chmielewski @ 2011-03-28  6:24 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Andrea Arcangeli

On 27.03.2011 11:42, Avi Kivity wrote:

(...)

> Okay, the fork came from the ,script=.
>
> The issue with %rsi looks like a use-after-free, however
> kvm_mmu_notifier_invalidate_range_start appears to be properly srcu
> protected.

FYI, I saw this one as well:

http://www.virtall.com/files/temp/kvm.txt


If you need to look at the config, it's available here:

http://www.virtall.com/files/temp/config-2.6.38.1

-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28  6:24       ` Tomasz Chmielewski
@ 2011-03-28  9:19         ` Avi Kivity
  2011-03-28 17:54           ` Andrea Arcangeli
  2011-03-29 13:34           ` Marcelo Tosatti
  0 siblings, 2 replies; 13+ messages in thread
From: Avi Kivity @ 2011-03-28  9:19 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm, Andrea Arcangeli, Marcelo Tosatti

On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote:
> On 27.03.2011 11:42, Avi Kivity wrote:
>
> (...)
>
>> Okay, the fork came from the ,script=.
>>
>> The issue with %rsi looks like a use-after-free, however
>> kvm_mmu_notifier_invalidate_range_start appears to be properly srcu
>> protected.
>
> FYI, I saw this one as well:
>
> http://www.virtall.com/files/temp/kvm.txt

Similar pattern - top 16 bits of %rsi are flipped.

Marcelo, what was the option to enable padding for allocations and 
overrun detection?  Also use-after-free?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28  9:19         ` Avi Kivity
@ 2011-03-28 17:54           ` Andrea Arcangeli
  2011-03-28 18:02             ` Avi Kivity
  2011-03-29 13:34           ` Marcelo Tosatti
  1 sibling, 1 reply; 13+ messages in thread
From: Andrea Arcangeli @ 2011-03-28 17:54 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm, Marcelo Tosatti

Hello everyone,

On Mon, Mar 28, 2011 at 11:19:51AM +0200, Avi Kivity wrote:
> On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote:
> > On 27.03.2011 11:42, Avi Kivity wrote:
> >
> > (...)
> >
> >> Okay, the fork came from the ,script=.
> >>
> >> The issue with %rsi looks like a use-after-free, however
> >> kvm_mmu_notifier_invalidate_range_start appears to be properly srcu
> >> protected.
> >
> > FYI, I saw this one as well:
> >
> > http://www.virtall.com/files/temp/kvm.txt
> 
> Similar pattern - top 16 bits of %rsi are flipped.
> 
> Marcelo, what was the option to enable padding for allocations and 
> overrun detection?  Also use-after-free?

BTW, is it genuine that a protection fault is generated instead of a page
fault while dereferencing address 0x00008805d6b087f8? I would normally
except a page fault from a memory dereference that doesn't alter
processor state/segments.

The other GFP happened in pmdp_clear_flush_notify inside
collapse_huge_page.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28 17:54           ` Andrea Arcangeli
@ 2011-03-28 18:02             ` Avi Kivity
  2011-03-28 20:04               ` Andrea Arcangeli
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2011-03-28 18:02 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Tomasz Chmielewski, kvm, Marcelo Tosatti

On 03/28/2011 07:54 PM, Andrea Arcangeli wrote:
> BTW, is it genuine that a protection fault is generated instead of a page
> fault while dereferencing address 0x00008805d6b087f8? I would normally
> except a page fault from a memory dereference that doesn't alter
> processor state/segments.

Yes.  Bits 48-63 of the address must be equal to bit 47, or a #GP is 
generated (non-canonical address).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28 18:02             ` Avi Kivity
@ 2011-03-28 20:04               ` Andrea Arcangeli
  2011-03-28 20:14                 ` Tomasz Chmielewski
  0 siblings, 1 reply; 13+ messages in thread
From: Andrea Arcangeli @ 2011-03-28 20:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm, Marcelo Tosatti

On Mon, Mar 28, 2011 at 08:02:47PM +0200, Avi Kivity wrote:
> On 03/28/2011 07:54 PM, Andrea Arcangeli wrote:
> > BTW, is it genuine that a protection fault is generated instead of a page
> > fault while dereferencing address 0x00008805d6b087f8? I would normally
> > except a page fault from a memory dereference that doesn't alter
> > processor state/segments.
> 
> Yes.  Bits 48-63 of the address must be equal to bit 47, or a #GP is 
> generated (non-canonical address).

Ok, when you said 16 bit reversed I didn't match it to bit 48 and max
128TB of user address space. I thought it was good idea to check
because in the past I've seen GFP that were hardware issues triggering
on normal memory dereference but this is probably not the case.

Tomasz, how easily can you reproduce? Could you upload to the site the
output of objdump -dr arch/x86/kvm/mmu.o too? (my assembly is vastly
different than the one shown so far, I may find more info in the oops
if I get the assembly of the caller too and of the iteration of the
loop that runs in that function before the GFP)

khugepaged is present in your second trace (and khugepaged is mangling
over some memslot range with guest gfn mapped or kvm_unmap_rmapp
wouldn't be called in the first place, hope the memslot are all ok)
but probably you didn't get the right alignment so likely the THP are
mapped as 4k pages in the guest, which must work fine too. I wonder if
that might be related to that (my qemu-kvm I keep it patched with the
patch below which isn't yet polished enough to be digestible for qemu,
wrong alignments, x86 4M alignment not handled yet, and not sure if
the DONTFORK fix to prevent OOM with hotplug/migrate is acceptable in
that position).

Can you try to "echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs"
and then run "cat /proc/`pgrep qemu`/smaps >/dev/null" once per minute (or find
the right pid by hand if you've more than one qemu process running).
This debug trick will only work for 2.6.38.1, as 2.6.39 has a native
THP handling in the smaps file, but in 2.6.38.1 it should flush all
sptes mapped on THP just like fork (this might help to reproduce).

I'm also surprised this happened during fork that initialize the tap
interface, shouldn't that fork run before any sptes is established?
(we're running the spte invalidate with mmu notifier in the parent
before wrprotecting the ptes during fork)

I also wonder if it's a memslot race of some kind, I don't see
anything wrong in the rmapp handling at the moment.

This isn't a patch to try, I'm only showing it here for reference as I
guess I suspect it might hide the bug. I'm now going to reverse it and
see if I can reproduce, in case having large sptes (instead of 4k
sptes) always mapped on host THP changes something.

Thanks!

diff --git a/exec.c b/exec.c
index bb0c1be..f60e5fe 100644
--- a/exec.c
+++ b/exec.c
@@ -2856,6 +2856,18 @@ static ram_addr_t last_ram_offset(void)
     return last;
 }
 
+#if defined(__linux__) && defined(__x86_64__)
+/*
+ * Align on the max transparent hugepage size so that
+ * "(gfn ^ pfn) & (HPAGE_SIZE-1) == 0" to allow KVM to
+ * take advantage of hugepages with NPT/EPT or to
+ * ensure the first 2M of the guest physical ram will
+ * be mapped by the same hugetlb for QEMU (it is worth
+ * it even without NPT/EPT).
+ */
+#define PREFERRED_RAM_ALIGN (2*1024*1024)
+#endif
+
 ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
                                    ram_addr_t size, void *host)
 {
@@ -2902,9 +2914,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name,
                                    PROT_EXEC|PROT_READ|PROT_WRITE,
                                    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
 #else
-            new_block->host = qemu_vmalloc(size);
+#ifdef PREFERRED_RAM_ALIGN
+	    if (size >= PREFERRED_RAM_ALIGN)
+		    new_block->host = qemu_memalign(PREFERRED_RAM_ALIGN, size);
+	    else
+#endif
+		    new_block->host = qemu_vmalloc(size);
 #endif
             qemu_madvise(new_block->host, size, QEMU_MADV_MERGEABLE);
+            qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
         }
     }
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28 20:04               ` Andrea Arcangeli
@ 2011-03-28 20:14                 ` Tomasz Chmielewski
  2011-04-20  9:28                   ` Thomas Treutner
  0 siblings, 1 reply; 13+ messages in thread
From: Tomasz Chmielewski @ 2011-03-28 20:14 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Avi Kivity, kvm, Marcelo Tosatti

On 28.03.2011 22:04, Andrea Arcangeli wrote:

> Tomasz, how easily can you reproduce?

Well, this server runs 10 VMs or so, and it happens after 1-2 days of 
uptime.

I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38 
already ;) so I'd rather not experiment anymore for some time with a 
kernel known to cause problems.


> Could you upload to the site the
> output of objdump -dr arch/x86/kvm/mmu.o too?

http://virtall.com/files/temp/mmu-objdump.txt


-- 
Tomasz Chmielewski
http://wpkg.org



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28  9:19         ` Avi Kivity
  2011-03-28 17:54           ` Andrea Arcangeli
@ 2011-03-29 13:34           ` Marcelo Tosatti
  1 sibling, 0 replies; 13+ messages in thread
From: Marcelo Tosatti @ 2011-03-29 13:34 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Tomasz Chmielewski, kvm, Andrea Arcangeli

On Mon, Mar 28, 2011 at 11:19:51AM +0200, Avi Kivity wrote:
> On 03/28/2011 08:24 AM, Tomasz Chmielewski wrote:
> >On 27.03.2011 11:42, Avi Kivity wrote:
> >
> >(...)
> >
> >>Okay, the fork came from the ,script=.
> >>
> >>The issue with %rsi looks like a use-after-free, however
> >>kvm_mmu_notifier_invalidate_range_start appears to be properly srcu
> >>protected.
> >
> >FYI, I saw this one as well:
> >
> >http://www.virtall.com/files/temp/kvm.txt
> 
> Similar pattern - top 16 bits of %rsi are flipped.
> 
> Marcelo, what was the option to enable padding for allocations and
> overrun detection?  Also use-after-free?

slub_debug=ZFPU boot kernel parameter.

Documentation/vm/slub.txt:

Possible debug options are
        F               Sanity checks on (enables SLAB_DEBUG_FREE. Sorry
                        SLAB legacy issues)
        Z               Red zoning
        P               Poisoning (object and padding)
        U               User tracking (free and alloc)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-03-28 20:14                 ` Tomasz Chmielewski
@ 2011-04-20  9:28                   ` Thomas Treutner
  2011-04-20 10:54                     ` Tomasz Chmielewski
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Treutner @ 2011-04-20  9:28 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: kvm

On 03/28/2011 10:14 PM, Tomasz Chmielewski wrote:
> On 28.03.2011 22:04, Andrea Arcangeli wrote:
>
>> Tomasz, how easily can you reproduce?
>
> Well, this server runs 10 VMs or so, and it happens after 1-2 days of
> uptime.
>
> I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38
> already ;) so I'd rather not experiment anymore for some time with a
> kernel known to cause problems.

Tomasz, to which exact kernel version (host+guests) did you switch and 
is it now stable?

thanks, -t

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38.1 general protection fault
  2011-04-20  9:28                   ` Thomas Treutner
@ 2011-04-20 10:54                     ` Tomasz Chmielewski
  0 siblings, 0 replies; 13+ messages in thread
From: Tomasz Chmielewski @ 2011-04-20 10:54 UTC (permalink / raw)
  To: Thomas Treutner; +Cc: kvm

On 20.04.2011 11:28, Thomas Treutner wrote:
> On 03/28/2011 10:14 PM, Tomasz Chmielewski wrote:
>> On 28.03.2011 22:04, Andrea Arcangeli wrote:
>>
>>> Tomasz, how easily can you reproduce?
>>
>> Well, this server runs 10 VMs or so, and it happens after 1-2 days of
>> uptime.
>>
>> I reverted now to a 2.6.35.x, as it had enough downtime with 2.6.38
>> already ;) so I'd rather not experiment anymore for some time with a
>> kernel known to cause problems.
>
> Tomasz, to which exact kernel version (host+guests) did you switch and
> is it now stable?

I've switched the host to the latest 2.6.35.x and it's stable.

Guest kernel doesn't seem to make a difference here, but majority of 
them are running 2.6.38.x kernel (had some weird issues with "events/0", 
taking 100% CPU on guests when I used 2.6.35, which made the guests 
crawling slow).


-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-04-20 10:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-25  9:32 2.6.38.1 general protection fault Tomasz Chmielewski
2011-03-26  9:15 ` Avi Kivity
2011-03-26 10:42   ` Tomasz Chmielewski
2011-03-27  9:42     ` Avi Kivity
2011-03-28  6:24       ` Tomasz Chmielewski
2011-03-28  9:19         ` Avi Kivity
2011-03-28 17:54           ` Andrea Arcangeli
2011-03-28 18:02             ` Avi Kivity
2011-03-28 20:04               ` Andrea Arcangeli
2011-03-28 20:14                 ` Tomasz Chmielewski
2011-04-20  9:28                   ` Thomas Treutner
2011-04-20 10:54                     ` Tomasz Chmielewski
2011-03-29 13:34           ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.