All of lore.kernel.org
 help / color / mirror / Atom feed
* Physical memory read: word crosses page boundary + host kernel oops
@ 2007-03-27 14:28 Kiselev, Sergey
       [not found] ` <C07C4589BE74A34981C3C3525EE1F80101AB0E55-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Kiselev, Sergey @ 2007-03-27 14:28 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


[-- Attachment #1.1: Type: text/plain, Size: 5797 bytes --]

Hi,
 
1. When booting old Linux (RH7.1 based, 2.4.9, 32bit) guest on kvm-18,
kvm userspace process crashes with 'Bus error' (last output on guest's
screen is "Uncompressing Linux...").
I did some debugging and found that kvm_readl() function calls
ldl_phys() with address 0x9FFFD, so resulting double-word read crosses
page boundary.
After looking at qemu/exec.c it seems that ld*_phys and st*_phys
functions not really care about crossing page boundary (even there is a
comment sayng "warning: addr must be aligned"). So either qemu/exec.c
should be updated to check such condition or (more logical place)
qemu/qemu-kvm.c should take care of it.
 
gdb backtrace:
(gdb) bt
#0  ldl_phys (addr=4093) at ../cpu-all.h:322
#1  0x000000000047e08d in kvm_readl (opaque=0x9f, addr=159, 
    data=0x2b63605a5030) at /srv/src/kvm-18/qemu/qemu-kvm.c:543
#2  0x00000000004dd133 in handle_mmio (kvm=0x2821010,
kvm_run=0x2b63605a5000)
    at kvmctl.c:627
#3  0x00000000004dd484 in kvm_run (kvm=0x2821010, vcpu=0) at
kvmctl.c:718
#4  0x000000000047deb3 in kvm_cpu_exec (env=0x28bc640)
    at /srv/src/kvm-18/qemu/qemu-kvm.c:444
#5  0x000000000047f1ea in cpu_x86_exec (env1=0x9f)
    at /srv/src/kvm-18/qemu/cpu-exec.c:411
#6  0x000000000040c1ca in main_loop () at /srv/src/kvm-18/qemu/vl.c:6255
#7  0x000000000040dc2c in main (argc=9792320, argv=0x28210f0)
    at /srv/src/kvm-18/qemu/vl.c:7708

2. After working-around the first issue, I have following problem: at
some point of guest's Linux boot sequence (after running microcode_ctl,
before running kudzu) following oops happens:
 
Mar 27 12:10:39 itstl140 kernel: Unable to handle kernel paging request
at 000000030593a563 RIP: 
Mar 27 12:10:39 itstl140 kernel:
<ffffffff88366aa6>{:kvm:mmu_page_remove_parent_pte+225}
Mar 27 12:10:39 itstl140 kernel: PGD 15178d067 PUD 0 
Mar 27 12:10:39 itstl140 kernel: Oops: 0000 [1] SMP 
Mar 27 12:10:39 itstl140 kernel: last sysfs file:
/class/net/tap1/ifindex
Mar 27 12:10:39 itstl140 kernel: CPU 3 
Mar 27 12:10:39 itstl140 kernel: Modules linked in: kvm_intel kvm tun
nfs lockd nfs_acl sunrpc kqemu autofs4 ipv6 cpufreq_ondemand
cpufreq_userspace cpufreq_powersave speedstep_centrino freq_table bridge
button battery ac apparmor aamatch_pcre loop dm_mod usbhid ehci_hcd
uhci_hcd shpchp pci_hotplug i2c_i801 ide_cd usbcore bnx2 i2c_core
hw_random cdrom ext3 jbd edd fan thermal processor sg aacraid piix
sd_mod scsi_mod ide_disk ide_core
Mar 27 12:10:39 itstl140 kernel: Pid: 13017, comm: qemu-system-x86
Tainted: G     U 2.6.16.27-0.9-smp #1
Mar 27 12:10:39 itstl140 kernel: RIP: 0010:[<ffffffff88366aa6>]
<ffffffff88366aa6>{:kvm:mmu_page_remove_parent_pte+225}
Mar 27 12:10:39 itstl140 kernel: RSP: 0018:ffff81014e34f938  EFLAGS:
00010206
Mar 27 12:10:39 itstl140 kernel: RAX: 000000030593a563 RBX:
ffff810085006000 RCX: 000000000000003f
Mar 27 12:10:39 itstl140 kernel: RDX: ffff810085006000 RSI:
ffff810085006000 RDI: ffff81014cc608c0
Mar 27 12:10:39 itstl140 kernel: RBP: ffff81014cc64248 R08:
0000000000000004 R09: 0000000000000000
Mar 27 12:10:39 itstl140 kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff81014cc608c0
Mar 27 12:10:39 itstl140 kernel: R13: 0000000000000fff R14:
0000000000037250 R15: 0000000000000000
Mar 27 12:10:39 itstl140 kernel: FS:  00002b305e5e2340(0000)
GS:ffff810430a92a40(0000) knlGS:0000000000000000
Mar 27 12:10:39 itstl140 kernel: CS:  0010 DS: 002b ES: 002b CR0:
0000000080050033
Mar 27 12:10:39 itstl140 kernel: CR2: 000000030593a563 CR3:
000000014de5e000 CR4: 00000000000026e0
Mar 27 12:10:39 itstl140 kernel: Process qemu-system-x86 (pid: 13017,
threadinfo ffff81014e34e000, task ffff81042ab51790)
Mar 27 12:10:39 itstl140 kernel: Stack: ffff810085006000
ffffffff88366f24 0000000100000002 0000000000000000 
Mar 27 12:10:39 itstl140 kernel:        0000000000037249
ffff81043bc23c38 ffff81014cc608c0 0000000000037250 
Mar 27 12:10:39 itstl140 kernel:        0000000037250fff
0000000000000001 
Mar 27 12:10:39 itstl140 kernel: Call Trace:
<ffffffff88366f24>{:kvm:kvm_mmu_pre_write+337}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff883635d5>{:kvm:emulator_write_emulated+132}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff8836b8b0>{:kvm:x86_emulate_memop+10819}
<ffffffff88367b09>{:kvm:paging32_walk_addr+120}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff88365cae>{:kvm:emulate_instruction+231}
<ffffffff8837a0cd>{:kvm_intel:handle_exception+300}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff88379efb>{:kvm_intel:kvm_vmx_return+415}
<ffffffff80128483>{find_busiest_group+356}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff8836414f>{:kvm:kvm_vcpu_ioctl+623}
<ffffffff802cf340>{thread_return+0}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff8013ad0c>{__dequeue_signal+395}
<ffffffff8013bf5a>{dequeue_signal+60}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff8013c55c>{get_signal_to_deliver+345}
<ffffffff8010a1fc>{do_signal+1722}
Mar 27 12:10:39 itstl140 kernel:        <ffffffff8011103d>{init_fpu+98}
<ffffffff8010c398>{math_state_restore+35}
Mar 27 12:10:39 itstl140 kernel:        <ffffffff8010b695>{error_exit+0}
<ffffffff8018b9bd>{do_ioctl+33}
Mar 27 12:10:39 itstl140 kernel:
<ffffffff8018bc4f>{vfs_ioctl+584}
<ffffffff8010a594>{sys_rt_sigreturn+670}
Mar 27 12:10:39 itstl140 kernel:        <ffffffff8018bcca>{sys_ioctl+98}
<ffffffff8010a7be>{system_call+126}
Mar 27 12:10:39 itstl140 kernel: 
Mar 27 12:10:39 itstl140 kernel: Code: 4c 8b 08 41 0f 18 09 48 8d 70 d8
31 c0 e9 39 ff ff ff 48 63 
Mar 27 12:10:39 itstl140 kernel: RIP
<ffffffff88366aa6>{:kvm:mmu_page_remove_parent_pte+225} RSP
<ffff81014e34f938>
Mar 27 12:10:39 itstl140 kernel: CR2: 000000030593a563

I tried to disable both microcode_ctl and kudzu, in this case oops
happens later.
 
Thanks,
Sergey

[-- Attachment #1.2: Type: text/html, Size: 8326 bytes --]

[-- Attachment #2: Type: text/plain, Size: 345 bytes --]

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Physical memory read: word crosses page boundary + host kernel oops
       [not found] ` <C07C4589BE74A34981C3C3525EE1F80101AB0E55-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-03-27 14:45   ` Avi Kivity
       [not found]     ` <46092DF8.1020202-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2007-03-27 14:45 UTC (permalink / raw)
  To: Kiselev, Sergey; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Kiselev, Sergey wrote:
> Hi,
>  
> 1. When booting old Linux (RH7.1 based, 2.4.9, 32bit) guest on kvm-18, 
> kvm userspace process crashes with 'Bus error' (last output on guest's 
> screen is "Uncompressing Linux...").
> I did some debugging and found that kvm_readl() function calls 
> ldl_phys() with address 0x9FFFD, so resulting double-word read crosses 
> page boundary.
> After looking at qemu/exec.c it seems that ld*_phys and st*_phys 
> functions not really care about crossing page boundary (even there is 
> a comment sayng "warning: addr must be aligned"). So either 
> qemu/exec.c should be updated to check such condition or (more logical 
> place) qemu/qemu-kvm.c should take care of it.
>  
> gdb backtrace:
> (gdb) bt
> #0  ldl_phys (addr=4093) at ../cpu-all.h:322
> #1  0x000000000047e08d in kvm_readl (opaque=0x9f, addr=159,
>     data=0x2b63605a5030) at /srv/src/kvm-18/qemu/qemu-kvm.c:543

This is quite surprising.  I agree that hacking kvm_readl() is the best fix.


> 2. After working-around the first issue, I have following problem: at 
> some point of guest's Linux boot sequence (after running 
> microcode_ctl, before running kudzu) following oops happens:
>
>
> Mar 27 12:10:39 itstl140 kernel: Code: 4c 8b 08 41 0f 18 09 48 8d 70 
> d8 31 c0 e9 39 ff ff ff 48 63
> Mar 27 12:10:39 itstl140 kernel: RIP 
> <ffffffff88366aa6>{:kvm:mmu_page_remove_parent_pte+225} RSP 
> <ffff81014e34f938>
> Mar 27 12:10:39 itstl140 kernel: CR2: 000000030593a563
> I tried to disable both microcode_ctl and kudzu, in this case oops h
Strangely, I've seen this exact oops somewhere booting Windows XP in 
safe mode.  I haven't been able to reproduce it, though.

If this is reproducible, it may be debugged by turning on audit 
(s/#undef AUDIT/#define AUDIT/ in mmu.c).  Audit slows the guest down, 
but is a little faster if you reduce the amount of guest memory.  If 
this is reproducible using a publicly available image, I may have a go 
at it too.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Physical memory read: word crosses page boundary + host kernel oops
       [not found]     ` <46092DF8.1020202-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-03-29 13:59       ` Kiselev, Sergey
       [not found]         ` <C07C4589BE74A34981C3C3525EE1F80101AE2908-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Kiselev, Sergey @ 2007-03-29 13:59 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Hi

1. It seems that first problem happens because this particular
double-word (address 0x9FFFD) is located on the boundary between regular
memory and video memory. Probably this address accessed because some bug
in that old kernel (I don't see any good reason to read this location).
But it will be nice to check for reads/writes to such addresses.

2. Problem (oops) is gone away in rev 4571. Not sure why. Still it can
be repeatedly reproduced on kvm-18.

If needed, I can upload an image that reproduces these problems (~150MB
compressed size).

Thanks,
Sergey

-----Original Message-----
From: Avi Kivity [mailto:avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org] 
Sent: 27 March 2007 16:45
To: Kiselev, Sergey
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Re: [kvm-devel] Physical memory read: word crosses page
boundary + host kernel oops

Kiselev, Sergey wrote:
> Hi,
>  
> 1. When booting old Linux (RH7.1 based, 2.4.9, 32bit) guest on kvm-18,

> kvm userspace process crashes with 'Bus error' (last output on guest's

> screen is "Uncompressing Linux...").
> I did some debugging and found that kvm_readl() function calls 
> ldl_phys() with address 0x9FFFD, so resulting double-word read crosses

> page boundary.
> After looking at qemu/exec.c it seems that ld*_phys and st*_phys 
> functions not really care about crossing page boundary (even there is 
> a comment sayng "warning: addr must be aligned"). So either 
> qemu/exec.c should be updated to check such condition or (more logical

> place) qemu/qemu-kvm.c should take care of it.
>  
> gdb backtrace:
> (gdb) bt
> #0  ldl_phys (addr=4093) at ../cpu-all.h:322
> #1  0x000000000047e08d in kvm_readl (opaque=0x9f, addr=159,
>     data=0x2b63605a5030) at /srv/src/kvm-18/qemu/qemu-kvm.c:543

This is quite surprising.  I agree that hacking kvm_readl() is the best
fix.


> 2. After working-around the first issue, I have following problem: at 
> some point of guest's Linux boot sequence (after running 
> microcode_ctl, before running kudzu) following oops happens:
>
>
> Mar 27 12:10:39 itstl140 kernel: Code: 4c 8b 08 41 0f 18 09 48 8d 70 
> d8 31 c0 e9 39 ff ff ff 48 63
> Mar 27 12:10:39 itstl140 kernel: RIP 
> <ffffffff88366aa6>{:kvm:mmu_page_remove_parent_pte+225} RSP 
> <ffff81014e34f938>
> Mar 27 12:10:39 itstl140 kernel: CR2: 000000030593a563
> I tried to disable both microcode_ctl and kudzu, in this case oops h
Strangely, I've seen this exact oops somewhere booting Windows XP in 
safe mode.  I haven't been able to reproduce it, though.

If this is reproducible, it may be debugged by turning on audit 
(s/#undef AUDIT/#define AUDIT/ in mmu.c).  Audit slows the guest down, 
but is a little faster if you reduce the amount of guest memory.  If 
this is reproducible using a publicly available image, I may have a go 
at it too.


-- 
error compiling committee.c: too many arguments to function

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Physical memory read: word crosses page boundary + host kernel oops
       [not found]         ` <C07C4589BE74A34981C3C3525EE1F80101AE2908-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2007-03-29 14:30           ` Avi Kivity
  0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2007-03-29 14:30 UTC (permalink / raw)
  To: Kiselev, Sergey; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Kiselev, Sergey wrote:
> Hi
>
> 1. It seems that first problem happens because this particular
> double-word (address 0x9FFFD) is located on the boundary between regular
> memory and video memory. Probably this address accessed because some bug
> in that old kernel (I don't see any good reason to read this location).
> But it will be nice to check for reads/writes to such addresses.
>
>   

I agree.

> 2. Problem (oops) is gone away in rev 4571. Not sure why. Still it can
> be repeatedly reproduced on kvm-18.
>
> If needed, I can upload an image that reproduces these problems (~150MB
> compressed size).
>   

I'd like to see it.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-03-29 14:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-27 14:28 Physical memory read: word crosses page boundary + host kernel oops Kiselev, Sergey
     [not found] ` <C07C4589BE74A34981C3C3525EE1F80101AB0E55-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-03-27 14:45   ` Avi Kivity
     [not found]     ` <46092DF8.1020202-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-03-29 13:59       ` Kiselev, Sergey
     [not found]         ` <C07C4589BE74A34981C3C3525EE1F80101AE2908-t8eeqVGEwHVP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2007-03-29 14:30           ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.