All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
       [not found] <OF2C342ECB.2185877A-ON052571E1.0002BBE6-052571E1.00052087@LocalDomain>
@ 2006-09-07 20:04 ` Steve Dobbelstein
  2006-09-07 21:11   ` Keir Fraser
  0 siblings, 1 reply; 8+ messages in thread
From: Steve Dobbelstein @ 2006-09-07 20:04 UTC (permalink / raw)
  To: xen-devel

steved@us.ibm.com wrote on 09/05/2006 07:56:00 PM:

> I'm running 64-bit SLES 10 beta 10 (yes, we have to upgrade to the
> official release) on a machine with four Xeon 7020s.  I got xen-
> unstable changeset 11429:66dd34f2f439 and built 64-bit uniprocessor
> kernels for dom0 and the HVM domain (a 2.6.16.13 baremetal kernel
> and its initrd).  The HVM domain is also running SLES 10 beta 10.  I
> followed the instructions to build the paravirtualized drivers for
> an HVM domain.  When I run "modprobe xen_platform_pci" in the HVM
> domain I get a kernel oops.  Here is the output in dmesg.
>
> PCI: Found IRQ 10 for device 0000:00:03.0
> Xen version 3.0.
> Hypercall area is 1 pages (order 0 allocation)
> Unable to handle kernel paging request at ffff81002aca5220 RIP:
> [<ffff81002aca5220>]
> PGD 8063 PUD 9063 PMD 800000002ac001e3 PTE 31e031e031e031e
> Oops: 0011 [1]
> CPU 0
> Modules linked in: xen_platform_pci ext3 mbcache jbd edd processor
> lpfc mptspi mptscsih mptbase ata_
> piix libata
> Pid: 4000, comm: modprobe Not tainted 2.6.16.13-baremetal-up #1
> RIP: 0010:[<ffff81002aca5220>] [<ffff81002aca5220>]
> RSP: 0018:ffff8100265b5b60  EFLAGS: 00010282
> RAX: ffff81002aca5220 RBX: 000000002aca5000 RCX: 0000000040000000
> RDX: 0000000000000000 RSI: ffff8100265b5b68 RDI: 0000000000000006
> RBP: ffff8100265b5b78 R08: ffff81002aca5000 R09: ffffffff7fffffff
> R10: 00007f0000000000 R11: 0000000080000000 R12: ffff81002fea8000
> R13: 00000000f3000000 R14: 000000000000c100 R15: 0000000000000001
> FS:  00002b443d7726d0(0000) GS:ffffffff80533000(0000)
knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: ffff81002aca5220 CR3: 0000000026f89000 CR4: 00000000000006e0
> Process modprobe (pid: 4000, threadinfo ffff8100265b4000, task
> ffff81002fba0380)
> Stack: ffffffff88086c5c ffff810000000000 ffffffff80146693
ffff8100265b5c08
>        ffffffff88086635 0000000300000000 ffff8100265b5bb8
0000000000000000
>        0000000000000100 0000000001000000
> Call Trace: <ffffffff88086c5c>{:xen_platform_pci:setup_xen_features+40}
>        <ffffffff80146693>{__get_free_pages+49} <ffffffff88086635>{:
> xen_platform_pci:platform_pci_init+832}
>        <ffffffff80207ef2>{pci_device_probe+77}
> <ffffffff8024d32a>{driver_probe_device+92}
>        <ffffffff8024d3f2>{__driver_attach+0}
> <ffffffff8024d449>{__driver_attach+87}
>        <ffffffff8024cd16>{bus_for_each_dev+79}
> <ffffffff8024d25a>{driver_attach+28}
>        <ffffffff8024c913>{bus_add_driver+122}
> <ffffffff8024d6d4>{driver_register+143}
>        <ffffffff802080b1>{__pci_register_driver+111}
> <ffffffff8808e01c>{:xen_platform_pci:platform_pci_module_init+28}
>        <ffffffff8013daa5>{sys_init_module+5606}
> <ffffffff8013731f>{autoremove_wake_function+0}
>        <ffffffff8015efaa>{vfs_read+173}
<ffffffff8010a8ba>{system_call+126}
>
> Code: b8 11 00 00 00 0f 01 c1 c3 00 00 00 00 00 00 00 00 00 00 00
> RIP [<ffff81002aca5220>] RSP <ffff8100265b5b60>
> CR2: ffff81002aca5220
>
> It is oopsing on line 25 in unmodified_drivers/linux-2.6/platform-
> pci/features.c (which is a sym link to ../../linux-2.6-xen-
> sparse/drivers/xen/core/features.c):
> if (HYPERVISOR_xen_version(XENVER_get_features, &fi) < 0)
>
> Looks like something went wrong with the hypercall.  I crawled
> through the code to see how the hypercall stubs are set up but got
> lost in the MSR stuff.  I'll take a look at it again tomorrow.
> Thought I should post it to the list in case anyone else can
> reproduce the problem and either find a fix or explain why it's a user
error.
>
> Let me know if you need more info on my setup.
>
> Steve D.

Digging into this further I found that the problem is that they hypercall
mechanism its trying to execute the instructions for the hypercall which
reside in the hypercall stubs page.  However, the page table entry for the
page has the _PAGE_NX (no execute) bit set.  (I'm running a 64-bit OS with
PAE in the HVM domain.)  The error code in the oops (0x11) indicates that
the page fault is because of the _PAGE_NX bit.  0x01 -> access rights
violation  0x10 -> The fault was caused by an instruction fetch.

I tried hacking some code to turn off the NX bit in the PTE for the
hypercall stubs page, but I still get the oops.  I'm thinking it's because
the NX bit is set in the PMD.

I'm quite new to the paging mechanism, so I'm not sure how to fix this at
the moment.   I'll keep poking around.  thought I'd share my findings so
far.

Steve D.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-07 20:04 ` Oops when loading xen_platform_pci module in HVM domain on CS 11429 Steve Dobbelstein
@ 2006-09-07 21:11   ` Keir Fraser
  2006-09-08  0:41     ` Steve Dobbelstein
  2006-09-08 17:03     ` Steven Smith
  0 siblings, 2 replies; 8+ messages in thread
From: Keir Fraser @ 2006-09-07 21:11 UTC (permalink / raw)
  To: Steve Dobbelstein, xen-devel




On 7/9/06 21:04, "Steve Dobbelstein" <steved@us.ibm.com> wrote:

> I tried hacking some code to turn off the NX bit in the PTE for the
> hypercall stubs page, but I still get the oops.  I'm thinking it's because
> the NX bit is set in the PMD.
> 
> I'm quite new to the paging mechanism, so I'm not sure how to fix this at
> the moment.   I'll keep poking around.  thought I'd share my findings so
> far.

Page directory entries use permissions _PAGE_TABLE, which does not include
_PAGE_NX. So clearing _PAGE_NX from the PTEs, using
change_page_attr(PAGE_KERNEL_EXEC), should suffice.

 -- Keir

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-07 21:11   ` Keir Fraser
@ 2006-09-08  0:41     ` Steve Dobbelstein
  2006-09-08 17:03     ` Steven Smith
  1 sibling, 0 replies; 8+ messages in thread
From: Steve Dobbelstein @ 2006-09-08  0:41 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote on 09/07/2006 04:11:37 PM:

> On 7/9/06 21:04, "Steve Dobbelstein" <steved@us.ibm.com> wrote:
>
> > I tried hacking some code to turn off the NX bit in the PTE for the
> > hypercall stubs page, but I still get the oops.  I'm thinking it's
because
> > the NX bit is set in the PMD.
> >
> > I'm quite new to the paging mechanism, so I'm not sure how to fix this
at
> > the moment.   I'll keep poking around.  thought I'd share my findings
so
> > far.
>
> Page directory entries use permissions _PAGE_TABLE, which does not
include
> _PAGE_NX. So clearing _PAGE_NX from the PTEs, using
> change_page_attr(PAGE_KERNEL_EXEC), should suffice.
>
>  -- Keir

Yes, it should suffice, but it doesn't.  What happens is that the PV driver
calls __get_free_page() and gets a page -- a large page, i.e. the _PAGE_PSE
bit is set in the PTE.  change_page_attr() sees that the pgprot is being
change for only one 4KB page and splits the page.  It creates a PMD for the
4KB pages that made up the large page.  The PMD is given the pgprot of the
original large page, which in this case includes the _PAGE_NX bit.  So
while the new PTE for the 4KB page for the hypercall stubs has the _PAGE_NX
bit turned off, the PMD over the PTE has the _PAGE_NX bit on which
effectively sets it for all the PTEs pointed to by the PMD. :(


Thanks for any tips.

Steve D.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-07 21:11   ` Keir Fraser
  2006-09-08  0:41     ` Steve Dobbelstein
@ 2006-09-08 17:03     ` Steven Smith
  2006-09-08 18:19       ` Keir Fraser
  1 sibling, 1 reply; 8+ messages in thread
From: Steven Smith @ 2006-09-08 17:03 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Steve Dobbelstein, xen-devel, sos22


[-- Attachment #1.1: Type: text/plain, Size: 784 bytes --]

> > I tried hacking some code to turn off the NX bit in the PTE for the
> > hypercall stubs page, but I still get the oops.  I'm thinking it's because
> > the NX bit is set in the PMD.
> > 
> > I'm quite new to the paging mechanism, so I'm not sure how to fix this at
> > the moment.   I'll keep poking around.  thought I'd share my findings so
> > far.
> Page directory entries use permissions _PAGE_TABLE, which does not include
> _PAGE_NX. So clearing _PAGE_NX from the PTEs, using
> change_page_attr(PAGE_KERNEL_EXEC), should suffice.
The oops message is fairly clear that _PAGE_NX is set on the PMD, and
I'd guess it probably got set from phys_pmd_init.

I think vmalloc_exec is probably the right answer here.  I'll have a
go at this over the weekend.

Steven.

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-08 17:03     ` Steven Smith
@ 2006-09-08 18:19       ` Keir Fraser
  2006-09-11  8:34         ` Steven Smith
  2006-09-11 14:48         ` Steve Dobbelstein
  0 siblings, 2 replies; 8+ messages in thread
From: Keir Fraser @ 2006-09-08 18:19 UTC (permalink / raw)
  To: Steven Smith; +Cc: Steve Dobbelstein, xen-devel, sos22

On 8/9/06 18:03, "Steven Smith" <sos22-xen@srcf.ucam.org> wrote:

>> Page directory entries use permissions _PAGE_TABLE, which does not include
>> _PAGE_NX. So clearing _PAGE_NX from the PTEs, using
>> change_page_attr(PAGE_KERNEL_EXEC), should suffice.
> The oops message is fairly clear that _PAGE_NX is set on the PMD, and
> I'd guess it probably got set from phys_pmd_init.
> 
> I think vmalloc_exec is probably the right answer here.  I'll have a
> go at this over the weekend.

I've had a go (c/s 11435). It's complicated by the fact that vmalloc_exec()
and __PAGE_KERNEL_EXEC are not exported to modules.

 -- Keir

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-08 18:19       ` Keir Fraser
@ 2006-09-11  8:34         ` Steven Smith
  2006-09-11 14:48         ` Steve Dobbelstein
  1 sibling, 0 replies; 8+ messages in thread
From: Steven Smith @ 2006-09-11  8:34 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Steve Dobbelstein, xen-devel, sos22


[-- Attachment #1.1: Type: text/plain, Size: 595 bytes --]

> >> Page directory entries use permissions _PAGE_TABLE, which does not include
> >> _PAGE_NX. So clearing _PAGE_NX from the PTEs, using
> >> change_page_attr(PAGE_KERNEL_EXEC), should suffice.
> > The oops message is fairly clear that _PAGE_NX is set on the PMD, and
> > I'd guess it probably got set from phys_pmd_init.
> > 
> > I think vmalloc_exec is probably the right answer here.  I'll have a
> > go at this over the weekend.
> I've had a go (c/s 11435). It's complicated by the fact that vmalloc_exec()
> and __PAGE_KERNEL_EXEC are not exported to modules.
Thanks.

Steven.

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-08 18:19       ` Keir Fraser
  2006-09-11  8:34         ` Steven Smith
@ 2006-09-11 14:48         ` Steve Dobbelstein
  2006-09-11 15:30           ` Keir Fraser
  1 sibling, 1 reply; 8+ messages in thread
From: Steve Dobbelstein @ 2006-09-11 14:48 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, sos22

Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote on 09/08/2006 01:19:33 PM:

> On 8/9/06 18:03, "Steven Smith" <sos22-xen@srcf.ucam.org> wrote:
>
> >> Page directory entries use permissions _PAGE_TABLE, which does not
include
> >> _PAGE_NX. So clearing _PAGE_NX from the PTEs, using
> >> change_page_attr(PAGE_KERNEL_EXEC), should suffice.
> > The oops message is fairly clear that _PAGE_NX is set on the PMD, and
> > I'd guess it probably got set from phys_pmd_init.
> >
> > I think vmalloc_exec is probably the right answer here.  I'll have a
> > go at this over the weekend.
>
> I've had a go (c/s 11435). It's complicated by the fact that
vmalloc_exec()
> and __PAGE_KERNEL_EXEC are not exported to modules.
>
>  -- Keir

Is this something that should be fixed in the mainline kernel?  Basically,
a change_page_attr() to make a page executable doesn't work.  It seems to
me that split_large_page() in arch/x86_64/mm/pageattr.c should be changed
to not propagate the old pgprot to the new PMD (at least not the _PAGE_NX
bit) but rather propagate it into the new sub-PTEs that are created when
the large PTE is split.

Thoughts?

Steve D.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Oops when loading xen_platform_pci module in HVM domain on CS 11429
  2006-09-11 14:48         ` Steve Dobbelstein
@ 2006-09-11 15:30           ` Keir Fraser
  0 siblings, 0 replies; 8+ messages in thread
From: Keir Fraser @ 2006-09-11 15:30 UTC (permalink / raw)
  To: Steve Dobbelstein; +Cc: xen-devel, sos22

On 11/9/06 3:48 pm, "Steve Dobbelstein" <steved@us.ibm.com> wrote:

> Is this something that should be fixed in the mainline kernel?  Basically,
> a change_page_attr() to make a page executable doesn't work.  It seems to
> me that split_large_page() in arch/x86_64/mm/pageattr.c should be changed
> to not propagate the old pgprot to the new PMD (at least not the _PAGE_NX
> bit) but rather propagate it into the new sub-PTEs that are created when
> the large PTE is split.
> 
> Thoughts?

Even just access to vmalloc_exec() from modules would be nice. I really had
to hack around the fact that vmalloc_exec() and even PAGE_KERNEL_EXEC are
not exported to modules. It almost seems deliberate, except that doesn't
really make sense since it's quite easy to work/hack around.

 -- Keir

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-09-11 15:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <OF2C342ECB.2185877A-ON052571E1.0002BBE6-052571E1.00052087@LocalDomain>
2006-09-07 20:04 ` Oops when loading xen_platform_pci module in HVM domain on CS 11429 Steve Dobbelstein
2006-09-07 21:11   ` Keir Fraser
2006-09-08  0:41     ` Steve Dobbelstein
2006-09-08 17:03     ` Steven Smith
2006-09-08 18:19       ` Keir Fraser
2006-09-11  8:34         ` Steven Smith
2006-09-11 14:48         ` Steve Dobbelstein
2006-09-11 15:30           ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.