paging and shadow paging in xen: trying to implement split memory

All of lore.kernel.org
 help / color / mirror / Atom feed

* paging and shadow paging in xen: trying to implement split memory
@ 2008-12-11  2:59 Sina Bahram
  2008-12-11 10:20 ` Tim Deegan
  0 siblings, 1 reply; 5+ messages in thread
From: Sina Bahram @ 2008-12-11  2:59 UTC (permalink / raw)
  To: xen-devel, xen-research

Hi all,

I've been reading through the code regarding paging --> spending a lot of
time in mm/*.*, as well as some of the other parts up a level or two, but
I'm still unclear as to some key things.

Here's what I think I know:

I think I know how a domain's shadow page table is first allocated E.G. the
hash_table is xmalloc'ed and when it is destroyed E.G. xfree'ed.

I believe I have identified the functions where a shadow page is inserted
and deleted with all the tlb modifications that go along with that.

I'm semi-comfortable with the format of the shadow page table itself. In
32-bit PAE, it follows the 2, 9, 9, 12 format.

Some questions:

Why do shadow page tables exist in xen for pv guests? What is their purpose,
and how do pv guests interact with them?

How does one activate this?

Does one have to have pae enabled for 32-bit pv guests? I thought I read
that I do, but when I look at the source, the classic 10, 10, 12 format for
page tables is supported. Is that not supported for shadow page tables, and
if so ... How can I learn more about this?

* general goal *

Here is what I'm trying to do, in a finite way.

I'd like to add a structure, for now a reference in the paging struct would
be fine, let's call it hash_table2 for lack of a better name.

I'd like to mirror all operations to the page table, to hash_table, in my
hash_table2.

Now to the purpose of why I'm doing this.

I'd like to make it so that if a page is accessed, with the supervisory bit
set, I direct all reads and writes to the original hash_table, but I want to
direct all executes to hash_table2, or vice versa, that hardly matters which
one gets which.

Eventually I'd like to not even mirror pages that are just data (read and
write only) or just code (execute only).

Again, I only want to do this for page swith supervisory bit set so as to
only affect the kernel's pages. That's the kernel of the pv guest.

In this way, I hope to implement split memory as a way of preventing certain
attacks to the guest.

Is there anyone I can speak to about this, perhaps over detailed emails, IM,
or even phone?

Thanks so much

Take care,
Sina

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: paging and shadow paging in xen: trying to implement split memory
  2008-12-11  2:59 paging and shadow paging in xen: trying to implement split memory Sina Bahram
@ 2008-12-11 10:20 ` Tim Deegan
  2008-12-11 17:50   ` paging and shadow paging in xen: trying toimplement " Sina Bahram
  0 siblings, 1 reply; 5+ messages in thread
From: Tim Deegan @ 2008-12-11 10:20 UTC (permalink / raw)
  To: Sina Bahram; +Cc: xen-devel

Hi, 

At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:
> Why do shadow page tables exist in xen for pv guests? What is their purpose,
> and how do pv guests interact with them?

They're used in live migration, to track which of the guest's pages have
been written to.  It's described in the paper I mentioned before:
http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf

> How does one activate this?

Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h
for details and the uses of xc_shadow_control() in libxc for examples).

> Does one have to have pae enabled for 32-bit pv guests?

Yes.

> I thought I read
> that I do, but when I look at the source, the classic 10, 10, 12 format for
> page tables is supported. Is that not supported for shadow page tables, and
> if so ... How can I learn more about this?

The shadow pagetable code does support non-PAE paging, because it has to
handle HVM guests, which can't be constrained to particular paging
behaviour.

> * general goal *
> 
> Here is what I'm trying to do, in a finite way.
> 
> I'd like to add a structure, for now a reference in the paging struct would
> be fine, let's call it hash_table2 for lack of a better name.
> 
> I'd like to mirror all operations to the page table, to hash_table, in my
> hash_table2.

I'm not sure what hash table you're talking about here.  The hash table
in the shadow code just contains the list of which shadows there are of
a guest pagetable, not any page permissions or such.

> Now to the purpose of why I'm doing this.
> 
> I'd like to make it so that if a page is accessed, with the supervisory bit
> set, I direct all reads and writes to the original hash_table, but I want to
> direct all executes to hash_table2, or vice versa, that hardly matters which
> one gets which.
> 
> Eventually I'd like to not even mirror pages that are just data (read and
> write only) or just code (execute only).
> 
> Again, I only want to do this for page swith supervisory bit set so as to
> only affect the kernel's pages. That's the kernel of the pv guest.
> 
> In this way, I hope to implement split memory as a way of preventing certain
> attacks to the guest.

Are you thinking of building two sets of shadow pagetables, one with
only execute permissions and one with only write permissions?  The CPU
only ever uses one set of pagetables at a time, so you'd never be able
to use the non-executable one.

I think it makes more sense to have just one set of shadow pagetables
but switch the individual mappings of a page back and forth.

In fact, getting into the shadow pagetables is probably just making life
difficult for yourself; if you can use a recent AMD processor that
supports NPT, you could just change p2m map back and forth, and use the
nested-pagefault handler to know when to make the change.  Much simpler,
and easier to get right.

By the way, it's not possible using x86 pagetables to have a page that's
executable but not readable.

> Is there anyone I can speak to about this, perhaps over detailed emails, IM,
> or even phone?

This mailing list (xen-devel) is the best place to discuss
implementation details.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: paging and shadow paging in xen: trying toimplement split memory
  2008-12-11 10:20 ` Tim Deegan
@ 2008-12-11 17:50   ` Sina Bahram
  2008-12-12 10:11     ` Tim Deegan
  0 siblings, 1 reply; 5+ messages in thread
From: Sina Bahram @ 2008-12-11 17:50 UTC (permalink / raw)
  To: xen-devel

Some things are more clear now, so thank you for that.

To elaborate, what I would like to do is direct all reads and writes from a
guest to one page table and all executes to another. It doesn't matter
whether the page is readable or not, because I would be directing all "read
and write operations" to one page and all "execute" operations to another
page.

What you mentioned with swapping the p2m mapping sounds like exactly what I
want to do, but I'm hoping it's not constrained to AMD. Can I not do this
inside of Xen, making it transparent to the pv guest?

How would I go about doing this?

Again, would trying to do this for HVM guests be easier, or even possible,
because of the layer of abstraction?

Take care,
Sina

-----Original Message-----
From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
Sent: Thursday, December 11, 2008 5:20 AM
To: Sina Bahram
Cc: xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] paging and shadow paging in xen: trying toimplement
split memory

Hi, 

At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:
> Why do shadow page tables exist in xen for pv guests? What is their
purpose,
> and how do pv guests interact with them?

They're used in live migration, to track which of the guest's pages have
been written to.  It's described in the paper I mentioned before:
http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf

> How does one activate this?

Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h
for details and the uses of xc_shadow_control() in libxc for examples).

> Does one have to have pae enabled for 32-bit pv guests?

Yes.

> I thought I read
> that I do, but when I look at the source, the classic 10, 10, 12 format
for
> page tables is supported. Is that not supported for shadow page tables,
and
> if so ... How can I learn more about this?

The shadow pagetable code does support non-PAE paging, because it has to
handle HVM guests, which can't be constrained to particular paging
behaviour.

> * general goal *
> 
> Here is what I'm trying to do, in a finite way.
> 
> I'd like to add a structure, for now a reference in the paging struct
would
> be fine, let's call it hash_table2 for lack of a better name.
> 
> I'd like to mirror all operations to the page table, to hash_table, in my
> hash_table2.

I'm not sure what hash table you're talking about here.  The hash table
in the shadow code just contains the list of which shadows there are of
a guest pagetable, not any page permissions or such.

> Now to the purpose of why I'm doing this.
> 
> I'd like to make it so that if a page is accessed, with the supervisory
bit
> set, I direct all reads and writes to the original hash_table, but I want
to
> direct all executes to hash_table2, or vice versa, that hardly matters
which
> one gets which.
> 
> Eventually I'd like to not even mirror pages that are just data (read and
> write only) or just code (execute only).
> 
> Again, I only want to do this for page swith supervisory bit set so as to
> only affect the kernel's pages. That's the kernel of the pv guest.
> 
> In this way, I hope to implement split memory as a way of preventing
certain
> attacks to the guest.

Are you thinking of building two sets of shadow pagetables, one with
only execute permissions and one with only write permissions?  The CPU
only ever uses one set of pagetables at a time, so you'd never be able
to use the non-executable one.

I think it makes more sense to have just one set of shadow pagetables
but switch the individual mappings of a page back and forth.

In fact, getting into the shadow pagetables is probably just making life
difficult for yourself; if you can use a recent AMD processor that
supports NPT, you could just change p2m map back and forth, and use the
nested-pagefault handler to know when to make the change.  Much simpler,
and easier to get right.

By the way, it's not possible using x86 pagetables to have a page that's
executable but not readable.

> Is there anyone I can speak to about this, perhaps over detailed emails,
IM,
> or even phone?

This mailing list (xen-devel) is the best place to discuss
implementation details.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: paging and shadow paging in xen: trying toimplement split memory
  2008-12-11 17:50   ` paging and shadow paging in xen: trying toimplement " Sina Bahram
@ 2008-12-12 10:11     ` Tim Deegan
  2008-12-12 15:02       ` paging and shadow paging in xen: tryingtoimplement " Sina Bahram
  0 siblings, 1 reply; 5+ messages in thread
From: Tim Deegan @ 2008-12-12 10:11 UTC (permalink / raw)
  To: Sina Bahram; +Cc: xen-devel

At 12:50 -0500 on 11 Dec (1228999819), Sina Bahram wrote:
> To elaborate, what I would like to do is direct all reads and writes from a
> guest to one page table and all executes to another. It doesn't matter
> whether the page is readable or not, because I would be directing all "read
> and write operations" to one page and all "execute" operations to another
> page.

OK, as I said, there's no way of making a page executable but not
readable, so I think 

> What you mentioned with swapping the p2m mapping sounds like exactly what I
> want to do, but I'm hoping it's not constrained to AMD. Can I not do this
> inside of Xen, making it transparent to the pv guest?

Sorry, I should have said - that mechanism only applies to HVM guests, 
not PV ones.   It would indeed be transparent to the guest.

The Intel exquivalent will be available in the "Nehalem" processor line,
which I think is coming out early next year; support for it is already
in Xen from version 3.3 on.

> How would I go about doing this?

In the p2m tree (arch/x86/mm/p2m.c), mark every page as non-executable. 
I think a little bit of tinkering with p2m_change_type_global() would work.
Then in the NPT nested pagefault handler (arch/x86/hvm/svm/svm.c) when
the guest faults trying to execute a page, call back to the p2m code to
change its permissions so that page becomes executable but not
writeable.  Likewise, for write faults, switch the page back to being
writeable but not executable. 

Three drawbacks:
 - code that writes to the page it's on (self-modifying code or on-stack
   trampolines) will just spin forever unless you do something cunning
   like emulate the instruction.
 - when the page is executable it will also be readable.  As I said
   there's no way of specifying that a page should be executable but not
   readable.  (Intel's EPT spec will let you request that combination
   but the actual processors "may choose not to support" it.)
 - this affects _all_ memory, not just kernel memory.  Since it's
   dealing with physical memory you can't easily tell which frames 
   contain user-space data and which are kernel.

Cheers,

Tim.

> Again, would trying to do this for HVM guests be easier, or even possible,
> because of the layer of abstraction?
> 
> Take care,
> Sina
> 
> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
> Sent: Thursday, December 11, 2008 5:20 AM
> To: Sina Bahram
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] paging and shadow paging in xen: trying toimplement
> split memory
> 
> Hi, 
> 
> At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:
> > Why do shadow page tables exist in xen for pv guests? What is their
> purpose,
> > and how do pv guests interact with them?
> 
> They're used in live migration, to track which of the guest's pages have
> been written to.  It's described in the paper I mentioned before:
> http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf
> 
> > How does one activate this?
> 
> Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h
> for details and the uses of xc_shadow_control() in libxc for examples).
> 
> > Does one have to have pae enabled for 32-bit pv guests?
> 
> Yes.
> 
> > I thought I read
> > that I do, but when I look at the source, the classic 10, 10, 12 format
> for
> > page tables is supported. Is that not supported for shadow page tables,
> and
> > if so ... How can I learn more about this?
> 
> The shadow pagetable code does support non-PAE paging, because it has to
> handle HVM guests, which can't be constrained to particular paging
> behaviour.
>  
> > * general goal *
> > 
> > Here is what I'm trying to do, in a finite way.
> > 
> > I'd like to add a structure, for now a reference in the paging struct
> would
> > be fine, let's call it hash_table2 for lack of a better name.
> > 
> > I'd like to mirror all operations to the page table, to hash_table, in my
> > hash_table2.
> 
> I'm not sure what hash table you're talking about here.  The hash table
> in the shadow code just contains the list of which shadows there are of
> a guest pagetable, not any page permissions or such.
> 
> > Now to the purpose of why I'm doing this.
> > 
> > I'd like to make it so that if a page is accessed, with the supervisory
> bit
> > set, I direct all reads and writes to the original hash_table, but I want
> to
> > direct all executes to hash_table2, or vice versa, that hardly matters
> which
> > one gets which.
> > 
> > Eventually I'd like to not even mirror pages that are just data (read and
> > write only) or just code (execute only).
> > 
> > Again, I only want to do this for page swith supervisory bit set so as to
> > only affect the kernel's pages. That's the kernel of the pv guest.
> > 
> > In this way, I hope to implement split memory as a way of preventing
> certain
> > attacks to the guest.
> 
> Are you thinking of building two sets of shadow pagetables, one with
> only execute permissions and one with only write permissions?  The CPU
> only ever uses one set of pagetables at a time, so you'd never be able
> to use the non-executable one.
> 
> I think it makes more sense to have just one set of shadow pagetables
> but switch the individual mappings of a page back and forth.
> 
> In fact, getting into the shadow pagetables is probably just making life
> difficult for yourself; if you can use a recent AMD processor that
> supports NPT, you could just change p2m map back and forth, and use the
> nested-pagefault handler to know when to make the change.  Much simpler,
> and easier to get right.
> 
> By the way, it's not possible using x86 pagetables to have a page that's
> executable but not readable.
> 
> > Is there anyone I can speak to about this, perhaps over detailed emails,
> IM,
> > or even phone?
> 
> This mailing list (xen-devel) is the best place to discuss
> implementation details.
> 
> Cheers,
> 
> Tim.
> 
> -- 
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Citrix Systems (R&D) Ltd.
> [Company #02300071, SL9 0DZ, UK.]
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: paging and shadow paging in xen: tryingtoimplement split memory
  2008-12-12 10:11     ` Tim Deegan
@ 2008-12-12 15:02       ` Sina Bahram
  0 siblings, 0 replies; 5+ messages in thread
From: Sina Bahram @ 2008-12-12 15:02 UTC (permalink / raw)
  To: xen-devel

Thanks for your response below.

Take care,
Sina
 
-----Original Message-----
From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
Sent: Friday, December 12, 2008 5:12 AM
To: Sina Bahram
Cc: xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] paging and shadow paging in xen: tryingtoimplement
split memory

At 12:50 -0500 on 11 Dec (1228999819), Sina Bahram wrote:
> To elaborate, what I would like to do is direct all reads and writes from
a
> guest to one page table and all executes to another. It doesn't matter
> whether the page is readable or not, because I would be directing all
"read
> and write operations" to one page and all "execute" operations to another
> page.

OK, as I said, there's no way of making a page executable but not
readable, so I think 

> What you mentioned with swapping the p2m mapping sounds like exactly what
I
> want to do, but I'm hoping it's not constrained to AMD. Can I not do this
> inside of Xen, making it transparent to the pv guest?

Sorry, I should have said - that mechanism only applies to HVM guests, 
not PV ones.   It would indeed be transparent to the guest.

The Intel exquivalent will be available in the "Nehalem" processor line,
which I think is coming out early next year; support for it is already
in Xen from version 3.3 on.

> How would I go about doing this?

In the p2m tree (arch/x86/mm/p2m.c), mark every page as non-executable. 
I think a little bit of tinkering with p2m_change_type_global() would work.
Then in the NPT nested pagefault handler (arch/x86/hvm/svm/svm.c) when
the guest faults trying to execute a page, call back to the p2m code to
change its permissions so that page becomes executable but not
writeable.  Likewise, for write faults, switch the page back to being
writeable but not executable. 

Three drawbacks:
 - code that writes to the page it's on (self-modifying code or on-stack
   trampolines) will just spin forever unless you do something cunning
   like emulate the instruction.
 - when the page is executable it will also be readable.  As I said
   there's no way of specifying that a page should be executable but not
   readable.  (Intel's EPT spec will let you request that combination
   but the actual processors "may choose not to support" it.)
 - this affects _all_ memory, not just kernel memory.  Since it's
   dealing with physical memory you can't easily tell which frames 
   contain user-space data and which are kernel.

Cheers,

Tim.

> Again, would trying to do this for HVM guests be easier, or even possible,
> because of the layer of abstraction?
> 
> Take care,
> Sina
> 
> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@citrix.com] 
> Sent: Thursday, December 11, 2008 5:20 AM
> To: Sina Bahram
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] paging and shadow paging in xen: trying
toimplement
> split memory
> 
> Hi, 
> 
> At 21:59 -0500 on 10 Dec (1228946361), Sina Bahram wrote:
> > Why do shadow page tables exist in xen for pv guests? What is their
> purpose,
> > and how do pv guests interact with them?
> 
> They're used in live migration, to track which of the guest's pages have
> been written to.  It's described in the paper I mentioned before:
> http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-nsdi-migration.pdf
> 
> > How does one activate this?
> 
> Using the XEN_DOMCTL_shadow_op domctl (see xen/include/public/domctl.h
> for details and the uses of xc_shadow_control() in libxc for examples).
> 
> > Does one have to have pae enabled for 32-bit pv guests?
> 
> Yes.
> 
> > I thought I read
> > that I do, but when I look at the source, the classic 10, 10, 12 format
> for
> > page tables is supported. Is that not supported for shadow page tables,
> and
> > if so ... How can I learn more about this?
> 
> The shadow pagetable code does support non-PAE paging, because it has to
> handle HVM guests, which can't be constrained to particular paging
> behaviour.
>  
> > * general goal *
> > 
> > Here is what I'm trying to do, in a finite way.
> > 
> > I'd like to add a structure, for now a reference in the paging struct
> would
> > be fine, let's call it hash_table2 for lack of a better name.
> > 
> > I'd like to mirror all operations to the page table, to hash_table, in
my
> > hash_table2.
> 
> I'm not sure what hash table you're talking about here.  The hash table
> in the shadow code just contains the list of which shadows there are of
> a guest pagetable, not any page permissions or such.
> 
> > Now to the purpose of why I'm doing this.
> > 
> > I'd like to make it so that if a page is accessed, with the supervisory
> bit
> > set, I direct all reads and writes to the original hash_table, but I
want
> to
> > direct all executes to hash_table2, or vice versa, that hardly matters
> which
> > one gets which.
> > 
> > Eventually I'd like to not even mirror pages that are just data (read
and
> > write only) or just code (execute only).
> > 
> > Again, I only want to do this for page swith supervisory bit set so as
to
> > only affect the kernel's pages. That's the kernel of the pv guest.
> > 
> > In this way, I hope to implement split memory as a way of preventing
> certain
> > attacks to the guest.
> 
> Are you thinking of building two sets of shadow pagetables, one with
> only execute permissions and one with only write permissions?  The CPU
> only ever uses one set of pagetables at a time, so you'd never be able
> to use the non-executable one.
> 
> I think it makes more sense to have just one set of shadow pagetables
> but switch the individual mappings of a page back and forth.
> 
> In fact, getting into the shadow pagetables is probably just making life
> difficult for yourself; if you can use a recent AMD processor that
> supports NPT, you could just change p2m map back and forth, and use the
> nested-pagefault handler to know when to make the change.  Much simpler,
> and easier to get right.
> 
> By the way, it's not possible using x86 pagetables to have a page that's
> executable but not readable.
> 
> > Is there anyone I can speak to about this, perhaps over detailed emails,
> IM,
> > or even phone?
> 
> This mailing list (xen-devel) is the best place to discuss
> implementation details.
> 
> Cheers,
> 
> Tim.
> 
> -- 
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Citrix Systems (R&D) Ltd.
> [Company #02300071, SL9 0DZ, UK.]
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-12-12 15:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-11  2:59 paging and shadow paging in xen: trying to implement split memory Sina Bahram
2008-12-11 10:20 ` Tim Deegan
2008-12-11 17:50   ` paging and shadow paging in xen: trying toimplement " Sina Bahram
2008-12-12 10:11     ` Tim Deegan
2008-12-12 15:02       ` paging and shadow paging in xen: tryingtoimplement " Sina Bahram

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.