All of lore.kernel.org
 help / color / mirror / Atom feed
* Consult some concepts about shadow paging mechanism
@ 2009-04-22 13:14 Jui-Hao Chiang
  2009-04-23 15:46 ` Gianluca Guida
  0 siblings, 1 reply; 8+ messages in thread
From: Jui-Hao Chiang @ 2009-04-22 13:14 UTC (permalink / raw)
  To: xen-devel

Dear All:

I am pretty new to xen-devel, please correct me in the following.

Assume we have the following terms
GPT: guest page table
SPT: shadow page table

(Question a) When guest OS is running, is it always using SPT for
address translation? If it is the case, how does guest OS refer and
modify its own GPT content? It seems that there is a page table entry
in SPT for the GPT page.

(Question b) The hypervisor is performing synchronization between GPT
and SPT. When guest OS increase access to some page (call it
Normal_Page) by marking 'read only' GPT entries as 'read write',
what's the read-write mode of the GPT page in the beginning?
(1) If it's read-only in SPT, the this modification will trigger a
page fault for GPT page, so that hypervisor can synchronize those two
tables at this moment.
(2) If it's read-write in SPT, then the page fault will only occur
when guest first write the Normal_Page content since the SPT entry has
read-only for the Normal_Page. Then hypervisor change the SPT entry to
read-write and to make sure it's the same as GPT, and return to guest
OS to run.

Thanks for your patience to look at these questions.
Jui-Hao

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-04-22 13:14 Consult some concepts about shadow paging mechanism Jui-Hao Chiang
@ 2009-04-23 15:46 ` Gianluca Guida
  2009-04-24  4:23   ` Jui-Hao Chiang
  0 siblings, 1 reply; 8+ messages in thread
From: Gianluca Guida @ 2009-04-23 15:46 UTC (permalink / raw)
  To: Jui-Hao Chiang; +Cc: xen-devel

Hi,

On Wed, Apr 22, 2009 at 3:14 PM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
> Assume we have the following terms
> GPT: guest page table
> SPT: shadow page table
>
> (Question a) When guest OS is running, is it always using SPT for
> address translation?

Yes, the guest always run directly on shadow pagetables.

> If it is the case, how does guest OS refer and
> modify its own GPT content?

The guest will map its own pagetables (GPT) in the guest page tables
itself. It will access them trasparently, and the shadow code will
kick in doing the various magic to keep guest and shadows in sync.

> It seems that there is a page table entry
> in SPT for the GPT page.

Yes, exactly.


> (Question b) The hypervisor is performing synchronization between GPT
> and SPT. When guest OS increase access to some page (call it
> Normal_Page) by marking 'read only' GPT entries as 'read write',
> what's the read-write mode of the GPT page in the beginning?
> (1) If it's read-only in SPT, the this modification will trigger a
> page fault for GPT page, so that hypervisor can synchronize those two
> tables at this moment.

Yes, that's exactly what happens in general. GPT pages are always
mapped read only in shadows. Well, there's an exception at the moment:
level 1 pagetables (page tables, as opposed to page directories, etc.)
can be mapped writable, but this is a much longer discussion.


Gianluca

-- 
It was a type of people I did not know, I found them very strange and
they did not inspire confidence at all. Later I learned that I had been
introduced to electronic engineers.
                                                  E. W. Dijkstra

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-04-23 15:46 ` Gianluca Guida
@ 2009-04-24  4:23   ` Jui-Hao Chiang
  2009-04-24 13:32     ` Gianluca Guida
  0 siblings, 1 reply; 8+ messages in thread
From: Jui-Hao Chiang @ 2009-04-24  4:23 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel

Thanks, Gianluca:

I have some additional doubts as the following:
(1) For normal data page, in order to propagate the Dirty or Access
bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
SPTE. When the write page fault of this data page comes, hypervisor
can propagate the Dirty or Access bit to GPTE and set it to R/W. My
question is when does the hypervisor make it Read-Only again? Is there
any place inside the source code you can point out?

(2) How many shadow pages are maintained for each guest domain? If the
hypervisor keep only one shadow page table for the active process in
each guest domain, then during the guest context-switch, it might
erase the entire shadow page table, and re-construct it for the new
process, which seems a lot of overhead. I have checked the
sh_update_cr3(), but not sure of the detailed mechanism.

Thanks for your patience,
Jui-Hao


On Thu, Apr 23, 2009 at 11:46 AM, Gianluca Guida <glguida@gmail.com> wrote:
> Hi,
>
> On Wed, Apr 22, 2009 at 3:14 PM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
>> Assume we have the following terms
>> GPT: guest page table
>> SPT: shadow page table
>>
>> (Question a) When guest OS is running, is it always using SPT for
>> address translation?
>
> Yes, the guest always run directly on shadow pagetables.
>
>> If it is the case, how does guest OS refer and
>> modify its own GPT content?
>
> The guest will map its own pagetables (GPT) in the guest page tables
> itself. It will access them trasparently, and the shadow code will
> kick in doing the various magic to keep guest and shadows in sync.
>
>> It seems that there is a page table entry
>> in SPT for the GPT page.
>
> Yes, exactly.
>
>
>> (Question b) The hypervisor is performing synchronization between GPT
>> and SPT. When guest OS increase access to some page (call it
>> Normal_Page) by marking 'read only' GPT entries as 'read write',
>> what's the read-write mode of the GPT page in the beginning?
>> (1) If it's read-only in SPT, the this modification will trigger a
>> page fault for GPT page, so that hypervisor can synchronize those two
>> tables at this moment.
>
> Yes, that's exactly what happens in general. GPT pages are always
> mapped read only in shadows. Well, there's an exception at the moment:
> level 1 pagetables (page tables, as opposed to page directories, etc.)
> can be mapped writable, but this is a much longer discussion.
>
>
> Gianluca
>
> --
> It was a type of people I did not know, I found them very strange and
> they did not inspire confidence at all. Later I learned that I had been
> introduced to electronic engineers.
>                                                  E. W. Dijkstra
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-04-24  4:23   ` Jui-Hao Chiang
@ 2009-04-24 13:32     ` Gianluca Guida
  2009-05-02  2:47       ` Jui-Hao Chiang
  0 siblings, 1 reply; 8+ messages in thread
From: Gianluca Guida @ 2009-04-24 13:32 UTC (permalink / raw)
  To: Jui-Hao Chiang; +Cc: xen-devel

On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
> I have some additional doubts as the following:
> (1) For normal data page, in order to propagate the Dirty or Access
> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
> SPTE. When the write page fault of this data page comes, hypervisor
> can propagate the Dirty or Access bit to GPTE and set it to R/W. My
> question is when does the hypervisor make it Read-Only again? Is there
> any place inside the source code you can point out?

What happens is this: the guest has to clear the dirty/accessed bit
and then flush the tlb (or invlpg the entry).
If the pagetable is mapped read only (as in levels > 1) the write to
the pagetable will trigger the emulator that will update the entry.
Otherwhise (if the page is out of sync, which means a writable guest
pagetable, and this happens when it's an L1) the flushtlb will do the
job of updating the shadow entry.

Look at how sh_propagate function works and when it get called. It's
what you're looking for.

> (2) How many shadow pages are maintained for each guest domain? If the
> hypervisor keep only one shadow page table for the active process in
> each guest domain, then during the guest context-switch, it might
> erase the entire shadow page table, and re-construct it for the new
> process, which seems a lot of overhead. I have checked the
> sh_update_cr3(), but not sure of the detailed mechanism.

There's a pool of shadow memory that get reused in a pseudo-LRU
manner. Across cr3 switch toplevel pagetables are kept in memory, and
unshadowed when evicted by the allocator or when other things happens,
mostly based on heuristic and reference counting.

Thanks,
Gianluca

-- 
It was a type of people I did not know, I found them very strange and
they did not inspire confidence at all. Later I learned that I had been
introduced to electronic engineers.
                                                  E. W. Dijkstra

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-04-24 13:32     ` Gianluca Guida
@ 2009-05-02  2:47       ` Jui-Hao Chiang
  2009-05-03 13:39         ` Jui-Hao Chiang
  0 siblings, 1 reply; 8+ messages in thread
From: Jui-Hao Chiang @ 2009-05-02  2:47 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel

Hi, sorry for disturbing you guys again.

Assume guest's paging level is 2 and shadow is using level 3 PAE.
I am now trying to dump the L2 shadow page table information in the
beginning of sh_update_cr3() as the following (actually copying the
code from sh_audit_l2_table and audit_gfn_to_mfn functions)

The code accidentally crashes in  guest_l2e_get_flags(*gl2e) of the
sh_walk_l2_table I wrote.
However, the weird part is the code doesn't crash in gfn =
guest_l2e_get_gfn(*gl2e) which is accessing the *gl2e in a similar way
as guest_l2e_get_flags.

static inline mfn_t
convert_gfn_to_mfn(struct vcpu *v, gfn_t gfn, mfn_t gmfn)
{
    p2m_type_t p2mt;
    if ( !shadow_mode_translate(v->domain) )
        return _mfn(gfn_x(gfn));

    if ( (mfn_to_page(gmfn)->u.inuse.type_info & PGT_type_mask)
         != PGT_writable_page )
        return _mfn(gfn_x(gfn)); // This is a paging-disabled shadow
    else
        return gfn_to_mfn(v->domain, gfn, &p2mt);
}

/* JuiHao: walk the l2 shadow page table based on input sl2mfn */
static int sh_walk_l2_table(struct vcpu *v, mfn_t sl2mfn, mfn_t x)
{
	guest_l2e_t *gl2e, *gp;
	shadow_l2e_t *sl2e;
	mfn_t sl1mfn, gl2mfn;
	gfn_t gfn;
	mfn_t gmfn;
	int done = 0;

	/* Follow the backpointer in struct shadow_page_info to get guest l2mfn */
	gl2mfn = _mfn(mfn_to_shadow_page(sl2mfn)->backpointer);
	gl2e = gp = sh_map_domain_page(gl2mfn);

	SHADOW_FOREACH_L2E(sl2mfn, sl2e, &gl2e, done, v->domain, {

		gfn = guest_l2e_get_gfn(*gl2e);  // ###!!!! Works Fine !!!!!####
		sl1mfn = shadow_l2e_get_mfn(*sl2e);
		
		if (mfn_valid(sl1mfn) && (shadow_l2e_get_flags(*sl2e) & _PAGE_PRESENT)) {

			// We get this gmfn is just to double check if this is equal to sl1mfn
			gmfn = (guest_l2e_get_flags(*gl2e) & _PAGE_PSE) // ###!!!! CRASH !!!!!####
				? get_fl1_shadow_status(v, gfn)
				: get_shadow_status(v, convert_gfn_to_mfn(v, gfn, gl2mfn),
				SH_type_l1_shadow);
			
			if (mfn_x(gmfn) != mfn_x(sl1mfn)) {
				printk("!! gmfn %" PRI_mfn " != sl1mfn %" PRI_mfn "\n", gmfn, sl1mfn);
			} else {
				printk("going down to traverse level 1 SPT\n");
			}
		}

	});
	sh_unmap_domain_page(gp);
	return 0;
}

Could you help a little bit on this?
Many thanks,
Jui-Hao

On Fri, Apr 24, 2009 at 9:32 AM, Gianluca Guida
<gianluca.guida@eu.citrix.com> wrote:
> On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
>> I have some additional doubts as the following:
>> (1) For normal data page, in order to propagate the Dirty or Access
>> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
>> SPTE. When the write page fault of this data page comes, hypervisor
>> can propagate the Dirty or Access bit to GPTE and set it to R/W. My
>> question is when does the hypervisor make it Read-Only again? Is there
>> any place inside the source code you can point out?
>
> What happens is this: the guest has to clear the dirty/accessed bit
> and then flush the tlb (or invlpg the entry).
> If the pagetable is mapped read only (as in levels > 1) the write to
> the pagetable will trigger the emulator that will update the entry.
> Otherwhise (if the page is out of sync, which means a writable guest
> pagetable, and this happens when it's an L1) the flushtlb will do the
> job of updating the shadow entry.
>
> Look at how sh_propagate function works and when it get called. It's
> what you're looking for.
>
>> (2) How many shadow pages are maintained for each guest domain? If the
>> hypervisor keep only one shadow page table for the active process in
>> each guest domain, then during the guest context-switch, it might
>> erase the entire shadow page table, and re-construct it for the new
>> process, which seems a lot of overhead. I have checked the
>> sh_update_cr3(), but not sure of the detailed mechanism.
>
> There's a pool of shadow memory that get reused in a pseudo-LRU
> manner. Across cr3 switch toplevel pagetables are kept in memory, and
> unshadowed when evicted by the allocator or when other things happens,
> mostly based on heuristic and reference counting.
>
> Thanks,
> Gianluca
>
> --
> It was a type of people I did not know, I found them very strange and
> they did not inspire confidence at all. Later I learned that I had been
> introduced to electronic engineers.
>                                                  E. W. Dijkstra
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-05-02  2:47       ` Jui-Hao Chiang
@ 2009-05-03 13:39         ` Jui-Hao Chiang
  2009-05-05  8:37           ` Tim Deegan
  2009-05-05  9:15           ` Gianluca Guida
  0 siblings, 2 replies; 8+ messages in thread
From: Jui-Hao Chiang @ 2009-05-03 13:39 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: xen-devel

I got the answer because I made a mistake to pass four sl2mfn entries
in v->arch.paging.shadow.l3table[] to sh_walk_l2_table().
Then truth is I only need to pass v->arch.paging.shadow.l3table[0]
because SHADOW_FOREACH_L2E has already done a good job on looping the
four sl2mfns.

But I have another doubt in traversing SPT from level 3, level 2, and level1.
When I am traversing down to the level 1 SPT, I found several
inconsistency between gl1e and sl1e content, which is the same as the
mechanism in sh_audit_l1_table(). Is this a normal case? I thought
they should keep consistent at all times.

My purpose is to walk down the SPT and GPT during each process context
switch (sh_update_cr3), and do some statistics first, e.g. dirty,
access, present bit.

Now I tried another checking in level 2 SPT by skipping those sl1mfn
which does not pass sh_mfn_is_a_page_table(sl1mfn) check, then the
inconsistency is gone is level 1 SPT traversing.

Can anyone show some hint about how to do the right thing? Is there
some special type of SPTE that I should not traverse down?

Many thanks,
Jui-Hao



On Fri, May 1, 2009 at 10:47 PM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
> Hi, sorry for disturbing you guys again.
>
> Assume guest's paging level is 2 and shadow is using level 3 PAE.
> I am now trying to dump the L2 shadow page table information in the
> beginning of sh_update_cr3() as the following (actually copying the
> code from sh_audit_l2_table and audit_gfn_to_mfn functions)
>
> The code accidentally crashes in  guest_l2e_get_flags(*gl2e) of the
> sh_walk_l2_table I wrote.
> However, the weird part is the code doesn't crash in gfn =
> guest_l2e_get_gfn(*gl2e) which is accessing the *gl2e in a similar way
> as guest_l2e_get_flags.
>
> static inline mfn_t
> convert_gfn_to_mfn(struct vcpu *v, gfn_t gfn, mfn_t gmfn)
> {
>    p2m_type_t p2mt;
>    if ( !shadow_mode_translate(v->domain) )
>        return _mfn(gfn_x(gfn));
>
>    if ( (mfn_to_page(gmfn)->u.inuse.type_info & PGT_type_mask)
>         != PGT_writable_page )
>        return _mfn(gfn_x(gfn)); // This is a paging-disabled shadow
>    else
>        return gfn_to_mfn(v->domain, gfn, &p2mt);
> }
>
> /* JuiHao: walk the l2 shadow page table based on input sl2mfn */
> static int sh_walk_l2_table(struct vcpu *v, mfn_t sl2mfn, mfn_t x)
> {
>        guest_l2e_t *gl2e, *gp;
>        shadow_l2e_t *sl2e;
>        mfn_t sl1mfn, gl2mfn;
>        gfn_t gfn;
>        mfn_t gmfn;
>        int done = 0;
>
>        /* Follow the backpointer in struct shadow_page_info to get guest l2mfn */
>        gl2mfn = _mfn(mfn_to_shadow_page(sl2mfn)->backpointer);
>        gl2e = gp = sh_map_domain_page(gl2mfn);
>
>        SHADOW_FOREACH_L2E(sl2mfn, sl2e, &gl2e, done, v->domain, {
>
>                gfn = guest_l2e_get_gfn(*gl2e);  // ###!!!! Works Fine !!!!!####
>                sl1mfn = shadow_l2e_get_mfn(*sl2e);
>
>                if (mfn_valid(sl1mfn) && (shadow_l2e_get_flags(*sl2e) & _PAGE_PRESENT)) {
>
>                        // We get this gmfn is just to double check if this is equal to sl1mfn
>                        gmfn = (guest_l2e_get_flags(*gl2e) & _PAGE_PSE) // ###!!!! CRASH !!!!!####
>                                ? get_fl1_shadow_status(v, gfn)
>                                : get_shadow_status(v, convert_gfn_to_mfn(v, gfn, gl2mfn),
>                                SH_type_l1_shadow);
>
>                        if (mfn_x(gmfn) != mfn_x(sl1mfn)) {
>                                printk("!! gmfn %" PRI_mfn " != sl1mfn %" PRI_mfn "\n", gmfn, sl1mfn);
>                        } else {
>                                printk("going down to traverse level 1 SPT\n");
>                        }
>                }
>
>        });
>        sh_unmap_domain_page(gp);
>        return 0;
> }
>
> Could you help a little bit on this?
> Many thanks,
> Jui-Hao
>
> On Fri, Apr 24, 2009 at 9:32 AM, Gianluca Guida
> <gianluca.guida@eu.citrix.com> wrote:
>> On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
>>> I have some additional doubts as the following:
>>> (1) For normal data page, in order to propagate the Dirty or Access
>>> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
>>> SPTE. When the write page fault of this data page comes, hypervisor
>>> can propagate the Dirty or Access bit to GPTE and set it to R/W. My
>>> question is when does the hypervisor make it Read-Only again? Is there
>>> any place inside the source code you can point out?
>>
>> What happens is this: the guest has to clear the dirty/accessed bit
>> and then flush the tlb (or invlpg the entry).
>> If the pagetable is mapped read only (as in levels > 1) the write to
>> the pagetable will trigger the emulator that will update the entry.
>> Otherwhise (if the page is out of sync, which means a writable guest
>> pagetable, and this happens when it's an L1) the flushtlb will do the
>> job of updating the shadow entry.
>>
>> Look at how sh_propagate function works and when it get called. It's
>> what you're looking for.
>>
>>> (2) How many shadow pages are maintained for each guest domain? If the
>>> hypervisor keep only one shadow page table for the active process in
>>> each guest domain, then during the guest context-switch, it might
>>> erase the entire shadow page table, and re-construct it for the new
>>> process, which seems a lot of overhead. I have checked the
>>> sh_update_cr3(), but not sure of the detailed mechanism.
>>
>> There's a pool of shadow memory that get reused in a pseudo-LRU
>> manner. Across cr3 switch toplevel pagetables are kept in memory, and
>> unshadowed when evicted by the allocator or when other things happens,
>> mostly based on heuristic and reference counting.
>>
>> Thanks,
>> Gianluca
>>
>> --
>> It was a type of people I did not know, I found them very strange and
>> they did not inspire confidence at all. Later I learned that I had been
>> introduced to electronic engineers.
>>                                                  E. W. Dijkstra
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-05-03 13:39         ` Jui-Hao Chiang
@ 2009-05-05  8:37           ` Tim Deegan
  2009-05-05  9:15           ` Gianluca Guida
  1 sibling, 0 replies; 8+ messages in thread
From: Tim Deegan @ 2009-05-05  8:37 UTC (permalink / raw)
  To: Jui-Hao Chiang; +Cc: Gianluca Guida, xen-devel

At 14:39 +0100 on 03 May (1241361566), Jui-Hao Chiang wrote:
> But I have another doubt in traversing SPT from level 3, level 2, and level1.
> When I am traversing down to the level 1 SPT, I found several
> inconsistency between gl1e and sl1e content, which is the same as the
> mechanism in sh_audit_l1_table(). Is this a normal case? I thought
> they should keep consistent at all times.

They should, as long as the GL1 is npt out of sync.  Did you see
Gianluca's recent patch that fixes the audit code to understand the
dirty-VRAM tracking?

Cheers, 

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Consult some concepts about shadow paging mechanism
  2009-05-03 13:39         ` Jui-Hao Chiang
  2009-05-05  8:37           ` Tim Deegan
@ 2009-05-05  9:15           ` Gianluca Guida
  1 sibling, 0 replies; 8+ messages in thread
From: Gianluca Guida @ 2009-05-05  9:15 UTC (permalink / raw)
  To: Jui-Hao Chiang; +Cc: xen-devel

Hello,

On Sun, May 3, 2009 at 3:39 PM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:

> My purpose is to walk down the SPT and GPT during each process context
> switch (sh_update_cr3), and do some statistics first, e.g. dirty,
> access, present bit.
>
> Now I tried another checking in level 2 SPT by skipping those sl1mfn
> which does not pass sh_mfn_is_a_page_table(sl1mfn) check, then the
> inconsistency is gone is level 1 SPT traversing.

sh_mfn_is_a_page_table is meant to be used for *guest* pages, not shadow pages.

>
> Can anyone show some hint about how to do the right thing? Is there
> some special type of SPTE that I should not traverse down?

No, if a shadow is linked to other shadows (be sure to held the shadow
lock while traversing them, or you can't be sure of what you're
reading) then it can be accessed by domains, so no funny things should
be mapped in there.

The only thing you should be careful of are splintered shadows: when a
guest has PSE set we do create an artificial L1 (called fl1)
containing all individuals 4k mappings. It's very important to note,
for these shadows, that the backpointer will not contain a link to the
guest pagetable, which infact doesn't exist, but the *gfn* of the 2mb
range we're splintering.

Hope this is clear and useful,
Gianluca

>
> Many thanks,
> Jui-Hao
>
>
>
> On Fri, May 1, 2009 at 10:47 PM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
>> Hi, sorry for disturbing you guys again.
>>
>> Assume guest's paging level is 2 and shadow is using level 3 PAE.
>> I am now trying to dump the L2 shadow page table information in the
>> beginning of sh_update_cr3() as the following (actually copying the
>> code from sh_audit_l2_table and audit_gfn_to_mfn functions)
>>
>> The code accidentally crashes in  guest_l2e_get_flags(*gl2e) of the
>> sh_walk_l2_table I wrote.
>> However, the weird part is the code doesn't crash in gfn =
>> guest_l2e_get_gfn(*gl2e) which is accessing the *gl2e in a similar way
>> as guest_l2e_get_flags.
>>
>> static inline mfn_t
>> convert_gfn_to_mfn(struct vcpu *v, gfn_t gfn, mfn_t gmfn)
>> {
>>    p2m_type_t p2mt;
>>    if ( !shadow_mode_translate(v->domain) )
>>        return _mfn(gfn_x(gfn));
>>
>>    if ( (mfn_to_page(gmfn)->u.inuse.type_info & PGT_type_mask)
>>         != PGT_writable_page )
>>        return _mfn(gfn_x(gfn)); // This is a paging-disabled shadow
>>    else
>>        return gfn_to_mfn(v->domain, gfn, &p2mt);
>> }
>>
>> /* JuiHao: walk the l2 shadow page table based on input sl2mfn */
>> static int sh_walk_l2_table(struct vcpu *v, mfn_t sl2mfn, mfn_t x)
>> {
>>        guest_l2e_t *gl2e, *gp;
>>        shadow_l2e_t *sl2e;
>>        mfn_t sl1mfn, gl2mfn;
>>        gfn_t gfn;
>>        mfn_t gmfn;
>>        int done = 0;
>>
>>        /* Follow the backpointer in struct shadow_page_info to get guest l2mfn */
>>        gl2mfn = _mfn(mfn_to_shadow_page(sl2mfn)->backpointer);
>>        gl2e = gp = sh_map_domain_page(gl2mfn);
>>
>>        SHADOW_FOREACH_L2E(sl2mfn, sl2e, &gl2e, done, v->domain, {
>>
>>                gfn = guest_l2e_get_gfn(*gl2e);  // ###!!!! Works Fine !!!!!####
>>                sl1mfn = shadow_l2e_get_mfn(*sl2e);
>>
>>                if (mfn_valid(sl1mfn) && (shadow_l2e_get_flags(*sl2e) & _PAGE_PRESENT)) {
>>
>>                        // We get this gmfn is just to double check if this is equal to sl1mfn
>>                        gmfn = (guest_l2e_get_flags(*gl2e) & _PAGE_PSE) // ###!!!! CRASH !!!!!####
>>                                ? get_fl1_shadow_status(v, gfn)
>>                                : get_shadow_status(v, convert_gfn_to_mfn(v, gfn, gl2mfn),
>>                                SH_type_l1_shadow);
>>
>>                        if (mfn_x(gmfn) != mfn_x(sl1mfn)) {
>>                                printk("!! gmfn %" PRI_mfn " != sl1mfn %" PRI_mfn "\n", gmfn, sl1mfn);
>>                        } else {
>>                                printk("going down to traverse level 1 SPT\n");
>>                        }
>>                }
>>
>>        });
>>        sh_unmap_domain_page(gp);
>>        return 0;
>> }
>>
>> Could you help a little bit on this?
>> Many thanks,
>> Jui-Hao
>>
>> On Fri, Apr 24, 2009 at 9:32 AM, Gianluca Guida
>> <gianluca.guida@eu.citrix.com> wrote:
>>> On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@gmail.com> wrote:
>>>> I have some additional doubts as the following:
>>>> (1) For normal data page, in order to propagate the Dirty or Access
>>>> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
>>>> SPTE. When the write page fault of this data page comes, hypervisor
>>>> can propagate the Dirty or Access bit to GPTE and set it to R/W. My
>>>> question is when does the hypervisor make it Read-Only again? Is there
>>>> any place inside the source code you can point out?
>>>
>>> What happens is this: the guest has to clear the dirty/accessed bit
>>> and then flush the tlb (or invlpg the entry).
>>> If the pagetable is mapped read only (as in levels > 1) the write to
>>> the pagetable will trigger the emulator that will update the entry.
>>> Otherwhise (if the page is out of sync, which means a writable guest
>>> pagetable, and this happens when it's an L1) the flushtlb will do the
>>> job of updating the shadow entry.
>>>
>>> Look at how sh_propagate function works and when it get called. It's
>>> what you're looking for.
>>>
>>>> (2) How many shadow pages are maintained for each guest domain? If the
>>>> hypervisor keep only one shadow page table for the active process in
>>>> each guest domain, then during the guest context-switch, it might
>>>> erase the entire shadow page table, and re-construct it for the new
>>>> process, which seems a lot of overhead. I have checked the
>>>> sh_update_cr3(), but not sure of the detailed mechanism.
>>>
>>> There's a pool of shadow memory that get reused in a pseudo-LRU
>>> manner. Across cr3 switch toplevel pagetables are kept in memory, and
>>> unshadowed when evicted by the allocator or when other things happens,
>>> mostly based on heuristic and reference counting.
>>>
>>> Thanks,
>>> Gianluca
>>>
>>> --
>>> It was a type of people I did not know, I found them very strange and
>>> they did not inspire confidence at all. Later I learned that I had been
>>> introduced to electronic engineers.
>>>                                                  E. W. Dijkstra
>>>
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>



-- 
It was a type of people I did not know, I found them very strange and
they did not inspire confidence at all. Later I learned that I had been
introduced to electronic engineers.
                                                  E. W. Dijkstra

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-05-05  9:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-22 13:14 Consult some concepts about shadow paging mechanism Jui-Hao Chiang
2009-04-23 15:46 ` Gianluca Guida
2009-04-24  4:23   ` Jui-Hao Chiang
2009-04-24 13:32     ` Gianluca Guida
2009-05-02  2:47       ` Jui-Hao Chiang
2009-05-03 13:39         ` Jui-Hao Chiang
2009-05-05  8:37           ` Tim Deegan
2009-05-05  9:15           ` Gianluca Guida

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.