All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/4] add e500 platform support  for KVM
@ 2008-08-15  8:50 Liu Yu
  2008-08-21 20:54 ` Hollis Blanchard
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Liu Yu @ 2008-08-15  8:50 UTC (permalink / raw)
  To: kvm-ppc


These patches add the support of e500 platform for KVM.

The code is just in the primary stage,
so this time just for discussion.

I have tested them under branch 2.6.26, but not under the lastest code,
for my board can not boot up via current kvm tree.

---
 arch/powerpc/include/asm/kvm_asm.h      |    7 +-
 arch/powerpc/include/asm/kvm_host.h     |  112 ++++++
 arch/powerpc/include/asm/kvm_ppc.h      |    9 -
 arch/powerpc/kernel/asm-offsets.c       |    9 +
 arch/powerpc/kvm/44x_tlb.c              |  206 +++++++++--
 arch/powerpc/kvm/44x_tlb.h              |   54 +++-
 arch/powerpc/kvm/Kconfig                |    4 +-
 arch/powerpc/kvm/Makefile               |    7 +
 arch/powerpc/kvm/booke_fsl_interrupts.S |  447 ++++++++++++++++++++++
 arch/powerpc/kvm/booke_guest.c          |  136 ++-----
 arch/powerpc/kvm/booke_host.c           |   20 +-
 arch/powerpc/kvm/e500_tlb.c             |  638 +++++++++++++++++++++++++++++++
 arch/powerpc/kvm/e500_tlb.h             |  172 +++++++++
 arch/powerpc/kvm/emulate.c              |  261 +++++---------
 arch/powerpc/kvm/inst.h                 |   60 +++
 arch/powerpc/kvm/powerpc.c              |   18 +-
 arch/powerpc/kvm/powerpc.h              |   14 +
 17 files changed, 1842 insertions(+), 332 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
@ 2008-08-21 20:54 ` Hollis Blanchard
  2008-08-22 11:00 ` Liu Yu
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-08-21 20:54 UTC (permalink / raw)
  To: kvm-ppc

On Fri, 2008-08-15 at 16:50 +0800, Liu Yu wrote:
> These patches add the support of e500 platform for KVM.

Hi Yu, sorry it's taken me this long to take a look. I've been sick and
catching up from vacation...

> The code is just in the primary stage,
> so this time just for discussion.

Could you talk a little about how it differs from the 440
implementation? For example, how are you using TLB0 and TLB1?

Some of the refactoring you've done, like creating completely separate
kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
expect the DTLB miss code to be refactored like this:

        case BOOKE_INTERRUPT_DTLB_MISS:	
        	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
        	if (!gtlbe) {
        		/* The guest didn't have a mapping for it. */
        		kvmppc_queue_exception(vcpu, exit_nr);
        		vcpu->arch.dear = vcpu->arch.fault_dear;
        		vcpu->arch.esr = vcpu->arch.fault_esr;
        		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
        		vcpu->stat.dtlb_real_miss_exits++;
        		r = RESUME_GUEST;
        		break;
        	}
        
        	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
        	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
        
        	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
        		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
        		               gtlbe->word2); <- CORE HOOK
        		vcpu->stat.dtlb_virt_miss_exits++;
        		r = RESUME_GUEST;
        	} else
        		r = kvmppc_emulate_mmio(run, vcpu);
        
        	break;

booke_fsl_interrupts.S looks like a lot of code copied and pasted, with
the obvious exception of the TLB handlers. Can't we work out some better
way to share the rest?

Christian makes a great point about interrupts in
host_tlb_write_entry(), but I have an even bigger question: by not
tracking the state of the shadow TLB, you're implementing lazy
save/restore. In contrast, 440 is doing a full TLB reload, since I
assume that a) the TLB is so small that is almost certainly full of
useful entries, and b) TLB misses are much more expensive with KVM than
on bare metal.

> I have tested them under branch 2.6.26, but not under the lastest code,
> for my board can not boot up via current kvm tree.

Hmm, that's too bad.

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
  2008-08-21 20:54 ` Hollis Blanchard
@ 2008-08-22 11:00 ` Liu Yu
  2008-08-29 17:09 ` Hollis Blanchard
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu @ 2008-08-22 11:00 UTC (permalink / raw)
  To: kvm-ppc


> -----Original Message-----
> From: Hollis Blanchard [mailto:hollisb@us.ibm.com] 
> Sent: Friday, August 22, 2008 4:54 AM
> To: Liu Yu
> Cc: kvm-ppc@vger.kernel.org
> Subject: Re: [patch 0/4] add e500 platform support for KVM
> 
> On Fri, 2008-08-15 at 16:50 +0800, Liu Yu wrote:
> > These patches add the support of e500 platform for KVM.
> 
> Hi Yu, sorry it's taken me this long to take a look. I've 
> been sick and
> catching up from vacation...

That's all right. ;-)

> 
> > The code is just in the primary stage,
> > so this time just for discussion.
> 
> Could you talk a little about how it differs from the 440
> implementation? For example, how are you using TLB0 and TLB1?

Sorry that I havent provided the detailed description.

The case of TLB1 is the same as 44x.
Since TLB0 is 4K-fixed map, it has different implementation.

Guest always has TLB0 and TLB1.
Host can chosen how to map them.

Currently if not define KVMPPC_E500_TLB0_ENABLE,
both guest TLB1 and guest TLB0 map to host TLB1.
Otherwise, guest TLB1 map to host TLB1 and guest TLB0 map to host TLB0.
KVMPPC_E500_TLB0_ENABLE is an intergradation,
as I first only use host TLB1, after TLB1 is ok then I debug host TLB0.

Unlike TLB1(one-to-many mapping, the same as 44x), guest TLB0 and shadow
TLB0 is one-one mapping. 
That is to say, you can use the entry select or index to find the
guest/shadow pair.

Moreover, TLB1 has fixed one-one mapping between shadow TLB1 and host
TLB1.
You can find the host TLB1 entry according to the index of shadow TLB1.
For TLB0, it's a dynamic one-one mapping, you can not find which host
entry the shadow ultimately be mapped.
This is due to the auto-entry-select mechanism of E500. And the only way
to destroy an entry is tibivax.

Unlike 44x use TID=0 map userspace and TID=1 map kernel space,
I plan to use host TLB1 to map kernel, and host TLB0 to map userspace.
This category can make it convient to handle privilege switch: that is
before enterring guest just need tlbivax TLB1.
Userspace tlb enties with differnet TID can exist in host TLB0 till host
kvm process switch or guest meet explicit tlbiax command.

I just have this idea and have not thought all the detail through.
What do you think of it?

> 
> Some of the refactoring you've done, like creating completely separate
> kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> expect the DTLB miss code to be refactored like this:

In fact, I don't have much experience on refactor,
I just thought we could merge as more common code as we can.

Do we need to separate dtlb and itlb?
I tried to merge them so that the code became what it looks like. :-|

> 
>         case BOOKE_INTERRUPT_DTLB_MISS:	
>         	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
>         	if (!gtlbe) {
>         		/* The guest didn't have a mapping for it. */
>         		kvmppc_queue_exception(vcpu, exit_nr);
>         		vcpu->arch.dear = vcpu->arch.fault_dear;
>         		vcpu->arch.esr = vcpu->arch.fault_esr;
>         		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
>         		vcpu->stat.dtlb_real_miss_exits++;
>         		r = RESUME_GUEST;
>         		break;
>         	}
>         
>         	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
>         	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
>         
>         	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>         		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
>         		               gtlbe->word2); <- CORE HOOK
>         		vcpu->stat.dtlb_virt_miss_exits++;
>         		r = RESUME_GUEST;
>         	} else
>         		r = kvmppc_emulate_mmio(run, vcpu);
>         
>         	break;
> 
> booke_fsl_interrupts.S looks like a lot of code copied and 
> pasted, with
> the obvious exception of the TLB handlers. Can't we work out 
> some better
> way to share the rest?

That would be great.
I just considered it step-by-step, and I am not very familiar with gnu
assemble macro, define things,
and I nocited head_fsl_booke.S and head_44x.S...

I will think about it.

> 
> Christian makes a great point about interrupts in
> host_tlb_write_entry(), but I have an even bigger question: by not
> tracking the state of the shadow TLB, you're implementing lazy
> save/restore. In contrast, 440 is doing a full TLB reload, since I
> assume that a) the TLB is so small that is almost certainly full of
> useful entries, and b) TLB misses are much more expensive 
> with KVM than
> on bare metal.

Do you mean TLB0?
I explain it above.
When I found a TLB0 entry not mapped, I write it to hardware
immediately.
I just don't care about where this entry located and when it will be
replaced,
this is due to the auto-entry-select mechanism.
I hope the way I deal with host TLB0 could be consistent with the host
kernel's.

> 
> > I have tested them under branch 2.6.26, but not under the 
> lastest code,
> > for my board can not boot up via current kvm tree.
> 
> Hmm, that's too bad.

At lease I can still test the core code. :-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
  2008-08-21 20:54 ` Hollis Blanchard
  2008-08-22 11:00 ` Liu Yu
@ 2008-08-29 17:09 ` Hollis Blanchard
  2008-08-30  3:15 ` Liu Yu
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-08-29 17:09 UTC (permalink / raw)
  To: kvm-ppc

On Fri, 2008-08-22 at 19:00 +0800, Liu Yu wrote: 
> > 
> > Could you talk a little about how it differs from the 440
> > implementation? For example, how are you using TLB0 and TLB1?
> 
> Sorry that I havent provided the detailed description.
> 
> The case of TLB1 is the same as 44x.
> Since TLB0 is 4K-fixed map, it has different implementation.
> 
> Guest always has TLB0 and TLB1.
> Host can chosen how to map them.
> 
> Currently if not define KVMPPC_E500_TLB0_ENABLE,
> both guest TLB1 and guest TLB0 map to host TLB1.
> Otherwise, guest TLB1 map to host TLB1 and guest TLB0 map to host TLB0.
> KVMPPC_E500_TLB0_ENABLE is an intergradation,
> as I first only use host TLB1, after TLB1 is ok then I debug host TLB0.

OK, I think you're saying that KVMPPC_E500_TLB0_ENABLE is a temporary
hack, since Wei's original code used only TLB1, and you're now extending
that to also use TLB0.

> Unlike TLB1(one-to-many mapping, the same as 44x), guest TLB0 and shadow
> TLB0 is one-one mapping. 
> That is to say, you can use the entry select or index to find the
> guest/shadow pair.
> 
> Moreover, TLB1 has fixed one-one mapping between shadow TLB1 and host
> TLB1.
> You can find the host TLB1 entry according to the index of shadow TLB1.
> For TLB0, it's a dynamic one-one mapping, you can not find which host
> entry the shadow ultimately be mapped.
> This is due to the auto-entry-select mechanism of E500. And the only way
> to destroy an entry is tibivax.

You lost me a little bit here.

It looks like your kvm_vcpu_arch structure holds a full copy of both
TLB0 and TLB1, guest and shadow. However, I can't see where you write
into shadow_tlb[0]; instead it looks like you directly insert those
entries into the hardware without saving a copy in the vcpu.

> Unlike 44x use TID=0 map userspace and TID=1 map kernel space,
> I plan to use host TLB1 to map kernel, and host TLB0 to map userspace.
> This category can make it convient to handle privilege switch: that is
> before enterring guest just need tlbivax TLB1.
> Userspace tlb enties with differnet TID can exist in host TLB0 till host
> kvm process switch or guest meet explicit tlbiax command.
> 
> I just have this idea and have not thought all the detail through.
> What do you think of it?

You might run into problems if you ever get large guest userspace
mappings. I know hugetlbfs doesn't exist for e500 Linux right now, but
it could in the future, and plus there are other kernels to consider.

> > Some of the refactoring you've done, like creating completely separate
> > kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> > expect the DTLB miss code to be refactored like this:
> 
> In fact, I don't have much experience on refactor,
> I just thought we could merge as more common code as we can.
> 
> Do we need to separate dtlb and itlb?
> I tried to merge them so that the code became what it looks like. :-|

Your code does something similar to mine, just without the helper
functions:

        kvmppc_handle_tlb_miss(vcpu, eaddr, MSR_IS);
vs
        kvmppc_44x_itlb_search(...) {
                kvmppc_44x_tlb_index(..., !!(msr & MSR_IS))
        }

> > booke_fsl_interrupts.S looks like a lot of code copied and 
> > pasted, with
> > the obvious exception of the TLB handlers. Can't we work out 
> > some better
> > way to share the rest?
> 
> That would be great.
> I just considered it step-by-step, and I am not very familiar with gnu
> assemble macro, define things,
> and I nocited head_fsl_booke.S and head_44x.S...
> 
> I will think about it.

OK, so there are two basic differences in booke_interrupts.S, and
they're both in the common "lightweight exit" path: the additional PID
register switching and the modifications to the TLB save/restore.

The PID stuff should be easy enough to hide: just create a
"KVMPPC_SAVE_PID" macro.

For now we can probably do the same with the TLB manipulations.

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (2 preceding siblings ...)
  2008-08-29 17:09 ` Hollis Blanchard
@ 2008-08-30  3:15 ` Liu Yu
  2008-09-08 17:30 ` Hollis Blanchard
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu @ 2008-08-30  3:15 UTC (permalink / raw)
  To: kvm-ppc


> -----Original Message-----
> From: Hollis Blanchard [mailto:hollisb@us.ibm.com] 
> Sent: Saturday, August 30, 2008 1:09 AM
> To: Liu Yu-B13201
> Cc: kvm-ppc@vger.kernel.org
> Subject: RE: [patch 0/4] add e500 platform support for KVM
> 
> On Fri, 2008-08-22 at 19:00 +0800, Liu Yu wrote: 
> > > 
> > > Could you talk a little about how it differs from the 440
> > > implementation? For example, how are you using TLB0 and TLB1?
> > 
> > Sorry that I havent provided the detailed description.
> > 
> > The case of TLB1 is the same as 44x.
> > Since TLB0 is 4K-fixed map, it has different implementation.
> > 
> > Guest always has TLB0 and TLB1.
> > Host can chosen how to map them.
> > 
> > Currently if not define KVMPPC_E500_TLB0_ENABLE,
> > both guest TLB1 and guest TLB0 map to host TLB1.
> > Otherwise, guest TLB1 map to host TLB1 and guest TLB0 map 
> to host TLB0.
> > KVMPPC_E500_TLB0_ENABLE is an intergradation,
> > as I first only use host TLB1, after TLB1 is ok then I 
> debug host TLB0.
> 
> OK, I think you're saying that KVMPPC_E500_TLB0_ENABLE is a temporary
> hack, since Wei's original code used only TLB1, and you're 
> now extending
> that to also use TLB0.

Yes.

> 
> > Unlike TLB1(one-to-many mapping, the same as 44x), guest 
> TLB0 and shadow
> > TLB0 is one-one mapping. 
> > That is to say, you can use the entry select or index to find the
> > guest/shadow pair.
> > 
> > Moreover, TLB1 has fixed one-one mapping between shadow 
> TLB1 and host
> > TLB1.
> > You can find the host TLB1 entry according to the index of 
> shadow TLB1.
> > For TLB0, it's a dynamic one-one mapping, you can not find 
> which host
> > entry the shadow ultimately be mapped.
> > This is due to the auto-entry-select mechanism of E500. And 
> the only way
> > to destroy an entry is tibivax.
> 
> You lost me a little bit here.
> 
> It looks like your kvm_vcpu_arch structure holds a full copy of both
> TLB0 and TLB1, guest and shadow. However, I can't see where you write
> into shadow_tlb[0]; instead it looks like you directly insert those
> entries into the hardware without saving a copy in the vcpu.

All the shadow tlb update exist in shadow_map()
TLB1 will call it from kvm_mmu_map(), and TLB0 will call it from
kvm_tlbe_map().

> 
> > Unlike 44x use TID=0 map userspace and TID=1 map kernel space,
> > I plan to use host TLB1 to map kernel, and host TLB0 to map 
> userspace.
> > This category can make it convient to handle privilege 
> switch: that is
> > before enterring guest just need tlbivax TLB1.
> > Userspace tlb enties with differnet TID can exist in host 
> TLB0 till host
> > kvm process switch or guest meet explicit tlbiax command.
> > 
> > I just have this idea and have not thought all the detail through.
> > What do you think of it?
> 
> You might run into problems if you ever get large guest userspace
> mappings. I know hugetlbfs doesn't exist for e500 Linux right now, but
> it could in the future, and plus there are other kernels to consider.

Can you give me more details? 
I'm not sure how hugetlbfs could be in the future, but TLB0 has a fixed
mapping size 4KB.

> 
> > > Some of the refactoring you've done, like creating 
> completely separate
> > > kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> > > expect the DTLB miss code to be refactored like this:
> > 
> > In fact, I don't have much experience on refactor,
> > I just thought we could merge as more common code as we can.
> > 
> > Do we need to separate dtlb and itlb?
> > I tried to merge them so that the code became what it looks 
> like. :-|
> 
> Your code does something similar to mine, just without the helper
> functions:
> 
>         kvmppc_handle_tlb_miss(vcpu, eaddr, MSR_IS);
> vs
>         kvmppc_44x_itlb_search(...) {
>                 kvmppc_44x_tlb_index(..., !!(msr & MSR_IS))
>         }

OK.
I will change them back.

> 
> > > booke_fsl_interrupts.S looks like a lot of code copied and 
> > > pasted, with
> > > the obvious exception of the TLB handlers. Can't we work out 
> > > some better
> > > way to share the rest?
> > 
> > That would be great.
> > I just considered it step-by-step, and I am not very 
> familiar with gnu
> > assemble macro, define things,
> > and I nocited head_fsl_booke.S and head_44x.S...
> > 
> > I will think about it.
> 
> OK, so there are two basic differences in booke_interrupts.S, and
> they're both in the common "lightweight exit" path: the additional PID
> register switching and the modifications to the TLB save/restore.
> 
> The PID stuff should be easy enough to hide: just create a
> "KVMPPC_SAVE_PID" macro.

Do we need a new header file to define this?

> 
> For now we can probably do the same with the TLB manipulations.

How do you think the way I define struct tlb_array and tlbe?
If you think it's fine, I think 44x can adopt it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (3 preceding siblings ...)
  2008-08-30  3:15 ` Liu Yu
@ 2008-09-08 17:30 ` Hollis Blanchard
  2008-09-09  2:08 ` Liu Yu-B13201
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-09-08 17:30 UTC (permalink / raw)
  To: kvm-ppc

On Sat, 2008-08-30 at 11:15 +0800, Liu Yu wrote:
> > > Unlike 44x use TID=0 map userspace and TID=1 map kernel space,
> > > I plan to use host TLB1 to map kernel, and host TLB0 to map 
> > userspace.
> > > This category can make it convient to handle privilege 
> > switch: that is
> > > before enterring guest just need tlbivax TLB1.
> > > Userspace tlb enties with differnet TID can exist in host 
> > TLB0 till host
> > > kvm process switch or guest meet explicit tlbiax command.
> > > 
> > > I just have this idea and have not thought all the detail through.
> > > What do you think of it?
> > 
> > You might run into problems if you ever get large guest userspace
> > mappings. I know hugetlbfs doesn't exist for e500 Linux right now, but
> > it could in the future, and plus there are other kernels to consider.
> 
> Can you give me more details? 
> I'm not sure how hugetlbfs could be in the future, but TLB0 has a fixed
> mapping size 4KB.

You can't assume that TLB1 does not contain user mappings, because
that's not true with hugetlbfs. Of course, hugetlbfs doesn't (yet?)
exist for e500, so the assumption is valid until that happens.

However, we *really* need large host page mappings to make KVM fast.
Right now we have to split guest large pages (covering the kernel) into
lots of 4K mappings, which means our TLB miss rate is *much* higher than
if we could use hugetlbfs on the host. In that case, we could use
hugetlbfs large user pages to back the guest kernel mappings.

> > > > booke_fsl_interrupts.S looks like a lot of code copied and 
> > > > pasted, with
> > > > the obvious exception of the TLB handlers. Can't we work out 
> > > > some better
> > > > way to share the rest?
> > > 
> > > That would be great.
> > > I just considered it step-by-step, and I am not very 
> > familiar with gnu
> > > assemble macro, define things,
> > > and I nocited head_fsl_booke.S and head_44x.S...
> > > 
> > > I will think about it.
> > 
> > OK, so there are two basic differences in booke_interrupts.S, and
> > they're both in the common "lightweight exit" path: the additional PID
> > register switching and the modifications to the TLB save/restore.
> > 
> > The PID stuff should be easy enough to hide: just create a
> > "KVMPPC_SAVE_PID" macro.
> 
> Do we need a new header file to define this?

Yup, we'd need something like e500_interrupts.h and 440_interrupts.h,
and these would only be included from booke_interrupts.S.

> > For now we can probably do the same with the TLB manipulations.
> 
> How do you think the way I define struct tlb_array and tlbe?
> If you think it's fine, I think 44x can adopt it.

I've been meaning to rename "struct tlbe" for a long time. It should
really be called "kvmppc_440_tlbe" or something.

As for the dynamic allocation of the TLB arrays, instead of building
them into struct kvm_vcpu_arch, I think that probably makes sense. A
little more annoying to manage, but makes sense.

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (4 preceding siblings ...)
  2008-09-08 17:30 ` Hollis Blanchard
@ 2008-09-09  2:08 ` Liu Yu-B13201
  2008-09-09 11:08 ` Hollis Blanchard
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu-B13201 @ 2008-09-09  2:08 UTC (permalink / raw)
  To: kvm-ppc

> -----Original Message-----
> From: Hollis Blanchard [mailto:hollisb@us.ibm.com] 
> Sent: Tuesday, September 09, 2008 1:31 AM
> To: Liu Yu-B13201
> Cc: kvm-ppc@vger.kernel.org
> Subject: RE: [patch 0/4] add e500 platform support for KVM
> 
> On Sat, 2008-08-30 at 11:15 +0800, Liu Yu wrote:
> > > > Unlike 44x use TID=0 map userspace and TID=1 map kernel space,
> > > > I plan to use host TLB1 to map kernel, and host TLB0 to map 
> > > userspace.
> > > > This category can make it convient to handle privilege 
> > > switch: that is
> > > > before enterring guest just need tlbivax TLB1.
> > > > Userspace tlb enties with differnet TID can exist in host 
> > > TLB0 till host
> > > > kvm process switch or guest meet explicit tlbiax command.
> > > > 
> > > > I just have this idea and have not thought all the 
> detail through.
> > > > What do you think of it?
> > > 
> > > You might run into problems if you ever get large guest userspace
> > > mappings. I know hugetlbfs doesn't exist for e500 Linux 
> right now, but
> > > it could in the future, and plus there are other kernels 
> to consider.
> > 
> > Can you give me more details? 
> > I'm not sure how hugetlbfs could be in the future, but TLB0 
> has a fixed
> > mapping size 4KB.
> 
> You can't assume that TLB1 does not contain user mappings, because
> that's not true with hugetlbfs. Of course, hugetlbfs doesn't (yet?)
> exist for e500, so the assumption is valid until that happens.
> 
> However, we *really* need large host page mappings to make KVM fast.
> Right now we have to split guest large pages (covering the 
> kernel) into
> lots of 4K mappings, which means our TLB miss rate is *much* 
> higher than
> if we could use hugetlbfs on the host. In that case, we could use
> hugetlbfs large user pages to back the guest kernel mappings.
> 

Yes. It's a problem. But I afraid e500 would not use hugetlbfs,
as it means to give up 512-entry TLB0, while TLB1 has only 16 entries.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (5 preceding siblings ...)
  2008-09-09  2:08 ` Liu Yu-B13201
@ 2008-09-09 11:08 ` Hollis Blanchard
  2008-09-11  8:43 ` Liu Yu-B13201
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-09-09 11:08 UTC (permalink / raw)
  To: kvm-ppc

On Monday 08 September 2008 21:08:02 Liu Yu-B13201 wrote:
> 
> > You can't assume that TLB1 does not contain user mappings, because
> > that's not true with hugetlbfs. Of course, hugetlbfs doesn't (yet?)
> > exist for e500, so the assumption is valid until that happens.
> > 
> > However, we *really* need large host page mappings to make KVM fast.
> > Right now we have to split guest large pages (covering the 
> > kernel) into
> > lots of 4K mappings, which means our TLB miss rate is *much* 
> > higher than
> > if we could use hugetlbfs on the host. In that case, we could use
> > hugetlbfs large user pages to back the guest kernel mappings.
> > 
> 
> Yes. It's a problem. But I afraid e500 would not use hugetlbfs,
> as it means to give up 512-entry TLB0, while TLB1 has only 16 entries.

I don't think you'd need to "give up" anything -- some parts of user memory 
would be backed with 4K pages, and some parts with large pages.

-- 
Hollis Blanchard
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (6 preceding siblings ...)
  2008-09-09 11:08 ` Hollis Blanchard
@ 2008-09-11  8:43 ` Liu Yu-B13201
  2008-09-11 14:45 ` Hollis Blanchard
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu-B13201 @ 2008-09-11  8:43 UTC (permalink / raw)
  To: kvm-ppc


> 
> Could you talk a little about how it differs from the 440
> implementation? For example, how are you using TLB0 and TLB1?
> 
> Some of the refactoring you've done, like creating completely separate
> kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> expect the DTLB miss code to be refactored like this:
> 
>         case BOOKE_INTERRUPT_DTLB_MISS:	
>         	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
>         	if (!gtlbe) {
>         		/* The guest didn't have a mapping for it. */
>         		kvmppc_queue_exception(vcpu, exit_nr);
>         		vcpu->arch.dear = vcpu->arch.fault_dear;
>         		vcpu->arch.esr = vcpu->arch.fault_esr;
>         		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
>         		vcpu->stat.dtlb_real_miss_exits++;
>         		r = RESUME_GUEST;
>         		break;
>         	}
>         
>         	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
>         	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
>         
>         	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>         		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
>         		               gtlbe->word2); <- CORE HOOK
>         		vcpu->stat.dtlb_virt_miss_exits++;
>         		r = RESUME_GUEST;
>         	} else
>         		r = kvmppc_emulate_mmio(run, vcpu);
>         
>         	break;
> 

Hollis, I'm reconsiderring this place.
The problem is that e500 has 2 TLB, so I need to get the tlb index and
entry index from kvmppc_dtlb_search
And if kvm_is_visible_gfn() returns nonzero, the two index are needed to
manipulate TLB0 or TLB1.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (7 preceding siblings ...)
  2008-09-11  8:43 ` Liu Yu-B13201
@ 2008-09-11 14:45 ` Hollis Blanchard
  2008-09-12  2:28 ` Liu Yu-B13201
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-09-11 14:45 UTC (permalink / raw)
  To: kvm-ppc

On Thu, 2008-09-11 at 16:43 +0800, Liu Yu-B13201 wrote:
> > 
> > Could you talk a little about how it differs from the 440
> > implementation? For example, how are you using TLB0 and TLB1?
> > 
> > Some of the refactoring you've done, like creating completely separate
> > kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> > expect the DTLB miss code to be refactored like this:
> > 
> >         case BOOKE_INTERRUPT_DTLB_MISS:	
> >         	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
> >         	if (!gtlbe) {
> >         		/* The guest didn't have a mapping for it. */
> >         		kvmppc_queue_exception(vcpu, exit_nr);
> >         		vcpu->arch.dear = vcpu->arch.fault_dear;
> >         		vcpu->arch.esr = vcpu->arch.fault_esr;
> >         		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
> >         		vcpu->stat.dtlb_real_miss_exits++;
> >         		r = RESUME_GUEST;
> >         		break;
> >         	}
> >         
> >         	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
> >         	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
> >         
> >         	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
> >         		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
> >         		               gtlbe->word2); <- CORE HOOK
> >         		vcpu->stat.dtlb_virt_miss_exits++;
> >         		r = RESUME_GUEST;
> >         	} else
> >         		r = kvmppc_emulate_mmio(run, vcpu);
> >         
> >         	break;
> > 
> 
> Hollis, I'm reconsiderring this place.
> The problem is that e500 has 2 TLB, so I need to get the tlb index and
> entry index from kvmppc_dtlb_search
> And if kvm_is_visible_gfn() returns nonzero, the two index are needed to
> manipulate TLB0 or TLB1.

OK, sounds reasonable. Feel free to change the prototypes of these hooks
to whatever you need, as long as it's clean and still lets me implement
440.

For example, maybe something like this:
        int kvmppc_dtlb_index(struct kvm_vcpu *vcpu, u32 eaddr);

You could then encode the TLB index in the high bits of the return
index. Alternatively, you could pass &tlb and &index, but I think I like
that less.

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (8 preceding siblings ...)
  2008-09-11 14:45 ` Hollis Blanchard
@ 2008-09-12  2:28 ` Liu Yu-B13201
  2008-09-12  2:31 ` Liu Yu-B13201
  2008-09-12 14:49 ` Hollis Blanchard
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu-B13201 @ 2008-09-12  2:28 UTC (permalink / raw)
  To: kvm-ppc


> > > Some of the refactoring you've done, like creating 
> completely separate
> > > kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> > > expect the DTLB miss code to be refactored like this:
> > > 
> > >         case BOOKE_INTERRUPT_DTLB_MISS:	
> > >         	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
> > >         	if (!gtlbe) {
> > >         		/* The guest didn't have a mapping for it. */
> > >         		kvmppc_queue_exception(vcpu, exit_nr);
> > >         		vcpu->arch.dear = vcpu->arch.fault_dear;
> > >         		vcpu->arch.esr = vcpu->arch.fault_esr;
> > >         		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
> > >         		vcpu->stat.dtlb_real_miss_exits++;
> > >         		r = RESUME_GUEST;
> > >         		break;
> > >         	}
> > >         
> > >         	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
> > >         	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
> > >         
> > >         	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
> > >         		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
> > >         		               gtlbe->word2); <- CORE HOOK
> > >         		vcpu->stat.dtlb_virt_miss_exits++;
> > >         		r = RESUME_GUEST;
> > >         	} else
> > >         		r = kvmppc_emulate_mmio(run, vcpu);
> > >         
> > >         	break;
> > > 
> > 
> > Hollis, I'm reconsiderring this place.
> > The problem is that e500 has 2 TLB, so I need to get the 
> tlb index and
> > entry index from kvmppc_dtlb_search
> > And if kvm_is_visible_gfn() returns nonzero, the two index 
> are needed to
> > manipulate TLB0 or TLB1.
> 
> OK, sounds reasonable. Feel free to change the prototypes of 
> these hooks
> to whatever you need, as long as it's clean and still lets me 
> implement
> 440.
> 
> For example, maybe something like this:
>         int kvmppc_dtlb_index(struct kvm_vcpu *vcpu, u32 eaddr);
> 
> You could then encode the TLB index in the high bits of the return
> index. Alternatively, you could pass &tlb and &index, but I 
> think I like
> that less.
> 

I just wanted you to know this place has some dependent factors.
Never mind. I think I could use the your former suggestion. 
I could calculate the index by offset or save index in vcpu->arch.
What's your opinion?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (9 preceding siblings ...)
  2008-09-12  2:28 ` Liu Yu-B13201
@ 2008-09-12  2:31 ` Liu Yu-B13201
  2008-09-12 14:49 ` Hollis Blanchard
  11 siblings, 0 replies; 13+ messages in thread
From: Liu Yu-B13201 @ 2008-09-12  2:31 UTC (permalink / raw)
  To: kvm-ppc


> 
> I just wanted you to know this place has some dependent factors.
> Never mind. I think I could use the your former suggestion. 
> I could calculate the index by offset or save index in vcpu->arch.
> What's your opinion?

Maybe we could decide what to do here after seeing more processores.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [patch 0/4] add e500 platform support  for KVM
  2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
                   ` (10 preceding siblings ...)
  2008-09-12  2:31 ` Liu Yu-B13201
@ 2008-09-12 14:49 ` Hollis Blanchard
  11 siblings, 0 replies; 13+ messages in thread
From: Hollis Blanchard @ 2008-09-12 14:49 UTC (permalink / raw)
  To: kvm-ppc

On Fri, 2008-09-12 at 10:28 +0800, Liu Yu-B13201 wrote:
> > > > Some of the refactoring you've done, like creating 
> > completely separate
> > > > kvmppc_handle_tlb_miss() functions, surprises me. For example, I'd
> > > > expect the DTLB miss code to be refactored like this:
> > > > 
> > > >         case BOOKE_INTERRUPT_DTLB_MISS:	
> > > >         	gtlbe = kvmppc_dtlb_search(vcpu, eaddr); <- CORE HOOK
> > > >         	if (!gtlbe) {
> > > >         		/* The guest didn't have a mapping for it. */
> > > >         		kvmppc_queue_exception(vcpu, exit_nr);
> > > >         		vcpu->arch.dear = vcpu->arch.fault_dear;
> > > >         		vcpu->arch.esr = vcpu->arch.fault_esr;
> > > >         		kvmppc_deliver_dtlb_miss(vcpu); <- CORE HOOK
> > > >         		vcpu->stat.dtlb_real_miss_exits++;
> > > >         		r = RESUME_GUEST;
> > > >         		break;
> > > >         	}
> > > >         
> > > >         	vcpu->arch.paddr_accessed = tlb_xlate(gtlbe, eaddr);
> > > >         	gfn = vcpu->arch.paddr_accessed >> PAGE_SHIFT;
> > > >         
> > > >         	if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
> > > >         		kvmppc_mmu_map(vcpu, eaddr, gfn, gtlbe->tid,
> > > >         		               gtlbe->word2); <- CORE HOOK
> > > >         		vcpu->stat.dtlb_virt_miss_exits++;
> > > >         		r = RESUME_GUEST;
> > > >         	} else
> > > >         		r = kvmppc_emulate_mmio(run, vcpu);
> > > >         
> > > >         	break;
> > > > 
> > > 
> > > Hollis, I'm reconsiderring this place.
> > > The problem is that e500 has 2 TLB, so I need to get the 
> > tlb index and
> > > entry index from kvmppc_dtlb_search
> > > And if kvm_is_visible_gfn() returns nonzero, the two index 
> > are needed to
> > > manipulate TLB0 or TLB1.
> > 
> > OK, sounds reasonable. Feel free to change the prototypes of 
> > these hooks
> > to whatever you need, as long as it's clean and still lets me 
> > implement
> > 440.
> > 
> > For example, maybe something like this:
> >         int kvmppc_dtlb_index(struct kvm_vcpu *vcpu, u32 eaddr);
> > 
> > You could then encode the TLB index in the high bits of the return
> > index. Alternatively, you could pass &tlb and &index, but I 
> > think I like
> > that less.
> > 
> 
> I just wanted you to know this place has some dependent factors.
> Never mind. I think I could use the your former suggestion. 
> I could calculate the index by offset or save index in vcpu->arch.
> What's your opinion?

I would definitely prefer using the index as a return value, rather than
saving to vcpu->arch.

-- 
Hollis Blanchard
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-09-12 14:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-15  8:50 [patch 0/4] add e500 platform support for KVM Liu Yu
2008-08-21 20:54 ` Hollis Blanchard
2008-08-22 11:00 ` Liu Yu
2008-08-29 17:09 ` Hollis Blanchard
2008-08-30  3:15 ` Liu Yu
2008-09-08 17:30 ` Hollis Blanchard
2008-09-09  2:08 ` Liu Yu-B13201
2008-09-09 11:08 ` Hollis Blanchard
2008-09-11  8:43 ` Liu Yu-B13201
2008-09-11 14:45 ` Hollis Blanchard
2008-09-12  2:28 ` Liu Yu-B13201
2008-09-12  2:31 ` Liu Yu-B13201
2008-09-12 14:49 ` Hollis Blanchard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.