All of lore.kernel.org
 help / color / mirror / Atom feed
* Question about removing memslots
@ 2012-03-28  7:24 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28  7:24 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood

So I was chasing a bug today when I realized that some "drivers" in qemu
do interesting things with memory regions.

More specifically, cirrus emulation constantly flips the linear
framebuffer between being mapped into the guest and being emulated MMIO
(the latter for the purpose of image blits).

This made me ponder ... whenever a memslot is "removed" like that (in
the case for example where cirrus turns the fb into emulation), we need
to ensure that any cached translation that involve those GPAs are
flushed out of whatever caching (HW or SW) is done by the KVM arch
code...

So I started looking and the only thing I can find (let me know if I
missed something) is kvm_arch_flush_shadow(). Is that it ? Because it
doesn't take the memslot going away as an argument, so it doesn't know
-what- to flush...

Now I see that x86 just seems to flush everything, which is quite heavy
handed considering how often cirrus does it, but maybe it doesn't have a
choice (lack of reverse mapping from GPA ?).

I also noticed that powerpc ... doesn't do anything :-) Ooops....

So all translations still present in the TLB will remain there, all
translations present in the MMU hash table as well, etc...

Now, in order to implement that properly and efficiently, we would need
to at least get the GPA (if not the whole memslot).

Do you have any objection (provided I didn't completely misunderstand
something which is quite possible) to us adding that argument to
kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
as an unused argument, and later get the proper implementation for
powerpc.

Another thing I noticed while at it is that my version of
__kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
adding a memslot ... but never does the opposite unmapping when that
memory slot is removed.... isn't that potentially an issue ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Question about removing memslots
@ 2012-03-28  7:24 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28  7:24 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood

So I was chasing a bug today when I realized that some "drivers" in qemu
do interesting things with memory regions.

More specifically, cirrus emulation constantly flips the linear
framebuffer between being mapped into the guest and being emulated MMIO
(the latter for the purpose of image blits).

This made me ponder ... whenever a memslot is "removed" like that (in
the case for example where cirrus turns the fb into emulation), we need
to ensure that any cached translation that involve those GPAs are
flushed out of whatever caching (HW or SW) is done by the KVM arch
code...

So I started looking and the only thing I can find (let me know if I
missed something) is kvm_arch_flush_shadow(). Is that it ? Because it
doesn't take the memslot going away as an argument, so it doesn't know
-what- to flush...

Now I see that x86 just seems to flush everything, which is quite heavy
handed considering how often cirrus does it, but maybe it doesn't have a
choice (lack of reverse mapping from GPA ?).

I also noticed that powerpc ... doesn't do anything :-) Ooops....

So all translations still present in the TLB will remain there, all
translations present in the MMU hash table as well, etc...

Now, in order to implement that properly and efficiently, we would need
to at least get the GPA (if not the whole memslot).

Do you have any objection (provided I didn't completely misunderstand
something which is quite possible) to us adding that argument to
kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
as an unused argument, and later get the proper implementation for
powerpc.

Another thing I noticed while at it is that my version of
__kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
adding a memslot ... but never does the opposite unmapping when that
memory slot is removed.... isn't that potentially an issue ?

Cheers,
Ben.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28  7:24 ` Benjamin Herrenschmidt
@ 2012-03-28  9:37   ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28  9:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 09:24 AM, Benjamin Herrenschmidt wrote:
> So I was chasing a bug today when I realized that some "drivers" in qemu
> do interesting things with memory regions.

They're usually called devices, drivers live in the guest.

> More specifically, cirrus emulation constantly flips the linear
> framebuffer between being mapped into the guest and being emulated MMIO
> (the latter for the purpose of image blits).
>
> This made me ponder ... whenever a memslot is "removed" like that (in
> the case for example where cirrus turns the fb into emulation), we need
> to ensure that any cached translation that involve those GPAs are
> flushed out of whatever caching (HW or SW) is done by the KVM arch
> code...
>
> So I started looking and the only thing I can find (let me know if I
> missed something) is kvm_arch_flush_shadow(). Is that it ? Because it
> doesn't take the memslot going away as an argument, so it doesn't know
> -what- to flush...
>
> Now I see that x86 just seems to flush everything, which is quite heavy
> handed considering how often cirrus does it, but maybe it doesn't have a
> choice (lack of reverse mapping from GPA ?).

We do have a reverse mapping, so we could easily flush just a single
slot.  The reason it hasn't been done is that slot changes are very are
on x86.  They're usually only done by 16-bit software; 32-bit software
just maps the entire framebuffer BAR and accesses it directly.  It's
also usually done in a tight loop, so flushing everything doesn't have a
large impact (and with a 20-bit address space, you couldn't cause a
large impact if you wanted to - memory is all of 256 pages).

> I also noticed that powerpc ... doesn't do anything :-) Ooops....
>
> So all translations still present in the TLB will remain there, all
> translations present in the MMU hash table as well, etc...
>
> Now, in order to implement that properly and efficiently, we would need
> to at least get the GPA (if not the whole memslot).
>
> Do you have any objection (provided I didn't completely misunderstand
> something which is quite possible) to us adding that argument to
> kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
> as an unused argument, and later get the proper implementation for
> powerpc.

Sure, it makes sense.

> Another thing I noticed while at it is that my version of
> __kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
> adding a memslot ... but never does the opposite unmapping when that
> memory slot is removed.... isn't that potentially an issue ?

It is.  Alex?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28  9:37   ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28  9:37 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 09:24 AM, Benjamin Herrenschmidt wrote:
> So I was chasing a bug today when I realized that some "drivers" in qemu
> do interesting things with memory regions.

They're usually called devices, drivers live in the guest.

> More specifically, cirrus emulation constantly flips the linear
> framebuffer between being mapped into the guest and being emulated MMIO
> (the latter for the purpose of image blits).
>
> This made me ponder ... whenever a memslot is "removed" like that (in
> the case for example where cirrus turns the fb into emulation), we need
> to ensure that any cached translation that involve those GPAs are
> flushed out of whatever caching (HW or SW) is done by the KVM arch
> code...
>
> So I started looking and the only thing I can find (let me know if I
> missed something) is kvm_arch_flush_shadow(). Is that it ? Because it
> doesn't take the memslot going away as an argument, so it doesn't know
> -what- to flush...
>
> Now I see that x86 just seems to flush everything, which is quite heavy
> handed considering how often cirrus does it, but maybe it doesn't have a
> choice (lack of reverse mapping from GPA ?).

We do have a reverse mapping, so we could easily flush just a single
slot.  The reason it hasn't been done is that slot changes are very are
on x86.  They're usually only done by 16-bit software; 32-bit software
just maps the entire framebuffer BAR and accesses it directly.  It's
also usually done in a tight loop, so flushing everything doesn't have a
large impact (and with a 20-bit address space, you couldn't cause a
large impact if you wanted to - memory is all of 256 pages).

> I also noticed that powerpc ... doesn't do anything :-) Ooops....
>
> So all translations still present in the TLB will remain there, all
> translations present in the MMU hash table as well, etc...
>
> Now, in order to implement that properly and efficiently, we would need
> to at least get the GPA (if not the whole memslot).
>
> Do you have any objection (provided I didn't completely misunderstand
> something which is quite possible) to us adding that argument to
> kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
> as an unused argument, and later get the proper implementation for
> powerpc.

Sure, it makes sense.

> Another thing I noticed while at it is that my version of
> __kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
> adding a memslot ... but never does the opposite unmapping when that
> memory slot is removed.... isn't that potentially an issue ?

It is.  Alex?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28  9:37   ` Avi Kivity
@ 2012-03-28  9:59     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28  9:59 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 11:37 +0200, Avi Kivity wrote:

> > Now I see that x86 just seems to flush everything, which is quite heavy
> > handed considering how often cirrus does it, but maybe it doesn't have a
> > choice (lack of reverse mapping from GPA ?).
> 
> We do have a reverse mapping, so we could easily flush just a single
> slot.  The reason it hasn't been done is that slot changes are very are
> on x86.  They're usually only done by 16-bit software; 32-bit software
> just maps the entire framebuffer BAR and accesses it directly.  It's
> also usually done in a tight loop, so flushing everything doesn't have a
> large impact (and with a 20-bit address space, you couldn't cause a
> large impact if you wanted to - memory is all of 256 pages).

Right ... except that it definitely seems to be happening here with
cirrusfb in the guest kernel :-)

Every time it does an imageblit it switches the BAR to emulation and
back to direct mapping when the "upload" of the image is complete.

The X cirrus driver doesn't seem to trigger that (at least didn't from
my limited testing so far) so it may not be using host data blits ...
I'll have to compare what they do in more details.

In any case, we are doing something wrong on kvm powerpc, I just wanted
to make sure my analysis was correct.

> > I also noticed that powerpc ... doesn't do anything :-) Ooops....
> >
> > So all translations still present in the TLB will remain there, all
> > translations present in the MMU hash table as well, etc...
> >
> > Now, in order to implement that properly and efficiently, we would need
> > to at least get the GPA (if not the whole memslot).
> >
> > Do you have any objection (provided I didn't completely misunderstand
> > something which is quite possible) to us adding that argument to
> > kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
> > as an unused argument, and later get the proper implementation for
> > powerpc.
> 
> Sure, it makes sense.
> 
> > Another thing I noticed while at it is that my version of
> > __kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
> > adding a memslot ... but never does the opposite unmapping when that
> > memory slot is removed.... isn't that potentially an issue ?
> 
> It is.  Alex?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28  9:59     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28  9:59 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 11:37 +0200, Avi Kivity wrote:

> > Now I see that x86 just seems to flush everything, which is quite heavy
> > handed considering how often cirrus does it, but maybe it doesn't have a
> > choice (lack of reverse mapping from GPA ?).
> 
> We do have a reverse mapping, so we could easily flush just a single
> slot.  The reason it hasn't been done is that slot changes are very are
> on x86.  They're usually only done by 16-bit software; 32-bit software
> just maps the entire framebuffer BAR and accesses it directly.  It's
> also usually done in a tight loop, so flushing everything doesn't have a
> large impact (and with a 20-bit address space, you couldn't cause a
> large impact if you wanted to - memory is all of 256 pages).

Right ... except that it definitely seems to be happening here with
cirrusfb in the guest kernel :-)

Every time it does an imageblit it switches the BAR to emulation and
back to direct mapping when the "upload" of the image is complete.

The X cirrus driver doesn't seem to trigger that (at least didn't from
my limited testing so far) so it may not be using host data blits ...
I'll have to compare what they do in more details.

In any case, we are doing something wrong on kvm powerpc, I just wanted
to make sure my analysis was correct.

> > I also noticed that powerpc ... doesn't do anything :-) Ooops....
> >
> > So all translations still present in the TLB will remain there, all
> > translations present in the MMU hash table as well, etc...
> >
> > Now, in order to implement that properly and efficiently, we would need
> > to at least get the GPA (if not the whole memslot).
> >
> > Do you have any objection (provided I didn't completely misunderstand
> > something which is quite possible) to us adding that argument to
> > kvm_arch_flush_shadow() ? We can easily put in a small patch adding that
> > as an unused argument, and later get the proper implementation for
> > powerpc.
> 
> Sure, it makes sense.
> 
> > Another thing I noticed while at it is that my version of
> > __kvm_set_memory_region() appears to call kvm_iommu_map_pages() whenever
> > adding a memslot ... but never does the opposite unmapping when that
> > memory slot is removed.... isn't that potentially an issue ?
> 
> It is.  Alex?

Cheers,
Ben.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28  9:59     ` Benjamin Herrenschmidt
@ 2012-03-28 10:05       ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28 10:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 11:59 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 11:37 +0200, Avi Kivity wrote:
>
> > > Now I see that x86 just seems to flush everything, which is quite heavy
> > > handed considering how often cirrus does it, but maybe it doesn't have a
> > > choice (lack of reverse mapping from GPA ?).
> > 
> > We do have a reverse mapping, so we could easily flush just a single
> > slot.  The reason it hasn't been done is that slot changes are very are
> > on x86.  They're usually only done by 16-bit software; 32-bit software
> > just maps the entire framebuffer BAR and accesses it directly.  It's
> > also usually done in a tight loop, so flushing everything doesn't have a
> > large impact (and with a 20-bit address space, you couldn't cause a
> > large impact if you wanted to - memory is all of 256 pages).
>
> Right ... except that it definitely seems to be happening here with
> cirrusfb in the guest kernel :-)
>
> Every time it does an imageblit it switches the BAR to emulation and
> back to direct mapping when the "upload" of the image is complete.

That's strange, the cirrus BAR allows the framebuffer and bitblt region
to coexist:

0000000000000000-7ffffffffffffffe (prio 0, RW): pci
  00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
    00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
@vga.vram 0000000000000000-0000000000007fff
    00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
    00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
@vga.vram

^ those are continuously flipped when running 16-bit software

0000000000008000-000000000000ffff
  00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
  00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
  00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
    00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
    00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
    00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio

^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
0xfd000000 and you trigger a bitblt.

  00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
  00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio

A guest driver problem perhaps?


> The X cirrus driver doesn't seem to trigger that (at least didn't from
> my limited testing so far) so it may not be using host data blits ...
> I'll have to compare what they do in more details.

Or maybe it understands the BAR layout better.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28 10:05       ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28 10:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 11:59 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 11:37 +0200, Avi Kivity wrote:
>
> > > Now I see that x86 just seems to flush everything, which is quite heavy
> > > handed considering how often cirrus does it, but maybe it doesn't have a
> > > choice (lack of reverse mapping from GPA ?).
> > 
> > We do have a reverse mapping, so we could easily flush just a single
> > slot.  The reason it hasn't been done is that slot changes are very are
> > on x86.  They're usually only done by 16-bit software; 32-bit software
> > just maps the entire framebuffer BAR and accesses it directly.  It's
> > also usually done in a tight loop, so flushing everything doesn't have a
> > large impact (and with a 20-bit address space, you couldn't cause a
> > large impact if you wanted to - memory is all of 256 pages).
>
> Right ... except that it definitely seems to be happening here with
> cirrusfb in the guest kernel :-)
>
> Every time it does an imageblit it switches the BAR to emulation and
> back to direct mapping when the "upload" of the image is complete.

That's strange, the cirrus BAR allows the framebuffer and bitblt region
to coexist:

0000000000000000-7ffffffffffffffe (prio 0, RW): pci
  00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
    00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
@vga.vram 0000000000000000-0000000000007fff
    00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
    00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
@vga.vram

^ those are continuously flipped when running 16-bit software

0000000000008000-000000000000ffff
  00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
  00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
  00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
    00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
    00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
    00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio

^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
0xfd000000 and you trigger a bitblt.

  00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
  00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio

A guest driver problem perhaps?


> The X cirrus driver doesn't seem to trigger that (at least didn't from
> my limited testing so far) so it may not be using host data blits ...
> I'll have to compare what they do in more details.

Or maybe it understands the BAR layout better.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28 10:05       ` Avi Kivity
@ 2012-03-28 10:17         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28 10:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 12:05 +0200, Avi Kivity wrote:

> That's strange, the cirrus BAR allows the framebuffer and bitblt region
> to coexist:
> 
> 0000000000000000-7ffffffffffffffe (prio 0, RW): pci
>   00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
>     00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
> @vga.vram 0000000000000000-0000000000007fff
>     00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
>     00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
> @vga.vram
> 
> ^ those are continuously flipped when running 16-bit software
> 
> 0000000000008000-000000000000ffff
>   00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
>   00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
>   00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
>     00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
>     00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
>     00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio
> 
> ^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
> 0xfd000000 and you trigger a bitblt.
> 
>   00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
>   00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio
> 
> A guest driver problem perhaps?

Quite possibly, I'm not familiar with the cirrus HW. The trigger is an
MMIO register write done by cirrusfb, which causes
cirrus_update_memory_access() to switch the BAR to emulation as a result
of this becoming true:

	s->cirrus_srcptr != s->cirrus_srcptr_end

I haven't had a chance to dig further today (I'm home now), I can have a
look tomorrow.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28 10:17         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28 10:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 12:05 +0200, Avi Kivity wrote:

> That's strange, the cirrus BAR allows the framebuffer and bitblt region
> to coexist:
> 
> 0000000000000000-7ffffffffffffffe (prio 0, RW): pci
>   00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
>     00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
> @vga.vram 0000000000000000-0000000000007fff
>     00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
>     00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
> @vga.vram
> 
> ^ those are continuously flipped when running 16-bit software
> 
> 0000000000008000-000000000000ffff
>   00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
>   00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
>   00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
>     00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
>     00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
>     00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio
> 
> ^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
> 0xfd000000 and you trigger a bitblt.
> 
>   00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
>   00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio
> 
> A guest driver problem perhaps?

Quite possibly, I'm not familiar with the cirrus HW. The trigger is an
MMIO register write done by cirrusfb, which causes
cirrus_update_memory_access() to switch the BAR to emulation as a result
of this becoming true:

	s->cirrus_srcptr != s->cirrus_srcptr_end

I haven't had a chance to dig further today (I'm home now), I can have a
look tomorrow.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28 10:17         ` Benjamin Herrenschmidt
@ 2012-03-28 10:51           ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28 10:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 12:17 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 12:05 +0200, Avi Kivity wrote:
>
> > That's strange, the cirrus BAR allows the framebuffer and bitblt region
> > to coexist:
> > 
> > 0000000000000000-7ffffffffffffffe (prio 0, RW): pci
> >   00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
> >     00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
> > @vga.vram 0000000000000000-0000000000007fff
> >     00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
> >     00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
> > @vga.vram
> > 
> > ^ those are continuously flipped when running 16-bit software
> > 
> > 0000000000008000-000000000000ffff
> >   00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
> >   00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
> >   00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
> >     00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
> >     00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
> >     00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio
> > 
> > ^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
> > 0xfd000000 and you trigger a bitblt.
> > 
> >   00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
> >   00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio
> > 
> > A guest driver problem perhaps?
>
> Quite possibly, I'm not familiar with the cirrus HW. The trigger is an
> MMIO register write done by cirrusfb, which causes
> cirrus_update_memory_access() to switch the BAR to emulation as a result
> of this becoming true:
>
> 	s->cirrus_srcptr != s->cirrus_srcptr_end
>
> I haven't had a chance to dig further today (I'm home now), I can have a
> look tomorrow.
>

Ah, then it's not a guest problem, it's how the chip was designed. 
Newer chips do allow a workaround (GR31 bit 6):

6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
responds to write accesses at 000BC000h–000BFFFFh for color-expand
BitBLTs. This frees the linear address apertures for other, concurrent
accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
BitBLTs (compatible with CL-GD543X/’4X).
  System Source Location (Revision B): If this bit is ‘1’,
system-to-screen BitBLTs use the second 16-Mbyte window specified in
PCI10. This allows direct frame buffer accesses in the first window to
be mixed with system-to-screen writes in the second window without
restrictions.

If a system-to-screen BitBLT requiring data is not active, writes to the
second window complete in the minimum time and the data is discarded.
Writes to the first window are ignored by the BitBLT engine (but are
taken as direct writes to the frame buffer).

If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
window (compatible with CL-GD543X/’4X).

But it looks like this refers to 546x, even though I found it in the
5446 manual.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28 10:51           ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-28 10:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 12:17 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 12:05 +0200, Avi Kivity wrote:
>
> > That's strange, the cirrus BAR allows the framebuffer and bitblt region
> > to coexist:
> > 
> > 0000000000000000-7ffffffffffffffe (prio 0, RW): pci
> >   00000000000a0000-00000000000bffff (prio 1, RW): cirrus-lowmem-container
> >     00000000000a0000-00000000000a7fff (prio 1, RW): alias vga.bank0
> > @vga.vram 0000000000000000-0000000000007fff
> >     00000000000a0000-00000000000bffff (prio 0, RW): cirrus-low-memory
> >     00000000000a8000-00000000000affff (prio 1, RW): alias vga.bank1
> > @vga.vram
> > 
> > ^ those are continuously flipped when running 16-bit software
> > 
> > 0000000000008000-000000000000ffff
> >   00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
> >   00000000000e0000-00000000000fffff (prio 1, R-): isa-bios
> >   00000000fc000000-00000000fdffffff (prio 1, RW): cirrus-pci-bar0
> >     00000000fc000000-00000000fc7fffff (prio 1, RW): vga.vram
> >     00000000fc000000-00000000fc7fffff (prio 0, RW): cirrus-linear-io
> >     00000000fd000000-00000000fd3fffff (prio 0, RW): cirrus-bitblt-mmio
> > 
> > ^ the cirrus BAR, write to 0xfc000000 and you hit vga.vram, write to
> > 0xfd000000 and you trigger a bitblt.
> > 
> >   00000000feba0000-00000000febbffff (prio 1, RW): e1000-mmio
> >   00000000febf0000-00000000febf0fff (prio 1, RW): cirrus-mmio
> > 
> > A guest driver problem perhaps?
>
> Quite possibly, I'm not familiar with the cirrus HW. The trigger is an
> MMIO register write done by cirrusfb, which causes
> cirrus_update_memory_access() to switch the BAR to emulation as a result
> of this becoming true:
>
> 	s->cirrus_srcptr != s->cirrus_srcptr_end
>
> I haven't had a chance to dig further today (I'm home now), I can have a
> look tomorrow.
>

Ah, then it's not a guest problem, it's how the chip was designed. 
Newer chips do allow a workaround (GR31 bit 6):

6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
responds to write accesses at 000BC000h–000BFFFFh for color-expand
BitBLTs. This frees the linear address apertures for other, concurrent
accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
BitBLTs (compatible with CL-GD543X/’4X).
  System Source Location (Revision B): If this bit is ‘1’,
system-to-screen BitBLTs use the second 16-Mbyte window specified in
PCI10. This allows direct frame buffer accesses in the first window to
be mixed with system-to-screen writes in the second window without
restrictions.

If a system-to-screen BitBLT requiring data is not active, writes to the
second window complete in the minimum time and the data is discarded.
Writes to the first window are ignored by the BitBLT engine (but are
taken as direct writes to the frame buffer).

If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
window (compatible with CL-GD543X/’4X).

But it looks like this refers to 546x, even though I found it in the
5446 manual.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28 10:51           ` Avi Kivity
@ 2012-03-28 21:04             ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28 21:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 12:51 +0200, Avi Kivity wrote:

> Ah, then it's not a guest problem, it's how the chip was designed. 
> Newer chips do allow a workaround (GR31 bit 6):
> 
> 6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
> responds to write accesses at 000BC000h–000BFFFFh for color-expand
> BitBLTs. This frees the linear address apertures for other, concurrent
> accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
> BitBLTs (compatible with CL-GD543X/’4X).
>   System Source Location (Revision B): If this bit is ‘1’,
> system-to-screen BitBLTs use the second 16-Mbyte window specified in
> PCI10. This allows direct frame buffer accesses in the first window to
> be mixed with system-to-screen writes in the second window without
> restrictions.
> 
> If a system-to-screen BitBLT requiring data is not active, writes to the
> second window complete in the minimum time and the data is discarded.
> Writes to the first window are ignored by the BitBLT engine (but are
> taken as direct writes to the frame buffer).
> 
> If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
> window (compatible with CL-GD543X/’4X).
> 
> But it looks like this refers to 546x, even though I found it in the
> 5446 manual.

Right. The first option uses legacy VGA memory which I don't always have
access to so that's not really an option for cirrusfb in general (in
fact I made it not use IO either, it's doing full MMIO for configuring
the CRTCs as well).

I could (I have patches to do so) open a ISA/VGA memory window on the
bridge, in fact I need that for the X driver to work for other reasons,
but I'd rather not have cirrusfb deal with that.

The funny thing here is we have "clean" APIs to let userspace map (when
available) and access legacy VGA memory & IO ports, but we don't have a
good in-kernel API to do the same thing :-) On x86 things are somewhat
easy because it's just all hard coded and there's really only one PCI do
main but on anything else it's a mess.

In any case, X seems to avoid it, maybe nobody does color expansion
nowadays (I suppose modern desktops don't, maybe using ancient X apps
might trigger that path). So no biggie. I'll have to fix KVM powerpc
dealing with memslot changes anyways.

Thanks for your help,

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-28 21:04             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-28 21:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Wed, 2012-03-28 at 12:51 +0200, Avi Kivity wrote:

> Ah, then it's not a guest problem, it's how the chip was designed. 
> Newer chips do allow a workaround (GR31 bit 6):
> 
> 6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
> responds to write accesses at 000BC000h–000BFFFFh for color-expand
> BitBLTs. This frees the linear address apertures for other, concurrent
> accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
> BitBLTs (compatible with CL-GD543X/’4X).
>   System Source Location (Revision B): If this bit is ‘1’,
> system-to-screen BitBLTs use the second 16-Mbyte window specified in
> PCI10. This allows direct frame buffer accesses in the first window to
> be mixed with system-to-screen writes in the second window without
> restrictions.
> 
> If a system-to-screen BitBLT requiring data is not active, writes to the
> second window complete in the minimum time and the data is discarded.
> Writes to the first window are ignored by the BitBLT engine (but are
> taken as direct writes to the frame buffer).
> 
> If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
> window (compatible with CL-GD543X/’4X).
> 
> But it looks like this refers to 546x, even though I found it in the
> 5446 manual.

Right. The first option uses legacy VGA memory which I don't always have
access to so that's not really an option for cirrusfb in general (in
fact I made it not use IO either, it's doing full MMIO for configuring
the CRTCs as well).

I could (I have patches to do so) open a ISA/VGA memory window on the
bridge, in fact I need that for the X driver to work for other reasons,
but I'd rather not have cirrusfb deal with that.

The funny thing here is we have "clean" APIs to let userspace map (when
available) and access legacy VGA memory & IO ports, but we don't have a
good in-kernel API to do the same thing :-) On x86 things are somewhat
easy because it's just all hard coded and there's really only one PCI do
main but on anything else it's a mess.

In any case, X seems to avoid it, maybe nobody does color expansion
nowadays (I suppose modern desktops don't, maybe using ancient X apps
might trigger that path). So no biggie. I'll have to fix KVM powerpc
dealing with memslot changes anyways.

Thanks for your help,

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28  9:37   ` Avi Kivity
@ 2012-03-29  5:15     ` Takuya Yoshikawa
  -1 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29  5:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

On Wed, 28 Mar 2012 11:37:38 +0200
Avi Kivity <avi@redhat.com> wrote:

> > Now I see that x86 just seems to flush everything, which is quite heavy
> > handed considering how often cirrus does it, but maybe it doesn't have a
> > choice (lack of reverse mapping from GPA ?).
> 
> We do have a reverse mapping, so we could easily flush just a single
> slot.  The reason it hasn't been done is that slot changes are very are
> on x86.  They're usually only done by 16-bit software; 32-bit software
> just maps the entire framebuffer BAR and accesses it directly.  It's
> also usually done in a tight loop, so flushing everything doesn't have a
> large impact (and with a 20-bit address space, you couldn't cause a
> large impact if you wanted to - memory is all of 256 pages).

Even without using reverse mapping we can restrict that flush easily:

	http://www.spinics.net/lists/kvm/msg68695.html
	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()

This would be better than using reverse mapping because we do not have so
many shadow pages when we are in a tight loop like you mensioned.

Anyway we could easily see tens of milliseconds difference by eliminating
unrelated flush.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29  5:15     ` Takuya Yoshikawa
  0 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29  5:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

On Wed, 28 Mar 2012 11:37:38 +0200
Avi Kivity <avi@redhat.com> wrote:

> > Now I see that x86 just seems to flush everything, which is quite heavy
> > handed considering how often cirrus does it, but maybe it doesn't have a
> > choice (lack of reverse mapping from GPA ?).
> 
> We do have a reverse mapping, so we could easily flush just a single
> slot.  The reason it hasn't been done is that slot changes are very are
> on x86.  They're usually only done by 16-bit software; 32-bit software
> just maps the entire framebuffer BAR and accesses it directly.  It's
> also usually done in a tight loop, so flushing everything doesn't have a
> large impact (and with a 20-bit address space, you couldn't cause a
> large impact if you wanted to - memory is all of 256 pages).

Even without using reverse mapping we can restrict that flush easily:

	http://www.spinics.net/lists/kvm/msg68695.html
	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()

This would be better than using reverse mapping because we do not have so
many shadow pages when we are in a tight loop like you mensioned.

Anyway we could easily see tens of milliseconds difference by eliminating
unrelated flush.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-28 21:04             ` Benjamin Herrenschmidt
@ 2012-03-29  9:36               ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29  9:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 11:04 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 12:51 +0200, Avi Kivity wrote:
>
> > Ah, then it's not a guest problem, it's how the chip was designed. 
> > Newer chips do allow a workaround (GR31 bit 6):
> > 
> > 6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
> > responds to write accesses at 000BC000h–000BFFFFh for color-expand
> > BitBLTs. This frees the linear address apertures for other, concurrent
> > accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
> > BitBLTs (compatible with CL-GD543X/’4X).
> >   System Source Location (Revision B): If this bit is ‘1’,
> > system-to-screen BitBLTs use the second 16-Mbyte window specified in
> > PCI10. This allows direct frame buffer accesses in the first window to
> > be mixed with system-to-screen writes in the second window without
> > restrictions.
> > 
> > If a system-to-screen BitBLT requiring data is not active, writes to the
> > second window complete in the minimum time and the data is discarded.
> > Writes to the first window are ignored by the BitBLT engine (but are
> > taken as direct writes to the frame buffer).
> > 
> > If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
> > window (compatible with CL-GD543X/’4X).
> > 
> > But it looks like this refers to 546x, even though I found it in the
> > 5446 manual.
>
> Right. The first option uses legacy VGA memory which I don't always have
> access to so that's not really an option for cirrusfb in general (in
> fact I made it not use IO either, it's doing full MMIO for configuring
> the CRTCs as well).
>
> I could (I have patches to do so) open a ISA/VGA memory window on the
> bridge, in fact I need that for the X driver to work for other reasons,
> but I'd rather not have cirrusfb deal with that.
>
> The funny thing here is we have "clean" APIs to let userspace map (when
> available) and access legacy VGA memory & IO ports, but we don't have a
> good in-kernel API to do the same thing :-) On x86 things are somewhat
> easy because it's just all hard coded and there's really only one PCI do
> main but on anything else it's a mess.
>
> In any case, X seems to avoid it, maybe nobody does color expansion
> nowadays (I suppose modern desktops don't, maybe using ancient X apps
> might trigger that path). So no biggie. I'll have to fix KVM powerpc
> dealing with memslot changes anyways.
>
> Thanks for your help,
>

As a workaround you can use -vga std or -vga qxl.  The latter will work
even better when we have a kernel driver.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29  9:36               ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29  9:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On 03/28/2012 11:04 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2012-03-28 at 12:51 +0200, Avi Kivity wrote:
>
> > Ah, then it's not a guest problem, it's how the chip was designed. 
> > Newer chips do allow a workaround (GR31 bit 6):
> > 
> > 6 System Source Location (Revision A): If this bit is ‘1’, the CL-GD546X
> > responds to write accesses at 000BC000h–000BFFFFh for color-expand
> > BitBLTs. This frees the linear address apertures for other, concurrent
> > accesses. If this bit is ‘0’, the CL-GD546X uses the linear aperture for
> > BitBLTs (compatible with CL-GD543X/’4X).
> >   System Source Location (Revision B): If this bit is ‘1’,
> > system-to-screen BitBLTs use the second 16-Mbyte window specified in
> > PCI10. This allows direct frame buffer accesses in the first window to
> > be mixed with system-to-screen writes in the second window without
> > restrictions.
> > 
> > If a system-to-screen BitBLT requiring data is not active, writes to the
> > second window complete in the minimum time and the data is discarded.
> > Writes to the first window are ignored by the BitBLT engine (but are
> > taken as direct writes to the frame buffer).
> > 
> > If this bit is ‘0’, system-to-screen BitBLTs use the first 16-Mbyte
> > window (compatible with CL-GD543X/’4X).
> > 
> > But it looks like this refers to 546x, even though I found it in the
> > 5446 manual.
>
> Right. The first option uses legacy VGA memory which I don't always have
> access to so that's not really an option for cirrusfb in general (in
> fact I made it not use IO either, it's doing full MMIO for configuring
> the CRTCs as well).
>
> I could (I have patches to do so) open a ISA/VGA memory window on the
> bridge, in fact I need that for the X driver to work for other reasons,
> but I'd rather not have cirrusfb deal with that.
>
> The funny thing here is we have "clean" APIs to let userspace map (when
> available) and access legacy VGA memory & IO ports, but we don't have a
> good in-kernel API to do the same thing :-) On x86 things are somewhat
> easy because it's just all hard coded and there's really only one PCI do
> main but on anything else it's a mess.
>
> In any case, X seems to avoid it, maybe nobody does color expansion
> nowadays (I suppose modern desktops don't, maybe using ancient X apps
> might trigger that path). So no biggie. I'll have to fix KVM powerpc
> dealing with memslot changes anyways.
>
> Thanks for your help,
>

As a workaround you can use -vga std or -vga qxl.  The latter will work
even better when we have a kernel driver.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29  5:15     ` Takuya Yoshikawa
@ 2012-03-29  9:44       ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29  9:44 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

On 03/29/2012 07:15 AM, Takuya Yoshikawa wrote:
> On Wed, 28 Mar 2012 11:37:38 +0200
> Avi Kivity <avi@redhat.com> wrote:
>
> > > Now I see that x86 just seems to flush everything, which is quite heavy
> > > handed considering how often cirrus does it, but maybe it doesn't have a
> > > choice (lack of reverse mapping from GPA ?).
> > 
> > We do have a reverse mapping, so we could easily flush just a single
> > slot.  The reason it hasn't been done is that slot changes are very are
> > on x86.  They're usually only done by 16-bit software; 32-bit software
> > just maps the entire framebuffer BAR and accesses it directly.  It's
> > also usually done in a tight loop, so flushing everything doesn't have a
> > large impact (and with a 20-bit address space, you couldn't cause a
> > large impact if you wanted to - memory is all of 256 pages).
>
> Even without using reverse mapping we can restrict that flush easily:
>
> 	http://www.spinics.net/lists/kvm/msg68695.html
> 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
>
> This would be better than using reverse mapping because we do not have so
> many shadow pages when we are in a tight loop like you mensioned.
>
> Anyway we could easily see tens of milliseconds difference by eliminating
> unrelated flush.

Hm, the patch uses ->slot_bitmap which we might want to kill if we
increase the number of slots dramatically, as some people want to do.

btw, what happened to that patch, did it just get ignored on the list?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29  9:44       ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29  9:44 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

On 03/29/2012 07:15 AM, Takuya Yoshikawa wrote:
> On Wed, 28 Mar 2012 11:37:38 +0200
> Avi Kivity <avi@redhat.com> wrote:
>
> > > Now I see that x86 just seems to flush everything, which is quite heavy
> > > handed considering how often cirrus does it, but maybe it doesn't have a
> > > choice (lack of reverse mapping from GPA ?).
> > 
> > We do have a reverse mapping, so we could easily flush just a single
> > slot.  The reason it hasn't been done is that slot changes are very are
> > on x86.  They're usually only done by 16-bit software; 32-bit software
> > just maps the entire framebuffer BAR and accesses it directly.  It's
> > also usually done in a tight loop, so flushing everything doesn't have a
> > large impact (and with a 20-bit address space, you couldn't cause a
> > large impact if you wanted to - memory is all of 256 pages).
>
> Even without using reverse mapping we can restrict that flush easily:
>
> 	http://www.spinics.net/lists/kvm/msg68695.html
> 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
>
> This would be better than using reverse mapping because we do not have so
> many shadow pages when we are in a tight loop like you mensioned.
>
> Anyway we could easily see tens of milliseconds difference by eliminating
> unrelated flush.

Hm, the patch uses ->slot_bitmap which we might want to kill if we
increase the number of slots dramatically, as some people want to do.

btw, what happened to that patch, did it just get ignored on the list?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29  9:36               ` Avi Kivity
@ 2012-03-29 11:46                 ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-29 11:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Thu, 2012-03-29 at 11:36 +0200, Avi Kivity wrote:
> > In any case, X seems to avoid it, maybe nobody does color expansion
> > nowadays (I suppose modern desktops don't, maybe using ancient X apps
> > might trigger that path). So no biggie. I'll have to fix KVM powerpc
> > dealing with memslot changes anyways.
> >
> > Thanks for your help,
> >
> 
> As a workaround you can use -vga std or -vga qxl.  The latter will work
> even better when we have a kernel driver.

-vga std works fine, -vga qxl, I haven't tried on ppc yet at all...

The thing with cirrus is that it's the default in many cases (libvirt,
openstack, you name it it's there...) so it would be good to get it
working.

I already have fixes for the X driver (mostly some bugs where it stores
PCI physical addresses in 32-bit quantities) and I'm fixing cirrusfb now
but the later hits the color expansion path, so it looks like we'll have
to fix kvm/ppc to do the right thing there too :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29 11:46                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-29 11:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Alexander Graf, kvm-ppc, Scott Wood, Alex Williamson

On Thu, 2012-03-29 at 11:36 +0200, Avi Kivity wrote:
> > In any case, X seems to avoid it, maybe nobody does color expansion
> > nowadays (I suppose modern desktops don't, maybe using ancient X apps
> > might trigger that path). So no biggie. I'll have to fix KVM powerpc
> > dealing with memslot changes anyways.
> >
> > Thanks for your help,
> >
> 
> As a workaround you can use -vga std or -vga qxl.  The latter will work
> even better when we have a kernel driver.

-vga std works fine, -vga qxl, I haven't tried on ppc yet at all...

The thing with cirrus is that it's the default in many cases (libvirt,
openstack, you name it it's there...) so it would be good to get it
working.

I already have fixes for the X driver (mostly some bugs where it stores
PCI physical addresses in 32-bit quantities) and I'm fixing cirrusfb now
but the later hits the color expansion path, so it looks like we'll have
to fix kvm/ppc to do the right thing there too :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29  9:36               ` Avi Kivity
@ 2012-03-29 13:49                 ` Gerd Hoffmann
  -1 siblings, 0 replies; 30+ messages in thread
From: Gerd Hoffmann @ 2012-03-29 13:49 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

  Hi,

> As a workaround you can use -vga std or -vga qxl.  The latter will work
> even better when we have a kernel driver.

There is a kernel driver for -vga std too ;)

http://patchwork.ozlabs.org/patch/145479/

Didn't try on ppc though.  There is a funky #ifdef TARGET_I386 in
vbe_portio_list[], no idea why, but it makes me think a little tweak
could be needed to make it fly.  There shouldn't be any major roadblocks
though.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29 13:49                 ` Gerd Hoffmann
  0 siblings, 0 replies; 30+ messages in thread
From: Gerd Hoffmann @ 2012-03-29 13:49 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Benjamin Herrenschmidt, kvm, Alexander Graf, kvm-ppc, Scott Wood,
	Alex Williamson

  Hi,

> As a workaround you can use -vga std or -vga qxl.  The latter will work
> even better when we have a kernel driver.

There is a kernel driver for -vga std too ;)

http://patchwork.ozlabs.org/patch/145479/

Didn't try on ppc though.  There is a funky #ifdef TARGET_I386 in
vbe_portio_list[], no idea why, but it makes me think a little tweak
could be needed to make it fly.  There shouldn't be any major roadblocks
though.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29  9:44       ` Avi Kivity
@ 2012-03-29 15:21         ` Takuya Yoshikawa
  -1 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29 15:21 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On Thu, 29 Mar 2012 11:44:12 +0200
Avi Kivity <avi@redhat.com> wrote:

> > Even without using reverse mapping we can restrict that flush easily:
> >
> > 	http://www.spinics.net/lists/kvm/msg68695.html
> > 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
> >
> > This would be better than using reverse mapping because we do not have so
> > many shadow pages when we are in a tight loop like you mensioned.
> >
> > Anyway we could easily see tens of milliseconds difference by eliminating
> > unrelated flush.
> 
> Hm, the patch uses ->slot_bitmap which we might want to kill if we
> increase the number of slots dramatically, as some people want to do.
> 
> btw, what happened to that patch, did it just get ignored on the list?

I did not get any comments, maybe because it was during around your vacation.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29 15:21         ` Takuya Yoshikawa
  0 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29 15:21 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On Thu, 29 Mar 2012 11:44:12 +0200
Avi Kivity <avi@redhat.com> wrote:

> > Even without using reverse mapping we can restrict that flush easily:
> >
> > 	http://www.spinics.net/lists/kvm/msg68695.html
> > 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
> >
> > This would be better than using reverse mapping because we do not have so
> > many shadow pages when we are in a tight loop like you mensioned.
> >
> > Anyway we could easily see tens of milliseconds difference by eliminating
> > unrelated flush.
> 
> Hm, the patch uses ->slot_bitmap which we might want to kill if we
> increase the number of slots dramatically, as some people want to do.
> 
> btw, what happened to that patch, did it just get ignored on the list?

I did not get any comments, maybe because it was during around your vacation.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29 15:21         ` Takuya Yoshikawa
@ 2012-03-29 15:26           ` Avi Kivity
  -1 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29 15:26 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On 03/29/2012 05:21 PM, Takuya Yoshikawa wrote:
> On Thu, 29 Mar 2012 11:44:12 +0200
> Avi Kivity <avi@redhat.com> wrote:
>
> > > Even without using reverse mapping we can restrict that flush easily:
> > >
> > > 	http://www.spinics.net/lists/kvm/msg68695.html
> > > 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
> > >
> > > This would be better than using reverse mapping because we do not have so
> > > many shadow pages when we are in a tight loop like you mensioned.
> > >
> > > Anyway we could easily see tens of milliseconds difference by eliminating
> > > unrelated flush.
> > 
> > Hm, the patch uses ->slot_bitmap which we might want to kill if we
> > increase the number of slots dramatically, as some people want to do.
> > 
> > btw, what happened to that patch, did it just get ignored on the list?
>
> I did not get any comments, maybe because it was during around your vacation.

Care to refresh it?  I think it's worthwhile.

And please ping me (or Marcelo, or others) if you get no reviews.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29 15:26           ` Avi Kivity
  0 siblings, 0 replies; 30+ messages in thread
From: Avi Kivity @ 2012-03-29 15:26 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On 03/29/2012 05:21 PM, Takuya Yoshikawa wrote:
> On Thu, 29 Mar 2012 11:44:12 +0200
> Avi Kivity <avi@redhat.com> wrote:
>
> > > Even without using reverse mapping we can restrict that flush easily:
> > >
> > > 	http://www.spinics.net/lists/kvm/msg68695.html
> > > 	[PATCH] KVM: Avoid zapping unrelated shadows in __kvm_set_memory_region()
> > >
> > > This would be better than using reverse mapping because we do not have so
> > > many shadow pages when we are in a tight loop like you mensioned.
> > >
> > > Anyway we could easily see tens of milliseconds difference by eliminating
> > > unrelated flush.
> > 
> > Hm, the patch uses ->slot_bitmap which we might want to kill if we
> > increase the number of slots dramatically, as some people want to do.
> > 
> > btw, what happened to that patch, did it just get ignored on the list?
>
> I did not get any comments, maybe because it was during around your vacation.

Care to refresh it?  I think it's worthwhile.

And please ping me (or Marcelo, or others) if you get no reviews.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
  2012-03-29 15:26           ` Avi Kivity
@ 2012-03-29 15:35             ` Takuya Yoshikawa
  -1 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29 15:35 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On Thu, 29 Mar 2012 17:26:59 +0200
Avi Kivity <avi@redhat.com> wrote:

> > > Hm, the patch uses ->slot_bitmap which we might want to kill if we
> > > increase the number of slots dramatically, as some people want to do.
> > > 
> > > btw, what happened to that patch, did it just get ignored on the list?
> >
> > I did not get any comments, maybe because it was during around your vacation.
> 
> Care to refresh it?  I think it's worthwhile.

No problem, I will do after my rmap-next patches gets finished.

> And please ping me (or Marcelo, or others) if you get no reviews.

OK.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Question about removing memslots
@ 2012-03-29 15:35             ` Takuya Yoshikawa
  0 siblings, 0 replies; 30+ messages in thread
From: Takuya Yoshikawa @ 2012-03-29 15:35 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Takuya Yoshikawa, Benjamin Herrenschmidt, kvm, Alexander Graf,
	kvm-ppc, Scott Wood, Alex Williamson

On Thu, 29 Mar 2012 17:26:59 +0200
Avi Kivity <avi@redhat.com> wrote:

> > > Hm, the patch uses ->slot_bitmap which we might want to kill if we
> > > increase the number of slots dramatically, as some people want to do.
> > > 
> > > btw, what happened to that patch, did it just get ignored on the list?
> >
> > I did not get any comments, maybe because it was during around your vacation.
> 
> Care to refresh it?  I think it's worthwhile.

No problem, I will do after my rmap-next patches gets finished.

> And please ping me (or Marcelo, or others) if you get no reviews.

OK.

	Takuya

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2012-03-29 15:35 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-28  7:24 Question about removing memslots Benjamin Herrenschmidt
2012-03-28  7:24 ` Benjamin Herrenschmidt
2012-03-28  9:37 ` Avi Kivity
2012-03-28  9:37   ` Avi Kivity
2012-03-28  9:59   ` Benjamin Herrenschmidt
2012-03-28  9:59     ` Benjamin Herrenschmidt
2012-03-28 10:05     ` Avi Kivity
2012-03-28 10:05       ` Avi Kivity
2012-03-28 10:17       ` Benjamin Herrenschmidt
2012-03-28 10:17         ` Benjamin Herrenschmidt
2012-03-28 10:51         ` Avi Kivity
2012-03-28 10:51           ` Avi Kivity
2012-03-28 21:04           ` Benjamin Herrenschmidt
2012-03-28 21:04             ` Benjamin Herrenschmidt
2012-03-29  9:36             ` Avi Kivity
2012-03-29  9:36               ` Avi Kivity
2012-03-29 11:46               ` Benjamin Herrenschmidt
2012-03-29 11:46                 ` Benjamin Herrenschmidt
2012-03-29 13:49               ` Gerd Hoffmann
2012-03-29 13:49                 ` Gerd Hoffmann
2012-03-29  5:15   ` Takuya Yoshikawa
2012-03-29  5:15     ` Takuya Yoshikawa
2012-03-29  9:44     ` Avi Kivity
2012-03-29  9:44       ` Avi Kivity
2012-03-29 15:21       ` Takuya Yoshikawa
2012-03-29 15:21         ` Takuya Yoshikawa
2012-03-29 15:26         ` Avi Kivity
2012-03-29 15:26           ` Avi Kivity
2012-03-29 15:35           ` Takuya Yoshikawa
2012-03-29 15:35             ` Takuya Yoshikawa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.