All of lore.kernel.org
 help / color / mirror / Atom feed
* Checking  guest memory pages changes from host userspace
@ 2009-06-19 18:09 Passera, Pablo R
  2009-06-20  6:47 ` Amit Shah
  2009-06-21 15:51 ` Avi Kivity
  0 siblings, 2 replies; 9+ messages in thread
From: Passera, Pablo R @ 2009-06-19 18:09 UTC (permalink / raw)
  To: kvm

Hi list,
        I need to monitor some guest memory pages. I need to know if the information in these pages was changed. For this, I was thinking to mark the guest memory pages in some way (like write protecting them) so a page fault is generated. Then manage this fault inside qemu. Is there some API in libkvm that allows me to do this?

Thanks a lot,
Pablo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-19 18:09 Checking guest memory pages changes from host userspace Passera, Pablo R
@ 2009-06-20  6:47 ` Amit Shah
  2009-06-21 15:51 ` Avi Kivity
  1 sibling, 0 replies; 9+ messages in thread
From: Amit Shah @ 2009-06-20  6:47 UTC (permalink / raw)
  To: Passera, Pablo R; +Cc: kvm

On (Fri) Jun 19 2009 [12:09:10], Passera, Pablo R wrote:
> Hi list,
>         I need to monitor some guest memory pages. I need to know if the information in these pages was changed. For this, I was thinking to mark the guest memory pages in some way (like write protecting them) so a page fault is generated. Then manage this fault inside qemu. Is there some API in libkvm that allows me to do this?

If it's for debugging, you can use qemu without kvm. If you're only
interested in a few pages (and are proposing a solution based on that),
I don't think that can be done with EPT / NPT enabled so it won't be
something that'll be accepted upstream.

		Amit

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-19 18:09 Checking guest memory pages changes from host userspace Passera, Pablo R
  2009-06-20  6:47 ` Amit Shah
@ 2009-06-21 15:51 ` Avi Kivity
  2009-06-21 18:46   ` Alexander Graf
  1 sibling, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-21 15:51 UTC (permalink / raw)
  To: Passera, Pablo R; +Cc: kvm

On 06/19/2009 09:09 PM, Passera, Pablo R wrote:
> Hi list,
>          I need to monitor some guest memory pages. I need to know if the information in these pages was changed. For this, I was thinking to mark the guest memory pages in some way (like write protecting them) so a page fault is generated. Then manage this fault inside qemu. Is there some API in libkvm that allows me to do this?
>    

You can use the dirty memory logging API.  vga uses this to track which 
regions of the screen have changed, and live migration uses it to allow 
the guest to proceed while copying its memory to the other node.  It 
works exactly by write protecting guest memory and trapping the 
resultant fault.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-21 15:51 ` Avi Kivity
@ 2009-06-21 18:46   ` Alexander Graf
  2009-06-21 20:01     ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Graf @ 2009-06-21 18:46 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Passera, Pablo R, kvm





On 21.06.2009, at 17:51, Avi Kivity <avi@redhat.com> wrote:

> On 06/19/2009 09:09 PM, Passera, Pablo R wrote:
>> Hi list,
>>         I need to monitor some guest memory pages. I need to know  
>> if the information in these pages was changed. For this, I was  
>> thinking to mark the guest memory pages in some way (like write  
>> protecting them) so a page fault is generated. Then manage this  
>> fault inside qemu. Is there some API in libkvm that allows me to do  
>> this?
>>
>
> You can use the dirty memory logging API.  vga uses this to track  
> which regions of the screen have changed, and live migration uses it  
> to allow the guest to proceed while copying its memory to the other  
> node.  It works exactly by write protecting guest memory and  
> trapping the resultant fault.

I stumbled across this on my ppc implementation: Is there an obvious  
reason we don't use the pte's dirty bit?

I don't know which operation is more frequent - writing into dirty  
mapped memory or reading the dirty map. And I have no idea how long it  
would take to find out dirty pages...

Alex


>
> -- 
> error compiling committee.c: too many arguments to function
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-21 18:46   ` Alexander Graf
@ 2009-06-21 20:01     ` Avi Kivity
  2009-06-22  8:50       ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-21 20:01 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Passera, Pablo R, kvm

On 06/21/2009 09:46 PM, Alexander Graf wrote:
>> You can use the dirty memory logging API.  vga uses this to track 
>> which regions of the screen have changed, and live migration uses it 
>> to allow the guest to proceed while copying its memory to the other 
>> node.  It works exactly by write protecting guest memory and trapping 
>> the resultant fault.
>
>
> I stumbled across this on my ppc implementation: Is there an obvious 
> reason we don't use the pte's dirty bit?


Yes:

> I don't know which operation is more frequent - writing into dirty 
> mapped memory or reading the dirty map. And I have no idea how long it 
> would take to find out dirty pages...

The cost of write protection is one fault per dirtied spte.  The cost of 
looking at the dirty bit is a cache miss per spte (could be reduced by 
scanning in spte order rather than gfn order).

The problem is when you have a low percentage of memory dirtied.  Then 
you're scanning a lot of sptes to find a few dirty ones - so the cost 
per dirty page goes up.

We've talked about write-protecting the upper levels first, but given a 
random distribution of writes, that doesn't help much.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-21 20:01     ` Avi Kivity
@ 2009-06-22  8:50       ` Avi Kivity
  2009-06-22  9:42         ` Alexander Graf
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-22  8:50 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Passera, Pablo R, kvm

On 06/21/2009 11:01 PM, Avi Kivity wrote:
>> I don't know which operation is more frequent - writing into dirty 
>> mapped memory or reading the dirty map. And I have no idea how long 
>> it would take to find out dirty pages...
>
> The cost of write protection is one fault per dirtied spte.  The cost 
> of looking at the dirty bit is a cache miss per spte (could be reduced 
> by scanning in spte order rather than gfn order).
>
> The problem is when you have a low percentage of memory dirtied.  Then 
> you're scanning a lot of sptes to find a few dirty ones - so the cost 
> per dirty page goes up.
>
> We've talked about write-protecting the upper levels first, but given 
> a random distribution of writes, that doesn't help much.
>

Thinking about it a bit more, when we write-protect pages we're O(spte) 
anyway, so that shouldn't be a barrier.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-22  8:50       ` Avi Kivity
@ 2009-06-22  9:42         ` Alexander Graf
  2009-06-22  9:48           ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Alexander Graf @ 2009-06-22  9:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Passera, Pablo R, kvm


On 22.06.2009, at 10:50, Avi Kivity wrote:

> On 06/21/2009 11:01 PM, Avi Kivity wrote:
>>> I don't know which operation is more frequent - writing into dirty  
>>> mapped memory or reading the dirty map. And I have no idea how  
>>> long it would take to find out dirty pages...
>>
>> The cost of write protection is one fault per dirtied spte.  The  
>> cost of looking at the dirty bit is a cache miss per spte (could be  
>> reduced by scanning in spte order rather than gfn order).
>>
>> The problem is when you have a low percentage of memory dirtied.   
>> Then you're scanning a lot of sptes to find a few dirty ones - so  
>> the cost per dirty page goes up.
>>
>> We've talked about write-protecting the upper levels first, but  
>> given a random distribution of writes, that doesn't help much.
>>
>
> Thinking about it a bit more, when we write-protect pages we're  
> O(spte) anyway, so that shouldn't be a barrier.

Yeah, the current implementation is probably the fastest you'll get. I  
didn't want to slow down shadow page setup due to the dirty update,  
but I guess compared to the rest of the overhead that doesn't really  
weight as much.

So I'll go with the same approach on ppc as well :-).

Alex


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
  2009-06-22  9:42         ` Alexander Graf
@ 2009-06-22  9:48           ` Avi Kivity
       [not found]             ` <DC72E7E7-2494-48BF-96C6-F543A29888B1@suse.de>
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2009-06-22  9:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Passera, Pablo R, kvm

On 06/22/2009 12:42 PM, Alexander Graf wrote:
>> Thinking about it a bit more, when we write-protect pages we're 
>> O(spte) anyway, so that shouldn't be a barrier.
>
>
> Yeah, the current implementation is probably the fastest you'll get. I 
> didn't want to slow down shadow page setup due to the dirty update, 
> but I guess compared to the rest of the overhead that doesn't really 
> weight as much.

I didn't explain myself well, I now think using the dirty bits is better.

Currently we do the following:
1. sweep all sptes to drop write permissions
2. on write faults, mark the page dirty
3. retrieve the log

We could do instead:
1. sweep all sptes to drop the dirty bit
2. on writes, set the dirty bit (the cpu does this)
3. sweep all sptes to read the dirty bit, and return the log

Since step 1 occurs after step 3 of the previous iteration, we could 
merge them, and lose nothing.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Checking  guest memory pages changes from host userspace
       [not found]             ` <DC72E7E7-2494-48BF-96C6-F543A29888B1@suse.de>
@ 2009-06-22 11:38               ` Avi Kivity
  0 siblings, 0 replies; 9+ messages in thread
From: Avi Kivity @ 2009-06-22 11:38 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Passera, Pablo R, kvm

On 06/22/2009 12:57 PM, Alexander Graf wrote:
>>> Yeah, the current implementation is probably the fastest you'll get. 
>>> I didn't want to slow down shadow page setup due to the dirty 
>>> update, but I guess compared to the rest of the overhead that 
>>> doesn't really weight as much.
>>
>> I didn't explain myself well, I now think using the dirty bits is better.
>>
>> Currently we do the following:
>> 1. sweep all sptes to drop write permissions
>
> sweep = flush / remove from spt?

sweep = iterate over all (dropping write permissions from each spte)

>> 2. on write faults, mark the page dirty
>> 3. retrieve the log
>>
>> We could do instead:
>> 1. sweep all sptes to drop the dirty bit
>
> sweep = modify pte to set dirty=0?

sweep = iterate over all (dropping dirty bits)


>> 2. on writes, set the dirty bit (the cpu does this)
>> 3. sweep all sptes to read the dirty bit, and return the log
>>
>> Since step 1 occurs after step 3 of the previous iteration, we could 
>> merge them, and lose nothing.
>
> Hm - so in both cases we need to loop through all PTEs anyways, 
> because we need to either remove/unset dirty them?

Yes.  Although for the write-protect case, we could alternatively look 
at the bitmap to see which sptes we need to drop.

>
> Then it really does make sense to use the dirty bit :-).
> Also doing a #vmexit is rather expensive, so I'd rather loop through 
> 1000 entries in the host context than taking 10 #vmexits. And dirty 
> bits don't #vmexit.

It's not that trivial.  A #vmexit is about 2000 cycles (including mmu 
code), while a cache miss is 100-200 cycles.  So is we don't scan the 
sptes carefully, the cache miss cost could be greater.

> Maybe it'd make sense to use the higher order PTE dirty bits too (do 
> they have dirty bits on x86?) to not loop through all PTEs to generate 
> the dirty map. In most cases it'll be 0 anyways.

There are no higher dirty bits, but we can write protect the higher 
level.  I'm not sure it's worthwhile; if 1% of memory is dirty, but it's 
scattered randomly, then all 2MB ranges will be dirty.

> That way we'd save 90% of the loop time, because we only need to check 
> a couple of 2/4mb pte entries.

You have a 4MB guest?  Okay, you're only considering the vga tracking.  
I don't think that's a problem in practice, worst case is a few hundred 
faults in a 30ms time period.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-06-22 11:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-19 18:09 Checking guest memory pages changes from host userspace Passera, Pablo R
2009-06-20  6:47 ` Amit Shah
2009-06-21 15:51 ` Avi Kivity
2009-06-21 18:46   ` Alexander Graf
2009-06-21 20:01     ` Avi Kivity
2009-06-22  8:50       ` Avi Kivity
2009-06-22  9:42         ` Alexander Graf
2009-06-22  9:48           ` Avi Kivity
     [not found]             ` <DC72E7E7-2494-48BF-96C6-F543A29888B1@suse.de>
2009-06-22 11:38               ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.