KVM ARM Archive on lore.kernel.org
 help / color / Atom feed
* [Question] Hardware management of stage2 page dirty state
@ 2020-05-14  9:16 zhukeqian
  2020-05-14 16:14 ` Catalin Marinas
  0 siblings, 1 reply; 3+ messages in thread
From: zhukeqian @ 2020-05-14  9:16 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm; +Cc: catalin.marinas, Zengtao (B), Marc Zyngier

Hi Catalin,

I have some questions after deep reading your patch
https://patchwork.kernel.org/patch/8824261/ which enables hardware updates
of the Access Flag for Stage 2 page tables.

I notice that at the bottom of commit message, you said the following words:
"After some digging through the KVM code, I concluded that hardware DBM
(dirty bit management) support is not feasible for Stage 2. A potential
user would be dirty logging but this requires a different bitmap exposed
to Qemu and, to avoid races, the stage 2 mappings need to be mapped
read-only on clean, writable on fault. This assumption simplifies the
hardware Stage 2 AF support."

I have three questions here.

1. I do not understand the reason well about "not feasible". Does the main reason
   for this is the "races" you referred?

2. What does the "races" refer to? Do you mean the races between [hardware S2 DBM]
   and [dirty information collection that executed by KVM]?

   During VM live migration, Qemu will send dirty page iteratively and finally stop
   VM when dirty pages is not too much. We may miss dirty pages during each iteration
   before VM stop, but there are no races after VM stop, so we won't miss dirty pages
   finally. It seems that "races" is not a convinced reason for "not feasible".

3. You said that disable hardware S2 DBM support can simplify the hardware S2 AF support.
   Could you please explain the reason in detail?



Expect your reply. Many Thanks!

Thanks,
Keqian.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Question] Hardware management of stage2 page dirty state
  2020-05-14  9:16 [Question] Hardware management of stage2 page dirty state zhukeqian
@ 2020-05-14 16:14 ` Catalin Marinas
  2020-05-15  4:20   ` zhukeqian
  0 siblings, 1 reply; 3+ messages in thread
From: Catalin Marinas @ 2020-05-14 16:14 UTC (permalink / raw)
  To: zhukeqian; +Cc: Marc Zyngier, Zengtao (B), kvmarm, linux-arm-kernel

Hi Keqian,

On Thu, May 14, 2020 at 05:16:52PM +0800, zhukeqian wrote:
> I have some questions after deep reading your patch
> https://patchwork.kernel.org/patch/8824261/ which enables hardware updates
> of the Access Flag for Stage 2 page tables.
> 
> I notice that at the bottom of commit message, you said the following words:
> "After some digging through the KVM code, I concluded that hardware DBM
> (dirty bit management) support is not feasible for Stage 2. A potential
> user would be dirty logging but this requires a different bitmap exposed
> to Qemu and, to avoid races, the stage 2 mappings need to be mapped
> read-only on clean, writable on fault. This assumption simplifies the
> hardware Stage 2 AF support."
> 
> I have three questions here.
> 
> 1. I do not understand the reason well about "not feasible". Does the main reason
>    for this is the "races" you referred?

IIRC, dirty logging works by having a bitmap populated by the host
kernel when the guest writes a page. Such write triggers a stage 2 fault
and the kernel populates the bitmap. With S2 DBM, you wouldn't get a
fault when the guest writes the page, so the host kernel would have to
periodically check which S2 entries became writable to update the qemu
bitmap.

I think the race I had in mind was that the bitmap still reports the
page as clean while the guest already updated it.

Looking at this again, it may not matter much as qemu can copy those
pages again when migrating and before control is handed over to the new
host.

> 2. What does the "races" refer to? Do you mean the races between [hardware S2 DBM]
>    and [dirty information collection that executed by KVM]?

Yes.

>    During VM live migration, Qemu will send dirty page iteratively and finally stop
>    VM when dirty pages is not too much. We may miss dirty pages during each iteration
>    before VM stop, but there are no races after VM stop, so we won't miss dirty pages
>    finally. It seems that "races" is not a convinced reason for "not feasible".

You are probably right. But you'd have to change the dirty tracking from
a fault mechanism to a polling one checking the S2 page tables
periodically. Or, can you check then only once after VM stop?

> 3. You said that disable hardware S2 DBM support can simplify the hardware S2 AF support.
>    Could you please explain the reason in detail?

I probably meant that it simplifies the patch rather than something
specific to the AF support. If you add DBM, you'd need to make sure that
making a pte read-only doesn't lose the dirty information (see
ptep_set_wrprotect(), not sure whether KVM uses the same macro).

-- 
Catalin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Question] Hardware management of stage2 page dirty state
  2020-05-14 16:14 ` Catalin Marinas
@ 2020-05-15  4:20   ` zhukeqian
  0 siblings, 0 replies; 3+ messages in thread
From: zhukeqian @ 2020-05-15  4:20 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Marc Zyngier, Zengtao (B), kvmarm, linux-arm-kernel

Hi Catalin,

On 2020/5/15 0:14, Catalin Marinas wrote:
> Hi Keqian,
> 
> On Thu, May 14, 2020 at 05:16:52PM +0800, zhukeqian wrote:
>> I have some questions after deep reading your patch
>> https://patchwork.kernel.org/patch/8824261/ which enables hardware updates
>> of the Access Flag for Stage 2 page tables.
>>
>> I notice that at the bottom of commit message, you said the following words:
>> "After some digging through the KVM code, I concluded that hardware DBM
>> (dirty bit management) support is not feasible for Stage 2. A potential
>> user would be dirty logging but this requires a different bitmap exposed
>> to Qemu and, to avoid races, the stage 2 mappings need to be mapped
>> read-only on clean, writable on fault. This assumption simplifies the
>> hardware Stage 2 AF support."
>>
>> I have three questions here.
>>
>> 1. I do not understand the reason well about "not feasible". Does the main reason
>>    for this is the "races" you referred?
> 
> IIRC, dirty logging works by having a bitmap populated by the host
> kernel when the guest writes a page. Such write triggers a stage 2 fault
> and the kernel populates the bitmap. With S2 DBM, you wouldn't get a
> fault when the guest writes the page, so the host kernel would have to
> periodically check which S2 entries became writable to update the qemu
> bitmap.
Sure, the performance problem introduced by traversing page table entries is
a defect of DBM mechanism.

> 
> I think the race I had in mind was that the bitmap still reports the
> page as clean while the guest already updated it.
> 
> Looking at this again, it may not matter much as qemu can copy those
> pages again when migrating and before control is handed over to the new
> host.
Yes, race is not a problem. Qemu will not miss dirty pages when control is
handed over to the new Qemu.

> 
>> 2. What does the "races" refer to? Do you mean the races between [hardware S2 DBM]
>>    and [dirty information collection that executed by KVM]?
> 
> Yes.
> 
>>    During VM live migration, Qemu will send dirty page iteratively and finally stop
>>    VM when dirty pages is not too much. We may miss dirty pages during each iteration
>>    before VM stop, but there are no races after VM stop, so we won't miss dirty pages
>>    finally. It seems that "races" is not a convinced reason for "not feasible".
> 
> You are probably right. But you'd have to change the dirty tracking from
> a fault mechanism to a polling one checking the S2 page tables
> periodically. Or, can you check then only once after VM stop?

Our purpose is to remove performance side effect on guest caused by fault mechanism, so we want to
use DBM from begin to end.

For now, the only problem of DBM that we can figure out is the page table traversing performance.
We have done some demo tests on this and situation is not that bad. Besides, we have come up with
some optimizations which can ease this situation effectively.

I plan to send out all test data and PATCH RFC to community next week. It should work functional
correctly but without any optimizations. After that I will add all optimizations based on PATCH
RFC and send PATCH v1.

> 
>> 3. You said that disable hardware S2 DBM support can simplify the hardware S2 AF support.
>>    Could you please explain the reason in detail?
> 
> I probably meant that it simplifies the patch rather than something
> specific to the AF support. If you add DBM, you'd need to make sure that
> making a pte read-only doesn't lose the dirty information (see
> ptep_set_wrprotect(), not sure whether KVM uses the same macro).
> 
OK, I will notice this problem, thanks!

Thanks,
Keqian
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14  9:16 [Question] Hardware management of stage2 page dirty state zhukeqian
2020-05-14 16:14 ` Catalin Marinas
2020-05-15  4:20   ` zhukeqian

KVM ARM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvmarm/0 kvmarm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvmarm kvmarm/ https://lore.kernel.org/kvmarm \
		kvmarm@lists.cs.columbia.edu
	public-inbox-index kvmarm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/edu.columbia.cs.lists.kvmarm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git