All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mario Smarduch <m.smarduch@samsung.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: kvmarm@lists.cs.columbia.edu, marc.zyngier@arm.com,
	steve.capper@arm.com, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, gavin.guo@canonical.com,
	peter.maydell@linaro.org, jays.lee@samsung.com,
	sungjinn.chung@samsung.com
Subject: Re: [PATCH v7 4/4] arm: dirty page logging 2nd stage page fault handling support
Date: Tue, 10 Jun 2014 11:23:17 -0700	[thread overview]
Message-ID: <53974D15.3070508@samsung.com> (raw)
In-Reply-To: <20140608120530.GH3279@lvm>

On 06/08/2014 05:05 AM, Christoffer Dall wrote:
> On Tue, Jun 03, 2014 at 04:19:27PM -0700, Mario Smarduch wrote:
>> This patch adds support for handling 2nd stage page faults during migration,
>> it disables faulting in huge pages, and disolves huge pages to page tables.
> 
> s/disolves/dissolves/g
Will do.
> 
>> In case migration is canceled huge pages will be used again.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/mmu.c |   36 ++++++++++++++++++++++++++++++++++--
>>  1 file changed, 34 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 1c546c9..aca4fbf 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -966,6 +966,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
>>  	struct vm_area_struct *vma;
>>  	pfn_t pfn;
>> +	/* Get logging status, if dirty_bitmap is not NULL then logging is on */
>> +	bool logging_active = !!memslot->dirty_bitmap;
> 
>>  
>>  	write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
>>  	if (fault_status == FSC_PERM && !write_fault) {
>> @@ -1019,10 +1021,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	spin_lock(&kvm->mmu_lock);
>>  	if (mmu_notifier_retry(kvm, mmu_seq))
>>  		goto out_unlock;
>> -	if (!hugetlb && !force_pte)
>> +
>> +	/* When logging don't spend cycles to check for huge pages */
> 
> drop the comment: either explain the entire clause (which would be too
> long) or don't explain anything.
> 
Ok.
>> +	if (!hugetlb && !force_pte && !logging_active)
> 
> instead of having all this, can't you just change 
> 
> if (is_vm_hugetlb_page(vma)) to
> if (is_vm_hugetlb_page(vma) && !logging_active)
> 
> then you're also not mucking around with the gfn etc.

I didn't want to modify this function too much, but if that's ok that 
simplifies things a lot.

> 
>>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>>  
>> -	if (hugetlb) {
>> +	/*
>> +	 * Force all not present/perm faults to PTE handling, address both
>> +	 * PMD and PTE faults
>> +	 */
> 
> I don't understand this comment?  In which case does this apply?
> 
The cases I see here -
- huge page permission fault is forced into page table code while logging
- pte permission/not present handled by page table code as before.
>> +	if (hugetlb && !logging_active) {
>>  		pmd_t new_pmd = pfn_pmd(pfn, PAGE_S2);
>>  		new_pmd = pmd_mkhuge(new_pmd);
>>  		if (writable) {
>> @@ -1034,6 +1042,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	} else {
>>  		pte_t new_pte = pfn_pte(pfn, PAGE_S2);
>>  		if (writable) {
>> +			/*
>> +			 * If pmd is  mapping a huge page then clear it and let
>> +			 * stage2_set_pte() create a pte table. At the sametime
>> +			 * you write protect the pte (PAGE_S2 pgprot_t).
>> +			 */
>> +			if (logging_active) {
>> +				pmd_t *pmd;
>> +				if (hugetlb) {
>> +					pfn += pte_index(fault_ipa);
>> +					gfn = fault_ipa >> PAGE_SHIFT;
>> +					new_pte = pfn_pte(pfn, PAGE_S2);
>> +				}
>> +				pmd = stage2_get_pmd(kvm, NULL, fault_ipa);
>> +				if (pmd && kvm_pmd_huge(*pmd))
>> +					clear_pmd_entry(kvm, pmd, fault_ipa);
>> +			}
> 
> now instead of all this, you just need to check for kvm_pmd_huge() in
> stage2_set_pte() and if that's true, you clear it, and then then install
> your new pte.

Yes this really simplifies things!

> 
>>  			kvm_set_s2pte_writable(&new_pte);
>>  			kvm_set_pfn_dirty(pfn);
>>  		}
>> @@ -1041,6 +1065,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, false);
>>  	}
>>  
>> +	/*
>> +	 * Log the dirty page in dirty_bitmap[], call regardless if logging is
>> +	 * disabled or enabled both cases handled safely.
>> +	 * TODO: for larger page size mark mulitple dirty page bits for each
>> +	 *       4k page.
>> +	 */
>> +	if (writable)
>> +		mark_page_dirty(kvm, gfn);
> 
> what if you just faulted in a page on a read which wasn't present
> before but it happens to belong to a writeable memslot, is that page
> then dirty? hmmm.
> 
A bug, must also check if it was a write fault not just that we're dealing with
a writable region. This one could be pretty bad on performance, not to mention
in accurate. It will be interesting to see new test results, glad you caught
that.

Thanks,
  Mario.
> 
>>  
>>  out_unlock:
>>  	spin_unlock(&kvm->mmu_lock);
>> -- 
>> 1.7.9.5
>>
> 
> Thanks,
> -Christoffer
> 


WARNING: multiple messages have this Message-ID (diff)
From: m.smarduch@samsung.com (Mario Smarduch)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v7 4/4] arm: dirty page logging 2nd stage page fault handling support
Date: Tue, 10 Jun 2014 11:23:17 -0700	[thread overview]
Message-ID: <53974D15.3070508@samsung.com> (raw)
In-Reply-To: <20140608120530.GH3279@lvm>

On 06/08/2014 05:05 AM, Christoffer Dall wrote:
> On Tue, Jun 03, 2014 at 04:19:27PM -0700, Mario Smarduch wrote:
>> This patch adds support for handling 2nd stage page faults during migration,
>> it disables faulting in huge pages, and disolves huge pages to page tables.
> 
> s/disolves/dissolves/g
Will do.
> 
>> In case migration is canceled huge pages will be used again.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/mmu.c |   36 ++++++++++++++++++++++++++++++++++--
>>  1 file changed, 34 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 1c546c9..aca4fbf 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -966,6 +966,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
>>  	struct vm_area_struct *vma;
>>  	pfn_t pfn;
>> +	/* Get logging status, if dirty_bitmap is not NULL then logging is on */
>> +	bool logging_active = !!memslot->dirty_bitmap;
> 
>>  
>>  	write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
>>  	if (fault_status == FSC_PERM && !write_fault) {
>> @@ -1019,10 +1021,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	spin_lock(&kvm->mmu_lock);
>>  	if (mmu_notifier_retry(kvm, mmu_seq))
>>  		goto out_unlock;
>> -	if (!hugetlb && !force_pte)
>> +
>> +	/* When logging don't spend cycles to check for huge pages */
> 
> drop the comment: either explain the entire clause (which would be too
> long) or don't explain anything.
> 
Ok.
>> +	if (!hugetlb && !force_pte && !logging_active)
> 
> instead of having all this, can't you just change 
> 
> if (is_vm_hugetlb_page(vma)) to
> if (is_vm_hugetlb_page(vma) && !logging_active)
> 
> then you're also not mucking around with the gfn etc.

I didn't want to modify this function too much, but if that's ok that 
simplifies things a lot.

> 
>>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>>  
>> -	if (hugetlb) {
>> +	/*
>> +	 * Force all not present/perm faults to PTE handling, address both
>> +	 * PMD and PTE faults
>> +	 */
> 
> I don't understand this comment?  In which case does this apply?
> 
The cases I see here -
- huge page permission fault is forced into page table code while logging
- pte permission/not present handled by page table code as before.
>> +	if (hugetlb && !logging_active) {
>>  		pmd_t new_pmd = pfn_pmd(pfn, PAGE_S2);
>>  		new_pmd = pmd_mkhuge(new_pmd);
>>  		if (writable) {
>> @@ -1034,6 +1042,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	} else {
>>  		pte_t new_pte = pfn_pte(pfn, PAGE_S2);
>>  		if (writable) {
>> +			/*
>> +			 * If pmd is  mapping a huge page then clear it and let
>> +			 * stage2_set_pte() create a pte table. At the sametime
>> +			 * you write protect the pte (PAGE_S2 pgprot_t).
>> +			 */
>> +			if (logging_active) {
>> +				pmd_t *pmd;
>> +				if (hugetlb) {
>> +					pfn += pte_index(fault_ipa);
>> +					gfn = fault_ipa >> PAGE_SHIFT;
>> +					new_pte = pfn_pte(pfn, PAGE_S2);
>> +				}
>> +				pmd = stage2_get_pmd(kvm, NULL, fault_ipa);
>> +				if (pmd && kvm_pmd_huge(*pmd))
>> +					clear_pmd_entry(kvm, pmd, fault_ipa);
>> +			}
> 
> now instead of all this, you just need to check for kvm_pmd_huge() in
> stage2_set_pte() and if that's true, you clear it, and then then install
> your new pte.

Yes this really simplifies things!

> 
>>  			kvm_set_s2pte_writable(&new_pte);
>>  			kvm_set_pfn_dirty(pfn);
>>  		}
>> @@ -1041,6 +1065,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, false);
>>  	}
>>  
>> +	/*
>> +	 * Log the dirty page in dirty_bitmap[], call regardless if logging is
>> +	 * disabled or enabled both cases handled safely.
>> +	 * TODO: for larger page size mark mulitple dirty page bits for each
>> +	 *       4k page.
>> +	 */
>> +	if (writable)
>> +		mark_page_dirty(kvm, gfn);
> 
> what if you just faulted in a page on a read which wasn't present
> before but it happens to belong to a writeable memslot, is that page
> then dirty? hmmm.
> 
A bug, must also check if it was a write fault not just that we're dealing with
a writable region. This one could be pretty bad on performance, not to mention
in accurate. It will be interesting to see new test results, glad you caught
that.

Thanks,
  Mario.
> 
>>  
>>  out_unlock:
>>  	spin_unlock(&kvm->mmu_lock);
>> -- 
>> 1.7.9.5
>>
> 
> Thanks,
> -Christoffer
> 

  reply	other threads:[~2014-06-10 18:23 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-03 23:19 [PATCH v7 0/4] arm: dirty page logging support for ARMv7 Mario Smarduch
2014-06-03 23:19 ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 1/4] arm: add ARMv7 HYP API to flush VM TLBs without address param Mario Smarduch
2014-06-03 23:19   ` Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-08 12:05     ` Christoffer Dall
2014-06-09 17:06     ` Mario Smarduch
2014-06-09 17:06       ` Mario Smarduch
2014-06-09 17:49       ` Christoffer Dall
2014-06-09 17:49         ` Christoffer Dall
2014-06-09 18:36         ` Mario Smarduch
2014-06-09 18:36           ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 2/4] arm: dirty page logging inital mem region write protect (w/no huge PUD support) Mario Smarduch
2014-06-03 23:19   ` Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-08 12:05     ` Christoffer Dall
2014-06-09 17:58     ` Mario Smarduch
2014-06-09 17:58       ` Mario Smarduch
2014-06-09 18:09       ` Christoffer Dall
2014-06-09 18:09         ` Christoffer Dall
2014-06-09 18:33         ` Mario Smarduch
2014-06-09 18:33           ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 3/4] arm: dirty log write protect management support Mario Smarduch
2014-06-03 23:19   ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 4/4] arm: dirty page logging 2nd stage page fault handling support Mario Smarduch
2014-06-03 23:19   ` Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-08 12:05     ` Christoffer Dall
2014-06-10 18:23     ` Mario Smarduch [this message]
2014-06-10 18:23       ` Mario Smarduch
2014-06-11  6:58       ` Christoffer Dall
2014-06-11  6:58         ` Christoffer Dall
2014-06-12  2:53         ` Mario Smarduch
2014-06-12  2:53           ` Mario Smarduch
2014-06-06 17:33 ` [RESEND PATCH v7 3/4] arm: dirty log write protect management support Mario Smarduch
2014-06-06 17:33   ` Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-08 12:05     ` Christoffer Dall
2014-06-10  1:47     ` Mario Smarduch
2014-06-10  1:47       ` Mario Smarduch
2014-06-10  9:22       ` Christoffer Dall
2014-06-10  9:22         ` Christoffer Dall
2014-06-10 18:08         ` Mario Smarduch
2014-06-10 18:08           ` Mario Smarduch
2014-06-11  7:03           ` Christoffer Dall
2014-06-11  7:03             ` Christoffer Dall
2014-06-12  3:02             ` Mario Smarduch
2014-06-12  3:02               ` Mario Smarduch
2014-06-18  1:41             ` Mario Smarduch
2014-06-18  1:41               ` Mario Smarduch
2014-07-03 15:04               ` Christoffer Dall
2014-07-03 15:04                 ` Christoffer Dall
2014-07-04 16:29                 ` Paolo Bonzini
2014-07-04 16:29                   ` Paolo Bonzini
2014-07-17 16:00                   ` Mario Smarduch
2014-07-17 16:00                     ` Mario Smarduch
2014-07-17 16:17                 ` Mario Smarduch
2014-07-17 16:17                   ` Mario Smarduch
2014-06-08 10:45 ` [PATCH v7 0/4] arm: dirty page logging support for ARMv7 Christoffer Dall
2014-06-08 10:45   ` Christoffer Dall
2014-06-09 17:02   ` Mario Smarduch
2014-06-09 17:02     ` Mario Smarduch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53974D15.3070508@samsung.com \
    --to=m.smarduch@samsung.com \
    --cc=christoffer.dall@linaro.org \
    --cc=gavin.guo@canonical.com \
    --cc=jays.lee@samsung.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=peter.maydell@linaro.org \
    --cc=steve.capper@arm.com \
    --cc=sungjinn.chung@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.