linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: yuzhao@google.com, Steve.Capper@arm.com, marc.zyngier@arm.com,
	Catalin.Marinas@arm.com, suzuki.poulose@arm.com,
	will.deacon@arm.com, james.morse@arm.com,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH V2 1/6] KVM: ARM: Remove pgtable standard functions from stage-2 page tables
Date: Tue, 26 Feb 2019 09:45:05 +0530	[thread overview]
Message-ID: <d5ccc1ae-b97a-5e01-78c7-a2f377b3024f@arm.com> (raw)
In-Reply-To: <20190225154922.GJ26236@lakrids.cambridge.arm.com>



On 02/25/2019 09:19 PM, Mark Rutland wrote:
> On Mon, Feb 25, 2019 at 07:50:27PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 02/25/2019 04:30 PM, Mark Rutland wrote:
>>> Hi Anshuman,
>>>
>>> On Mon, Feb 25, 2019 at 10:33:54AM +0530, Anshuman Khandual wrote:
>>>> ARM64 standard pgtable functions are going to use pgtable_page_[ctor|dtor]
>>>> constructs. Certain stage-2 page table allocations are multi order which
>>>> cannot be allocated through a generic pgtable function as it does not exist
>>>> right now. This prevents all pgtable allocations including multi order ones
>>>> in stage-2 from moving into new ARM64 (pgtable_page_[ctor|dtor]) pgtable
>>>> functions. Hence remove all generic pgtable allocation function dependency
>>>> from stage-2 page tables till there is one multi-order allocation function
>>>> available.
>>>
>>> I'm a bit confused by this. Which allocations are multi-order?
>>>
>>
>> Stage-2 PGD.
>>
>> kvm_alloc_stage2_pgd -> alloc_pages_exact
>> kvm_free_stage2_pgd -> free_pages_exact
>>  
>>> Why does that prevent other allcoations from using the regular routines?
>>
>> At present both stage-2 PGD (kvm_alloc_stage2_pgd -> alloc_pages_exact), PUD|PMD
>> (mmu_memory_cache_alloc) allocates directly from buddy allocator but then while
>> freeing back stage-2 PGD directly calls buddy allocator via free_pages_exact but
>> PUD|PMD get freed with stage2_[pud|pmd]_free which calls pud|pmd_free instead
>> of calling free_pages() directly.
> 
> If we allocate/free the stage-2 PGD with {alloc,free}_pages_exact(),
> then the PGD level is balanced today.
> 
> I don't see what that has to do with the other levels of table.

The idea is either all page table levels pages go through ctor/dtor constructs
or none of them do. Not only this makes sense logically but also prevents getting
into bad page state errors when freeing without calling dtor. For other non-KVM
use cases like kernel linear or vmmap it helps while tearing down the page table
for memory hot-remove cases.

> 
>> All of these worked fine because pud|pmd_free called free_pages() directly with
>> out going through pgtable_page_dtor(). But now we are changing pud|pmd_free to
>> use pgtable_page_dtor() both for user and host kernel page tables. This will
>> break stage2 page table (bad page state errors) because the new free path which
>> would call pgtable_page_dtor() where as alloc patch never called pgtable_page_ctor().
> 
> I'm lost as to how that relates to the alloc/free of the PGD. AFAICT,
> that's unrelated to the problem at hand.

As mentioned above either all page table use ctor/dtor based allocation/free or
none of them do.

> 
> What subtlety am I missing?

pmd|pud_free() will call dtor() after the series hence they cannot be used by
stage2_pmd|pud_free because stage2_pud|pmd get allocated from pre-allocated
mmu_memory_cache_alloc which builds with __get_free_page(PGALLOC_GFP). Is not
that right ? 

> 
>> To fix this situation either we move all stage-2 page table to use pte_alloc_one()
>> and pte_free() which goes through pgtable_page_ctor/dtor cycle or just keep the
>> virtualization page tables out of it (stage2/hyp) and remove the only dependency
>> which breaks because of these changes. This series went with the second option. 
> 
> It sounds to me like this is just a mismatch between the alloc/free
> paths for the PUD and PMD levels of table.
> 
> IIUC, we allocate those with {pmd,pud}_alloc_one(), and free them with
> stage2_{pud,pmd}_free(), which call {pud,pmd}_free().

AFAICS they get allocated from mmu_memory_cache_alloc instead. pud|pmd_alloc_one()
get used only for the _hyp_ mappings not for the stage2 table.

kvm_handle_guest_abort
	user_mem_abort
		stage2_set_pud_huge -> stage2_get_pud -> mmu_memory_cache_alloc
		stage2_set_pmd_huge -> stage2_get_pmd -> mmu_memory_cache_alloc
		stage2_set_pte -> mmu_memory_cache_alloc

Am I missing something here.

> 
> I would naively expect p?d_alloc_one() to pair with p?d_free(), so that
> being a problem is surprising to me.

They work absolutely fine. But thats not the case here. stage2_pud|pmd_free()
calls pud|pmd_free() for pages not allocated with pud|pmd_alloc_one().

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-02-26  4:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-25  5:03 [PATCH V2 0/6] arm64/mm: Enable accounting for page table pages Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 1/6] KVM: ARM: Remove pgtable standard functions from stage-2 page tables Anshuman Khandual
2019-02-25 11:00   ` Mark Rutland
2019-02-25 14:20     ` Anshuman Khandual
2019-02-25 15:49       ` Mark Rutland
2019-02-26  4:15         ` Anshuman Khandual [this message]
2019-02-26  8:51           ` Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 2/6] arm64/mm: Make pgd_pgtable_alloc() call pte_alloc_one() always Anshuman Khandual
2019-02-25 11:08   ` Mark Rutland
2019-02-25 14:41     ` Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 3/6] arm64/mm: Make all page table pages cycles through standard constructs Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 4/6] arm64/mm: Call pgtable_page_dtor() for both PMD and PUD page table pages Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 5/6] arm64/mm: Enable page table page accounting for user space Anshuman Khandual
2019-02-25 11:11   ` Mark Rutland
2019-02-25 14:49     ` Anshuman Khandual
2019-02-25 15:35       ` Mark Rutland
2019-02-26  5:06         ` Anshuman Khandual
2019-02-26  6:37   ` Anshuman Khandual
2019-02-25  5:03 ` [PATCH V2 6/6] arm64/mm: Enable ARCH_ENABLE_SPLIT_PMD_PTLOCK Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5ccc1ae-b97a-5e01-78c7-a2f377b3024f@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Steve.Capper@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will.deacon@arm.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).