From: Suzuki K Poulose <suzuki.poulose@arm.com>
To: zhengxiang9@huawei.com, yuzenghui@huawei.com
Cc: marc.zyngier@arm.com, christoffer.dall@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com,
james.morse@arm.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
wanghaibin.wang@huawei.com, lious.lilei@hisilicon.com,
lishuo1@hisilicon.com
Subject: Re: [RFC] Question about TLB flush while set Stage-2 huge pages
Date: Fri, 15 Mar 2019 14:56:45 +0000 [thread overview]
Message-ID: <6aea4049-7860-7144-a7be-14f856cdc789@arm.com> (raw)
In-Reply-To: <d322e126-4da2-6dfd-a86d-088dfb3bf0f4@huawei.com>
Hi Zhengui,
On 15/03/2019 08:21, Zheng Xiang wrote:
> Hi Suzuki,
>
> I have tested this patch, VM doesn't hang and we get expected WARNING log:
Thanks for the quick testing !
> However, we also get the following unexpected log:
>
> [ 908.329900] BUG: Bad page state in process qemu-kvm pfn:a2fb41cf
> [ 908.339415] page:ffff7e28bed073c0 count:-4 mapcount:0 mapping:0000000000000000 index:0x0
> [ 908.339416] flags: 0x4ffffe0000000000()
> [ 908.339418] raw: 4ffffe0000000000 dead000000000100 dead000000000200 0000000000000000
> [ 908.339419] raw: 0000000000000000 0000000000000000 fffffffcffffffff 0000000000000000
> [ 908.339420] page dumped because: nonzero _refcount
> [ 908.339437] CPU: 32 PID: 72599 Comm: qemu-kvm Kdump: loaded Tainted: G B W 5.0.0+ #1
> [ 908.339438] Call trace:
> [ 908.339439] dump_backtrace+0x0/0x188
> [ 908.339441] show_stack+0x24/0x30
> [ 908.339442] dump_stack+0xa8/0xcc
> [ 908.339443] bad_page+0xf0/0x150
> [ 908.339445] free_pages_check_bad+0x84/0xa0
> [ 908.339446] free_pcppages_bulk+0x4b8/0x750
> [ 908.339448] free_unref_page_commit+0x13c/0x198
> [ 908.339449] free_unref_page+0x84/0xa0
> [ 908.339451] __free_pages+0x58/0x68
> [ 908.339452] zap_huge_pmd+0x290/0x2d8
> [ 908.339454] unmap_page_range+0x2b4/0x470
> [ 908.339455] unmap_single_vma+0x94/0xe8
> [ 908.339457] unmap_vmas+0x8c/0x108
> [ 908.339458] exit_mmap+0xd4/0x178
> [ 908.339459] mmput+0x74/0x180
> [ 908.339460] do_exit+0x2b4/0x5b0
> [ 908.339462] do_group_exit+0x3c/0xe0
> [ 908.339463] __arm64_sys_exit_group+0x24/0x28
> [ 908.339465] el0_svc_common+0xa0/0x180
> [ 908.339466] el0_svc_handler+0x38/0x78
> [ 908.339467] el0_svc+0x8/0xc
Thats bad, we seem to be making upto 4 unbalanced put_page().
>>> ---
>>> virt/kvm/arm/mmu.c | 51 +++++++++++++++++++++++++++++++++++----------------
>>> 1 file changed, 35 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 66e0fbb5..04b0f9b 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1076,24 +1076,38 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>>> * Skip updating the page table if the entry is
>>> * unchanged.
>>> */
>>> - if (pmd_val(old_pmd) == pmd_val(*new_pmd))
>>> + if (pmd_val(old_pmd) == pmd_val(*new_pmd)) {
>>> return 0;
>>> -
>>> + } else if (WARN_ON_ONCE(!pmd_thp_or_huge(old_pmd))) {
>>> /*
>>> - * Mapping in huge pages should only happen through a
>>> - * fault. If a page is merged into a transparent huge
>>> - * page, the individual subpages of that huge page
>>> - * should be unmapped through MMU notifiers before we
>>> - * get here.
>>> - *
>>> - * Merging of CompoundPages is not supported; they
>>> - * should become splitting first, unmapped, merged,
>>> - * and mapped back in on-demand.
>>> + * If we have PTE level mapping for this block,
>>> + * we must unmap it to avoid inconsistent TLB
>>> + * state. We could end up in this situation if
>>> + * the memory slot was marked for dirty logging
>>> + * and was reverted, leaving PTE level mappings
>>> + * for the pages accessed during the period.
>>> + * Normal THP split/merge follows mmu_notifier
>>> + * callbacks and do get handled accordingly.
>>> */
>>> - VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
>>> + unmap_stage2_range(kvm, (addr & S2_PMD_MASK), S2_PMD_SIZE);
>
> It seems that kvm decreases the _refcount of the page twice in transparent_hugepage_adjust()
> and unmap_stage2_range().
But I thought we should be doing that on the head_page already, as this is THP.
I will take a look and get back to you on this. Btw, is it possible for you
to turn on CONFIG_DEBUG_VM and re-run with the above patch ?
Kind regards
Suzuki
next prev parent reply other threads:[~2019-03-15 14:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-11 16:31 [RFC] Question about TLB flush while set Stage-2 huge pages Zheng Xiang
2019-03-12 11:32 ` Marc Zyngier
2019-03-12 15:30 ` Zheng Xiang
2019-03-12 18:18 ` Marc Zyngier
2019-03-13 9:45 ` Zheng Xiang
2019-03-14 10:55 ` Suzuki K Poulose
2019-03-14 15:50 ` Zenghui Yu
2019-03-15 8:21 ` Zheng Xiang
2019-03-15 14:56 ` Suzuki K Poulose [this message]
2019-03-17 13:34 ` Zenghui Yu
2019-03-18 17:34 ` Suzuki K Poulose
2019-03-19 9:05 ` Zenghui Yu
2019-03-19 14:11 ` [PATCH] kvm: arm: Fix handling of stage2 huge mappings Suzuki K Poulose
2019-03-19 16:02 ` Zenghui Yu
2019-03-20 8:15 ` Marc Zyngier
2019-03-20 9:44 ` Suzuki K Poulose
2019-03-20 10:11 ` Marc Zyngier
2019-03-20 10:23 ` Suzuki K Poulose
2019-03-20 10:35 ` Marc Zyngier
2019-03-20 11:12 ` Suzuki K Poulose
2019-03-20 17:24 ` Marc Zyngier
2019-03-17 13:55 ` [RFC] Question about TLB flush while set Stage-2 huge pages Zenghui Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6aea4049-7860-7144-a7be-14f856cdc789@arm.com \
--to=suzuki.poulose@arm.com \
--cc=catalin.marinas@arm.com \
--cc=christoffer.dall@arm.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lious.lilei@hisilicon.com \
--cc=lishuo1@hisilicon.com \
--cc=marc.zyngier@arm.com \
--cc=wanghaibin.wang@huawei.com \
--cc=will.deacon@arm.com \
--cc=yuzenghui@huawei.com \
--cc=zhengxiang9@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).