All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yanan Wang <wangyanan55@huawei.com>
To: Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	James Morse <james.morse@arm.com>,
	"Julien Thierry" <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	<kvmarm@lists.cs.columbia.edu>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: <wanghaibin.wang@huawei.com>, <yuzenghui@huawei.com>,
	Yanan Wang <wangyanan55@huawei.com>
Subject: [PATCH 0/2] Performance improvement about cache flush
Date: Mon, 25 Jan 2021 22:10:42 +0800	[thread overview]
Message-ID: <20210125141044.380156-1-wangyanan55@huawei.com> (raw)

Hi,
This two patches are posted to introduce a new method that can distinguish cases
of allocating memcache more precisely, and to elide some unnecessary cache flush.

For patch-1:
With a guest translation fault, we don't really need the memcache pages when
only installing a new entry to the existing page table or replacing the table
entry with a block entry. And with a guest permission fault, we also don't need
the memcache pages for a write_fault in dirty-logging time if VMs are not
configured with huge mappings. So a new method is introduced to distinguish cases
of allocating memcache more precisely.

For patch-2:
If migration of a VM with hugepages is canceled midway, KVM will adjust the
stage-2 table mappings back to block mappings. With multiple vCPUs accessing
guest pages within the same 1G range, there could be numbers of translation
faults to handle, and KVM will uniformly flush data cache for 1G range before
handling the faults. As it will cost a long time to flush the data cache for
1G range of memory(130ms on Kunpeng 920 servers, for example), the consequent
cache flush for each translation fault will finally lead to vCPU stuck for
seconds or even a soft lockup. I have met both the stuck and soft lockup on
Kunpeng servers with FWB not supported.

When KVM need to recover the table mappings back to block mappings, as we only
replace the existing page tables with a block entry and the cacheability has not
been changed, the cache maintenance opreations can be skipped.

Yanan Wang (2):
  KVM: arm64: Distinguish cases of allocating memcache more precisely
  KVM: arm64: Skip the cache flush when coalescing tables into a block

 arch/arm64/kvm/mmu.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

-- 
2.19.1


WARNING: multiple messages have this Message-ID (diff)
From: Yanan Wang <wangyanan55@huawei.com>
To: Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	James Morse <james.morse@arm.com>,
	"Julien Thierry" <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	<kvmarm@lists.cs.columbia.edu>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH 0/2] Performance improvement about cache flush
Date: Mon, 25 Jan 2021 22:10:42 +0800	[thread overview]
Message-ID: <20210125141044.380156-1-wangyanan55@huawei.com> (raw)

Hi,
This two patches are posted to introduce a new method that can distinguish cases
of allocating memcache more precisely, and to elide some unnecessary cache flush.

For patch-1:
With a guest translation fault, we don't really need the memcache pages when
only installing a new entry to the existing page table or replacing the table
entry with a block entry. And with a guest permission fault, we also don't need
the memcache pages for a write_fault in dirty-logging time if VMs are not
configured with huge mappings. So a new method is introduced to distinguish cases
of allocating memcache more precisely.

For patch-2:
If migration of a VM with hugepages is canceled midway, KVM will adjust the
stage-2 table mappings back to block mappings. With multiple vCPUs accessing
guest pages within the same 1G range, there could be numbers of translation
faults to handle, and KVM will uniformly flush data cache for 1G range before
handling the faults. As it will cost a long time to flush the data cache for
1G range of memory(130ms on Kunpeng 920 servers, for example), the consequent
cache flush for each translation fault will finally lead to vCPU stuck for
seconds or even a soft lockup. I have met both the stuck and soft lockup on
Kunpeng servers with FWB not supported.

When KVM need to recover the table mappings back to block mappings, as we only
replace the existing page tables with a block entry and the cacheability has not
been changed, the cache maintenance opreations can be skipped.

Yanan Wang (2):
  KVM: arm64: Distinguish cases of allocating memcache more precisely
  KVM: arm64: Skip the cache flush when coalescing tables into a block

 arch/arm64/kvm/mmu.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

-- 
2.19.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Yanan Wang <wangyanan55@huawei.com>
To: Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	James Morse <james.morse@arm.com>,
	"Julien Thierry" <julien.thierry.kdev@gmail.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	<kvmarm@lists.cs.columbia.edu>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: yuzenghui@huawei.com, wanghaibin.wang@huawei.com,
	Yanan Wang <wangyanan55@huawei.com>
Subject: [PATCH 0/2] Performance improvement about cache flush
Date: Mon, 25 Jan 2021 22:10:42 +0800	[thread overview]
Message-ID: <20210125141044.380156-1-wangyanan55@huawei.com> (raw)

Hi,
This two patches are posted to introduce a new method that can distinguish cases
of allocating memcache more precisely, and to elide some unnecessary cache flush.

For patch-1:
With a guest translation fault, we don't really need the memcache pages when
only installing a new entry to the existing page table or replacing the table
entry with a block entry. And with a guest permission fault, we also don't need
the memcache pages for a write_fault in dirty-logging time if VMs are not
configured with huge mappings. So a new method is introduced to distinguish cases
of allocating memcache more precisely.

For patch-2:
If migration of a VM with hugepages is canceled midway, KVM will adjust the
stage-2 table mappings back to block mappings. With multiple vCPUs accessing
guest pages within the same 1G range, there could be numbers of translation
faults to handle, and KVM will uniformly flush data cache for 1G range before
handling the faults. As it will cost a long time to flush the data cache for
1G range of memory(130ms on Kunpeng 920 servers, for example), the consequent
cache flush for each translation fault will finally lead to vCPU stuck for
seconds or even a soft lockup. I have met both the stuck and soft lockup on
Kunpeng servers with FWB not supported.

When KVM need to recover the table mappings back to block mappings, as we only
replace the existing page tables with a block entry and the cacheability has not
been changed, the cache maintenance opreations can be skipped.

Yanan Wang (2):
  KVM: arm64: Distinguish cases of allocating memcache more precisely
  KVM: arm64: Skip the cache flush when coalescing tables into a block

 arch/arm64/kvm/mmu.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

-- 
2.19.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2021-01-26  6:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 14:10 Yanan Wang [this message]
2021-01-25 14:10 ` [PATCH 0/2] Performance improvement about cache flush Yanan Wang
2021-01-25 14:10 ` Yanan Wang
2021-01-25 14:10 ` [PATCH 1/2] KVM: arm64: Distinguish cases of allocating memcache more precisely Yanan Wang
2021-01-25 14:10   ` Yanan Wang
2021-01-25 14:10   ` Yanan Wang
2021-03-08 16:35   ` Will Deacon
2021-03-08 16:35     ` Will Deacon
2021-03-08 16:35     ` Will Deacon
2021-01-25 14:10 ` [PATCH 2/2] KVM: arm64: Skip the cache flush when coalescing tables into a block Yanan Wang
2021-01-25 14:10   ` Yanan Wang
2021-01-25 14:10   ` Yanan Wang
2021-03-08 16:34   ` Will Deacon
2021-03-08 16:34     ` Will Deacon
2021-03-08 16:34     ` Will Deacon
2021-03-09  8:34     ` wangyanan (Y)
2021-03-09  8:34       ` wangyanan (Y)
2021-03-09  8:34       ` wangyanan (Y)
2021-03-09  8:43       ` Marc Zyngier
2021-03-09  8:43         ` Marc Zyngier
2021-03-09  8:43         ` Marc Zyngier
2021-03-09  9:02         ` wangyanan (Y)
2021-03-09  9:02           ` wangyanan (Y)
2021-03-09  9:02           ` wangyanan (Y)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210125141044.380156-1-wangyanan55@huawei.com \
    --to=wangyanan55@huawei.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.