linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: Marc Zyngier <maz@kernel.org>
Cc: mark.rutland@arm.com, drjones@redhat.com, suzuki.poulose@arm.com,
	catalin.marinas@arm.com, eric.auger@redhat.com,
	james.morse@arm.com, shan.gavin@gmail.com, sudeep.holla@arm.com,
	will@kernel.org, kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH RFCv1 0/7] Support Async Page Fault
Date: Tue, 14 Apr 2020 15:39:56 +1000	[thread overview]
Message-ID: <6a1d7e8b-da10-409f-16d0-354004566a1a@redhat.com> (raw)
In-Reply-To: <d2882e806ad2f9793437160093c8d3fa@kernel.org>

Hi Marc,

On 4/10/20 10:52 PM, Marc Zyngier wrote:
> Hi Gavin,
> 
> On 2020-04-10 09:58, Gavin Shan wrote:
>> There are two stages of page faults and the stage one page fault is
>> handled by guest itself. The guest is trapped to host when the page
>> fault is caused by stage 2 page table, for example missing. The guest
>> is suspended until the requested page is populated. To populate the
>> requested page can be costly and might be related to IO activities
>> if the page was swapped out previously. In this case, the guest has
>> to suspend for a few of milliseconds at least, regardless of the
>> overall system load.
>>
>> The series adds support to asychornous page fault to improve above
>> situation. If it's costly to populate the requested page, a signal
>> (PAGE_NOT_PRESENT) is sent to guest so that the faulting process can
>> be rescheduled if it can be. Otherwise, it is put into power-saving
>> mode. Another signal (PAGE_READY) is sent to guest once the requested
>> page is populated so that the faulting process can be waken up either
>> from either waiting state or power-saving state.
>>
>> In order to fulfil the control flow and convey signals between host
>> and guest. A IMPDEF system register (SYS_ASYNC_PF_EL1) is introduced.
>> The register accepts control block's physical address, plus requested
>> features. Also, the signal is sent using data abort with the specific
>> IMPDEF Data Fault Status Code (DFSC). The specific signal is stored
>> in the control block by host, to be consumed by guest.
>>
>> Todo
>> ====
>> * CONFIG_KVM_ASYNC_PF_SYNC is disabled for now because the exception
>>    injection can't work in nested mode. It might be something to be
>>    improved in future.
>> * KVM_ASYNC_PF_SEND_ALWAYS is disabled even with CONFIG_PREEMPTION
>>    because it's simply not working reliably.
>> * Tracepoints, which should something to be done in short term.
>> * kvm-unit-test cases.
>> * More testing and debugging are needed. Sometimes, the guest can be
>>    stuck and the root cause needs to be figured out.
> 
> Let me add another few things:
> 
> - KVM/arm is (supposed to be) an architectural hypervisor. It means
>    that one of the design goal is to have as few differences as possible
>    from the actual hardware. I'm not keen on deviating from it (next
>    thing you know, you'll add all the PV horror from Xen, HV, VMware...).
> 
> - The idea of butchering the arm64 mm subsystem to handle a new exotic
>    style of exceptions is not something I am looking forward to. We
>    might as well PV the whole MMU, Xen style, and be done with it. I'll
>    let the arm64 maintainers comment on this though.
> 

Thanks for your comments. The feature won't be enabled on guest side until
CONFIG_KVM_GUEST is enabled. More details can be found from PATCH[7/7]. So
it would be one specific features supported by KVM. I'm not familiar with
xen and would like to learn how MMU is para-virtualized there. Do you have
documents recommended to start with? Otherwise, I will try google later.

> - We don't add IMPDEF sysregs, period. That's reserved for the HW. If
>    you want to trap, there's the HVC instruction to that effect.
> 

Yes, HVC can be used for trapping as PV stolen time did. However, I guess
it's guarded by specification? For example, the para-virtualized time calls
are specified by DEN0057A, as highlighted in include/linux/arm-smccc.h.

/* Paravirtualised time calls (defined by ARM DEN0057A) */
#define ARM_SMCCC_HV_PV_TIME_FEATURES                           \
         ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,                 \
                            ARM_SMCCC_SMC_64,                    \
                            ARM_SMCCC_OWNER_STANDARD_HYP,        \
                            0x20)

#define ARM_SMCCC_HV_PV_TIME_ST                                 \
         ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,                 \
                            ARM_SMCCC_SMC_64,                    \
                            ARM_SMCCC_OWNER_STANDARD_HYP,        \
                            0x21)

I really don't understand how IMPDEF sysreg is used by hardware vendors.
Do we have an existing functionality, which depends on IMPDEF sysreg?
I was thinking the IMPDEF sysreg can be used by software either, but
it seems I'm wrong.

> - If this is such a great improvement, where are the performance
>    numbers?
> 

Yep, Ineed. I'm still looking for appropriate workload currently and hopefully,
I can share performance data in RFCv2 :)

> - The fact that it apparently cannot work with nesting nor with
>    preemption tends to indicate that it isn't future proof.
> 

I didn't make myself clear about the nesting. The data abort exception is injected
by tweaking ELR_EL1/SPSR_EL1 if the guest is runing in 64-bits and EL1 mode. These
registers are loaded when the guest gets chance to run. However, it's impossible to
inject two (nested) data abort exception at once. It's something different from nested
VM.

There was a hot discusson about the preemption support. It's something in the TODO list
and needs to be sorted out in future.

https://lore.kernel.org/patchwork/patch/1206121/


> Thanks,
> 
> 	M.
> 

Thanks,
Gavin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-04-14  5:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-10  8:58 [PATCH RFCv1 0/7] Support Async Page Fault Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 1/7] kvm/arm64: Rename kvm_vcpu_get_hsr() to kvm_vcpu_get_esr() Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 2/7] kvm/arm64: Detach ESR operator from vCPU struct Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 3/7] kvm/arm64: Replace hsr with esr Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 4/7] kvm/arm64: Export kvm_handle_user_mem_abort() with prefault mode Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 5/7] kvm/arm64: Allow inject data abort with specified DFSC Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 6/7] kvm/arm64: Support async page fault Gavin Shan
2020-04-10  8:58 ` [PATCH RFCv1 7/7] arm64: " Gavin Shan
2020-04-10 12:52 ` [PATCH RFCv1 0/7] Support Async Page Fault Marc Zyngier
2020-04-14  5:39   ` Gavin Shan [this message]
2020-04-14 11:05     ` Mark Rutland
2020-04-16  7:59       ` Gavin Shan
2020-04-16  9:16         ` Mark Rutland
2020-04-16  9:21           ` Will Deacon
2020-04-17 10:34           ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a1d7e8b-da10-409f-16d0-354004566a1a@redhat.com \
    --to=gshan@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=drjones@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=james.morse@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=shan.gavin@gmail.com \
    --cc=sudeep.holla@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).