KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Zenghui Yu <yuzenghui@huawei.com>
To: James Morse <james.morse@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>,
	<kvmarm@lists.cs.columbia.edu>, <kvm@vger.kernel.org>,
	<linux-perf-users@vger.kernel.org>, <christoffer.dall@arm.com>,
	<marc.zyngier@arm.com>, <acme@redhat.com>, <peterz@infradead.org>,
	<mingo@redhat.com>, <ganapatrao.kulkarni@cavium.com>,
	<catalin.marinas@arm.com>, <will.deacon@arm.com>,
	<mark.rutland@arm.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <jolsa@redhat.com>,
	<namhyung@kernel.org>, <wanghaibin.wang@huawei.com>,
	<xiexiangyou@huawei.com>, <linuxarm@huawei.com>
Subject: Re: [PATCH v1 2/5] KVM: arm/arm64: Adjust entry/exit and trap related tracepoints
Date: Thu, 13 Jun 2019 19:28:10 +0800
Message-ID: <e78a9798-cce3-a360-37c3-0ad359944b85@huawei.com> (raw)
In-Reply-To: <977f8f8c-72b4-0287-4b1c-47a0d6f1fd6e@arm.com>

Hi James,

On 2019/6/12 20:49, James Morse wrote:
> Hi,
> On 12/06/2019 10:08, Zenghui Yu wrote:
>> Currently, we use trace_kvm_exit() to report exception type (e.g.,
>> "IRQ", "TRAP") and exception class (ESR_ELx's bit[31:26]) together.
> (They both caused an exit!)
>> But hardware only saves the exit class to ESR_ELx on synchronous
> EC is the 'Exception Class'. Exit is KVM/Linux's terminology.
Yes, a stupid mistake ;-)

>> exceptions, not on asynchronous exceptions. When the guest exits
>> due to external interrupts, we will get tracing output like:
>> 	"kvm_exit: IRQ: HSR_EC: 0x0000 (UNKNOWN), PC: 0xffff87259e30"
>> Obviously, "HSR_EC" here is meaningless.
> I assume we do it this way so there is only one guest-exit tracepoint that catches all exits.
> I don't think its a problem if user-space has to know the EC isn't set for asynchronous
> exceptions, this is a property of the architecture and anything using these trace-points
> is already arch specific.
Actually, *no* problem in current implementation, and I'm OK to still
keep the EC in trace_kvm_exit().  What I really want to do is adding the
EC in trace_trap_enter (the new tracepoint), will explain it later.

>> This patch splits "exit" and "trap" events by adding two tracepoints
>> explicitly in handle_trap_exceptions(). Let trace_kvm_exit() report VM
>> exit events, and trace_kvm_trap_exit() report VM trap events.
>> These tracepoints are adjusted also in preparation for supporting
>> 'perf kvm stat' on arm64.
> Because the existing tracepoints are ABI, I don't think we can change them.
> We can add new ones if there is something that a user reasonably needs to trace, and can't
> be done any other way.
> What can't 'perf kvm stat' do with the existing trace points?
(A good question! I should have made it clear in the commit message,
  forgive me.)

First, how does 'perf kvm stat' interact with tracepoints?

We have three handlers for a specific event (e.g., "VM-EXIT") --
"is_begin_event", "is_end_event", "decode_key". The first two handlers
make use of two existing tracepoints ("kvm:kvm_exit" & "kvm:kvm_entry")
to check when the VM-EXIT events started/ended, thus the time difference
stats, event start/end time etc. can be calculated.
"is_begin_event" handler gets a *key* from the "ret" field (exit_code)
of "kvm:kvm_exit" payload, and "decode_key" handler makes use of the
*key* to find out the reason for the VM-EXIT event. Of course we should
maintain the mapping between exit_code and exit_reason in userspace.
These are all what *patch #4* had done, #4 is a simple patch to review!
Oh, we can also set "vcpu_id_str" to achieve per vcpu event record, but
currently, we only have the "vcpu_pc" field in "kvm:kvm_entry", without
something like "vcpu_id".

perf people must have a much deeper understanding of this.

OK, next comes the more important question - what should/can we do to
the tracepoints in preparation of 'perf kvm stat' on arm64?

 From the article you've provided, it's clear that we can't remove the EC
from trace_kvm_exit(). But can we add something like "vcpu_id" into
(at least) trace_kvm_entry(), just like what this patch has done?
If not, which means we have to keep the existing tracepoints totally
unchanged, then 'perf kvm stat' will have no way to record/report per
vcpu VM-EXIT events (other arch like X86, powerpc, s390 etc. have this
capability, if I understand it correctly).

As for TRAP events, should we consider adding two new tracepoints --
"kvm_trap_enter" and "kvm_trap_exit", to keep tracking of the trap
handling process? We should also record the EC in "kvm_trap_enter", 
which will be used as *key* in TRAP event's "is_begin_event" handler.
Patch #5 tells us the whole story, it's simple too.

What do you suggest?

>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index 516aead..af3c732 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -264,7 +264,10 @@ static int handle_trap_exceptions(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>   		exit_handle_fn exit_handler;
>>   		exit_handler = kvm_get_exit_handler(vcpu);
>> +		trace_kvm_trap_enter(vcpu->vcpu_id,
>> +				     kvm_vcpu_trap_get_class(vcpu));
>>   		handled = exit_handler(vcpu, run);
>> +		trace_kvm_trap_exit(vcpu->vcpu_id);
>>   	}
> Why are there two? Are you using this to benchmark the exit_handler()?
Almostly yes. Let perf know when the TRAP handling event start/end,
and ...

> As we can't remove the EC from the exit event, I don't think this tells us anything new.
As explained above, this EC is for 'perf kvm stat'.

>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 90cedeb..9f63fd9 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -758,7 +758,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>   		/**************************************************************
>>   		 * Enter the guest
>>   		 */
>> -		trace_kvm_entry(*vcpu_pc(vcpu));
>> +		trace_kvm_entry(vcpu->vcpu_id, *vcpu_pc(vcpu));
> Why do you need the PC? It was exported on exit.
> (its mostly junk for user-space anyway, you can't infer anything from it)
(I mainly wanted to add the "vcpu->vcpu_id" here.)
It seems that we can't just remove the PC, which will cause ABI change?

Thanks for your reviewing!



  reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-12  9:08 [PATCH v1 0/5] perf kvm: Add stat support on arm64 Zenghui Yu
2019-06-12  9:08 ` [PATCH v1 1/5] KVM: arm/arm64: Remove kvm_mmio_emulate tracepoint Zenghui Yu
2019-06-12 12:48   ` James Morse
2019-06-13 11:20     ` Zenghui Yu
2019-06-12  9:08 ` [PATCH v1 2/5] KVM: arm/arm64: Adjust entry/exit and trap related tracepoints Zenghui Yu
2019-06-12 12:49   ` James Morse
2019-06-13 11:28     ` Zenghui Yu [this message]
2019-06-17 11:19       ` James Morse
2019-06-21 13:25         ` Zenghui Yu
2019-06-12  9:08 ` [PATCH v1 3/5] perf tools arm64: Add support for get_cpuid() function Zenghui Yu
2019-06-12  9:08 ` [PATCH v1 4/5] perf,kvm/arm64: Add stat support on arm64 Zenghui Yu
2019-06-12  9:08 ` [PATCH v1 5/5] perf,kvm/arm64: perf-kvm-stat to report VM TRAP Zenghui Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e78a9798-cce3-a360-37c3-0ad359944b85@huawei.com \
    --to=yuzenghui@huawei.com \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=ganapatrao.kulkarni@cavium.com \
    --cc=james.morse@arm.com \
    --cc=jolsa@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will.deacon@arm.com \
    --cc=xiexiangyou@huawei.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git