linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Shijith Thotton <sthotton@marvell.com>
To: Julien Thierry <Julien.Thierry@arm.com>,
	Steven Price <Steven.Price@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Cc: Mark Rutland <Mark.Rutland@arm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	"acme@kernel.org" <acme@kernel.org>,
	"alexander.shishkin@linux.intel.com"
	<alexander.shishkin@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"namhyung@kernel.org" <namhyung@kernel.org>,
	"jolsa@redhat.com" <jolsa@redhat.com>,
	"liwei391@huawei.com" <liwei391@huawei.com>
Subject: Re: [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection
Date: Wed, 17 Jul 2019 04:45:47 +0000	[thread overview]
Message-ID: <374a9f8f-6d1d-a43c-1e25-ab32fcb63b02@marvell.com> (raw)
In-Reply-To: <750864d6-543b-32a4-9b90-4a928c824a4b@arm.com>

Hi Julien,

On 7/16/19 3:54 AM, Julien Thierry wrote:
> On 16/07/2019 11:33, Shijith Thotton wrote:
>> On 7/10/19 4:01 AM, Julien Thierry wrote:
>>> On 10/07/2019 11:57, Steven Price wrote:
>>>> On 08/07/2019 15:32, Julien Thierry wrote:
>>>>> From: Mark Rutland <mark.rutland@arm.com>
>>>>>
>>>>> Currently we access the counter registers and their respective type
>>>>> registers indirectly. This requires us to write to PMSELR, issue an ISB,
>>>>> then access the relevant PMXEV* registers.
>>>>>
>>>>> This is unfortunate, because:
>>>>>
>>>>> * Under virtualization, accessing one registers requires two traps to
>>>>>     the hypervisor, even though we could access the register directly with
>>>>>     a single trap.
>>>>>
>>>>> * We have to issue an ISB which we could otherwise avoid the cost of.
>>>>>
>>>>> * When we use NMIs, the NMI handler will have to save/restore the select
>>>>>     register in case the code it preempted was attempting to access a
>>>>>     counter or its type register.
>>>>>
>>>>> We can avoid these issues by directly accessing the relevant registers.
>>>>> This patch adds helpers to do so.
>>>>>
>>>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>>>> [Julien T.: Don't inline read/write functions to avoid big code-size
>>>>>     increase, remove unused read_pmevtypern function,
>>>>>     fix counter index issue.]
>>>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> ---
>>>>>    arch/arm64/kernel/perf_event.c | 96 ++++++++++++++++++++++++++++++++++++------
>>>>>    1 file changed, 83 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>>>>> index 96e90e2..7759f8a 100644
>>>>> --- a/arch/arm64/kernel/perf_event.c
>>>>> +++ b/arch/arm64/kernel/perf_event.c
>>>>> @@ -369,6 +369,77 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>>>>>    #define  ARMV8_IDX_TO_COUNTER(x) \
>>>>>     (((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
>>>>>
>>>>> +/*
>>>>> + * This code is really good
>>>>> + */
>>>>> +
>>>>> +#define PMEVN_CASE(n, case_macro) \
>>>>> +  case n: case_macro(n); break
>>>>> +
>>>>> +#define PMEVN_SWITCH(x, case_macro)                               \
>>>>> +  do {                                                    \
>>>>> +          switch (x) {                                    \
>>>>> +          PMEVN_CASE(0,  case_macro);                     \
>>>>> +          PMEVN_CASE(1,  case_macro);                     \
>>>>> +          PMEVN_CASE(2,  case_macro);                     \
>>>>> +          PMEVN_CASE(3,  case_macro);                     \
>>>>> +          PMEVN_CASE(4,  case_macro);                     \
>>>>> +          PMEVN_CASE(5,  case_macro);                     \
>>>>> +          PMEVN_CASE(6,  case_macro);                     \
>>>>> +          PMEVN_CASE(7,  case_macro);                     \
>>>>> +          PMEVN_CASE(8,  case_macro);                     \
>>>>> +          PMEVN_CASE(9,  case_macro);                     \
>>>>> +          PMEVN_CASE(10, case_macro);                     \
>>>>> +          PMEVN_CASE(11, case_macro);                     \
>>>>> +          PMEVN_CASE(12, case_macro);                     \
>>>>> +          PMEVN_CASE(13, case_macro);                     \
>>>>> +          PMEVN_CASE(14, case_macro);                     \
>>>>> +          PMEVN_CASE(15, case_macro);                     \
>>>>> +          PMEVN_CASE(16, case_macro);                     \
>>>>> +          PMEVN_CASE(17, case_macro);                     \
>>>>> +          PMEVN_CASE(18, case_macro);                     \
>>>>> +          PMEVN_CASE(19, case_macro);                     \
>>>>
>>>> Is 20 missing on purpose?
>>>>
>>>
>>> That would have been fun to debug. Well spotted!
>>>
>>> I'll fix it in the next version.
>>>
>>> Thanks,
>>>
>>
>> Tried perf top/record on this patch and are working fine.
>> Output of perf record on "iperf -c 127.0.0.1 -t 30" is below. (single core)
>>
>> With Pseudo-NMI:
>>       20.35%  [k] lock_acquire
>>       16.95%  [k] lock_release
>>       11.02%  [k] __arch_copy_from_user
>>        7.78%  [k] lock_is_held_type
>>        2.12%  [k] ipt_do_table
>>        1.34%  [k] kmem_cache_free
>>        1.25%  [k] _raw_spin_unlock_irqrestore
>>        1.21%  [k] __nf_conntrack_find_get
>>        1.06%  [k] get_page_from_freelist
>>        1.04%  [k] ktime_get
>>        1.03%  [k] kfree
>>        1.00%  [k] nf_conntrack_tcp_packet
>>        0.96%  [k] tcp_sendmsg_locked
>>        0.87%  [k] __softirqentry_text_start
>>        0.87%  [k] process_backlog
>>        0.76%  [k] __local_bh_enable_ip
>>        0.75%  [k] ip_finish_output2
>>        0.68%  [k] __tcp_transmit_skb
>>        0.62%  [k] enqueue_to_backlog
>>        0.60%  [k] __lock_acquire.isra.17
>>        0.58%  [k] __free_pages_ok
>>        0.54%  [k] nf_conntrack_in
>>
>> With IRQ:
>>       16.49%  [k] __arch_copy_from_user
>>       12.38%  [k] _raw_spin_unlock_irqrestore
>>        9.41%  [k] lock_acquire
>>        6.92%  [k] lock_release
>>        3.71%  [k] lock_is_held_type
>>        2.80%  [k] ipt_do_table
>>        2.06%  [k] get_page_from_freelist
>>        1.82%  [k] ktime_get
>>        1.73%  [k] process_backlog
>>        1.27%  [k] nf_conntrack_tcp_packet
>>        1.21%  [k] enqueue_to_backlog
>>        1.17%  [k] __tcp_transmit_skb
>>        1.12%  [k] ip_finish_output2
>>        1.11%  [k] tcp_sendmsg_locked
>>        1.06%  [k] __free_pages_ok
>>        1.05%  [k] tcp_ack
>>        0.99%  [k] __netif_receive_skb_core
>>        0.88%  [k] __nf_conntrack_find_get
>>        0.71%  [k] nf_conntrack_in
>>        0.61%  [k] kmem_cache_free
>>        0.59%  [k] kfree
>>        0.57%  [k] __alloc_pages_nodemask
>>
>> Thanks Juilen and Wei,
>> Tested-by: Shijith Thotton <sthotton@marvell.com>
>>
> 
> Thanks for testing this and confirming the improvement.
> 
> I'm gonna post a new version soon. Is it alright if I apply this tag for
> the other arm64 patches that enable the use of Pseudo-NMI for the PMU?
> (I'm mostly thinking of patches 8 and 9 since there haven't been
> comments on them and won't have behavioural changes in the next version).
> 

Yes please.

Thanks,
Shijith
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-07-17  4:46 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-08 14:32 [PATCH v3 0/9] arm_pmu: Use NMI for perf interrupt Julien Thierry
2019-07-08 14:32 ` [PATCH v3 1/9] arm64: perf: avoid PMXEV* indirection Julien Thierry
2019-07-08 15:03   ` Mark Rutland
2019-07-10 10:57   ` Steven Price
2019-07-10 11:01     ` Julien Thierry
2019-07-16 10:33       ` Shijith Thotton
2019-07-16 10:54         ` Julien Thierry
2019-07-17  4:45           ` Shijith Thotton [this message]
2019-07-08 14:32 ` [PATCH v3 2/9] arm64: perf: Remove PMU locking Julien Thierry
2019-07-08 15:03   ` Mark Rutland
2019-07-08 15:34     ` Julien Thierry
2019-07-09 11:22       ` Mark Rutland
2019-07-08 14:32 ` [PATCH v3 3/9] arm: perf: save/resore pmsel Julien Thierry
2019-07-08 15:06   ` Mark Rutland
2019-07-08 15:40     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 4/9] arm: perf: Remove Remove PMU locking Julien Thierry
2019-07-08 15:10   ` Mark Rutland
2019-07-08 14:32 ` [PATCH v3 5/9] perf/arm_pmu: Move PMU lock to ARMv6 events Julien Thierry
2019-07-08 15:19   ` Mark Rutland
2019-07-08 15:50     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 6/9] arm64: perf: Do not call irq_work_run in NMI context Julien Thierry
2019-07-08 15:29   ` Mark Rutland
2019-07-08 16:00     ` Julien Thierry
2019-07-08 14:32 ` [PATCH v3 7/9] arm/arm64: kvm: pmu: Make overflow handler NMI safe Julien Thierry
2019-07-08 15:30   ` Mark Rutland
2019-07-11 12:38   ` Zenghui Yu
2019-07-08 14:32 ` [PATCH v3 8/9] arm_pmu: Introduce pmu_irq_ops Julien Thierry
2019-07-08 14:32 ` [PATCH v3 9/9] arm_pmu: Use NMIs for PMU Julien Thierry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=374a9f8f-6d1d-a43c-1e25-ab32fcb63b02@marvell.com \
    --to=sthotton@marvell.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Julien.Thierry@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Steven.Price@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=liwei391@huawei.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).