From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57D98C49ED7 for ; Thu, 19 Sep 2019 05:23:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0CF5F21848 for ; Thu, 19 Sep 2019 05:23:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="AsBX4Zvc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388045AbfISFX3 (ORCPT ); Thu, 19 Sep 2019 01:23:29 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:51754 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387576AbfISFX2 (ORCPT ); Thu, 19 Sep 2019 01:23:28 -0400 Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id x8J5NGFn008745 for ; Wed, 18 Sep 2019 22:23:27 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=facebook; bh=XHBkJEBIG1Bfmj81zm5SNu/KGNemYwKeFT0DPZCh4oM=; b=AsBX4ZvcBz0V/YSGzJ/scitBWdRgH+tZEtyPcocNf8bMTXWPVeqy+wEGGSeZNEEYmOYa l3gaC9aG16vr114HGPDcOfXvKY/jJ7N7MmZ78rJTSOE6UTgRZPEA3cyMJlm+Qd8fl/Fr it3v5y4zEG6P6t+Kd8F2F1zkydvNFFiqGc8= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 2v3vay9h5j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 18 Sep 2019 22:23:26 -0700 Received: from mx-out.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 18 Sep 2019 22:23:25 -0700 Received: by devbig006.ftw2.facebook.com (Postfix, from userid 4523) id 4DF6462E1F22; Wed, 18 Sep 2019 22:23:21 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig006.ftw2.facebook.com To: CC: Song Liu , , , Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Alexey Budankov , Namhyung Kim , Tejun Heo Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH v6] perf: Sharing PMU counters across compatible events Date: Wed, 18 Sep 2019 22:23:14 -0700 Message-ID: <20190919052314.2925604-1-songliubraving@fb.com> X-Mailer: git-send-email 2.17.1 X-FB-Internal: Safe MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.70,1.0.8 definitions=2019-09-19_02:2019-09-18,2019-09-19 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 mlxlogscore=878 malwarescore=0 phishscore=0 spamscore=0 priorityscore=1501 bulkscore=0 adultscore=0 mlxscore=0 suspectscore=3 clxscore=1015 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1908290000 definitions=main-1909190049 X-FB-Internal: deliver Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch tries to enable PMU sharing. To make perf event scheduling fast, we use special data structures. An array of "struct perf_event_dup" is added to the perf_event_context, to remember all the duplicated events under this ctx. All the events under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible events under the same ctx share the same perf_event_dup. The following figure shows a simplified version of the data structure. ctx -> perf_event_dup -> master ^ | perf_event /| | perf_event / Connection among perf_event and perf_event_dup are built when events are added or removed from the ctx. So these are not on the critical path of schedule or perf_rotate_context(). On the critical paths (add, del read), sharing PMU counters doesn't increase the complexity. Helper functions event_pmu_[add|del|read]() are introduced to cover these cases. All these functions have O(1) time complexity. We allocate a separate perf_event for perf_event_dup->master. This needs extra attention, because perf_event_alloc() may sleep. To allocate the master event properly, a new pointer, tmp_master, is added to perf_event. tmp_master carries a separate perf_event into list_[add|del]_event(). The master event has valid ->ctx and holds ctx->refcount. Details about the handling of the master event is added to include/linux/perf_event.h, before struct perf_event_dup. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Alexey Budankov Cc: Namhyung Kim Cc: Tejun Heo Signed-off-by: Song Liu --- include/linux/perf_event.h | 61 ++++++++ kernel/events/core.c | 294 ++++++++++++++++++++++++++++++++++--- 2 files changed, 332 insertions(+), 23 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 61448c19a132..a694e5eee80a 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -722,6 +722,12 @@ struct perf_event { #endif struct list_head sb_list; + + /* for PMU sharing */ + struct perf_event *tmp_master; + int dup_id; + u64 dup_base_count; + u64 dup_base_child_count; #endif /* CONFIG_PERF_EVENTS */ }; @@ -731,6 +737,58 @@ struct perf_event_groups { u64 index; }; +/* + * Sharing PMU across compatible events + * + * If two perf_events in the same perf_event_context are counting same + * hardware events (instructions, cycles, etc.), they could share the + * hardware PMU counter. + * + * When a perf_event is added to the ctx (list_add_event), it is compared + * against other events in the ctx. If they can share the PMU counter, + * a perf_event_dup is allocated to represent the sharing. + * + * Each perf_event_dup has a virtual master event, which is called by + * pmu->add() and pmu->del(). We cannot call perf_event_alloc() in + * list_add_event(), so it is allocated and carried by event->tmp_master + * into list_add_event(). + * + * Virtual master in different cases/paths: + * + * < I > perf_event_open() -> close() path: + * + * 1. Allocated by perf_event_alloc() in sys_perf_event_open(); + * 2. event->tmp_master->ctx assigned in perf_install_in_context(); + * 3.a. if used by ctx->dup_events, freed in perf_event_release_kernel(); + * 3.b. if not used by ctx->dup_events, freed in perf_event_open(). + * + * < II > inherit_event() path: + * + * 1. Allocated by perf_event_alloc() in inherit_event(); + * 2. tmp_master->ctx assigned in inherit_event(); + * 3.a. if used by ctx->dup_events, freed in perf_event_release_kernel(); + * 3.b. if not used by ctx->dup_events, freed in inherit_event(). + * + * < III > perf_pmu_migrate_context() path: + * all dup_events removed during migration (no sharing after the move). + * + * < IV > perf_event_create_kernel_counter() path: + * not supported yet. + */ +struct perf_event_dup { + /* + * master event being called by pmu->add() and pmu->del(). + * This event is allocated with perf_event_alloc(). When + * attached to a ctx, this event should hold ctx->refcount. + */ + struct perf_event *master; + /* number of events in the ctx that shares the master */ + int total_event_count; + /* number of active events of the master */ + int active_event_count; +}; + +#define MAX_PERF_EVENT_DUP_PER_CTX 4 /** * struct perf_event_context - event context structure * @@ -791,6 +849,9 @@ struct perf_event_context { #endif void *task_ctx_data; /* pmu specific data */ struct rcu_head rcu_head; + + /* for PMU sharing. array is needed for O(1) access */ + struct perf_event_dup dup_events[MAX_PERF_EVENT_DUP_PER_CTX]; }; /* diff --git a/kernel/events/core.c b/kernel/events/core.c index 4f08b17d6426..973425e9de9b 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1657,6 +1657,142 @@ perf_event_groups_next(struct perf_event *event) event = rb_entry_safe(rb_next(&event->group_node), \ typeof(*event), group_node)) +static void _free_event(struct perf_event *event); + +/* free event->tmp_master */ +static inline void perf_event_free_tmp_master(struct perf_event *event) +{ + if (event->tmp_master) { + _free_event(event->tmp_master); + event->tmp_master = NULL; + } +} + +static inline u64 dup_read_count(struct perf_event_dup *dup) +{ + return local64_read(&dup->master->count); +} + +static inline u64 dup_read_child_count(struct perf_event_dup *dup) +{ + return atomic64_read(&dup->master->child_count); +} + +/* Returns whether a perf_event can share PMU counter with other events */ +static inline bool perf_event_can_share(struct perf_event *event) +{ + /* only do sharing for hardware events */ + if (is_software_event(event)) + return false; + + /* + * limit sharing to counting events. + * perf-stat sets PERF_SAMPLE_IDENTIFIER for counting events, so + * let that in. + */ + if (event->attr.sample_type & ~PERF_SAMPLE_IDENTIFIER) + return false; + + return true; +} + +/* + * Returns whether the two events can share a PMU counter. + * + * Note: This function does NOT check perf_event_can_share() for + * the two events, they should be checked before this function + */ +static inline bool perf_event_compatible(struct perf_event *event_a, + struct perf_event *event_b) +{ + return event_a->attr.type == event_b->attr.type && + event_a->attr.config == event_b->attr.config && + event_a->attr.config1 == event_b->attr.config1 && + event_a->attr.config2 == event_b->attr.config2; +} + +/* + * After adding a event to the ctx, try find compatible event(s). + * + * This function should only be called at the end of list_add_event(). + * Master event cannot be allocated or freed within this function. To add + * new master event, the event should already have a master event + * allocated (event->tmp_master). + */ +static inline void perf_event_setup_dup(struct perf_event *event, + struct perf_event_context *ctx) + +{ + struct perf_event *tmp; + int empty_slot = -1; + int match; + int i; + + if (!perf_event_can_share(event)) + return; + + /* look for sharing with existing dup events */ + for (i = 0; i < MAX_PERF_EVENT_DUP_PER_CTX; i++) { + if (ctx->dup_events[i].master == NULL) { + if (empty_slot == -1) + empty_slot = i; + continue; + } + + if (perf_event_compatible(event, ctx->dup_events[i].master)) { + event->dup_id = i; + ctx->dup_events[i].total_event_count++; + return; + } + } + + if (!event->tmp_master || /* perf_event_alloc() failed */ + empty_slot == -1) /* no available dup_event */ + return; + + match = 0; + /* look for dup with other events */ + list_for_each_entry(tmp, &ctx->event_list, event_entry) { + if (tmp == event || tmp->dup_id != -1 || + !perf_event_can_share(tmp) || + !perf_event_compatible(event, tmp)) + continue; + + tmp->dup_id = empty_slot; + match++; + } + + /* found at least one dup */ + if (match) { + ctx->dup_events[empty_slot].master = event->tmp_master; + ctx->dup_events[empty_slot].total_event_count = match + 1; + event->dup_id = empty_slot; + event->tmp_master = NULL; + return; + } +} + +/* + * Remove the event from ctx->dup_events. + * This function should only be called from list_del_event(). Similar to + * perf_event_setup_dup(), we cannot call _free_event(master). If a master + * event should be freed, it is carried out of this function by the event + * (event->tmp_master). + */ +static void perf_event_remove_dup(struct perf_event *event, + struct perf_event_context *ctx) + +{ + if (event->dup_id == -1) + return; + + if (--ctx->dup_events[event->dup_id].total_event_count == 0) { + event->tmp_master = ctx->dup_events[event->dup_id].master; + ctx->dup_events[event->dup_id].master = NULL; + } + event->dup_id = -1; +} + /* * Add an event from the lists for its context. * Must be called with ctx->mutex and ctx->lock held. @@ -1689,6 +1825,7 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx) ctx->nr_stat++; ctx->generation++; + perf_event_setup_dup(event, ctx); } /* @@ -1855,6 +1992,7 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx) WARN_ON_ONCE(event->ctx != ctx); lockdep_assert_held(&ctx->lock); + perf_event_remove_dup(event, ctx); /* * We can have double detach due to exit/hot-unplug + close. */ @@ -2069,6 +2207,84 @@ event_filter_match(struct perf_event *event) perf_cgroup_match(event) && pmu_filter_match(event); } +/* PMU sharing aware version of event->pmu->add() */ +static int event_pmu_add(struct perf_event *event, + struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + int ret; + + /* no sharing, just do event->pmu->add() */ + if (event->dup_id == -1) + return event->pmu->add(event, PERF_EF_START); + + dup = &ctx->dup_events[event->dup_id]; + + if (!dup->active_event_count) { + ret = event->pmu->add(dup->master, PERF_EF_START); + if (ret) + return ret; + } + + dup->active_event_count++; + dup->master->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); + return 0; +} + +/* + * sync data count from dup->master to event, called on event_pmu_read() + * and event_pmu_del() + */ +static void event_sync_dup_count(struct perf_event *event, + struct perf_event_dup *dup) +{ + u64 new_count; + u64 new_child_count; + + event->pmu->read(dup->master); + new_count = dup_read_count(dup); + new_child_count = dup_read_child_count(dup); + + local64_add(new_count - event->dup_base_count, &event->count); + atomic64_add(new_child_count - event->dup_base_child_count, + &event->child_count); + + event->dup_base_count = new_count; + event->dup_base_child_count = new_child_count; +} + +/* PMU sharing aware version of event->pmu->del() */ +static void event_pmu_del(struct perf_event *event, + struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + + if (event->dup_id == -1) { + event->pmu->del(event, 0); + return; + } + + dup = &ctx->dup_events[event->dup_id]; + event_sync_dup_count(event, dup); + if (--dup->active_event_count == 0) + event->pmu->del(dup->master, 0); +} + +/* PMU sharing aware version of event->pmu->read() */ +static void event_pmu_read(struct perf_event *event) +{ + struct perf_event_dup *dup; + + if (event->dup_id == -1) { + event->pmu->read(event); + return; + } + dup = &event->ctx->dup_events[event->dup_id]; + event_sync_dup_count(event, dup); +} + static void event_sched_out(struct perf_event *event, struct perf_cpu_context *cpuctx, @@ -2091,7 +2307,7 @@ event_sched_out(struct perf_event *event, perf_pmu_disable(event->pmu); - event->pmu->del(event, 0); + event_pmu_del(event, ctx); event->oncpu = -1; if (READ_ONCE(event->pending_disable) >= 0) { @@ -2364,7 +2580,7 @@ event_sched_in(struct perf_event *event, perf_log_itrace_start(event); - if (event->pmu->add(event, PERF_EF_START)) { + if (event_pmu_add(event, ctx)) { perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE); event->oncpu = -1; ret = -EAGAIN; @@ -2612,20 +2828,9 @@ static int __perf_install_in_context(void *info) raw_spin_lock(&task_ctx->lock); } -#ifdef CONFIG_CGROUP_PERF - if (is_cgroup_event(event)) { - /* - * If the current cgroup doesn't match the event's - * cgroup, we should not try to schedule it. - */ - struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx); - reprogram = cgroup_is_descendant(cgrp->css.cgroup, - event->cgrp->css.cgroup); - } -#endif - if (reprogram) { - ctx_sched_out(ctx, cpuctx, EVENT_TIME); + /* schedule out all events to set up dup properly */ + ctx_sched_out(ctx, cpuctx, EVENT_ALL); add_event_to_ctx(event, ctx); ctx_resched(cpuctx, task_ctx, get_event_type(event)); } else { @@ -2664,6 +2869,10 @@ perf_install_in_context(struct perf_event_context *ctx, * Ensures that if we can observe event->ctx, both the event and ctx * will be 'complete'. See perf_iterate_sb_cpu(). */ + if (event->tmp_master) { + event->tmp_master->ctx = ctx; + get_ctx(ctx); + } smp_store_release(&event->ctx, ctx); if (!task) { @@ -3115,7 +3324,7 @@ static void __perf_event_sync_stat(struct perf_event *event, * don't need to use it. */ if (event->state == PERF_EVENT_STATE_ACTIVE) - event->pmu->read(event); + event_pmu_read(event); perf_event_update_time(event); @@ -3967,14 +4176,14 @@ static void __perf_event_read(void *info) goto unlock; if (!data->group) { - pmu->read(event); + event_pmu_read(event); data->ret = 0; goto unlock; } pmu->start_txn(pmu, PERF_PMU_TXN_READ); - pmu->read(event); + event_pmu_read(event); for_each_sibling_event(sub, event) { if (sub->state == PERF_EVENT_STATE_ACTIVE) { @@ -3982,7 +4191,7 @@ static void __perf_event_read(void *info) * Use sibling's PMU rather than @event's since * sibling could be on different (eg: software) PMU. */ - sub->pmu->read(sub); + event_pmu_read(sub); } } @@ -4052,7 +4261,7 @@ int perf_event_read_local(struct perf_event *event, u64 *value, * oncpu == -1). */ if (event->oncpu == smp_processor_id()) - event->pmu->read(event); + event_pmu_read(event); *value = local64_read(&event->count); if (enabled || running) { @@ -4587,6 +4796,7 @@ static void free_event(struct perf_event *event) return; } + perf_event_free_tmp_master(event); _free_event(event); } @@ -4646,6 +4856,7 @@ static void put_event(struct perf_event *event) if (!atomic_long_dec_and_test(&event->refcount)) return; + perf_event_free_tmp_master(event); _free_event(event); } @@ -6261,7 +6472,7 @@ static void perf_output_read_group(struct perf_output_handle *handle, if ((leader != event) && (leader->state == PERF_EVENT_STATE_ACTIVE)) - leader->pmu->read(leader); + event_pmu_read(leader); values[n++] = perf_event_count(leader); if (read_format & PERF_FORMAT_ID) @@ -6274,7 +6485,7 @@ static void perf_output_read_group(struct perf_output_handle *handle, if ((sub != event) && (sub->state == PERF_EVENT_STATE_ACTIVE)) - sub->pmu->read(sub); + event_pmu_read(sub); values[n++] = perf_event_count(sub); if (read_format & PERF_FORMAT_ID) @@ -9539,7 +9750,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) if (event->state != PERF_EVENT_STATE_ACTIVE) return HRTIMER_NORESTART; - event->pmu->read(event); + event_pmu_read(event); perf_sample_data_init(&data, 0, event->hw.last_period); regs = get_irq_regs(); @@ -10430,6 +10641,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu, event->id = atomic64_inc_return(&perf_event_id); event->state = PERF_EVENT_STATE_INACTIVE; + event->dup_id = -1; if (task) { event->attach_state = PERF_ATTACH_TASK; @@ -10986,6 +11198,14 @@ SYSCALL_DEFINE5(perf_event_open, goto err_cred; } + if (perf_event_can_share(event)) { + event->tmp_master = perf_event_alloc(&event->attr, cpu, + task, NULL, NULL, + NULL, NULL, -1); + if (IS_ERR(event->tmp_master)) + event->tmp_master = NULL; + } + if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { err = -EOPNOTSUPP; @@ -11197,9 +11417,17 @@ SYSCALL_DEFINE5(perf_event_open, perf_remove_from_context(group_leader, 0); put_ctx(gctx); + /* + * move_group only happens to sw events, from sw ctx to hw + * ctx. The sw events should not have valid dup_id. So it + * is not necessary to handle dup_events. + */ + WARN_ON_ONCE(group_leader->dup_id != -1); + for_each_sibling_event(sibling, group_leader) { perf_remove_from_context(sibling, 0); put_ctx(gctx); + WARN_ON_ONCE(sibling->dup_id != -1); } /* @@ -11257,6 +11485,9 @@ SYSCALL_DEFINE5(perf_event_open, put_task_struct(task); } + /* if event->tmp_master is not used by ctx->dup_events, free it */ + perf_event_free_tmp_master(event); + mutex_lock(¤t->perf_event_mutex); list_add_tail(&event->owner_entry, ¤t->perf_event_list); mutex_unlock(¤t->perf_event_mutex); @@ -11401,6 +11632,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu) perf_remove_from_context(event, 0); unaccount_event_cpu(event, src_cpu); put_ctx(src_ctx); + perf_event_free_tmp_master(event); list_add(&event->migrate_entry, &events); } @@ -11773,6 +12005,14 @@ inherit_event(struct perf_event *parent_event, if (IS_ERR(child_event)) return child_event; + if (perf_event_can_share(child_event)) { + child_event->tmp_master = perf_event_alloc(&parent_event->attr, + parent_event->cpu, + child, NULL, NULL, + NULL, NULL, -1); + if (IS_ERR(child_event->tmp_master)) + child_event->tmp_master = NULL; + } if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) && !child_ctx->task_ctx_data) { @@ -11827,6 +12067,10 @@ inherit_event(struct perf_event *parent_event, child_event->overflow_handler = parent_event->overflow_handler; child_event->overflow_handler_context = parent_event->overflow_handler_context; + if (child_event->tmp_master) { + child_event->tmp_master->ctx = child_ctx; + get_ctx(child_ctx); + } /* * Precalculate sample_data sizes @@ -11841,6 +12085,7 @@ inherit_event(struct perf_event *parent_event, add_event_to_ctx(child_event, child_ctx); raw_spin_unlock_irqrestore(&child_ctx->lock, flags); + perf_event_free_tmp_master(child_event); /* * Link this into the parent event's child list */ @@ -12112,6 +12357,7 @@ static void perf_event_exit_cpu_context(int cpu) { struct perf_cpu_context *cpuctx; struct perf_event_context *ctx; + struct perf_event *event; struct pmu *pmu; mutex_lock(&pmus_lock); @@ -12123,6 +12369,8 @@ static void perf_event_exit_cpu_context(int cpu) smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); cpuctx->online = 0; mutex_unlock(&ctx->mutex); + list_for_each_entry(event, &ctx->event_list, event_entry) + perf_event_free_tmp_master(event); } cpumask_clear_cpu(cpu, perf_online_mask); mutex_unlock(&pmus_lock); -- 2.17.1