From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754064AbdFWA6K (ORCPT <rfc822;w@1wt.eu>);
        Thu, 22 Jun 2017 20:58:10 -0400
Received: from szxga01-in.huawei.com ([45.249.212.187]:8776 "EHLO
        szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1754006AbdFWA6I (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 22 Jun 2017 20:58:08 -0400
Message-ID: <594C6746.2060401@huawei.com>
Date: Fri, 23 Jun 2017 08:56:38 +0800
From: zhouchengming <zhouchengming1@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: Mark Rutland <mark.rutland@arm.com>
CC: <linux-kernel@vger.kernel.org>,
        Alexander Shishkin <alexander.shishkin@linux.intel.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
        Li Bin <huawei.libin@huawei.com>
Subject: Re: [PATCH] perf/core: fix group {cpu,task} validation
References: <1498142498-15758-1-git-send-email-mark.rutland@arm.com>
In-Reply-To: <1498142498-15758-1-git-send-email-mark.rutland@arm.com>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.177.236.183]
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
        refid=str=0001.0A020201.594C676E.0075,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0,
        ip=0.0.0.0,
        so=2014-11-16 11:51:01,
        dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: d6a5fd18d01ea73a5d6a09cad244f61e
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 2017/6/22 22:41, Mark Rutland wrote:
> Regardless of which events form a group, it does not make sense for the
> events to target different tasks and/or CPUs, as this leaves the group
> inconsistent and impossible to schedule. The core perf code assumes that
> these are consistent across (successfully intialised) groups.
>
> Core perf code only verifies this when moving SW events into a HW
> context. Thus, we can violate this requirement for pure SW groups and
> pure HW groups, unless the relevant PMU driver happens to perform this
> verification itself. These mismatched groups subsequently wreak havoc
> elsewhere.
>
> For example, we handle watchpoints as SW events, and reserve watchpoint
> HW on a per-cpu basis at pmu::event_init() time to ensure that any event
> that is initialised is guaranteed to have a slot at pmu::add() time.
> However, the core code only checks the group leader's cpu filter (via
> event_filter_match()), and can thus install follower events onto CPUs
> violating thier (mismatched) CPU filters, potentially installing them
> into a CPU without sufficient reserved slots.
>
> This can be triggered with the below test case, resulting in warnings
> from arch backends.
>
>    #define _GNU_SOURCE
>    #include<linux/hw_breakpoint.h>
>    #include<linux/perf_event.h>
>    #include<sched.h>
>    #include<stdio.h>
>    #include<sys/prctl.h>
>    #include<sys/syscall.h>
>    #include<unistd.h>
>
>    static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
>    			   int group_fd, unsigned long flags)
>    {
>    	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
>    }
>
>    char watched_char;
>
>    struct perf_event_attr wp_attr = {
>    	.type = PERF_TYPE_BREAKPOINT,
>    	.bp_type = HW_BREAKPOINT_RW,
>    	.bp_addr = (unsigned long)&watched_char,
>    	.bp_len = 1,
>    	.size = sizeof(wp_attr),
>    };
>
>    int main(int argc, char *argv[])
>    {
>    	int leader, ret;
>    	cpu_set_t cpus;
>
>    	/*
>    	 * Force use of CPU0 to ensure our CPU0-bound events get scheduled.
>    	 */
>    	CPU_ZERO(&cpus);
>    	CPU_SET(0,&cpus);
>    	ret = sched_setaffinity(0, sizeof(cpus),&cpus);
>    	if (ret) {
>    		printf("Unable to set cpu affinity\n");
>    		return 1;
>    	}
>
>    	/* open leader event, bound to this task, CPU0 only */
>    	leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
>    	if (leader<  0) {
>    		printf("Couldn't open leader: %d\n", leader);
>    		return 1;
>    	}
>
>    	/*
>    	 * Open a follower event that is bound to the same task, but a
>    	 * different CPU. This means that the group should never be possible to
>    	 * schedule.
>    	 */
>    	ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
>    	if (ret<  0) {
>    		printf("Couldn't open mismatched follower: %d\n", ret);
>    		return 1;
>    	} else {
>    		printf("Opened leader/follower with mismastched CPUs\n");
>    	}
>
>    	/*
>    	 * Open as many independent events as we can, all bound to the same
>    	 * task, CPU0 only.
>    	 */
>    	do {
>    		ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
>    	} while (ret>= 0);
>
>    	/*
>    	 * Force enable/disble all events to trigger the erronoeous
>    	 * installation of the follower event.
>    	 */
>    	printf("Opened all events. Toggling..\n");
>    	for (;;) {
>    		prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
>    		prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
>    	}
>
>    	return 0;
>    }

Very good example!

>
> Fix this by validating this requirement regardless of whether we're
> moving events.
>
> Signed-off-by: Mark Rutland<mark.rutland@arm.com>
> Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com>
> Cc: Arnaldo Carvalho de Melo<acme@kernel.org>
> Cc: Ingo Molnar<mingo@redhat.com>
> Cc: Peter Zijlstra<peterz@infradead.org>
> Cc: Zhou Chengming<zhouchengming1@huawei.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>   kernel/events/core.c | 39 +++++++++++++++++++--------------------
>   1 file changed, 19 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 6c4e523..1dca484 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10010,28 +10010,27 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
>   			goto err_context;
>
>   		/*
> -		 * Do not allow to attach to a group in a different
> -		 * task or CPU context:
> +		 * Make sure we're both events for the same CPU;
> +		 * grouping events for different CPUs is broken; since
> +		 * you can never concurrently schedule them anyhow.
>   		 */
> -		if (move_group) {
> -			/*
> -			 * Make sure we're both on the same task, or both
> -			 * per-cpu events.
> -			 */
> -			if (group_leader->ctx->task != ctx->task)
> -				goto err_context;
> +		if (group_leader->cpu != event->cpu)
> +			goto err_context;
>
> -			/*
> -			 * Make sure we're both events for the same CPU;
> -			 * grouping events for different CPUs is broken; since
> -			 * you can never concurrently schedule them anyhow.
> -			 */
> -			if (group_leader->cpu != event->cpu)
> -				goto err_context;
> -		} else {
> -			if (group_leader->ctx != ctx)
> -				goto err_context;
> -		}
> +		/*
> +		 * Make sure we're both on the same task, or both
> +		 * per-cpu events.
> +		 */
> +		if (group_leader->ctx->task != ctx->task)
> +			goto err_context;
> +
> +		/*
> +		 * Do not allow to attach to a group in a different task
> +		 * or CPU context. If we're moving SW events, we'll fix
> +		 * this up later, so allow that.
> +		 */
> +		if (!move_group&&  group_leader->ctx != ctx)
> +			goto err_context;

We don't need to check move_group here, the previous two checks already make sure
the events are on the same task and the same cpu.  So when move_group needed, they
will be moved to the same taskctx or cpuctx then.

Thanks.

>
>   		/*
>   		 * Only a group leader can be exclusive or pinned