From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754684AbbCaNpB (ORCPT ); Tue, 31 Mar 2015 09:45:01 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:53888 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754016AbbCaNoM (ORCPT ); Tue, 31 Mar 2015 09:44:12 -0400 From: Yunlong Song To: , , , CC: , Subject: [PATCH 6/9] perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task Date: Tue, 31 Mar 2015 21:46:33 +0800 Message-ID: <1427809596-29559-7-git-send-email-yunlong.song@huawei.com> X-Mailer: git-send-email 1.8.4.5 In-Reply-To: <1427809596-29559-1-git-send-email-yunlong.song@huawei.com> References: <1427809596-29559-1-git-send-email-yunlong.song@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.110.52.30] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since there is sem_wait for each task in the wait_for_tasks(), e.g. sem_wait(&task->work_done_sem). The sem_wait can continue only when work_done_sem is greater than 0, or it will be blocked. For perf sched replay, one task may sem_post the work_done_sem of another task, which causes the work_done_sem of that task processed in a reasonable sequence, e.g. sem_post, sem_wait, sem_wait, sem_post... This sequence simulates the sched process of the running tasks at the time when perf sched record runs. As a result, all the tasks are required and their threads must be successfully created. If any one (task A) of the tasks fails to create its thread, then another task (task B), whose work_done_sem needs sem_post from that failed task A, may likely block itself due to seg_wait. And this is a dead halt, since task B's thread_func cannot continue at all. To solve this problem, perf sched replay should exit once any task fails to create its thread. Example: Test environment: x86_64 with 160 cores Before this patch: $ perf sched replay ... Error: sys_perf_event_open() syscall returned with -1 (Too many open files) ------------------------------------------------------------ <- dead halt After this patch: $ perf sched replay ... task 1551 ( : 0), nr_events: 10 Error: sys_perf_event_open() syscall returned with -1 (Too many open files) $ As shown above, perf sched replay finishes the process after printing an error message and does not block itself. Signed-off-by: Yunlong Song --- tools/perf/builtin-sched.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index 7fe3b3c..3261300 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -451,10 +451,12 @@ static int self_open_counters(void) fd = sys_perf_event_open(&attr, 0, -1, -1, perf_event_open_cloexec_flag()); - if (fd < 0) + if (fd < 0) { pr_err("Error: sys_perf_event_open() syscall returned " "with %d (%s)\n", fd, strerror_r(errno, sbuf, sizeof(sbuf))); + exit(EXIT_FAILURE); + } return fd; } -- 1.8.5.2