From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46284C43334 for ; Wed, 6 Jul 2022 12:29:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233325AbiGFM3h (ORCPT ); Wed, 6 Jul 2022 08:29:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231656AbiGFM3g (ORCPT ); Wed, 6 Jul 2022 08:29:36 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70F8E767E; Wed, 6 Jul 2022 05:29:35 -0700 (PDT) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.53]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4LdJk51Qk1z1DDHr; Wed, 6 Jul 2022 20:28:45 +0800 (CST) Received: from kwepemm600003.china.huawei.com (7.193.23.202) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 6 Jul 2022 20:29:33 +0800 Received: from [10.67.111.205] (10.67.111.205) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 6 Jul 2022 20:29:32 +0800 Subject: Re: [PATCH v2] perf/core: Fix data race between perf_event_set_output and perf_mmap_close To: Peter Zijlstra CC: , , , , , , , References: <20220704120006.98141-1-yangjihong1@huawei.com> From: Yang Jihong Message-ID: <1e28533a-33ed-cae3-0389-c68e7c52cead@huawei.com> Date: Wed, 6 Jul 2022 20:29:32 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.111.205] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemm600003.china.huawei.com (7.193.23.202) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On 2022/7/5 21:07, Peter Zijlstra wrote: > On Mon, Jul 04, 2022 at 05:26:04PM +0200, Peter Zijlstra wrote: >> On Mon, Jul 04, 2022 at 08:00:06PM +0800, Yang Jihong wrote: >>> Data race exists between perf_event_set_output and perf_mmap_close. >>> The scenario is as follows: >>> >>> CPU1 CPU2 >>> perf_mmap_close(event2) >>> if (atomic_dec_and_test(&event2->rb->mmap_count) // mmap_count 1 -> 0 >>> detach_rest = true; >>> ioctl(event1, PERF_EVENT_IOC_SET_OUTPUT, event2) >>> perf_event_set_output(event1, event2) >>> if (!detach_rest) >>> goto out_put; >>> list_for_each_entry_rcu(event, &event2->rb->event_list, rb_entry) >>> ring_buffer_attach(event, NULL) >>> // because event1 has not been added to event2->rb->event_list, >>> // event1->rb is not set to NULL in these loops >>> >>> ring_buffer_attach(event1, event2->rb) >>> list_add_rcu(&event1->rb_entry, &event2->rb->event_list) >>> >>> The above data race causes a problem, that is, event1->rb is not NULL, but event1->rb->mmap_count is 0. >>> If the perf_mmap interface is invoked for the fd of event1, the kernel keeps in the perf_mmap infinite loop: >>> >>> again: >>> mutex_lock(&event->mmap_mutex); >>> if (event->rb) { >>> >>> if (!atomic_inc_not_zero(&event->rb->mmap_count)) { >>> /* >>> * Raced against perf_mmap_close() through >>> * perf_event_set_output(). Try again, hope for better >>> * luck. >>> */ >>> mutex_unlock(&event->mmap_mutex); >>> goto again; >>> } >>> >> >> Too tired, must look again tomorrow, little feeback below. > > With brain more awake I ended up with the below. Does that work? Yes, I apply the patch on kernel versions 5.10 and mainline, and it could fixed the problem. Tested-by: Yang Jihong Thanks, Yang