From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752270AbdLFOMW (ORCPT ); Wed, 6 Dec 2017 09:12:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:50188 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751628AbdLFOMU (ORCPT ); Wed, 6 Dec 2017 09:12:20 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC7F121881 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=acme@kernel.org Date: Wed, 6 Dec 2017 11:12:13 -0300 From: Arnaldo Carvalho de Melo To: Peter Zijlstra Cc: Namhyung Kim , Fengguang Wu , linux-kernel@vger.kernel.org, Wang Nan , Ingo Molnar , Alexander Shishkin , Jiri Olsa , Linus Torvalds , Will Deacon , lkp@01.org, Dmitry Vyukov , kasan-dev@googlegroups.com, kernel-team@lge.com Subject: Re: BUG: KASAN: slab-out-of-bounds in perf_callchain_user+0x494/0x530 Message-ID: <20171206141213.GD12234@kernel.org> References: <20171130023218.g2y35nn4zyufqk6t@wfg-t540p.sh.intel.com> <20171130082026.ih7esfpn4wfsfoge@hirez.programming.kicks-ass.net> <20171130193712.GU3298@kernel.org> <20171205081156.GB16663@sejong> <20171205133740.GA28405@kernel.org> <20171205144718.GA27916@danjae.aot.lge.com> <20171206134706.ahlr6ygnhtu2ik4s@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20171206134706.ahlr6ygnhtu2ik4s@hirez.programming.kicks-ass.net> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Wed, Dec 06, 2017 at 02:47:06PM +0100, Peter Zijlstra escreveu: > On Tue, Dec 05, 2017 at 11:47:18PM +0900, Namhyung Kim wrote: > > Sure, I mean the following code: > > > > mutex_lock(&callchain_mutex); > > > > count = atomic_inc_return(&nr_callchain_events); > > if (WARN_ON_ONCE(count < 1)) { > > err = -EINVAL; > > goto exit; > > } > > > > if (count > 1) { > > /* If the allocation failed, give up */ > > if (!callchain_cpus_entries) > > err = -ENOMEM; > > > > goto exit; > > } > > > > err = alloc_callchain_buffers(); > > exit: > > if (err) > > atomic_dec(&nr_callchain_events); > > > > mutex_unlock(&callchain_mutex); > > > > > > The callchain_cpus_entries is allocated in alloc_callchain_buffers() > > only when the count is 1. But if it failed to allocate, it decrease > > the count so next event would try to allocate it again. Thus it seems > > not possible to see the callchain_cpus_entries being NULL in the > > 'if (count > 1)' block. If you want to make next event give up, it'd > > need to take an additional count IMHO. > > There's also a race against put_callchain_buffers() there, consider: > > > get_callchain_buffers() put_callchain_buffers() > mutex_lock(); > inc() > dec_and_test() // false > > dec() // 0 > > > And the buffers leak. Yeah, this code is complicated, and there are several csets to consider, by Frédéric that may help to understando why the code ended up like that, I started from git blame going first to 9251f904f95175b4a1d8cbc0449e748f9edd7629, where the test seemed to make sense, to then go back, but still reading this... commit fc3b86d673e41ac66b4ba5b75a90c2fcafb90089 Author: Frederic Weisbecker Date: Fri Aug 2 18:29:54 2013 +0200 perf: Roll back callchain buffer refcount under the callchain mutex commit 90983b16078ab0fdc58f0dab3e8e3da79c9579a2 Author: Frederic Weisbecker Date: Tue Jul 23 02:31:00 2013 +0200 perf: Sanitize get_callchain_buffer() commit fd45c15f13e754f3c106427e857310f3e0813951 Author: Namhyung Kim Date: Fri Jan 20 10:12:45 2012 +0900 perf: Don't call release_callchain_buffers() if allocation fails commit 9251f904f95175b4a1d8cbc0449e748f9edd7629 Author: Borislav Petkov Date: Sun Oct 16 17:15:04 2011 +0200 perf: Carve out callchain functionality