From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933235AbbENNSd (ORCPT ); Thu, 14 May 2015 09:18:33 -0400 Received: from mail.kernel.org ([198.145.29.136]:60986 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932797AbbENNSc (ORCPT ); Thu, 14 May 2015 09:18:32 -0400 Date: Thu, 14 May 2015 10:18:27 -0300 From: Arnaldo Carvalho de Melo To: Namhyung Kim Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Adrian Hunter , Ananth N Mavinakayanahalli , Borislav Petkov , David Ahern , Davidlohr Bueso , Deng-Cheng Zhu , Don Zickus , Frederic Weisbecker , Hemant Kumar , Jiri Olsa , Masami Hiramatsu , "Naveen N . Rao" , Peter Zijlstra , Ralf Baechle , Stephane Eranian Subject: Re: [GIT PULL 00/30] perf/core improvements and fixes Message-ID: <20150514131827.GG23588@kernel.org> References: <1431118602-5877-1-git-send-email-acme@kernel.org> <20150511050939.GA1630@sejong> <20150511140626.GL28183@kernel.org> <20150514082330.GB14231@sejong> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150514082330.GB14231@sejong> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Thu, May 14, 2015 at 05:23:30PM +0900, Namhyung Kim escreveu: > On Mon, May 11, 2015 at 11:06:26AM -0300, Arnaldo Carvalho de Melo wrote: > > Em Mon, May 11, 2015 at 02:09:39PM +0900, Namhyung Kim escreveu: > > > I'm seeing a segfault on 'perf report' with a large data file after > > > applying thread refcount change - it happens regardless of the atomic > > > operation. > > Any specific 'perf record' command line? Does it take a long time to > > reproduce? Any backtraces? I'll try to repro, its possible that we're > > doing one too many thread__put()... > It's a kernel build with '-j 20' and recorded data size is ~2.1GB. > It takes ~30 sec to reproduce. > > $ perf report -i threaded/kbuild7.data --header-only > # ======== > # captured on: Thu Dec 18 12:06:35 2014 > # hostname : sejong > # os release : 3.17.4-1-ARCH > # perf version : 3.18.rc3.gcb4774b > # arch : x86_64 > # nrcpus online : 12 > # nrcpus avail : 12 > # cpudesc : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz > # cpuid : GenuineIntel,6,45,7 > # total memory : 24646828 kB > # cmdline : /home/namhyung/project/linux/tools/perf/perf record -ag -o /home/namhyung/tmp/perf/threaded/kbuild7.data -- make -j20 > # event : name = cycles, , size = 104, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled = 1, inherit > # HEADER_CPU_TOPOLOGY info available, use -I to display > # HEADER_NUMA_TOPOLOGY info available, use -I to display > # pmu mappings: cpu = 4, software = 1, power = 24, uncore_pcu = 13, tracepoint = 2, uncore_imc_0 = 15, uncore_imc_1 = 16, uncore_imc_2 = 17, uncore_ > # ======== > # > > > $ perf data stat -i threaded/kbuild7.data > > Total event stats for 'threaded/kbuild7.data' file: > > TOTAL events: 25126492 > MMAP events: 114 > COMM events: 117957 > EXIT events: 240544 > THROTTLE events: 16 > UNTHROTTLE events: 16 > FORK events: 120488 > SAMPLE events: 23878219 > MMAP2 events: 745325 > FINISHED_ROUND events: 23813 > > Sample event stats: > > 20,579,564,471,104 cycles > 23,878,219 samples # sampling ratio 99.745% (3989/4000) > > 498.736917889 second time sampled > > > $ perf report -i threaded/kbuild7.data We need to improve this segfault backtrace, I have to always use addr2line to resolve those missing entries, i.e. if you try: addr2line -fe /path/to/your/perf 0x4dd9c8 addr2line -fe /path/to/your/perf 0x4e2580 We would have resolved those lines :-/ But I think this is a longstanding bug in handling hist_entries, i.e. probably we have more than one pointer to a hist_entry and are accessing it in two places at the same time, with one of them deleting it and possibly reusing the data. > perf: Segmentation fault > -------- backtrace -------- > perf[0x51c7cb] > /usr/lib/libc.so.6(+0x33540)[0x7f37eb37e540] > /usr/lib/libc.so.6(+0x9029e)[0x7f37eb3db29e] > perf[0x4dd9c8] > perf(__hists__add_entry+0x188)[0x4e2258] > perf[0x4e2580] > perf(hist_entry_iter__add+0x9d)[0x4e2a7d] > perf[0x437fda] > perf[0x4c4c8e] > perf[0x4c5176] > perf[0x4c8bab] > perf[0x4c53c2] > perf[0x4c5f0c] > perf(perf_session__process_events+0xb3)[0x4c6b23] > perf(cmd_report+0x12a0)[0x439310] > perf[0x483ec3] > perf(main+0x60a)[0x42979a] > /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f37eb36b800] > perf(_start+0x29)[0x4298b9] > [0x0] > > It seems like some memory area was corrupted.. Right, looks like use after free, for instance, freeing something still on a list or rbtree :-/ - Arnaldo