From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753253AbcEJPTA (ORCPT ); Tue, 10 May 2016 11:19:00 -0400 Received: from merlin.infradead.org ([205.233.59.134]:58531 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753139AbcEJPQX (ORCPT ); Tue, 10 May 2016 11:16:23 -0400 From: Arnaldo Carvalho de Melo To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Chris Phlipot , Adrian Hunter , Peter Zijlstra , Arnaldo Carvalho de Melo Subject: [PATCH 07/15] perf symbols: Fix handling of zero-length symbols. Date: Tue, 10 May 2016 12:15:17 -0300 Message-Id: <1462893325-28646-8-git-send-email-acme@kernel.org> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1462893325-28646-1-git-send-email-acme@kernel.org> References: <1462893325-28646-1-git-send-email-acme@kernel.org> X-SRS-Rewrite: SMTP reverse-path rewritten from by merlin.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Chris Phlipot This change introduces a fix to symbols__find, so that it is able to find symbols of length zero (where start == end). The current code has the following problem: - The current implementation of symbols__find is unable to find any symbols of length zero. - The db-export framework explicitly creates zero length symbols at locations where no symbol currently exists. The combination of the two above behaviors results in behavior similar to the example below. 1. addr_location is created for a sample, but symbol is unable to be resolved. 2. db export creates an "unknown" symbol of length zero at that address and inserts it into the dso. 3. A new sample comes in at the same address, but symbol__find is unable to find the zero length symbol, so it is still unresolved. 4. db export sees the symbol is unresolved, and allocated a duplicate symbol, even though it already did this in step 2. This behavior continues every time an address without symbol information is seen, which causes a very large number of these symbols to be allocated. The effect of this fix can be observed by looking at the contents of an exported database before/after the fix (generated with scripts/python/export-to-postgresql.py) Ex. BEFORE THE CHANGE: example_db=# select count(*) from symbols; count -------- 900213 (1 row) example_db=# select count(*) from symbols where symbols.name='unknown'; count -------- 897355 (1 row) example_db=# select count(*) from symbols where symbols.name!='unknown'; count ------- 2858 (1 row) AFTER THE CHANGE: example_db=# select count(*) from symbols; count ------- 25217 (1 row) example_db=# select count(*) from symbols where name='unknown'; count ------- 22359 (1 row) example_db=# select count(*) from symbols where name!='unknown'; count ------- 2858 (1 row) Signed-off-by: Chris Phlipot Cc: Adrian Hunter Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com [ Moved the test to later in the rb_tree tests, as this not the likely case ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/symbol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 415c4f6d98fd..2946295ca502 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -301,7 +301,7 @@ static struct symbol *symbols__find(struct rb_root *symbols, u64 ip) if (ip < s->start) n = n->rb_left; - else if (ip >= s->end) + else if (ip > s->end || (ip == s->end && ip != s->start)) n = n->rb_right; else return s; -- 2.5.5