From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4C77C00144 for ; Fri, 29 Jul 2022 07:45:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235264AbiG2Hpj (ORCPT ); Fri, 29 Jul 2022 03:45:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235104AbiG2Ho6 (ORCPT ); Fri, 29 Jul 2022 03:44:58 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55B9F820FE for ; Fri, 29 Jul 2022 00:44:45 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-31f4450c963so36270677b3.19 for ; Fri, 29 Jul 2022 00:44:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=vsVx0RXggVgT0Vzeq33Ge2ECpCDo8PPDDZprUlFUzbU=; b=d7tM7m3wSx+Nl3LsHF3Qkp1URtOSaRrhcRzOpGhMvuDIzsJ6ORKxe9qw5CIL7pLEi+ DV986YF3a/OamH6P50M+iW7ja+NNZIQ7c65kuRHpi7trc7GXOzI0a+4GqbPXR20O63Wp gM5mSRJsf0xkjfsmmhaW0JsdpFPTMIzAi5arSo5Gqmkw4E6NtXmjicsj89vV5RPOzdCs SwnikVqD2c4lL0LQLpfcwfNdvi18dCIFcH/XWMAL9HoYlW1NfWScpIXCpZMC5nROT1Qm rbwloGq+SwztiHqkJfpMc3X2e+GdNcLApSmmAU4H7gRB/zrPyPU5j/YUGapUwZWBlTuA pmTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=vsVx0RXggVgT0Vzeq33Ge2ECpCDo8PPDDZprUlFUzbU=; b=Z61PQPAoObmuBr+IAP3QvOpRFauJF0gq8FybMzQ5v2aJeBhebjIcwgU+M9WsBQZPGG 2gHNSxYw61tIJ0ewl0JhDDDtmzj2YwL5xn/E9gRmZhbZIJO3WHj/sARL3S/qudmJddqs biLq/OZqJriNlfdGbWdFQTgj01bG5xFbNC585+q7GUClEl2GK5pqcVl8IgtNfSQSpZ9Z DVMP+d1RZmRnUqf50GVlc2SVzA2LiPm+Y32ld1BEuL1sNGBCpt5T5RHD9QBA+tPLVhop ZqmNe1OefM40GgthjQd4p7dkg4d5BWojeBzvy48iOcXYpm8QIBnnki3nTO9K9gd4yS3p rgPQ== X-Gm-Message-State: ACgBeo3EIxGINLMvs0P2cz3U2zYVH/VntoIeiF0YU0/bR6S1q//0fGtM 5nesVM8INrwoaBPIm1V8+ShvAWzkw4vu X-Google-Smtp-Source: AA6agR4Gdae2DAr0F8eHUkZlTJOf6KliV4pgoBZPuV2HWpQnhPdzjdAaC/qIhk5RnUThfkaBsRPBs6u1CKjy X-Received: from irogers.svl.corp.google.com ([2620:15c:2d4:203:524b:47b4:2aeb:1b49]) (user=irogers job=sendgmr) by 2002:a5b:585:0:b0:671:2b80:9d42 with SMTP id l5-20020a5b0585000000b006712b809d42mr1689021ybp.154.1659080685054; Fri, 29 Jul 2022 00:44:45 -0700 (PDT) Date: Fri, 29 Jul 2022 00:43:51 -0700 In-Reply-To: <20220729074351.138260-1-irogers@google.com> Message-Id: <20220729074351.138260-18-irogers@google.com> Mime-Version: 1.0 References: <20220729074351.138260-1-irogers@google.com> X-Mailer: git-send-email 2.37.1.455.g008518b4e5-goog Subject: [PATCH v3 17/17] perf jevents: Fold strings optimization From: Ian Rogers To: John Garry , Will Deacon , James Clark , Mike Leach , Leo Yan , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Andi Kleen , Zhengjun Xing , Ravi Bangoria , Kan Liang , Adrian Hunter , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org Cc: Stephane Eranian , Ian Rogers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If a shorter string ends a longer string then the shorter string may reuse the longer string at an offset. For example, on x86 the event arith.cycles_div_busy and cycles_div_busy can be folded, even though they have difference names the strings are identical after 6 characters. cycles_div_busy can reuse the arith.cycles_div_busy string at an offset of 6. In pmu-events.c this looks like the following where the 'also:' lists folded strings: /* offset=3D177541 */ "arith.cycles_div_busy\000\000pipeline\000Cycles the = divider is busy\000\000\000event=3D0x14,period=3D2000000,umask=3D0x1\000\00= 0\000\000\000\000\000\000\000" /* also: cycles_div_busy\000\000pipeline\000= Cycles the divider is busy\000\000\000event=3D0x14,period=3D2000000,umask= =3D0x1\000\000\000\000\000\000\000\000\000 */ As jevents.py combines multiple strings for an event into a larger string, the amount of folding is minimal as all parts of the event must align. Other organizations can benefit more from folding, but lose space by say recording more offsets. Signed-off-by: Ian Rogers --- tools/perf/pmu-events/jevents.py | 55 ++++++++++++++++++++++++++++---- 1 file changed, 48 insertions(+), 7 deletions(-) diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jeven= ts.py index a5e162558994..e5545758c92d 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -80,7 +80,9 @@ class BigCString: are all the other C strings (to avoid memory issues the string itself is held as a list of strings). The offsets within the big string are recorded and when stored to disk these don't need - relocation. + relocation. To reduce the size of the string further, identical + strings are merged. If a longer string ends-with the same value as a + shorter string, these entries are also merged. """ strings: Set[str] big_string: Sequence[str] @@ -96,6 +98,33 @@ class BigCString: def compute(self) -> None: """Called once all strings are added to compute the string and offsets= .""" =20 + folded_strings =3D {} + # Determine if two strings can be folded, ie. let 1 string use the + # end of another. First reverse all strings and sort them. + sorted_reversed_strings =3D sorted([x[::-1] for x in self.strings]) + + # Strings 'xyz' and 'yz' will now be [ 'zy', 'zyx' ]. Scan forward + # for each string to see if there is a better candidate to fold it + # into, in the example rather than using 'yz' we can use'xyz' at + # an offset of 1. We record which string can be folded into which + # in folded_strings, we don't need to record the offset as it is + # trivially computed from the string lengths. + for pos,s in enumerate(sorted_reversed_strings): + best_pos =3D pos + for check_pos in range(pos + 1, len(sorted_reversed_strings)): + if sorted_reversed_strings[check_pos].startswith(s): + best_pos =3D check_pos + else: + break + if pos !=3D best_pos: + folded_strings[s[::-1]] =3D sorted_reversed_strings[best_pos][::-1= ] + + # Compute reverse mappings for debugging. + fold_into_strings =3D collections.defaultdict(set) + for key, val in folded_strings.items(): + if key !=3D val: + fold_into_strings[val].add(key) + # big_string_offset is the current location within the C string # being appended to - comments, etc. don't count. big_string is # the string contents represented as a list. Strings are immutable @@ -104,13 +133,25 @@ class BigCString: big_string_offset =3D 0 self.big_string =3D [] self.offsets =3D {} - # Emit all strings in a sorted manner. + + # Emit all strings that aren't folded in a sorted manner. for s in sorted(self.strings): - self.offsets[s] =3D big_string_offset - self.big_string.append(f'/* offset=3D{big_string_offset} */ "') - self.big_string.append(s) - self.big_string.append('"\n') - big_string_offset +=3D c_len(s) + if s not in folded_strings: + self.offsets[s] =3D big_string_offset + self.big_string.append(f'/* offset=3D{big_string_offset} */ "') + self.big_string.append(s) + self.big_string.append('"') + if s in fold_into_strings: + self.big_string.append(' /* also: ' + ', '.join(fold_into_string= s[s]) + ' */') + self.big_string.append('\n') + big_string_offset +=3D c_len(s) + continue + + # Compute the offsets of the folded strings. + for s in folded_strings.keys(): + assert s not in self.offsets + folded_s =3D folded_strings[s] + self.offsets[s] =3D self.offsets[folded_s] + c_len(folded_s) - c_len= (s) =20 _bcs =3D BigCString() =20 --=20 2.37.1.455.g008518b4e5-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A1D58C00144 for ; Fri, 29 Jul 2022 08:35:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:References: Mime-Version:Message-Id:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=nKwizwrKcuxTKaoPYPJ8fm3855N/Yeb9BCb0autEMDQ=; b=IsCCj0rZBsEqQMrTXVYgqbOz5J Fy6qMy4+7wNOIQbZlFa3aEvJUwC6vm9GAPDlOYW0mIZohWfu5DR7m5yYEQXFDi2TEfX09eXHHb8OP W01mLd+HgIKuIuB8mLGD+ipm0RA1ZXsZdqO/MchMgZmGynxInlSYt3rgrMRoZ0vRlQacgTaycH1j2 0lEE/sn6s97CEiw/JvJiaA8j8pckOGYtt3eOAJUdpPdoV8NBTH2QZx/xSkUtj4p1V7sfgFtJiT3ZE XfQJ37Cn3p8DK7HEv4W/QcvpEBGsct0/thmm34KwTZU9gdagx7tcuFWV8cIdruLUyRPdzDgz3y8O3 X1qoMXqw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oHLRt-003KIU-4u; Fri, 29 Jul 2022 08:34:21 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oHLEf-0037vW-3X for linux-arm-kernel@bombadil.infradead.org; Fri, 29 Jul 2022 08:20:41 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :Cc:To:From:Subject:References:Mime-Version:Message-Id:In-Reply-To:Date: Sender:Reply-To:Content-ID:Content-Description; bh=vsVx0RXggVgT0Vzeq33Ge2ECpCDo8PPDDZprUlFUzbU=; b=LugNCTNth+jr3zDNppzOdix3O8 ulnLvGSwfMXjgQLYbERpALKhv/H4XTgwm1XU+m0x5A3cwRHbFV/nFfF3Nr61jkYbwD5GVQtAQpdfn 4AQzH3ee3/LpkIHWzREt9coHqfh5fv2B3EEZYJW3oAkPsDqxvkPRCk1HLVdIZJCOZo/7kmA/Y+u7W /GEYaBa256bkg/tgrH+w/mntlc9KqTWbbxrHEN+AsdYQJQSlVzhHeY4bHti8wAUR8GRJv0BbcDx5Q +JHowFwk0s6zzs4PCN1CoQe5bO+2YFT2U6Wm9nC9XUsidVfhH82toYMleGcJWDHu0CWinvj6guJmT CNvYP3bQ==; Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oHKfu-001GhB-6h for linux-arm-kernel@lists.infradead.org; Fri, 29 Jul 2022 07:44:48 +0000 Received: by mail-yb1-xb49.google.com with SMTP id m11-20020a5b040b000000b0066fcc60d1a0so3288500ybp.19 for ; Fri, 29 Jul 2022 00:44:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=vsVx0RXggVgT0Vzeq33Ge2ECpCDo8PPDDZprUlFUzbU=; b=d7tM7m3wSx+Nl3LsHF3Qkp1URtOSaRrhcRzOpGhMvuDIzsJ6ORKxe9qw5CIL7pLEi+ DV986YF3a/OamH6P50M+iW7ja+NNZIQ7c65kuRHpi7trc7GXOzI0a+4GqbPXR20O63Wp gM5mSRJsf0xkjfsmmhaW0JsdpFPTMIzAi5arSo5Gqmkw4E6NtXmjicsj89vV5RPOzdCs SwnikVqD2c4lL0LQLpfcwfNdvi18dCIFcH/XWMAL9HoYlW1NfWScpIXCpZMC5nROT1Qm rbwloGq+SwztiHqkJfpMc3X2e+GdNcLApSmmAU4H7gRB/zrPyPU5j/YUGapUwZWBlTuA pmTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=vsVx0RXggVgT0Vzeq33Ge2ECpCDo8PPDDZprUlFUzbU=; b=zU+GU1uVFgzqk0ZjqqAwBSwVRM4FDK9zlRENIs8lNqGliszwO6pipczrjGcQ+5dbce IJOa7KdQgaOIcwn7Yb7ycomHFMef8B3TlvTicxFxjBjYZG9x6Kx9UiIdfHlOWBYuxgE0 andqoM7zYLxejQCVlokqfpH+4HwONEdYxFG8dDCbIYz7zUI3QqC+njgPSskbJ9egT2BK 9qNRXQD/TfLIBha0jzj/8VWe4Ny30tuioAFBmXubTGT9D7ZHXGzbInVkVBudSIIlwwbp ZNMAc9pM4Zc47hxnPuKXKNJ43jGXQgDQLEnXB5rvhaygqnQSqFpcYFCMerEhxoSHBSTl G+ew== X-Gm-Message-State: ACgBeo191X7MPQkAnTY+BtOKVUzQFaMlttO4MdbA/sORdXd+PIDW9PcB 3IWSolSYtZGsNuRkk0UgqxRP9DNDXGLu X-Google-Smtp-Source: AA6agR4Gdae2DAr0F8eHUkZlTJOf6KliV4pgoBZPuV2HWpQnhPdzjdAaC/qIhk5RnUThfkaBsRPBs6u1CKjy X-Received: from irogers.svl.corp.google.com ([2620:15c:2d4:203:524b:47b4:2aeb:1b49]) (user=irogers job=sendgmr) by 2002:a5b:585:0:b0:671:2b80:9d42 with SMTP id l5-20020a5b0585000000b006712b809d42mr1689021ybp.154.1659080685054; Fri, 29 Jul 2022 00:44:45 -0700 (PDT) Date: Fri, 29 Jul 2022 00:43:51 -0700 In-Reply-To: <20220729074351.138260-1-irogers@google.com> Message-Id: <20220729074351.138260-18-irogers@google.com> Mime-Version: 1.0 References: <20220729074351.138260-1-irogers@google.com> X-Mailer: git-send-email 2.37.1.455.g008518b4e5-goog Subject: [PATCH v3 17/17] perf jevents: Fold strings optimization From: Ian Rogers To: John Garry , Will Deacon , James Clark , Mike Leach , Leo Yan , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Andi Kleen , Zhengjun Xing , Ravi Bangoria , Kan Liang , Adrian Hunter , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org Cc: Stephane Eranian , Ian Rogers X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220729_084446_356992_00FB5A03 X-CRM114-Status: GOOD ( 19.87 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org If a shorter string ends a longer string then the shorter string may reuse the longer string at an offset. For example, on x86 the event arith.cycles_div_busy and cycles_div_busy can be folded, even though they have difference names the strings are identical after 6 characters. cycles_div_busy can reuse the arith.cycles_div_busy string at an offset of 6. In pmu-events.c this looks like the following where the 'also:' lists folded strings: /* offset=177541 */ "arith.cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000" /* also: cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000 */ As jevents.py combines multiple strings for an event into a larger string, the amount of folding is minimal as all parts of the event must align. Other organizations can benefit more from folding, but lose space by say recording more offsets. Signed-off-by: Ian Rogers --- tools/perf/pmu-events/jevents.py | 55 ++++++++++++++++++++++++++++---- 1 file changed, 48 insertions(+), 7 deletions(-) diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py index a5e162558994..e5545758c92d 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -80,7 +80,9 @@ class BigCString: are all the other C strings (to avoid memory issues the string itself is held as a list of strings). The offsets within the big string are recorded and when stored to disk these don't need - relocation. + relocation. To reduce the size of the string further, identical + strings are merged. If a longer string ends-with the same value as a + shorter string, these entries are also merged. """ strings: Set[str] big_string: Sequence[str] @@ -96,6 +98,33 @@ class BigCString: def compute(self) -> None: """Called once all strings are added to compute the string and offsets.""" + folded_strings = {} + # Determine if two strings can be folded, ie. let 1 string use the + # end of another. First reverse all strings and sort them. + sorted_reversed_strings = sorted([x[::-1] for x in self.strings]) + + # Strings 'xyz' and 'yz' will now be [ 'zy', 'zyx' ]. Scan forward + # for each string to see if there is a better candidate to fold it + # into, in the example rather than using 'yz' we can use'xyz' at + # an offset of 1. We record which string can be folded into which + # in folded_strings, we don't need to record the offset as it is + # trivially computed from the string lengths. + for pos,s in enumerate(sorted_reversed_strings): + best_pos = pos + for check_pos in range(pos + 1, len(sorted_reversed_strings)): + if sorted_reversed_strings[check_pos].startswith(s): + best_pos = check_pos + else: + break + if pos != best_pos: + folded_strings[s[::-1]] = sorted_reversed_strings[best_pos][::-1] + + # Compute reverse mappings for debugging. + fold_into_strings = collections.defaultdict(set) + for key, val in folded_strings.items(): + if key != val: + fold_into_strings[val].add(key) + # big_string_offset is the current location within the C string # being appended to - comments, etc. don't count. big_string is # the string contents represented as a list. Strings are immutable @@ -104,13 +133,25 @@ class BigCString: big_string_offset = 0 self.big_string = [] self.offsets = {} - # Emit all strings in a sorted manner. + + # Emit all strings that aren't folded in a sorted manner. for s in sorted(self.strings): - self.offsets[s] = big_string_offset - self.big_string.append(f'/* offset={big_string_offset} */ "') - self.big_string.append(s) - self.big_string.append('"\n') - big_string_offset += c_len(s) + if s not in folded_strings: + self.offsets[s] = big_string_offset + self.big_string.append(f'/* offset={big_string_offset} */ "') + self.big_string.append(s) + self.big_string.append('"') + if s in fold_into_strings: + self.big_string.append(' /* also: ' + ', '.join(fold_into_strings[s]) + ' */') + self.big_string.append('\n') + big_string_offset += c_len(s) + continue + + # Compute the offsets of the folded strings. + for s in folded_strings.keys(): + assert s not in self.offsets + folded_s = folded_strings[s] + self.offsets[s] = self.offsets[folded_s] + c_len(folded_s) - c_len(s) _bcs = BigCString() -- 2.37.1.455.g008518b4e5-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel