* [RFC PATCH 01/12] perf topdown-parser: Add a simple logging API.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 02/12] perf topdown-parser: Add utility functions Ian Rogers
` (11 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
A logging API that is simpler but inspired by that in abseil.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../perf/pmu-events/topdown-parser/logging.h | 25 +++++++++++++++++++
1 file changed, 25 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/logging.h
diff --git a/tools/perf/pmu-events/topdown-parser/logging.h b/tools/perf/pmu-events/topdown-parser/logging.h
new file mode 100644
index 000000000000..9942018c4c75
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/logging.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// -------------------------------------
+// File: logging.h
+// -------------------------------------
+//
+// The header provides the macro defintion for logging errors/warnings
+
+#ifndef TOPDOWN_PARSER_LOGGING_H_
+#define TOPDOWN_PARSER_LOGGING_H_
+
+#include <iostream>
+
+#define INFO(msg) std::cout << "\033[1;35mInfo: " << msg << "\033[0m\n"
+#define ERROR(msg) \
+ std::cout << __FILE__ << ":" << __LINE__ \
+ << " \033[1;31mError: " << msg << "\033[0m\n"
+#define FATAL(msg) \
+ do { \
+ std::cout << __FILE__ << ":" << __LINE__ \
+ << " \033[1;31mFatal: " << msg << "\033[0m\n"; \
+ exit(1); \
+ } while (false)
+
+#endif // TOPDOWN_PARSER_LOGGING_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 02/12] perf topdown-parser: Add utility functions.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 01/12] perf topdown-parser: Add a simple logging API Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 03/12] perf topdown-paser: Add a CSV file reader Ian Rogers
` (10 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Basic string, ostream and file functions.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../topdown-parser/general_utils.cpp | 173 ++++++++++++++++++
.../pmu-events/topdown-parser/general_utils.h | 131 +++++++++++++
2 files changed, 304 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/general_utils.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/general_utils.h
diff --git a/tools/perf/pmu-events/topdown-parser/general_utils.cpp b/tools/perf/pmu-events/topdown-parser/general_utils.cpp
new file mode 100644
index 000000000000..810c27cf3724
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/general_utils.cpp
@@ -0,0 +1,173 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "general_utils.h"
+
+#include <dirent.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include <regex>
+#include <sstream>
+
+#include "logging.h"
+
+namespace topdown_parser
+{
+std::string Trim(const std::string &str)
+{
+ const char *ws = " \t\n\r\f\v";
+ size_t endpos = str.find_last_not_of(ws);
+ if (endpos == std::string::npos)
+ return "";
+
+ size_t startpos = str.find_first_not_of(ws);
+ return str.substr(startpos, endpos - startpos + 1);
+}
+
+std::vector<std::string> Split(const std::string &str, char delim)
+{
+ std::vector<std::string> tokens;
+ std::string token;
+ std::istringstream tokenStream(str);
+ while (std::getline(tokenStream, token, delim)) {
+ tokens.push_back(Trim(token));
+ }
+ return tokens;
+}
+
+std::string Strip(const std::string &str, char delim)
+{
+ std::string retval("");
+ for (size_t i = 0; i < str.length(); ++i) {
+ if (str[i] != delim) {
+ retval += str[i];
+ }
+ }
+ return retval;
+}
+
+std::vector<std::string> WhitespaceSplit(const std::string &s)
+{
+ std::vector<std::string> split_tokens = Split(s, ' ');
+ std::vector<std::string> retval;
+ for (auto &split_token : split_tokens) {
+ if (split_token.empty() || split_token == " ") {
+ continue;
+ }
+ retval.push_back(split_token);
+ }
+ return retval;
+}
+
+bool IsOperator(const std::string &str)
+{
+ std::regex r(
+ "\\/|\\-|\\+|\\*|\\(|\\)|\\<|\\>|min|max|\\?|\\:|,|==|>=|<=|="
+ "|if|else|d_ratio|#Model|in|\\[|\\]");
+ return regex_match(Trim(str), r);
+}
+
+bool IsConstant(const std::string &str)
+{
+ std::regex integer("[-+]?[0-9]+");
+ std::regex floating("[-+]?[0-9]*\\.?[0-9]+");
+
+ return regex_match(str, integer) || regex_match(str, floating);
+}
+
+time_t GetTimestamp(const std::string &fname)
+{
+ struct stat st;
+ int ierr = stat(fname.c_str(), &st);
+ if (ierr != 0) {
+ ERROR("Error getting stat on file: " << fname);
+ return 0;
+ }
+ return st.st_mtime;
+}
+
+bool CheckDirPathExists(const std::string &dirname)
+{
+ return opendir(dirname.c_str()) != nullptr;
+}
+
+std::string ConvertToCIdentifier(const std::string &str)
+{
+ static const char *int_to_word[] = { "zero", "one", "two", "three",
+ "four", "five", "six", "seven",
+ "eight", "nine" };
+ std::regex r("\\/|#|\\.|-|:|=");
+ std::string retval = regex_replace(str, r, "_");
+
+ std::smatch sm;
+ if (regex_match(retval, sm, std::regex("^([0-9])(.*)"))) {
+ auto digit = stoi(sm[1].str());
+ std::string word = int_to_word[digit];
+ std::string rest = sm[2].str();
+ return word + "_" + rest;
+ }
+ return retval;
+}
+
+std::string ToLower(const std::string &str)
+{
+ std::string retval("");
+
+ for (auto &c : str) {
+ retval.append(1, std::tolower(c));
+ }
+ return retval;
+}
+
+std::vector<std::string> NormalizeModel(const std::vector<std::string> &tokens,
+ const std::string &cpu)
+{
+ std::vector<std::string> retval;
+ // Track the event if encountering a '['
+ bool match_start = false;
+ // The evaluated value of the sub-expression #Model in ['CPUX' 'CPUY']
+ int condition = 0;
+
+ for (size_t i = 0; i < tokens.size(); ++i) {
+ // Skip keywords like "#Model" and "in"
+ if (tokens[i] == "#Model" || tokens[i] == "in") {
+ continue;
+ }
+ if (tokens[i] == "[") {
+ match_start = true;
+ continue;
+ }
+
+ if (tokens[i] == "]") {
+ retval.push_back(std::to_string(condition));
+ match_start = false;
+ continue;
+ }
+
+ if (match_start) {
+ if (cpu == Strip(tokens[i], '\'')) {
+ condition = condition | 1;
+ }
+ continue;
+ }
+
+ // Rest of tokens
+ retval.push_back(tokens[i]);
+ }
+
+ return retval;
+}
+
+std::string InjectSanityChecksAndReturn(const std::string &str)
+{
+ std::string injected_string =
+ std::string("double retval = ") + str + ";\n\n";
+ injected_string += "\treturn retval < 0.0 ? 0.0 : retval;";
+
+ return injected_string;
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/general_utils.h b/tools/perf/pmu-events/topdown-parser/general_utils.h
new file mode 100644
index 000000000000..6e1213247011
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/general_utils.h
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// ------------------------------------------------
+// File: general_utils.h
+// ------------------------------------------------
+//
+// The header implements the interface of common utilities used by the
+// topdown generator.
+
+#ifndef TOPDOWN_PARSER_GENERAL_UTILS_H_
+#define TOPDOWN_PARSER_GENERAL_UTILS_H_
+
+#include <set>
+#include <string>
+#include <unordered_set>
+#include <vector>
+
+namespace topdown_parser
+{
+/**
+ * Overloading << operators for various STL containers.
+ */
+template <typename T>
+std::ostream &operator<<(std::ostream &OS, std::vector<T> V)
+{
+ for (size_t i = 0; i < V.size(); ++i)
+ OS << V[i] << ",";
+
+ return OS;
+}
+
+template <typename T> std::ostream &operator<<(std::ostream &OS, std::set<T> V)
+{
+ for (auto &f : V)
+ OS << f << "|";
+
+ return OS;
+}
+
+template <typename T>
+std::ostream &operator<<(std::ostream &OS, std::unordered_set<T> V)
+{
+ for (auto &f : V)
+ OS << f << "|";
+
+ return OS;
+}
+
+/**
+ * Function used for splitting a string 'str' based on a delimiter 'delim'.
+ */
+std::vector<std::string> Split(const std::string &str, char delim);
+
+/**
+ * Function used for
+ * (1) splitting a string 'str' based on a whitespace, and
+ * (2) pruning the splits resulting in empty string or string containing only
+ * whitespaces.
+ * Example: For an input string s = "a b d"
+ * Result: {"a", "b", "c"}
+ */
+std::vector<std::string> WhitespaceSplit(const std::string &str);
+
+/**
+ * Trim removes the leading and trailing whitespaces of a string `str`.
+ */
+std::string Trim(const std::string &str);
+
+/**
+ * Remove a char 'delim' from anywhere in string 'str'.
+ */
+std::string Strip(const std::string &str, char delim);
+
+/**
+ * Check if the string `str` is an operator.
+ */
+bool IsOperator(const std::string &str);
+
+/**
+ * Check if the string `str` is an constant decimal numer or float.
+ */
+bool IsConstant(const std::string &);
+
+/**
+ * Returns timestamp of a file `fname`
+ */
+time_t GetTimestamp(const std::string &fname);
+
+/*
+ * Check if a directory path `dirname` exists
+ */
+bool CheckDirPathExists(const std::string &dirname);
+
+/**
+ * Convert an arbitrary string `str` to C identifier.
+ * It converts some characters like '#', '.', '-', '=' to '_', if appear
+ * anywhere in the string.
+ */
+std::string ConvertToCIdentifier(const std::string &str);
+
+/**
+ * Lowercase a string `str`
+ */
+std::string ToLower(const std::string &str);
+
+/**
+ * The input csv file might contain formula like
+ * "Exp1 if #Model in ['CPUX' 'CPUY'] else Expr2 "
+ * in a column specifying a list of CPUs as CPUX/CPUY/CPUZ
+ * We want to generate the following formulas for each cpu
+ * For CPUX: Expr1 if 1 else Expr2
+ * For CPUY: Expr1 if 1 else Expr2
+ * For CPUZ: Expr1 if 0 else Expr2
+ *
+ * `tokens`: A list of tokens representing the formula delimited by whitespace.
+ * `cpu`: The CPU for which we want to generate the formula.
+ */
+std::vector<std::string> NormalizeModel(const std::vector<std::string> &tokens,
+ const std::string &cpu);
+
+/**
+ * `InjectSanityChecksAndReturn` converts a formula 'str'
+ * to
+ * double retval = str < 0.0 ? 0.0 : str;
+ * return retval;
+ */
+std::string InjectSanityChecksAndReturn(const std::string &str);
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_GENERAL_UTILS_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 03/12] perf topdown-paser: Add a CSV file reader.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 01/12] perf topdown-parser: Add a simple logging API Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 02/12] perf topdown-parser: Add utility functions Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 04/12] perf topdown-parser: Add a json " Ian Rogers
` (9 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Read a CSV file info a two dimensional vector of vectors. Open
parentheses are counted so that expressions like "min(a,b)" aren't
split. Escape characters and quotations aren't handled.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../pmu-events/topdown-parser/csvreader.cpp | 49 ++++++++++++++++++
.../pmu-events/topdown-parser/csvreader.h | 51 +++++++++++++++++++
2 files changed, 100 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/csvreader.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/csvreader.h
diff --git a/tools/perf/pmu-events/topdown-parser/csvreader.cpp b/tools/perf/pmu-events/topdown-parser/csvreader.cpp
new file mode 100644
index 000000000000..142e0e7e5ce7
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/csvreader.cpp
@@ -0,0 +1,49 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "csvreader.h"
+
+#include <cassert>
+#include <algorithm>
+#include <fstream>
+
+#include "general_utils.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+std::vector<std::vector<std::string> > CsvReader::getData() const
+{
+ std::vector<std::vector<std::string> > dataList;
+ std::ifstream file(file_name_);
+ std::string line = "";
+ assert(file.is_open() && "unable to open csv file");
+
+ while (getline(file, line)) {
+ std::vector<std::string> tokens;
+ int opens = 0;
+ int closes = 0;
+ for (const std::string &str : Split(line, delimeter_)) {
+ std::string stripped_str = Strip(str, '"');
+ if (opens > closes) {
+ tokens.back() += ", " + stripped_str;
+ } else {
+ tokens.push_back(stripped_str);
+ }
+ opens += std::count(str.begin(), str.end(), '(');
+ closes += std::count(str.begin(), str.end(), ')');
+ }
+
+ dataList.push_back(tokens);
+ }
+
+ if (dataList.empty()) {
+ FATAL("Empty csv file" << file_name_);
+ }
+
+ return dataList;
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/csvreader.h b/tools/perf/pmu-events/topdown-parser/csvreader.h
new file mode 100644
index 000000000000..a82470041145
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/csvreader.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// ---------------------------------------------
+// File: csvheader.h
+// ---------------------------------------------
+//
+// The header file provides the interface for parsing csv file using
+// CsvReader::delimeter_ as the delimiter for parsing each line.
+//
+// The library provides the following utilities:
+// `getData`: Reads the input csv file `file_name_` and parses its
+// contents, based on the delimeter `delimeter_`, as strings.
+// The parsed data is returned as a 2D vector, V, of strings such
+// that V[r][c] is same as the value of the input csv file at row r
+// and column c.
+//
+// For example, with the following content of a csv file,
+// a,b,c,
+// 1,2,3
+// and delimiter as ',', the return value is
+//
+// {
+// {"a", "b", "c"},
+// {"1", "2", "3"}
+// }
+
+#ifndef TOPDOWN_PARSER_CSV_READER_H_
+#define TOPDOWN_PARSER_CSV_READER_H_
+
+#include <string>
+#include <vector>
+
+namespace topdown_parser
+{
+class CsvReader {
+ public:
+ explicit CsvReader(std::string fname, char delm = ',')
+ : file_name_(fname), delimeter_(delm)
+ {
+ }
+
+ std::vector<std::vector<std::string> > getData() const;
+
+ private:
+ const std::string file_name_;
+ const char delimeter_;
+};
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_CSV_READER_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 04/12] perf topdown-parser: Add a json file reader.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (2 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 03/12] perf topdown-paser: Add a CSV file reader Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 05/12] perf topdown-parser: Add a configuration Ian Rogers
` (8 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Wrap jsmn as a "C" library. Add some utilities for working with tokens
and to read a vector of tokens.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
| 199 ++++++++++++++++++
| 42 ++++
2 files changed, 241 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/jsmn_extras.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/jsmn_extras.h
--git a/tools/perf/pmu-events/topdown-parser/jsmn_extras.cpp b/tools/perf/pmu-events/topdown-parser/jsmn_extras.cpp
new file mode 100644
index 000000000000..83a15b636378
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/jsmn_extras.cpp
@@ -0,0 +1,199 @@
+#include "jsmn_extras.h"
+
+#include <cassert>
+#include <cstring>
+#include <functional>
+#include <memory>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include "logging.h"
+
+namespace topdown_parser
+{
+int jsoneq(const char *json, const jsmntok_t *tok, const char *s)
+{
+ if (tok->type == JSMN_STRING &&
+ static_cast<int>(strlen(s)) == tok->end - tok->start &&
+ strncmp(json + tok->start, s, tok->end - tok->start) == 0) {
+ return 0;
+ }
+ return -1;
+}
+
+int get_primitive(const char *js, const jsmntok_t *t, int i,
+ std::string *retval)
+{
+ if (t[i].type != JSMN_STRING && t[i].type != JSMN_PRIMITIVE) {
+ assert(0);
+ }
+ const jsmntok_t *g = t + i;
+ (*retval) = std::string(js + g->start, g->end - g->start);
+ return i;
+}
+
+// Parse the following pattern of key-values
+// A:B
+int get_key_val(const char *js, const jsmntok_t *t, int i,
+ std::pair<std::string, std::string> *P)
+{
+ assert(t[i].type == JSMN_STRING);
+ i = get_primitive(js, t, i, &((*P).first));
+
+ i++;
+ i = get_primitive(js, t, i, &((*P).second));
+
+ return i;
+}
+
+int get_array_of_primitives(const char *js, const jsmntok_t *t, int i,
+ std::vector<std::string> *V)
+{
+ int j;
+ if (t[i].type != JSMN_ARRAY) {
+ assert(0);
+ }
+ int size = t[i].size;
+ if (size == 0) {
+ return i;
+ }
+
+ i++;
+ std::string retval;
+
+ for (j = 0; j < size - 1; j++) {
+ i = get_primitive(js, t, i, &retval);
+ (*V).push_back(retval);
+ i++;
+ }
+ i = get_primitive(js, t, i, &retval);
+ (*V).push_back(retval);
+
+ return i;
+}
+
+int get_struct(const char *js, const jsmntok_t *t, int i,
+ std::map<std::string, std::string> *data)
+{
+ int j;
+ if (t[i].type != JSMN_OBJECT) {
+ assert(0);
+ }
+
+ int size = t[i].size;
+ i++;
+
+ for (j = 0; j < size - 2; j += 2) {
+ std::pair<std::string, std::string> P;
+ i = get_key_val(js, t, i, &P);
+ (*data).insert(P);
+ i++;
+ }
+ std::pair<std::string, std::string> P;
+ i = get_key_val(js, t, i, &P);
+ (*data).insert(P);
+ return i;
+}
+
+int get_struct_of_array(
+ const char *js, const jsmntok_t *t, int i,
+ std::unordered_map<std::string, std::vector<std::string> > *data)
+{
+ if (t[i].type != JSMN_OBJECT) {
+ assert(0);
+ }
+
+ int size = t[i].size;
+ i++;
+
+ std::string key;
+ for (int j = 0; j < size - 2; j += 2) {
+ i = get_primitive(js, t, i, &key);
+ i++;
+
+ i = get_array_of_primitives(js, t, i, &((*data)[key]));
+ i++;
+ }
+ i = get_primitive(js, t, i, &key);
+ i++;
+ i = get_array_of_primitives(js, t, i, &((*data)[key]));
+ return i;
+}
+
+/**
+ * ParseJson parses a json file file 'fname' and delegate the processing of the
+ * parsed model to an external callback function 'callback' provided by the
+ * clients of the function.
+ *
+ * The clients using the following routine are:
+ * 1. ReadEventInfoFromJson: Parsing the event encoding json file for each CPU
+ * as downloaded from https://download.01.org/perfmon/
+ * 2. ReadConfig: Parsing the configuration.json file, which specifies the
+ * parameters for the topdown_parser tool.
+ */
+int ParseJson(const char *fname,
+ void (*callback)(const char *, const jsmntok_t *, int, void *),
+ void *metainfo)
+{
+ // Read the file fully into js.
+ int fd = open(fname, O_RDONLY);
+ if (fd == -1) {
+ ERROR("Failed to open '" << fname << "': " << strerror(errno));
+ return 1;
+ }
+ struct stat statbuf;
+ if (fstat(fd, &statbuf) == -1) {
+ ERROR("Failed to stat '" << fname << "': " << strerror(errno));
+ close(fd);
+ return 2;
+ }
+
+ std::unique_ptr<char[]> js(new char[statbuf.st_size]);
+ if (read(fd, js.get(), statbuf.st_size) < 0) {
+ ERROR("Failed to read '" << fname << "': " << strerror(errno));
+ close(fd);
+ return 3;
+ }
+ close(fd);
+
+ // Prepare parser.
+ jsmn_parser p;
+ jsmn_init(&p);
+
+ // Allocate some tokens as a start then iterate until resizing is
+ // unnecessary.
+ std::vector<jsmntok_t> tok;
+ tok.resize(32);
+
+ jsmnerr_t r;
+ do {
+ r = jsmn_parse(&p, js.get(), statbuf.st_size, tok.data(),
+ tok.size());
+ if (r == JSMN_ERROR_NOMEM) {
+ tok.resize(tok.size() * 2);
+ }
+ } while (r == JSMN_ERROR_NOMEM);
+
+ switch (r) {
+ default:
+ ERROR("Json parse error " << r << " in '" << fname << "' at "
+ << p.pos);
+ return 4;
+ case JSMN_ERROR_INVAL:
+ ERROR("Invalid character in '" << fname << "' at " << p.pos);
+ return 5;
+ case JSMN_ERROR_PART:
+ ERROR("Incomplete json packet in '" << fname << "' at "
+ << p.pos);
+ return 6;
+ case JSMN_SUCCESS:
+ break;
+ }
+ (*callback)(js.get(), tok.data(), p.toknext, metainfo);
+ return 0;
+}
+
+} // namespace topdown_parser
--git a/tools/perf/pmu-events/topdown-parser/jsmn_extras.h b/tools/perf/pmu-events/topdown-parser/jsmn_extras.h
new file mode 100644
index 000000000000..b6721e50f064
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/jsmn_extras.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// --------------------------------------------------
+// File: jsmn_extras.h
+// --------------------------------------------------
+//
+
+// The header provides additional helpers based on the jsmn library.
+
+#ifndef JSMN_EXTRAS_H_
+#define JSMN_EXTRAS_H_
+
+#include <cstdlib>
+extern "C" {
+#include "../jsmn.h"
+}
+#include <map>
+#include <string>
+#include <unordered_map>
+#include <vector>
+
+namespace topdown_parser
+{
+int jsoneq(const char *json, const jsmntok_t *tok, const char *s);
+int get_primitive(const char *js, const jsmntok_t *t, int i,
+ std::string *retval);
+int get_key_val(const char *js, const jsmntok_t *t, int i,
+ std::pair<std::string, std::string> *P);
+int get_array_of_primitives(const char *js, const jsmntok_t *t, int i,
+ std::vector<std::string> *V);
+int get_struct(const char *js, const jsmntok_t *t, int i,
+ std::map<std::string, std::string> *data);
+int get_struct_of_array(
+ const char *js, const jsmntok_t *t, int i,
+ std::unordered_map<std::string, std::vector<std::string> > *data);
+int ParseJson(const char *fname,
+ void (*callback)(const char *, const jsmntok_t *, int, void *),
+ void *metainfo);
+
+} // namespace topdown_parser
+
+#endif // JSMN_EXTRAS_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 05/12] perf topdown-parser: Add a configuration.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (3 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 04/12] perf topdown-parser: Add a json " Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 06/12] perf topdown-parser: Interface for TMA_Metrics.csv Ian Rogers
` (7 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
The configuration.json holds configuration data that will be read
into the ConfigurationParameters class in configuration.h.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../topdown-parser/configuration.cpp | 198 ++++++++++++++++++
.../pmu-events/topdown-parser/configuration.h | 181 ++++++++++++++++
.../topdown-parser/configuration.json | 72 +++++++
3 files changed, 451 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/configuration.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/configuration.h
create mode 100644 tools/perf/pmu-events/topdown-parser/configuration.json
diff --git a/tools/perf/pmu-events/topdown-parser/configuration.cpp b/tools/perf/pmu-events/topdown-parser/configuration.cpp
new file mode 100644
index 000000000000..6cb4dffe7755
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/configuration.cpp
@@ -0,0 +1,198 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "configuration.h"
+
+#include <cassert>
+
+#include "jsmn_extras.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+/**
+ * kConfigParams is the set of all the parameters defined by the
+ * configuration.json file.
+ */
+ConfigurationParameters *const kConfigParams =
+ ConfigurationParameters::GetConfigurationParameters();
+
+ConfigurationParameters *ConfigurationParameters::config_param_instance_ =
+ nullptr;
+
+namespace
+{
+/**
+ * Parse the 'configuration.json' file to populate the configuration
+ * parameters `kConfigParams`. Each key in the Json file corresponds to
+ * a parameter in `kConfigParams`.
+ */
+void ParseConfigJson(const char *js, const jsmntok_t *t, int r,
+ void *metainfo __attribute__((unused)))
+{
+ for (int i = 1; i < r; ++i) {
+ if (jsoneq(js, &t[i], "configuration") == 0) {
+ i++;
+ assert(t[i].type == JSMN_OBJECT);
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "_COMMENT_") == 0) {
+ i++;
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "target") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ kConfigParams->target_ = retval;
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "metric_max_header") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->metric_max_header_ =
+ stoi(retval);
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "header_row") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->header_row = stoi(retval) - 1;
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "formula_start_colm") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->formula_start_colm_ =
+ retval[0] - 'A';
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "formula_end_colm") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->formula_end_colm_ =
+ retval[0] - 'A';
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "server_identifier_row") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->server_identifier_row_ =
+ stoi(retval) - 1;
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "first_level") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->first_level_ = stoi(retval);
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "last_level") == 0) {
+ i++;
+ std::string retval;
+ i = get_primitive(js, t, i, &retval);
+ if (!retval.empty()) {
+ kConfigParams->first_last_ = stoi(retval);
+ }
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "json_filename_hints") == 0) {
+ i++;
+ i = get_struct(js, t, i,
+ &(kConfigParams->json_filename_hints_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "output_directory_per_cpu") == 0) {
+ i++;
+ i = get_struct(
+ js, t, i,
+ &(kConfigParams->output_directory_per_cpu_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "perf_stat_switch_names") == 0) {
+ i++;
+ i = get_struct(
+ js, t, i,
+ &(kConfigParams->perf_stat_switch_names_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "filenames_for_fixed_events_vals") == 0) {
+ i++;
+ i = get_struct(
+ js, t, i,
+ &(kConfigParams
+ ->filenames_for_fixed_events_vals_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "cpu_to_model_number") == 0) {
+ i++;
+ i = get_struct_of_array(
+ js, t, i,
+ &(kConfigParams->cpu_to_model_number_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "selected_cpus") == 0) {
+ i++;
+ i = get_array_of_primitives(
+ js, t, i, &(kConfigParams->selected_cpus_));
+ continue;
+ }
+
+ if (jsoneq(js, &t[i], "dont_care_cpus") == 0) {
+ i++;
+ std::vector<std::string> retval;
+ i = get_array_of_primitives(js, t, i, &retval);
+ kConfigParams->dont_care_cpus_.insert(retval.begin(),
+ retval.end());
+ continue;
+ }
+
+ FATAL("Unexpected json key: "
+ << std::string(js + t[i].start, t[i].end - t[i].start));
+ }
+}
+
+} // namespace
+
+int ReadConfig()
+{
+ return ParseJson(kConfigParams->config_file_.c_str(), ParseConfigJson,
+ nullptr);
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/configuration.h b/tools/perf/pmu-events/topdown-parser/configuration.h
new file mode 100644
index 000000000000..4b0767c0c3ef
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/configuration.h
@@ -0,0 +1,181 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// -----------------------------------------------
+// File: configurations.h
+// -----------------------------------------------
+//
+// The configuration file "configuration.json" mentions a set of parameters for
+// the client to control (1) Generation of auto-generated code, and (2)
+// Parsing of the input csv file.
+// This header file provides the interface `ConfigurationParameters` to access
+// all those parameters.
+// Following is the list of variables defining each configuration parameter.
+// * output_path_
+// * filenames_for_fixed_events_vals_
+// * target_
+// * selected_cpus_
+// * dont_care_cpus_
+// * metric_max_header_
+// * header_row;
+// * formula_start_colm_;
+// * formula_end_colm_;
+// * server_identifier_row_;
+// * first_level_;
+// * first_last_;
+// * cpu_to_model_number_;
+// * json_filename_hints_;
+// * output_directory_per_cpu_;
+// * perf_stat_switch_names_;
+//
+// The implementation of interface `ConfigurationParameters` restricts its
+// instantiation to one object `kConfigParams`
+
+#ifndef TOPDOWN_PARSER_CONFIGURATION_H_
+#define TOPDOWN_PARSER_CONFIGURATION_H_
+
+#include <climits>
+#include <map>
+#include <string>
+#include <unordered_map>
+#include <unordered_set>
+#include <vector>
+
+namespace topdown_parser
+{
+/**
+ * Read configuration file "configuration.json"
+ */
+int ReadConfig();
+
+class ConfigurationParameters {
+ public:
+ /**
+ * The input configuration file name
+ */
+ std::string config_file_;
+
+ /**
+ * The path name of directory containing event encoding files
+ */
+ std::string event_data_dir_;
+
+ /**
+ * The directory to output the auto-generated file(s)
+ */
+ std::string output_path_;
+
+ /**
+ * Per CPU filename for reading fixed event values. Used in testing mode
+ */
+ std::map<std::string, std::string> filenames_for_fixed_events_vals_;
+
+ /**
+ * Generate topdown file for different projects or targets
+ */
+ std::string target_;
+
+ /**
+ * Generate topdown information only for selected CPUs
+ */
+ std::vector<std::string> selected_cpus_;
+
+ /**
+ * The CPUs to ignore
+ */
+ std::unordered_set<std::string> dont_care_cpus_;
+
+ /**
+ * Maximum length of the header printed on executing 'perf stat' command
+ */
+ size_t metric_max_header_;
+
+ /**
+ * 'header_row' is the row number of the header. The header of the input
+ * csv file specifies the information like level numbers, CPU product
+ * names, Metric Description etc. A typical header row looks like:
+ * Key | Level1 | Level2 | Level3 | SKX | SKL | ...
+ */
+ size_t header_row;
+
+ /**
+ * 'formula_start_colm_' is first column number specifying a formula
+ */
+ size_t formula_start_colm_;
+
+ /**
+ * 'formula_end_colm_' is last column number specifying a formula
+ */
+ size_t formula_end_colm_;
+
+ /**
+ * Row number of input csv file which identifies if a
+ * CPU product is a client or server.
+ */
+ size_t server_identifier_row_;
+
+ /**
+ * First and last topdown levels.
+ * A typical header row looks like:
+ * Key | Level1 | Level2 | Level3 | SKX | SKL | ...
+ *
+ * first_level_ specifies a number [1-UINT_MAX] to specify the first
+ * level.
+ * g_LasttLevel specifies a number [1-UINT_MAX] to specify the last
+ * level.
+ */
+ size_t first_level_;
+ size_t first_last_;
+
+ /**
+ * Model numbers of CPUs
+ */
+ std::unordered_map<std::string, std::vector<std::string> >
+ cpu_to_model_number_;
+
+ /**
+ * Hints for event encoding JSon filenames
+ */
+ std::map<std::string, std::string> json_filename_hints_;
+
+ /**
+ * Hints for event encoding JSon filenames
+ */
+ std::map<std::string, std::string> output_directory_per_cpu_;
+
+ /**
+ * The perf stat switch names for each top level metric
+ */
+ std::map<std::string, std::string> perf_stat_switch_names_;
+
+ /**
+ * GetConfigurationParameters return a single instance of
+ * ConfigurationParameters
+ */
+ static ConfigurationParameters *GetConfigurationParameters(void)
+ {
+ if (config_param_instance_ == nullptr)
+ config_param_instance_ = new ConfigurationParameters();
+
+ return config_param_instance_;
+ }
+
+ private:
+ static ConfigurationParameters *config_param_instance_;
+
+ ConfigurationParameters()
+ {
+ metric_max_header_ = UINT_MAX;
+ header_row = UINT_MAX;
+ formula_start_colm_ = UINT_MAX;
+ formula_end_colm_ = UINT_MAX;
+ server_identifier_row_ = UINT_MAX;
+ first_level_ = UINT_MAX;
+ first_last_ = UINT_MAX;
+ }
+};
+
+extern ConfigurationParameters *const kConfigParams;
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_CONFIGURATION_H_
diff --git a/tools/perf/pmu-events/topdown-parser/configuration.json b/tools/perf/pmu-events/topdown-parser/configuration.json
new file mode 100644
index 000000000000..a9fddb54c8a1
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/configuration.json
@@ -0,0 +1,72 @@
+{
+ "configuration" : {
+ "_COMMENT_":"Generate topdown file for specific project or target.",
+ "target":"perf_json",
+
+ "_COMMENT_":"Hints for event encoding JSon filenames",
+ "json_filename_hints": {
+ "BDW":"broadwell",
+ "BDW-DE":"broadwellde"
+ "BDW-EP":"broadwellx",
+ "BDX":"broadwellx",
+ "CLX":"cascadelakex",
+ "HSW":"haswell",
+ "HSX":"haswellx",
+ "ICL":"icelake",
+ "IVB":"ivybridge",
+ "IVB-EP":"ivytown",
+ "IVT":"ivytown",
+ "JKT":"jaketown",
+ "SKL":"skylake",
+ "SKL-EP":"skylakex",
+ "SKX":"skylakex",
+ "SNB":"sandybridge",
+ "SNB-EP":"jaketown",
+ },
+
+ "_COMMENT_":"Output directory for perf-metric json files for each cpu",
+ "output_directory_per_cpu": {
+ "BDW":"broadwell",
+ "BDW-DE":"broadwellde"
+ "BDX":"broadwellx",
+ "CLX":"cascadelakex",
+ "HSW":"haswell",
+ "HSX":"haswellx",
+ "ICL":"icelake",
+ "IVB":"ivybridge",
+ "IVT":"ivytown",
+ "JKT":"jaketown",
+ "SKL":"skylake",
+ "SKX":"skylakex",
+ "SNB":"sandybridge",
+ },
+
+ "_COMMENT_":"Generate topdown information only for selected CPUs.",
+ "_COMMENT_":"The CPU name must confirm to the naming convention in input csv file",
+ "_COMMENT_":"Recommended: Leave blank. The names is will inferred from input csv file.",
+ "selected_cpus":[ ],
+
+ "_COMMENT_":"The CPU names to ignore",
+ "dont_care_cpus":[ "CPX", "JKT", "CNL", "KBL", "KBLR", "CFL", "SNB-EP", "TGL" ],
+
+ "_COMMENT_":"header row is the row number of the header [1 - UINT_MAX]",
+ "header_row":"",
+
+ "_COMMENT_":"Formula start colm is first column number specifying a formula [A - Z]",
+ "formula_start_colm":"",
+
+ "_COMMENT_":"Formula end colm is last column number specifying a formula [A - Z]",
+ "formula_end_colm":"",
+
+ "_COMMENT_":"Row number of the input csv file specifying if a column is for Server or Client [1 - UINT_MAX]",
+ "server_identifier_row":"",
+
+ "_COMMENT_":"The first topdown level [1 - #Levels]",
+ "_COMMENT_":"All the levels before first_level will be ignored",
+ "first_level":"",
+
+ "_COMMENT_":"The last topdown level [1 - #Levels]",
+ "_COMMENT_":"All the levels after 'last_level' will be ignored",
+ "last_level":""
+ }
+}
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 06/12] perf topdown-parser: Interface for TMA_Metrics.csv.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (4 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 05/12] perf topdown-parser: Add a configuration Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 07/12] perf topdown-parser: Metric expression parser Ian Rogers
` (6 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Reads the CSV file then creates an in memory model from the data.
Co-authored-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../topdown-parser/dependence_dag_utils.cpp | 984 ++++++++++++++++++
.../topdown-parser/dependence_dag_utils.h | 178 ++++
2 files changed, 1162 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/dependence_dag_utils.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/dependence_dag_utils.h
diff --git a/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.cpp b/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.cpp
new file mode 100644
index 000000000000..7c9eff06e2a9
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.cpp
@@ -0,0 +1,984 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "dependence_dag_utils.h"
+
+#include <cassert>
+#include <fstream>
+#include <regex>
+
+#include "configuration.h"
+#include "general_utils.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+char g_PerfmonVersion[VERSION_MAX_STRLEN];
+
+std::map<std::string, TopdownInfo> *g_TopdownHierarchy = nullptr;
+
+std::vector<std::string> *g_RelevantCpus = nullptr;
+
+std::vector<std::set<std::string> > *g_CpuAliasesForEventInfo = nullptr;
+
+namespace
+{
+/**
+ * Column number in the input csv file specifying 'Count Domain'
+ */
+size_t g_CountDomainColm = UINT_MAX;
+
+/**
+ * Column number in the input csv file specifying 'Metric Group'
+ */
+size_t g_MetricGroupColm = UINT_MAX;
+
+/**
+ * Column number in the input csv file specifying 'Description'
+ */
+size_t g_DescColm = UINT_MAX;
+
+/**
+ * header_rowKey is used to derive the row number of the header. The
+ * header of the input csv file specifies the information like level
+ * numbers, CPU product names, Metric Description etc. A typical header
+ * row looks like: Key | Level1 | Level2 | SKX SKL | Count | Domain
+ * Metric Description | ...
+ */
+const char *header_rowKey = "level[0-9]+";
+
+/**
+ * formula_start_colm_Key is used to derive the first column number
+ * specifying a formula.
+ */
+const char *formula_start_colm_Key = "level[0-9]+";
+
+/**
+ * formula_end_colm_Key is used to derive the last column number
+ * specifying a formula.
+ */
+const char *formula_end_colm_Key = "locate-with";
+
+/**
+ * g_CountDomainColmKey is used to derive column number in the input csv
+ * file specifying 'Count Domain'.
+ */
+const char *g_CountDomainColmKey = "count";
+
+/**
+ * g_DescColmKey is used to derive column number in the input csv file
+ * specifying 'Description'.
+ */
+const char *g_DescColmKey = "description";
+
+/**
+ * g_MetricGroupColmKey is used to derive column number in the input csv
+ * file specifying 'Metric Group'.
+ */
+const char *g_MetricGroupColmKey = "group";
+
+/**
+ * Last row number in the input csv file specifying topdown levels
+ */
+size_t g_LevelEndRow = UINT_MAX;
+
+/**
+ * g_LevelEndRowKey is used to derive the last row number in the input
+ * csv file specifying topdown levels.
+ */
+const char *g_LevelEndRowKey = "\\.";
+
+/**
+ * First and last column numbers in the input csv file specifying
+ * topdown levels.
+ */
+size_t g_LevelStartColm = UINT_MAX;
+size_t g_LevelEndColm = UINT_MAX;
+
+/**
+ * Initialize globals.
+ */
+void InitGlobals()
+{
+ if (g_TopdownHierarchy == nullptr) {
+ g_TopdownHierarchy = new std::map<std::string, TopdownInfo>;
+ }
+
+ if (g_RelevantCpus == nullptr) {
+ g_RelevantCpus = new std::vector<std::string>;
+ }
+
+ if (g_CpuAliasesForEventInfo == nullptr) {
+ g_CpuAliasesForEventInfo =
+ new std::vector<std::set<std::string> >;
+ }
+}
+
+/**
+ * Plot the topdown hierarchy in graphviz dot.
+ */
+void PlotTopdownHierarchy(
+ const std::map<std::string, TopdownInfo> &g_TopdownHierarchy)
+{
+ std::string topdown_hierarchy_dot =
+ kConfigParams->output_path_ + "topdown_hierarchy.dot";
+ std::ofstream ofile_dot(topdown_hierarchy_dot);
+ if (!ofile_dot.is_open()) {
+ ERROR("Error opening file: " << topdown_hierarchy_dot);
+ exit(1);
+ }
+
+ INFO("Generating topdown hierarchy file: " << topdown_hierarchy_dot);
+
+ ofile_dot << "digraph graphname {\n";
+ int toggle = 0;
+ std::string color = "[color=blue]";
+ for (auto &p : g_TopdownHierarchy) {
+ auto metric_name = std::string("\"") + p.first + "\"";
+
+ if (toggle == 0) {
+ color = "[color=blue]";
+ } else {
+ color = "[color=red]";
+ }
+
+ for (size_t i = 0; i < p.second.child_metrics.size() - 1; ++i) {
+ ofile_dot << metric_name << " -> "
+ << "\"" << p.second.child_metrics[i] << "\" "
+ << color << std::endl;
+ }
+ ofile_dot
+ << metric_name << "->"
+ << "\""
+ << p.second.child_metrics[p.second.child_metrics.size() -
+ 1]
+ << "\" " << color << std::endl;
+ toggle = toggle ^ 1;
+ }
+ ofile_dot << "}\n";
+
+ ofile_dot.close();
+}
+
+/**
+ * GetTopdownHierarchy derives the topdown hierarchy
+ * from the csv file.
+ * The hierarchy looks like
+ *
+ * g_LevelStartColm g_LevelEndColm
+ * | |
+ * V V
+ * header_row | Level1 | Level2 | Level3 |
+ * header_row + 1 | A | | |
+ * | | B | |
+ * | | | C |
+ * | | G | |
+ * | | D | |
+ * | | | E |
+ * g_LevelEndRow | | | F |
+ *
+ * The function returns:
+ * g_TopdownHierarchy["Topdown"] --> {"<perf-stat-switch-name>", {"A"}}
+ * g_TopdownHierarchy["A"] --> {"<perf-stat-switch-name>", {"B", "G", "D"}}
+ * g_TopdownHierarchy["B"] --> {"<perf-stat-switch-name>", {"C"}}
+ * g_TopdownHierarchy["G"] --> {"<perf-stat-switch-name>", {"E", "F"}}
+ *
+ * <perf-stat-switch-name> for a metric is the name of the switch which
+ * will be used invoke perf stat on that metric. These names are
+ * provided by a configuration parameters perf_stat_switch_names_ for
+ * each parent metric.
+ */
+void GetTopdownHierarchy(const std::vector<std::vector<std::string> > &records)
+{
+ assert((UINT_MAX != kConfigParams->header_row &&
+ UINT_MAX != g_LevelEndRow && UINT_MAX != g_LevelStartColm &&
+ UINT_MAX != g_LevelEndColm) &&
+ "Cannot find topdown hierarchy");
+
+ std::string last_parent("");
+ for (size_t level = g_LevelStartColm; level <= g_LevelEndColm;
+ level++) {
+ for (size_t i = kConfigParams->header_row + 1;
+ i <= g_LevelEndRow; ++i) {
+ if (records[i][level].empty() &&
+ records[i][level - 1].empty()) {
+ continue;
+ }
+
+ // All the metrics in the first level becomes
+ // the sub-metrics of "topdown" metric.
+ if (g_LevelStartColm == level) {
+ if (!records[i][level].empty()) {
+ (*g_TopdownHierarchy)[std::string(
+ "topdown")]
+ .child_metrics.push_back(
+ records[i][level]);
+ }
+ continue;
+ }
+
+ // For the case
+ // Level1 | Level2 | Level3 |
+ // A | | |
+ // We register the parent metric "A"
+ if (records[i][level].empty() &&
+ !records[i][level - 1].empty()) {
+ last_parent = records[i][level - 1];
+ continue;
+ }
+
+ // For the case
+ // Level1 | Level2 | Level3 |
+ // | B | |
+ // We make "B" the sub-metric of the registered
+ // parent metric "A".
+ if (!records[i][level].empty() &&
+ records[i][level - 1].empty()) {
+ (*g_TopdownHierarchy)[last_parent]
+ .child_metrics.push_back(
+ records[i][level]);
+ continue;
+ }
+ }
+ }
+
+ // Assign a perf stat switch names to parent metrics. The perf
+ // stat switch names are provided by configuration parameter and
+ // used while invoking perf stat command.
+ for (auto &p : *g_TopdownHierarchy) {
+ if (kConfigParams->perf_stat_switch_names_.count(p.first) !=
+ 0) {
+ p.second.perf_stat_switch_name =
+ kConfigParams->perf_stat_switch_names_.at(
+ p.first);
+ }
+ // For some targets, like perf_public,
+ // perf_stat_switch_names_ will not be available.
+ }
+}
+
+/**
+ * GetPerfmonVersion extracts the version number from the
+ * input csv file. The number is extracted as follows:
+ * 1. Find the column in the first row of input csv file having a regex
+ * match with keyword "version"
+ * 2. The version number is typically specified in the very next column
+ * of the same row.
+ */
+void GetPerfmonVersion(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r("version", std::regex_constants::icase);
+ std::string retval;
+
+ for (size_t j = 0; j < records[0].size(); j++) {
+ if (regex_match(records[0][j], r)) {
+ retval = (j + 1 < records[0].size()) ?
+ records[0][j + 1] :
+ "";
+ strncpy(g_PerfmonVersion, retval.c_str(),
+ sizeof(g_PerfmonVersion));
+ return;
+ }
+ }
+
+ strncpy(g_PerfmonVersion, "", sizeof(g_PerfmonVersion));
+}
+
+/**
+ * Determine the level end row. Level end row is defined as the
+ * last row number in the input csv file specifying a topdown
+ * level. Typically it is marked in the csv file with a 'dot'.
+ */
+size_t GetLevelEndRow(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r(g_LevelEndRowKey, std::regex_constants::icase);
+ for (size_t i = kConfigParams->header_row + 1; i < records.size();
+ ++i) {
+ if (regex_search(records[i][0], r)) {
+ return i - 1;
+ }
+ }
+
+ ERROR("Failed to derive the level end row using level end row"
+ " key: "
+ << g_LevelEndRowKey);
+ INFO("Level end row not found. update the 'g_LevelEndRowKey' in"
+ " dependence_dag_utils.cc");
+ exit(1);
+
+ return UINT_MAX;
+}
+
+/**
+ * The function determines the row number of input csv file which
+ * specifies if a CPU product is a client or server. It is typically the
+ * row above the header row.
+ */
+size_t GetServerIdentifierRow()
+{
+ if (UINT_MAX != kConfigParams->server_identifier_row_) {
+ return kConfigParams->server_identifier_row_;
+ }
+
+ return kConfigParams->header_row - 1;
+}
+
+/**
+ * Determine the last column letter in the input csv file specifying
+ * topdown levels. It is derived as the column before the one which
+ * starts specifying formulas.
+ *
+ * A typical header row looks like:
+ * Key | Level1 | Level2 | Level3 | SKX | SKL | ...
+ * The function return the column number for Level3.
+ *
+ * In case the kConfigParams->first_last_ is provided, which specifies
+ * the last level number, then the function returns the column number
+ * corresponding to that level. For example, if
+ * kConfigParams->first_last_ == 2, then the function returns the column
+ * number for Level2.
+ */
+size_t GetLevelEndColm(const std::vector<std::vector<std::string> > &records)
+{
+ if (UINT_MAX != kConfigParams->first_last_) {
+ std::string search_string("level");
+ search_string += std::to_string(kConfigParams->first_last_);
+ std::regex r(search_string.c_str(),
+ std::regex_constants::icase);
+
+ for (size_t j = 1; j <= g_LevelEndColm; j++) {
+ const std::string &cell_content =
+ records[kConfigParams->header_row][j];
+ if (regex_search(cell_content, r)) {
+ return j;
+ }
+ }
+ ERROR("Wrong specification of last level in "
+ "configuration file. Current Value: "
+ << kConfigParams->first_last_);
+ INFO("Assumption is levels are marked in the csv file "
+ "as Level1, Level2, "
+ "..., Leveln and the expected values of last level"
+ " (to be "
+ "provided in the configuration file) are [1 - n]");
+ exit(1);
+ }
+
+ if (kConfigParams->formula_start_colm_ == UINT_MAX) {
+ assert(0 && "kConfigParams->formula_start_colm_ not set");
+ }
+ return kConfigParams->formula_start_colm_ - 1;
+}
+
+/**
+ * Determine the first column letter in the input csv file specifying
+ * topdown levels. It is derived as the column, in the header row,
+ * having a regex match with formula_start_colm_Key.
+ *
+ * A typical header row looks like:
+ * Key | Level1 | Level2 | SKX | SKL | ...
+ * The function return the column number for Level1.
+ *
+ * In case the kConfigParams->first_level_ is provided, which specifies
+ * the level number to begin with, then the function returns the column
+ * number corresponding to that level.
+ * For example, if kConfigParams->first_level_ == 2, then the function
+ * returns the column number for Level2.
+ */
+size_t GetLevelStartColm(const std::vector<std::vector<std::string> > &records)
+{
+ if (UINT_MAX != kConfigParams->first_level_) {
+ std::string search_string("level");
+ search_string += std::to_string(kConfigParams->first_level_);
+ std::regex r(search_string.c_str(),
+ std::regex_constants::icase);
+
+ for (size_t j = 1;
+ j < records[kConfigParams->header_row].size(); j++) {
+ const std::string &cell_content =
+ records[kConfigParams->header_row][j];
+ if (regex_search(cell_content, r)) {
+ return j;
+ }
+ }
+
+ ERROR("Wrong specification of first level in "
+ "onfiguration file. Current Value: "
+ << kConfigParams->first_level_);
+ INFO("Assumption is levels are marked in the csv file "
+ "as Level1, Level2, "
+ "..., Leveln and the expected values of first "
+ "level (to be "
+ "provided in the configuration file) are [1 - n]");
+ exit(1);
+ }
+
+ if (std::strcmp(formula_start_colm_Key, "") == 0) {
+ FATAL("Set formula_start_colm_Key in "
+ "dependence_dag_utils.cpp file");
+ }
+
+ std::regex r(formula_start_colm_Key, std::regex_constants::icase);
+ for (size_t j = 1; j < records[kConfigParams->header_row].size(); j++) {
+ if (regex_search(records[kConfigParams->header_row][j], r)) {
+ return j;
+ }
+ }
+
+ ERROR("Wrong specification of formula start column key "
+ "Current Value: "
+ << formula_start_colm_Key);
+ INFO("Assumption is 'g_LevelStartColm' is derived as the first"
+ "column whose header matches the formula start column key."
+ " Try updating the formula_start_colm_Key in "
+ "dependence_dag_utils.cc.");
+ exit(1);
+}
+
+/**
+ * Derives the header row as the first row in csv file counting from
+ * topmost row, that has a substring match with header_rowKey on any of
+ * its cells.
+ */
+size_t GetHeaderRow(const std::vector<std::vector<std::string> > &records)
+{
+ if (UINT_MAX != kConfigParams->header_row) {
+ return kConfigParams->header_row;
+ }
+
+ std::regex r(header_rowKey, std::regex_constants::icase);
+ for (size_t i = 0; i < records.size(); ++i) {
+ for (size_t j = 0; j < records[i].size(); j++) {
+ if (regex_search(records[i][j], r)) {
+ return i;
+ }
+ }
+ }
+
+ ERROR("Header row not found.");
+ INFO("Update the header row keys in dependence_dag_utils.cpp");
+ exit(1);
+ return UINT_MAX;
+}
+
+/**
+ * Derives "the first column number specifying a formula" as the first
+ * column in csv file, counting from left in the header row, which does
+ * not match with formula_start_colm_Key. The counting of columns starts
+ * with 2nd from the left as the left most one has the item "Key" in its
+ * header column.
+ */
+size_t
+GetFormulaStartColm(const std::vector<std::vector<std::string> > &records)
+{
+ if (UINT_MAX != kConfigParams->formula_start_colm_) {
+ return kConfigParams->formula_start_colm_;
+ }
+
+ std::regex r(formula_start_colm_Key, std::regex_constants::icase);
+ for (size_t j = 1; j < records[kConfigParams->header_row].size(); j++) {
+ if (!regex_search(records[kConfigParams->header_row][j], r)) {
+ return j;
+ }
+ }
+ assert(0 && "formula start column not found. update the formula "
+ "start column keys");
+ return UINT_MAX;
+}
+
+/**
+ * Derives "the last column number specifying a formula".
+ * For the purpose, we first find the first column from
+ * right in the header row which **do** match with formula_end_colm_Key.
+ * The desired column is the one to the left of above found column.
+ */
+size_t GetFormulaEndColm(const std::vector<std::vector<std::string> > &records)
+{
+ if (UINT_MAX != kConfigParams->formula_end_colm_) {
+ return kConfigParams->formula_end_colm_;
+ }
+
+ std::regex r(formula_end_colm_Key, std::regex_constants::icase);
+ for (size_t j = records[kConfigParams->header_row].size(); j-- > 0;) {
+ if (regex_search(records[kConfigParams->header_row][j], r)) {
+ return j - 1;
+ }
+ }
+ assert(0 && "formula end column not found. update the formula end "
+ "column keys");
+ return UINT_MAX;
+}
+
+/**
+ * Derives "Column number in the input csv file specifying
+ * 'Count Domain'" as the column number, counting from leftmost, that
+ * has a substring match with g_CountDomainColmKey.
+ */
+size_t GetCountDomainColm(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r(g_CountDomainColmKey, std::regex_constants::icase);
+ for (size_t j = 1; j < records[kConfigParams->header_row].size(); j++) {
+ if (regex_search(records[kConfigParams->header_row][j], r)) {
+ return j;
+ }
+ }
+
+ ERROR("Count domain column not found.");
+ INFO("Update the formula 'g_CountDomainColmKey' in "
+ "dependence_dag_utils.cpp");
+ exit(1);
+
+ return UINT_MAX;
+}
+
+/**
+ * Get the alias CPUs, marked in the csv file as CPUX/CPUY.
+ */
+void GetAliasCpus(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r("\\/");
+ for (size_t j = kConfigParams->formula_start_colm_;
+ j <= kConfigParams->formula_end_colm_; j++) {
+ const std::string &cell_content =
+ records[kConfigParams->header_row][j];
+ if (regex_search(cell_content, r)) {
+ std::set<std::string> alias_set;
+ std::vector<std::string> split_values =
+ Split(cell_content, '/');
+
+ for (auto &item : split_values) {
+ if (kConfigParams->dont_care_cpus_.count(
+ item) == 0) {
+ alias_set.insert(Trim(item));
+ }
+ }
+ if (alias_set.size() > 1) {
+ g_CpuAliasesForEventInfo->push_back(alias_set);
+ }
+ }
+ }
+}
+
+/**
+ * Determine the cpus relevant to generate topdown hierarchy.
+ * If kConfigParams->selected_cpus_ is present (which are the selected
+ * CPUs provided using configuration parameter selected_cpus), then the
+ * function return value == kConfigParams->selected_cpus_.
+ * If not, return value =
+ * (cpu names derived from csv file) - kConfigParams->dont_care_cpus_
+ */
+std::vector<std::string>
+GetRelevantCpus(const std::vector<std::vector<std::string> > &records)
+{
+ if (!g_RelevantCpus->empty()) {
+ return *g_RelevantCpus;
+ }
+
+ if (!kConfigParams->selected_cpus_.empty()) {
+ return (kConfigParams->selected_cpus_);
+ }
+
+ std::vector<std::string> retval;
+
+ std::regex r("\\/");
+ for (size_t j = kConfigParams->formula_start_colm_;
+ j <= kConfigParams->formula_end_colm_; j++) {
+ const std::string &cell_content =
+ records[kConfigParams->header_row][j];
+
+ // Check if the CPUs names are provided as CPUx/CPUy
+ if (regex_search(cell_content, r)) {
+ std::vector<std::string> split_values =
+ Split(cell_content, '/');
+
+ for (auto &item : split_values) {
+ if (kConfigParams->dont_care_cpus_.count(
+ item) == 0) {
+ retval.push_back(Trim(item));
+ }
+ }
+ } else {
+ if (kConfigParams->dont_care_cpus_.count(
+ cell_content) == 0) {
+ retval.push_back(Trim(cell_content));
+ }
+ }
+ }
+
+ return retval;
+}
+
+/**
+ * Determines the column number in the input csv file specifying
+ * 'Description'. It is derived as the column number, counting from
+ * leftmost, that has a substring match with g_DescColmKey.
+ */
+size_t GetDescColm(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r(g_DescColmKey, std::regex_constants::icase);
+ for (size_t j = 1; j < records[kConfigParams->header_row].size(); j++) {
+ if (regex_search(records[kConfigParams->header_row][j], r)) {
+ return j;
+ }
+ }
+
+ ERROR("Description column not found.");
+ INFO("Update the formula 'g_DescColmKey' in "
+ "dependence_dag_utils.cpp");
+ exit(1);
+
+ return UINT_MAX;
+}
+
+/**
+ * Determines the column number in the input csv file specifying
+ * 'Metric Group'. It is derived as the column number, counting from
+ * leftmost, that has a substring match with g_MetricGroupColmKey.
+ */
+size_t GetMetricGroupColm(const std::vector<std::vector<std::string> > &records)
+{
+ std::regex r(g_MetricGroupColmKey, std::regex_constants::icase);
+ for (size_t j = 1; j < records[kConfigParams->header_row].size(); j++) {
+ if (regex_search(records[kConfigParams->header_row][j], r)) {
+ return j;
+ }
+ }
+
+ ERROR("Metric Group column not found.");
+ INFO("Update the formula 'g_MetricGroupColmKey' in "
+ "dependence_dag_utils.cpp");
+ exit(1);
+
+ return UINT_MAX;
+}
+
+/**
+ * 'IsServer' determine if a product represented by a column number is a
+ * server or a client.
+ */
+bool IsServer(const std::vector<std::vector<std::string> > &records,
+ const size_t product_column_number)
+{
+ std::regex r("server", std::regex_constants::icase);
+ if (regex_match(records[kConfigParams->server_identifier_row_]
+ [product_column_number],
+ r)) {
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * The input csv file has intentionally omitted some formulas for many
+ * metrics. The idea is that those missing formulas can be derived
+ * using an inheritance rule which says:
+ *
+ * Client products (like SNB/IVB/HSW/BDW/SKL) inherits on their
+ * predecessors. E.g. BDW inherits HSW (which inherits IVB)
+ *
+ * Servers products (like JKT/IVT/HSX/BDX) inherits a baseline core and
+ * builds-on predecessors. E.g. HSX inherits HSW and builds-on IVT
+ * (which inherits IVB)
+ *
+ * PopulateEmptyFormulas modifies the `records` (which is the in-memory
+ * representation of the input csv file), to fill in the cell with
+ * missing formulas based on above inheritance rule.
+ *
+ */
+void PopulateEmptyFormulas(std::vector<std::vector<std::string> > *records)
+{
+ bool server_bool = false;
+
+ for (size_t i = kConfigParams->header_row + 1; i < records->size();
+ ++i) {
+ std::string last_client_data("");
+
+ for (size_t j = kConfigParams->formula_end_colm_;
+ j >= kConfigParams->formula_start_colm_; j--) {
+ server_bool = IsServer(*records, j);
+
+ if (!server_bool) {
+ // Client will inherit missing data from
+ // its predecessors clients.
+ if (!(*records)[i][j].empty()) {
+ last_client_data = (*records)[i][j];
+ } else {
+ (*records)[i][j] = last_client_data;
+ }
+ } else {
+ // Servers will inherit missing data
+ // from its predecessors clients.
+ if ((*records)[i][j].empty()) {
+ (*records)[i][j] = last_client_data;
+ }
+ }
+ }
+ }
+}
+
+/**
+ * `records` is the in-memory representation of the input csv file.
+ * `ParseRecordToMappedData` parses each cell given by
+ * `records[row][column]` and extracts information as follows:
+ *
+ * For example: For the following csv entry
+ * 0 1 2 3 4 5
+ * Level1 | SKX | Count Domain | Description | Metric Group
+ * P M | Formula | Slots | description | MG
+ *
+ * For the cell specifying "Formula", the information collected are:
+ * (1) row and column number: 0,2
+ * (2) Textual content of the cell: "Formula"
+ * (3) Name of the header: "SKX"
+ * (4) The count domain: "Slots"
+ * (5) Descriptive text: "Some description"
+ * (6) Metric group: "MG"
+ * (7) Key: "M"
+ * (8) Prefix: "P"
+ * (9) Aux data: A collection of data like Count Domain, Description,
+ * Metric Group etc.
+ */
+MappedData
+ParseRecordToMappedData(const std::vector<std::vector<std::string> > &records,
+ const size_t &row, const size_t &column)
+{
+ MappedData obj;
+ obj.row_ = row;
+ obj.column_ = column;
+ obj.cell_content_ = records[row][column];
+ obj.header_name_ = records[kConfigParams->header_row][column];
+ obj.count_domain_ = records[row][g_CountDomainColm];
+ obj.description_ = records[row][g_DescColm];
+ obj.metric_group_ = records[row][g_MetricGroupColm];
+
+ // Find metric name
+ // This is equal to the first non-empty string in `row` before
+ // the column starting formula/expression specification.
+ for (size_t j = kConfigParams->formula_start_colm_; j-- > 0;) {
+ if (!records[row][j].empty()) {
+ obj.metric_name_ = records[row][j];
+ break;
+ }
+ }
+
+ if (obj.metric_name_.empty()) {
+ std::cerr << "key missing for row: " << row
+ << " column: " << column << "\n";
+ assert(0);
+ }
+
+ // Find the prefix string.
+ bool flag = true;
+ for (size_t j = kConfigParams->formula_start_colm_; j-- > 0;) {
+ if (!records[row][j].empty()) {
+ if (flag) {
+ flag = false;
+ } else {
+ obj.prefix_ = records[row][j] + obj.prefix_;
+ }
+ }
+ }
+
+ // Find the aux_data string.
+ for (size_t j = kConfigParams->formula_end_colm_ + 1;
+ j < records[row].size(); j++) {
+ if (!records[row][j].empty()) {
+ obj.aux_data_ += "\t * " +
+ records[kConfigParams->header_row][j] +
+ ": " + records[row][j] + "\n";
+ }
+ }
+
+ return obj;
+}
+
+std::string GetKey(const MappedData &data)
+{
+ return data.metric_name_ + "_" + data.header_name_;
+}
+
+/**
+ * Create a dependence dag using the 'records' data-structure
+ * (which is the in-memory representation of the input csv file)
+ * "dependence dag" is implemented as a map as follows:
+ * A. Suppose we have rows
+ * Level1 Level2 || SKL BDW
+ * K L1 || P1*L2 P2*L2
+ * K L2 || P3*Q3 P4*Q4
+ *
+ * The information we will be storing in the map are as follows:
+ *
+ * Map Key -> and object of `MappedData`
+ * -----------------------------------------
+ * <metric>_<CPU> -> {<header>, <textual formula>, <prefix>, ...}
+ * L1_SKL -> {SKL, P1*L2, "K", ...}
+ * L1_BWD -> {BWD, P2*L2, "K", ...}
+ * L2_SKL -> {SKL, P3*Q3, "K", ...}
+ * L2_BWD -> {BWD, P4*Q4, "K", ...}
+ */
+std::unordered_map<std::string, MappedData>
+CreateDependenceDag(const std::vector<std::vector<std::string> > &records)
+{
+ std::unordered_map<std::string, MappedData> dependence_dag;
+
+ // Store the records in a dependence std::map
+ for (size_t i = kConfigParams->header_row + 1; i < records.size();
+ ++i) {
+ for (size_t j = kConfigParams->formula_start_colm_;
+ j <= kConfigParams->formula_end_colm_; j++) {
+ MappedData data =
+ ParseRecordToMappedData(records, i, j);
+
+ // Skip std::map population for irrelevant keys.
+ if (data.metric_name_.empty() ||
+ data.metric_name_ == ".") {
+ continue;
+ }
+
+ std::string key = GetKey(data);
+
+ if (dependence_dag.count(key) > 0) {
+ std::cerr << "Duplicate key: " << key
+ << " Row: " << i << " Colm: " << j
+ << "\n";
+ assert(0 && "duplicate!!");
+ } else {
+ dependence_dag[key] = data;
+ }
+ }
+ }
+
+ assert(!dependence_dag.empty() && "empty dependence dag");
+
+ // Remove the entries with column header label as SKL/BDW and
+ // create separate entry for them.
+ std::regex r("\\/");
+ std::vector<std::string> keys_to_remove;
+ std::vector<std::pair<std::string, MappedData> > keys_to_insert;
+ for (auto &p : dependence_dag) {
+ const std::string &key = p.first;
+ MappedData &value = p.second;
+
+ if (regex_search(value.header_name_, r)) {
+ std::vector<std::string> split_values =
+ Split(value.header_name_, '/');
+
+ for (auto &item : split_values) {
+ MappedData new_value = value;
+ new_value.header_name_ = Trim(item);
+ std::string new_mapkey = GetKey(new_value);
+ if (dependence_dag.count(new_mapkey) > 0) {
+ std::cerr << "Duplicate key: "
+ << new_mapkey << "\n";
+ assert(0 && "Duplicate 2");
+ }
+ keys_to_insert.push_back(
+ std::pair<std::string, MappedData>(
+ new_mapkey, new_value));
+ }
+ keys_to_remove.push_back(key);
+ }
+ }
+
+ for (auto &delkey : keys_to_remove) {
+ dependence_dag.erase(delkey);
+ }
+ for (auto &insertkey : keys_to_insert) {
+ dependence_dag.insert(insertkey);
+ }
+
+ // Adding description for dummy metric topdown.
+ for (auto &known_cpu : *g_RelevantCpus) {
+ MappedData &data =
+ dependence_dag[std::string("topdown") + "_" + known_cpu];
+ data.header_name_ = known_cpu;
+ data.metric_name_ = std::string("topdown") + "_" + known_cpu;
+ data.description_ = std::string(
+ "Intel Topdown analysis expressed in % of issue"
+ " slots");
+ }
+
+ return dependence_dag;
+}
+
+/**
+ * Print Diagnosis results.
+ */
+void PrintConfigVars()
+{
+ std::cout << std::endl;
+ INFO("Important csv artifacts");
+ INFO(std::string("Header row number = ") +
+ std::to_string(kConfigParams->header_row + 1));
+ INFO(std::string("Server identifier row number = ") +
+ std::to_string(kConfigParams->server_identifier_row_ + 1));
+ INFO(std::string("Formula start column = ") +
+ std::string(1, static_cast<char>(
+ 'A' + kConfigParams->formula_start_colm_)));
+ INFO(std::string("Formula end column = ") +
+ std::string(1, static_cast<char>(
+ 'A' + kConfigParams->formula_end_colm_)));
+ INFO(std::string("Level start column = ") +
+ std::string(1, static_cast<char>('A' + g_LevelStartColm)));
+ INFO(std::string("Level end column = ") +
+ std::string(1, static_cast<char>('A' + g_LevelEndColm)));
+ INFO(std::string("Level end row number = ") +
+ std::to_string(g_LevelEndRow + 1));
+ INFO(std::string("Count Domain column = ") +
+ std::string(1, static_cast<char>('A' + g_CountDomainColm)));
+ INFO(std::string("Description column = ") +
+ std::string(1, static_cast<char>('A' + g_DescColm)));
+
+ std::cout << std::endl;
+ INFO("Relevant CPUs = " << *g_RelevantCpus);
+ INFO("Don't care CPUs = " << kConfigParams->dont_care_cpus_);
+ INFO("CPU alias sets for event encodings");
+ for (auto &alias_set : *g_CpuAliasesForEventInfo) {
+ INFO("\t{" << alias_set << "} ");
+ }
+}
+
+} // namespace
+
+std::unordered_map<std::string, MappedData>
+ProcessRecords(std::vector<std::vector<std::string> > *records)
+{
+ InitGlobals();
+
+ // Task 0
+ GetPerfmonVersion(*records);
+
+ // Task 1
+ kConfigParams->header_row = GetHeaderRow(*records);
+ kConfigParams->formula_start_colm_ = GetFormulaStartColm(*records);
+ kConfigParams->formula_end_colm_ = GetFormulaEndColm(*records);
+ g_CountDomainColm = GetCountDomainColm(*records);
+ g_DescColm = GetDescColm(*records);
+ g_MetricGroupColm = GetMetricGroupColm(*records);
+ g_LevelStartColm = GetLevelStartColm(*records);
+ g_LevelEndColm = GetLevelEndColm(*records);
+ g_LevelEndRow = GetLevelEndRow(*records);
+ (*g_RelevantCpus) = GetRelevantCpus(*records);
+ kConfigParams->server_identifier_row_ = GetServerIdentifierRow();
+
+ // Task 2
+ GetTopdownHierarchy(*records);
+ PlotTopdownHierarchy(*g_TopdownHierarchy);
+
+ // Task 3
+ GetAliasCpus(*records);
+
+ // Task 4
+ PopulateEmptyFormulas(records);
+
+ // Task 5
+ PrintConfigVars();
+
+ // Task 6
+ return CreateDependenceDag(*records);
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.h b/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.h
new file mode 100644
index 000000000000..e7f992f98e45
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/dependence_dag_utils.h
@@ -0,0 +1,178 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// --------------------------------------------------------
+// File: dependence_dag_utils.h
+// --------------------------------------------------------
+//
+// The header provides the interface to read the input csv file, process it and
+// populate an in-memory model.
+//
+// The cells of the input csv file can be broadly divided into two types:
+// (1) Ones specifying the top down metric and (2) Ones specifying the
+// metric expression for a metric and CPU pair. (CPU is specified in the csv
+// file by the column and metric by the row).
+//
+// A formula might involve the following:
+// (1) Raw PMU events
+// (2) Constants
+// (3) External parameters: The definition of such components are not defined in
+// the input csv file and must come from elsewhere. For example, a formula
+// component `SMT_on`, specifying if hyper-threading is enabled on CPU or not,
+// need to be extracted from host machine.
+// Example, ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else CLKS
+// (4) Another (sub-)metric, as in
+// 1 - ( Frontend_Bound + Bad_Speculation + Retiring )
+//
+// We represent the formula as a dependence dag where the root of the dag
+// represents a topdown metric (hence a formula), the intermediate nodes
+// represent sub-formulas and the leaves represent the PMU events, constants or
+// external parameter. We implement this dependence dag using a map.
+
+#ifndef TOPDOWN_PARSER_DEPENDENCE_DAG_UTILS_H_
+#define TOPDOWN_PARSER_DEPENDENCE_DAG_UTILS_H_
+
+#include <map>
+#include <set>
+#include <string>
+#include <unordered_map>
+#include <vector>
+
+namespace topdown_parser
+{
+/**
+ * For each metric, the data-structure `TopdownInfo` stores the
+ * (1) the name of perf-metric switch name to be used for invoking perf stat.
+ * Note: This field is not required for all targets.
+ * (2) the names of all the sub-metrics.
+ */
+struct TopdownInfo {
+ std::string perf_stat_switch_name;
+ std::vector<std::string> child_metrics;
+};
+
+/**
+ * `g_TopdownHierarchy` stores the topdown hierarchy.
+ *
+ * An example: The metric `topdown` has four sub-metrics and each of the
+ * sub-metrics can be further broken down.
+ * Topdown
+ * Frontend_Bound
+ * Frontend_Latency
+ * Frontend_Bandwidth
+ * Backend_Bound
+ * ...
+ * Bad_Speculation
+ * ...
+ * Retiring
+ * ...
+ *
+ * g_TopdownHierarchy is an map from the parent metric name to an object of
+ * type TopdownInfo which contains
+ * 1. Name of perf stat switch: This is derived from the configuration
+ * parameter `perf_stat_switch_names_`.
+ * 2. Names of all the child metrics
+ *
+ * For example, in the context of running example,
+ *
+ * g_TopdownHierarchy["Topdown"] --> {"topdown",
+ * {"Frontend_Bound", Backend_Bound, Bad_Speculation, Retiring}}
+ * g_TopdownHierarchy["Frontend_Bound"] --> {"topdown_fe",
+ * {"Frontend_Latency", "Frontend_Bandwidth"}}
+ */
+extern std::map<std::string, TopdownInfo> *g_TopdownHierarchy;
+
+/**
+ * The version number of the input csv file.
+ */
+#define VERSION_MAX_STRLEN 100
+extern char g_PerfmonVersion[VERSION_MAX_STRLEN];
+
+/**
+ * The CPUs actually used for generating topdown files. This takes into account
+ * the CPUs derived from the input csv file and the ones included or excluded
+ * by configuration parameters `selected_cpus_` and `dont_care_cpus_`
+ */
+extern std::vector<std::string> *g_RelevantCpus;
+
+/**
+ * List of unique CPU names which are specified in the input csv file as
+ * CPUX/CPUY
+ */
+extern std::vector<std::set<std::string> > *g_CpuAliasesForEventInfo;
+
+/**
+ * Each textual entry of the input csv file is parsed to the following
+ * data-structure.
+ */
+struct MappedData {
+ // Row and column of the textual entry.
+ size_t row_, column_;
+ // The textual content.
+ std::string cell_content_;
+ // Prefix is used to make the
+ // function name more informative.
+ std::string prefix_;
+ // Auxiliary data about the entry.
+ std::string aux_data_;
+ // The header value for the entry, which equals the CPU model.
+ std::string header_name_;
+ // Metric name
+ std::string metric_name_;
+ // The value of count domain
+ // for the entry.
+ std::string count_domain_;
+ // The value of description
+ // for the entry.
+ std::string description_;
+ // The value of metric group
+ // for the entry.
+ std::string metric_group_;
+};
+
+std::ostream &operator<<(std::ostream &, const MappedData &);
+std::ostream &operator<<(std::ostream &,
+ const std::unordered_map<std::string, MappedData> &);
+
+/**
+ * ProcessRecords parses and process the entries of the csv file and
+ * creates an in-memory model. It process the list of rows 'records' of the csv
+ * file in the following way.
+ * Task 0. Determine the version number of the input csv file.
+ *
+ * Task 1. Derive information from the input csv file.
+ *
+ * Task 2: Generate the topdown hierarchy
+ *
+ * Task 3. Determine the alias CPUs.
+ * If the csv file has column headers like "CPUX/CPUY", then we consider the
+ * CPUs as aliases for event encoding look-up purposes. That is, if the event
+ * encoding JSon file for CPUX, is missing or an event is not found in the
+ * event encoding file for CPUX, then we will lookup in encoding file of CPUY.
+ *
+ * Task 4. Populate the missing cell values.
+ *
+ * Task 5. Print the information derived at Task 2.
+ *
+ * Task 6. Create a map storing the records. in the following fashion.
+ * Example
+ * A. Suppose we have rows
+ * Level1 Level2 || SKL BDW
+ * K L1 || P1*L2 P2*L2
+ * K L2 || P3*Q3 P4*Q4
+ *
+ * The information we will be storing in the map are as follows:
+ *
+ * Map Key -> Some of the mapped values
+ *
+ * <metric>_<CPU> -> {<header>, <textual formula>, <prefix>, ...}
+ * L1_SKL -> {SKL, P1*L2, "K", ...}
+ * L1_BWD -> {BWD, P2*L2, "K", ...}
+ * L2_SKL -> {SKL, P3*Q3, "K", ...}
+ * L2_BWD -> {BWD, P4*Q4, "K", ...}
+ */
+std::unordered_map<std::string, MappedData>
+ProcessRecords(std::vector<std::vector<std::string> > *);
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_DEPENDENCE_DAG_UTILS_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 07/12] perf topdown-parser: Metric expression parser.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (5 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 06/12] perf topdown-parser: Interface for TMA_Metrics.csv Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 08/12] perf topdown-parser: Add event interface Ian Rogers
` (5 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
A parser capable of processing metrics found in TMA_Metrics.csv.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../pmu-events/topdown-parser/expr_parser.y | 224 ++++++++++++++++++
1 file changed, 224 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/expr_parser.y
diff --git a/tools/perf/pmu-events/topdown-parser/expr_parser.y b/tools/perf/pmu-events/topdown-parser/expr_parser.y
new file mode 100644
index 000000000000..ddf635a6470c
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/expr_parser.y
@@ -0,0 +1,224 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+/* Topdown expression parser */
+%language "c++"
+%define api.value.type variant
+
+%code requires {
+#include <string>
+}
+
+/* Increase error verbosity. */
+%define parse.trace
+%define parse.error verbose
+/* Semantic value printer. */
+%printer { yyo << $$; } <*>;
+
+
+/* Inputs for the parser and yylex. */
+%param { const std::string &input }
+%param { size_t *cursor }
+/* Inputs/output for the parser. */
+%parse-param { bool convert_if_stmt }
+%parse-param { bool remove_false_branch }
+%parse-param { bool wrap_div_in_function }
+%parse-param { std::string *final_val }
+
+/* Tokens returned by yylex. */
+%define api.token.prefix {TOK_}
+%token
+ MIN
+ MAX
+ IF
+ ELSE
+ MODEL
+ IN
+ NEG
+ EOF 0
+%token <std::string> ID_OR_NUM
+
+/* Type of non-terminal expressions. */
+%type <std::string> expr if_expr IDS
+
+/* Presidence and associativity. */
+%left MIN MAX MODEL IN
+%right ELSE
+%right IF
+%left '='
+%left '<' '>'
+%left '-' '+'
+%left '*' '/' '%'
+%left NEG
+
+%code {
+static int yylex(yy::parser::semantic_type *res,
+ const std::string &input,
+ size_t *cursor);
+
+void yy::parser::error (const std::string& m)
+{
+// ERROR(m << '\n' << "Input:\n" << input << "\nCursor: " << *cursor);
+}
+
+}
+
+%%
+%start all_expr;
+all_expr: expr EOF { *final_val = $1; } ;
+
+IDS:
+'\'' ID_OR_NUM '\'' { $$ = std::string(" ") + $2 + " "; }
+|
+IDS '\'' ID_OR_NUM '\'' { $$ = $1 + " " + $3 + " "; }
+;
+
+if_expr:
+expr IF expr ELSE expr
+{
+ if (convert_if_stmt)
+ $$ = $3 + " ? " + $1 + " : " + $5;
+ else
+ $$ = $1 + " if " + $3 + " else " + $5;
+
+ if (remove_false_branch) {
+ if (std::string::npos != $3.find("0.000", 0))
+ $$ = $5;
+ else if (std::string::npos != $3.find("1.000", 0))
+ $$ = $1;
+ }
+}
+|
+expr IF MODEL IN '[' IDS ']' ELSE expr
+{
+ $$ = std::string("#Model in [ ") + $6 + " ] ? " + $1 + " : " + $9;
+}
+;
+
+expr:
+ID_OR_NUM { $$ = $1; }
+|
+expr '+' expr { $$ = $1 + " + " + $3; }
+|
+expr '-' expr { $$ = $1 + " - " + $3; }
+|
+expr '*' expr { $$ = $1 + " * " + $3; }
+|
+expr '>' expr { $$ = $1 + " > " + $3; }
+|
+expr '<' expr { $$ = $1 + " < " + $3; }
+|
+expr '%' expr { $$ = $1 + " % " + $3; }
+|
+'(' expr ')' { $$ = std::string("( ") + $2 + " )"; }
+|
+expr '=' '=' expr { $$ = $1 + " == " + $4; }
+|
+'-' expr %prec NEG { $$ = std::string(" - ") + $2; }
+|
+expr '/' expr
+{
+ if (wrap_div_in_function)
+ $$ = std::string("d_ratio ( ") + $1 + " , " + $3 + " )";
+ else
+ $$ = $1 + " / " + $3;
+}
+|
+MIN '(' expr ',' expr ')'
+{
+ $$ = std::string("min ( ") + $3 + " , " + $5 + " )";
+}
+|
+MAX '(' expr ',' expr ')'
+{
+ $$ = std::string("max ( ") + $3 + " , " + $5 + " )";
+}
+|
+if_expr
+{
+ if (convert_if_stmt)
+ $$ = std::string("( ") + $1 + " )";
+ else
+ $$ = $1;
+}
+;
+
+%%
+static int expr__symbol(yy::parser::semantic_type *res,
+ size_t p,
+ const std::string &input,
+ size_t *cursor)
+{
+ std::string dst;
+
+ if (input[p] == '#')
+ dst += input[p++];
+
+ while (p < input.size() &&
+ (isalnum(input[p]) ||
+ input[p] == '_' ||
+ input[p] == '.' ||
+ input[p] == ':' ||
+ input[p] == '@' ||
+ input[p] == '\\' ||
+ input[p] == '=')
+ ) {
+ if(input[p] == '\\') {
+ // Consume 2 consequitive '\\' and the escaped char.
+ dst += input[p++];
+ if (p >= input.size())
+ break;
+ dst += input[p++];
+ if (p >= input.size())
+ break;
+ }
+ dst += input[p++];
+ }
+ *cursor = p;
+ if (p >= input.size() && dst.empty())
+ return yy::parser::token::TOK_EOF;
+ if (dst == "min") return yy::parser::token::TOK_MIN;
+ if (dst == "max") return yy::parser::token::TOK_MAX;
+ if (dst == "if") return yy::parser::token::TOK_IF;
+ if (dst == "in") return yy::parser::token::TOK_IN;
+ if (dst == "else") return yy::parser::token::TOK_ELSE;
+ if (dst == "#Model") return yy::parser::token::TOK_MODEL;
+ res->emplace<std::string>(dst);
+ return yy::parser::token::TOK_ID_OR_NUM;
+}
+
+static int yylex(yy::parser::semantic_type *res,
+ const std::string &input,
+ size_t *cursor)
+{
+ size_t p = *cursor;
+
+ // Skip spaces.
+ while (p < input.size() && isspace(input[p]))
+ p++;
+
+ if (p >= input.size()) {
+ *cursor = p;
+ return yy::parser::token::TOK_EOF;
+ }
+ switch (input[p]) {
+ case '#':
+ case 'a' ... 'z':
+ case 'A' ... 'Z':
+ return expr__symbol(res, p, input, cursor);
+ case '0' ... '9': case '.': {
+ // Read the number and regularize numbers starting
+ // with '.' adjusting the cursor to after the number.
+ const size_t s = p;
+ res->emplace<std::string>(
+ std::to_string(std::stod(input.substr(p), &p)));
+ *cursor = p + s;
+ return yy::parser::token::TOK_ID_OR_NUM;
+ }
+ default:
+ *cursor = p + 1;
+ return input[p];
+ }
+}
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 08/12] perf topdown-parser: Add event interface.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (6 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 07/12] perf topdown-parser: Metric expression parser Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 09/12] perf topdown-paser: Add code generation API Ian Rogers
` (4 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Add an ability to load then query events loaded from json files. Events
may be loaded from a single json file, such as on
download.01.org/perfmon, are from multiple json files within a
directory.
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../pmu-events/topdown-parser/event_info.cpp | 443 ++++++++++++++++++
.../pmu-events/topdown-parser/event_info.h | 114 +++++
2 files changed, 557 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/event_info.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/event_info.h
diff --git a/tools/perf/pmu-events/topdown-parser/event_info.cpp b/tools/perf/pmu-events/topdown-parser/event_info.cpp
new file mode 100644
index 000000000000..c5a6fa305fcb
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/event_info.cpp
@@ -0,0 +1,443 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "event_info.h"
+
+#include <dirent.h>
+
+#include <regex>
+
+#include "configuration.h"
+#include "dependence_dag_utils.h"
+#include "expr_parser-bison.hpp"
+#include "general_utils.h"
+#include "jsmn_extras.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+namespace
+{
+/**
+ * g_EventInfoMap stores, the event information `EventInfo`
+ * corresponsing to an event name and a cpu, using the following map
+ * structure.
+ *
+ * CPU -> (Event Name -> "Meta Information of that event")
+ *
+ * The data-structure is useful for querying event name for a particular
+ * cpu.
+ */
+using EventNameToEventInfo = std::unordered_map<std::string, EventInfo>;
+using CPUToEventInfo = std::unordered_map<std::string, EventNameToEventInfo>;
+CPUToEventInfo *g_EventInfoMap = nullptr;
+
+/**
+ * Initialize globals.
+ */
+void InitGlobals()
+{
+ if (g_EventInfoMap == nullptr) {
+ g_EventInfoMap = new std::unordered_map<
+ std::string,
+ std::unordered_map<std::string, EventInfo> >;
+ }
+}
+
+/**
+ * SearchEvent implements the algorithm to search event E for CPU 'cpu'
+ */
+bool SearchEvent(const std::string &cpu, const std::string &event_token,
+ const EventInfo **event_data)
+{
+ // If there is no event encoding map for 'cpu', return false;
+ if (g_EventInfoMap->count(cpu) == 0) {
+ return false;
+ }
+
+ // If there is event encoding map for 'cpu' and event is found
+ // in the map, return true;
+ if (g_EventInfoMap->at(cpu).count(event_token)) {
+ *event_data = &g_EventInfoMap->at(cpu).at(event_token);
+ return true;
+ }
+
+ // At this point, we have an event encoding map for 'cpu', but
+ // event is NOT found in the map. Check for the alias CPUs and
+ // search for the event in their encoding maps.
+ for (auto &alias_set : *g_CpuAliasesForEventInfo) {
+ // Go over all the alias sets and find the one where
+ // `cpu` belongs.
+ if (alias_set.count(cpu) == 0) {
+ continue;
+ }
+
+ for (auto &alias : alias_set) {
+ if (alias == cpu) {
+ continue;
+ }
+ if (g_EventInfoMap->count(alias) &&
+ g_EventInfoMap->at(alias).count(event_token)) {
+ *event_data = &g_EventInfoMap->at(alias).at(
+ event_token);
+ return true;
+ }
+ }
+ }
+
+ return false;
+}
+
+void PopulateEventInfoMap(const char *js, const jsmntok_t *t, int r,
+ void *metainfo)
+{
+ std::unordered_map<std::string, EventInfo> *event_info =
+ (std::unordered_map<std::string, EventInfo> *)metainfo;
+
+ // Events are organized as an array of objects of key value pairs.
+ for (int i = 1; i < r;) {
+ if (t[i].type != JSMN_OBJECT) {
+ continue;
+ }
+ int size = t[i].size;
+ i++;
+ std::unordered_map<std::string, std::string> working_set;
+ for (int j = 0; j < size; j += 2) {
+ std::pair<std::string, std::string> key_val;
+ i = get_key_val(js, t, i, &key_val);
+ i++;
+ working_set[key_val.first] = key_val.second;
+ }
+ auto name = working_set.find("EventName");
+ if (name != working_set.end()) {
+ (*event_info)[name->second] = EventInfo(
+ name->second, working_set["EventCode"],
+ working_set["UMask"], working_set["MSRValue"],
+ working_set["CounterMask"],
+ working_set["Invert"], working_set["AnyThread"],
+ working_set["EdgeDetect"],
+ working_set["Errata"]);
+ }
+ }
+}
+
+/**
+ * Extract the event information `event_info` like EventName, EventCode.
+ * etc from the event encoding json file `json_fname`.
+ */
+int ReadEventInfoFromJson(const char *json_fname,
+ std::unordered_map<std::string, EventInfo> *event_info)
+{
+ return ParseJson(json_fname, &PopulateEventInfoMap, event_info);
+}
+
+/**
+ * ProcessEventFiles does the following: 1. Read the version number of
+ * each Json file. 2. Print the candidate Json files for each CPU and
+ * mark the selected one with (*). 3. Read the event information from
+ * each Json file for a particular cpu and populate the`g_EventInfoMap`
+ */
+void ProcessEventFiles(
+ const std::unordered_map<std::string, std::vector<std::string> >
+ &cpu_to_json_filelist)
+{
+ for (const auto &entry : cpu_to_json_filelist) {
+ const std::string &cpu = entry.first;
+ const std::vector<std::string> &json_files = entry.second;
+ std::unordered_map<std::string, EventInfo> event_info;
+ for (const auto &jname : json_files) {
+ ReadEventInfoFromJson(jname.c_str(), &event_info);
+ }
+ g_EventInfoMap->insert(
+ std::pair<std::string,
+ std::unordered_map<std::string, EventInfo> >(
+ cpu, event_info));
+ }
+}
+
+/**
+ * Check if every permissible CPU has a Json file hint associated with
+ * it. If a particular CPU, CPUX does not have a Json hint, we check
+ * for alias CPUs, (like CPUX/CPUY as mentioned in the csv file), and
+ * assign the Json file hint of the alias, CPUY, to the CPU CPUX.
+ */
+void CheckJsonEventHints()
+{
+ // Check if the Json event file hints are provided for each
+ // CPUs.
+ for (auto &cpu : *g_RelevantCpus) {
+ if (kConfigParams->json_filename_hints_.count(cpu) == 0) {
+ // Check for any alias to cpu
+ bool json_filename_hint_found = false;
+ for (auto &alias_set : *g_CpuAliasesForEventInfo) {
+ if (alias_set.count(cpu) == 0) {
+ continue;
+ }
+
+ for (auto &alias : alias_set) {
+ if (alias == cpu) {
+ continue;
+ }
+ if (0 !=
+ kConfigParams->json_filename_hints_
+ .count(alias)) {
+ kConfigParams
+ ->json_filename_hints_
+ [cpu] =
+ kConfigParams
+ ->json_filename_hints_
+ [alias];
+ json_filename_hint_found = true;
+ INFO("Using the same "
+ "Json file hint: \""
+ << kConfigParams
+ ->json_filename_hints_
+ [alias]
+ << "\" for alias CPUs: "
+ << alias << ", " << cpu);
+ break;
+ }
+ }
+ }
+
+ if (json_filename_hint_found) {
+ continue;
+ }
+
+ ERROR("Unspecified json filename hint for cpu: "
+ << cpu);
+ INFO("Specify a substring of the json file name"
+ "in 'kConfigParams->json_filename_hints_' "
+ "data structure in configuration file."
+ "Else put the cpu into "
+ "'dont_care_cpus' in configuration file.");
+ exit(1);
+ }
+ }
+}
+
+/**
+ * Preprocess cell contents.
+ */
+std::vector<std::string> NormalizeFormula(const std::string &str)
+{
+ std::vector<std::string> body_tokens;
+
+ if (!str.length()) {
+ return body_tokens;
+ }
+
+ // Make the cell content amenable to split based on
+ // whitespace.
+ std::string cell_content;
+ size_t cursor = 0;
+ yy::parser parser(str, &cursor, false /* convert if stmt */,
+ false /* Remove false branch */,
+ false /* wrap div operator in a function */,
+ &cell_content);
+ if (parser.parse())
+ FATAL("Parsing error");
+
+ // Split the cell content based on whitespace.
+ body_tokens = WhitespaceSplit(cell_content);
+
+ return body_tokens;
+}
+
+} // namespace
+
+bool GetEventInfo(const std::string &input_str, const std::string &cpu,
+ const EventInfo **event_data,
+ std::vector<std::string> *tokens)
+{
+ std::string str(input_str);
+
+ // Check if the token is of the form
+ // OFFCORE_RESPONSE:request=A:response=B
+ // Replace it with OFFCORE_RESPONSE.A.B
+ if (regex_search(str, std::regex("OFFCORE_RESPONSE"))) {
+ str = regex_replace(str, std::regex(":request="), ".");
+ str = regex_replace(str, std::regex(":response="), ".");
+ }
+
+ // Handle PEBS event.
+ std::string event_token = regex_replace(str, std::regex("_PS$"), "");
+
+ // Check if the token is of form 'evt:c1:e1'; Extract the 'evt' part.
+ if (regex_search(str, std::regex("\\:"))) {
+ *tokens = Split(str, ':');
+ if (tokens->size() < 2) {
+ FATAL("Event Token: \"" << input_str
+ << "\" is not well formed:");
+ }
+ event_token = (*tokens)[0];
+ }
+
+ // Search the event token among known events.
+ return SearchEvent(cpu, event_token, event_data);
+}
+
+void ProcessEventEncodings()
+{
+ InitGlobals();
+
+ // Check if all the permissible CPU has a Json file hint associated with
+ // it.
+ CheckJsonEventHints();
+
+ std::unordered_map<std::string, std::vector<std::string> >
+ cpu_to_json_filelist;
+ std::vector<std::string> event_data_dirs(
+ { kConfigParams->event_data_dir_ });
+
+ while (!event_data_dirs.empty()) {
+ std::string dir_str = event_data_dirs.back();
+ event_data_dirs.pop_back();
+ std::unique_ptr<DIR, std::function<int(DIR *)> > dir(
+ opendir(dir_str.c_str()), closedir);
+ if (dir == nullptr) {
+ FATAL("Cannot open data directory: " << dir_str);
+ }
+ for (struct dirent *ent = readdir(dir.get()); ent != nullptr;
+ ent = readdir(dir.get())) {
+ std::string fname = std::string(ent->d_name);
+ if (ent->d_type == DT_DIR) {
+ if (fname[0] != '.') {
+ event_data_dirs.push_back(dir_str +
+ fname + "/");
+ }
+ continue;
+ }
+ if (fname.find("json") == std::string::npos) {
+ continue;
+ }
+ for (auto &cpu : *g_RelevantCpus) {
+ const std::string &json_hint =
+ kConfigParams->json_filename_hints_.at(
+ cpu);
+ if (dir_str.find(json_hint + "/") ==
+ std::string::npos) {
+ continue;
+ }
+ cpu_to_json_filelist[cpu].push_back(dir_str +
+ fname);
+ }
+ }
+ }
+
+ // Check if all the CPU got a event encoding Json file.
+ for (auto &cpu : *g_RelevantCpus) {
+ if (cpu_to_json_filelist.count(cpu) == 0) {
+ ERROR("Missing Json file for CPU: " << cpu);
+ INFO("In case no Json files are available for a CPU, "
+ "put the CPU into "
+ "'dont_care_cpus' in configuration file.");
+ }
+ }
+
+ ProcessEventFiles(cpu_to_json_filelist);
+}
+
+std::set<std::string>
+FindEvents(const std::string &token,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ const std::string &cpu)
+{
+ std::string search_key = token + "_" + cpu;
+ std::set<std::string> eventlist;
+
+ // Check if the 'token' corresponds to a metric.
+ if (dependence_dag.count(search_key) != 0) {
+ assert(dependence_dag.at(search_key).prefix_ != "Info.System" &&
+ "A Topdown formula referring to \'Info.System\'");
+ std::vector<std::string> body_tokens = NormalizeFormula(
+ dependence_dag.at(search_key).cell_content_);
+ for (auto &body_token : body_tokens) {
+ std::set<std::string> evlist =
+ FindEvents(body_token, dependence_dag, cpu);
+ eventlist.insert(evlist.begin(), evlist.end());
+ }
+ return eventlist;
+ }
+
+ // Check if the token is an operator, constant, or "NA".
+ if (IsOperator(token) || IsConstant(token) || token == "#NA" ||
+ token == "NA" || token == "N/A") {
+ return eventlist;
+ }
+
+ // At this point 'token' could be en event.
+ // Check if it is an event. If yes, then get the event information.
+ const EventInfo *event_data;
+ std::vector<std::string> tokens;
+ if (GetEventInfo(token, cpu, &event_data, &tokens)) {
+ eventlist.insert(token);
+ }
+
+ // At this point we might have token like
+ // 1. CPU names which arise out of parsing input csv entries like
+ // "#Model in ['SKL' 'KBL']" Such csv entries will be processed later
+ // using `NormalizeModel`
+ // 2. We would error out any expected tokens in `ComputeBodyFormula`
+ // where we will have more context around the error.
+ return eventlist;
+}
+
+std::set<std::string>
+FindErrata(const std::string &token,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ const std::string &cpu)
+{
+ std::string search_key = token + "_" + cpu;
+ std::set<std::string> erratalist;
+
+ // Check if the 'token' corresponds to a metric.
+ if (dependence_dag.count(search_key) > 0) {
+ assert(dependence_dag.at(search_key).prefix_ != "Info.System" &&
+ "A Topdown formula referring to \'Info.System\'");
+ std::vector<std::string> body_tokens = NormalizeFormula(
+ dependence_dag.at(search_key).cell_content_);
+ for (auto &body_token : body_tokens) {
+ std::set<std::string> errlist =
+ FindErrata(body_token, dependence_dag, cpu);
+ erratalist.insert(errlist.begin(), errlist.end());
+ }
+ return erratalist;
+ }
+
+ // Check if the token is an operator, constant, or "NA".
+ if (IsOperator(token) || IsConstant(token) || token == "#NA" ||
+ token == "NA" || token == "N/A") {
+ return erratalist;
+ }
+
+ // At this point 'token' could be en event.
+ // Check if it is an event.
+ const EventInfo *event_data;
+ std::vector<std::string> tokens;
+ if (GetEventInfo(token, cpu, &event_data, &tokens)) {
+ const std::string &errata = event_data->errata_;
+ if (errata != "0" && errata != "null" && errata != "nullptr") {
+ if (regex_search(errata, std::regex(","))) {
+ tokens = Split(errata, ',');
+ for (auto &token : tokens) {
+ erratalist.insert(token);
+ }
+ } else {
+ erratalist.insert(errata);
+ }
+ }
+ }
+
+ // At this point we might have token like
+ // 1. CPU names which arise out of parsing input csv entries like
+ // "#Model in ['SKL' 'KBL']" Such csv entries will be processed later
+ // using `NormalizeModel`
+ // 2. We would error out any expected tokens in `ComputeBodyFormula`
+ // where we will have more context around the error.
+ return erratalist;
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/event_info.h b/tools/perf/pmu-events/topdown-parser/event_info.h
new file mode 100644
index 000000000000..b5b7d1521fe2
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/event_info.h
@@ -0,0 +1,114 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// ---------------------------------------------
+// File: event_info.h
+// ---------------------------------------------
+//
+// The header provides the interface to
+// (1) Read/process the events information from event encoding JSon files.
+// (2) Query events information using an event name.
+
+#ifndef TOPDOWN_PARSER_EVENT_INFO_H_
+#define TOPDOWN_PARSER_EVENT_INFO_H_
+
+#include <time.h>
+
+#include <map>
+#include <set>
+#include <string>
+#include <unordered_map>
+#include <vector>
+
+namespace topdown_parser
+{
+class MappedData;
+
+/**
+ * The following data-structure is used to store the various meta information
+ * of an event.
+ */
+class EventInfo {
+ public:
+ std::string eventname_;
+ std::string eventcode_;
+ std::string umask_;
+ std::string msrvalue_;
+ std::string countermask_;
+ std::string invert_;
+ std::string anythread_;
+ std::string edgedetect_;
+ std::string errata_;
+
+ bool operator==(const EventInfo &ei)
+ {
+ return eventname_ == ei.eventname_ &&
+ eventcode_ == ei.eventcode_ && umask_ == ei.umask_ &&
+ eventcode_ == ei.eventcode_ &&
+ msrvalue_ == ei.msrvalue_ && invert_ == ei.invert_ &&
+ anythread_ == ei.anythread_ &&
+ edgedetect_ == ei.edgedetect_ && errata_ == ei.errata_;
+ }
+
+ bool operator!=(const EventInfo &ei)
+ {
+ return !(*this == ei);
+ }
+ EventInfo() = default;
+ EventInfo(const std::string &en, const std::string &ec,
+ const std::string &um, const std::string &msrv,
+ const std::string &cm, const std::string &i,
+ const std::string &at, const std::string &ed,
+ const std::string &er)
+ : eventname_(en), eventcode_(ec), umask_(um), msrvalue_(msrv),
+ countermask_(cm), invert_(i), anythread_(at), edgedetect_(ed),
+ errata_(er)
+ {
+ }
+};
+
+/**
+ * Query the information for a event `input_str` for a cpu `cpu`. The
+ * `EventInfo` information is stored in 'event_data'.
+ * If the token is of form 'evt:c1:e1', we tokenize it based on delimiter ':'
+ * and return the tokens. The tokens are used by some downstream functions, like
+ * GetEventString, to extract more information about the event.
+ */
+bool GetEventInfo(const std::string &input_str, const std::string &cpu,
+ const EventInfo **event_data,
+ std::vector<std::string> *tokens);
+
+/**
+ * Read and process the json files specifying the event encodings
+ */
+void ProcessEventEncodings();
+
+/**
+ * If `token` is the name of a metric, then 'FindEvents' returns a list of
+ * events used in the metric expression of that metric. If the metric expression
+ * contains sub-metrics, then 'FindEvents' recursive finds the events in those
+ * sub-metrics as well. An empty
+ * list is returned if `token` is not a metric name. The function uses
+ * `dependence_dag` (an in-memory model to store the input csv file
+ * information) and `cpu` to check if the `token` is a metric or not.
+ */
+std::set<std::string>
+FindEvents(const std::string &token,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ const std::string &cpu);
+
+/**
+ * If `token` is the name of a metric, then 'FindErrata' returns a list of
+ * errata corresponding to events used in the metric expression of that metric.
+ * If the metric expression contains sub-metrics, then 'FindEvents' recursive
+ * finds the errata for those sub-metrics as well. An empty list is returned if
+ * `token` is not a metric name. The function uses `dependence_dag` (an
+ * in-memory model to store the input csv file information) and `cpu` to check
+ * if the `token` is a metric or not.
+ */
+std::set<std::string>
+FindErrata(const std::string &token,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ const std::string &cpu);
+
+} // namespace topdown_parser
+#endif // TOPDOWN_PARSER_EVENT_INFO_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 09/12] perf topdown-paser: Add code generation API.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (7 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 08/12] perf topdown-parser: Add event interface Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 10/12] perf topdown-parser: Add json metric code generation Ian Rogers
` (3 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Add API that is called to generate code using all registered targets.
Co-authored-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../topdown-parser/code_gen_target.cpp | 51 ++++++++++++
.../topdown-parser/code_gen_target.h | 77 +++++++++++++++++++
2 files changed, 128 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/code_gen_target.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/code_gen_target.h
diff --git a/tools/perf/pmu-events/topdown-parser/code_gen_target.cpp b/tools/perf/pmu-events/topdown-parser/code_gen_target.cpp
new file mode 100644
index 000000000000..c6d7ce8eb661
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/code_gen_target.cpp
@@ -0,0 +1,51 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "code_gen_target.h"
+
+#include "configuration.h"
+
+namespace topdown_parser
+{
+/**
+ * Dump event list. Used for testing of auto-generation.
+ */
+bool g_DumpEvents = false;
+
+namespace
+{
+/**
+ * `kRegisteredTargets` enumerates all the target supported by the
+ * topdown generator tool. Each target is responsible for generating a
+ * "code", which essentially encodes the topdown metric expressions, in
+ * a particular language or format support.
+ */
+TargetInfo *kRegisteredTargets[] = {
+ &kTargetPerfJson /* target to generate JSon code */,
+};
+
+} // namespace
+
+void CodeGenTarget(
+ const std::unordered_map<std::string, MappedData> &dependence_dag)
+{
+ for (size_t i = 0;
+ i < sizeof(kRegisteredTargets) / sizeof(TargetInfo *); ++i) {
+ const std::string &target_name = kRegisteredTargets[i]->name;
+
+ if (target_name == kConfigParams->target_) {
+ kRegisteredTargets[i]->codegen_entry_point(
+ dependence_dag);
+ if (kRegisteredTargets[i]
+ ->codegen_test_harness_entry_point) {
+ kRegisteredTargets[i]
+ ->codegen_test_harness_entry_point();
+ }
+ break;
+ }
+ }
+}
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/code_gen_target.h b/tools/perf/pmu-events/topdown-parser/code_gen_target.h
new file mode 100644
index 000000000000..ab3e2b48bebc
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/code_gen_target.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// --------------------------------------------------
+// File: code_gen_target.h
+// --------------------------------------------------
+//
+
+// The header provides the interface `CodeGenTarget` to generate code, encoding
+// the topdown metric expressions, using the data `dependence_dag` read from
+// the input csv file. The language or format of the generated code is the one
+// supported by a specific project (e.g. perf or projects using perf ) to encode
+// the topdown metric metric expressions. We define a `target` as a specific
+// project and use that to guide generation of topdown code in a language
+// supported by the project.
+
+#ifndef TOPDOWN_PARSER_CODE_GEN_TARGET_H_
+#define TOPDOWN_PARSER_CODE_GEN_TARGET_H_
+
+#include <string>
+#include <unordered_map>
+
+namespace topdown_parser
+{
+class MappedData;
+
+/**
+ * Dump event list. Used for testing of auto-generation.
+ */
+extern bool g_DumpEvents;
+
+/**
+ * The structure `TargetInfo` is used to specify a target.
+ */
+struct TargetInfo {
+ /**
+ * Name of the target. This will be used to invoke code generation for a
+ * particular target.
+ */
+ std::string name;
+
+ /**
+ * Descriptive information of the target (Optional).
+ */
+ std::string description;
+
+ /**
+ * The entry point function for generating code.
+ */
+ void (*codegen_entry_point)(
+ const std::unordered_map<std::string, MappedData>
+ &dependence_dag);
+
+ /**
+ * Function to generate golden reference for testing the auto-generated
+ * code.
+ * (Optional)
+ */
+ void (*codegen_test_harness_entry_point)();
+};
+
+/**
+ * Target information for generating JSon code for perf encoding the topdown
+ * metric expressions.
+ */
+extern TargetInfo kTargetPerfJson;
+
+/**
+ * `CodeGenTarget` dispatches an appropriate callback, based on the
+ * configuration variable `kConfigParams->target_`, to generate "code" for a
+ * particular target.
+ */
+void CodeGenTarget(
+ const std::unordered_map<std::string, MappedData> &dependence_dag);
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_CODE_GEN_TARGET_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 10/12] perf topdown-parser: Add json metric code generation.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (8 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 09/12] perf topdown-paser: Add code generation API Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 11/12] perf topdown-parser: Main driver Ian Rogers
` (2 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Code generation from read in TMA_Metrics.csv to json metric encoding.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
.../code_gen_target_perf_json.cpp | 546 ++++++++++++++++++
.../code_gen_target_perf_json.h | 25 +
2 files changed, 571 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.cpp
create mode 100644 tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.h
diff --git a/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.cpp b/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.cpp
new file mode 100644
index 000000000000..70bb45de6675
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.cpp
@@ -0,0 +1,546 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include "code_gen_target_perf_json.h"
+
+#include <cassert>
+#include <fstream>
+#include <regex>
+
+#include "configuration.h"
+#include "dependence_dag_utils.h"
+#include "event_info.h"
+#include "expr_parser-bison.hpp"
+#include "general_utils.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+namespace
+{
+/**
+ * The input csv file does not define the formula for some metrics which
+ * are meant to be defined by the host machine. For example, the
+ * expression entry for Boolean metric `SMT_on` is empty in the input
+ * csv file. Perf tool evaluating the formula must extract information
+ * about the availability of hyper-threading from the host machine. We
+ * refer such metrics as external parameters. While generating the
+ * metric json files (encoding the expression of each metric), we want
+ * to replace the expression for such metrics either with their
+ * definition or a symbol recognized by the perf tool so that it can
+ * parse the json file correctly. For example,
+ * `#SMT_on` is the symbol used by perf tool identify the csv Boolean
+ * metric `SMT_on`
+ *
+ * 'CheckExternalParameter' checks if a name matches an external
+ * parameter name. If found, then `external_param_info` is used to
+ * return meta-information about the external parameter. The information
+ * includes: (1) The data-type of the metric, (2) The definition or
+ * the symbol used to replace the metric expression of the external
+ * parameter.
+ */
+bool CheckExternalParameter(
+ const std::string &sym_name,
+ std::pair<std::string, std::pair<std::string, std::string> >
+ *external_param_info)
+{
+ using ParamInfo = std::pair<std::string, std::string>;
+ using ExternalParamNameToParamInfo = std::map<std::string, ParamInfo>;
+
+ /**
+ * g_ExternalParameters stores the external parameters in the
+ * following format:
+ * Parameter name --> {Parameter Data Type, Definition or
+ * symbol to be used instead of the parameter}
+ */
+ static ExternalParamNameToParamInfo g_ExternalParameters = {
+ // SMT_on: Hyper-threading is ON on host machine.
+ { "SMT_on",
+ std::pair<std::string, std::string>("bool", "#SMT_on") },
+ // EBS_Mode: Event Sampling Based Mode
+ { "EBS_Mode",
+ std::pair<std::string, std::string>("bool", "0") },
+ };
+
+ for (auto &exp : g_ExternalParameters) {
+ const std::string &exp_name = exp.first;
+ if (sym_name.find(exp_name) != std::string::npos) {
+ *external_param_info =
+ std::pair<std::string,
+ std::pair<std::string, std::string> >(
+ exp_name, exp.second);
+ return true;
+ }
+ }
+ external_param_info = nullptr;
+ return false;
+}
+
+/**
+ * Create the event string for event 'event_str'.
+ *
+ * For example:
+ * For the event "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD:c4",
+ * Return:
+ * "cpu@OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD\\,cmask\\=4@"
+ */
+std::string GetEventString(const std::string &event_str, const std::string &cpu)
+{
+ std::string retval("");
+ const EventInfo *event_data;
+ std::vector<std::string> tokens;
+
+ GetEventInfo(event_str, cpu, &event_data, &tokens);
+
+ const std::string &event_name = event_data->eventname_;
+ const std::string msrvalue = Trim(event_data->msrvalue_);
+ std::string cmask = event_data->countermask_;
+
+ std::string edge = "";
+ if (event_data->edgedetect_ != "0") {
+ edge = "edge";
+ }
+
+ const std::string any = (event_data->anythread_ != "0") ? "any" : "";
+
+ std::string invert = "";
+ if (event_data->invert_ != "0") {
+ invert = "inv";
+ }
+
+ if (tokens.size() > 1) {
+ for (size_t i = 1; i < tokens.size(); ++i) {
+ std::smatch sm;
+ // Cmask
+ if (regex_match(tokens[i], sm,
+ std::regex("c([0-9]+)"))) {
+ cmask = sm[1].str();
+ continue;
+ }
+
+ // Edge
+ if (regex_match(tokens[i], std::regex("e1"))) {
+ edge = "edge";
+ continue;
+ }
+
+ // invert_
+ if (regex_match(tokens[i], std::regex("i1"))) {
+ invert = "inv";
+ continue;
+ }
+
+ ERROR("Unhandled token: " << tokens[i]
+ << " for Event: " << event_str
+ << " for CPU:" << cpu);
+ }
+ }
+
+ retval += "";
+ retval += "cpu@" + event_name;
+ // Cmask
+ if (!cmask.empty() && cmask != "0") {
+ retval += "\\\\,cmask\\\\=";
+ retval += cmask;
+ }
+
+ // Edge
+ if (!edge.empty()) {
+ retval += "\\\\,edge";
+ }
+
+ // Any
+ if (!any.empty()) {
+ retval += "\\\\,any";
+ }
+
+ // Invert
+ if (!invert.empty()) {
+ retval += "\\\\,inv";
+ }
+ retval += "@";
+
+ return retval;
+}
+
+/**
+ * Formatting the formula.
+ */
+std::string FormatFormula(const std::string &str)
+{
+ std::regex r_comma("(\\,)"); // For every occurrence of
+ // character ','
+ std::string repl_comma = "$1 "; // Replace with ", "
+
+ std::string retval = regex_replace(str, r_comma, repl_comma);
+
+ std::regex r_op("(\\<|\\>|\\+|\\-|\\*|\\/|\\%" // Every occurrence of
+ "|if|else)");
+ std::string repl_op = " $1 "; // operator '+',
+ retval = regex_replace(retval, r_op,
+ repl_op); // replace with ' + '
+
+ // The above formatting will make the event encoding
+ // cpu@OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD\\,cmask\\=4@
+ // look
+ // cpu@OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD\\, cmask\\=4@
+ // which is not acceptable.
+ // For the event attributes like cmask, invert, edge and any, we
+ // prevent such transformation.
+ retval = regex_replace(retval, std::regex("(\\s*)cmask"), "cmask");
+ retval = regex_replace(retval, std::regex("(\\s*)inv"), "inv");
+ retval = regex_replace(retval, std::regex("(\\s*)edge"), "edge");
+ retval = regex_replace(retval, std::regex("(\\s*)any"), "any");
+
+ return retval;
+}
+/**
+ * Preprocess cell contents.
+ */
+std::vector<std::string> NormalizeFormula(const std::string &str,
+ const std::string &header_name)
+{
+ std::vector<std::string> body_tokens;
+
+ if (!str.length()) {
+ return body_tokens;
+ }
+
+ // Make the cell content amenable to split based on whitespace.
+ std::string cell_content;
+ size_t cursor = 0;
+ yy::parser parser(str, &cursor, false /* do not convert if stmt */,
+ false /* Remove false branch */,
+ false /* do not wrap div operator in a function */,
+ &cell_content);
+ if (parser.parse())
+ FATAL("Parsing error");
+
+ // Split the cell content based on whitespace.
+ body_tokens = WhitespaceSplit(cell_content);
+
+ // Handle 'if #Model in ['KBLR' 'CFL']'
+ if (regex_search(cell_content, std::regex("Model"))) {
+ body_tokens = NormalizeModel(body_tokens, header_name);
+ }
+
+ return body_tokens;
+}
+
+// Forward declaration
+std::string
+GetMetricExpr(const std::string &key,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ std::unordered_map<std::string, std::string> *formula_cache);
+
+std::string ComputeBodyFormula(
+ const MappedData &data,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ std::unordered_map<std::string, std::string> *formula_cache)
+{
+ // For the cells containing Uncore event, generate an assertion
+ // error and bail off.
+ std::regex blacklisted_formulas("UNC_|_PS");
+
+ if (data.cell_content_.find("UNC_") != std::string::npos) {
+ FATAL("Found an uncore event in expr: " << data.cell_content_);
+ }
+
+ std::string retval("");
+ std::vector<std::string> retval_tokens;
+ const std::string &header_name = data.header_name_;
+ std::vector<std::string> body_tokens =
+ NormalizeFormula(data.cell_content_, header_name);
+
+ for (auto &body_token : body_tokens) {
+ std::string search_key = body_token + "_" + header_name;
+
+ // Check if the token corresponds to an existing cell.
+ if (dependence_dag.count(search_key) != 0) {
+ // If any of the cell token corresponds to an
+ // 'Info.Systems' cell, then generate an
+ // assertion error and bail off.
+ if (dependence_dag.at(search_key).prefix_ ==
+ "Info.System") {
+ FATAL("Formula refer to Info.System: "
+ << data.cell_content_);
+ }
+
+ retval_tokens.push_back(GetMetricExpr(
+ search_key, dependence_dag, formula_cache));
+ continue;
+ }
+
+ // Check if the token is an operator.
+ if (IsOperator(body_token) || IsConstant(body_token)) {
+ retval_tokens.push_back(body_token);
+ continue;
+ }
+
+ // Check if the token is "NA"
+ if (body_token == "#NA" || body_token == "NA" ||
+ body_token == "N/A") {
+ retval_tokens.push_back("NOT_APPLICABLE");
+ continue;
+ }
+
+ // Check if the token is an event.
+ const EventInfo *event_data;
+ std::vector<std::string> tokens;
+ if (GetEventInfo(body_token, header_name, &event_data,
+ &tokens)) {
+ retval_tokens.push_back(
+ GetEventString(body_token, header_name));
+ continue;
+ }
+
+ // Unknown token: Error Out We want to emit all the
+ // missing definition errors before we assert false.
+ ERROR("Missing definition of "
+ << body_token << " in the formula: " << data.cell_content_
+ << " for CPU: " << header_name);
+ retval_tokens.push_back(body_token);
+ }
+
+ for (auto &retval_token : retval_tokens) {
+ retval += retval_token;
+ }
+
+ return (retval);
+}
+
+std::string
+GetMetricExpr(const std::string &key,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ std::unordered_map<std::string, std::string> *formula_cache)
+{
+ std::string retval("0.0");
+ const MappedData &cell_data = dependence_dag.at(key);
+
+ // Check if the function name corresponds to an external
+ // parameter
+ std::pair<std::string, std::pair<std::string, std::string> >
+ external_param_info;
+ bool isExtParam = CheckExternalParameter(key, &external_param_info);
+
+ // Skip generating the function definitions
+ // for certain conditions.
+ if ((!isExtParam && cell_data.cell_content_.empty()) ||
+ cell_data.cell_content_ == "#NA" ||
+ cell_data.cell_content_ == "N/A" ||
+ cell_data.cell_content_ == "NA" || cell_data.cell_content_ == "-" ||
+ cell_data.prefix_ == "Info.System") {
+ return "NOT_APPLICABLE";
+ }
+
+ if (0 != formula_cache->count(key)) {
+ return (*formula_cache)[key];
+ }
+
+ if (isExtParam) {
+ retval = external_param_info.second.second;
+ } else {
+ retval = "(" +
+ ComputeBodyFormula(cell_data, dependence_dag,
+ formula_cache) +
+ ")";
+ }
+
+ (*formula_cache)[key] = retval;
+ return retval;
+}
+
+/**
+ * For the metric group of form mg1:mg2, the function ProcessMetricGroup
+ * return <prefix>_mg1; <prefix>_mg2
+ */
+std::string ProcessMetricGroup(const std::string &metric_group,
+ const std::string &prefix)
+{
+ std::string retval("");
+ std::vector<std::string> metric_group_tokens = Split(metric_group, ';');
+
+ for (size_t i = 0; i < metric_group_tokens.size(); ++i) {
+ if (i == 0) {
+ retval += prefix + metric_group_tokens[i];
+ continue;
+ }
+ retval += ";" + prefix + metric_group_tokens[i];
+ }
+ return retval;
+}
+
+/**
+ * Generate topdown json records. Each records contains
+ * 1. A BriefDescription of the metric.
+ * 2. A Metric Group as specified in the input csv file.
+ * 3. Name of the metric
+ * 4. The metric expression: For example, say the expression for metrics
+ * M1 and M2 are (e1 op1 e2) and (e3 op2 e4) respectively, where ei
+ * is an event and opi is some operator. For a metric M with
+ * expession as (e5 op3 M1 op4 M2). The flattened expression for M is
+ * e5 op3 (e1 op1 e2) op4 (e3 op2 e4)
+ */
+void GenTopdownRecords(
+ std::ofstream &ofile_json, const std::string &metric,
+ const std::string &child_metric,
+ const std::unordered_map<std::string, MappedData> &dependence_dag,
+ const std::string &cpu)
+{
+ std::string key = child_metric + "_" + cpu;
+
+ if (dependence_dag.count(key) == 0) {
+ FATAL("Topdown key: " << key << " not found for metric: "
+ << metric << ", CPU: " << cpu);
+ }
+
+ const MappedData &cell_data = dependence_dag.at(key);
+
+ // Get "BriefDescription" json key
+ std::string brief_description = cell_data.description_;
+
+ // Get flattened "MetricExpr" json key.
+ std::unordered_map<std::string, std::string> formula_cache;
+ std::string metric_expr =
+ GetMetricExpr(key, dependence_dag, &formula_cache);
+
+ // Format the expression
+ metric_expr = FormatFormula(metric_expr);
+
+ // Remove false branch.
+ std::string metric_expr_false_branch_removed;
+ size_t cursor = 0;
+ yy::parser parser(metric_expr, &cursor, false /* convert if stmt */,
+ true /* Remove false branch */,
+ false /* wrap div operator in a function */,
+ &metric_expr_false_branch_removed);
+ if (parser.parse())
+ FATAL("Parsing error");
+
+ // Check if the flattened expression has a "NOT_APPLICABLE"
+ // string. It yes, it means that metric expression is not valid
+ // for `cpu` and we can ignore the metric `child_metric`.
+ //
+ // Note: This check needs to be done after "Removing false
+ // branches". This is because: We might have a flattened
+ // expression like (e1 op "NOT_APPLICABLE" if 0 else e2). Even
+ // though the expression contain "NOT_APPLICABLE", but we
+ // should not ignore the metric as the "NOT_APPLICABLE" appears
+ // in the false branch.
+ if (std::string::npos !=
+ metric_expr_false_branch_removed.find("NOT_APPLICABLE")) {
+ return;
+ }
+
+ // Get "MetricGroup" json key
+ std::string metric_group = cell_data.metric_group_;
+
+ // Get "MetricName" json key
+ std::string metric_name = cell_data.metric_name_;
+
+ ofile_json << " {\n";
+ ofile_json << "\t\t\"BriefDescription\": \"" << brief_description
+ << "\",\n";
+ ofile_json << "\t\t\"MetricExpr\": \""
+ << metric_expr_false_branch_removed << "\",\n";
+ ofile_json << "\t\t\"MetricGroup\": \""
+ << ProcessMetricGroup(metric_group, "Topdown_Group_")
+ << "\",\n";
+ ofile_json << "\t\t\"MetricName\": \""
+ << "Topdown_Metric_" + metric_name << "\"\n";
+ ofile_json << " },\n";
+}
+
+/**
+ * CodeGen generates metric json files (e.g. skx-topdown-metric.json)
+ */
+void CodeGenPerfJson(
+ const std::unordered_map<std::string, MappedData> &dependence_dag)
+{
+ const std::set<std::string> compact_cpus_to_handle(
+ g_RelevantCpus->begin(), g_RelevantCpus->end());
+
+ for (const std::string &cpu : compact_cpus_to_handle) {
+ // For the CPUs JKT and SNB-EP, generate output only for
+ // JKT.
+ // This is because:
+ // 1. All the members in a group share the same formula
+ // (as specified in the input csv file as JKT/SNB-EP)
+ // and same event encoding json files.
+ // 2. pmu-events/arch/x86 hosts directory only for
+ // jaketown
+ if ((cpu == "SNB-EP" &&
+ compact_cpus_to_handle.count("JKT") != 0)) {
+ continue;
+ }
+
+ std::string outfile = kConfigParams->output_path_ + "/";
+
+ // If (per CPU output directory is not specified or
+ // It is specified but does not exists)
+ // dump the JSon file in kConfigParams->output_path_
+ // Else
+ // Else dump the JSon file in
+ // kConfigParams->output_path_/<per cpu dir>
+ if (kConfigParams->output_directory_per_cpu_.count(cpu) == 0 ||
+ !CheckDirPathExists(
+ outfile +
+ kConfigParams->output_directory_per_cpu_.at(cpu))) {
+ INFO("No CPU specific directory found under"
+ << " Path " << outfile << " for CPU " << cpu);
+ INFO("Either directory "
+ << outfile
+ << "<per cpu directory> does not exists."
+ "Or there is no CPU specific "
+ "output directory "
+ "mentioned under JSon key"
+ "\"output_directory_per_cpu\" for "
+ << cpu);
+ outfile += ToLower(cpu) + "-topdown-metric.json";
+ } else {
+ outfile += kConfigParams->output_directory_per_cpu_.at(
+ cpu) +
+ "/" + ToLower(cpu) + "-topdown-metric.json";
+ }
+
+ std::ofstream ofile_json(outfile);
+
+ if (false == ofile_json.is_open()) {
+ FATAL("Cannot open metric json file: " << outfile);
+ }
+ INFO("Generating metric json file: " << outfile << "\n");
+
+ ofile_json << "[\n";
+
+ for (auto &p : *g_TopdownHierarchy) {
+ const std::string &parent_metric = p.first;
+ std::vector<std::string> &child_metrics =
+ p.second.child_metrics;
+
+ for (size_t i = 0; i < child_metrics.size(); ++i) {
+ GenTopdownRecords(ofile_json, parent_metric,
+ child_metrics[i],
+ dependence_dag, cpu);
+ }
+ }
+
+ ofile_json << "\n]";
+ ofile_json.close();
+ }
+}
+
+} // namespace
+
+TargetInfo kTargetPerfJson = {
+ .name = "perf_json",
+ .description = "The generated code includes:\n"
+ "<cpu>-topdown-metric.json:"
+ "Per cpu json file encoding the topdown "
+ "metric formulas\n",
+ .codegen_entry_point = &CodeGenPerfJson,
+ .codegen_test_harness_entry_point = nullptr,
+};
+
+} // namespace topdown_parser
diff --git a/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.h b/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.h
new file mode 100644
index 000000000000..bb4fe7776f2b
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/code_gen_target_perf_json.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+// --------------------------------------------------------------
+// File: code_gen_target_perf_json.h
+// -------------------------------------------------------------
+//
+// The header file provides the interface to generate JSon files encoding
+// topdown formulas to be used by upstream perf.
+
+#ifndef TOPDOWN_PARSER_CODE_GEN_TARGET_PERF_JSON_H_
+#define TOPDOWN_PARSER_CODE_GEN_TARGET_PERF_JSON_H_
+
+#include "code_gen_target.h"
+
+namespace topdown_parser
+{
+/**
+ * Target information for generating JSon code for json perf encoding the
+ * topdown metric expressions.
+ */
+extern TargetInfo kTargetPerfJson;
+
+} // namespace topdown_parser
+
+#endif // TOPDOWN_PARSER_CODE_GEN_TARGET_PERF_JSON_H_
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 11/12] perf topdown-parser: Main driver.
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (9 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 10/12] perf topdown-parser: Add json metric code generation Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-10 10:03 ` [RFC PATCH 12/12] perf pmu-events: Topdown parser tool Ian Rogers
2020-11-11 21:46 ` [RFC PATCH 00/12] Topdown parser Andi Kleen
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
Invoke the necessary configuration reading and parsing, then code
generation. Handles command line arguments.
Add a minor README.
Co-authored-by: Stephane Eranian <eranian@google.com>
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
tools/perf/pmu-events/topdown-parser/README | 5 +
.../topdown-parser/topdown_parser_main.cpp | 155 ++++++++++++++++++
2 files changed, 160 insertions(+)
create mode 100644 tools/perf/pmu-events/topdown-parser/README
create mode 100644 tools/perf/pmu-events/topdown-parser/topdown_parser_main.cpp
diff --git a/tools/perf/pmu-events/topdown-parser/README b/tools/perf/pmu-events/topdown-parser/README
new file mode 100644
index 000000000000..7f100792b00c
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/README
@@ -0,0 +1,5 @@
+Topdown parser and code generator
+=================================
+
+The topdown parser processes a TMA_Metrics.csv file and generates
+Intel specific metrics from the data in the spreadsheet cells.
\ No newline at end of file
diff --git a/tools/perf/pmu-events/topdown-parser/topdown_parser_main.cpp b/tools/perf/pmu-events/topdown-parser/topdown_parser_main.cpp
new file mode 100644
index 000000000000..ba9acd32726e
--- /dev/null
+++ b/tools/perf/pmu-events/topdown-parser/topdown_parser_main.cpp
@@ -0,0 +1,155 @@
+/*
+ * Copyright 2020 Google LLC.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include <getopt.h>
+#include <stdlib.h>
+
+#include <iostream>
+#include <string>
+#include <unordered_map>
+#include <vector>
+
+#include "code_gen_target.h"
+#include "configuration.h"
+#include "csvreader.h"
+#include "dependence_dag_utils.h"
+#include "event_info.h"
+#include "logging.h"
+
+namespace topdown_parser
+{
+namespace
+{
+/**
+ * Printing usage model
+ */
+[[noreturn]] void ShowUsage()
+{
+ std::cout
+ << "\n"
+ " Usage: topdown_parser --csv-file <csv input file>\n"
+ " --events-data-dir <path of event "
+ "encoding JSon files>\n"
+ " --config-file <path to "
+ "configuration.json>\n"
+ " --output-path <path to "
+ "output the topdown file(s)>\n"
+ " [Options]\n"
+ " Synopsis: Auto-generates topdown.c \n\n"
+ " Options\n"
+ "\t--dump-events : Dump the unique events for each metric.\n"
+ "generated topdown file. Used for testing.\n"
+ "\t--help : Show help\n";
+ exit(0);
+}
+
+/**
+ * The input csv file name specifying formula encoding for topdown
+ * metric
+ */
+char *g_CsvFile = nullptr;
+
+/**
+ * ProcessArgs parses command-line arguments
+ */
+bool ProcessArgs(int argc, char **argv)
+{
+ // The following command-line arguments to the program
+ // todown_parser are required: --csv-file <file>,
+ // --events-data-dir <dir>, --config-file <file> --output-path
+ // <path>
+ if (argc < 9) {
+ ShowUsage();
+ return false;
+ }
+
+ const char *const short_opts = "f:a:z:hdt";
+ const option long_opts[] = {
+ { "csv-file", required_argument, nullptr, 'f' },
+ { "events-data-dir", required_argument, nullptr, 'a' },
+ { "config-file", required_argument, nullptr, 'z' },
+ { "output-path", required_argument, nullptr, 'o' },
+ { "dump-events", no_argument, nullptr, 'd' },
+ { "help", no_argument, nullptr, 'h' },
+ { nullptr, no_argument, nullptr, 0 }
+ };
+
+ while (true) {
+ const auto opt =
+ getopt_long(argc, argv, short_opts, long_opts, nullptr);
+
+ if (opt == -1)
+ break;
+
+ switch (opt) {
+ case 'f':
+ g_CsvFile = optarg;
+ break;
+
+ case 'a':
+ kConfigParams->event_data_dir_ = optarg;
+ kConfigParams->event_data_dir_ += "/";
+ break;
+
+ case 'z':
+ kConfigParams->config_file_ = optarg;
+ break;
+
+ case 'o':
+ kConfigParams->output_path_ = optarg;
+ break;
+
+ case 'd':
+ g_DumpEvents = true;
+ break;
+
+ case 'h':
+ case '?':
+ default:
+ ShowUsage();
+ return false;
+ }
+ }
+
+ INFO("csv filename: |" << g_CsvFile << "|");
+ INFO("events data dir: |" << kConfigParams->event_data_dir_ << "|");
+ INFO("config file : |" << kConfigParams->config_file_ << "|");
+ return true;
+}
+
+} // namespace
+
+} // namespace topdown_parser
+
+/**
+ * Main driver function for generating topdown files.
+ */
+int main(int argc, char *argv[])
+{
+ bool process_arg_stat = topdown_parser::ProcessArgs(argc, argv);
+ if (!process_arg_stat) {
+ FATAL("Failed to process the command-line arguments");
+ }
+
+ // Read the configuration file "configuration.json"
+ int read_config_stat = topdown_parser::ReadConfig();
+ if (read_config_stat != 0) {
+ FATAL("Failed to read configuration file");
+ }
+
+ // Read the input csv file
+ topdown_parser::CsvReader reader(topdown_parser::g_CsvFile);
+ std::vector<std::vector<std::string> > records = reader.getData();
+ std::unordered_map<std::string, topdown_parser::MappedData>
+ dependence_dag = topdown_parser::ProcessRecords(&records);
+
+ // Read and process the json files specifying the event encodings
+ topdown_parser::ProcessEventEncodings();
+
+ // Generate topdown files for a specific target (or purpose)
+ topdown_parser::CodeGenTarget(dependence_dag);
+
+ return 0;
+}
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 12/12] perf pmu-events: Topdown parser tool
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (10 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 11/12] perf topdown-parser: Main driver Ian Rogers
@ 2020-11-10 10:03 ` Ian Rogers
2020-11-11 21:46 ` [RFC PATCH 00/12] Topdown parser Andi Kleen
12 siblings, 0 replies; 16+ messages in thread
From: Ian Rogers @ 2020-11-10 10:03 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Andi Kleen, Jin Yao, John Garry, Paul Clarke,
kajoljain
Cc: Stephane Eranian, Sandeep Dasgupta, linux-perf-users, Ian Rogers
From: Sandeep Dasgupta <sdasgup@google.com>
A tool for processing Intel's TMA_Metrics.csv and from it generating
metrics encoded as json.
As an example, the build here is configured to wget TMA_Metrics.csv
from download.01.org/perfmon and then build the metric json using
events encoded in the pmu-events directory. As the TMA_Metrics.csv is
newer there are missing event encodings that will be warned about, in
particular icelake PERF_METRICS.*.
On a Skylakex this shows with 'perf list metricgroups' the new groups
of:
Topdown_Group_Backend
Topdown_Group_BadSpec
Topdown_Group_BrMispredicts
Topdown_Group_Cache_Misses
Topdown_Group_DSB
Topdown_Group_FLOPS
Topdown_Group_Fetch_BW
Topdown_Group_Fetch_Lat
Topdown_Group_Frontend
Topdown_Group_HPC
Topdown_Group_IcMiss
Topdown_Group_Machine_Clears
Topdown_Group_Memory_BW
Topdown_Group_Memory_Bound
Topdown_Group_Memory_Lat
Topdown_Group_MicroSeq
Topdown_Group_Offcore
Topdown_Group_Ports_Utilization
Topdown_Group_Retire
Topdown_Group_TLB
Topdown_Group_TopDownL1
Topdown_Group_TopDownL2
And the new metrics of:
Topdown_Metric_Backend_Bound
Topdown_Metric_Bad_Speculation
Topdown_Metric_Branch_Mispredicts
Topdown_Metric_Branch_Resteers
Topdown_Metric_Core_Bound
Topdown_Metric_DRAM_Bound
Topdown_Metric_DSB
Topdown_Metric_DSB_Switches
Topdown_Metric_DTLB_Load
Topdown_Metric_Divider
Topdown_Metric_FB_Full
Topdown_Metric_FP_Arith
Topdown_Metric_FP_Scalar
Topdown_Metric_FP_Vector
Topdown_Metric_Fetch_Bandwidth
Topdown_Metric_Fetch_Latency
Topdown_Metric_Frontend_Bound
Topdown_Metric_Heavy_Operations
Topdown_Metric_ICache_Misses
Topdown_Metric_ITLB_Misses
Topdown_Metric_L1_Bound
Topdown_Metric_L2_Bound
Topdown_Metric_L3_Bound
Topdown_Metric_Light_Operations
Topdown_Metric_MEM_Bandwidth
Topdown_Metric_MEM_Latency
Topdown_Metric_MITE
Topdown_Metric_MS_Switches
Topdown_Metric_Machine_Clears
Topdown_Metric_Memory_Bound
Topdown_Metric_Microcode_Sequencer
Topdown_Metric_Other
Topdown_Metric_Ports_Utilization
Topdown_Metric_Retiring
Topdown_Metric_Serializing_Operation
Topdown_Metric_Store_Bound
Using one of the metric groups shows:
$ perf stat -M Topdown_Group_TopDownL1 -a
Performance counter stats for 'system wide':
18,224,977,565 cpu/idq_uops_not_delivered.core,edge,any,inv/ # 0.38 Topdown_Metric_Frontend_Bound
# 0.44 Topdown_Metric_Backend_Bound (57.11%)
450,438,658 cpu/int_misc.recovery_cycles,edge,any,inv/ # 0.07 Topdown_Metric_Bad_Speculation (57.11%)
11,981,273,993 cpu/cpu_clk_unhalted.thread,edge,any,inv/ # 0.11 Topdown_Metric_Retiring (57.13%)
5,288,258,009 cpu/uops_retired.retire_slots,edge,any,inv/ (57.17%)
6,808,261,153 cpu/uops_issued.any,edge,any,inv/ (57.19%)
456,255,269 cpu/int_misc.recovery_cycles_any,edge,any,inv/ (57.17%)
12,383,804,530 cpu/cpu_clk_unhalted.thread_any,edge,any,inv/ (57.12%)
10.159307832 seconds time elapsed
Co-authored-by: Stephane Eranian <eranian@google.com>
Co-authored-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandeep Dasgupta <sdasgup@google.com>
---
tools/perf/Makefile.perf | 13 +++++++++-
tools/perf/pmu-events/Build | 50 ++++++++++++++++++++++++++++++++++---
2 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 7ce3f2e8b9c7..b1f4145ca757 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -634,6 +634,11 @@ strip: $(PROGRAMS) $(OUTPUT)perf
PERF_IN := $(OUTPUT)perf-in.o
+TOPDOWN_PARSER := $(OUTPUT)pmu-events/topdown_parser
+TOPDOWN_PARSER_IN := $(OUTPUT)pmu-events/topdown_parser-in.o
+
+export TOPDOWN_PARSER
+
JEVENTS := $(OUTPUT)pmu-events/jevents
JEVENTS_IN := $(OUTPUT)pmu-events/jevents-in.o
@@ -646,13 +651,19 @@ build := -f $(srctree)/tools/build/Makefile.build dir=. obj
$(PERF_IN): prepare FORCE
$(Q)$(MAKE) $(build)=perf
+$(TOPDOWN_PARSER_IN): FORCE
+ $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=topdown_parser
+
+$(TOPDOWN_PARSER): $(TOPDOWN_PARSER_IN)
+ $(QUIET_LINK)$(HOSTCC) $(TOPDOWN_PARSER_IN) -lstdc++ -o $@
+
$(JEVENTS_IN): FORCE
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=jevents
$(JEVENTS): $(JEVENTS_IN)
$(QUIET_LINK)$(HOSTCC) $(JEVENTS_IN) -o $@
-$(PMU_EVENTS_IN): $(JEVENTS) FORCE
+$(PMU_EVENTS_IN): $(JEVENTS) $(TOPDOWN_PARSER) FORCE
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=pmu-events
$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) $(LIBTRACEEVENT_DYNAMIC_LIST)
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build
index 215ba30b8534..d54bf9e8c224 100644
--- a/tools/perf/pmu-events/Build
+++ b/tools/perf/pmu-events/Build
@@ -1,15 +1,57 @@
-hostprogs := jevents
+hostprogs := jevents topdown_parser
jevents-y += json.o jsmn.o jevents.o
HOSTCFLAGS_jevents.o = -I$(srctree)/tools/include
-pmu-events-y += pmu-events.o
+
+topdown_parser-y += topdown-parser/code_gen_target.o
+topdown_parser-y += topdown-parser/code_gen_target_perf_json.o
+topdown_parser-y += topdown-parser/configuration.o
+topdown_parser-y += topdown-parser/csvreader.o
+topdown_parser-y += topdown-parser/dependence_dag_utils.o
+topdown_parser-y += topdown-parser/event_info.o
+topdown_parser-y += topdown-parser/expr_parser-bison.o
+topdown_parser-y += topdown-parser/general_utils.o
+topdown_parser-y += topdown-parser/jsmn_extras.o
+topdown_parser-y += topdown-parser/topdown_parser_main.o
+topdown_parser-y += jsmn.o
+CXXFLAGS_topdown_parser += -I$(OUTPUT)pmu-events/topdown-parser
+
+$(OUTPUT)pmu-events/topdown-parser/expr_parser-bison.cpp $(OUTPUT)pmu-events/topdown-parser/expr_parser-bison.hpp: pmu-events/topdown-parser/expr_parser.y
+ $(call rule_mkdir)
+ $(Q)$(call echo-cmd,bison)$(BISON) -v $< -d $(PARSER_DEBUG_BISON) -o $@
+
+$(OUTPUT)pmu-events/topdown-parser/code_gen_target_perf_json.o: pmu-events/topdown-parser/code_gen_target_perf_json.cpp $(OUTPUT)pmu-events/topdown-parser/expr_parser-bison.hpp
+ $(call rule_mkdir)
+ $(call if_changed_dep,cxx_o_c)
+
+$(OUTPUT)pmu-events/topdown-parser/event_info.o: pmu-events/topdown-parser/event_info.cpp $(OUTPUT)pmu-events/topdown-parser/expr_parser-bison.hpp
+ $(call rule_mkdir)
+ $(call if_changed_dep,cxx_o_c)
+
+TMA_METRICS = $(OUTPUT)pmu-events/TMA_Metrics.csv
+
+$(TMA_METRICS):
+ $(call rule_mkdir)
+ wget -O $@ https://download.01.org/perfmon/TMA_Metrics.csv
+
JDIR = pmu-events/arch/$(SRCARCH)
JSON = $(shell [ -d $(JDIR) ] && \
find $(JDIR) -name '*.json' -o -name 'mapfile.csv')
+$(OUTPUT)pmu-events/arch: pmu-events/topdown-parser/configuration.json $(TOPDOWN_PARSER) $(TMA_METRICS) $(JSON)
+ mkdir -p $(OUTPUT)pmu-events/arch
+ cp -R pmu-events/arch $(OUTPUT)pmu-events/
+ $(TOPDOWN_PARSER) \
+ --csv-file $(TMA_METRICS) \
+ --events-data-dir pmu-events/arch/x86 \
+ --config-file $< \
+ --output-path $(OUTPUT)pmu-events/arch/x86
+
+pmu-events-y += pmu-events.o
+
#
# Locate/process JSON files in pmu-events/arch/
# directory and create tables in pmu-events.c.
#
-$(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JEVENTS)
- $(Q)$(call echo-cmd,gen)$(JEVENTS) $(SRCARCH) pmu-events/arch $(OUTPUT)pmu-events/pmu-events.c $(V)
+$(OUTPUT)pmu-events/pmu-events.c: $(OUTPUT)pmu-events/arch $(JSON) $(JEVENTS)
+ $(Q)$(call echo-cmd,gen)$(JEVENTS) $(SRCARCH) $(OUTPUT)pmu-events/arch $(OUTPUT)pmu-events/pmu-events.c $(V)
--
2.29.2.222.g5d2a92d10f8-goog
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 00/12] Topdown parser
2020-11-10 10:03 [RFC PATCH 00/12] Topdown parser Ian Rogers
` (11 preceding siblings ...)
2020-11-10 10:03 ` [RFC PATCH 12/12] perf pmu-events: Topdown parser tool Ian Rogers
@ 2020-11-11 21:46 ` Andi Kleen
[not found] ` <CAP-5=fXedJEZcYhxmPAzRVx5kdW2YA71Ks3BycqurAHydtXh8A@mail.gmail.com>
12 siblings, 1 reply; 16+ messages in thread
From: Andi Kleen @ 2020-11-11 21:46 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
linux-kernel, Jin Yao, John Garry, Paul Clarke, kajoljain,
Stephane Eranian, Sandeep Dasgupta, linux-perf-users
On Tue, Nov 10, 2020 at 02:03:34AM -0800, Ian Rogers wrote:
> This RFC is for a new tool that reads TMA_Metrics.csv as found on
> download.01.org/perfmon and generates metrics and metric groups from
> it. To show the functionality the TMA_Metrics.csv is downloaded, but
> an accepted change would most likely include a copy of this file from
> Intel. With this tool rather than just level 1 topdown metrics, a full
> set of topdown metrics to level 4 are generated.
I'm not sure I understand the motivation for making the spreadsheet parsing
part of the perf tool? It only needs to run once when the metrics are generated
to build perf.
FWIW I did something similar in python (that's how the current metrics
json files were generated from the spreadsheet) and it was a lot
simpler and shorter in a higher level language.
One problem I see with putting the full TopDown model into perf is
that to do a full job it requires a lot more infrastructure that is
currently not implemented in metrics: like an event scheduler,
hierarchical thresholding over different nodes, SMT mode support etc.
I implemented it all in toplev, but it was a lot of work to get it all right.
I'm not saying it's not doable, but it will be a lot of additional work
to work out all the quirks using the metrics infrastructure.
I think adding one or two more levels is probably ok, but doing all levels
without these mechanisms might be difficult to use in the end.
-Andi
^ permalink raw reply [flat|nested] 16+ messages in thread