linux-trace-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/6] NumPy Interface for KernelShark
@ 2019-04-05 10:14 Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

NumPy is an efficient multi-dimensional container of generic data.
It uses strong typing in order to provide fast data processing in
Python. The NumPy interface will allow sophisticated analysis of
tracing data via scripts, but it also opens the door for exposing
the kernel tracing data to the instruments provided by the scientific
toolkit of Python (stats, matplotlib, scikit-learn, ...) or maybe
even to PyTorch and TensorFlow in the future.

Disclaimer: I am not very good in Python. Please check as carefully
as possible :-)

Changes in v2:
 - Patch "[RFC 1/7] kernel-shark: kshark_string_config_alloc() must
   take no arguments" has been dropped from the patch-set because it
   was applied by Steven.
 - Proper clean up in the case when data_matrix_alloc() fails to
   allocate memory.
 - in kspy_read_event_field(), tep_find_event() is use instead of
   tep_find_event_by_name(). This makes the retrieval of the field
   value more efficient.
 - Fixed memory leak in kspy_new_session_file().
 - Python code follows the PEP8 Style Guide.
 - load_data() returns a Python dictionary. The user can specify the
   data-fields to be loaded via the named parameters of the function.

IMPORTANT: the patch-set must be applied on top of
[RFC 1/7] kernel-shark: kshark_string_config_alloc() must take no arguments

Yordan Karadzhov (6):
  kernel-shark: Add new dataloading method to be used by the NumPu
    interface
  kernel-shark: Prepare for building the NumPy interface
  kernel-shark: Add the core components of the NumPy API
  kernel-shark: Add Numpy Interface for processing of tracing data
  kernel-shark: Add automatic building of the NumPy interface
  kernel-shark: Add basic example demonstrating the NumPy interface

 kernel-shark/CMakeLists.txt                 |   3 +
 kernel-shark/README                         |  12 +-
 kernel-shark/bin/sched_wakeup.py            | 106 +++++++
 kernel-shark/build/FindNumPy.cmake          |  35 +++
 kernel-shark/build/py/libkshark_wrapper.pyx | 302 ++++++++++++++++++++
 kernel-shark/build/py/np_setup.py           | 101 +++++++
 kernel-shark/build/py/pybuild.sh            |  29 ++
 kernel-shark/src/CMakeLists.txt             |  59 +++-
 kernel-shark/src/libkshark-py.c             | 175 ++++++++++++
 kernel-shark/src/libkshark.c                | 136 +++++++++
 kernel-shark/src/libkshark.h                |   7 +
 11 files changed, 954 insertions(+), 11 deletions(-)
 create mode 100755 kernel-shark/bin/sched_wakeup.py
 create mode 100644 kernel-shark/build/FindNumPy.cmake
 create mode 100644 kernel-shark/build/py/libkshark_wrapper.pyx
 create mode 100644 kernel-shark/build/py/np_setup.py
 create mode 100755 kernel-shark/build/py/pybuild.sh
 create mode 100644 kernel-shark/src/libkshark-py.c

-- 
2.19.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  2019-04-08 15:07   ` Slavomir Kaslev
  2019-04-05 10:14 ` [RFC v2 2/6] kernel-shark: Prepare for building the NumPy interface Yordan Karadzhov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

The new function loads the content of the trace data file into a
table / matrix, made of columns / arrays of data having various integer
types. Later those arrays will be wrapped as NumPy arrays.

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/src/libkshark.c | 136 +++++++++++++++++++++++++++++++++++
 kernel-shark/src/libkshark.h |   7 ++
 2 files changed, 143 insertions(+)

diff --git a/kernel-shark/src/libkshark.c b/kernel-shark/src/libkshark.c
index a886f80..98086a9 100644
--- a/kernel-shark/src/libkshark.c
+++ b/kernel-shark/src/libkshark.c
@@ -959,6 +959,142 @@ ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
 	return -ENOMEM;
 }
 
+static bool data_matrix_alloc(size_t n_rows, uint64_t **offset_array,
+					     uint8_t **cpu_array,
+					     uint64_t **ts_array,
+					     uint16_t **pid_array,
+					     int **event_array)
+{
+	if (offset_array) {
+		*offset_array = calloc(n_rows, sizeof(**offset_array));
+		if (!offset_array)
+			goto free_all;
+	}
+
+	if (cpu_array) {
+		*cpu_array = calloc(n_rows, sizeof(**cpu_array));
+		if (!cpu_array)
+			goto free_all;
+	}
+
+	if (ts_array) {
+		*ts_array = calloc(n_rows, sizeof(**ts_array));
+		if (!ts_array)
+			goto free_all;
+	}
+
+	if (pid_array) {
+		*pid_array = calloc(n_rows, sizeof(**pid_array));
+		if (!pid_array)
+			goto free_all;
+	}
+
+	if (event_array) {
+		*event_array = calloc(n_rows, sizeof(**event_array));
+		if (!event_array)
+			goto free_all;
+	}
+
+	return true;
+
+ free_all:
+	fprintf(stderr, "Failed to allocate memory during data loading.\n");
+
+	if (offset_array)
+		free(*offset_array);
+
+	if (cpu_array)
+		free(*cpu_array);
+
+	if (ts_array)
+		free(*ts_array);
+
+	if (pid_array)
+		free(*pid_array);
+
+	if (event_array)
+		free(*event_array);
+
+	return false;
+}
+
+/**
+ * @brief Load the content of the trace data file into a table / matrix made
+ *	  of columns / arrays of data. The user is responsible for freeing the
+ *	  elements of the outputted array
+ *
+ * @param kshark_ctx: Input location for the session context pointer.
+ * @param offset_array: Output location for the array of record offsets.
+ * @param cpu_array: Output location for the array of CPU Ids.
+ * @param ts_array: Output location for the array of timestamps.
+ * @param pid_array: Output location for the array of Process Ids.
+ * @param event_array: Output location for the array of Event Ids.
+ *
+ * @returns The size of the outputted arrays in the case of success, or a
+ *	    negative error code on failure.
+ */
+size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
+			       uint64_t **offset_array,
+			       uint8_t **cpu_array,
+			       uint64_t **ts_array,
+			       uint16_t **pid_array,
+			       int **event_array)
+{
+	enum rec_type type = REC_ENTRY;
+	struct rec_list **rec_list;
+	size_t count, total = 0;
+	bool status;
+	int n_cpus;
+
+	total = get_records(kshark_ctx, &rec_list, type);
+	if (total < 0)
+		goto fail;
+
+	status = data_matrix_alloc(total, offset_array,
+					  cpu_array,
+					  ts_array,
+					  pid_array,
+					  event_array);
+	if (!status)
+		goto fail;
+
+	n_cpus = tracecmd_cpus(kshark_ctx->handle);
+
+	for (count = 0; count < total; count++) {
+		int next_cpu;
+
+		next_cpu = pick_next_cpu(rec_list, n_cpus, type);
+		if (next_cpu >= 0) {
+			struct kshark_entry *e = &rec_list[next_cpu]->entry;
+
+			if (offset_array)
+				(*offset_array)[count] = e->offset;
+
+			if (cpu_array)
+				(*cpu_array)[count] = e->cpu;
+
+			if (ts_array)
+				(*ts_array)[count] = e->ts;
+
+			if (pid_array)
+				(*pid_array)[count] = e->pid;
+
+			if (event_array)
+				(*event_array)[count] = e->event_id;
+
+			rec_list[next_cpu] = rec_list[next_cpu]->next;
+			free(e);
+		}
+	}
+
+	free_rec_list(rec_list, n_cpus, type);
+	return total;
+
+ fail:
+	fprintf(stderr, "Failed to allocate memory during data loading.\n");
+	return -ENOMEM;
+}
+
 static const char *kshark_get_latency(struct tep_handle *pe,
 				      struct tep_record *record)
 {
diff --git a/kernel-shark/src/libkshark.h b/kernel-shark/src/libkshark.h
index c218b61..92ade41 100644
--- a/kernel-shark/src/libkshark.h
+++ b/kernel-shark/src/libkshark.h
@@ -149,6 +149,13 @@ ssize_t kshark_load_data_entries(struct kshark_context *kshark_ctx,
 ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
 				 struct tep_record ***data_rows);
 
+size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
+			       uint64_t **offset_array,
+			       uint8_t **cpu_array,
+			       uint64_t **ts_array,
+			       uint16_t **pid_array,
+			       int **event_array);
+
 ssize_t kshark_get_task_pids(struct kshark_context *kshark_ctx, int **pids);
 
 void kshark_close(struct kshark_context *kshark_ctx);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v2 2/6] kernel-shark: Prepare for building the NumPy interface
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 3/6] kernel-shark: Add the core components of the NumPy API Yordan Karadzhov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

This patch prepares the Cmake build infrastructure for the
introduction of a the NumPy interface.

We add building of a static version of the C API library to be used by
the interface. The NumPy interface itself will be added in the following
patches.

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/CMakeLists.txt        |  3 +++
 kernel-shark/README                | 12 ++++++++--
 kernel-shark/build/FindNumPy.cmake | 35 ++++++++++++++++++++++++++++
 kernel-shark/src/CMakeLists.txt    | 37 ++++++++++++++++++++++--------
 4 files changed, 76 insertions(+), 11 deletions(-)
 create mode 100644 kernel-shark/build/FindNumPy.cmake

diff --git a/kernel-shark/CMakeLists.txt b/kernel-shark/CMakeLists.txt
index e0778ba..c6a4abf 100644
--- a/kernel-shark/CMakeLists.txt
+++ b/kernel-shark/CMakeLists.txt
@@ -34,6 +34,9 @@ if (Qt5Widgets_FOUND)
 
 endif (Qt5Widgets_FOUND)
 
+find_package(PythonLibs)
+include(${KS_DIR}/build/FindNumPy.cmake)
+
 set(LIBRARY_OUTPUT_PATH    "${KS_DIR}/lib")
 set(EXECUTABLE_OUTPUT_PATH "${KS_DIR}/bin")
 
diff --git a/kernel-shark/README b/kernel-shark/README
index b238d3f..a75b08b 100644
--- a/kernel-shark/README
+++ b/kernel-shark/README
@@ -12,7 +12,11 @@ KernelShark has the following external dependencies:
     sudo apt-get install freeglut3-dev libxmu-dev libxi-dev -y
     sudo apt-get install qtbase5-dev -y
 
-1.1 I you want to be able to generate Doxygen documentation:
+1.1 I you want to be able to use the NumPu Interface of KernelShark:
+    sudo apt-get install libpython3-dev cython3 -y
+    sudo apt-get install python3-numpy python3-matplotlib -y
+
+1.2 I you want to be able to generate Doxygen documentation:
     sudo apt-get install graphviz doxygen-gui -y
 
 
@@ -21,7 +25,11 @@ KernelShark has the following external dependencies:
     dnf install freeglut-devel redhat-rpm-config -y
     dnf install qt5-qtbase-devel -y
 
-2.1 I you want to be able to generate Doxygen documentation:
+2.1 I you want to be able to use the NumPu Interface of KernelShark:
+    dnf install python3-devel python-Cython -y
+    dnf install python-numpy python3-matplotlib -y
+
+2.2 I you want to be able to generate Doxygen documentation:
     dnf install graphviz doxygen -y
 
 
diff --git a/kernel-shark/build/FindNumPy.cmake b/kernel-shark/build/FindNumPy.cmake
new file mode 100644
index 0000000..b23440c
--- /dev/null
+++ b/kernel-shark/build/FindNumPy.cmake
@@ -0,0 +1,35 @@
+execute_process(COMMAND python -c "import Cython; print(Cython.__version__)"
+                RESULT_VARIABLE CYTHON_RES
+                OUTPUT_VARIABLE CYTHON_VERSION
+                ERROR_VARIABLE CYTHON_ERR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+IF (CYTHON_RES MATCHES 0)
+
+  SET(CYTHON_FOUND TRUE)
+  message(STATUS "Found Cython:  (version: ${CYTHON_VERSION})")
+
+ELSE (CYTHON_RES MATCHES 0)
+
+  SET(CYTHON_FOUND FALSE)
+  message(STATUS "\nCould not find CYTHON:  ${CYTHON_ERR}\n")
+
+ENDIF (CYTHON_RES MATCHES 0)
+
+execute_process(COMMAND python -c "import numpy; print(numpy.__version__)"
+                RESULT_VARIABLE NP_RES
+                OUTPUT_VARIABLE NUMPY_VERSION
+                ERROR_VARIABLE NP_ERR
+                OUTPUT_STRIP_TRAILING_WHITESPACE)
+
+IF (NP_RES MATCHES 0)
+
+  SET(NUMPY_FOUND TRUE)
+  message(STATUS "Found NumPy:  (version: ${NUMPY_VERSION})")
+
+ELSE (NP_RES MATCHES 0)
+
+  SET(NUMPY_FOUND FALSE)
+  message(STATUS "\nCould not find NumPy:  ${NP_ERR}\n")
+
+ENDIF (NP_RES MATCHES 0)
diff --git a/kernel-shark/src/CMakeLists.txt b/kernel-shark/src/CMakeLists.txt
index b7dbd7e..b9a05e1 100644
--- a/kernel-shark/src/CMakeLists.txt
+++ b/kernel-shark/src/CMakeLists.txt
@@ -1,16 +1,21 @@
 message("\n src ...")
 
 message(STATUS "libkshark")
-add_library(kshark SHARED libkshark.c
-                          libkshark-model.c
-                          libkshark-plugin.c
-                          libkshark-configio.c
-                          libkshark-collection.c)
 
-target_link_libraries(kshark ${CMAKE_DL_LIBS}
-                             ${JSONC_LIBRARY}
-                             ${TRACEEVENT_LIBRARY}
-                             ${TRACECMD_LIBRARY})
+set(LIBKSHARK_SOURCE libkshark.c
+                     libkshark-model.c
+                     libkshark-plugin.c
+                     libkshark-configio.c
+                     libkshark-collection.c)
+
+set(LIBKSHARK_LINK_LIBS ${CMAKE_DL_LIBS}
+                        ${JSONC_LIBRARY}
+                        ${TRACEEVENT_LIBRARY}
+                        ${TRACECMD_LIBRARY})
+
+add_library(kshark SHARED ${LIBKSHARK_SOURCE})
+
+target_link_libraries(kshark ${LIBKSHARK_LINK_LIBS})
 
 set_target_properties(kshark  PROPERTIES SUFFIX	".so.${KS_VERSION_STRING}")
 
@@ -28,6 +33,20 @@ if (OPENGL_FOUND AND GLUT_FOUND)
 
 endif (OPENGL_FOUND AND GLUT_FOUND)
 
+if (PYTHONLIBS_FOUND AND CYTHON_FOUND AND NUMPY_FOUND)
+
+    message(STATUS "kshark_wrapper")
+
+    add_library(kshark-static STATIC ${LIBKSHARK_SOURCE})
+
+    target_compile_options(kshark-static PUBLIC "-fPIC")
+
+    set_target_properties(kshark-static PROPERTIES OUTPUT_NAME kshark)
+
+    target_link_libraries(kshark-static ${LIBKSHARK_LINK_LIBS})
+
+endif (PYTHONLIBS_FOUND AND CYTHON_FOUND AND NUMPY_FOUND)
+
 if (Qt5Widgets_FOUND AND Qt5Network_FOUND)
 
     message(STATUS "libkshark-gui")
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v2 3/6] kernel-shark: Add the core components of the NumPy API
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 2/6] kernel-shark: Prepare for building the NumPy interface Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 4/6] kernel-shark: Add Numpy Interface for processing of tracing data Yordan Karadzhov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

The NumPy API is meant to operate on top of the C API-s of trace-cmd and
KernelShark and to provide only the minimum of basic functionalities needed
in order to processor the tracing data. The NumPy API itself is made of two
layers. A bottom-one written in C and a top-one which implements the
interface, written in Cython (C-Python). This patch introduces the C layer.

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/src/libkshark-py.c | 175 ++++++++++++++++++++++++++++++++
 1 file changed, 175 insertions(+)
 create mode 100644 kernel-shark/src/libkshark-py.c

diff --git a/kernel-shark/src/libkshark-py.c b/kernel-shark/src/libkshark-py.c
new file mode 100644
index 0000000..a1dd450
--- /dev/null
+++ b/kernel-shark/src/libkshark-py.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: LGPL-2.1
+
+/*
+ * Copyright (C) 2019 VMware Inc, Yordan Karadzhov <y.karadz@gmail.com>
+ */
+
+ /**
+ *  @file    libkshark-py.c
+ *  @brief   Python API for processing of FTRACE (trace-cmd) data.
+ */
+
+// KernelShark
+#include "libkshark.h"
+#include "libkshark-model.h"
+
+bool kspy_open(const char *fname)
+{
+	struct kshark_context *kshark_ctx = NULL;
+
+	if (!kshark_instance(&kshark_ctx))
+		return false;
+
+	return kshark_open(kshark_ctx, fname);
+}
+
+void kspy_close(void)
+{
+	struct kshark_context *kshark_ctx = NULL;
+
+	if (!kshark_instance(&kshark_ctx))
+		return;
+
+	kshark_close(kshark_ctx);
+	kshark_free(kshark_ctx);
+}
+
+static int compare(const void * a, const void * b)
+{
+  return ( *(int*)a - *(int*)b );
+}
+
+size_t kspy_get_tasks(int **pids, char ***names)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	const char *comm;
+	ssize_t i, n;
+	int ret;
+
+	if (!kshark_instance(&kshark_ctx))
+		return 0;
+
+	n = kshark_get_task_pids(kshark_ctx, pids);
+	if (n == 0)
+		return 0;
+
+	qsort(*pids, n, sizeof(**pids), compare);
+
+	*names = calloc(n, sizeof(char*));
+	if (!(*names))
+		goto fail;
+
+	for (i = 0; i < n; ++i) {
+		comm = tep_data_comm_from_pid(kshark_ctx->pevent, (*pids)[i]);
+		ret = asprintf(&(*names)[i], "%s", comm);
+		if (ret < 1)
+			goto fail;
+	}
+
+	return n;
+
+  fail:
+	free(*pids);
+	free(*names);
+	return 0;
+}
+
+size_t kspy_trace2matrix(uint64_t **offset_array,
+			 uint8_t **cpu_array,
+			 uint64_t **ts_array,
+			 uint16_t **pid_array,
+			 int **event_array)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	size_t total = 0;
+
+	if (!kshark_instance(&kshark_ctx))
+		return false;
+
+	total = kshark_load_data_matrix(kshark_ctx, offset_array,
+					cpu_array,
+					ts_array,
+					pid_array,
+					event_array);
+
+	return total;
+}
+
+int kspy_get_event_id(const char *sys, const char *evt)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	struct tep_event *event;
+
+	if (!kshark_instance(&kshark_ctx))
+		return -1;
+
+	event = tep_find_event_by_name(kshark_ctx->pevent, sys, evt);
+
+	return event->id;
+}
+
+uint64_t kspy_read_event_field(uint64_t offset, int id, const char *field)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	struct tep_format_field *evt_field;
+	struct tep_record *record;
+	struct tep_event *event;
+	unsigned long long val;
+	int ret;
+
+	if (!kshark_instance(&kshark_ctx))
+		return 0;
+
+	event = tep_find_event(kshark_ctx->pevent, id);
+	if (!event)
+		return 0;
+
+	evt_field = tep_find_any_field(event, field);
+	if (!evt_field)
+		return 0;
+
+	record = tracecmd_read_at(kshark_ctx->handle, offset, NULL);
+	if (!record)
+		return 0;
+
+	ret = tep_read_number_field(evt_field, record->data, &val);
+	free_record(record);
+
+	if (ret != 0)
+		return 0;
+
+	return val;
+}
+
+void kspy_new_session_file(const char *data_file, const char *session_file)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	struct kshark_trace_histo histo;
+	struct kshark_config_doc *session;
+	struct kshark_config_doc *filters;
+	struct kshark_config_doc *markers;
+	struct kshark_config_doc *model;
+	struct kshark_config_doc *file;
+
+	if (!kshark_instance(&kshark_ctx))
+		return;
+
+	session = kshark_config_new("kshark.config.session",
+				    KS_CONFIG_JSON);
+
+	file = kshark_export_trace_file(data_file, KS_CONFIG_JSON);
+	kshark_config_doc_add(session, "Data", file);
+
+	filters = kshark_export_all_filters(kshark_ctx, KS_CONFIG_JSON);
+	kshark_config_doc_add(session, "Filters", filters);
+
+	ksmodel_init(&histo);
+	model = kshark_export_model(&histo, KS_CONFIG_JSON);
+	kshark_config_doc_add(session, "Model", model);
+
+	markers = kshark_config_new("kshark.config.markers", KS_CONFIG_JSON);
+	kshark_config_doc_add(session, "Markers", markers);
+
+	kshark_save_config_file(session_file, session);
+	kshark_free_config_doc(session);
+}
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v2 4/6] kernel-shark: Add Numpy Interface for processing of tracing data
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
                   ` (2 preceding siblings ...)
  2019-04-05 10:14 ` [RFC v2 3/6] kernel-shark: Add the core components of the NumPy API Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 5/6] kernel-shark: Add automatic building of the NumPy interface Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 6/6] kernel-shark: Add basic example demonstrating " Yordan Karadzhov
  5 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

This patch contains the Cython implementation of the Interface
together with a Python script used to build the corresponding
library (libkshark_wrapper.so)

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/build/py/libkshark_wrapper.pyx | 302 ++++++++++++++++++++
 kernel-shark/build/py/np_setup.py           | 101 +++++++
 2 files changed, 403 insertions(+)
 create mode 100644 kernel-shark/build/py/libkshark_wrapper.pyx
 create mode 100644 kernel-shark/build/py/np_setup.py

diff --git a/kernel-shark/build/py/libkshark_wrapper.pyx b/kernel-shark/build/py/libkshark_wrapper.pyx
new file mode 100644
index 0000000..71d4bd7
--- /dev/null
+++ b/kernel-shark/build/py/libkshark_wrapper.pyx
@@ -0,0 +1,302 @@
+"""
+SPDX-License-Identifier: GPL-2.0
+
+Copyright (C) 2017 VMware Inc, Yordan Karadzhov <ykaradzhov@vmware.com>
+"""
+
+import ctypes
+
+# Import the Python-level symbols of numpy
+import numpy as np
+# Import the C-level symbols of numpy
+cimport numpy as np
+
+import json
+
+from libcpp cimport bool
+
+from libc.stdlib cimport free
+
+from cpython cimport PyObject, Py_INCREF
+
+
+cdef extern from 'stdint.h':
+    ctypedef unsigned short uint8_t
+    ctypedef unsigned short uint16_t
+    ctypedef unsigned long long uint64_t
+
+cdef extern from 'numpy/ndarraytypes.h':
+    int NPY_ARRAY_CARRAY
+
+# Declare all C functions we are going to call
+cdef extern from '../../src/libkshark-py.c':
+    bool kspy_open(const char *fname)
+
+cdef extern from '../../src/libkshark-py.c':
+    bool kspy_close()
+
+cdef extern from '../../src/libkshark-py.c':
+    size_t kspy_trace2matrix(uint64_t **offset_array,
+                             uint8_t **cpu_array,
+                             uint64_t **ts_array,
+                             uint16_t **pid_array,
+                             int **event_array)
+
+cdef extern from '../../src/libkshark-py.c':
+    int kspy_get_event_id(const char *sys, const char *evt)
+
+cdef extern from '../../src/libkshark-py.c':
+    uint64_t kspy_read_event_field(uint64_t offset,
+                                   int event_id,
+                                   const char *field)
+
+cdef extern from '../../src/libkshark-py.c':
+    ssize_t kspy_get_tasks(int **pids, char ***names)
+
+cdef extern from '../../src/libkshark.h':
+    int KS_EVENT_OVERFLOW
+
+cdef extern from '../../src/libkshark-py.c':
+    void kspy_new_session_file(const char *data_file,
+                               const char *session_file)
+
+EVENT_OVERFLOW = KS_EVENT_OVERFLOW
+
+# Numpy must be initialized!!!
+np.import_array()
+
+
+cdef class KsDataWrapper:
+    cdef int item_size
+    cdef int data_size
+    cdef int data_type
+    cdef void* data_ptr
+
+    cdef init(self,
+              int data_type,
+              int data_size,
+              int item_size,
+              void* data_ptr):
+        """ This initialization cannot be done in the constructor because
+            we use C-level arguments.
+        """
+        self.item_size = item_size
+        self.data_size = data_size
+        self.data_type = data_type
+        self.data_ptr = data_ptr
+
+    def __array__(self):
+        """ Here we use the __array__ method, that is called when numpy
+            tries to get an array from the object.
+        """
+        cdef np.npy_intp shape[1]
+        shape[0] = <np.npy_intp> self.data_size
+
+        ndarray = np.PyArray_New(np.ndarray,
+                                 1, shape,
+                                 self.data_type,
+                                 NULL,
+                                 self.data_ptr,
+                                 self.item_size,
+                                 NPY_ARRAY_CARRAY,
+                                 <object>NULL)
+
+        return ndarray
+
+    def __dealloc__(self):
+        """ Free the data. This is called by Python when all the references to
+            the object are gone.
+        """
+        free(<void*>self.data_ptr)
+
+
+def c_str2py(char *c_str):
+    """ String convertion C -> Python
+    """
+    return ctypes.c_char_p(c_str).value.decode('utf-8')
+
+
+def py_str2c(py_str):
+    """ String convertion Python -> C
+    """
+    return py_str.encode('utf-8')
+
+
+def open_file(fname):
+    """ Open a tracing data file.
+    """
+    return kspy_open(py_str2c(fname))
+
+
+def close():
+    """ Open the session file.
+    """
+    kspy_close()
+
+
+def read_event_field(offset, event_id, field):
+    """ Read the value of a specific field of the trace event.
+    """
+    cdef uint64_t v
+
+    v = kspy_read_event_field(offset, event_id, py_str2c(field))
+    return v
+
+
+def event_id(system, event):
+    """ Get the unique Id of the event
+    """
+    return kspy_get_event_id(py_str2c(system), py_str2c(event))
+
+
+def get_tasks():
+    """ Get a dictionary of all task's PIDs
+    """
+    cdef int *pids
+    cdef char **names
+    cdef int size = kspy_get_tasks(&pids, &names)
+
+    task_dict = {}
+
+    for i in range(0, size):
+        task_dict.update({c_str2py(names[i]): pids[i]})
+
+    return task_dict
+
+
+def load_data(ofst_data=True, cpu_data=True,
+	      ts_data=True, pid_data=True,
+	      evt_data=True):
+    """ Python binding of the 'kshark_load_data_matrix' function that does not
+        copy the data. The input parameters can be used to avoid loading the
+        data from the unnecessary fields.
+    """
+    cdef uint64_t *ofst_c
+    cdef uint8_t *cpu_c
+    cdef uint64_t *ts_c
+    cdef uint16_t *pid_c
+    cdef int *evt_c
+
+    cdef np.ndarray ofst
+    cdef np.ndarray cpu
+    cdef np.ndarray ts
+    cdef np.ndarray pid
+    cdef np.ndarray evt
+
+    if not ofst_data:
+        ofst_c = NULL
+
+    if not cpu_data:
+        cpu_c = NULL
+
+    if not ts_data:
+        ts_c = NULL
+
+    if not pid_data:
+        pid_c = NULL
+
+    if not evt_data:
+        evt_c = NULL
+
+    data_dict = {}
+
+    # Call the C function
+    size = kspy_trace2matrix(&ofst_c, &cpu_c, &ts_c, &pid_c, &evt_c)
+
+    if ofst_data:
+        array_wrapper_ofst = KsDataWrapper()
+        array_wrapper_ofst.init(data_type=np.NPY_UINT64,
+                                item_size=0,
+                                data_size=size,
+                                data_ptr=<void *> ofst_c)
+
+
+        ofst = np.array(array_wrapper_ofst, copy=False)
+        ofst.base = <PyObject *> array_wrapper_ofst
+        data_dict.update({'offset': ofst})
+        Py_INCREF(array_wrapper_ofst)
+
+    if cpu_data:
+        array_wrapper_cpu = KsDataWrapper()
+        array_wrapper_cpu.init(data_type=np.NPY_UINT8,
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *> cpu_c)
+
+        cpu = np.array(array_wrapper_cpu, copy=False)
+        cpu.base = <PyObject *> array_wrapper_cpu
+        data_dict.update({'cpu': cpu})
+        Py_INCREF(array_wrapper_cpu)
+
+    if ts_data:
+        array_wrapper_ts = KsDataWrapper()
+        array_wrapper_ts.init(data_type=np.NPY_UINT64,
+                              data_size=size,
+                              item_size=0,
+                              data_ptr=<void *> ts_c)
+
+        ts = np.array(array_wrapper_ts, copy=False)
+        ts.base = <PyObject *> array_wrapper_ts
+        data_dict.update({'time': ts})
+        Py_INCREF(array_wrapper_ts)
+
+    if pid_data:
+        array_wrapper_pid = KsDataWrapper()
+        array_wrapper_pid.init(data_type=np.NPY_UINT16,
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *>pid_c)
+
+        pid = np.array(array_wrapper_pid, copy=False)
+        pid.base = <PyObject *> array_wrapper_pid
+        data_dict.update({'pid': pid})
+        Py_INCREF(array_wrapper_pid)
+
+    if evt_data:
+        array_wrapper_evt = KsDataWrapper()
+        array_wrapper_evt.init(data_type=np.NPY_INT,
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *>evt_c)
+
+        evt = np.array(array_wrapper_evt, copy=False)
+        evt.base = <PyObject *> array_wrapper_evt
+        data_dict.update({'event': evt})
+        Py_INCREF(array_wrapper_evt)
+
+    return data_dict
+
+
+def save_session(session, s):
+    """ Save a KernelShark session description of a JSON file.
+    """
+    s.seek(0)
+    json.dump(session, s, indent=4)
+    s.truncate()
+
+
+def new_session(fname, sname):
+    """ Generate and save a default KernelShark session description
+        file (JSON).
+    """
+    kspy_new_session_file(py_str2c(fname), py_str2c(sname))
+
+    with open(sname, 'r+') as s:
+        session = json.load(s)
+
+        session['Filters']['filter mask'] = 7
+        session['CPUPlots'] = []
+        session['TaskPlots'] = []
+        session['Splitter'] = [1, 1]
+        session['MainWindow'] = [1200, 800]
+        session['ViewTop'] = 0
+        session['ColorScheme'] = 0.75
+        session['Model']['bins'] = 1000
+
+        session['Markers']['markA'] = {}
+        session['Markers']['markA']['isSet'] = False
+        session['Markers']['markB'] = {}
+        session['Markers']['markB']['isSet'] = False
+        session['Markers']['Active'] = 'A'
+
+        save_session(session, s)
diff --git a/kernel-shark/build/py/np_setup.py b/kernel-shark/build/py/np_setup.py
new file mode 100644
index 0000000..0f0a26c
--- /dev/null
+++ b/kernel-shark/build/py/np_setup.py
@@ -0,0 +1,101 @@
+#!/usr/bin/env python3
+
+"""
+SPDX-License-Identifier: LGPL-2.1
+
+Copyright (C) 2017 VMware Inc, Yordan Karadzhov <ykaradzhov@vmware.com>
+"""
+
+import sys
+import getopt
+import numpy
+from Cython.Distutils import build_ext
+
+
+def libs(argv):
+    kslibdir = ''
+    evlibdir = ''
+    evincdir = ''
+    trlibdir = ''
+    trincdir = ''
+    jsnincdir = ''
+
+    try:
+        opts, args = getopt.getopt(
+            argv, 'l:t:i:e:n:j:', ['kslibdir=',
+                                   'trlibdir=',
+                                   'trincdir=',
+                                   'evlibdir=',
+                                   'evincdir=',
+                                   'jsnincdir=']
+        )
+
+    except getopt.GetoptError:
+        sys.exit(2)
+
+    for opt, arg in opts:
+        if opt in ('-l', '--kslibdir'):
+            kslibdir = arg
+        elif opt in ('-t', '--trlibdir'):
+            trlibdir = arg
+        elif opt in ('-i', '--trincdir'):
+            trincdir = arg
+        elif opt in ('-e', '--evlibdir'):
+            evlibdir = arg
+        elif opt in ('-n', '--evincdir'):
+            evincdir = arg
+        elif opt in ('-j', '--jsnincdir'):
+            jsnincdir = arg
+
+    cmd1 = 1
+    for i in range(len(sys.argv)):
+        if sys.argv[i] == 'build_ext':
+            cmd1 = i
+
+    sys.argv = sys.argv[:1] + sys.argv[cmd1:]
+
+    return kslibdir, evlibdir, evincdir, trlibdir, trincdir, jsnincdir
+
+
+def configuration(parent_package='',
+                  top_path=None,
+                  libs=['kshark', 'tracecmd', 'traceevent', 'json-c'],
+                  libdirs=['.'],
+                  incdirs=['.']):
+    """ Function used to build our configuration.
+    """
+    from numpy.distutils.misc_util import Configuration
+
+    # The configuration object that hold information on all the files
+    # to be built.
+    config = Configuration('', parent_package, top_path)
+    config.add_extension('ksharkpy',
+                         sources=['libkshark_wrapper.pyx'],
+                         libraries=libs,
+                         library_dirs=libdirs,
+                         depends=['../../src/libkshark.c', '../../src/libkshark-py.c'],
+                         include_dirs=incdirs)
+
+    return config
+
+
+def main(argv):
+    # Retrieve third-party libraries
+    kslibdir, evlibdir, evincdir, trlibdir, trincdir, jsnincdir = libs(sys.argv[1:])
+
+    # Retrieve the parameters of our local configuration
+    params = configuration(top_path='',
+                           libdirs=[kslibdir, trlibdir, evlibdir],
+                           incdirs=[trincdir, evincdir, jsnincdir]).todict()
+
+    # Override the C-extension building so that it knows about '.pyx'
+    # Cython files.
+    params['cmdclass'] = dict(build_ext=build_ext)
+
+    # Call the actual building/packaging function (see distutils docs)
+    from numpy.distutils.core import setup
+    setup(**params)
+
+
+if __name__ == '__main__':
+    main(sys.argv[1:])
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v2 5/6] kernel-shark: Add automatic building of the NumPy interface
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
                   ` (3 preceding siblings ...)
  2019-04-05 10:14 ` [RFC v2 4/6] kernel-shark: Add Numpy Interface for processing of tracing data Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  2019-04-05 10:14 ` [RFC v2 6/6] kernel-shark: Add basic example demonstrating " Yordan Karadzhov
  5 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

CMAKE will call the python script that build the NumPy interface.
It will also takes care about cleaning.

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/build/py/pybuild.sh | 29 +++++++++++++++++++++++++++++
 kernel-shark/src/CMakeLists.txt  | 22 ++++++++++++++++++++++
 2 files changed, 51 insertions(+)
 create mode 100755 kernel-shark/build/py/pybuild.sh

diff --git a/kernel-shark/build/py/pybuild.sh b/kernel-shark/build/py/pybuild.sh
new file mode 100755
index 0000000..fc07711
--- /dev/null
+++ b/kernel-shark/build/py/pybuild.sh
@@ -0,0 +1,29 @@
+#!/bin/bash
+
+# SPDX-License-Identifier: GPL-2.0
+
+python3 np_setup.py --kslibdir=$1 \
+                    --trlibdir=$2 \
+                    --trincdir=$3 \
+                    --evlibdir=$4 \
+                    --evincdir=$5 \
+                    --jsnincdir=$6 \
+                    build_ext -i &> pybuild.log
+
+WC=$(grep 'error' pybuild.log | wc -l)
+if ((WC > 2)); then
+   cat pybuild.log
+fi
+
+if grep -q 'Error' pybuild.log; then
+   cat pybuild.log
+fi
+
+WC=$(grep 'warning' pybuild.log | wc -l)
+if ((WC > 3)); then
+   cat pybuild.log
+fi
+
+if grep -q 'usage' pybuild.log; then
+   cat pybuild.log
+fi
diff --git a/kernel-shark/src/CMakeLists.txt b/kernel-shark/src/CMakeLists.txt
index b9a05e1..ba97c42 100644
--- a/kernel-shark/src/CMakeLists.txt
+++ b/kernel-shark/src/CMakeLists.txt
@@ -45,6 +45,28 @@ if (PYTHONLIBS_FOUND AND CYTHON_FOUND AND NUMPY_FOUND)
 
     target_link_libraries(kshark-static ${LIBKSHARK_LINK_LIBS})
 
+
+    add_custom_target(kshark_wrapper ALL DEPENDS kshark-static libkshark-py.c
+                                     COMMENT "Generating libkshark_wrapper.c")
+
+    add_custom_command(TARGET kshark_wrapper
+                       PRE_BUILD
+                       COMMAND rm -rf build
+                       COMMAND ./pybuild.sh ${KS_DIR}/lib/
+                                            ${TRACECMD_LIBRARY_DIR}   ${TRACECMD_INCLUDE_DIR}
+                                            ${TRACEEVENT_LIBRARY_DIR} ${TRACEEVENT_INCLUDE_DIR}
+                                            ${JSONC_INCLUDE_DIR}
+                       COMMAND mv ksharkpy.cpython-*.so ksharkpy.so
+                       COMMAND cp ksharkpy.so  ${KS_DIR}/bin
+                       WORKING_DIRECTORY ${CMAKE_BINARY_DIR}/py
+                       DEPENDS ${KS_DIR}/build/py/np_setup.py ${KS_DIR}/src/libkshark-py.c)
+
+    set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES
+                                           "${KS_DIR}/bin/ksharkpy.so"
+                                           "${KS_DIR}/build/py/ksharkpy.so"
+                                           "${KS_DIR}/build/py/libkshark_wrapper.c"
+                                           "${KS_DIR}/build/py/build")
+
 endif (PYTHONLIBS_FOUND AND CYTHON_FOUND AND NUMPY_FOUND)
 
 if (Qt5Widgets_FOUND AND Qt5Network_FOUND)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC v2 6/6] kernel-shark: Add basic example demonstrating the NumPy interface
  2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
                   ` (4 preceding siblings ...)
  2019-04-05 10:14 ` [RFC v2 5/6] kernel-shark: Add automatic building of the NumPy interface Yordan Karadzhov
@ 2019-04-05 10:14 ` Yordan Karadzhov
  5 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov @ 2019-04-05 10:14 UTC (permalink / raw)
  To: rostedt; +Cc: linux-trace-devel

The example script plots the distribution (histogram) of the pulseaudio
wake-up times and finds the biggest latency. The script also generates
a KernelShark session descriptor file (JSON). The session descriptor file
can be used by the KernelSherk GUI to open a session which will directly
visualize the largest wake-up latency.

Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
---
 kernel-shark/bin/sched_wakeup.py | 106 +++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)
 create mode 100755 kernel-shark/bin/sched_wakeup.py

diff --git a/kernel-shark/bin/sched_wakeup.py b/kernel-shark/bin/sched_wakeup.py
new file mode 100755
index 0000000..8e2cfc1
--- /dev/null
+++ b/kernel-shark/bin/sched_wakeup.py
@@ -0,0 +1,106 @@
+#!/usr/bin/env python3
+
+""" The license to be determined !!!!
+"""
+
+import json
+import sys
+
+import matplotlib.pyplot as plt
+import scipy.stats as st
+import numpy as np
+
+import ksharkpy as ks
+
+fname = str(sys.argv[1])
+
+ks.open_file(fname)
+
+# We do not need the Process Ids of the records.
+# Do not load the "pid" data.
+data = ks.load_data(pid_data=False)
+
+tasks = ks.get_tasks()
+task_pid = tasks['pulseaudio']
+
+# Get the Event Ids of the sched_switch and sched_waking events.
+ss_eid = ks.event_id('sched', 'sched_switch')
+w_eid = ks.event_id('sched', 'sched_waking')
+
+# Gey the size of the data.
+i = data['offset'].size
+
+dt = []
+delta_max = i_ss_max = i_sw_max = 0
+
+while i > 0:
+    i = i - 1
+    if data['event'][i] == ss_eid:
+        next_pid = ks.read_event_field(offset=data['offset'][i],
+                                       event_id=ss_eid,
+                                       field='next_pid')
+
+        if next_pid == task_pid:
+            time_ss = data['time'][i]
+            index_ss = i
+
+            while i > 0:
+                i = i - 1
+                if (data['event'][i] == w_eid):
+                    waking_pid = ks.read_event_field(offset=data['offset'][i],
+                                                     event_id=w_eid,
+                                                     field='pid')
+
+                    if waking_pid == task_pid:
+                        delta = (time_ss - data['time'][i]) / 1000.
+                        # print(delta)
+                        dt.append(delta)
+                        if delta > delta_max:
+                            print('lat. max: ', delta)
+                            i_ss_max = index_ss
+                            i_sw_max = i
+                            delta_max = delta
+
+                        break
+
+desc = st.describe(np.array(dt))
+print(desc)
+
+fig, ax = plt.subplots(nrows=1, ncols=1)
+fig.set_figheight(6)
+fig.set_figwidth(7)
+
+rect = fig.patch
+rect.set_facecolor('white')
+
+ax.set_xlabel('latency [$\mu$s]')
+plt.yscale('log')
+ax.hist(dt, bins=(200), histtype='step')
+plt.show()
+
+sname = 'sched.json'
+ks.new_session(fname, sname)
+
+with open(sname, 'r+') as s:
+    session = json.load(s)
+    session['TaskPlots'] = [task_pid]
+    session['CPUPlots'] = [
+        int(data['cpu'][i_sw_max]),
+        int(data['cpu'][i_ss_max])]
+
+    delta = data['time'][i_ss_max] - data['time'][i_sw_max]
+    tmin = int(data['time'][i_sw_max] - delta)
+    tmax = int(data['time'][i_ss_max] + delta)
+    session['Model']['range'] = [tmin, tmax]
+
+    session['Markers']['markA']['isSet'] = True
+    session['Markers']['markA']['row'] = int(i_sw_max)
+
+    session['Markers']['markB']['isSet'] = True
+    session['Markers']['markB']['row'] = int(i_ss_max)
+
+    session['ViewTop'] = int(i_sw_max) - 5
+
+    ks.save_session(session, s)
+
+ks.close()
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface
  2019-04-05 10:14 ` [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
@ 2019-04-08 15:07   ` Slavomir Kaslev
  2019-04-09 12:01     ` Yordan Karadzhov (VMware)
  0 siblings, 1 reply; 9+ messages in thread
From: Slavomir Kaslev @ 2019-04-08 15:07 UTC (permalink / raw)
  To: Yordan Karadzhov; +Cc: rostedt, linux-trace-devel

On Fri, Apr 05, 2019 at 01:14:06PM +0300, Yordan Karadzhov wrote:
> The new function loads the content of the trace data file into a
> table / matrix, made of columns / arrays of data having various integer
> types. Later those arrays will be wrapped as NumPy arrays.
> 
> Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
> ---
>  kernel-shark/src/libkshark.c | 136 +++++++++++++++++++++++++++++++++++
>  kernel-shark/src/libkshark.h |   7 ++
>  2 files changed, 143 insertions(+)
> 
> diff --git a/kernel-shark/src/libkshark.c b/kernel-shark/src/libkshark.c
> index a886f80..98086a9 100644
> --- a/kernel-shark/src/libkshark.c
> +++ b/kernel-shark/src/libkshark.c
> @@ -959,6 +959,142 @@ ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
>  	return -ENOMEM;
>  }
>  
> +static bool data_matrix_alloc(size_t n_rows, uint64_t **offset_array,
> +					     uint8_t **cpu_array,
> +					     uint64_t **ts_array,
> +					     uint16_t **pid_array,
> +					     int **event_array)
> +{
> +	if (offset_array) {
> +		*offset_array = calloc(n_rows, sizeof(**offset_array));
> +		if (!offset_array)

This should be

		if (!*offset_array)

and ditto for the rest.

-- Slavi

> +			goto free_all;
> +	}
> +
> +	if (cpu_array) {
> +		*cpu_array = calloc(n_rows, sizeof(**cpu_array));
> +		if (!cpu_array)
> +			goto free_all;
> +	}
> +
> +	if (ts_array) {
> +		*ts_array = calloc(n_rows, sizeof(**ts_array));
> +		if (!ts_array)
> +			goto free_all;
> +	}
> +
> +	if (pid_array) {
> +		*pid_array = calloc(n_rows, sizeof(**pid_array));
> +		if (!pid_array)
> +			goto free_all;
> +	}
> +
> +	if (event_array) {
> +		*event_array = calloc(n_rows, sizeof(**event_array));
> +		if (!event_array)
> +			goto free_all;
> +	}
> +
> +	return true;
> +
> + free_all:
> +	fprintf(stderr, "Failed to allocate memory during data loading.\n");
> +
> +	if (offset_array)
> +		free(*offset_array);
> +
> +	if (cpu_array)
> +		free(*cpu_array);
> +
> +	if (ts_array)
> +		free(*ts_array);
> +
> +	if (pid_array)
> +		free(*pid_array);
> +
> +	if (event_array)
> +		free(*event_array);
> +
> +	return false;
> +}
> +
> +/**
> + * @brief Load the content of the trace data file into a table / matrix made
> + *	  of columns / arrays of data. The user is responsible for freeing the
> + *	  elements of the outputted array
> + *
> + * @param kshark_ctx: Input location for the session context pointer.
> + * @param offset_array: Output location for the array of record offsets.
> + * @param cpu_array: Output location for the array of CPU Ids.
> + * @param ts_array: Output location for the array of timestamps.
> + * @param pid_array: Output location for the array of Process Ids.
> + * @param event_array: Output location for the array of Event Ids.
> + *
> + * @returns The size of the outputted arrays in the case of success, or a
> + *	    negative error code on failure.
> + */
> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
> +			       uint64_t **offset_array,
> +			       uint8_t **cpu_array,
> +			       uint64_t **ts_array,
> +			       uint16_t **pid_array,
> +			       int **event_array)
> +{
> +	enum rec_type type = REC_ENTRY;
> +	struct rec_list **rec_list;
> +	size_t count, total = 0;
> +	bool status;
> +	int n_cpus;
> +
> +	total = get_records(kshark_ctx, &rec_list, type);
> +	if (total < 0)
> +		goto fail;
> +
> +	status = data_matrix_alloc(total, offset_array,
> +					  cpu_array,
> +					  ts_array,
> +					  pid_array,
> +					  event_array);
> +	if (!status)
> +		goto fail;
> +
> +	n_cpus = tracecmd_cpus(kshark_ctx->handle);
> +
> +	for (count = 0; count < total; count++) {
> +		int next_cpu;
> +
> +		next_cpu = pick_next_cpu(rec_list, n_cpus, type);
> +		if (next_cpu >= 0) {
> +			struct kshark_entry *e = &rec_list[next_cpu]->entry;
> +
> +			if (offset_array)
> +				(*offset_array)[count] = e->offset;
> +
> +			if (cpu_array)
> +				(*cpu_array)[count] = e->cpu;
> +
> +			if (ts_array)
> +				(*ts_array)[count] = e->ts;
> +
> +			if (pid_array)
> +				(*pid_array)[count] = e->pid;
> +
> +			if (event_array)
> +				(*event_array)[count] = e->event_id;
> +
> +			rec_list[next_cpu] = rec_list[next_cpu]->next;
> +			free(e);
> +		}
> +	}
> +
> +	free_rec_list(rec_list, n_cpus, type);
> +	return total;
> +
> + fail:
> +	fprintf(stderr, "Failed to allocate memory during data loading.\n");
> +	return -ENOMEM;
> +}
> +
>  static const char *kshark_get_latency(struct tep_handle *pe,
>  				      struct tep_record *record)
>  {
> diff --git a/kernel-shark/src/libkshark.h b/kernel-shark/src/libkshark.h
> index c218b61..92ade41 100644
> --- a/kernel-shark/src/libkshark.h
> +++ b/kernel-shark/src/libkshark.h
> @@ -149,6 +149,13 @@ ssize_t kshark_load_data_entries(struct kshark_context *kshark_ctx,
>  ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
>  				 struct tep_record ***data_rows);
>  
> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
> +			       uint64_t **offset_array,
> +			       uint8_t **cpu_array,
> +			       uint64_t **ts_array,
> +			       uint16_t **pid_array,
> +			       int **event_array);
> +
>  ssize_t kshark_get_task_pids(struct kshark_context *kshark_ctx, int **pids);
>  
>  void kshark_close(struct kshark_context *kshark_ctx);
> -- 
> 2.19.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface
  2019-04-08 15:07   ` Slavomir Kaslev
@ 2019-04-09 12:01     ` Yordan Karadzhov (VMware)
  0 siblings, 0 replies; 9+ messages in thread
From: Yordan Karadzhov (VMware) @ 2019-04-09 12:01 UTC (permalink / raw)
  To: Slavomir Kaslev, Yordan Karadzhov; +Cc: rostedt, linux-trace-devel



On 8.04.19 г. 18:07 ч., Slavomir Kaslev wrote:
> On Fri, Apr 05, 2019 at 01:14:06PM +0300, Yordan Karadzhov wrote:
>> The new function loads the content of the trace data file into a
>> table / matrix, made of columns / arrays of data having various integer
>> types. Later those arrays will be wrapped as NumPy arrays.
>>
>> Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com>
>> ---
>>   kernel-shark/src/libkshark.c | 136 +++++++++++++++++++++++++++++++++++
>>   kernel-shark/src/libkshark.h |   7 ++
>>   2 files changed, 143 insertions(+)
>>
>> diff --git a/kernel-shark/src/libkshark.c b/kernel-shark/src/libkshark.c
>> index a886f80..98086a9 100644
>> --- a/kernel-shark/src/libkshark.c
>> +++ b/kernel-shark/src/libkshark.c
>> @@ -959,6 +959,142 @@ ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
>>   	return -ENOMEM;
>>   }
>>   
>> +static bool data_matrix_alloc(size_t n_rows, uint64_t **offset_array,
>> +					     uint8_t **cpu_array,
>> +					     uint64_t **ts_array,
>> +					     uint16_t **pid_array,
>> +					     int **event_array)
>> +{
>> +	if (offset_array) {
>> +		*offset_array = calloc(n_rows, sizeof(**offset_array));
>> +		if (!offset_array)
> 
> This should be
> 
> 		if (!*offset_array)
> 
> and ditto for the rest.
> 

You are right. The whole function is a mess. I will try to fix it in the 
flowing version.

Thanks!
Yordan



> -- Slavi
> 
>> +			goto free_all;
>> +	}
>> +
>> +	if (cpu_array) {
>> +		*cpu_array = calloc(n_rows, sizeof(**cpu_array));
>> +		if (!cpu_array)
>> +			goto free_all;
>> +	}
>> +
>> +	if (ts_array) {
>> +		*ts_array = calloc(n_rows, sizeof(**ts_array));
>> +		if (!ts_array)
>> +			goto free_all;
>> +	}
>> +
>> +	if (pid_array) {
>> +		*pid_array = calloc(n_rows, sizeof(**pid_array));
>> +		if (!pid_array)
>> +			goto free_all;
>> +	}
>> +
>> +	if (event_array) {
>> +		*event_array = calloc(n_rows, sizeof(**event_array));
>> +		if (!event_array)
>> +			goto free_all;
>> +	}
>> +
>> +	return true;
>> +
>> + free_all:
>> +	fprintf(stderr, "Failed to allocate memory during data loading.\n");
>> +
>> +	if (offset_array)
>> +		free(*offset_array);
>> +
>> +	if (cpu_array)
>> +		free(*cpu_array);
>> +
>> +	if (ts_array)
>> +		free(*ts_array);
>> +
>> +	if (pid_array)
>> +		free(*pid_array);
>> +
>> +	if (event_array)
>> +		free(*event_array);
>> +
>> +	return false;
>> +}
>> +
>> +/**
>> + * @brief Load the content of the trace data file into a table / matrix made
>> + *	  of columns / arrays of data. The user is responsible for freeing the
>> + *	  elements of the outputted array
>> + *
>> + * @param kshark_ctx: Input location for the session context pointer.
>> + * @param offset_array: Output location for the array of record offsets.
>> + * @param cpu_array: Output location for the array of CPU Ids.
>> + * @param ts_array: Output location for the array of timestamps.
>> + * @param pid_array: Output location for the array of Process Ids.
>> + * @param event_array: Output location for the array of Event Ids.
>> + *
>> + * @returns The size of the outputted arrays in the case of success, or a
>> + *	    negative error code on failure.
>> + */
>> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
>> +			       uint64_t **offset_array,
>> +			       uint8_t **cpu_array,
>> +			       uint64_t **ts_array,
>> +			       uint16_t **pid_array,
>> +			       int **event_array)
>> +{
>> +	enum rec_type type = REC_ENTRY;
>> +	struct rec_list **rec_list;
>> +	size_t count, total = 0;
>> +	bool status;
>> +	int n_cpus;
>> +
>> +	total = get_records(kshark_ctx, &rec_list, type);
>> +	if (total < 0)
>> +		goto fail;
>> +
>> +	status = data_matrix_alloc(total, offset_array,
>> +					  cpu_array,
>> +					  ts_array,
>> +					  pid_array,
>> +					  event_array);
>> +	if (!status)
>> +		goto fail;
>> +
>> +	n_cpus = tracecmd_cpus(kshark_ctx->handle);
>> +
>> +	for (count = 0; count < total; count++) {
>> +		int next_cpu;
>> +
>> +		next_cpu = pick_next_cpu(rec_list, n_cpus, type);
>> +		if (next_cpu >= 0) {
>> +			struct kshark_entry *e = &rec_list[next_cpu]->entry;
>> +
>> +			if (offset_array)
>> +				(*offset_array)[count] = e->offset;
>> +
>> +			if (cpu_array)
>> +				(*cpu_array)[count] = e->cpu;
>> +
>> +			if (ts_array)
>> +				(*ts_array)[count] = e->ts;
>> +
>> +			if (pid_array)
>> +				(*pid_array)[count] = e->pid;
>> +
>> +			if (event_array)
>> +				(*event_array)[count] = e->event_id;
>> +
>> +			rec_list[next_cpu] = rec_list[next_cpu]->next;
>> +			free(e);
>> +		}
>> +	}
>> +
>> +	free_rec_list(rec_list, n_cpus, type);
>> +	return total;
>> +
>> + fail:
>> +	fprintf(stderr, "Failed to allocate memory during data loading.\n");
>> +	return -ENOMEM;
>> +}
>> +
>>   static const char *kshark_get_latency(struct tep_handle *pe,
>>   				      struct tep_record *record)
>>   {
>> diff --git a/kernel-shark/src/libkshark.h b/kernel-shark/src/libkshark.h
>> index c218b61..92ade41 100644
>> --- a/kernel-shark/src/libkshark.h
>> +++ b/kernel-shark/src/libkshark.h
>> @@ -149,6 +149,13 @@ ssize_t kshark_load_data_entries(struct kshark_context *kshark_ctx,
>>   ssize_t kshark_load_data_records(struct kshark_context *kshark_ctx,
>>   				 struct tep_record ***data_rows);
>>   
>> +size_t kshark_load_data_matrix(struct kshark_context *kshark_ctx,
>> +			       uint64_t **offset_array,
>> +			       uint8_t **cpu_array,
>> +			       uint64_t **ts_array,
>> +			       uint16_t **pid_array,
>> +			       int **event_array);
>> +
>>   ssize_t kshark_get_task_pids(struct kshark_context *kshark_ctx, int **pids);
>>   
>>   void kshark_close(struct kshark_context *kshark_ctx);
>> -- 
>> 2.19.1
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-09 12:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-05 10:14 [RFC v2 0/6] NumPy Interface for KernelShark Yordan Karadzhov
2019-04-05 10:14 ` [RFC v2 1/6] kernel-shark: Add new dataloading method to be used by the NumPu interface Yordan Karadzhov
2019-04-08 15:07   ` Slavomir Kaslev
2019-04-09 12:01     ` Yordan Karadzhov (VMware)
2019-04-05 10:14 ` [RFC v2 2/6] kernel-shark: Prepare for building the NumPy interface Yordan Karadzhov
2019-04-05 10:14 ` [RFC v2 3/6] kernel-shark: Add the core components of the NumPy API Yordan Karadzhov
2019-04-05 10:14 ` [RFC v2 4/6] kernel-shark: Add Numpy Interface for processing of tracing data Yordan Karadzhov
2019-04-05 10:14 ` [RFC v2 5/6] kernel-shark: Add automatic building of the NumPy interface Yordan Karadzhov
2019-04-05 10:14 ` [RFC v2 6/6] kernel-shark: Add basic example demonstrating " Yordan Karadzhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).