All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
To: linux-trace-devel@vger.kernel.org
Cc: rostedt@goodmis.org, Douglas.Raillard@arm.com,
	Valentin.Schneider@arm.com, nd@arm.com,
	"Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
Subject: [PATCH v2 03/12] trace-cruncher: Refactor NumPy based data wrapper
Date: Tue,  7 Jan 2020 19:03:03 +0200	[thread overview]
Message-ID: <20200107170312.27116-4-y.karadz@gmail.com> (raw)
In-Reply-To: <20200107170312.27116-1-y.karadz@gmail.com>

The data wrapper is the only thing that remains being built with Cython.
It is now a subpackage called "tracecruncher.datawrapper".

Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
---
 setup.py            |  12 ++-
 src/datawrapper.pyx | 201 ++++++++++++++++++++++++++++++++++++++++++++
 src/trace2matrix.c  |  29 +++++++
 3 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100644 src/datawrapper.pyx
 create mode 100644 src/trace2matrix.c

diff --git a/setup.py b/setup.py
index 62912e2..526e1e7 100644
--- a/setup.py
+++ b/setup.py
@@ -9,12 +9,22 @@ Copyright 2019 VMware Inc, Yordan Karadzhov (VMware) <y.karadz@gmail.com>
 from setuptools import setup, find_packages
 from distutils.core import Extension
 from Cython.Build import cythonize
+import numpy as np
 
 def main():
     kshark_path = '/usr/local/lib/kernelshark'
     traceevent_path = '/usr/local/lib/traceevent/'
     tracecmd_path = '/usr/local/lib/trace-cmd/'
 
+    cythonize('src/datawrapper.pyx')
+    module_data = Extension('tracecruncher.datawrapper',
+                            sources=['src/datawrapper.c'],
+                            include_dirs=[np.get_include()],
+                            library_dirs=[kshark_path, traceevent_path, tracecmd_path],
+                            runtime_library_dirs=[kshark_path, traceevent_path, tracecmd_path],
+                            libraries=['kshark', 'traceevent', 'tracecmd']
+                            )
+
     module_ks = Extension('tracecruncher.ksharkpy',
                           sources=['src/ksharkpy.c'],
                           library_dirs=[kshark_path],
@@ -41,7 +51,7 @@ def main():
           url='https://github.com/vmware/trace-cruncher',
           license='LGPL-2.1',
           packages=find_packages(),
-          ext_modules=[module_ks, module_ft],
+          ext_modules=[module_data, module_ks, module_ft],
           classifiers=[
               'Development Status :: 3 - Alpha',
               'Programming Language :: Python :: 3',
diff --git a/src/datawrapper.pyx b/src/datawrapper.pyx
new file mode 100644
index 0000000..070d4e4
--- /dev/null
+++ b/src/datawrapper.pyx
@@ -0,0 +1,201 @@
+"""
+SPDX-License-Identifier: LGPL-2.1
+
+Copyright 2019 VMware Inc, Yordan Karadzhov (VMware) <y.karadz@gmail.com>
+"""
+
+import ctypes
+
+# Import the Python-level symbols of numpy
+import numpy as np
+# Import the C-level symbols of numpy
+cimport numpy as np
+
+import json
+
+from libcpp cimport bool
+
+from libc.stdlib cimport free
+
+from cpython cimport PyObject, Py_INCREF
+
+from libc cimport stdint
+ctypedef stdint.int16_t int16_t
+ctypedef stdint.uint16_t uint16_t
+ctypedef stdint.int32_t int32_t
+ctypedef stdint.uint32_t uint32_t
+ctypedef stdint.int64_t int64_t
+ctypedef stdint.uint64_t uint64_t
+
+cdef extern from 'numpy/ndarraytypes.h':
+    int NPY_ARRAY_CARRAY
+    
+# Numpy must be initialized!!!
+np.import_array()
+
+cdef extern from 'trace2matrix.c':
+    ssize_t trace2matrix(uint64_t **offset_array,
+			 uint16_t **cpu_array,
+			 uint64_t **ts_array,
+			 uint16_t **pid_array,
+			 int **event_array)
+
+data_column_types = {
+    'cpu': np.NPY_UINT16,
+    'pid': np.NPY_UINT16,
+    'event': np.NPY_INT,
+    'offset': np.NPY_UINT64,
+    'time': np.NPY_UINT64
+    }
+
+cdef class KsDataWrapper:
+    cdef int item_size
+    cdef int data_size
+    cdef int data_type
+    cdef void* data_ptr
+
+    cdef init(self, int data_type,
+                    int data_size,
+                    int item_size,
+                    void* data_ptr):
+        """ This initialization cannot be done in the constructor because
+            we use C-level arguments.
+        """
+        self.item_size = item_size
+        self.data_size = data_size
+        self.data_type = data_type
+        self.data_ptr = data_ptr
+
+    def __array__(self):
+        """ Here we use the __array__ method, that is called when numpy
+            tries to get an array from the object.
+        """
+        cdef np.npy_intp shape[1]
+        shape[0] = <np.npy_intp> self.data_size
+
+        ndarray = np.PyArray_New(np.ndarray,
+                                 1, shape,
+                                 self.data_type,
+                                 NULL,
+                                 self.data_ptr,
+                                 self.item_size,
+                                 NPY_ARRAY_CARRAY,
+                                 <object>NULL)
+
+        return ndarray
+
+    def __dealloc__(self):
+        """ Free the data. This is called by Python when all the references to
+            the object are gone.
+        """
+        free(<void*>self.data_ptr)
+
+
+def load(ofst_data=True, cpu_data=True, ts_data=True,
+         pid_data=True, evt_data=True):
+    """ Python binding of the 'kshark_load_data_matrix' function that does not
+        copy the data. The input parameters can be used to avoid loading the
+        data from the unnecessary fields.
+    """
+    cdef uint64_t *ofst_c
+    cdef uint16_t *cpu_c
+    cdef uint64_t *ts_c
+    cdef uint16_t *pid_c
+    cdef int *evt_c
+
+    cdef np.ndarray ofst
+    cdef np.ndarray cpu
+    cdef np.ndarray ts
+    cdef np.ndarray pid
+    cdef np.ndarray evt
+
+    if not ofst_data:
+        ofst_c = NULL
+
+    if not cpu_data:
+        cpu_c = NULL
+
+    if not ts_data:
+        ts_c = NULL
+
+    if not pid_data:
+        pid_c = NULL
+
+    if not evt_data:
+        evt_c = NULL
+
+    data_dict = {}
+
+    cdef ssize_t size
+
+    size = trace2matrix(&ofst_c, &cpu_c, &ts_c, &pid_c, &evt_c)
+    if size <= 0:
+        raise Exception('No data has been loaded.')
+
+    if cpu_data:
+        column = 'cpu'
+        array_wrapper_cpu = KsDataWrapper()
+        array_wrapper_cpu.init(data_type=data_column_types[column],
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *> cpu_c)
+
+        cpu = np.array(array_wrapper_cpu, copy=False)
+        cpu.base = <PyObject *> array_wrapper_cpu
+        data_dict.update({column: cpu})
+        Py_INCREF(array_wrapper_cpu)
+
+    if pid_data:
+        column = 'pid'
+        array_wrapper_pid = KsDataWrapper()
+        array_wrapper_pid.init(data_type=data_column_types[column],
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *>pid_c)
+
+        pid = np.array(array_wrapper_pid, copy=False)
+        pid.base = <PyObject *> array_wrapper_pid
+        data_dict.update({column: pid})
+        Py_INCREF(array_wrapper_pid)
+
+    if evt_data:
+        column = 'event'
+        array_wrapper_evt = KsDataWrapper()
+        array_wrapper_evt.init(data_type=data_column_types[column],
+                               data_size=size,
+                               item_size=0,
+                               data_ptr=<void *>evt_c)
+
+        evt = np.array(array_wrapper_evt, copy=False)
+        evt.base = <PyObject *> array_wrapper_evt
+        data_dict.update({column: evt})
+        Py_INCREF(array_wrapper_evt)
+
+    if ofst_data:
+        column = 'offset'
+        array_wrapper_ofst = KsDataWrapper()
+        array_wrapper_ofst.init(data_type=data_column_types[column],
+                                data_size=size,
+                                item_size=0,
+                                data_ptr=<void *> ofst_c)
+
+
+        ofst = np.array(array_wrapper_ofst, copy=False)
+        ofst.base = <PyObject *> array_wrapper_ofst
+        data_dict.update({column: ofst})
+        Py_INCREF(array_wrapper_ofst)
+        
+    if ts_data:
+        column = 'time'
+        array_wrapper_ts = KsDataWrapper()
+        array_wrapper_ts.init(data_type=data_column_types[column],
+                              data_size=size,
+                              item_size=0,
+                              data_ptr=<void *> ts_c)
+
+        ts = np.array(array_wrapper_ts, copy=False)
+        ts.base = <PyObject *> array_wrapper_ts
+        data_dict.update({column: ts})
+        Py_INCREF(array_wrapper_ts)
+
+    return data_dict
diff --git a/src/trace2matrix.c b/src/trace2matrix.c
new file mode 100644
index 0000000..aaf8322
--- /dev/null
+++ b/src/trace2matrix.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: LGPL-2.1
+
+/*
+ * Copyright 2019 VMware Inc, Yordan Karadzhov <ykaradzhov@vmware.com>
+ */
+
+// KernelShark
+#include "kernelshark/libkshark.h"
+
+ssize_t trace2matrix(uint64_t **offset_array,
+		     uint16_t **cpu_array,
+		     uint64_t **ts_array,
+		     uint16_t **pid_array,
+		     int **event_array)
+{
+	struct kshark_context *kshark_ctx = NULL;
+	ssize_t total = 0;
+
+	if (!kshark_instance(&kshark_ctx))
+		return -1;
+
+	total = kshark_load_data_matrix(kshark_ctx, offset_array,
+						    cpu_array,
+						    ts_array,
+						    pid_array,
+						    event_array);
+
+	return total;
+}
-- 
2.20.1


  parent reply	other threads:[~2020-01-07 17:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 17:03 [PATCH v2 00/12] Build trace-cruncher as Python pakage Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 01/12] trace-cruncher: Refactor the part of the interface that relies on libkshark Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 02/12] trace-cruncher: Refactor the part of the interface that relies on libtraceevent Yordan Karadzhov (VMware)
2020-01-07 17:03 ` Yordan Karadzhov (VMware) [this message]
2020-01-07 17:03 ` [PATCH v2 04/12] trace-cruncher: Add "utils" Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 05/12] trace-cruncher: Adapt sched_wakeup.py to use the new module Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 06/12] trace-cruncher: Add Makefile Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 07/12] trace-cruncher: Adapt gpareto_fit.py to use the new module Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 08/12] trace-cruncher: Adapt page_faults.py " Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 09/12] trace-cruncher: Automate the third-party build Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 10/12] trace-cruncher: Update README.md Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 11/12] trace-cruncher: Remove all leftover files Yordan Karadzhov (VMware)
2020-01-07 17:03 ` [PATCH v2 12/12] trace-cruncher: Improve Makefile Provide more robust and better looking build process Yordan Karadzhov (VMware)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200107170312.27116-4-y.karadz@gmail.com \
    --to=y.karadz@gmail.com \
    --cc=Douglas.Raillard@arm.com \
    --cc=Valentin.Schneider@arm.com \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=nd@arm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.