All of lore.kernel.org
 help / color / mirror / Atom feed
* fiologparser_hist.py script patch and enhancements?
       [not found] <CY1PR0401MB11163E6C999998E0A2F13F9E81F70@CY1PR0401MB1116.namprd04.prod.outlook.com>
@ 2018-02-12 20:36 ` Kris Davis
  2018-02-12 21:38   ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-12 20:36 UTC (permalink / raw)
  To: fio; +Cc: Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 22177 bytes --]

In light of some related commits, I am reposting my enhancements to fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py
I've included a patch at bottom.

Reasons for the changes:

1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.

2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

3) A recent commit made some changes to support python3.
I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"

4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
Someone might know of a better, more conventional pythonic design pattern to use, but it works.

5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.

In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.

Thanks

Kris



diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 62a4eb4..c77bb11 100755
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2.7
+#!/usr/bin/python
"""
     Utility for converting *_clat_hist* files generated by fio into latency statistics.

@@ -16,8 +16,16 @@
import os
import sys
import pandas
+import re
import numpy as np

+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
err = sys.stderr.write

 def weighted_percentile(percs, vs, ws):
@@ -64,8 +72,20 @@ def weights(start_ts, end_ts, start, end):
def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)

-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+
+

 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -178,7 +198,11 @@ def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
     row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))

 def update_extreme(val, fncn, new_val):
@@ -207,7 +231,7 @@ def process_interval(ctx, samples, iStart, iEnd):

         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]

@@ -241,7 +265,7 @@ def guess_max_from_bins(ctx, hist_cols):
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +274,12 @@ def guess_max_from_bins(ctx, hist_cols):
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg)
+
     return bins[idx[1][0]]

 def main(ctx):
@@ -274,10 +303,19 @@ def main(ctx):
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)

     if ctx.interval is None:
         ctx.interval = 1000

+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
+
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate
     # those files, and calculate the appropriate bin latency values:
@@ -339,6 +377,7 @@ def main(ctx):

 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -385,5 +424,18 @@ if __name__ == '__main__':
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')

+    arg('--percentiles',
+        default="90:95:99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0:95.0:99.0".  min, median(50%%) and max percentiles are always printed')
+
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')
+
+
+
     main(p.parse_args())

diff --git a/tools/hist/fiologparser_hist_nw.py b/tools/hist/fiologparser_hist_nw.py
new file mode 100644
index 0000000..98a378d
--- /dev/null
+++ b/tools/hist/fiologparser_hist_nw.py
@@ -0,0 +1,383 @@
+#!/usr/bin/python
+"""
+    Utility for converting *_clat_hist* files generated by fio into latency statistics.
+
+    Example usage:
+
+            $ fiologparser_hist.py *_clat_hist*
+            end-time, samples, min, avg, median, 90%, 95%, 99%, max
+            1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
+            2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
+            4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
+            ...
+
+    @author Karl Cronburg <karl.cronburg@gmail.com<mailto:karl.cronburg@gmail.com>>
+"""
+import os
+import sys
+import re
+import numpy as np
+
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
+err = sys.stderr.write
+
+class HistFileRdr():
+    """ Class to read a hist file line by line, buffering
+        a value array for the latest line, and allowing a preview
+        of the next timestamp in next line
+        Note: this does not follow a generator pattern, but must explicitly
+        get next bin array.
+    """
+    def __init__(self, file):
+        self.fp = open(file, 'r')
+        self.data = self.nextData()
+
+    def close(self):
+        self.fp.close()
+        self.fp = None
+
+    def nextData(self):
+        self.data = None
+        if self.fp:
+            line = self.fp.readline()
+            if line == "":
+                self.close()
+            else:
+                self.data = [int(x) for x in line.replace(' ', '').rstrip().split(',')]
+
+        return self.data
+
+    @property
+    def curTS(self):
+        ts = None
+        if self.data:
+            ts = self.data[0]
+        return ts
+
+    @property
+    def curBins(self):
+        return self.data[3:]
+
+
+
+def weighted_percentile(percs, vs, ws):
+    """ Use linear interpolation to calculate the weighted percentile.
+
+        Value and weight arrays are first sorted by value. The cumulative
+        distribution function (cdf) is then computed, after which np.interp
+        finds the two values closest to our desired weighted percentile(s)
+        and linearly interpolates them.
+
+        percs  :: List of percentiles we want to calculate
+        vs     :: Array of values we are computing the percentile of
+        ws     :: Array of weights for our corresponding values
+        return :: Array of percentiles
+    """
+    idx = np.argsort(vs)
+    vs, ws = vs[idx], ws[idx] # weights and values sorted by value
+    cdf = 100 * (ws.cumsum() - ws / 2.0) / ws.sum()
+    return np.interp(percs, cdf, vs) # linear interpolation
+
+def weighted_average(vs, ws):
+    return np.sum(vs * ws) / np.sum(ws)
+
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+
+
+
+def fmt_float_list(ctx, num=1):
+  """ Return a comma separated list of float formatters to the required number
+      of decimal places. For instance:
+
+        fmt_float_list(ctx.decimals=4, num=3) == "%.4f, %.4f, %.4f"
+  """
+  return ', '.join(["%%.%df" % ctx.decimals] * num)
+
+# Default values - see beginning of main() for how we detect number columns in
+# the input files:
+__HIST_COLUMNS = 1216
+__NON_HIST_COLUMNS = 3
+__TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS
+
+def get_min(fps, arrs):
+    """ Find the file with the current first row with the smallest start time """
+    return min([fp for fp in fps if not arrs[fp] is None], key=lambda fp: arrs.get(fp)[0][0])
+
+def _plat_idx_to_val(idx, edge=0.5, FIO_IO_U_PLAT_BITS=6, FIO_IO_U_PLAT_VAL=64):
+    """ Taken from fio's stat.c for calculating the latency value of a bin
+        from that bin's index.
+
+            idx  : the value of the index into the histogram bins
+            edge : fractional value in the range [0,1]** indicating how far into
+            the bin we wish to compute the latency value of.
+
+        ** edge = 0.0 and 1.0 computes the lower and upper latency bounds
+           respectively of the given bin index. """
+
+    # MSB <= (FIO_IO_U_PLAT_BITS-1), cannot be rounded off. Use
+    # all bits of the sample as index
+    if (idx < (FIO_IO_U_PLAT_VAL << 1)):
+        return idx
+
+    # Find the group and compute the minimum value of that group
+    error_bits = (idx >> FIO_IO_U_PLAT_BITS) - 1
+    base = 1 << (error_bits + FIO_IO_U_PLAT_BITS)
+
+    # Find its bucket number of the group
+    k = idx % FIO_IO_U_PLAT_VAL
+
+    # Return the mean (if edge=0.5) of the range of the bucket
+    return base + ((k + edge) * (1 << error_bits))
+
+def plat_idx_to_val_coarse(idx, coarseness, edge=0.5):
+    """ Converts the given *coarse* index into a non-coarse index as used by fio
+        in stat.h:plat_idx_to_val(), subsequently computing the appropriate
+        latency value for that bin.
+        """
+
+    # Multiply the index by the power of 2 coarseness to get the bin
+    # bin index with a max of 1536 bins (FIO_IO_U_PLAT_GROUP_NR = 24 in stat.h)
+    stride = 1 << coarseness
+    idx = idx * stride
+    lower = _plat_idx_to_val(idx, edge=0.0)
+    upper = _plat_idx_to_val(idx + stride, edge=1.0)
+    return lower + (upper - lower) * edge
+
+def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
+    ps = weighted_percentile(percs, vs, ws)
+
+    avg = weighted_average(vs, ws)
+    values = [mn, avg] + list(ps) + [mx]
+    row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
+    print (fmt % tuple(row))
+
+def update_extreme(val, fncn, new_val):
+    """ Calculate min / max in the presence of None values """
+    if val is None: return new_val
+    else: return fncn(val, new_val)
+
+# See beginning of main() for how bin_vals are computed
+bin_vals = []
+lower_bin_vals = [] # lower edge of each bin
+upper_bin_vals = [] # upper edge of each bin
+
+def process_interval(ctx, iHist, iEnd):
+    """ print estimated percentages for the given merged sample
+    """
+    ss_cnt = 0 # number of samples affecting this interval
+    mn_bin_val, mx_bin_val = None, None
+
+    # Update total number of samples affecting current interval histogram:
+    ss_cnt += np.sum(iHist)
+
+    # Update min and max bin values
+    idxs = np.nonzero(iHist != 0)[0]
+    if idxs.size > 0:
+        mn_bin_val = bin_vals[idxs[0]]
+        mx_bin_val = bin_vals[idxs[-1]]
+
+    if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val)
+
+def guess_max_from_bins(ctx, hist_cols):
+    """ Try to guess the GROUP_NR from given # of histogram
+        columns seen in an input file """
+    max_coarse = 8
+    if ctx.group_nr < 19 or ctx.group_nr > 26:
+        bins = [ctx.group_nr * (1 << 6)]
+    else:
+        bins = [1216,1280,1344,1408,1472,1536,1600,1664]
+    coarses = range(max_coarse + 1)
+    fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else -10, coarses))
+
+    arr = np.transpose(list(map(fncn, bins)))
+    idx = np.where(arr == hist_cols)
+    if len(idx[1]) == 0:
+        table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
+            "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
+            "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
+            "This number needs to be equal to one of the following numbers:\n\n"
+            + table + "\n\n"
+            "Possible reasons and corresponding solutions:\n"
+            "  - Input file(s) does not contain histograms.\n"
+            "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
+            "    new GROUP_NR on the command line with --group_nr\n")
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg)
+
+    return bins[idx[1][0]]
+
+def main(ctx):
+
+    if ctx.job_file:
+        try:
+            from configparser import SafeConfigParser, NoOptionError
+        except ImportError:
+            from ConfigParser import SafeConfigParser, NoOptionError
+
+        cp = SafeConfigParser(allow_no_value=True)
+        with open(ctx.job_file, 'r') as fp:
+            cp.readfp(fp)
+
+        if ctx.interval is None:
+            # Auto detect --interval value
+            for s in cp.sections():
+                try:
+                    hist_msec = cp.get(s, 'log_hist_msec')
+                    if hist_msec is not None:
+                        ctx.interval = int(hist_msec)
+                except NoOptionError:
+                    pass
+
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
+
+    if ctx.interval is None:
+        ctx.interval = 1000
+
+    # Automatically detect how many columns are in the input files,
+    # calculate the corresponding 'coarseness' parameter used to generate
+    # those files, and calculate the appropriate bin latency values:
+    with open(ctx.FILE[0], 'r') as fp:
+        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
+        __TOTAL_COLUMNS = len(fp.readline().split(','))
+        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
+
+        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
+        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
+        bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        lower_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 0.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        upper_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 1.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+
+    fps = [HistFileRdr(f) for f in ctx.FILE]
+
+    print(', '.join(columns))
+
+    start = 0
+    end = ctx.interval
+    while True:
+
+        more_data = False
+
+        # add bins from all files in target intervals
+        arr = None
+        numSamples = 0
+        while True:
+            foundSamples = False
+            for fp in fps:
+                ts = fp.curTS
+                if ts and ts+10 < end:  # shift sample time when very close to an end time
+                    numSamples += 1
+                    foundSamples = True
+                    if arr is None:
+                        arr = np.zeros(shape=(__HIST_COLUMNS), dtype=int)
+                    arr = np.add(arr, fp.curBins)
+                    more_data = True
+                    fp.nextData()
+                elif ts:
+                    more_data = True
+
+            # reached end of all files
+            # or gone through all files without finding sample in interval
+            if not more_data or not foundSamples:
+                break
+
+        if arr is not None:
+            #print("{} size({}) samples({}) nonzero({}):".format(end, arr.size, numSamples, np.count_nonzero(arr)), str(arr), )
+            process_interval(ctx, arr, end)
+
+        # reach end of all files
+        if not more_data:
+            break
+
+        start += ctx.interval
+        end = start + ctx.interval
+
+        #if end > 20000: break
+
+
+if __name__ == '__main__':
+    import argparse
+    runascmd = True
+    p = argparse.ArgumentParser()
+    arg = p.add_argument
+    arg("FILE", help='space separated list of latency log filenames', nargs='+')
+    arg('--buff_size',
+        default=10000,
+        type=int,
+        help='number of samples to buffer into numpy at a time')
+
+    arg('--max_latency',
+        default=20,
+        type=float,
+        help='number of seconds of data to process at a time')
+
+    arg('-i', '--interval',
+        type=int,
+        help='interval width (ms), default 1000 ms '
+        '(no weighting between samples performed, results represent sample period only)')
+
+    arg('-d', '--divisor',
+        required=False,
+        type=int,
+        default=1,
+        help='divide the results by this value.')
+
+    arg('--decimals',
+        default=3,
+        type=int,
+        help='number of decimal places to print floats to')
+
+    arg('--warn',
+        dest='warn',
+        action='store_true',
+        default=False,
+        help='print warning messages to stderr')
+
+    arg('--group_nr',
+        default=29,
+        type=int,
+        help='FIO_IO_U_PLAT_GROUP_NR as defined in stat.h')
+
+    arg('--job-file',
+        default=None,
+        type=str,
+        help='Optional argument pointing to the job file used to create the '
+             'given histogram files. Useful for auto-detecting --log_hist_msec and '
+             '--log_unix_epoch (in fio) values.')
+
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+
+
+    main(p.parse_args())
+


[-- Attachment #2: Type: text/html, Size: 61248 bytes --]

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* fiologparser_hist.py script patch and enhancements?
  2018-02-12 20:36 ` fiologparser_hist.py script patch and enhancements? Kris Davis
@ 2018-02-12 21:38   ` Kris Davis
  2018-02-13  7:26     ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-12 21:38 UTC (permalink / raw)
  To: fio; +Cc: Jens Axboe

In light of some related commits, I am reposting my enhancements to fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py
I've included a patch at bottom.

Reasons for the changes: 

1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.

2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max). 
I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

3) A recent commit made some changes to support python3. 
I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"

4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.  
Someone might know of a better, more conventional pythonic design pattern to use, but it works.

5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.  

In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.

Thanks

Kris



diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 62a4eb4..c77bb11 100755
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2.7
+#!/usr/bin/python
""" 
     Utility for converting *_clat_hist* files generated by fio into latency statistics.
     
@@ -16,8 +16,16 @@
import os
import sys
import pandas
+import re
import numpy as np

+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
err = sys.stderr.write

 def weighted_percentile(percs, vs, ws):
@@ -64,8 +72,20 @@ def weights(start_ts, end_ts, start, end):
def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)

-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+

 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -178,7 +198,11 @@ def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
     row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))

 def update_extreme(val, fncn, new_val):
@@ -207,7 +231,7 @@ def process_interval(ctx, samples, iStart, iEnd):
             
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]

@@ -241,7 +265,7 @@ def guess_max_from_bins(ctx, hist_cols):
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +274,12 @@ def guess_max_from_bins(ctx, hist_cols):
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
     return bins[idx[1][0]]

 def main(ctx):
@@ -274,10 +303,19 @@ def main(ctx):
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)

     if ctx.interval is None:
         ctx.interval = 1000

+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
+
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate
     # those files, and calculate the appropriate bin latency values:
@@ -339,6 +377,7 @@ def main(ctx):

 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -385,5 +424,18 @@ if __name__ == '__main__':
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')

+    arg('--percentiles',
+        default="90:95:99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0:95.0:99.0".  min, median(50%%) and max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')
+    
+    
+
     main(p.parse_args())

diff --git a/tools/hist/fiologparser_hist_nw.py b/tools/hist/fiologparser_hist_nw.py
new file mode 100644
index 0000000..98a378d
--- /dev/null
+++ b/tools/hist/fiologparser_hist_nw.py
@@ -0,0 +1,383 @@
+#!/usr/bin/python
+""" 
+    Utility for converting *_clat_hist* files generated by fio into latency statistics.
+    
+    Example usage:
+    
+            $ fiologparser_hist.py *_clat_hist*
+            end-time, samples, min, avg, median, 90%, 95%, 99%, max
+            1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
+            2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
+            4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
+            ...
+    
+    @author Karl Cronburg <mailto:karl.cronburg@gmail.com>
+"""
+import os
+import sys
+import re
+import numpy as np
+
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
+err = sys.stderr.write
+
+class HistFileRdr():
+    """ Class to read a hist file line by line, buffering 
+        a value array for the latest line, and allowing a preview
+        of the next timestamp in next line
+        Note: this does not follow a generator pattern, but must explicitly
+        get next bin array.
+    """
+    def __init__(self, file):
+        self.fp = open(file, 'r')
+        self.data = self.nextData()
+        
+    def close(self):
+        self.fp.close()
+        self.fp = None
+        
+    def nextData(self):
+        self.data = None
+        if self.fp: 
+            line = self.fp.readline()
+            if line == "":
+                self.close()
+            else:
+                self.data = [int(x) for x in line.replace(' ', '').rstrip().split(',')]
+                
+        return self.data
+ 
+    @property
+    def curTS(self):
+        ts = None
+        if self.data:
+            ts = self.data[0]
+        return ts
+             
+    @property
+    def curBins(self):
+        return self.data[3:]
+                
+    
+
+def weighted_percentile(percs, vs, ws):
+    """ Use linear interpolation to calculate the weighted percentile.
+        
+        Value and weight arrays are first sorted by value. The cumulative
+        distribution function (cdf) is then computed, after which np.interp
+        finds the two values closest to our desired weighted percentile(s)
+        and linearly interpolates them.
+        
+        percs  :: List of percentiles we want to calculate
+        vs     :: Array of values we are computing the percentile of
+        ws     :: Array of weights for our corresponding values
+        return :: Array of percentiles
+    """
+    idx = np.argsort(vs)
+    vs, ws = vs[idx], ws[idx] # weights and values sorted by value
+    cdf = 100 * (ws.cumsum() - ws / 2.0) / ws.sum()
+    return np.interp(percs, cdf, vs) # linear interpolation
+
+def weighted_average(vs, ws):
+    return np.sum(vs * ws) / np.sum(ws)
+
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
+
+def fmt_float_list(ctx, num=1):
+  """ Return a comma separated list of float formatters to the required number
+      of decimal places. For instance:
+
+        fmt_float_list(ctx.decimals=4, num=3) == "%.4f, %.4f, %.4f"
+  """
+  return ', '.join(["%%.%df" % ctx.decimals] * num)
+
+# Default values - see beginning of main() for how we detect number columns in
+# the input files:
+__HIST_COLUMNS = 1216
+__NON_HIST_COLUMNS = 3
+__TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS
+    
+def get_min(fps, arrs):
+    """ Find the file with the current first row with the smallest start time """
+    return min([fp for fp in fps if not arrs[fp] is None], key=lambda fp: arrs.get(fp)[0][0])
+
+def _plat_idx_to_val(idx, edge=0.5, FIO_IO_U_PLAT_BITS=6, FIO_IO_U_PLAT_VAL=64):
+    """ Taken from fio's stat.c for calculating the latency value of a bin
+        from that bin's index.
+        
+            idx  : the value of the index into the histogram bins
+            edge : fractional value in the range [0,1]** indicating how far into
+            the bin we wish to compute the latency value of.
+        
+        ** edge = 0.0 and 1.0 computes the lower and upper latency bounds
+           respectively of the given bin index. """
+
+    # MSB <= (FIO_IO_U_PLAT_BITS-1), cannot be rounded off. Use
+    # all bits of the sample as index
+    if (idx < (FIO_IO_U_PLAT_VAL << 1)):
+        return idx 
+
+    # Find the group and compute the minimum value of that group
+    error_bits = (idx >> FIO_IO_U_PLAT_BITS) - 1 
+    base = 1 << (error_bits + FIO_IO_U_PLAT_BITS)
+
+    # Find its bucket number of the group
+    k = idx % FIO_IO_U_PLAT_VAL
+
+    # Return the mean (if edge=0.5) of the range of the bucket
+    return base + ((k + edge) * (1 << error_bits))
+    
+def plat_idx_to_val_coarse(idx, coarseness, edge=0.5):
+    """ Converts the given *coarse* index into a non-coarse index as used by fio
+        in stat.h:plat_idx_to_val(), subsequently computing the appropriate
+        latency value for that bin.
+        """
+
+    # Multiply the index by the power of 2 coarseness to get the bin
+    # bin index with a max of 1536 bins (FIO_IO_U_PLAT_GROUP_NR = 24 in stat.h)
+    stride = 1 << coarseness
+    idx = idx * stride
+    lower = _plat_idx_to_val(idx, edge=0.0)
+    upper = _plat_idx_to_val(idx + stride, edge=1.0)
+    return lower + (upper - lower) * edge
+
+def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
+    ps = weighted_percentile(percs, vs, ws)
+
+    avg = weighted_average(vs, ws)
+    values = [mn, avg] + list(ps) + [mx]
+    row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
+    print (fmt % tuple(row))
+
+def update_extreme(val, fncn, new_val):
+    """ Calculate min / max in the presence of None values """
+    if val is None: return new_val
+    else: return fncn(val, new_val)
+
+# See beginning of main() for how bin_vals are computed
+bin_vals = []
+lower_bin_vals = [] # lower edge of each bin
+upper_bin_vals = [] # upper edge of each bin 
+
+def process_interval(ctx, iHist, iEnd):
+    """ print estimated percentages for the given merged sample
+    """
+    ss_cnt = 0 # number of samples affecting this interval
+    mn_bin_val, mx_bin_val = None, None
+   
+    # Update total number of samples affecting current interval histogram:
+    ss_cnt += np.sum(iHist)
+        
+    # Update min and max bin values
+    idxs = np.nonzero(iHist != 0)[0]
+    if idxs.size > 0:
+        mn_bin_val = bin_vals[idxs[0]]
+        mx_bin_val = bin_vals[idxs[-1]]
+
+    if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val)
+
+def guess_max_from_bins(ctx, hist_cols):
+    """ Try to guess the GROUP_NR from given # of histogram
+        columns seen in an input file """
+    max_coarse = 8
+    if ctx.group_nr < 19 or ctx.group_nr > 26:
+        bins = [ctx.group_nr * (1 << 6)]
+    else:
+        bins = [1216,1280,1344,1408,1472,1536,1600,1664]
+    coarses = range(max_coarse + 1)
+    fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else -10, coarses))
+    
+    arr = np.transpose(list(map(fncn, bins)))
+    idx = np.where(arr == hist_cols)
+    if len(idx[1]) == 0:
+        table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
+            "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
+            "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
+            "This number needs to be equal to one of the following numbers:\n\n"
+            + table + "\n\n"
+            "Possible reasons and corresponding solutions:\n"
+            "  - Input file(s) does not contain histograms.\n"
+            "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
+            "    new GROUP_NR on the command line with --group_nr\n")
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
+    return bins[idx[1][0]]
+
+def main(ctx):
+
+    if ctx.job_file:
+        try:
+            from configparser import SafeConfigParser, NoOptionError
+        except ImportError:
+            from ConfigParser import SafeConfigParser, NoOptionError
+
+        cp = SafeConfigParser(allow_no_value=True)
+        with open(ctx.job_file, 'r') as fp:
+            cp.readfp(fp)
+
+        if ctx.interval is None:
+            # Auto detect --interval value
+            for s in cp.sections():
+                try:
+                    hist_msec = cp.get(s, 'log_hist_msec')
+                    if hist_msec is not None:
+                        ctx.interval = int(hist_msec)
+                except NoOptionError:
+                    pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
+
+    if ctx.interval is None:
+        ctx.interval = 1000
+        
+    # Automatically detect how many columns are in the input files,
+    # calculate the corresponding 'coarseness' parameter used to generate
+    # those files, and calculate the appropriate bin latency values:
+    with open(ctx.FILE[0], 'r') as fp:
+        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
+        __TOTAL_COLUMNS = len(fp.readline().split(','))
+        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
+
+        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
+        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
+        bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        lower_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 0.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        upper_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 1.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+
+    fps = [HistFileRdr(f) for f in ctx.FILE]
+
+    print(', '.join(columns))
+
+    start = 0
+    end = ctx.interval
+    while True:
+        
+        more_data = False
+        
+        # add bins from all files in target intervals
+        arr = None
+        numSamples = 0
+        while True:
+            foundSamples = False
+            for fp in fps:
+                ts = fp.curTS
+                if ts and ts+10 < end:  # shift sample time when very close to an end time                 
+                    numSamples += 1
+                    foundSamples = True
+                    if arr is None: 
+                        arr = np.zeros(shape=(__HIST_COLUMNS), dtype=int)
+                    arr = np.add(arr, fp.curBins)
+                    more_data = True
+                    fp.nextData()
+                elif ts:
+                    more_data = True
+            
+            # reached end of all files
+            # or gone through all files without finding sample in interval 
+            if not more_data or not foundSamples:
+                break
+        
+        if arr is not None:
+            #print("{} size({}) samples({}) nonzero({}):".format(end, arr.size, numSamples, np.count_nonzero(arr)), str(arr), )
+            process_interval(ctx, arr, end)         
+        
+        # reach end of all files
+        if not more_data:
+            break
+            
+        start += ctx.interval
+        end = start + ctx.interval
+        
+        #if end > 20000: break
+
+
+if __name__ == '__main__':
+    import argparse
+    runascmd = True
+    p = argparse.ArgumentParser()
+    arg = p.add_argument
+    arg("FILE", help='space separated list of latency log filenames', nargs='+')
+    arg('--buff_size',
+        default=10000,
+        type=int,
+        help='number of samples to buffer into numpy at a time')
+
+    arg('--max_latency',
+        default=20,
+        type=float,
+        help='number of seconds of data to process at a time')
+
+    arg('-i', '--interval',
+        type=int,
+        help='interval width (ms), default 1000 ms '
+        '(no weighting between samples performed, results represent sample period only)')
+
+    arg('-d', '--divisor',
+        required=False,
+        type=int,
+        default=1,
+        help='divide the results by this value.')
+
+    arg('--decimals',
+        default=3,
+        type=int,
+        help='number of decimal places to print floats to')
+
+    arg('--warn',
+        dest='warn',
+        action='store_true',
+        default=False,
+        help='print warning messages to stderr')
+
+    arg('--group_nr',
+        default=29,
+        type=int,
+        help='FIO_IO_U_PLAT_GROUP_NR as defined in stat.h')
+
+    arg('--job-file',
+        default=None,
+        type=str,
+        help='Optional argument pointing to the job file used to create the '
+             'given histogram files. Useful for auto-detecting --log_hist_msec and '
+             '--log_unix_epoch (in fio) values.')
+
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+        
+
+    main(p.parse_args())
+



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-02-12 21:38   ` Kris Davis
@ 2018-02-13  7:26     ` Sitsofe Wheeler
  2018-02-14 17:51       ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-02-13  7:26 UTC (permalink / raw)
  To: Kris Davis; +Cc: fio, Jens Axboe, Vincent Fu

(CC'ing Vincent)

On 12 February 2018 at 21:38, Kris Davis <Kris.Davis@wdc.com> wrote:
> In light of some related commits, I am reposting my enhancements to fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py
> I've included a patch at bottom.

It might be an idea to post this up on github too (it might make it
easier for others to pull down).

> Reasons for the changes:
>
> 1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.
>
> 2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
> I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

These sound good.

> 3) A recent commit made some changes to support python3.
> I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"

Sadly the switch to python2.7 was done on purpose:
https://github.com/axboe/fio/commit/60023ade47e7817db1c18d9b7e511839de5c2c99
- Linux distros are clamping down on python and macOS doesn't have
python2. The whole python interpreter line business is a mess and
there's simply no common agreement - if you look you can find
conflicting PEPs and I'm starting to think packagers will just have to
include a function to rename lines to their preferred style. My hope
is one day all the scripts are converted to be both python2 and
python3 compatible, all OSes finally get around to shipping python3 by
default and then the interpreter line can be switched.

> 4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
> Someone might know of a better, more conventional pythonic design pattern to use, but it works.

I've no strong feeling on this.

> 5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.
>
> In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Just out of interest does using pypy help you at all?

> Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.
>
> Thanks
>
> Kris
>
>
>
> diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
> index 62a4eb4..c77bb11 100755
> --- a/tools/hist/fiologparser_hist.py
> +++ b/tools/hist/fiologparser_hist.py

<snip>

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-02-13  7:26     ` Sitsofe Wheeler
@ 2018-02-14 17:51       ` Kris Davis
  2018-02-21 14:41         ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-14 17:51 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe, Vincent Fu

I wasn't familiar with pypy.  I tried it out, but needed to build locally on my Centos 7.3 machine.  However, I found something interesting...
I ran fiologparser_hist.py in multiple trials against 9 log files to be combined.
*  with pypy  it consistently took about 51-52 seconds.
*  with python3.4.3, it also consistently took about 51-52 seconds.
*  with python2.7.5 it consistently took 10-11 seconds.

I also built python 3.4.3 from the distribution locally some months earlier.  Sure is suspicious.
But, this is off topic, so I'll investigate further without further updates.  Thanks for the reference Sitsofe.

Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Tuesday, February 13, 2018 1:27 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu <vincentfu@gmail.com>
Subject: Re: fiologparser_hist.py script patch and enhancements?

(CC'ing Vincent)

On 12 February 2018 at 21:38, Kris Davis <Kris.Davis@wdc.com> wrote:
> In light of some related commits, I am reposting my enhancements to 
> fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py I've included a patch at bottom.

It might be an idea to post this up on github too (it might make it easier for others to pull down).

> Reasons for the changes:
>
> 1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.
>
> 2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
> I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

These sound good.

> 3) A recent commit made some changes to support python3.
> I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"

Sadly the switch to python2.7 was done on purpose:
https://github.com/axboe/fio/commit/60023ade47e7817db1c18d9b7e511839de5c2c99
- Linux distros are clamping down on python and macOS doesn't have python2. The whole python interpreter line business is a mess and there's simply no common agreement - if you look you can find conflicting PEPs and I'm starting to think packagers will just have to include a function to rename lines to their preferred style. My hope is one day all the scripts are converted to be both python2 and
python3 compatible, all OSes finally get around to shipping python3 by default and then the interpreter line can be switched.

> 4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
> Someone might know of a better, more conventional pythonic design pattern to use, but it works.

I've no strong feeling on this.

> 5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.
>
> In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Just out of interest does using pypy help you at all?

> Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.
>
> Thanks
>
> Kris
>
>
>
> diff --git a/tools/hist/fiologparser_hist.py 
> b/tools/hist/fiologparser_hist.py index 62a4eb4..c77bb11 100755
> --- a/tools/hist/fiologparser_hist.py
> +++ b/tools/hist/fiologparser_hist.py

<snip>

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-02-14 17:51       ` Kris Davis
@ 2018-02-21 14:41         ` Kris Davis
  2018-02-21 16:52           ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-21 14:41 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe, Vincent Fu

After doing some more research, my best guess is that python3.4 being much slower than python2.7 is due to the difference in integer handling, so pypy 3 didn't do any better then cpython 3.4. I don't really have anything to verify that.  

So, given no further feedback, is there any reason not to commit the suggested changes?   Is there anything else I need to do? 

Thanks
Kris Davis

-----Original Message-----
From: Kris Davis 
Sent: Wednesday, February 14, 2018 11:51 AM
To: 'Sitsofe Wheeler' <sitsofe@gmail.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu <vincentfu@gmail.com>
Subject: RE: fiologparser_hist.py script patch and enhancements?

I wasn't familiar with pypy.  I tried it out, but needed to build locally on my Centos 7.3 machine.  However, I found something interesting...
I ran fiologparser_hist.py in multiple trials against 9 log files to be combined.
*  with pypy  it consistently took about 51-52 seconds.
*  with python3.4.3, it also consistently took about 51-52 seconds.
*  with python2.7.5 it consistently took 10-11 seconds.

I also built python 3.4.3 from the distribution locally some months earlier.  Sure is suspicious.
But, this is off topic, so I'll investigate further without further updates.  Thanks for the reference Sitsofe.

Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com]
Sent: Tuesday, February 13, 2018 1:27 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu <vincentfu@gmail.com>
Subject: Re: fiologparser_hist.py script patch and enhancements?

(CC'ing Vincent)

On 12 February 2018 at 21:38, Kris Davis <Kris.Davis@wdc.com> wrote:
> In light of some related commits, I am reposting my enhancements to 
> fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py I've included a patch at bottom.

It might be an idea to post this up on github too (it might make it easier for others to pull down).

> Reasons for the changes:
>
> 1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.
>
> 2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
> I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

These sound good.

> 3) A recent commit made some changes to support python3.
> I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"

Sadly the switch to python2.7 was done on purpose:
https://github.com/axboe/fio/commit/60023ade47e7817db1c18d9b7e511839de5c2c99
- Linux distros are clamping down on python and macOS doesn't have python2. The whole python interpreter line business is a mess and there's simply no common agreement - if you look you can find conflicting PEPs and I'm starting to think packagers will just have to include a function to rename lines to their preferred style. My hope is one day all the scripts are converted to be both python2 and
python3 compatible, all OSes finally get around to shipping python3 by default and then the interpreter line can be switched.

> 4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
> Someone might know of a better, more conventional pythonic design pattern to use, but it works.

I've no strong feeling on this.

> 5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.
>
> In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Just out of interest does using pypy help you at all?

> Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.
>
> Thanks
>
> Kris
>
>
>
> diff --git a/tools/hist/fiologparser_hist.py 
> b/tools/hist/fiologparser_hist.py index 62a4eb4..c77bb11 100755
> --- a/tools/hist/fiologparser_hist.py
> +++ b/tools/hist/fiologparser_hist.py

<snip>

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-02-21 14:41         ` Kris Davis
@ 2018-02-21 16:52           ` Sitsofe Wheeler
  2018-02-21 17:00             ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-02-21 16:52 UTC (permalink / raw)
  To: Vincent Fu; +Cc: Vincent Fu, fio, Jens Axboe, Kris Davis

Hi Kris,

From what you're saying it seems to make sense to leave the
interpreter line at 2.7 for now. If you can repost your patch with
that and/or send a github pull request to Jens' Github repo the rest of
the changes sound fine to me.

Vincent: any comment on these changes?

On 21 February 2018 at 14:41, Kris Davis <Kris.Davis@wdc.com> wrote:
> After doing some more research, my best guess is that python3.4 being much slower than python2.7 is due to the difference in integer handling, so pypy 3 didn't do any better then cpython 3.4. I don't really have anything to verify that.
>
> So, given no further feedback, is there any reason not to commit the suggested changes?   Is there anything else I need to do?
>
> Thanks
> Kris Davis
>
> -----Original Message-----
> From: Kris Davis
> Sent: Wednesday, February 14, 2018 11:51 AM
> To: 'Sitsofe Wheeler' <sitsofe@gmail.com>
> Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu <vincentfu@gmail.com>
> Subject: RE: fiologparser_hist.py script patch and enhancements?
>
> I wasn't familiar with pypy.  I tried it out, but needed to build locally on my Centos 7.3 machine.  However, I found something interesting...
> I ran fiologparser_hist.py in multiple trials against 9 log files to be combined.
> *  with pypy  it consistently took about 51-52 seconds.
> *  with python3.4.3, it also consistently took about 51-52 seconds.
> *  with python2.7.5 it consistently took 10-11 seconds.
>
> I also built python 3.4.3 from the distribution locally some months earlier.  Sure is suspicious.
> But, this is off topic, so I'll investigate further without further updates.  Thanks for the reference Sitsofe.
>
> Kris Davis
>
> -----Original Message-----
> From: Sitsofe Wheeler [mailto:sitsofe@gmail.com]
> Sent: Tuesday, February 13, 2018 1:27 AM
> To: Kris Davis <Kris.Davis@wdc.com>
> Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu <vincentfu@gmail.com>
> Subject: Re: fiologparser_hist.py script patch and enhancements?
>
> (CC'ing Vincent)
>
> On 12 February 2018 at 21:38, Kris Davis <Kris.Davis@wdc.com> wrote:
>> In light of some related commits, I am reposting my enhancements to
>> fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py I've included a patch at bottom.
>
> It might be an idea to post this up on github too (it might make it easier for others to pull down).
>
>> Reasons for the changes:
>>
>> 1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.
>>
>> 2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
>> I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).
>
> These sound good.
>
>> 3) A recent commit made some changes to support python3.
>> I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"
>
> Sadly the switch to python2.7 was done on purpose:
> https://github.com/axboe/fio/commit/60023ade47e7817db1c18d9b7e511839de5c2c99
> - Linux distros are clamping down on python and macOS doesn't have python2. The whole python interpreter line business is a mess and there's simply no common agreement - if you look you can find conflicting PEPs and I'm starting to think packagers will just have to include a function to rename lines to their preferred style. My hope is one day all the scripts are converted to be both python2 and
> python3 compatible, all OSes finally get around to shipping python3 by default and then the interpreter line can be switched.
>
>> 4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
>> Someone might know of a better, more conventional pythonic design pattern to use, but it works.
>
> I've no strong feeling on this.
>
>> 5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.
>>
>> In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.
>
> Just out of interest does using pypy help you at all?
>
>> Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.
>>
>> diff --git a/tools/hist/fiologparser_hist.py
>> b/tools/hist/fiologparser_hist.py index 62a4eb4..c77bb11 100755
>> --- a/tools/hist/fiologparser_hist.py
>> +++ b/tools/hist/fiologparser_hist.py
>
> <snip>

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-02-21 16:52           ` Sitsofe Wheeler
@ 2018-02-21 17:00             ` Kris Davis
  2018-02-21 17:23               ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-21 17:00 UTC (permalink / raw)
  To: Sitsofe Wheeler, Vincent Fu; +Cc: Vincent Fu, fio, Jens Axboe

I had changed the interpreter to just be "python" rather than "python2.7", and added a check to ensure that the python version was at least 2.7.  This allows it to use whatever version (2.7 or above) that has been associated with "python" (usually 2.7+ in recent linux os's).  What I've see (unless a virtualenv is used), python 3+ has a symlink set to python3.   
The point is, if the interpreter is set to python2.7, the user is generally "forced" to use 2.7, unless all command lines are prepended with the python??

Does that address your concern?

Kris Davis

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Sitsofe Wheeler
Sent: Wednesday, February 21, 2018 10:52 AM
To: Vincent Fu <vincentfu@gmail.com>
Cc: Vincent Fu <Vincent.Fu@wdc.com>; fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Kris Davis <Kris.Davis@wdc.com>
Subject: Re: fiologparser_hist.py script patch and enhancements?

Hi Kris,

From what you're saying it seems to make sense to leave the interpreter line at 2.7 for now. If you can repost your patch with that and/or send a github pull request to Jens' Github repo the rest of the changes sound fine to me.

Vincent: any comment on these changes?

On 21 February 2018 at 14:41, Kris Davis <Kris.Davis@wdc.com> wrote:
> After doing some more research, my best guess is that python3.4 being much slower than python2.7 is due to the difference in integer handling, so pypy 3 didn't do any better then cpython 3.4. I don't really have anything to verify that.
>
> So, given no further feedback, is there any reason not to commit the suggested changes?   Is there anything else I need to do?
>
> Thanks
> Kris Davis
>
> -----Original Message-----
> From: Kris Davis
> Sent: Wednesday, February 14, 2018 11:51 AM
> To: 'Sitsofe Wheeler' <sitsofe@gmail.com>
> Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu 
> <vincentfu@gmail.com>
> Subject: RE: fiologparser_hist.py script patch and enhancements?
>
> I wasn't familiar with pypy.  I tried it out, but needed to build locally on my Centos 7.3 machine.  However, I found something interesting...
> I ran fiologparser_hist.py in multiple trials against 9 log files to be combined.
> *  with pypy  it consistently took about 51-52 seconds.
> *  with python3.4.3, it also consistently took about 51-52 seconds.
> *  with python2.7.5 it consistently took 10-11 seconds.
>
> I also built python 3.4.3 from the distribution locally some months earlier.  Sure is suspicious.
> But, this is off topic, so I'll investigate further without further updates.  Thanks for the reference Sitsofe.
>
> Kris Davis
>
> -----Original Message-----
> From: Sitsofe Wheeler [mailto:sitsofe@gmail.com]
> Sent: Tuesday, February 13, 2018 1:27 AM
> To: Kris Davis <Kris.Davis@wdc.com>
> Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu 
> <vincentfu@gmail.com>
> Subject: Re: fiologparser_hist.py script patch and enhancements?
>
> (CC'ing Vincent)
>
> On 12 February 2018 at 21:38, Kris Davis <Kris.Davis@wdc.com> wrote:
>> In light of some related commits, I am reposting my enhancements to 
>> fiologparser_hist.py, and a suggested addition of fiologparser_hist.nw.py I've included a patch at bottom.
>
> It might be an idea to post this up on github too (it might make it easier for others to pull down).
>
>> Reasons for the changes:
>>
>> 1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.
>>
>> 2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max).
>> I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).
>
> These sound good.
>
>> 3) A recent commit made some changes to support python3.
>> I added a check to make sure the python version is at least 2.7 or above, and changed the "shbang" to only call out "python" rather than "python2.7"
>
> Sadly the switch to python2.7 was done on purpose:
> https://github.com/axboe/fio/commit/60023ade47e7817db1c18d9b7e511839de
> 5c2c99
> - Linux distros are clamping down on python and macOS doesn't have 
> python2. The whole python interpreter line business is a mess and 
> there's simply no common agreement - if you look you can find 
> conflicting PEPs and I'm starting to think packagers will just have to 
> include a function to rename lines to their preferred style. My hope 
> is one day all the scripts are converted to be both python2 and
> python3 compatible, all OSes finally get around to shipping python3 by default and then the interpreter line can be switched.
>
>> 4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so I can load as a module and call main directly, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.
>> Someone might know of a better, more conventional pythonic design pattern to use, but it works.
>
> I've no strong feeling on this.
>
>> 5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.
>>
>> In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much quicker performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.
>
> Just out of interest does using pypy help you at all?
>
>> Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.
>>
>> diff --git a/tools/hist/fiologparser_hist.py 
>> b/tools/hist/fiologparser_hist.py index 62a4eb4..c77bb11 100755
>> --- a/tools/hist/fiologparser_hist.py
>> +++ b/tools/hist/fiologparser_hist.py
>
> <snip>

--
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-02-21 17:00             ` Kris Davis
@ 2018-02-21 17:23               ` Sitsofe Wheeler
  2018-02-21 17:45                 ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-02-21 17:23 UTC (permalink / raw)
  To: Kris Davis; +Cc: Vincent Fu, Vincent Fu, fio, Jens Axboe

Hi,

Alas no. It's less of a case of "is the Python version at least 2.7"
and more a case of "not all platforms have a "python2" link (see
macOS) / distros are banning interpreters lines that mention just
"python" (see https://fedoraproject.org/wiki/Packaging:Python#Multiple_Python_Runtimes
).

On 21 February 2018 at 17:00, Kris Davis <Kris.Davis@wdc.com> wrote:
> I had changed the interpreter to just be "python" rather than "python2.7", and added a check to ensure that the python version was at least 2.7.  This allows it to use whatever version (2.7 or above) that has been associated with "python" (usually 2.7+ in recent linux os's).  What I've see (unless a virtualenv is used), python 3+ has a symlink set to python3.
> The point is, if the interpreter is set to python2.7, the user is generally "forced" to use 2.7, unless all command lines are prepended with the python??
>
> Does that address your concern?

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-02-21 17:23               ` Sitsofe Wheeler
@ 2018-02-21 17:45                 ` Kris Davis
  2018-02-21 18:24                   ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-21 17:45 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: Vincent Fu, Vincent Fu, fio, Jens Axboe

Sigh...
So much for trying to have it work for either python3 or python2.   

The doc you referenced does say: "If the executables provide the same functionality independent of whether they are run on top of Python 2 or Python 3, then only the Python 3 version of the executable should be packaged."

However, since 2.7 runs so much faster, and 2.7 is still the default for the near future, I agree we should stay with 2.7.   I'll back out that tweak and create a new patch.

Thanks
Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Wednesday, February 21, 2018 11:24 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: Vincent Fu <vincentfu@gmail.com>; Vincent Fu <Vincent.Fu@wdc.com>; fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: Re: fiologparser_hist.py script patch and enhancements?

Hi,

Alas no. It's less of a case of "is the Python version at least 2.7"
and more a case of "not all platforms have a "python2" link (see
macOS) / distros are banning interpreters lines that mention just "python" (see https://fedoraproject.org/wiki/Packaging:Python#Multiple_Python_Runtimes
).

On 21 February 2018 at 17:00, Kris Davis <Kris.Davis@wdc.com> wrote:
> I had changed the interpreter to just be "python" rather than "python2.7", and added a check to ensure that the python version was at least 2.7.  This allows it to use whatever version (2.7 or above) that has been associated with "python" (usually 2.7+ in recent linux os's).  What I've see (unless a virtualenv is used), python 3+ has a symlink set to python3.
> The point is, if the interpreter is set to python2.7, the user is generally "forced" to use 2.7, unless all command lines are prepended with the python??
>
> Does that address your concern?

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-02-21 17:45                 ` Kris Davis
@ 2018-02-21 18:24                   ` Sitsofe Wheeler
  2018-02-26 21:56                     ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-02-21 18:24 UTC (permalink / raw)
  To: Kris Davis; +Cc: Vincent Fu, Vincent Fu, fio, Jens Axboe

For what it's worth I feel your pain and I think it's impossible to
win until there's only one version of Python actively left. My hope is
by making it consistent across all the fio scripts everyone will be
the same jam and will at least be able to do a find/replace.

Perhaps someone should suggest a PEP which is an interpreter line that
is just a line like preferredpython[2,3] and everyone is just expected
to run a small conversion tool before starting to use it. Said
conversion tool can rewrite the line to be env based, absolute, prefer
python 3 to python 2 if when it's a Sunday and the weather's good...

On 21 February 2018 at 17:45, Kris Davis <Kris.Davis@wdc.com> wrote:
> Sigh...
> So much for trying to have it work for either python3 or python2.
>
> The doc you referenced does say: "If the executables provide the same functionality independent of whether they are run on top of Python 2 or Python 3, then only the Python 3 version of the executable should be packaged."
>
> However, since 2.7 runs so much faster, and 2.7 is still the default for the near future, I agree we should stay with 2.7.   I'll back out that tweak and create a new patch.

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-02-21 18:24                   ` Sitsofe Wheeler
@ 2018-02-26 21:56                     ` Kris Davis
  2018-03-01 14:39                       ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-02-26 21:56 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: Vincent Fu, Vincent Fu, fio, Jens Axboe

Here is an updated patch against 'master' without a change to the shabang: 


diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 62a4eb4..f71e6d0
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -16,7 +16,10 @@
 import os
 import sys
 import pandas
+import re
 import numpy as np
+
+runascmd = False
 
 err = sys.stderr.write
 
@@ -64,8 +67,20 @@
 def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)
 
-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
 
 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -178,7 +193,11 @@
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
     row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))
 
 def update_extreme(val, fncn, new_val):
@@ -207,7 +226,7 @@
             
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]
 
@@ -241,7 +260,7 @@
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +269,12 @@
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
     return bins[idx[1][0]]
 
 def main(ctx):
@@ -274,9 +298,18 @@
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
 
     if ctx.interval is None:
         ctx.interval = 1000
+
+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
 
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate
@@ -339,6 +372,7 @@
 
 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -385,5 +419,18 @@
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')
 
+    arg('--percentiles',
+        default="90:95:99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0:95.0:99.0".  min, median(50%%) and max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')
+    
+    
+
     main(p.parse_args())
 
diff --git a/tools/hist/fiologparser_hist_nw.py b/tools/hist/fiologparser_hist_nw.py
new file mode 100755
index 0000000..3e1be32
--- /dev/null
+++ b/tools/hist/fiologparser_hist_nw.py
@@ -0,0 +1,379 @@
+#!/usr/bin/python2.7
+""" 
+    Utility for converting *_clat_hist* files generated by fio into latency statistics.
+    
+    Example usage:
+    
+            $ fiologparser_hist.py *_clat_hist*
+            end-time, samples, min, avg, median, 90%, 95%, 99%, max
+            1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
+            2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
+            4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
+            ...
+    
+    @author Karl Cronburg <karl.cronburg@gmail.com>
+    modification by Kris Davis <shimrot@gmail.com>
+"""
+import os
+import sys
+import re
+import numpy as np
+
+runascmd = False
+
+err = sys.stderr.write
+
+class HistFileRdr():
+    """ Class to read a hist file line by line, buffering 
+        a value array for the latest line, and allowing a preview
+        of the next timestamp in next line
+        Note: this does not follow a generator pattern, but must explicitly
+        get next bin array.
+    """
+    def __init__(self, file):
+        self.fp = open(file, 'r')
+        self.data = self.nextData()
+        
+    def close(self):
+        self.fp.close()
+        self.fp = None
+        
+    def nextData(self):
+        self.data = None
+        if self.fp: 
+            line = self.fp.readline()
+            if line == "":
+                self.close()
+            else:
+                self.data = [int(x) for x in line.replace(' ', '').rstrip().split(',')]
+                
+        return self.data
+ 
+    @property
+    def curTS(self):
+        ts = None
+        if self.data:
+            ts = self.data[0]
+        return ts
+             
+    @property
+    def curBins(self):
+        return self.data[3:]
+                
+    
+
+def weighted_percentile(percs, vs, ws):
+    """ Use linear interpolation to calculate the weighted percentile.
+        
+        Value and weight arrays are first sorted by value. The cumulative
+        distribution function (cdf) is then computed, after which np.interp
+        finds the two values closest to our desired weighted percentile(s)
+        and linearly interpolates them.
+        
+        percs  :: List of percentiles we want to calculate
+        vs     :: Array of values we are computing the percentile of
+        ws     :: Array of weights for our corresponding values
+        return :: Array of percentiles
+    """
+    idx = np.argsort(vs)
+    vs, ws = vs[idx], ws[idx] # weights and values sorted by value
+    cdf = 100 * (ws.cumsum() - ws / 2.0) / ws.sum()
+    return np.interp(percs, cdf, vs) # linear interpolation
+
+def weighted_average(vs, ws):
+    return np.sum(vs * ws) / np.sum(ws)
+
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
+
+def fmt_float_list(ctx, num=1):
+  """ Return a comma separated list of float formatters to the required number
+      of decimal places. For instance:
+
+        fmt_float_list(ctx.decimals=4, num=3) == "%.4f, %.4f, %.4f"
+  """
+  return ', '.join(["%%.%df" % ctx.decimals] * num)
+
+# Default values - see beginning of main() for how we detect number columns in
+# the input files:
+__HIST_COLUMNS = 1216
+__NON_HIST_COLUMNS = 3
+__TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS
+    
+def get_min(fps, arrs):
+    """ Find the file with the current first row with the smallest start time """
+    return min([fp for fp in fps if not arrs[fp] is None], key=lambda fp: arrs.get(fp)[0][0])
+
+def _plat_idx_to_val(idx, edge=0.5, FIO_IO_U_PLAT_BITS=6, FIO_IO_U_PLAT_VAL=64):
+    """ Taken from fio's stat.c for calculating the latency value of a bin
+        from that bin's index.
+        
+            idx  : the value of the index into the histogram bins
+            edge : fractional value in the range [0,1]** indicating how far into
+            the bin we wish to compute the latency value of.
+        
+        ** edge = 0.0 and 1.0 computes the lower and upper latency bounds
+           respectively of the given bin index. """
+
+    # MSB <= (FIO_IO_U_PLAT_BITS-1), cannot be rounded off. Use
+    # all bits of the sample as index
+    if (idx < (FIO_IO_U_PLAT_VAL << 1)):
+        return idx 
+
+    # Find the group and compute the minimum value of that group
+    error_bits = (idx >> FIO_IO_U_PLAT_BITS) - 1 
+    base = 1 << (error_bits + FIO_IO_U_PLAT_BITS)
+
+    # Find its bucket number of the group
+    k = idx % FIO_IO_U_PLAT_VAL
+
+    # Return the mean (if edge=0.5) of the range of the bucket
+    return base + ((k + edge) * (1 << error_bits))
+    
+def plat_idx_to_val_coarse(idx, coarseness, edge=0.5):
+    """ Converts the given *coarse* index into a non-coarse index as used by fio
+        in stat.h:plat_idx_to_val(), subsequently computing the appropriate
+        latency value for that bin.
+        """
+
+    # Multiply the index by the power of 2 coarseness to get the bin
+    # bin index with a max of 1536 bins (FIO_IO_U_PLAT_GROUP_NR = 24 in stat.h)
+    stride = 1 << coarseness
+    idx = idx * stride
+    lower = _plat_idx_to_val(idx, edge=0.0)
+    upper = _plat_idx_to_val(idx + stride, edge=1.0)
+    return lower + (upper - lower) * edge
+
+def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
+    ps = weighted_percentile(percs, vs, ws)
+
+    avg = weighted_average(vs, ws)
+    values = [mn, avg] + list(ps) + [mx]
+    row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
+    print (fmt % tuple(row))
+
+def update_extreme(val, fncn, new_val):
+    """ Calculate min / max in the presence of None values """
+    if val is None: return new_val
+    else: return fncn(val, new_val)
+
+# See beginning of main() for how bin_vals are computed
+bin_vals = []
+lower_bin_vals = [] # lower edge of each bin
+upper_bin_vals = [] # upper edge of each bin 
+
+def process_interval(ctx, iHist, iEnd):
+    """ print estimated percentages for the given merged sample
+    """
+    ss_cnt = 0 # number of samples affecting this interval
+    mn_bin_val, mx_bin_val = None, None
+   
+    # Update total number of samples affecting current interval histogram:
+    ss_cnt += np.sum(iHist)
+        
+    # Update min and max bin values
+    idxs = np.nonzero(iHist != 0)[0]
+    if idxs.size > 0:
+        mn_bin_val = bin_vals[idxs[0]]
+        mx_bin_val = bin_vals[idxs[-1]]
+
+    if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val)
+
+def guess_max_from_bins(ctx, hist_cols):
+    """ Try to guess the GROUP_NR from given # of histogram
+        columns seen in an input file """
+    max_coarse = 8
+    if ctx.group_nr < 19 or ctx.group_nr > 26:
+        bins = [ctx.group_nr * (1 << 6)]
+    else:
+        bins = [1216,1280,1344,1408,1472,1536,1600,1664]
+    coarses = range(max_coarse + 1)
+    fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else -10, coarses))
+    
+    arr = np.transpose(list(map(fncn, bins)))
+    idx = np.where(arr == hist_cols)
+    if len(idx[1]) == 0:
+        table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
+            "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
+            "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
+            "This number needs to be equal to one of the following numbers:\n\n"
+            + table + "\n\n"
+            "Possible reasons and corresponding solutions:\n"
+            "  - Input file(s) does not contain histograms.\n"
+            "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
+            "    new GROUP_NR on the command line with --group_nr\n")
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
+    return bins[idx[1][0]]
+
+def main(ctx):
+
+    if ctx.job_file:
+        try:
+            from configparser import SafeConfigParser, NoOptionError
+        except ImportError:
+            from ConfigParser import SafeConfigParser, NoOptionError
+
+        cp = SafeConfigParser(allow_no_value=True)
+        with open(ctx.job_file, 'r') as fp:
+            cp.readfp(fp)
+
+        if ctx.interval is None:
+            # Auto detect --interval value
+            for s in cp.sections():
+                try:
+                    hist_msec = cp.get(s, 'log_hist_msec')
+                    if hist_msec is not None:
+                        ctx.interval = int(hist_msec)
+                except NoOptionError:
+                    pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
+
+    if ctx.interval is None:
+        ctx.interval = 1000
+        
+    # Automatically detect how many columns are in the input files,
+    # calculate the corresponding 'coarseness' parameter used to generate
+    # those files, and calculate the appropriate bin latency values:
+    with open(ctx.FILE[0], 'r') as fp:
+        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
+        __TOTAL_COLUMNS = len(fp.readline().split(','))
+        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
+
+        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
+        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
+        bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        lower_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 0.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        upper_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 1.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+
+    fps = [HistFileRdr(f) for f in ctx.FILE]
+
+    print(', '.join(columns))
+
+    start = 0
+    end = ctx.interval
+    while True:
+        
+        more_data = False
+        
+        # add bins from all files in target intervals
+        arr = None
+        numSamples = 0
+        while True:
+            foundSamples = False
+            for fp in fps:
+                ts = fp.curTS
+                if ts and ts+10 < end:  # shift sample time when very close to an end time                 
+                    numSamples += 1
+                    foundSamples = True
+                    if arr is None: 
+                        arr = np.zeros(shape=(__HIST_COLUMNS), dtype=int)
+                    arr = np.add(arr, fp.curBins)
+                    more_data = True
+                    fp.nextData()
+                elif ts:
+                    more_data = True
+            
+            # reached end of all files
+            # or gone through all files without finding sample in interval 
+            if not more_data or not foundSamples:
+                break
+        
+        if arr is not None:
+            #print("{} size({}) samples({}) nonzero({}):".format(end, arr.size, numSamples, np.count_nonzero(arr)), str(arr), )
+            process_interval(ctx, arr, end)         
+        
+        # reach end of all files
+        if not more_data:
+            break
+            
+        start += ctx.interval
+        end = start + ctx.interval
+        
+        #if end > 20000: break
+
+
+if __name__ == '__main__':
+    import argparse
+    runascmd = True
+    p = argparse.ArgumentParser()
+    arg = p.add_argument
+    arg("FILE", help='space separated list of latency log filenames', nargs='+')
+    arg('--buff_size',
+        default=10000,
+        type=int,
+        help='number of samples to buffer into numpy at a time')
+
+    arg('--max_latency',
+        default=20,
+        type=float,
+        help='number of seconds of data to process at a time')
+
+    arg('-i', '--interval',
+        type=int,
+        help='interval width (ms), default 1000 ms '
+        '(no weighting between samples performed, results represent sample period only)')
+
+    arg('-d', '--divisor',
+        required=False,
+        type=int,
+        default=1,
+        help='divide the results by this value.')
+
+    arg('--decimals',
+        default=3,
+        type=int,
+        help='number of decimal places to print floats to')
+
+    arg('--warn',
+        dest='warn',
+        action='store_true',
+        default=False,
+        help='print warning messages to stderr')
+
+    arg('--group_nr',
+        default=29,
+        type=int,
+        help='FIO_IO_U_PLAT_GROUP_NR as defined in stat.h')
+
+    arg('--job-file',
+        default=None,
+        type=str,
+        help='Optional argument pointing to the job file used to create the '
+             'given histogram files. Useful for auto-detecting --log_hist_msec and '
+             '--log_unix_epoch (in fio) values.')
+
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+        
+
+    main(p.parse_args())
+






Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Wednesday, February 21, 2018 12:25 PM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: Vincent Fu <vincentfu@gmail.com>; Vincent Fu <Vincent.Fu@wdc.com>; fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: Re: fiologparser_hist.py script patch and enhancements?

For what it's worth I feel your pain and I think it's impossible to win until there's only one version of Python actively left. My hope is by making it consistent across all the fio scripts everyone will be the same jam and will at least be able to do a find/replace.

Perhaps someone should suggest a PEP which is an interpreter line that is just a line like preferredpython[2,3] and everyone is just expected to run a small conversion tool before starting to use it. Said conversion tool can rewrite the line to be env based, absolute, prefer python 3 to python 2 if when it's a Sunday and the weather's good...

On 21 February 2018 at 17:45, Kris Davis <Kris.Davis@wdc.com> wrote:
> Sigh...
> So much for trying to have it work for either python3 or python2.
>
> The doc you referenced does say: "If the executables provide the same functionality independent of whether they are run on top of Python 2 or Python 3, then only the Python 3 version of the executable should be packaged."
>
> However, since 2.7 runs so much faster, and 2.7 is still the default for the near future, I agree we should stay with 2.7.   I'll back out that tweak and create a new patch.

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-02-26 21:56                     ` Kris Davis
@ 2018-03-01 14:39                       ` Sitsofe Wheeler
  2018-03-15 17:02                         ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-03-01 14:39 UTC (permalink / raw)
  To: Kris Davis; +Cc: fio, Jens Axboe

Hi I've been going over this:

On 26 February 2018 at 21:56, Kris Davis <Kris.Davis@wdc.com> wrote:
> Here is an updated patch against 'master' without a change to the shabang:
>
>
> diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
> index 62a4eb4..f71e6d0
> --- a/tools/hist/fiologparser_hist.py
> +++ b/tools/hist/fiologparser_hist.py
> @@ -16,7 +16,10 @@
>  import os
>  import sys
>  import pandas
> +import re
>  import numpy as np
> +
> +runascmd = False
>
>  err = sys.stderr.write
>
> @@ -64,8 +67,20 @@
>  def weighted_average(vs, ws):
>      return np.sum(vs * ws) / np.sum(ws)
>
> -columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
> -percs   = [50, 90, 95, 99]
> +
> +percs = None
> +columns = None
> +
> +def gen_output_columns(percentiles):
> +    global percs,columns
> +    strpercs = re.split('[,:]', percentiles)
> +    percs = [50.0]  # always print 50% in 'median' column
> +    percs.extend(list(map(float,strpercs)))
> +    columns = ["end-time", "samples", "min", "avg", "median"]
> +    columns.extend(list(map(lambda x: x+'%', strpercs)))
> +    columns.append("max")
> +

Allegedly there's trailing whitespace here but it's hard to tell if
that's just how the patch was transferred...

> +
>
>  def fmt_float_list(ctx, num=1):
>    """ Return a comma separated list of float formatters to the required number
> @@ -178,7 +193,11 @@
>      avg = weighted_average(vs, ws)
>      values = [mn, avg] + list(ps) + [mx]
>      row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
> -    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
> +    if ctx.divisor > 1:
> +        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)

Does the + need space around it?

> +    else:
> +        # max and min are decimal values if no divisor
> +        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"

Same here - space around the "+" of +1?

<snip>

> diff --git a/tools/hist/fiologparser_hist_nw.py b/tools/hist/fiologparser_hist_nw.py
> new file mode 100755
> index 0000000..3e1be32
> --- /dev/null
> +++ b/tools/hist/fiologparser_hist_nw.py

It just seems a shame there's no way to just optionally skip the
weighting step in fiologparser_hist.py... So much of the code is the
same between these two files it's going to be a maintenance burden
keeping them in sync going forwards. I see two approaches:

1. Refactor the common code into a library and write two different
frontends that use the common library code.
2. Keep one script but somehow let it contain non-weighted vs weighted paths.

Thoughts?

At a minimum it would be good to land the changes to
fiologparser_hist.py as they seem good and not controversial...

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-03-01 14:39                       ` Sitsofe Wheeler
@ 2018-03-15 17:02                         ` Kris Davis
  2018-03-17 10:00                           ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-03-15 17:02 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe

Sorry I've been unavailable for a couple of weeks. 

Chose #2: 
2. Keep one script but somehow let it contain non-weighted vs weighted paths.

I merged the changes from my "no weighted" version with the original script, and added a "--noweight" option.    I didn't try to combined the file i/o methods, but just split the "main" function latter operation into "output_interval_data" and "output_weighted_interval_data", for lack of more creative names :).

Below is the new patch 
Thanks
Kris Davis


diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 62a4eb4..7acd2d3
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -16,9 +16,49 @@
 import os
 import sys
 import pandas
+import re
 import numpy as np
 
+runascmd = False
+
 err = sys.stderr.write
+
+class HistFileRdr():
+    """ Class to read a hist file line by line, buffering 
+        a value array for the latest line, and allowing a preview
+        of the next timestamp in next line
+        Note: this does not follow a generator pattern, but must explicitly
+        get next bin array.
+    """
+    def __init__(self, file):
+        self.fp = open(file, 'r')
+        self.data = self.nextData()
+        
+    def close(self):
+        self.fp.close()
+        self.fp = None
+        
+    def nextData(self):
+        self.data = None
+        if self.fp: 
+            line = self.fp.readline()
+            if line == "":
+                self.close()
+            else:
+                self.data = [int(x) for x in line.replace(' ', '').rstrip().split(',')]
+                
+        return self.data
+ 
+    @property
+    def curTS(self):
+        ts = None
+        if self.data:
+            ts = self.data[0]
+        return ts
+             
+    @property
+    def curBins(self):
+        return self.data[3:]
 
 def weighted_percentile(percs, vs, ws):
     """ Use linear interpolation to calculate the weighted percentile.
@@ -42,7 +82,7 @@
     """ Calculate weights based on fraction of sample falling in the
         given interval [start,end]. Weights computed using vector / array
         computation instead of for-loops.
-    
+
         Note that samples with zero time length are effectively ignored
         (we set their weight to zero).
 
@@ -64,8 +104,18 @@
 def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)
 
-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
 
 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -80,7 +130,7 @@
 __HIST_COLUMNS = 1216
 __NON_HIST_COLUMNS = 3
 __TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS
-    
+
 def read_chunk(rdr, sz):
     """ Read the next chunk of size sz from the given reader. """
     try:
@@ -88,7 +138,7 @@
             occurs if rdr is None due to the file being empty. """
         new_arr = rdr.read().values
     except (StopIteration, AttributeError):
-        return None    
+        return None
 
     """ Extract array of just the times, and histograms matrix without times column. """
     times, rws, szs = new_arr[:,0], new_arr[:,1], new_arr[:,2]
@@ -178,7 +228,11 @@
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
     row = [end, ss_cnt] + [float(x) / ctx.divisor for x in values]
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))
 
 def update_extreme(val, fncn, new_val):
@@ -191,33 +245,51 @@
 lower_bin_vals = [] # lower edge of each bin
 upper_bin_vals = [] # upper edge of each bin 
 
-def process_interval(ctx, samples, iStart, iEnd):
+def process_interval(ctx, iHist, iEnd):
+    """ print estimated percentages for the given merged sample
+    """
+    ss_cnt = 0 # number of samples affecting this interval
+    mn_bin_val, mx_bin_val = None, None
+   
+    # Update total number of samples affecting current interval histogram:
+    ss_cnt += np.sum(iHist)
+        
+    # Update min and max bin values
+    idxs = np.nonzero(iHist != 0)[0]
+    if idxs.size > 0:
+        mn_bin_val = bin_vals[idxs[0]]
+        mx_bin_val = bin_vals[idxs[-1]]
+
+    if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val)
+
+
+def process_weighted_interval(ctx, samples, iStart, iEnd):
     """ Construct the weighted histogram for the given interval by scanning
         through all the histograms and figuring out which of their bins have
         samples with latencies which overlap with the given interval
         [iStart,iEnd].
     """
-    
+
     times, files, hists = samples[:,0], samples[:,1], samples[:,2:]
     iHist = np.zeros(__HIST_COLUMNS)
     ss_cnt = 0 # number of samples affecting this interval
     mn_bin_val, mx_bin_val = None, None
 
     for end_time,file,hist in zip(times,files,hists):
-            
+
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]
 
         # Increment current interval histogram by weighted values of future histogram:
         ws = hs * weights(s_ts, end_time, iStart, iEnd)
         iHist[idx] += ws
-    
+
         # Update total number of samples affecting current interval histogram:
         ss_cnt += np.sum(hs)
-        
+
         # Update min and max bin values seen if necessary:
         idx = np.where(hs != 0)[0]
         if idx.size > 0:
@@ -241,7 +313,7 @@
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,47 +322,15 @@
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
     return bins[idx[1][0]]
 
-def main(ctx):
-
-    if ctx.job_file:
-        try:
-            from configparser import SafeConfigParser, NoOptionError
-        except ImportError:
-            from ConfigParser import SafeConfigParser, NoOptionError
-
-        cp = SafeConfigParser(allow_no_value=True)
-        with open(ctx.job_file, 'r') as fp:
-            cp.readfp(fp)
-
-        if ctx.interval is None:
-            # Auto detect --interval value
-            for s in cp.sections():
-                try:
-                    hist_msec = cp.get(s, 'log_hist_msec')
-                    if hist_msec is not None:
-                        ctx.interval = int(hist_msec)
-                except NoOptionError:
-                    pass
-
-    if ctx.interval is None:
-        ctx.interval = 1000
-
-    # Automatically detect how many columns are in the input files,
-    # calculate the corresponding 'coarseness' parameter used to generate
-    # those files, and calculate the appropriate bin latency values:
-    with open(ctx.FILE[0], 'r') as fp:
-        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
-        __TOTAL_COLUMNS = len(fp.readline().split(','))
-        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
-
-        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
-        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
-        bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness) for x in np.arange(__HIST_COLUMNS)], dtype=float)
-        lower_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 0.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
-        upper_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 1.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+def output_weighted_interval_data(ctx):
 
     fps = [open(f, 'r') for f in ctx.FILE]
     gen = histogram_generator(ctx, fps, ctx.buff_size)
@@ -322,7 +362,7 @@
                     start = start - (start % ctx.interval)
                     end = start + ctx.interval
 
-                process_interval(ctx, arr, start, end)
+                process_weighted_interval(ctx, arr, start, end)
                 
                 # Update arr to throw away samples we no longer need - samples which
                 # end before the start of the next interval, i.e. the end of the
@@ -335,10 +375,112 @@
     finally:
         for fp in fps:
             fp.close()
+ 
+def output_interval_data(ctx):
+    fps = [HistFileRdr(f) for f in ctx.FILE]
+
+    print(', '.join(columns))
+
+    start = 0
+    end = ctx.interval
+    while True:
+        
+        more_data = False
+        
+        # add bins from all files in target intervals
+        arr = None
+        numSamples = 0
+        while True:
+            foundSamples = False
+            for fp in fps:
+                ts = fp.curTS
+                if ts and ts+10 < end:  # shift sample time when very close to an end time                 
+                    numSamples += 1
+                    foundSamples = True
+                    if arr is None: 
+                        arr = np.zeros(shape=(__HIST_COLUMNS), dtype=int)
+                    arr = np.add(arr, fp.curBins)
+                    more_data = True
+                    fp.nextData()
+                elif ts:
+                    more_data = True
+            
+            # reached end of all files
+            # or gone through all files without finding sample in interval 
+            if not more_data or not foundSamples:
+                break
+        
+        if arr is not None:
+            #print("{} size({}) samples({}) nonzero({}):".format(end, arr.size, numSamples, np.count_nonzero(arr)), str(arr), )
+            process_interval(ctx, arr, end)         
+        
+        # reach end of all files
+        if not more_data:
+            break
+            
+        start += ctx.interval
+        end = start + ctx.interval
+
+ 
+def main(ctx):
+
+    if ctx.job_file:
+        try:
+            from configparser import SafeConfigParser, NoOptionError
+        except ImportError:
+            from ConfigParser import SafeConfigParser, NoOptionError
+
+        cp = SafeConfigParser(allow_no_value=True)
+        with open(ctx.job_file, 'r') as fp:
+            cp.readfp(fp)
+
+        if ctx.interval is None:
+            # Auto detect --interval value
+            for s in cp.sections():
+                try:
+                    hist_msec = cp.get(s, 'log_hist_msec')
+                    if hist_msec is not None:
+                        ctx.interval = int(hist_msec)
+                except NoOptionError:
+                    pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
+
+    if ctx.interval is None:
+        ctx.interval = 1000
+
+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
+
+
+    # Automatically detect how many columns are in the input files,
+    # calculate the corresponding 'coarseness' parameter used to generate
+    # those files, and calculate the appropriate bin latency values:
+    with open(ctx.FILE[0], 'r') as fp:
+        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
+        __TOTAL_COLUMNS = len(fp.readline().split(','))
+        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
+
+        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
+        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
+        bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        lower_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 0.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+        upper_bin_vals = np.array([plat_idx_to_val_coarse(x, coarseness, 1.0) for x in np.arange(__HIST_COLUMNS)], dtype=float)
+
+    
+    if ctx.noweight:
+        output_interval_data(ctx)
+    else:
+        output_weighted_interval_data(ctx)
 
 
 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -355,6 +497,11 @@
     arg('-i', '--interval',
         type=int,
         help='interval width (ms), default 1000 ms')
+
+    arg('--noweight',
+        action='store_true',
+        default=False,
+        help='do not perform weighting of samples between output intervals')
 
     arg('-d', '--divisor',
         required=False,
@@ -385,5 +532,16 @@
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')
 
+    arg('--percentiles',
+        default="90:95:99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0:95.0:99.0".  min, median(50%%) and max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')      
+
     main(p.parse_args())


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-03-15 17:02                         ` Kris Davis
@ 2018-03-17 10:00                           ` Sitsofe Wheeler
  2018-03-19 15:45                             ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-03-17 10:00 UTC (permalink / raw)
  To: Kris Davis; +Cc: fio, Jens Axboe

Hi,

On 15 March 2018 at 17:02, Kris Davis <Kris.Davis@wdc.com> wrote:
> Sorry I've been unavailable for a couple of weeks.
>
> Chose #2:
> 2. Keep one script but somehow let it contain non-weighted vs weighted paths.
>
> I merged the changes from my "no weighted" version with the original script, and added a "--noweight" option.    I didn't try to combined the file i/o methods, but just split the "main" function latter operation into "output_interval_data" and "output_weighted_interval_data", for lack of more creative names :).
>
> Below is the new patch

By some chance you wouldn't have this up in a github repo too? GMail's
web client seems to do unspeakable whitespace changes to patches (I
wound up copying and pasting from
https://www.spinics.net/lists/fio/msg06872.html in the end). I still
get grumbles about trailing whitespace from git but again I don't know
if that's just how I got the patch.

Could you update the tool's manpage
(tools/hist/fiologparser_hist.py.1) with your changes too?

Shouldn't __HIST_COLUMNS be changed depending on whether micro or
nanoseconds are being used? It looks like it matches FIO_IO_U_PLAT_NR
which is calculated by (1 << FIO_IO_U_PLAT_BITS) *
FIO_IO_U_PLAT_GROUP_NR which for fio-3.5 is (1 << 6) * 29  = 1856 .
Commit https://github.com/axboe/fio/commit/d6bb626ef37d3905221ade2887b422717a07af09
seems to be the one that changed the value of FIO_IO_U_PLAT_GROUP_NR
...

Beyond that a simple run worked for me with the following:

cd /tmp/
fio --name=test --rw=read --runtime=5s --time_based
--filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100
fiologparser_hist.py test_clat_hist.1.log
fiologparser_hist.py --noweight --percentiles 0,1,50 test_clat_hist.1.log

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-03-17 10:00                           ` Sitsofe Wheeler
@ 2018-03-19 15:45                             ` Kris Davis
  2018-03-20  5:57                               ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-03-19 15:45 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe

Sitsofe, 

Thanks for being diligent with this one.  I haven't run into grumbles about whitespace before, so I'm unclear what you are seeing.  However, I'll try to set up a github clone and submit changes there.  I've not really used github for any thing other than some coursework before, nothing collaborative.   I've only been submitting patches in the email because of the suggestion of others, and it seemed to be what others have done.

I'm unclear about the __HIST_COLUMNS.  The actual values appears to be discovered from the input log files.  However, the --group_nr option was in the original script to override the expected FIO_IO_U_PLAT_GROUP_NR value.  I don't really understand the relationship between the FIO_IO_U_PLAT_GROUP_NR value, the HIST_COLUMNs and whether the BINS are in milliseconds or microseconds.   If there is a known consistent relationship, maybe the options can be made simpler.

I've updated the man page, now that I understand how the need for manual editing.   I've also implemented yet another option.  I found the personal need to for getting separate results for the reads and writes.  So I added another "--directions=rwtm" option to allow a person to specify which directions they want output.   If the --directions option is not provide, the output is the normal mixed/combined results.  If --directions specified, it uses the characters to indicate which directions results to created, and adds a "dir" column to the output.  You can have both directional and 'mixed' rows as desired.

I've implemented and did some quick tests, but want to do some more careful variations - to make sure that directional results are correct, though the mixed results are pretty easily compare old results.

So, be patient for the next chapter in the epic fiologparser_hist patches...
Thanks Much
Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Saturday, March 17, 2018 5:01 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: Re: fiologparser_hist.py script patch and enhancements?

Hi,

On 15 March 2018 at 17:02, Kris Davis <Kris.Davis@wdc.com> wrote:
> Sorry I've been unavailable for a couple of weeks.
>
> Chose #2:
> 2. Keep one script but somehow let it contain non-weighted vs weighted paths.
>
> I merged the changes from my "no weighted" version with the original script, and added a "--noweight" option.    I didn't try to combined the file i/o methods, but just split the "main" function latter operation into "output_interval_data" and "output_weighted_interval_data", for lack of more creative names :).
>
> Below is the new patch

By some chance you wouldn't have this up in a github repo too? GMail's web client seems to do unspeakable whitespace changes to patches (I wound up copying and pasting from https://www.spinics.net/lists/fio/msg06872.html in the end). I still get grumbles about trailing whitespace from git but again I don't know if that's just how I got the patch.

Could you update the tool's manpage
(tools/hist/fiologparser_hist.py.1) with your changes too?

Shouldn't __HIST_COLUMNS be changed depending on whether micro or nanoseconds are being used? It looks like it matches FIO_IO_U_PLAT_NR which is calculated by (1 << FIO_IO_U_PLAT_BITS) * FIO_IO_U_PLAT_GROUP_NR which for fio-3.5 is (1 << 6) * 29  = 1856 .
Commit https://github.com/axboe/fio/commit/d6bb626ef37d3905221ade2887b422717a07af09
seems to be the one that changed the value of FIO_IO_U_PLAT_GROUP_NR ...

Beyond that a simple run worked for me with the following:

cd /tmp/
fio --name=test --rw=read --runtime=5s --time_based --filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100
fiologparser_hist.py test_clat_hist.1.log fiologparser_hist.py --noweight --percentiles 0,1,50 test_clat_hist.1.log

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-03-19 15:45                             ` Kris Davis
@ 2018-03-20  5:57                               ` Kris Davis
  2018-03-22 14:47                                 ` Sitsofe Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-03-20  5:57 UTC (permalink / raw)
  To: Kris Davis, Sitsofe Wheeler; +Cc: fio, Jens Axboe

I created a pull request from github with the latest changes, both to fiologparser_hist.py and the associated man page.   This includes a new "--directions=rwtm" option to allow independent directional results to be printed (with an added 'dir' column).  

If there is a way to infer whether the histogram results are in 'ns' or 'ms' from the number of columns, please let me know.  I wonder if the 'coarseness' option might be a factor they might complicate things.  I'm mostly relying on the original script operation, rather than having a more complete understanding of the dependencies.

Thanks

Kris Davis

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Kris Davis
Sent: Monday, March 19, 2018 10:45 AM
To: Sitsofe Wheeler <sitsofe@gmail.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: RE: fiologparser_hist.py script patch and enhancements?

Sitsofe, 

Thanks for being diligent with this one.  I haven't run into grumbles about whitespace before, so I'm unclear what you are seeing.  However, I'll try to set up a github clone and submit changes there.  I've not really used github for any thing other than some coursework before, nothing collaborative.   I've only been submitting patches in the email because of the suggestion of others, and it seemed to be what others have done.

I'm unclear about the __HIST_COLUMNS.  The actual values appears to be discovered from the input log files.  However, the --group_nr option was in the original script to override the expected FIO_IO_U_PLAT_GROUP_NR value.  I don't really understand the relationship between the FIO_IO_U_PLAT_GROUP_NR value, the HIST_COLUMNs and whether the BINS are in milliseconds or microseconds.   If there is a known consistent relationship, maybe the options can be made simpler.

I've updated the man page, now that I understand how the need for manual editing.   I've also implemented yet another option.  I found the personal need to for getting separate results for the reads and writes.  So I added another "--directions=rwtm" option to allow a person to specify which directions they want output.   If the --directions option is not provide, the output is the normal mixed/combined results.  If --directions specified, it uses the characters to indicate which directions results to created, and adds a "dir" column to the output.  You can have both directional and 'mixed' rows as desired.

I've implemented and did some quick tests, but want to do some more careful variations - to make sure that directional results are correct, though the mixed results are pretty easily compare old results.

So, be patient for the next chapter in the epic fiologparser_hist patches...
Thanks Much
Kris Davis

-----Original Message-----
From: Sitsofe Wheeler [mailto:sitsofe@gmail.com] 
Sent: Saturday, March 17, 2018 5:01 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: Re: fiologparser_hist.py script patch and enhancements?

Hi,

On 15 March 2018 at 17:02, Kris Davis <Kris.Davis@wdc.com> wrote:
> Sorry I've been unavailable for a couple of weeks.
>
> Chose #2:
> 2. Keep one script but somehow let it contain non-weighted vs weighted paths.
>
> I merged the changes from my "no weighted" version with the original script, and added a "--noweight" option.    I didn't try to combined the file i/o methods, but just split the "main" function latter operation into "output_interval_data" and "output_weighted_interval_data", for lack of more creative names :).
>
> Below is the new patch

By some chance you wouldn't have this up in a github repo too? GMail's web client seems to do unspeakable whitespace changes to patches (I wound up copying and pasting from https://www.spinics.net/lists/fio/msg06872.html in the end). I still get grumbles about trailing whitespace from git but again I don't know if that's just how I got the patch.

Could you update the tool's manpage
(tools/hist/fiologparser_hist.py.1) with your changes too?

Shouldn't __HIST_COLUMNS be changed depending on whether micro or nanoseconds are being used? It looks like it matches FIO_IO_U_PLAT_NR which is calculated by (1 << FIO_IO_U_PLAT_BITS) * FIO_IO_U_PLAT_GROUP_NR which for fio-3.5 is (1 << 6) * 29  = 1856 .
Commit https://github.com/axboe/fio/commit/d6bb626ef37d3905221ade2887b422717a07af09
seems to be the one that changed the value of FIO_IO_U_PLAT_GROUP_NR ...

Beyond that a simple run worked for me with the following:

cd /tmp/
fio --name=test --rw=read --runtime=5s --time_based --filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100
fiologparser_hist.py test_clat_hist.1.log fiologparser_hist.py --noweight --percentiles 0,1,50 test_clat_hist.1.log

--
Sitsofe | http://sucs.org/~sits/
N�����r��y���b�X��ǧv�^�)޺{.n�+�������\x17��ܨ}���Ơz�&j:+v���\r����zZ+��+zf���h���~����i���z�\x1e�w���?����&�)ߢ^[f

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: fiologparser_hist.py script patch and enhancements?
  2018-03-20  5:57                               ` Kris Davis
@ 2018-03-22 14:47                                 ` Sitsofe Wheeler
  2018-03-23 17:22                                   ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Sitsofe Wheeler @ 2018-03-22 14:47 UTC (permalink / raw)
  To: Kris Davis; +Cc: fio, Jens Axboe

On 20 March 2018 at 05:57, Kris Davis <Kris.Davis@wdc.com> wrote:
> I created a pull request from github with the latest changes, both to fiologparser_hist.py and the associated man page.   This includes a new "--directions=rwtm" option to allow independent directional results to be printed (with an added 'dir' column).

I see Jens pulled it in so congrats on staying the course!

> If there is a way to infer whether the histogram results are in 'ns' or 'ms' from the number of columns, please let me know.  I wonder if the 'coarseness' option might be a factor they might complicate things.  I'm mostly relying on the original script operation, rather than having a more complete understanding of the dependencies.

The coarseness definitely complicates things but here are my thoughts:

If you know the total columns and you're willing to assume fio compile
defaults you can try and guess whether FIO_IO_U_PLAT_GROUP_NR is 19
(in which case you assume usec) or 29 (in which case you assume nsec).

From iolog.c:
 723         int stride = 1 << hist_coarseness;
[...]
 734                 s = __get_sample(samples, log_offset, i);
 735
 736                 entry = s->data.plat_entry;
 737                 io_u_plat = entry->io_u_plat;
 738
 739                 entry_before = flist_first_entry(&entry->list,
struct io_u_plat_entry, list);
 740                 io_u_plat_before = entry_before->io_u_plat;
 741
 742                 fprintf(f, "%lu, %u, %u, ", (unsigned long) s->time,
 743                                                 io_sample_ddir(s), s->bs);
 744                 for (j = 0; j < FIO_IO_U_PLAT_NR - stride; j += stride) {
 745                         fprintf(f, "%llu, ", (unsigned long long)
 746                                 hist_sum(j, stride, io_u_plat,
io_u_plat_before));
 747                 }
 748                 fprintf(f, "%llu\n", (unsigned long long)
 749                         hist_sum(FIO_IO_U_PLAT_NR - stride,
stride, io_u_plat,
 750                                         io_u_plat_before));

So we can tell the number of columns is related to the 3 +
(FIO_IO_U_PLAT_NR / (2 ^ coarseness)) + 1. The 3 comes from printing
the time, the data direction and the blocksize columns. In python

def main():
    for pgn in [19, 29]:
        print("pgn=%d:" % (pgn)),
        for c in range(7):
            pn = (1 << 6) * pgn
            s = (1 << c)
            print(((pn - s) / s) + 4),
        print("")

if __name__ == "__main__":
    main()

prints the following:
pgn=19: 1219 611 307 155 79 41 22
pgn=29: 1859 931 467 235 119 61 32

Quick comparison:
$ ./fio --version
fio-3.5-70-g40f1
$ ./fio --name=test --rw=read --runtime=200ms --time_based
--filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100 --log_hist_coarseness=0
$ head -1 test_clat_hist.1.log | awk -F',' '{print NF}'
1859
./fio --name=test --rw=read --runtime=200ms --time_based
--filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100 --log_hist_coarseness=6
$ head -1 test_clat_hist.1.log | awk -F',' '{print NF}'
32

If we approximate the equation we get
((64 * FIO_IO_U_PLAT_GROUP_NR) / (2 ^ coarseness)) + 4 = total_columns;
which can be rearranged to
(64 * FIO_IO_U_PLAT_GROUP_NR) / (total_columns - 4) = (2 ^ coarseness)
log2((64 * FIO_IO_U_PLAT_GROUP_NR) / (total_columns - 4) = coarseness

This allows us to guess and check a bit quicker. For example with
FIO_IO_U_PLAT_GROUP_NR=19, columns=611

import math

def check_columns(column_actual, pgn_guess):
    approx_coarse_guess = int(math.log((64 * pgn_guess) /
float(column_actual - 4), 2))
    stride_guess = 1 << approx_coarse_guess
    column_guess = ((64 * pgn_guess - stride_guess) / stride_guess) + 4
    return column_guess == column_actual

>>> check_columns(611, 19)
True

A bigger test:
>>> [check_columns(x, 19) for x in (1219, 611, 307, 155, 79, 41, 22)]
[True, True, True, True, True, True, True]
>>> [check_columns(x, 19) for x in (1859, 931, 467, 235, 119, 61, 32)]
[False, False, False, False, False, False, False]

If someone explicitly sets --group_nr they ought to set whether the
time is in usec or nsec (or generally override the default) as we
can't really guess any more.

What do you think?

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-03-22 14:47                                 ` Sitsofe Wheeler
@ 2018-03-23 17:22                                   ` Kris Davis
  0 siblings, 0 replies; 20+ messages in thread
From: Kris Davis @ 2018-03-23 17:22 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio, Jens Axboe

Sitsofe,

Wow!  It looks like you went the extra mile checking this out.  It's going to take me a bit to grok this, I'll have to get back to you.

Thanks much

Kris Davis

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Sitsofe Wheeler
Sent: Thursday, March 22, 2018 9:47 AM
To: Kris Davis <Kris.Davis@wdc.com>
Cc: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>
Subject: Re: fiologparser_hist.py script patch and enhancements?

On 20 March 2018 at 05:57, Kris Davis <Kris.Davis@wdc.com> wrote:
> I created a pull request from github with the latest changes, both to fiologparser_hist.py and the associated man page.   This includes a new "--directions=rwtm" option to allow independent directional results to be printed (with an added 'dir' column).

I see Jens pulled it in so congrats on staying the course!

> If there is a way to infer whether the histogram results are in 'ns' or 'ms' from the number of columns, please let me know.  I wonder if the 'coarseness' option might be a factor they might complicate things.  I'm mostly relying on the original script operation, rather than having a more complete understanding of the dependencies.

The coarseness definitely complicates things but here are my thoughts:

If you know the total columns and you're willing to assume fio compile defaults you can try and guess whether FIO_IO_U_PLAT_GROUP_NR is 19 (in which case you assume usec) or 29 (in which case you assume nsec).

From iolog.c:
 723         int stride = 1 << hist_coarseness;
[...]
 734                 s = __get_sample(samples, log_offset, i);
 735
 736                 entry = s->data.plat_entry;
 737                 io_u_plat = entry->io_u_plat;
 738
 739                 entry_before = flist_first_entry(&entry->list,
struct io_u_plat_entry, list);
 740                 io_u_plat_before = entry_before->io_u_plat;
 741
 742                 fprintf(f, "%lu, %u, %u, ", (unsigned long) s->time,
 743                                                 io_sample_ddir(s), s->bs);
 744                 for (j = 0; j < FIO_IO_U_PLAT_NR - stride; j += stride) {
 745                         fprintf(f, "%llu, ", (unsigned long long)
 746                                 hist_sum(j, stride, io_u_plat,
io_u_plat_before));
 747                 }
 748                 fprintf(f, "%llu\n", (unsigned long long)
 749                         hist_sum(FIO_IO_U_PLAT_NR - stride,
stride, io_u_plat,
 750                                         io_u_plat_before));

So we can tell the number of columns is related to the 3 + (FIO_IO_U_PLAT_NR / (2 ^ coarseness)) + 1. The 3 comes from printing the time, the data direction and the blocksize columns. In python

def main():
    for pgn in [19, 29]:
        print("pgn=%d:" % (pgn)),
        for c in range(7):
            pn = (1 << 6) * pgn
            s = (1 << c)
            print(((pn - s) / s) + 4),
        print("")

if __name__ == "__main__":
    main()

prints the following:
pgn=19: 1219 611 307 155 79 41 22
pgn=29: 1859 931 467 235 119 61 32

Quick comparison:
$ ./fio --version
fio-3.5-70-g40f1
$ ./fio --name=test --rw=read --runtime=200ms --time_based --filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100 --log_hist_coarseness=0 $ head -1 test_clat_hist.1.log | awk -F',' '{print NF}'
1859
./fio --name=test --rw=read --runtime=200ms --time_based --filename=/tmp/fio.tmp --size=1M --write_hist_log=test
--log_hist_msec=100 --log_hist_coarseness=6 $ head -1 test_clat_hist.1.log | awk -F',' '{print NF}'
32

If we approximate the equation we get
((64 * FIO_IO_U_PLAT_GROUP_NR) / (2 ^ coarseness)) + 4 = total_columns; which can be rearranged to
(64 * FIO_IO_U_PLAT_GROUP_NR) / (total_columns - 4) = (2 ^ coarseness)
log2((64 * FIO_IO_U_PLAT_GROUP_NR) / (total_columns - 4) = coarseness

This allows us to guess and check a bit quicker. For example with FIO_IO_U_PLAT_GROUP_NR=19, columns=611

import math

def check_columns(column_actual, pgn_guess):
    approx_coarse_guess = int(math.log((64 * pgn_guess) / float(column_actual - 4), 2))
    stride_guess = 1 << approx_coarse_guess
    column_guess = ((64 * pgn_guess - stride_guess) / stride_guess) + 4
    return column_guess == column_actual

>>> check_columns(611, 19)
True

A bigger test:
>>> [check_columns(x, 19) for x in (1219, 611, 307, 155, 79, 41, 22)]
[True, True, True, True, True, True, True]
>>> [check_columns(x, 19) for x in (1859, 931, 467, 235, 119, 61, 32)]
[False, False, False, False, False, False, False]

If someone explicitly sets --group_nr they ought to set whether the time is in usec or nsec (or generally override the default) as we can't really guess any more.

What do you think?

--
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: fiologparser_hist.py script patch and enhancements?
  2018-01-30 19:13 Kris Davis
@ 2018-02-01 19:03 ` Kris Davis
  0 siblings, 0 replies; 20+ messages in thread
From: Kris Davis @ 2018-02-01 19:03 UTC (permalink / raw)
  To: fio; +Cc: karl.cronburg

Other than an internal business comment this hasn't indicated any discussion.
If nothing more: here is the patch for both the modified script fiologparser_hist.py, and the add non-weighted version fiologparser_hist_nw.pu script, generated from a git diff operation: 

Thanks
Kris Davis


diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 2e05b92..fe9a951 100755
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2.7
+#!/usr/bin/python
 """ 
     Utility for converting *_clat_hist* files generated by fio into latency statistics.
     
@@ -16,8 +16,16 @@
 import os
 import sys
 import pandas
+import re
 import numpy as np
 
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
 err = sys.stderr.write
 
 def weighted_percentile(percs, vs, ws):
@@ -64,8 +72,20 @@ def weights(start_ts, end_ts, start, end):
 def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)
 
-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
 
 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -118,6 +138,7 @@ def histogram_generator(ctx, fps, sz):
 
     # Initial histograms from disk:
     arrs = {fp: read_chunk(rdr, sz) for fp,rdr in rdrs.items()}
+    
     while True:
 
         try:
@@ -177,8 +198,12 @@ def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
 
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
-    row = [end, ss_cnt] + map(lambda x: float(x) / ctx.divisor, values)
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    row = [end, ss_cnt] + list(map(lambda x: float(x) / ctx.divisor, values))
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))
 
 def update_extreme(val, fncn, new_val):
@@ -207,7 +232,7 @@ def process_interval(ctx, samples, iStart, iEnd):
             
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]
 
@@ -241,7 +266,7 @@ def guess_max_from_bins(ctx, hist_cols):
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +275,12 @@ def guess_max_from_bins(ctx, hist_cols):
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
     return bins[idx[1][0]]
 
 def main(ctx):
@@ -274,10 +304,19 @@ def main(ctx):
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
 
     if ctx.interval is None:
         ctx.interval = 1000
 
+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
+
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate
     # those files, and calculate the appropriate bin latency values:
@@ -288,9 +327,9 @@ def main(ctx):
 
         max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
         coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
-        bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS)), dtype=float)
-        lower_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS)), dtype=float)
-        upper_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS)), dtype=float)
+        bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS))), dtype=float)
+        lower_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS))), dtype=float)
+        upper_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS))), dtype=float)
 
     fps = [open(f, 'r') for f in ctx.FILE]
     gen = histogram_generator(ctx, fps, ctx.buff_size)
@@ -304,7 +343,7 @@ def main(ctx):
         while more_data or len(arr) > 0:
             
             # Read up to ctx.max_latency (default 20 seconds) of data from end of current interval.
-            while len(arr) == 0 or arr[-1][0] < ctx.max_latency * 1000 + end:
+            while len(arr) == 0 or arr[-1][0] < ctx.max_latency + end:
                 try:
                     new_arr = next(gen)
                 except StopIteration:
@@ -338,6 +377,7 @@ def main(ctx):
 
 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -384,5 +424,18 @@ if __name__ == '__main__':
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')
 
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')
+    
+    
+
     main(p.parse_args())
 
diff --git a/tools/hist/fiologparser_hist_nw.py b/tools/hist/fiologparser_hist_nw.py
new file mode 100644
index 0000000..c8c6e87
--- /dev/null
+++ b/tools/hist/fiologparser_hist_nw.py
@@ -0,0 +1,384 @@
+#!/usr/bin/python
+""" 
+    Utility for converting *_clat_hist* files generated by fio into latency statistics.
+    
+    Example usage:
+    
+            $ fiologparser_hist.py *_clat_hist*
+            end-time, samples, min, avg, median, 90%, 95%, 99%, max
+            1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000
+            2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000
+            4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744
+            ...
+    
+    @author Karl Cronburg <karl.cronburg@gmail.com>
+    modification by Kris Davis <shimrot@gmail.com>
+"""
+import os
+import sys
+import re
+import numpy as np
+
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
+
+err = sys.stderr.write
+
+class HistFileRdr():
+    """ Class to read a hist file line by line, buffering 
+        a value array for the latest line, and allowing a preview
+        of the next timestamp in next line
+        Note: this does not follow a generator pattern, but must explicitly
+        get next bin array.
+    """
+    def __init__(self, file):
+        self.fp = open(file, 'r')
+        self.data = self.nextData()
+        
+    def close(self):
+        self.fp.close()
+        self.fp = None
+        
+    def nextData(self):
+        self.data = None
+        if self.fp: 
+            line = self.fp.readline()
+            if line == "":
+                self.close()
+            else:
+                self.data = [int(x) for x in line.replace(' ', '').rstrip().split(',')]
+                
+        return self.data
+ 
+    @property
+    def curTS(self):
+        ts = None
+        if self.data:
+            ts = self.data[0]
+        return ts
+             
+    @property
+    def curBins(self):
+        return self.data[3:]
+                
+    
+
+def weighted_percentile(percs, vs, ws):
+    """ Use linear interpolation to calculate the weighted percentile.
+        
+        Value and weight arrays are first sorted by value. The cumulative
+        distribution function (cdf) is then computed, after which np.interp
+        finds the two values closest to our desired weighted percentile(s)
+        and linearly interpolates them.
+        
+        percs  :: List of percentiles we want to calculate
+        vs     :: Array of values we are computing the percentile of
+        ws     :: Array of weights for our corresponding values
+        return :: Array of percentiles
+    """
+    idx = np.argsort(vs)
+    vs, ws = vs[idx], ws[idx] # weights and values sorted by value
+    cdf = 100 * (ws.cumsum() - ws / 2.0) / ws.sum()
+    return np.interp(percs, cdf, vs) # linear interpolation
+
+def weighted_average(vs, ws):
+    return np.sum(vs * ws) / np.sum(ws)
+
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
+
+def fmt_float_list(ctx, num=1):
+  """ Return a comma separated list of float formatters to the required number
+      of decimal places. For instance:
+
+        fmt_float_list(ctx.decimals=4, num=3) == "%.4f, %.4f, %.4f"
+  """
+  return ', '.join(["%%.%df" % ctx.decimals] * num)
+
+# Default values - see beginning of main() for how we detect number columns in
+# the input files:
+__HIST_COLUMNS = 1216
+__NON_HIST_COLUMNS = 3
+__TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS
+    
+def get_min(fps, arrs):
+    """ Find the file with the current first row with the smallest start time """
+    return min([fp for fp in fps if not arrs[fp] is None], key=lambda fp: arrs.get(fp)[0][0])
+
+def _plat_idx_to_val(idx, edge=0.5, FIO_IO_U_PLAT_BITS=6, FIO_IO_U_PLAT_VAL=64):
+    """ Taken from fio's stat.c for calculating the latency value of a bin
+        from that bin's index.
+        
+            idx  : the value of the index into the histogram bins
+            edge : fractional value in the range [0,1]** indicating how far into
+            the bin we wish to compute the latency value of.
+        
+        ** edge = 0.0 and 1.0 computes the lower and upper latency bounds
+           respectively of the given bin index. """
+
+    # MSB <= (FIO_IO_U_PLAT_BITS-1), cannot be rounded off. Use
+    # all bits of the sample as index
+    if (idx < (FIO_IO_U_PLAT_VAL << 1)):
+        return idx 
+
+    # Find the group and compute the minimum value of that group
+    error_bits = (idx >> FIO_IO_U_PLAT_BITS) - 1 
+    base = 1 << (error_bits + FIO_IO_U_PLAT_BITS)
+
+    # Find its bucket number of the group
+    k = idx % FIO_IO_U_PLAT_VAL
+
+    # Return the mean (if edge=0.5) of the range of the bucket
+    return base + ((k + edge) * (1 << error_bits))
+    
+def plat_idx_to_val_coarse(idx, coarseness, edge=0.5):
+    """ Converts the given *coarse* index into a non-coarse index as used by fio
+        in stat.h:plat_idx_to_val(), subsequently computing the appropriate
+        latency value for that bin.
+        """
+
+    # Multiply the index by the power of 2 coarseness to get the bin
+    # bin index with a max of 1536 bins (FIO_IO_U_PLAT_GROUP_NR = 24 in stat.h)
+    stride = 1 << coarseness
+    idx = idx * stride
+    lower = _plat_idx_to_val(idx, edge=0.0)
+    upper = _plat_idx_to_val(idx + stride, edge=1.0)
+    return lower + (upper - lower) * edge
+
+def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx):
+    ps = weighted_percentile(percs, vs, ws)
+
+    avg = weighted_average(vs, ws)
+    values = [mn, avg] + list(ps) + [mx]
+    row = [end, ss_cnt] + list(map(lambda x: float(x) / ctx.divisor, values))
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
+    print (fmt % tuple(row))
+
+def update_extreme(val, fncn, new_val):
+    """ Calculate min / max in the presence of None values """
+    if val is None: return new_val
+    else: return fncn(val, new_val)
+
+# See beginning of main() for how bin_vals are computed
+bin_vals = []
+lower_bin_vals = [] # lower edge of each bin
+upper_bin_vals = [] # upper edge of each bin 
+
+def process_interval(ctx, iHist, iEnd):
+    """ print estimated percentages for the given merged sample
+    """
+    ss_cnt = 0 # number of samples affecting this interval
+    mn_bin_val, mx_bin_val = None, None
+   
+    # Update total number of samples affecting current interval histogram:
+    ss_cnt += np.sum(iHist)
+        
+    # Update min and max bin values
+    idxs = np.nonzero(iHist != 0)[0]
+    if idxs.size > 0:
+        mn_bin_val = bin_vals[idxs[0]]
+        mx_bin_val = bin_vals[idxs[-1]]
+
+    if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val)
+
+def guess_max_from_bins(ctx, hist_cols):
+    """ Try to guess the GROUP_NR from given # of histogram
+        columns seen in an input file """
+    max_coarse = 8
+    if ctx.group_nr < 19 or ctx.group_nr > 26:
+        bins = [ctx.group_nr * (1 << 6)]
+    else:
+        bins = [1216,1280,1344,1408,1472,1536,1600,1664]
+    coarses = range(max_coarse + 1)
+    fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else -10, coarses))
+    
+    arr = np.transpose(list(map(fncn, bins)))
+    idx = np.where(arr == hist_cols)
+    if len(idx[1]) == 0:
+        table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
+            "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
+            "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
+            "This number needs to be equal to one of the following numbers:\n\n"
+            + table + "\n\n"
+            "Possible reasons and corresponding solutions:\n"
+            "  - Input file(s) does not contain histograms.\n"
+            "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
+            "    new GROUP_NR on the command line with --group_nr\n")
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
+    return bins[idx[1][0]]
+
+def main(ctx):
+
+    if ctx.job_file:
+        try:
+            from configparser import SafeConfigParser, NoOptionError
+        except ImportError:
+            from ConfigParser import SafeConfigParser, NoOptionError
+
+        cp = SafeConfigParser(allow_no_value=True)
+        with open(ctx.job_file, 'r') as fp:
+            cp.readfp(fp)
+
+        if ctx.interval is None:
+            # Auto detect --interval value
+            for s in cp.sections():
+                try:
+                    hist_msec = cp.get(s, 'log_hist_msec')
+                    if hist_msec is not None:
+                        ctx.interval = int(hist_msec)
+                except NoOptionError:
+                    pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
+
+    if ctx.interval is None:
+        ctx.interval = 1000
+        
+    # Automatically detect how many columns are in the input files,
+    # calculate the corresponding 'coarseness' parameter used to generate
+    # those files, and calculate the appropriate bin latency values:
+    with open(ctx.FILE[0], 'r') as fp:
+        global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS
+        __TOTAL_COLUMNS = len(fp.readline().split(','))
+        __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS
+
+        max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
+        coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
+        bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS))), dtype=float)
+        lower_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS))), dtype=float)
+        upper_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS))), dtype=float)
+
+    fps = [HistFileRdr(f) for f in ctx.FILE]
+
+    print(', '.join(columns))
+
+    start = 0
+    end = ctx.interval
+    while True:
+        
+        more_data = False
+        
+        # add bins from all files in target intervals
+        arr = None
+        numSamples = 0
+        while True:
+            foundSamples = False
+            for fp in fps:
+                ts = fp.curTS
+                if ts and ts+10 < end:  # shift sample time when very close to an end time                 
+                    numSamples += 1
+                    foundSamples = True
+                    if arr is None: 
+                        arr = np.zeros(shape=(__HIST_COLUMNS), dtype=int)
+                    arr = np.add(arr, fp.curBins)
+                    more_data = True
+                    fp.nextData()
+                elif ts:
+                    more_data = True
+            
+            # reached end of all files
+            # or gone through all files without finding sample in interval 
+            if not more_data or not foundSamples:
+                break
+        
+        if arr is not None:
+            #print("{} size({}) samples({}) nonzero({}):".format(end, arr.size, numSamples, np.count_nonzero(arr)), str(arr), )
+            process_interval(ctx, arr, end)         
+        
+        # reach end of all files
+        if not more_data:
+            break
+            
+        start += ctx.interval
+        end = start + ctx.interval
+        
+        #if end > 20000: break
+
+
+if __name__ == '__main__':
+    import argparse
+    runascmd = True
+    p = argparse.ArgumentParser()
+    arg = p.add_argument
+    arg("FILE", help='space separated list of latency log filenames', nargs='+')
+    arg('--buff_size',
+        default=10000,
+        type=int,
+        help='number of samples to buffer into numpy at a time')
+
+    arg('--max_latency',
+        default=20,
+        type=float,
+        help='number of seconds of data to process at a time')
+
+    arg('-i', '--interval',
+        type=int,
+        help='interval width (ms), default 1000 ms '
+        '(no weighting between samples performed, results represent sample period only)')
+
+    arg('-d', '--divisor',
+        required=False,
+        type=int,
+        default=1,
+        help='divide the results by this value.')
+
+    arg('--decimals',
+        default=3,
+        type=int,
+        help='number of decimal places to print floats to')
+
+    arg('--warn',
+        dest='warn',
+        action='store_true',
+        default=False,
+        help='print warning messages to stderr')
+
+    arg('--group_nr',
+        default=29,
+        type=int,
+        help='FIO_IO_U_PLAT_GROUP_NR as defined in stat.h')
+
+    arg('--job-file',
+        default=None,
+        type=str,
+        help='Optional argument pointing to the job file used to create the '
+             'given histogram files. Useful for auto-detecting --log_hist_msec and '
+             '--log_unix_epoch (in fio) values.')
+
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+        
+
+    main(p.parse_args())
+



-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Kris Davis
Sent: Tuesday, January 30, 2018 1:13 PM
To: fio@vger.kernel.org
Cc: karl.cronburg@gmail.com
Subject: fiologparser_hist.py script patch and enhancements?

Greetings all, 

While working with the clat histogram log data, I ran into a few difficulties with the "fiologparser_hist.py" script.  
I've created a patch to address these, but of course need some discussion and review.

The issues: 

1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.

2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max). 
I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, for backward compatibility).

3) I use python >3.4 almost exclusively in my environment, and it was relatively simple to alter script so would run with either python 2.7 or 3x . 
I added a check to make sure the python version is at least 2.7 or above, but only actually tested with python 2.7.9 and 3.4.4 I understand there are some minor python differences between 3.0 and 3.4 that might be an issue, but haven't tried.

4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so can call main directly in a script, but needed to slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.  
Someone might know of a better, more conventional pythonic design pattern to use, but it works.

5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across all the intervals.  

In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py', which doesn't perform any weighting of samples.  It has an added advantage of much better performance. For example, fiologparser_hist took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, but that is just user beware.

I've listed a patch below for fiologparser_hist.py.  I'm not sure if can actually "attach" files or would have included zipped copies of both fiologparser_hist.py and fiologparser_hist_nw.py.  But, I could include them in a github issue.

Thanks

Kris



diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py index 2e05b92..fe9a951
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2.7
+#!/usr/bin/python
 """ 
     Utility for converting *_clat_hist* files generated by fio into latency statistics.
     
@@ -16,7 +16,15 @@
 import os
 import sys
 import pandas
+import re
 import numpy as np
+
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
 
 err = sys.stderr.write
 
@@ -64,8 +72,20 @@
 def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)
 
-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
 
 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number @@ -118,6 +138,7 @@
 
     # Initial histograms from disk:
     arrs = {fp: read_chunk(rdr, sz) for fp,rdr in rdrs.items()}
+    
     while True:
 
         try:
@@ -177,8 +198,12 @@
 
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
-    row = [end, ss_cnt] + map(lambda x: float(x) / ctx.divisor, values)
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    row = [end, ss_cnt] + list(map(lambda x: float(x) / ctx.divisor, values))
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))
 
 def update_extreme(val, fncn, new_val):
@@ -207,7 +232,7 @@
             
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 
+ ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]
 
@@ -241,7 +266,7 @@
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +275,12 @@
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg)
+        
     return bins[idx[1][0]]
 
 def main(ctx):
@@ -274,9 +304,18 @@
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
 
     if ctx.interval is None:
         ctx.interval = 1000
+
+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
 
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate @@ -288,9 +327,9 @@
 
         max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
         coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
-        bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS)), dtype=float)
-        lower_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS)), dtype=float)
-        upper_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS)), dtype=float)
+        bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS))), dtype=float)
+        lower_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS))), dtype=float)
+        upper_bin_vals = np.array(list(map(lambda x: 
+ plat_idx_to_val_coarse(x, coarseness, 1.0), 
+ np.arange(__HIST_COLUMNS))), dtype=float)
 
     fps = [open(f, 'r') for f in ctx.FILE]
     gen = histogram_generator(ctx, fps, ctx.buff_size) @@ -304,7 +343,7 @@
         while more_data or len(arr) > 0:
             
             # Read up to ctx.max_latency (default 20 seconds) of data from end of current interval.
-            while len(arr) == 0 or arr[-1][0] < ctx.max_latency * 1000 + end:
+            while len(arr) == 0 or arr[-1][0] < ctx.max_latency + end:
                 try:
                     new_arr = next(gen)
                 except StopIteration:
@@ -338,6 +377,7 @@
 
 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+') @@ -384,5 +424,18 @@
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')
 
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and 
+ max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. 
+ fio uses ns for version >= 2.99')
+    
+    
+
     main(p.parse_args())

--
To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* fiologparser_hist.py script patch and enhancements?
@ 2018-01-30 19:13 Kris Davis
  2018-02-01 19:03 ` Kris Davis
  0 siblings, 1 reply; 20+ messages in thread
From: Kris Davis @ 2018-01-30 19:13 UTC (permalink / raw)
  To: fio; +Cc: karl.cronburg

Greetings all, 

While working with the clat histogram log data, I ran into a few difficulties with the "fiologparser_hist.py" script.  
I've created a patch to address these, but of course need some discussion and review.

The issues: 

1) The fiologparser_hist script didn't support the new nanosecond bin values.  So I changed the operation to assume nanosecond 
histogram bins, and new "--usbin" option to allow user to override so same script can still process older version histogram logs.

2) The script asppeared hardcoded to only return 50% (median), 90%, 95%, and 99% values (along with min and max). 
I added "--percentiles" option to allow a request for more values ('median' always printed, even if a duplicate 50% column is requested, 
for backward compatibility).

3) I use python >3.4 almost exclusively in my environment, and it was relatively simple to alter script so would run with either python 2.7 or 3x . 
I added a check to make sure the python version is at least 2.7 or above, but only actually tested with python 2.7.9 and 3.4.4 
I understand there are some minor python differences between 3.0 and 3.4 that might be an issue, but haven't tried.

4) The process can be slow for large or combining many log files.  I have some automation which will generically process many log files, and found
I cut the process time in half if I loaded as a module rather than calling as a command.  So, changed so can call main directly in a script, but needed to 
slightly alter the end of "guess_max_from_bins" to throw an exception on error rather than exit, when called as a module.  
Someone might know of a better, more conventional pythonic design pattern to use, but it works.

5) The script appears to assume that the log is never missing samples.  That is, weight samples to the requested intervals, I think with the 
assumption that the log samples are at longer intervals, or at least the same interval length as the "--interval" value.  If the workload actually
contains "thinktime" intervals (with missing sample when zero data), the script cannot know this, and assumes the logged operations should still be spread across 
all the intervals.  

In my case, I'm mostly interested in results at the same interval as gathered during logging, so I tweaked into an alternate version I named 'fiologparser_hist_nw.py',
which doesn't perform any weighting of samples.  It has an added advantage of much better performance. For example, fiologparser_hist 
took about 1/2 hr to combine about 350 logs, but fiologparser_hist_nw took 45 seconds, way better for my automation.

Of course, larger number of 9's percentiles would have additional inaccuracies when there are not enough operations in a sample period, 
but that is just user beware.

I've listed a patch below for fiologparser_hist.py.  I'm not sure if can actually "attach" files or would have included zipped copies of both fiologparser_hist.py and fiologparser_hist_nw.py.  But, I could include them in a github issue.

Thanks

Kris



diff --git a/tools/hist/fiologparser_hist.py b/tools/hist/fiologparser_hist.py
index 2e05b92..fe9a951
--- a/tools/hist/fiologparser_hist.py
+++ b/tools/hist/fiologparser_hist.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python2.7
+#!/usr/bin/python
 """ 
     Utility for converting *_clat_hist* files generated by fio into latency statistics.
     
@@ -16,7 +16,15 @@
 import os
 import sys
 import pandas
+import re
 import numpy as np
+
+runascmd = False
+
+if (sys.version_info < (2, 7)):
+    err("ERROR: Python version = %s; version 2.7 or greater is required.\n")
+    exit(1)
+
 
 err = sys.stderr.write
 
@@ -64,8 +72,20 @@
 def weighted_average(vs, ws):
     return np.sum(vs * ws) / np.sum(ws)
 
-columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"]
-percs   = [50, 90, 95, 99]
+
+percs = None
+columns = None
+
+def gen_output_columns(percentiles):
+    global percs,columns
+    strpercs = re.split('[,:]', percentiles)
+    percs = [50.0]  # always print 50% in 'median' column
+    percs.extend(list(map(float,strpercs)))
+    columns = ["end-time", "samples", "min", "avg", "median"]
+    columns.extend(list(map(lambda x: x+'%', strpercs)))
+    columns.append("max")
+        
+
 
 def fmt_float_list(ctx, num=1):
   """ Return a comma separated list of float formatters to the required number
@@ -118,6 +138,7 @@
 
     # Initial histograms from disk:
     arrs = {fp: read_chunk(rdr, sz) for fp,rdr in rdrs.items()}
+    
     while True:
 
         try:
@@ -177,8 +198,12 @@
 
     avg = weighted_average(vs, ws)
     values = [mn, avg] + list(ps) + [mx]
-    row = [end, ss_cnt] + map(lambda x: float(x) / ctx.divisor, values)
-    fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d"
+    row = [end, ss_cnt] + list(map(lambda x: float(x) / ctx.divisor, values))
+    if ctx.divisor > 1:
+        fmt = "%d, %d, " + fmt_float_list(ctx, len(percs)+3)
+    else:
+        # max and min are decimal values if no divisor
+        fmt = "%d, %d, %d, " + fmt_float_list(ctx, len(percs)+1) + ", %d"
     print (fmt % tuple(row))
 
 def update_extreme(val, fncn, new_val):
@@ -207,7 +232,7 @@
             
         # Only look at bins of the current histogram sample which
         # started before the end of the current time interval [start,end]
-        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0
+        start_times = (end_time - 0.5 * ctx.interval) - bin_vals / ctx.time_divisor
         idx = np.where(start_times < iEnd)
         s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx]
 
@@ -241,7 +266,7 @@
     idx = np.where(arr == hist_cols)
     if len(idx[1]) == 0:
         table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array','     ')
-        err("Unable to determine bin values from input clat_hist files. Namely \n"
+        errmsg = ("Unable to determine bin values from input clat_hist files. Namely \n"
             "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) +
             "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n"
             "This number needs to be equal to one of the following numbers:\n\n"
@@ -250,7 +275,12 @@
             "  - Input file(s) does not contain histograms.\n"
             "  - You recompiled fio with a different GROUP_NR. If so please specify this\n"
             "    new GROUP_NR on the command line with --group_nr\n")
-        exit(1)
+        if runascmd:
+            err(errmsg)
+            exit(1)
+        else:
+            raise RuntimeError(errmsg) 
+        
     return bins[idx[1][0]]
 
 def main(ctx):
@@ -274,9 +304,18 @@
                         ctx.interval = int(hist_msec)
                 except NoOptionError:
                     pass
+    
+    if not hasattr(ctx, 'percentiles'):
+        ctx.percentiles = "90,95,99"
+    gen_output_columns(ctx.percentiles)
 
     if ctx.interval is None:
         ctx.interval = 1000
+
+    if ctx.usbin:
+        ctx.time_divisor = 1000.0        # bins are in us
+    else:
+        ctx.time_divisor = 1000000.0     # bins are in ns
 
     # Automatically detect how many columns are in the input files,
     # calculate the corresponding 'coarseness' parameter used to generate
@@ -288,9 +327,9 @@
 
         max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS)
         coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS))
-        bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS)), dtype=float)
-        lower_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS)), dtype=float)
-        upper_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS)), dtype=float)
+        bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS))), dtype=float)
+        lower_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS))), dtype=float)
+        upper_bin_vals = np.array(list(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS))), dtype=float)
 
     fps = [open(f, 'r') for f in ctx.FILE]
     gen = histogram_generator(ctx, fps, ctx.buff_size)
@@ -304,7 +343,7 @@
         while more_data or len(arr) > 0:
             
             # Read up to ctx.max_latency (default 20 seconds) of data from end of current interval.
-            while len(arr) == 0 or arr[-1][0] < ctx.max_latency * 1000 + end:
+            while len(arr) == 0 or arr[-1][0] < ctx.max_latency + end:
                 try:
                     new_arr = next(gen)
                 except StopIteration:
@@ -338,6 +377,7 @@
 
 if __name__ == '__main__':
     import argparse
+    runascmd = True
     p = argparse.ArgumentParser()
     arg = p.add_argument
     arg("FILE", help='space separated list of latency log filenames', nargs='+')
@@ -384,5 +424,18 @@
              'given histogram files. Useful for auto-detecting --log_hist_msec and '
              '--log_unix_epoch (in fio) values.')
 
+    arg('--percentiles',
+        default="90,95,99",
+        type=str,
+        help='Optional argument of comma or colon separated percentiles to print. '
+             'The default is "90.0,95.0,99.0".  min, median(50%%) and max percentiles are always printed')
+    
+    arg('--usbin',
+        default=False,
+        action='store_true',
+        help='histogram bin latencies are in us (fio versions < 2.99. fio uses ns for version >= 2.99')
+    
+    
+
     main(p.parse_args())


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-03-23 17:22 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CY1PR0401MB11163E6C999998E0A2F13F9E81F70@CY1PR0401MB1116.namprd04.prod.outlook.com>
2018-02-12 20:36 ` fiologparser_hist.py script patch and enhancements? Kris Davis
2018-02-12 21:38   ` Kris Davis
2018-02-13  7:26     ` Sitsofe Wheeler
2018-02-14 17:51       ` Kris Davis
2018-02-21 14:41         ` Kris Davis
2018-02-21 16:52           ` Sitsofe Wheeler
2018-02-21 17:00             ` Kris Davis
2018-02-21 17:23               ` Sitsofe Wheeler
2018-02-21 17:45                 ` Kris Davis
2018-02-21 18:24                   ` Sitsofe Wheeler
2018-02-26 21:56                     ` Kris Davis
2018-03-01 14:39                       ` Sitsofe Wheeler
2018-03-15 17:02                         ` Kris Davis
2018-03-17 10:00                           ` Sitsofe Wheeler
2018-03-19 15:45                             ` Kris Davis
2018-03-20  5:57                               ` Kris Davis
2018-03-22 14:47                                 ` Sitsofe Wheeler
2018-03-23 17:22                                   ` Kris Davis
2018-01-30 19:13 Kris Davis
2018-02-01 19:03 ` Kris Davis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.