Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk, newella@fb.com, clm@fb.com,
	josef@toxicpanda.com, dennisz@fb.com, lizefan@huawei.com,
	hannes@cmpxchg.org
Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 10/10] blkcg: add tools/cgroup/iocost_coef_gen.py
Date: Wed, 28 Aug 2019 15:06:00 -0700
Message-ID: <20190828220600.2527417-11-tj@kernel.org> (raw)
In-Reply-To: <20190828220600.2527417-1-tj@kernel.org>

Add a script which can be used to generate device-specific iocost
linear model coefficients.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/admin-guide/cgroup-v2.rst |   3 +
 block/blk-iocost.c                      |   3 +
 tools/cgroup/iocost_coef_gen.py         | 178 ++++++++++++++++++++++++
 3 files changed, 184 insertions(+)
 create mode 100644 tools/cgroup/iocost_coef_gen.py

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 1521c7e554f5..3deacdc5e6d2 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1529,6 +1529,9 @@ IO Interface Files
 	The IO cost model isn't expected to be accurate in absolute
 	sense and is scaled to the device behavior dynamically.
 
+	If needed, tools/cgroup/iocost_coef_gen.py can be used to
+	generate device-specific coefficients.
+
   io.weight
 	A read-write flat-keyed file which exists on non-root cgroups.
 	The default is "default 100".
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 3208d2fdc55e..f04a4ed1cb45 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -46,6 +46,9 @@
  * If needed, tools/cgroup/iocost_coef_gen.py can be used to generate
  * device-specific coefficients.
  *
+ * If needed, tools/cgroup/iocost_coef_gen.py can be used to generate
+ * device-specific coefficients.
+ *
  * 2. Control Strategy
  *
  * The device virtual time (vtime) is used as the primary control metric.
diff --git a/tools/cgroup/iocost_coef_gen.py b/tools/cgroup/iocost_coef_gen.py
new file mode 100644
index 000000000000..df17a2ae80e5
--- /dev/null
+++ b/tools/cgroup/iocost_coef_gen.py
@@ -0,0 +1,178 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2019 Tejun Heo <tj@kernel.org>
+# Copyright (C) 2019 Andy Newell <newella@fb.com>
+# Copyright (C) 2019 Facebook
+
+desc = """
+Generate linear IO cost model coefficients used by the blk-iocost
+controller.  If the target raw testdev is specified, destructive tests
+are performed against the whole device; otherwise, on
+./iocost-coef-fio.testfile.  The result can be written directly to
+/sys/fs/cgroup/io.cost.model.
+
+On high performance devices, --numjobs > 1 is needed to achieve
+saturation.
+
+See Documentation/admin-guide/cgroup-v2.rst and block/blk-iocost.c
+for more details.
+"""
+
+import argparse
+import re
+import json
+import glob
+import os
+import sys
+import atexit
+import shutil
+import tempfile
+import subprocess
+
+parser = argparse.ArgumentParser(description=desc,
+                                 formatter_class=argparse.RawTextHelpFormatter)
+parser.add_argument('--testdev', metavar='DEV',
+                    help='Raw block device to use for testing, ignores --testfile-size')
+parser.add_argument('--testfile-size-gb', type=float, metavar='GIGABYTES', default=16,
+                    help='Testfile size in gigabytes (default: %(default)s)')
+parser.add_argument('--duration', type=int, metavar='SECONDS', default=120,
+                    help='Individual test run duration in seconds (default: %(default)s)')
+parser.add_argument('--seqio-block-mb', metavar='MEGABYTES', type=int, default=128,
+                    help='Sequential test block size in megabytes (default: %(default)s)')
+parser.add_argument('--seq-depth', type=int, metavar='DEPTH', default=64,
+                    help='Sequential test queue depth (default: %(default)s)')
+parser.add_argument('--rand-depth', type=int, metavar='DEPTH', default=64,
+                    help='Random test queue depth (default: %(default)s)')
+parser.add_argument('--numjobs', type=int, metavar='JOBS', default=1,
+                    help='Number of parallel fio jobs to run (default: %(default)s)')
+parser.add_argument('--quiet', action='store_true')
+parser.add_argument('--verbose', action='store_true')
+
+def info(msg):
+    if not args.quiet:
+        print(msg)
+
+def dbg(msg):
+    if args.verbose and not args.quiet:
+        print(msg)
+
+# determine ('DEVNAME', 'MAJ:MIN') for @path
+def dir_to_dev(path):
+    # find the block device the current directory is on
+    devname = subprocess.run(f'findmnt -nvo SOURCE -T{path}',
+                             stdout=subprocess.PIPE, shell=True).stdout
+    devname = os.path.basename(devname).decode('utf-8').strip()
+
+    # partition -> whole device
+    parents = glob.glob('/sys/block/*/' + devname)
+    if len(parents):
+        devname = os.path.basename(os.path.dirname(parents[0]))
+    rdev = os.stat(f'/dev/{devname}').st_rdev
+    return (devname, f'{os.major(rdev)}:{os.minor(rdev)}')
+
+def create_testfile(path, size):
+    global args
+
+    if os.path.isfile(path) and os.stat(path).st_size == size:
+        return
+
+    info(f'Creating testfile {path}')
+    subprocess.check_call(f'rm -f {path}', shell=True)
+    subprocess.check_call(f'touch {path}', shell=True)
+    subprocess.call(f'chattr +C {path}', shell=True)
+    subprocess.check_call(
+        f'pv -s {size} -pr /dev/urandom {"-q" if args.quiet else ""} | '
+        f'dd of={path} count={size} '
+        f'iflag=count_bytes,fullblock oflag=direct bs=16M status=none',
+        shell=True)
+
+def run_fio(testfile, duration, iotype, iodepth, blocksize, jobs):
+    global args
+
+    eta = 'never' if args.quiet else 'always'
+    outfile = tempfile.NamedTemporaryFile()
+    cmd = (f'fio --direct=1 --ioengine=libaio --name=coef '
+           f'--filename={testfile} --runtime={round(duration)} '
+           f'--readwrite={iotype} --iodepth={iodepth} --blocksize={blocksize} '
+           f'--eta={eta} --output-format json --output={outfile.name} '
+           f'--time_based --numjobs={jobs}')
+    if args.verbose:
+        dbg(f'Running {cmd}')
+    subprocess.check_call(cmd, shell=True)
+    with open(outfile.name, 'r') as f:
+        d = json.loads(f.read())
+    return sum(j['read']['bw_bytes'] + j['write']['bw_bytes'] for j in d['jobs'])
+
+def restore_elevator_nomerges():
+    global elevator_path, nomerges_path, elevator, nomerges
+
+    info(f'Restoring elevator to {elevator} and nomerges to {nomerges}')
+    with open(elevator_path, 'w') as f:
+        f.write(elevator)
+    with open(nomerges_path, 'w') as f:
+        f.write(nomerges)
+
+
+args = parser.parse_args()
+
+missing = False
+for cmd in [ 'findmnt', 'pv', 'dd', 'fio' ]:
+    if not shutil.which(cmd):
+        print(f'Required command "{cmd}" is missing', file=sys.stderr)
+        missing = True
+if missing:
+    sys.exit(1)
+
+if args.testdev:
+    devname = os.path.basename(args.testdev)
+    rdev = os.stat(f'/dev/{devname}').st_rdev
+    devno = f'{os.major(rdev)}:{os.minor(rdev)}'
+    testfile = f'/dev/{devname}'
+    info(f'Test target: {devname}({devno})')
+else:
+    devname, devno = dir_to_dev('.')
+    testfile = 'iocost-coef-fio.testfile'
+    testfile_size = int(args.testfile_size_gb * 2 ** 30)
+    create_testfile(testfile, testfile_size)
+    info(f'Test target: {testfile} on {devname}({devno})')
+
+elevator_path = f'/sys/block/{devname}/queue/scheduler'
+nomerges_path = f'/sys/block/{devname}/queue/nomerges'
+
+with open(elevator_path, 'r') as f:
+    elevator = re.sub(r'.*\[(.*)\].*', r'\1', f.read().strip())
+with open(nomerges_path, 'r') as f:
+    nomerges = f.read().strip()
+
+info(f'Temporarily disabling elevator and merges')
+atexit.register(restore_elevator_nomerges)
+with open(elevator_path, 'w') as f:
+    f.write('none')
+with open(nomerges_path, 'w') as f:
+    f.write('1')
+
+info('Determining rbps...')
+rbps = run_fio(testfile, args.duration, 'read',
+               1, args.seqio_block_mb * (2 ** 20), args.numjobs)
+info(f'\nrbps={rbps}, determining rseqiops...')
+rseqiops = round(run_fio(testfile, args.duration, 'read',
+                         args.seq_depth, 4096, args.numjobs) / 4096)
+info(f'\nrseqiops={rseqiops}, determining rrandiops...')
+rrandiops = round(run_fio(testfile, args.duration, 'randread',
+                          args.rand_depth, 4096, args.numjobs) / 4096)
+info(f'\nrrandiops={rrandiops}, determining wbps...')
+wbps = run_fio(testfile, args.duration, 'write',
+               1, args.seqio_block_mb * (2 ** 20), args.numjobs)
+info(f'\nwbps={wbps}, determining wseqiops...')
+wseqiops = round(run_fio(testfile, args.duration, 'write',
+                         args.seq_depth, 4096, args.numjobs) / 4096)
+info(f'\nwseqiops={wseqiops}, determining wrandiops...')
+wrandiops = round(run_fio(testfile, args.duration, 'randwrite',
+                          args.rand_depth, 4096, args.numjobs) / 4096)
+info(f'\nwrandiops={wrandiops}')
+restore_elevator_nomerges()
+atexit.unregister(restore_elevator_nomerges)
+info('')
+
+print(f'{devno} rbps={rbps} rseqiops={rseqiops} rrandiops={rrandiops} '
+      f'wbps={wbps} wseqiops={wseqiops} wrandiops={wrandiops}')
-- 
2.17.1


  parent reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-28 22:05 [PATCHSET v3 block/for-linus] IO cost model based work-conserving porportional controller Tejun Heo
2019-08-28 22:05 ` [PATCH 01/10] blkcg: pass @q and @blkcg into blkcg_pol_alloc_pd_fn() Tejun Heo
2019-08-28 22:05 ` [PATCH 02/10] blkcg: make ->cpd_init_fn() optional Tejun Heo
2019-08-28 22:05 ` [PATCH 03/10] blkcg: separate blkcg_conf_get_disk() out of blkg_conf_prep() Tejun Heo
2019-08-28 22:05 ` [PATCH 04/10] block/rq_qos: add rq_qos_merge() Tejun Heo
2019-08-28 22:05 ` [PATCH 05/10] block/rq_qos: implement rq_qos_ops->queue_depth_changed() Tejun Heo
2019-08-28 22:05 ` [PATCH 06/10] blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ Tejun Heo
2019-08-28 22:05 ` [PATCH 07/10] blk-mq: add optional request->alloc_time_ns Tejun Heo
2019-08-28 22:05 ` [PATCH 08/10] blkcg: implement blk-iocost Tejun Heo
2019-08-29 15:53   ` [PATCH] blkcg: fix missing free on error path of blk_iocost_init() Tejun Heo
2019-09-10 12:55   ` [PATCH 08/10] blkcg: implement blk-iocost Michal Koutný
2019-09-10 16:08     ` Tejun Heo
2019-09-11  8:18       ` Paolo Valente
2019-09-11 14:16         ` Tejun Heo
2019-09-11 15:54           ` Tejun Heo
2019-09-11 16:44           ` Paolo Valente
2019-10-03 14:51       ` Michal Koutný
2019-10-03 16:45         ` Tejun Heo
2019-10-09 15:36           ` Michal Koutný
2019-10-14 15:36             ` Tejun Heo
2019-11-01 16:15               ` Michal Koutný
2019-11-01 16:56                 ` Paolo Valente
2019-08-28 22:05 ` [PATCH 09/10] blkcg: add tools/cgroup/iocost_monitor.py Tejun Heo
2019-08-28 22:06 ` Tejun Heo [this message]
2019-08-29  3:29 ` [PATCHSET v3 block/for-linus] IO cost model based work-conserving porportional controller Jens Axboe
     [not found] ` <20190829082248.6464-1-hdanton@sina.com>
2019-08-29 15:43   ` [PATCH 07/10] blk-mq: add optional request->alloc_time_ns Tejun Heo
     [not found] ` <20190829133928.16192-1-hdanton@sina.com>
2019-08-29 15:46   ` [PATCH 08/10] blkcg: implement blk-iocost Tejun Heo
2019-08-29 15:54 ` [PATCHSET v3 block/for-linus] IO cost model based work-conserving porportional controller Paolo Valente
2019-08-29 15:56   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2019-07-10 20:51 [PATCHSET v2 " Tejun Heo
2019-07-10 20:51 ` [PATCH 10/10] blkcg: add tools/cgroup/iocost_coef_gen.py Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190828220600.2527417-11-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=dennisz@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=newella@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git