From: Stefan Ring <stefanrin@gmail.com>
To: dm-devel@redhat.com
Subject: Significantly dropped dm-cache performance in 4.13 compared to 4.11
Date: Fri, 3 Nov 2017 19:50:23 +0100 [thread overview]
Message-ID: <CAAxjCExHqssZ8vuCC19_NP1io=StFAYk48nqDfudhjD-PyLZCg@mail.gmail.com> (raw)
Having just upgraded from a 4.11 kernel to a 4.13 one, I see a
significantly higher scrub time for a ZFS on Linux (=ZoL) pool that
lives on a dm-cache device consisting of a 800 GB partition on one
spinning 1TB disk and one partition on an SDD (something between 100
and 200 GB). ZFS scrubbing consists of reading everything stored in
the pool from start to finish, roughly in the order that it was
written. The data on the pool is for the most part more or less
linear, and the scrubbing used to achieve read rates from the spinning
disk in excess of 100MB/sec. With the old kernel, that is. These are
the scrub times for both kernels:
4.11.5-300.fc26: 1h56m
4.13.9-200.fc26: 4h32m
Nothing changed between those two runs except for the booted kernel.
ZoL is version 0.7.3 in both cases. Originally, I suspected ZoL 0.7.x
to be the culprit, which I upgraded simultaneously to the kernel, from
0.6.5.11. However, I built and installed it for both kernel versions
from the exact same sources, and scrub times are comparable to what
they were before on my home system which uses ZoL on four spinning
disks without an interposed dm-cache.
Typical output for iostat -dmx 3 with kernel 4.13 while scrub is going
on. Otherwise, there is no I/O activity on the system:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 300.67 0.00 462.67 0.00 68.16 0.00
301.69 2.63 5.61 5.61 0.00 2.16 99.90
sdb 0.00 194.67 6.00 83.33 0.38 14.01
329.82 0.20 2.22 0.50 2.34 1.58 14.13
dm-0 0.00 0.00 6.00 221.33 0.38 13.83
128.01 0.54 2.38 0.50 2.43 0.29 6.63
dm-1 0.00 0.00 0.00 53.67 0.00 0.17
6.31 0.12 2.28 0.00 2.28 2.06 11.07
dm-2 0.00 0.00 763.33 0.00 68.16 0.00
182.86 8.05 10.49 10.49 0.00 1.31 99.93
dm-3 0.00 0.00 440.00 0.00 54.70 0.00
254.60 1.98 4.41 4.41 0.00 2.27 100.03
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 468.00 1.00 519.67 20.00 82.39 0.24
313.60 2.93 5.38 5.49 2.50 1.83 98.63
sdb 0.00 356.00 18.67 109.33 1.00 25.80
428.73 0.15 1.20 1.20 1.20 1.04 13.33
dm-0 0.00 0.00 18.67 426.00 1.00 25.75
123.20 0.52 1.16 1.20 1.16 0.19 8.33
dm-1 0.00 0.00 0.00 39.67 0.00 0.13
6.66 0.06 1.52 0.00 1.52 1.43 5.67
dm-2 0.00 0.00 988.00 21.00 82.68 0.24
168.31 9.63 8.97 9.11 2.38 0.98 98.60
dm-3 0.00 0.00 485.00 19.33 57.84 0.24
235.88 2.14 4.29 4.41 1.41 1.98 99.87
dm-3 is the cached device which ZoL reads from. sda/dm-2 is the
spinning disk, sdb/dm-0 is the cache SDD.
It strikes me as odd that the amount read from the spinning disk is
actually more than what comes out of the combined device in the end.
It is exactly the other way around with the older kernel, which makes
much more sense to me. It looks like this with 4.11, where the
resulting amount of data is the sum of both reads:
Typical samples with kernel 4.11:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 87.67 0.00 618.33 0.00 62.53 0.00
207.12 1.58 2.56 2.56 0.00 1.36 84.37
sdb 0.67 0.00 1057.00 0.00 86.96 0.00
168.49 0.44 0.41 0.41 0.00 0.23 24.37
dm-0 0.00 0.00 1057.67 0.00 86.96 0.00
168.38 0.44 0.42 0.42 0.00 0.23 24.40
dm-1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 706.00 0.00 62.56 0.00
181.48 1.74 2.46 2.46 0.00 1.19 84.33
dm-3 0.00 0.00 1488.33 0.00 149.52 0.00
205.74 1.97 1.32 1.32 0.00 0.67 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 165.33 0.00 747.33 0.00 91.42 0.00
250.52 1.70 2.27 2.27 0.00 1.14 85.37
sdb 0.00 0.00 746.33 0.00 64.54 0.00
177.09 0.36 0.49 0.49 0.00 0.23 17.00
dm-0 0.00 0.00 746.33 0.00 64.54 0.00
177.09 0.37 0.49 0.49 0.00 0.23 17.07
dm-1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 912.67 0.00 91.39 0.00
205.07 2.02 2.21 2.21 0.00 0.94 85.37
dm-3 0.00 0.00 1363.00 0.00 155.92 0.00
234.28 2.02 1.48 1.48 0.00 0.73 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 161.00 0.00 684.67 0.00 84.63 0.00
253.14 1.93 2.83 2.83 0.00 1.45 99.27
sdb 0.00 0.00 62.67 0.00 6.05 0.00
197.57 0.03 0.48 0.48 0.00 0.32 2.03
dm-0 0.00 0.00 62.67 0.00 6.05 0.00
197.57 0.03 0.48 0.48 0.00 0.32 2.03
dm-1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 845.67 0.00 84.63 0.00
204.94 2.26 2.68 2.68 0.00 1.17 99.30
dm-3 0.00 0.00 727.67 0.00 90.67 0.00
255.19 1.97 2.70 2.70 0.00 1.37 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 160.67 0.00 738.67 0.00 89.74 0.00
248.81 1.83 2.48 2.48 0.00 1.29 95.23
sdb 0.33 0.00 303.33 0.00 28.02 0.00
189.17 0.14 0.47 0.47 0.00 0.25 7.73
dm-0 0.00 0.00 303.67 0.00 28.02 0.00
188.96 0.14 0.47 0.47 0.00 0.26 7.87
dm-1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 899.33 0.00 89.74 0.00
204.36 2.17 2.41 2.41 0.00 1.06 95.27
dm-3 0.00 0.00 978.67 0.00 117.76 0.00
246.42 1.96 2.00 2.00 0.00 1.02 100.00
$ ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 236 Nov 3 09:08 control
lrwxrwxrwx 1 root root 7 Nov 3 09:08 vg_zfs-lv_cachedata_cdata -> ../dm-0
lrwxrwxrwx 1 root root 7 Nov 3 09:08 vg_zfs-lv_cachedata_cmeta -> ../dm-1
lrwxrwxrwx 1 root root 7 Nov 3 09:08 vg_zfs-lv_zfsdisk -> ../dm-3
lrwxrwxrwx 1 root root 7 Nov 3 09:08 vg_zfs-lv_zfsdisk_corig -> ../dm-2
$ sudo dmsetup ls --tree
vg_zfs-lv_zfsdisk (253:3)
├─vg_zfs-lv_zfsdisk_corig (253:2)
│ └─ (8:6)
├─vg_zfs-lv_cachedata_cdata (253:0)
│ └─ (8:21)
└─vg_zfs-lv_cachedata_cmeta (253:1)
└─ (8:21)
$ sudo dmsetup table vg_zfs-lv_zfsdisk
0 1876041728 cache 253:1 253:0 253:2 1024 1 writethrough smq 0
$ sudo dmsetup status /dev/mapper/vg_zfs-lv_zfsdisk
0 1876041728 cache 8 1296/54272 1024 430706/430720 91621106 163624489
32345201 16417931 307686 307668 0 1 writethrough 2 migration_threshold
2048 smq 0 rw -
Any ideas?
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
next reply other threads:[~2017-11-03 18:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-03 18:50 Stefan Ring [this message]
2017-11-07 14:41 ` Significantly dropped dm-cache performance in 4.13 compared to 4.11 Joe Thornber
2017-11-09 15:15 ` Stefan Ring
2017-11-13 17:31 ` Stefan Ring
2017-11-13 19:01 ` Mike Snitzer
2017-11-14 11:00 ` Joe Thornber
2017-11-14 14:53 ` Stefan Ring
2017-11-14 18:41 ` Stefan Ring
2017-12-15 16:03 ` Stefan Ring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAAxjCExHqssZ8vuCC19_NP1io=StFAYk48nqDfudhjD-PyLZCg@mail.gmail.com' \
--to=stefanrin@gmail.com \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.