Changing a workload results in performance drop

* Changing a workload results in performance drop
@ 2020-04-24 14:56 Konstantin Kharlamov
  2020-06-02 14:22 ` Konstantin Kharlamov
  0 siblings, 1 reply; 3+ messages in thread
From: Konstantin Kharlamov @ 2020-04-24 14:56 UTC (permalink / raw)
  To: linux-ext4

* SSDs are used in testing, so random access is not a concern. But I tried the
   "steps to reproduce" with raw block device, and IOPS always holds 9k for me.
* "Direct" IO is used to bypass file-system cache.
* The issue is way less visible on XFS, so it looks specific to file systems.
* The biggest difference I've seen is on 70% reads/30% writes workload. But for
   simplicity in "steps to reproduce" I'm using 100% write.
* it seems over time (perhaps a day) performance gets improved, so for best
   results when testing that you need to re-create ext4 anew.
* in "steps to reproduce" I grep fio stdout. That suppresses interactive
   output. Interactive output may be interesting though, I've often seen workload
   drops to 600-700 IOPS while average was 5-6k
* Original problem I worked with https://github.com/openzfs/zfs/issues/10231

# Steps to reproduce (in terms of terminal commands)

     $ cat fio_jobfile
     [job-section]
     name=temp-fio
     bs=8k
     ioengine=libaio
     rw=randrw
     rwmixread=0
     rwmixwrite=100
     filename=/mnt/test/file1
     iodepth=1
     numjobs=1
     group_reporting
     time_based
     runtime=1m
     direct=1
     filesize=4G
     $ mkfs.ext4 /dev/sdw1
     $ mount /dev/sdw1 /mnt/test
     $ truncate -s 100G /mnt/test/file1
     $ fio fio_jobfile | grep -i IOPS
       write: IOPS=12.5k, BW=97.0MiB/s (103MB/s)(5879MiB/60001msec)
        iops        : min=10966, max=14730, avg=12524.20, stdev=1240.27, samples=119
     $ sed -i 's/4G/100G/' fio_jobfile
     $ fio fio_jobfile | grep -i IOPS
       write: IOPS=5880, BW=45.9MiB/s (48.2MB/s)(2756MiB/60001msec)
        iops        : min= 4084, max= 6976, avg=5879.31, stdev=567.58, samples=119

## Expected

Performance should be more or less the same

## Actual

The second test is twice as slow

# Versions

* Kernel version: 5.6.2-050602-generic

It seems however that the problem is present at least in 4.19 and 5.4. as well, so not a regression.

^ permalink raw reply	[flat|nested] 3+ messages in thread