From: Mel Gorman <mgorman@suse.de> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [MMTests] Interactivity during IO on ext4 Date: Thu, 5 Jul 2012 15:57:50 +0100 [thread overview] Message-ID: <20120705145750.GO14154@suse.de> (raw) In-Reply-To: <20120629111932.GA14154@suse.de> Configuration: global-dhp__io-interactive-performance-ext4 Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4 Benchmarks: postmark largedd fsmark-single fsmark-threaded micro Summary ======= Unlike ext3, these figures look generally good. There are a few wrinkles in there but indications are that interactivity jitter experienced by users may have a filesystem-specific component. One possibility is that there are differences in how metadata reads are sent to the IO scheduler but I did not confirm this. Benchmark notes =============== NOTE: This configuration is new and very experimental. This is my first time looking at the results of this type of test so flaws are inevitable. There is ample scope for improvement but I had to start somewhere. This configuration is very different in that it is trying to analyse the impact of IO on interactive performance. Some interactivity problems are due to an application trying to read() cache-cold data such as configuration files or cached images. If there is a lot of IO going on, the application may stall while this happens. This is a limited scenario for measuring interactivity but a common one. These tests are fairly standard except that there is a background application running in parallel. It begins by creating a 100M file and using fadvise(POSIX_FADV_DONTNEED) to evict it from cache. Once that is complete it will try to read 1M from the file every few seconds and record the latency. When it reaches the end of the file, it dumps it from cache and starts again. This latency is a *proxy* measure of interactivity, not a true measure. A variation would be to measure the time for small writes for applications that are logging data or applications like gnome-terminal that do small writes to /tmp as part of its buffer management. The main strength is that if we get this basic case wrong, then the complex cases are almost certainly screwed as well. There are two areas to pay attention to. One is completion time and how it is affected by the small reads taking place in parallel. A comprehensive analysis would show exactly how much the workload is affected by a parallel read but right now I'm just looking at wall time. The second area to pay attention to is the read latencies paying particular attention to the average latency and the max latencies. The variations are harder to draw decent conclusions from. A sensible option would be to plot a CDF to get a better idea what the probability of a given read latency is but for now that's a TODO item. As it is, the graphs are barely usable and I'll be giving that more thought. =========================================================== Machine: arnold Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/arnold/comparison.html Arch: x86 CPUs: 1 socket, 2 threads Model: Pentium 4 Disk: Single Rotary Disk Status: =========================================================== fsmark-single ------------- Completion times are more or less ok. 3.2 showed a big improvement which is not in line with what was experienced in ext3. As with ext3, kernel 2.6.32 was a disaster but otherwise our maximum read latencies were looking up until 3.3 when there was a big jump that was not fixed in 3.4. By and large though the average latencies are looking good and while the max latency is bad, the 99th percentile was looking good implying that the worst latencies are rarely experienced. fsmark-threaded --------------- Completion times look generally good with 3.1 being an exception. Latencies are also looking good. postmark -------- Similar story. Completion times and latencies generally look good. largedd ------- Completion times were higher from 2.6.39 up until 3.3 taking nearly two minutes to complete the copy in some cases. This is reflected in some of the maximum latencies in that window but by and large the read latencies are much improved. micro ----- Looking good all round. ========================================================== Machine: hydra Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/hydra/comparison.html Arch: x86-64 CPUs: 1 socket, 4 threads Model: AMD Phenom II X4 940 Disk: Single Rotary Disk Status: Ok ========================================================== fsmark-single ------------- Completion times have degraded slightly but are acceptable. All the latency figures look good with some big improvements. fsmark-threaded --------------- Same story, generally looking good with big improvements. postmark -------- Completion times are a bit varied but latencies look good. largedd ------- Completion times look good. Latency has improved since 2.6.32 but there is a big wrinkle in there. Maximum latency was 337ms in kernel 3.2 but in 3.3 it was 707ms and in 3.4 was 990ms. The 99th percentile figures look good but something happened to allow bigger outliers. micro ----- For the most part, looks good but there was a big jump in the maximum latency in kernel 3.4. Like largedd, the 99th percentil did not look as bad so it might be an outlier. ========================================================== Machine: sandy Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/sandy/comparison.html Arch: x86-64 CPUs: 1 socket, 8 threads Model: Intel Core i7-2600 Disk: Single Rotary Disk Status: ========================================================== fsmark-single ------------- Completion times have degraded slightly but are acceptble. All the latency figures look good with some big improvements. fsmark-threaded --------------- Completion times are improved although curiously it is not reflected in the performance figures for fsmark itself. Maximum latency figures generally look good other than a mild jump in 3.2 that has almost being recovered. postmark -------- Completion times have varied a lot and 3.4 is particularly high. The latency figures in general regressed in 3.4 in comparison to 3.3 but by and large the figures look good. largedd ------- Completion times generally look good but were noticably worse for a number of releases between 2.6.39 and 3.2. This same window showed much higher latency figures with kernel 3.1 showing a maximum latency of 1.3 seconds for example. These were mostly outliers though as the 99th percentile generally looked ok. micro ----- Generally much improved. -- Mel Gorman SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [MMTests] Interactivity during IO on ext4 Date: Thu, 5 Jul 2012 15:57:50 +0100 [thread overview] Message-ID: <20120705145750.GO14154@suse.de> (raw) In-Reply-To: <20120629111932.GA14154@suse.de> Configuration: global-dhp__io-interactive-performance-ext4 Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4 Benchmarks: postmark largedd fsmark-single fsmark-threaded micro Summary ======= Unlike ext3, these figures look generally good. There are a few wrinkles in there but indications are that interactivity jitter experienced by users may have a filesystem-specific component. One possibility is that there are differences in how metadata reads are sent to the IO scheduler but I did not confirm this. Benchmark notes =============== NOTE: This configuration is new and very experimental. This is my first time looking at the results of this type of test so flaws are inevitable. There is ample scope for improvement but I had to start somewhere. This configuration is very different in that it is trying to analyse the impact of IO on interactive performance. Some interactivity problems are due to an application trying to read() cache-cold data such as configuration files or cached images. If there is a lot of IO going on, the application may stall while this happens. This is a limited scenario for measuring interactivity but a common one. These tests are fairly standard except that there is a background application running in parallel. It begins by creating a 100M file and using fadvise(POSIX_FADV_DONTNEED) to evict it from cache. Once that is complete it will try to read 1M from the file every few seconds and record the latency. When it reaches the end of the file, it dumps it from cache and starts again. This latency is a *proxy* measure of interactivity, not a true measure. A variation would be to measure the time for small writes for applications that are logging data or applications like gnome-terminal that do small writes to /tmp as part of its buffer management. The main strength is that if we get this basic case wrong, then the complex cases are almost certainly screwed as well. There are two areas to pay attention to. One is completion time and how it is affected by the small reads taking place in parallel. A comprehensive analysis would show exactly how much the workload is affected by a parallel read but right now I'm just looking at wall time. The second area to pay attention to is the read latencies paying particular attention to the average latency and the max latencies. The variations are harder to draw decent conclusions from. A sensible option would be to plot a CDF to get a better idea what the probability of a given read latency is but for now that's a TODO item. As it is, the graphs are barely usable and I'll be giving that more thought. =========================================================== Machine: arnold Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/arnold/comparison.html Arch: x86 CPUs: 1 socket, 2 threads Model: Pentium 4 Disk: Single Rotary Disk Status: =========================================================== fsmark-single ------------- Completion times are more or less ok. 3.2 showed a big improvement which is not in line with what was experienced in ext3. As with ext3, kernel 2.6.32 was a disaster but otherwise our maximum read latencies were looking up until 3.3 when there was a big jump that was not fixed in 3.4. By and large though the average latencies are looking good and while the max latency is bad, the 99th percentile was looking good implying that the worst latencies are rarely experienced. fsmark-threaded --------------- Completion times look generally good with 3.1 being an exception. Latencies are also looking good. postmark -------- Similar story. Completion times and latencies generally look good. largedd ------- Completion times were higher from 2.6.39 up until 3.3 taking nearly two minutes to complete the copy in some cases. This is reflected in some of the maximum latencies in that window but by and large the read latencies are much improved. micro ----- Looking good all round. ========================================================== Machine: hydra Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/hydra/comparison.html Arch: x86-64 CPUs: 1 socket, 4 threads Model: AMD Phenom II X4 940 Disk: Single Rotary Disk Status: Ok ========================================================== fsmark-single ------------- Completion times have degraded slightly but are acceptable. All the latency figures look good with some big improvements. fsmark-threaded --------------- Same story, generally looking good with big improvements. postmark -------- Completion times are a bit varied but latencies look good. largedd ------- Completion times look good. Latency has improved since 2.6.32 but there is a big wrinkle in there. Maximum latency was 337ms in kernel 3.2 but in 3.3 it was 707ms and in 3.4 was 990ms. The 99th percentile figures look good but something happened to allow bigger outliers. micro ----- For the most part, looks good but there was a big jump in the maximum latency in kernel 3.4. Like largedd, the 99th percentil did not look as bad so it might be an outlier. ========================================================== Machine: sandy Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-interactive-performance-ext4/sandy/comparison.html Arch: x86-64 CPUs: 1 socket, 8 threads Model: Intel Core i7-2600 Disk: Single Rotary Disk Status: ========================================================== fsmark-single ------------- Completion times have degraded slightly but are acceptble. All the latency figures look good with some big improvements. fsmark-threaded --------------- Completion times are improved although curiously it is not reflected in the performance figures for fsmark itself. Maximum latency figures generally look good other than a mild jump in 3.2 that has almost being recovered. postmark -------- Completion times have varied a lot and 3.4 is particularly high. The latency figures in general regressed in 3.4 in comparison to 3.3 but by and large the figures look good. largedd ------- Completion times generally look good but were noticably worse for a number of releases between 2.6.39 and 3.2. This same window showed much higher latency figures with kernel 3.1 showing a maximum latency of 1.3 seconds for example. These were mostly outliers though as the 99th percentile generally looked ok. micro ----- Generally much improved. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-05 14:57 UTC|newest] Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-06-20 11:32 MMTests 0.04 Mel Gorman 2012-06-20 11:32 ` Mel Gorman 2012-06-29 11:19 ` Mel Gorman 2012-06-29 11:19 ` Mel Gorman 2012-06-29 11:21 ` [MMTests] Page allocator Mel Gorman 2012-06-29 11:21 ` Mel Gorman 2012-06-29 11:22 ` [MMTests] Network performance Mel Gorman 2012-06-29 11:22 ` Mel Gorman 2012-06-29 11:23 ` [MMTests] IO metadata on ext3 Mel Gorman 2012-06-29 11:23 ` Mel Gorman 2012-06-29 11:24 ` [MMTests] IO metadata on ext4 Mel Gorman 2012-06-29 11:24 ` Mel Gorman 2012-06-29 11:25 ` [MMTests] IO metadata on XFS Mel Gorman 2012-06-29 11:25 ` Mel Gorman 2012-06-29 11:25 ` Mel Gorman 2012-07-01 23:54 ` Dave Chinner 2012-07-01 23:54 ` Dave Chinner 2012-07-01 23:54 ` Dave Chinner 2012-07-02 6:32 ` Christoph Hellwig 2012-07-02 6:32 ` Christoph Hellwig 2012-07-02 6:32 ` Christoph Hellwig 2012-07-02 14:32 ` Mel Gorman 2012-07-02 14:32 ` Mel Gorman 2012-07-02 14:32 ` Mel Gorman 2012-07-02 19:35 ` Mel Gorman 2012-07-02 19:35 ` Mel Gorman 2012-07-02 19:35 ` Mel Gorman 2012-07-03 0:19 ` Dave Chinner 2012-07-03 0:19 ` Dave Chinner 2012-07-03 0:19 ` Dave Chinner 2012-07-03 10:59 ` Mel Gorman 2012-07-03 10:59 ` Mel Gorman 2012-07-03 10:59 ` Mel Gorman 2012-07-03 11:44 ` Mel Gorman 2012-07-03 11:44 ` Mel Gorman 2012-07-03 11:44 ` Mel Gorman 2012-07-03 12:31 ` Daniel Vetter 2012-07-03 12:31 ` Daniel Vetter 2012-07-03 12:31 ` Daniel Vetter 2012-07-03 13:08 ` Mel Gorman 2012-07-03 13:08 ` Mel Gorman 2012-07-03 13:08 ` Mel Gorman 2012-07-03 13:28 ` Eugeni Dodonov 2012-07-03 13:28 ` Eugeni Dodonov 2012-07-04 0:47 ` Dave Chinner 2012-07-04 0:47 ` Dave Chinner 2012-07-04 0:47 ` Dave Chinner 2012-07-04 9:51 ` Mel Gorman 2012-07-04 9:51 ` Mel Gorman 2012-07-04 9:51 ` Mel Gorman 2012-07-03 13:04 ` Mel Gorman 2012-07-03 13:04 ` Mel Gorman 2012-07-03 13:04 ` Mel Gorman 2012-07-03 14:04 ` Daniel Vetter 2012-07-03 14:04 ` Daniel Vetter 2012-07-03 14:04 ` Daniel Vetter 2012-07-02 13:30 ` Mel Gorman 2012-07-02 13:30 ` Mel Gorman 2012-07-02 13:30 ` Mel Gorman 2012-07-04 15:52 ` [MMTests] Page reclaim performance on ext3 Mel Gorman 2012-07-04 15:52 ` Mel Gorman 2012-07-04 15:53 ` [MMTests] Page reclaim performance on ext4 Mel Gorman 2012-07-04 15:53 ` Mel Gorman 2012-07-04 15:53 ` [MMTests] Page reclaim performance on xfs Mel Gorman 2012-07-04 15:53 ` Mel Gorman 2012-07-05 14:56 ` [MMTests] Interactivity during IO on ext3 Mel Gorman 2012-07-05 14:56 ` Mel Gorman 2012-07-10 9:49 ` Jan Kara 2012-07-10 9:49 ` Jan Kara 2012-07-10 11:30 ` Mel Gorman 2012-07-10 11:30 ` Mel Gorman 2012-07-05 14:57 ` Mel Gorman [this message] 2012-07-05 14:57 ` [MMTests] Interactivity during IO on ext4 Mel Gorman 2012-07-23 21:12 ` [MMTests] Scheduler Mel Gorman 2012-07-23 21:12 ` Mel Gorman 2012-07-23 21:13 ` [MMTests] Sysbench read-only on ext3 Mel Gorman 2012-07-23 21:13 ` Mel Gorman 2012-07-24 2:29 ` Mike Galbraith 2012-07-24 2:29 ` Mike Galbraith 2012-07-24 8:19 ` Mel Gorman 2012-07-24 8:19 ` Mel Gorman 2012-07-24 8:32 ` Mike Galbraith 2012-07-24 8:32 ` Mike Galbraith 2012-07-23 21:14 ` [MMTests] Sysbench read-only on ext4 Mel Gorman 2012-07-23 21:14 ` Mel Gorman 2012-07-23 21:15 ` [MMTests] Sysbench read-only on xfs Mel Gorman 2012-07-23 21:15 ` Mel Gorman 2012-07-23 21:17 ` [MMTests] memcachetest and parallel IO on ext3 Mel Gorman 2012-07-23 21:17 ` Mel Gorman 2012-07-23 21:19 ` [MMTests] memcachetest and parallel IO on xfs Mel Gorman 2012-07-23 21:19 ` Mel Gorman 2012-07-23 21:20 ` [MMTests] Stress high-order allocations on ext3 Mel Gorman 2012-07-23 21:20 ` Mel Gorman 2012-07-23 21:21 ` [MMTests] dbench4 async " Mel Gorman 2012-07-23 21:21 ` Mel Gorman 2012-08-16 14:52 ` Jan Kara 2012-08-16 14:52 ` Jan Kara 2012-08-21 22:00 ` Jan Kara 2012-08-21 22:00 ` Jan Kara 2012-08-22 10:48 ` Mel Gorman 2012-08-22 10:48 ` Mel Gorman 2012-07-23 21:23 ` [MMTests] dbench4 async on ext4 Mel Gorman 2012-07-23 21:23 ` Mel Gorman 2012-07-23 21:24 ` [MMTests] Threaded IO Performance on ext3 Mel Gorman 2012-07-23 21:24 ` Mel Gorman 2012-07-23 21:25 ` [MMTests] Threaded IO Performance on xfs Mel Gorman 2012-07-23 21:25 ` Mel Gorman 2012-07-23 21:25 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20120705145750.GO14154@suse.de \ --to=mgorman@suse.de \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.