From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-block <linux-block@vger.kernel.org>,
Matthew Wilcox <willy@infradead.org>, Chris Mason <clm@fb.com>,
Dave Chinner <david@fromorbit.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED
Date: Wed, 11 Dec 2019 18:41:44 -0700 [thread overview]
Message-ID: <6e2ca035-0e06-1def-5ea9-90a7466b2d49@kernel.dk> (raw)
In-Reply-To: <d8a8ea42-7f76-926c-ae9a-d49b11578153@kernel.dk>
On 12/11/19 6:29 PM, Jens Axboe wrote:
> On 12/11/19 6:22 PM, Linus Torvalds wrote:
>> On Wed, Dec 11, 2019 at 5:11 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> 15K is likely too slow to really show an issue, I'm afraid. The 970
>>> is no slouch, but your crypt setup will likely hamper it a lot. You
>>> don't have a non-encrypted partition on it?
>>
>> No. I normally don't need all that much disk, so I've never upgraded
>> my ssd from the 512G size.
>>
>> Which means that it's actually half full or so, and I never felt like
>> "I should keep an unencrypted partition for IO testing", since I don't
>> generally _do_ any IO testing.
>>
>> I can get my load up with "numjobs=8" and get my iops up to the 100k
>> range, though.
>>
>> But kswapd doesn't much seem to care, the CPU percentage actually does
>> _down_ to 0.39% when I try that. Probably simply because now my CPU's
>> are busy, so they are running at 4.7Ghz instead of the 800Mhz "mostly
>> idle" state ...
>>
>> I guess I should be happy. It does mean that the situation you see
>> isn't exactly the normal case. I understand why you want to do the
>> non-cached case, but the case I think it the worrisome one is the
>> regular buffered one, so that's what I'm testing (not even trying the
>> noaccess patches).
>>
>> So from your report I went "uhhuh, that sounds like a bug". And it
>> appears that it largely isn't - you're seeing it because of pushing
>> the IO subsystem by another order of magnitude (and then I agree that
>> "under those kinds of IO loads, caching just won't help")
>
> I'd very much argue that it IS a bug, maybe just doesn't show on your
> system. My test box is a pretty standard 2 socket system, 24 cores / 48
> threads, 2 nodes. The last numbers I sent were 100K IOPS, so nothing
> crazy, and granted that's only 10% kswapd cpu time, but that still seems
> very high for those kinds of rates. I'm surprised you see essentially no
> kswapd time for the same data rate.
>
> We'll keep poking here, I know Johannes is spending some time looking
> into the reclaim side.
Out of curiosity, just tried it on my laptop, which also has some
samsung drive. Using 8 jobs, I get around 100K IOPS too, and this
is my top listing:
23308 axboe 20 0 623156 1304 8 D 10.3 0.0 0:03.81 fio
23309 axboe 20 0 623160 1304 8 D 10.3 0.0 0:03.81 fio
23311 axboe 20 0 623168 1304 8 D 10.3 0.0 0:03.82 fio
23313 axboe 20 0 623176 1304 8 D 10.3 0.0 0:03.82 fio
23314 axboe 20 0 623180 1304 8 D 10.3 0.0 0:03.81 fio
162 root 20 0 0 0 0 S 9.9 0.0 0:12.97 kswapd0
23307 axboe 20 0 623152 1304 8 D 9.9 0.0 0:03.84 fio
23310 axboe 20 0 623164 1304 8 D 9.9 0.0 0:03.81 fio
23312 axboe 20 0 623172 1304 8 D 9.9 0.0 0:03.80 fio
kswapd is between 9-11% the whole time, and the profile looks very
similar to what I saw on my test box:
35.79% kswapd0 [kernel.vmlinux] [k] xas_create
9.97% kswapd0 [kernel.vmlinux] [k] free_pcppages_bulk
9.94% kswapd0 [kernel.vmlinux] [k] isolate_lru_pages
7.78% kswapd0 [kernel.vmlinux] [k] shrink_page_list
3.78% kswapd0 [kernel.vmlinux] [k] xas_clear_mark
3.08% kswapd0 [kernel.vmlinux] [k] workingset_eviction
2.48% kswapd0 [kernel.vmlinux] [k] __isolate_lru_page
2.06% kswapd0 [kernel.vmlinux] [k] page_mapping
1.95% kswapd0 [kernel.vmlinux] [k] __remove_mapping
So now I'm even more puzzled why your (desktop?) doesn't show it, it
must be more potent than my x1 laptop. But for me, the laptop and 2
socket test box show EXACTLY the same behavior, laptop is just too slow
to make it really pathological.
--
Jens Axboe
next prev parent reply other threads:[~2019-12-12 1:41 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-11 15:29 [PATCHSET v3 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 1/5] fs: add read support " Jens Axboe
2019-12-11 15:29 ` [PATCH 2/5] mm: make generic_perform_write() take a struct kiocb Jens Axboe
2019-12-11 15:29 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 4/5] iomap: pass in the write_begin/write_end flags to iomap_actor Jens Axboe
2019-12-11 17:19 ` Linus Torvalds
2019-12-11 15:29 ` [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes Jens Axboe
2019-12-11 17:19 ` Matthew Wilcox
2019-12-11 18:05 ` Jens Axboe
2019-12-12 22:34 ` Dave Chinner
2019-12-13 0:54 ` Jens Axboe
2019-12-13 0:57 ` Jens Axboe
2019-12-16 4:17 ` Dave Chinner
2019-12-17 14:31 ` Jens Axboe
2019-12-18 0:49 ` Dave Chinner
2019-12-18 1:01 ` Jens Axboe
2019-12-11 17:37 ` [PATCHSET v3 0/5] Support for RWF_UNCACHED Linus Torvalds
2019-12-11 17:56 ` Jens Axboe
2019-12-11 19:14 ` Linus Torvalds
2019-12-11 19:34 ` Jens Axboe
2019-12-11 20:03 ` Linus Torvalds
2019-12-11 20:08 ` Jens Axboe
2019-12-11 20:18 ` Linus Torvalds
2019-12-11 21:04 ` Johannes Weiner
2019-12-12 1:30 ` Jens Axboe
2019-12-11 23:41 ` Jens Axboe
2019-12-12 1:08 ` Linus Torvalds
2019-12-12 1:11 ` Jens Axboe
2019-12-12 1:22 ` Linus Torvalds
2019-12-12 1:29 ` Jens Axboe
2019-12-12 1:41 ` Linus Torvalds
2019-12-12 1:56 ` Matthew Wilcox
2019-12-12 2:47 ` Linus Torvalds
2019-12-12 17:52 ` Matthew Wilcox
2019-12-12 18:29 ` Linus Torvalds
2019-12-12 20:05 ` Matthew Wilcox
2019-12-12 1:41 ` Jens Axboe [this message]
2019-12-12 1:49 ` Linus Torvalds
2019-12-12 1:09 ` Jens Axboe
2019-12-12 2:03 ` Jens Axboe
2019-12-12 2:10 ` Jens Axboe
2019-12-12 2:21 ` Matthew Wilcox
2019-12-12 2:38 ` Jens Axboe
2019-12-12 22:18 ` Dave Chinner
2019-12-13 1:32 ` Chris Mason
2020-01-07 17:42 ` Christoph Hellwig
2020-01-08 14:09 ` Chris Mason
2020-02-01 10:33 ` Andres Freund
2019-12-11 20:43 ` Matthew Wilcox
2019-12-11 20:04 ` Jens Axboe
2019-12-12 10:44 ` Martin Steigerwald
2019-12-12 15:16 ` Jens Axboe
2019-12-12 21:45 ` Martin Steigerwald
2019-12-12 22:15 ` Jens Axboe
2019-12-12 22:18 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6e2ca035-0e06-1def-5ea9-90a7466b2d49@kernel.dk \
--to=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=david@fromorbit.com \
--cc=hannes@cmpxchg.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).