From: John Hubbard <jhubbard@nvidia.com>
To: Jerome Glisse <jglisse@redhat.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: <john.hubbard@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Christian Benvenuti <benve@cisco.com>,
Christoph Hellwig <hch@infradead.org>,
Christopher Lameter <cl@linux.com>,
Dan Williams <dan.j.williams@intel.com>,
Dave Chinner <david@fromorbit.com>,
Dennis Dalessandro <dennis.dalessandro@intel.com>,
Doug Ledford <dledford@redhat.com>,
Ira Weiny <ira.weiny@intel.com>, Jan Kara <jack@suse.cz>,
Jason Gunthorpe <jgg@ziepe.ca>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@kernel.org>,
Mike Rapoport <rppt@linux.ibm.com>,
Mike Marciniszyn <mike.marciniszyn@intel.com>,
Ralph Campbell <rcampbell@nvidia.com>,
Tom Talpey <tom@talpey.com>, LKML <linux-kernel@vger.kernel.org>,
<linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v4 1/1] mm: introduce put_user_page*(), placeholder versions
Date: Tue, 19 Mar 2019 12:24:00 -0700 [thread overview]
Message-ID: <bf443287-2461-ea2d-5a15-251190782ab7@nvidia.com> (raw)
In-Reply-To: <20190319134724.GB3437@redhat.com>
On 3/19/19 6:47 AM, Jerome Glisse wrote:
> On Tue, Mar 19, 2019 at 03:04:17PM +0300, Kirill A. Shutemov wrote:
>> On Fri, Mar 08, 2019 at 01:36:33PM -0800, john.hubbard@gmail.com wrote:
>>> From: John Hubbard <jhubbard@nvidia.com>
>
> [...]
>>> +void put_user_pages_dirty(struct page **pages, unsigned long npages)
>>> +{
>>> + __put_user_pages_dirty(pages, npages, set_page_dirty);
>>
>> Have you checked if compiler is clever enough eliminate indirect function
>> call here? Maybe it's better to go with an opencodded approach and get rid
>> of callbacks?
>>
>
> Good point, dunno if John did check that.
Hi Kirill, Jerome,
The compiler does *not* eliminate the indirect function call, at least unless
I'm misunderstanding things. The __put_user_pages_dirty() function calls the
appropriate set_page_dirty*() call, via __x86_indirect_thunk_r12, which seems
pretty definitive.
ffffffff81a00ef0 <__x86_indirect_thunk_r12>:
ffffffff81a00ef0: 41 ff e4 jmpq *%r12
ffffffff81a00ef3: 90 nop
ffffffff81a00ef4: 90 nop
ffffffff81a00ef5: 90 nop
ffffffff81a00ef6: 90 nop
ffffffff81a00ef7: 90 nop
ffffffff81a00ef8: 90 nop
ffffffff81a00ef9: 90 nop
ffffffff81a00efa: 90 nop
ffffffff81a00efb: 90 nop
ffffffff81a00efc: 90 nop
ffffffff81a00efd: 90 nop
ffffffff81a00efe: 90 nop
ffffffff81a00eff: 90 nop
ffffffff81a00f00: 90 nop
ffffffff81a00f01: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
ffffffff81a00f08: 00 00 00 00
ffffffff81a00f0c: 0f 1f 40 00 nopl 0x0(%rax)
However, there is no visible overhead to doing so, at a macro level. An fio
O_DIRECT run with and without the full conversion patchset shows the same
numbers:
cat fio.conf
[reader]
direct=1
ioengine=libaio
blocksize=4096
size=1g
numjobs=1
rw=read
iodepth=64
=====================
Before (baseline):
=====================
reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
reader: (groupid=0, jobs=1): err= 0: pid=1828: Mon Mar 18 14:56:22 2019
read: IOPS=192k, BW=751MiB/s (787MB/s)(1024MiB/1364msec)
slat (nsec): min=1274, max=42375, avg=1564.12, stdev=682.65
clat (usec): min=168, max=12209, avg=331.01, stdev=184.95
lat (usec): min=171, max=12215, avg=332.61, stdev=185.11
clat percentiles (usec):
| 1.00th=[ 326], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326],
| 99.00th=[ 519], 99.50th=[ 523], 99.90th=[ 537], 99.95th=[ 594],
| 99.99th=[12125]
bw ( KiB/s): min=755280, max=783016, per=100.00%, avg=769148.00, stdev=19612.31, samples=2
iops : min=188820, max=195754, avg=192287.00, stdev=4903.08, samples=2
lat (usec) : 250=0.14%, 500=98.59%, 750=1.25%
lat (msec) : 20=0.02%
cpu : usr=12.69%, sys=48.20%, ctx=248836, majf=0, minf=73
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=751MiB/s (787MB/s), 751MiB/s-751MiB/s (787MB/s-787MB/s), io=1024MiB (1074MB), run=1364-1364msec
Disk stats (read/write):
nvme0n1: ios=220106/0, merge=0/0, ticks=70136/0, in_queue=704, util=91.19%
==================================================
After (with enough callsites converted to run fio:
==================================================
reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
reader: (groupid=0, jobs=1): err= 0: pid=2026: Mon Mar 18 14:35:07 2019
read: IOPS=192k, BW=751MiB/s (787MB/s)(1024MiB/1364msec)
slat (nsec): min=1263, max=41861, avg=1591.99, stdev=692.09
clat (usec): min=154, max=12205, avg=330.82, stdev=184.98
lat (usec): min=157, max=12212, avg=332.45, stdev=185.14
clat percentiles (usec):
| 1.00th=[ 322], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326],
| 99.00th=[ 502], 99.50th=[ 510], 99.90th=[ 523], 99.95th=[ 570],
| 99.99th=[12125]
bw ( KiB/s): min=746848, max=783088, per=99.51%, avg=764968.00, stdev=25625.55, samples=2
iops : min=186712, max=195772, avg=191242.00, stdev=6406.39, samples=2
lat (usec) : 250=0.09%, 500=98.88%, 750=1.01%
lat (msec) : 20=0.02%
cpu : usr=14.38%, sys=48.64%, ctx=248037, majf=0, minf=73
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=751MiB/s (787MB/s), 751MiB/s-751MiB/s (787MB/s-787MB/s), io=1024MiB (1074MB), run=1364-1364msec
Disk stats (read/write):
nvme0n1: ios=220228/0, merge=0/0, ticks=70426/0, in_queue=704, util=91.27%
So, I could be persuaded either way. But given the lack of an visible perf
effects, and given that this could will get removed anyway because we'll
likely end up with set_page_dirty() called at GUP time instead...it seems
like it's probably OK to just leave it as is.
thanks,
--
John Hubbard
NVIDIA
next prev parent reply other threads:[~2019-03-19 19:24 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-08 21:36 [PATCH v4 0/1] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-03-08 21:36 ` [PATCH v4 1/1] " john.hubbard
2019-03-19 12:04 ` Kirill A. Shutemov
2019-03-19 13:47 ` Jerome Glisse
2019-03-19 14:06 ` Kirill A. Shutemov
2019-03-19 14:15 ` Jerome Glisse
2019-03-19 20:01 ` John Hubbard
2019-03-20 9:28 ` Kirill A. Shutemov
2019-03-19 14:14 ` Jerome Glisse
2019-03-19 14:29 ` Kirill A. Shutemov
2019-03-19 15:36 ` Jan Kara
2019-03-19 9:03 ` Ira Weiny
2019-03-19 20:43 ` Tom Talpey
2019-03-19 20:45 ` Jerome Glisse
2019-03-19 20:55 ` Tom Talpey
2019-03-19 19:02 ` John Hubbard
2019-03-19 21:23 ` Dave Chinner
2019-03-19 22:06 ` Jerome Glisse
2019-03-19 23:57 ` Dave Chinner
2019-03-20 0:08 ` Jerome Glisse
2019-03-20 1:43 ` John Hubbard
2019-03-20 4:33 ` Jerome Glisse
2019-03-20 9:08 ` Ira Weiny
2019-03-20 14:55 ` William Kucharski
2019-03-20 14:59 ` Jerome Glisse
2019-03-20 0:15 ` John Hubbard
2019-03-20 1:01 ` Christopher Lameter
2019-03-19 19:24 ` John Hubbard [this message]
2019-03-20 9:40 ` Kirill A. Shutemov
2019-03-08 23:21 ` [PATCH v4 0/1] " John Hubbard
2019-03-19 18:12 ` Christopher Lameter
2019-03-19 19:24 ` John Hubbard
2019-03-20 1:09 ` Christopher Lameter
2019-03-20 1:18 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf443287-2461-ea2d-5a15-251190782ab7@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=benve@cisco.com \
--cc=cl@linux.com \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=dennis.dalessandro@intel.com \
--cc=dledford@redhat.com \
--cc=hch@infradead.org \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=john.hubbard@gmail.com \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.marciniszyn@intel.com \
--cc=rcampbell@nvidia.com \
--cc=rppt@linux.ibm.com \
--cc=tom@talpey.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).