All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Jerome Glisse <jglisse@redhat.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: <john.hubbard@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christian Benvenuti <benve@cisco.com>,
	Christoph Hellwig <hch@infradead.org>,
	Christopher Lameter <cl@linux.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Doug Ledford <dledford@redhat.com>,
	Ira Weiny <ira.weiny@intel.com>, Jan Kara <jack@suse.cz>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Tom Talpey <tom@talpey.com>, LKML <linux-kernel@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v4 1/1] mm: introduce put_user_page*(), placeholder versions
Date: Tue, 19 Mar 2019 12:24:00 -0700	[thread overview]
Message-ID: <bf443287-2461-ea2d-5a15-251190782ab7@nvidia.com> (raw)
In-Reply-To: <20190319134724.GB3437@redhat.com>

On 3/19/19 6:47 AM, Jerome Glisse wrote:
> On Tue, Mar 19, 2019 at 03:04:17PM +0300, Kirill A. Shutemov wrote:
>> On Fri, Mar 08, 2019 at 01:36:33PM -0800, john.hubbard@gmail.com wrote:
>>> From: John Hubbard <jhubbard@nvidia.com>
> 
> [...]
>>> +void put_user_pages_dirty(struct page **pages, unsigned long npages)
>>> +{
>>> +	__put_user_pages_dirty(pages, npages, set_page_dirty);
>>
>> Have you checked if compiler is clever enough eliminate indirect function
>> call here? Maybe it's better to go with an opencodded approach and get rid
>> of callbacks?
>>
> 
> Good point, dunno if John did check that.

Hi Kirill, Jerome,

The compiler does *not* eliminate the indirect function call, at least unless
I'm misunderstanding things. The __put_user_pages_dirty() function calls the
appropriate set_page_dirty*() call, via __x86_indirect_thunk_r12, which seems
pretty definitive.

ffffffff81a00ef0 <__x86_indirect_thunk_r12>:
ffffffff81a00ef0:	41 ff e4             	jmpq   *%r12
ffffffff81a00ef3:	90                   	nop
ffffffff81a00ef4:	90                   	nop
ffffffff81a00ef5:	90                   	nop
ffffffff81a00ef6:	90                   	nop
ffffffff81a00ef7:	90                   	nop
ffffffff81a00ef8:	90                   	nop
ffffffff81a00ef9:	90                   	nop
ffffffff81a00efa:	90                   	nop
ffffffff81a00efb:	90                   	nop
ffffffff81a00efc:	90                   	nop
ffffffff81a00efd:	90                   	nop
ffffffff81a00efe:	90                   	nop
ffffffff81a00eff:	90                   	nop
ffffffff81a00f00:	90                   	nop
ffffffff81a00f01:	66 66 2e 0f 1f 84 00 	data16 nopw %cs:0x0(%rax,%rax,1)
ffffffff81a00f08:	00 00 00 00 
ffffffff81a00f0c:	0f 1f 40 00          	nopl   0x0(%rax)

However, there is no visible overhead to doing so, at a macro level. An fio
O_DIRECT run with and without the full conversion patchset shows the same 
numbers:

cat fio.conf 
[reader]
direct=1
ioengine=libaio
blocksize=4096
size=1g
numjobs=1
rw=read
iodepth=64

=====================
Before (baseline):
=====================

reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process

reader: (groupid=0, jobs=1): err= 0: pid=1828: Mon Mar 18 14:56:22 2019
   read: IOPS=192k, BW=751MiB/s (787MB/s)(1024MiB/1364msec)
    slat (nsec): min=1274, max=42375, avg=1564.12, stdev=682.65
    clat (usec): min=168, max=12209, avg=331.01, stdev=184.95
     lat (usec): min=171, max=12215, avg=332.61, stdev=185.11
    clat percentiles (usec):
     |  1.00th=[  326],  5.00th=[  326], 10.00th=[  326], 20.00th=[  326],
     | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[  326],
     | 70.00th=[  326], 80.00th=[  326], 90.00th=[  326], 95.00th=[  326],
     | 99.00th=[  519], 99.50th=[  523], 99.90th=[  537], 99.95th=[  594],
     | 99.99th=[12125]
   bw (  KiB/s): min=755280, max=783016, per=100.00%, avg=769148.00, stdev=19612.31, samples=2
   iops        : min=188820, max=195754, avg=192287.00, stdev=4903.08, samples=2
  lat (usec)   : 250=0.14%, 500=98.59%, 750=1.25%
  lat (msec)   : 20=0.02%
  cpu          : usr=12.69%, sys=48.20%, ctx=248836, majf=0, minf=73
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=751MiB/s (787MB/s), 751MiB/s-751MiB/s (787MB/s-787MB/s), io=1024MiB (1074MB), run=1364-1364msec

Disk stats (read/write):
  nvme0n1: ios=220106/0, merge=0/0, ticks=70136/0, in_queue=704, util=91.19%

==================================================
After (with enough callsites converted to run fio:
==================================================

reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process

reader: (groupid=0, jobs=1): err= 0: pid=2026: Mon Mar 18 14:35:07 2019
   read: IOPS=192k, BW=751MiB/s (787MB/s)(1024MiB/1364msec)
    slat (nsec): min=1263, max=41861, avg=1591.99, stdev=692.09
    clat (usec): min=154, max=12205, avg=330.82, stdev=184.98
     lat (usec): min=157, max=12212, avg=332.45, stdev=185.14
    clat percentiles (usec):
     |  1.00th=[  322],  5.00th=[  326], 10.00th=[  326], 20.00th=[  326],
     | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[  326],
     | 70.00th=[  326], 80.00th=[  326], 90.00th=[  326], 95.00th=[  326],
     | 99.00th=[  502], 99.50th=[  510], 99.90th=[  523], 99.95th=[  570],
     | 99.99th=[12125]
   bw (  KiB/s): min=746848, max=783088, per=99.51%, avg=764968.00, stdev=25625.55, samples=2
   iops        : min=186712, max=195772, avg=191242.00, stdev=6406.39, samples=2
  lat (usec)   : 250=0.09%, 500=98.88%, 750=1.01%
  lat (msec)   : 20=0.02%
  cpu          : usr=14.38%, sys=48.64%, ctx=248037, majf=0, minf=73
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=751MiB/s (787MB/s), 751MiB/s-751MiB/s (787MB/s-787MB/s), io=1024MiB (1074MB), run=1364-1364msec

Disk stats (read/write):
  nvme0n1: ios=220228/0, merge=0/0, ticks=70426/0, in_queue=704, util=91.27%


So, I could be persuaded either way. But given the lack of an visible perf
effects, and given that this could will get removed anyway because we'll
likely end up with set_page_dirty() called at GUP time instead...it seems
like it's probably OK to just leave it as is.

thanks,
-- 
John Hubbard
NVIDIA


  parent reply	other threads:[~2019-03-19 19:24 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-08 21:36 [PATCH v4 0/1] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-03-08 21:36 ` [PATCH v4 1/1] " john.hubbard
2019-03-19 12:04   ` Kirill A. Shutemov
2019-03-19 13:47     ` Jerome Glisse
2019-03-19 14:06       ` Kirill A. Shutemov
2019-03-19 14:15         ` Jerome Glisse
2019-03-19 20:01         ` John Hubbard
2019-03-20  9:28           ` Kirill A. Shutemov
2019-03-19 14:14       ` Jerome Glisse
2019-03-19 14:29         ` Kirill A. Shutemov
2019-03-19 15:36           ` Jan Kara
2019-03-19  9:03             ` Ira Weiny
2019-03-19 20:43               ` Tom Talpey
2019-03-19 20:45                 ` Jerome Glisse
2019-03-19 20:55                   ` Tom Talpey
2019-03-19 19:02             ` John Hubbard
2019-03-19 21:23         ` Dave Chinner
2019-03-19 22:06           ` Jerome Glisse
2019-03-19 23:57             ` Dave Chinner
2019-03-20  0:08               ` Jerome Glisse
2019-03-20  1:43                 ` John Hubbard
2019-03-20  4:33                   ` Jerome Glisse
2019-03-20  9:08                     ` Ira Weiny
2019-03-20 14:55                     ` William Kucharski
2019-03-20 14:59                       ` Jerome Glisse
2019-03-20  0:15               ` John Hubbard
2019-03-20  1:01               ` Christopher Lameter
2019-03-20  1:01                 ` Christopher Lameter
2019-03-19 19:24       ` John Hubbard [this message]
2019-03-20  9:40         ` Kirill A. Shutemov
2019-03-08 23:21 ` [PATCH v4 0/1] " John Hubbard
2019-03-19 18:12 ` Christopher Lameter
2019-03-19 18:12   ` Christopher Lameter
2019-03-19 19:24   ` John Hubbard
2019-03-20  1:09     ` Christopher Lameter
2019-03-20  1:09       ` Christopher Lameter
2019-03-20  1:18       ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf443287-2461-ea2d-5a15-251190782ab7@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=benve@cisco.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=john.hubbard@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=rcampbell@nvidia.com \
    --cc=rppt@linux.ibm.com \
    --cc=tom@talpey.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.