linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Kirill Smelkov <kirr@nexedi.com>, Michel Lespinasse <walken@google.com>
Cc: linux-man <linux-man@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: Why mmap(MAP_POPULATE | MAP_NONBLOCK) is needed (Re: [patch] mmap.2: Add link to commit which broke MAP_POPULATE | MAP_NONBLOCK to be noop)
Date: Mon, 20 Mar 2017 20:38:50 +0100	[thread overview]
Message-ID: <CAKgNAkinN=4KeAOuZEsxhP_XPbiNjJ4ngF4Jb0WcB-UTVSjEGQ@mail.gmail.com> (raw)
In-Reply-To: <20170320155948.pgpp2uhgoppicdl4@deco.navytux.spb.ru>

[CC += Michel Lespinasse <walken@google.com>]

Kirill,

I need some help here.

On 20 March 2017 at 16:59, Kirill Smelkov <kirr@nexedi.com> wrote:
> On Sat, Mar 18, 2017 at 10:40:10PM +0300, Kirill Smelkov wrote:
>> Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
>> ---
>>  man2/mmap.2 | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/man2/mmap.2 b/man2/mmap.2
>> index 96875e486..f6fd56523 100644
>> --- a/man2/mmap.2
>> +++ b/man2/mmap.2
>> @@ -300,6 +300,7 @@ Don't perform read-ahead:
>>  create page tables entries only for pages
>>  that are already present in RAM.
>>  Since Linux 2.6.23, this flag causes
>> +.\" commit 54cb8821de07f2ffcd28c380ce9b93d5784b40d7
>>  .BR MAP_POPULATE
>>  to do nothing.
>>  One day, the combination of
>
> Please also find below benchmark which explains why
>
>         mmap(MAP_POPULATE | MAP_NONBLOCK)
>
> is actually needed.

Okay -- clearly things have changed (but I received no man-pages
patch). What do you believe the man page should now say.

Or, perhaps we can ask Michel:

commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
Author: Michel Lespinasse <walken@google.com>
Date:   Fri Feb 22 16:32:37 2013 -0800

The above commit (which went into Linux 3.9) seems to be the source of
the change.

Michael, can you suggest to us what the mmap() man page should now say
about MAP_POPULATE?

Thanks,

Michael


>
> Thanks,
> Kirill
>
> ---- 8< ---- (https://lab.nexedi.com/kirr/misc/blob/5a25f4ae/t_sysmmap_c.c)
> /* This program benchmarks pagefault time.
>  *
>  * Unfortunately as of 2017-Mar-20 for data in pagecache the situation is as
>  * follows (i7-6600U, Linux 4.9.13):
>  *
>  * 1. minor pagefault:                  ~ 1200ns
>  *    (this program)
>  *
>  * 2. read syscall + whole page copy:   ~  215ns
>  *    (https://github.com/golang/go/issues/19563#issuecomment-287423654)
>  *
>  * 3. it is not possible to mmap(MAP_POPULATE | MAP_NONBLOCK) (i.e. prefault
>  *    those PTE that are already in pagecache).
>  *    ( http://www.spinics.net/lists/linux-man/msg11420.html,
>  *      https://git.kernel.org/linus/54cb8821de07f2ffcd28c380ce9b93d5784b40d7 )
>  *
>  * 4. (Q) I'm not sure a mechanism exists in the kernel to automatically
>  *    subscribe a VMA so that when a page becomes pagecached, associated PTE is
>  *    adjusted so that programs won't need to pay minor pagefault time on
>  *    access.
>  *
>  * unless 3 and 4 are solved mmap unfortunately seems to be slower choice
>  * compared to just pread.
>  */
> #define _GNU_SOURCE
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <sys/user.h>
> #include <sys/mman.h>
>
> //               12345678
> #define NITER   500000
>
> // microtime returns current time as double
> double microtime() {
>         int err;
>         struct timeval tv;
>
>         err = gettimeofday(&tv, NULL);
>         if (err == -1) {
>                 perror("gettimeofday");
>                 abort();
>         }
>
>         return tv.tv_sec + 1E-6 * tv.tv_usec;
> }
>
>
> int main() {
>         unsigned char *addr, sum = 0;
>         int fd, err, i;
>         size_t size;
>         double Tstart, Tend;
>
>         fd = open("/dev/shm/y.dat", O_RDWR | O_CREAT | O_TRUNC, 0666);
>         if (fd == -1) {
>                 perror("open");
>                 abort();
>         }
>
>         size = NITER * PAGE_SIZE;
>
>         err = ftruncate(fd, size);
>         if (err == -1) {
>                 perror("ftruncate");
>                 abort();
>         }
>
> #if 1
>         // make sure RAM is actually allocated
>         Tstart = microtime();
>         err = fallocate(fd, /*mode*/0, 0, size);
>         Tend = microtime();
>         if (err == -1) {
>                 perror("fallocate");
>                 abort();
>         }
>         printf("T(fallocate):\t%.1f\t%6.1f ns / page\n", Tend - Tstart, (Tend - Tstart) * 1E9 / NITER);
> #endif
>
>         Tstart = microtime();
>         addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0);
>         //addr = mmap(NULL, size, PROT_READ, MAP_SHARED | MAP_POPULATE, fd, 0);
>         //addr = mmap(NULL, size, PROT_READ, MAP_SHARED | MAP_POPULATE | MAP_NONBLOCK, fd, 0);
>         if (addr == MAP_FAILED) {
>                 perror("mmap");
>                 abort();
>         }
>         Tend = microtime();
>         printf("T(mmap):\t%.1f\t%6.1f ns / page\n", Tend - Tstart, (Tend - Tstart) * 1E9 / NITER);
>
>         Tstart = microtime();
>         //for (int j=0; j < 100; j++)
>         for (i=0; i<NITER; i++) {
>                 sum += addr[i*PAGE_SIZE];
>         }
>         Tend = microtime();
>
>         printf("T(pagefault):\t%.1f\t%6.1f ns / page\t(%i)\n", Tend - Tstart, (Tend - Tstart) * 1E9 / NITER, sum);
>
>         return 0;
> }
> ---- 8< ----



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  reply	other threads:[~2017-03-20 19:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-18 19:40 [patch] mmap.2: Add link to commit which broke MAP_POPULATE | MAP_NONBLOCK to be noop Kirill Smelkov
2017-03-20 15:59 ` Why mmap(MAP_POPULATE | MAP_NONBLOCK) is needed (Re: [patch] mmap.2: Add link to commit which broke MAP_POPULATE | MAP_NONBLOCK to be noop) Kirill Smelkov
2017-03-20 19:38   ` Michael Kerrisk (man-pages) [this message]
2017-03-20 20:06     ` Kirill Smelkov
2017-04-19  8:22       ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgNAkinN=4KeAOuZEsxhP_XPbiNjJ4ngF4Jb0WcB-UTVSjEGQ@mail.gmail.com' \
    --to=mtk.manpages@gmail.com \
    --cc=kirr@nexedi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).