All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	linux-mm <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE
Date: Thu, 1 Jun 2017 15:45:22 +0200	[thread overview]
Message-ID: <20170601134522.GE302@redhat.com> (raw)
In-Reply-To: <20170601080909.GD32677@dhcp22.suse.cz>

On Thu, Jun 01, 2017 at 10:09:09AM +0200, Michal Hocko wrote:
> That is a bit surprising. I didn't think that the userfault syscall
> (ioctl) can be faster than a regular #PF but considering that
> __mcopy_atomic bypasses the page fault path and it can be optimized for
> the anon case suggests that we can save some cycles for each page and so
> the cumulative savings can be visible.

__mcopy_atomic works not just for anonymous memory, hugetlbfs/shmem
are covered too and there are branches to handle those.

If you were to run more than one precopy pass UFFDIO_COPY shall become
slower than the userland access starting from the second pass.

At the light of this if CRIU can only do one single pass of precopy,
CRIU is probably better off using UFFDIO_COPY than using prctl or
madvise to temporarily turn off THP.

With QEMU as opposed we set MADV_HUGEPAGE during precopy on
destination to maximize the THP utilization for all those 2M naturally
aligned guest regions that aren't re-dirtied in the source, so we're
better off without using UFFDIO_COPY in precopy even during the first
pass to avoid the enter/kernel for subpages that are written to
destination in a already instantiated THP. At least until we teach
QEMU to map 2M at once if possible (UFFDIO_COPY would then also
require an enhancement, because currently it won't map THP on the
fly).

Thanks,
Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	linux-mm <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE
Date: Thu, 1 Jun 2017 15:45:22 +0200	[thread overview]
Message-ID: <20170601134522.GE302@redhat.com> (raw)
In-Reply-To: <20170601080909.GD32677@dhcp22.suse.cz>

On Thu, Jun 01, 2017 at 10:09:09AM +0200, Michal Hocko wrote:
> That is a bit surprising. I didn't think that the userfault syscall
> (ioctl) can be faster than a regular #PF but considering that
> __mcopy_atomic bypasses the page fault path and it can be optimized for
> the anon case suggests that we can save some cycles for each page and so
> the cumulative savings can be visible.

__mcopy_atomic works not just for anonymous memory, hugetlbfs/shmem
are covered too and there are branches to handle those.

If you were to run more than one precopy pass UFFDIO_COPY shall become
slower than the userland access starting from the second pass.

At the light of this if CRIU can only do one single pass of precopy,
CRIU is probably better off using UFFDIO_COPY than using prctl or
madvise to temporarily turn off THP.

With QEMU as opposed we set MADV_HUGEPAGE during precopy on
destination to maximize the THP utilization for all those 2M naturally
aligned guest regions that aren't re-dirtied in the source, so we're
better off without using UFFDIO_COPY in precopy even during the first
pass to avoid the enter/kernel for subpages that are written to
destination in a already instantiated THP. At least until we teach
QEMU to map 2M at once if possible (UFFDIO_COPY would then also
require an enhancement, because currently it won't map THP on the
fly).

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-06-01 13:45 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-22  6:12 [PATCH] mm: introduce MADV_CLR_HUGEPAGE Mike Rapoport
2017-05-22  6:12 ` Mike Rapoport
2017-05-22  7:26 ` Anshuman Khandual
2017-05-22  7:26   ` Anshuman Khandual
2017-05-22  8:12   ` Mike Rapoport
2017-05-22  8:12     ` Mike Rapoport
2017-05-22 11:42 ` Kirill A. Shutemov
2017-05-22 11:42   ` Kirill A. Shutemov
2017-05-22 13:36   ` Mike Rapoport
2017-05-22 13:36     ` Mike Rapoport
2017-05-22 13:44     ` Kirill A. Shutemov
2017-05-22 13:44       ` Kirill A. Shutemov
2017-05-22 13:55     ` Michal Hocko
2017-05-22 13:55       ` Michal Hocko
2017-05-22 14:29       ` Mike Rapoport
2017-05-22 14:29         ` Mike Rapoport
2017-05-22 15:52         ` Vlastimil Babka
2017-05-22 15:52           ` Vlastimil Babka
2017-05-22 17:51           ` Mike Rapoport
2017-05-22 17:51             ` Mike Rapoport
2017-05-24  7:50           ` Mike Rapoport
2017-05-24  7:50             ` Mike Rapoport
2017-05-24  7:58             ` Vlastimil Babka
2017-05-24  7:58               ` Vlastimil Babka
2017-05-24  7:58               ` Vlastimil Babka
2017-05-24 10:39               ` Mike Rapoport
2017-05-24 10:39                 ` Mike Rapoport
2017-05-24 11:18                 ` Michal Hocko
2017-05-24 11:18                   ` Michal Hocko
2017-05-24 14:25                   ` Pavel Emelyanov
2017-05-24 14:25                     ` Pavel Emelyanov
2017-05-24 14:27                   ` Mike Rapoport
2017-05-24 14:27                     ` Mike Rapoport
2017-05-24 15:22                     ` Andrea Arcangeli
2017-05-24 15:22                       ` Andrea Arcangeli
2017-05-30  7:44                     ` Michal Hocko
2017-05-30  7:44                       ` Michal Hocko
2017-05-30  7:44                       ` Michal Hocko
2017-05-30 10:19                       ` Mike Rapoport
2017-05-30 10:19                         ` Mike Rapoport
2017-05-30 10:19                         ` Mike Rapoport
2017-05-30 10:39                         ` Michal Hocko
2017-05-30 10:39                           ` Michal Hocko
2017-05-30 14:04                           ` Andrea Arcangeli
2017-05-30 14:04                             ` Andrea Arcangeli
2017-05-30 14:04                             ` Andrea Arcangeli
2017-05-30 14:39                             ` Michal Hocko
2017-05-30 14:39                               ` Michal Hocko
2017-05-30 14:56                               ` Michal Hocko
2017-05-30 14:56                                 ` Michal Hocko
2017-05-30 14:56                                 ` Michal Hocko
2017-05-30 16:06                                 ` Andrea Arcangeli
2017-05-30 16:06                                   ` Andrea Arcangeli
2017-05-30 16:06                                   ` Andrea Arcangeli
2017-05-31  6:30                                   ` Vlastimil Babka
2017-05-31  6:30                                     ` Vlastimil Babka
2017-05-31  8:24                                     ` Michal Hocko
2017-05-31  8:24                                       ` Michal Hocko
2017-05-31  9:27                                       ` Mike Rapoport
2017-05-31  9:27                                         ` Mike Rapoport
2017-05-31 10:24                                         ` Michal Hocko
2017-05-31 10:24                                           ` Michal Hocko
2017-05-31 10:22                                       ` Michal Hocko
2017-05-31 10:22                                         ` Michal Hocko
2017-05-31 10:22                                         ` Michal Hocko
2017-06-01 11:00                                       ` Mike Rapoport
2017-06-01 11:00                                         ` Mike Rapoport
2017-06-01 12:27                                         ` Michal Hocko
2017-06-01 12:27                                           ` Michal Hocko
2017-05-30 15:43                               ` Andrea Arcangeli
2017-05-30 15:43                                 ` Andrea Arcangeli
2017-05-31 12:08                                 ` Michal Hocko
2017-05-31 12:08                                   ` Michal Hocko
2017-05-31 12:39                                   ` Mike Rapoprt
2017-05-31 12:39                                     ` Mike Rapoprt
2017-05-31 12:39                                     ` Mike Rapoprt
2017-05-31 14:18                                     ` Andrea Arcangeli
2017-05-31 14:18                                       ` Andrea Arcangeli
2017-05-31 14:32                                       ` Michal Hocko
2017-05-31 14:32                                         ` Michal Hocko
2017-05-31 14:32                                         ` Michal Hocko
2017-05-31 15:46                                         ` Andrea Arcangeli
2017-05-31 15:46                                           ` Andrea Arcangeli
2017-06-01  6:58                                       ` Mike Rapoport
2017-06-01  6:58                                         ` Mike Rapoport
2017-06-01  6:58                                         ` Mike Rapoport
2017-05-31 14:19                                     ` Michal Hocko
2017-05-31 14:19                                       ` Michal Hocko
2017-05-31 14:19                                       ` Michal Hocko
2017-06-01  6:53                               ` Mike Rapoport
2017-06-01  6:53                                 ` Mike Rapoport
2017-06-01  8:09                                 ` Michal Hocko
2017-06-01  8:09                                   ` Michal Hocko
2017-06-01  8:35                                   ` Mike Rapoport
2017-06-01  8:35                                     ` Mike Rapoport
2017-06-01  8:35                                     ` Mike Rapoport
2017-06-01 13:45                                   ` Andrea Arcangeli [this message]
2017-06-01 13:45                                     ` Andrea Arcangeli
2017-06-02  9:11                                     ` Mike Rapoport
2017-06-02  9:11                                       ` Mike Rapoport
2017-05-31  9:08                           ` Mike Rapoport
2017-05-31  9:08                             ` Mike Rapoport
2017-05-31  9:08                             ` Mike Rapoport
2017-05-31 12:05                             ` Michal Hocko
2017-05-31 12:05                               ` Michal Hocko
2017-05-31 12:25                               ` Mike Rapoprt
2017-05-31 12:25                                 ` Mike Rapoprt
2017-05-24 11:31                 ` Vlastimil Babka
2017-05-24 11:31                   ` Vlastimil Babka
2017-05-24 14:28                   ` Pavel Emelyanov
2017-05-24 14:28                     ` Pavel Emelyanov
2017-05-24 14:54                     ` Vlastimil Babka
2017-05-24 14:54                       ` Vlastimil Babka
2017-05-24 15:13                       ` Mike Rapoport
2017-05-24 15:13                         ` Mike Rapoport
2017-05-22 15:33 ` kbuild test robot
2017-05-22 15:33   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170601134522.GE302@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.