linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Emelyanov <xemul@parallels.com>
To: Matt Helsley <matthltc@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/5] pagemap: Introduce the /proc/PID/pagemap2 file
Date: Sat, 04 May 2013 13:47:40 +0400	[thread overview]
Message-ID: <5184D93C.7000806@parallels.com> (raw)
In-Reply-To: <20130502170857.GB24627@us.ibm.com>

On 05/02/2013 09:08 PM, Matt Helsley wrote:
> On Thu, Apr 11, 2013 at 03:29:41PM +0400, Pavel Emelyanov wrote:
>> This file is the same as the pagemap one, but shows entries with bits
>> 55-60 being zero (reserved for future use). Next patch will occupy one
>> of them.
> 
> This approach doesn't scale as well as it could. As best I can see
> CRIU would do:
> 
> for each vma in /proc/<pid>/smaps
> 	for each page in /proc/<pid>/pagemap2
> 		if soft dirty bit
> 			copy page
> 
> (possibly with pfn checks to avoid copying the same page mapped in
> multiple locations..)

Comparing pfns got from two subsequent pagemap reads doesn't help at all.
If they are equal, this can mean that either page is shared or (less likely,
but still) that the page, that used to be at the 1st pagemap was reclaimed
and mapped to the 2nd between two reads. If they differ, it can again mean
either not-shared (most likely) or shared (pfns were equal, but got reclaimed
and swapped in back).

Some better API for pages sharing would be nice, probably such API could be
also re-used for the user-space KSM :)

> However, if soft dirty bit changes could be queued up (from say the
> fault handler and page table ops that map/unmap pages) and accumulated
> in something like an interval tree it could be something like:
> 
> for each range of changed pages
> 	for each page in range
> 		copy page
> 
> IOW something that scales with the number of changed pages rather
> than the number of mapped pages.
> 
> So I wonder if CRIU would abandon pagemap2 in the future for something
> like this.

We'd surely adopt such APIs is one exists. One thing to note about one is that
we'd also appreciate if this API would be able to batch "present" bits as well
as "swapped" and "page-file" ones. We use these three in CRIU as well, and
these bits scanning can also be optimized.

> Cheers,
> 	-Matt Helsley
> 

Thanks,
Pavel

  reply	other threads:[~2013-05-04  9:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-11 11:28 [PATCH 0/5] mm: Ability to monitor task memory changes (v3) Pavel Emelyanov
2013-04-11 11:28 ` [PATCH 1/5] clear_refs: Sanitize accepted commands declaration Pavel Emelyanov
2013-04-11 21:17   ` Andrew Morton
2013-04-11 11:29 ` [PATCH 2/5] clear_refs: Introduce private struct for mm_walk Pavel Emelyanov
2013-04-11 11:29 ` [PATCH 3/5] pagemap: Introduce pagemap_entry_t without pmshift bits Pavel Emelyanov
2013-04-11 11:29 ` [PATCH 4/5] pagemap: Introduce the /proc/PID/pagemap2 file Pavel Emelyanov
2013-04-11 21:19   ` Andrew Morton
2013-04-12 13:10     ` Pavel Emelyanov
2013-05-02 17:08   ` Matt Helsley
2013-05-04  9:47     ` Pavel Emelyanov [this message]
2013-04-11 11:30 ` [PATCH 5/5] mm: Soft-dirty bits for user memory changes tracking Pavel Emelyanov
2013-04-11 21:24   ` Andrew Morton
2013-04-12 13:14     ` Pavel Emelyanov
2013-04-15 21:46       ` Andrew Morton
2013-04-15 23:57         ` Stephen Rothwell
2013-04-16 19:58         ` Pavel Emelyanov
2013-04-12 15:53   ` [PATCH 6/5] selftest: Add simple test for soft-dirty bit Pavel Emelyanov
2013-04-16 19:51 ` [PATCH 7/5] mem-soft-dirty: Reshuffle CONFIG_ options to be more Arch-friendly Pavel Emelyanov
2013-04-16 23:24   ` Stephen Rothwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5184D93C.7000806@parallels.com \
    --to=xemul@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthltc@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).