All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Nai Xia <nai.xia@gmail.com>
Cc: Izik Eidus <izik.eidus@ravellosystems.com>,
	Avi Kivity <avi@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Chris Wright <chrisw@sous-sol.org>, linux-mm <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking
Date: Wed, 22 Jun 2011 19:28:22 -0400	[thread overview]
Message-ID: <4E027A96.3040905@redhat.com> (raw)
In-Reply-To: <BANLkTikidXPzyxySbmrXK=EUXOzqMtm-0g@mail.gmail.com>

On 06/22/2011 07:13 PM, Nai Xia wrote:
> On Wed, Jun 22, 2011 at 11:39 PM, Rik van Riel<riel@redhat.com>  wrote:
>> On 06/22/2011 07:19 AM, Izik Eidus wrote:
>>
>>> So what we say here is: it is better to have little junk in the unstable
>>> tree that get flushed eventualy anyway, instead of make the guest
>>> slower....
>>> this race is something that does not reflect accurate of ksm anyway due
>>> to the full memcmp that we will eventualy perform...
>>
>> With 2MB pages, I am not convinced they will get "flushed eventually",
>> because there is a good chance at least one of the 4kB pages inside
>> a 2MB page is in active use at all times.
>>
>> I worry that the proposed changes may end up effectively preventing
>> KSM from scanning inside 2MB pages, when even one 4kB page inside
>> is in active use.  This could mean increased swapping on systems
>> that run low on memory, which can be a much larger performance penalty
>> than ksmd CPU use.
>>
>> We need to scan inside 2MB pages when memory runs low, regardless
>> of the accessed or dirty bits.
>
> I agree on this point. Dirty bit , young bit, is by no means accurate. Even
> on 4kB pages, there is always a chance that the pte are dirty but the contents
> are actually the same. Yeah, the whole optimization contains trade-offs and
> trades-offs always have the possibilities to annoy  someone.  Just like
> page-bit-relying LRU approximations none of them is perfect too. But I think
> it can benefit some people. So maybe we could just provide a generic balanced
> solution but provide fine tuning interfaces to make sure tha when it really gets
> in the way of someone, he has a way to walk around.
> Do you agree on my argument? :-)

That's not an argument.

That is a "if I wave my hands vigorously enough, maybe people
will let my patch in without thinking about what I wrote"
style argument.

I believe your optimization makes sense for 4kB pages, but
is going to be counter-productive for 2MB pages.

Your approach of "make ksmd skip over more pages, so it uses
less CPU" is likely to reduce the effectiveness of ksm by not
sharing some pages.

For 4kB pages that is fine, because you'll get around to them
eventually.

However, the internal use of a 2MB page is likely to be quite
different.  Chances are most 2MB pages will have actively used,
barely used and free pages inside.

You absolutely want ksm to get at the barely used and free
sub-pages.  Having just one actively used 4kB sub-page prevent
ksm from merging any of the other 511 sub-pages is a problem.

-- 
All rights reversed

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Nai Xia <nai.xia@gmail.com>
Cc: Izik Eidus <izik.eidus@ravellosystems.com>,
	Avi Kivity <avi@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Chris Wright <chrisw@sous-sol.org>, linux-mm <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking
Date: Wed, 22 Jun 2011 19:28:22 -0400	[thread overview]
Message-ID: <4E027A96.3040905@redhat.com> (raw)
In-Reply-To: <BANLkTikidXPzyxySbmrXK=EUXOzqMtm-0g@mail.gmail.com>

On 06/22/2011 07:13 PM, Nai Xia wrote:
> On Wed, Jun 22, 2011 at 11:39 PM, Rik van Riel<riel@redhat.com>  wrote:
>> On 06/22/2011 07:19 AM, Izik Eidus wrote:
>>
>>> So what we say here is: it is better to have little junk in the unstable
>>> tree that get flushed eventualy anyway, instead of make the guest
>>> slower....
>>> this race is something that does not reflect accurate of ksm anyway due
>>> to the full memcmp that we will eventualy perform...
>>
>> With 2MB pages, I am not convinced they will get "flushed eventually",
>> because there is a good chance at least one of the 4kB pages inside
>> a 2MB page is in active use at all times.
>>
>> I worry that the proposed changes may end up effectively preventing
>> KSM from scanning inside 2MB pages, when even one 4kB page inside
>> is in active use.  This could mean increased swapping on systems
>> that run low on memory, which can be a much larger performance penalty
>> than ksmd CPU use.
>>
>> We need to scan inside 2MB pages when memory runs low, regardless
>> of the accessed or dirty bits.
>
> I agree on this point. Dirty bit , young bit, is by no means accurate. Even
> on 4kB pages, there is always a chance that the pte are dirty but the contents
> are actually the same. Yeah, the whole optimization contains trade-offs and
> trades-offs always have the possibilities to annoy  someone.  Just like
> page-bit-relying LRU approximations none of them is perfect too. But I think
> it can benefit some people. So maybe we could just provide a generic balanced
> solution but provide fine tuning interfaces to make sure tha when it really gets
> in the way of someone, he has a way to walk around.
> Do you agree on my argument? :-)

That's not an argument.

That is a "if I wave my hands vigorously enough, maybe people
will let my patch in without thinking about what I wrote"
style argument.

I believe your optimization makes sense for 4kB pages, but
is going to be counter-productive for 2MB pages.

Your approach of "make ksmd skip over more pages, so it uses
less CPU" is likely to reduce the effectiveness of ksm by not
sharing some pages.

For 4kB pages that is fine, because you'll get around to them
eventually.

However, the internal use of a 2MB page is likely to be quite
different.  Chances are most 2MB pages will have actively used,
barely used and free pages inside.

You absolutely want ksm to get at the barely used and free
sub-pages.  Having just one actively used 4kB sub-page prevent
ksm from merging any of the other 511 sub-pages is a problem.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-22 23:28 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-21 12:55 [PATCH 0/2 V2] ksm: take dirty bit as reference to avoid volatile pages scanning Nai Xia
2011-06-21 12:55 ` Nai Xia
2011-06-21 13:26 ` [PATCH 1/2 " Nai Xia
2011-06-21 13:26   ` Nai Xia
2011-06-21 21:42   ` Chris Wright
2011-06-21 21:42     ` Chris Wright
2011-06-22  0:02     ` Nai Xia
2011-06-22  0:02       ` Nai Xia
2011-06-22  0:42       ` Chris Wright
2011-06-22  0:42         ` Chris Wright
2011-06-21 13:32 ` [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking Nai Xia
2011-06-21 13:32   ` Nai Xia
2011-06-22  0:21   ` Chris Wright
2011-06-22  0:21     ` Chris Wright
2011-06-22  4:43     ` Nai Xia
2011-06-22  4:43       ` Nai Xia
2011-06-22  6:15     ` Izik Eidus
2011-06-22  6:15       ` Izik Eidus
2011-06-22  6:38       ` Nai Xia
2011-06-22  6:38         ` Nai Xia
2011-06-22 15:46       ` Chris Wright
2011-06-22 15:46         ` Chris Wright
2011-06-22 10:43   ` Avi Kivity
2011-06-22 10:43     ` Avi Kivity
2011-06-22 11:05     ` Izik Eidus
2011-06-22 11:05       ` Izik Eidus
2011-06-22 11:10       ` Avi Kivity
2011-06-22 11:10         ` Avi Kivity
2011-06-22 11:19         ` Izik Eidus
2011-06-22 11:19           ` Izik Eidus
2011-06-22 11:24           ` Avi Kivity
2011-06-22 11:24             ` Avi Kivity
2011-06-22 11:28             ` Avi Kivity
2011-06-22 11:28               ` Avi Kivity
2011-06-22 11:31               ` Avi Kivity
2011-06-22 11:31                 ` Avi Kivity
2011-06-22 11:33               ` Nai Xia
2011-06-22 11:33                 ` Nai Xia
2011-06-22 11:39                 ` Izik Eidus
2011-06-22 11:39                   ` Izik Eidus
2011-06-22 15:39           ` Rik van Riel
2011-06-22 15:39             ` Rik van Riel
2011-06-22 16:55             ` Andrea Arcangeli
2011-06-22 16:55               ` Andrea Arcangeli
2011-06-22 23:37               ` Nai Xia
2011-06-22 23:37                 ` Nai Xia
2011-06-22 23:59                 ` Andrea Arcangeli
2011-06-22 23:59                   ` Andrea Arcangeli
2011-06-23  0:31                   ` Nai Xia
2011-06-23  0:31                     ` Nai Xia
2011-06-23  0:44                     ` Andrea Arcangeli
2011-06-23  0:44                       ` Andrea Arcangeli
2011-06-23  1:36                       ` Nai Xia
2011-06-23  1:36                         ` Nai Xia
2011-06-23  0:00                 ` Rik van Riel
2011-06-23  0:00                   ` Rik van Riel
2011-06-23  0:42                   ` Nai Xia
2011-06-23  0:42                     ` Nai Xia
2011-06-22 23:13             ` Nai Xia
2011-06-22 23:13               ` Nai Xia
2011-06-22 23:25               ` Andrea Arcangeli
2011-06-22 23:25                 ` Andrea Arcangeli
2011-06-23  1:30                 ` Nai Xia
2011-06-23  1:30                   ` Nai Xia
2011-06-22 23:28               ` Rik van Riel [this message]
2011-06-22 23:28                 ` Rik van Riel
2011-06-23  0:52                 ` Nai Xia
2011-06-23  0:52                   ` Nai Xia
2011-06-22 11:24     ` Nai Xia
2011-06-22 15:03   ` Andrea Arcangeli
2011-06-22 15:03     ` Andrea Arcangeli
2011-06-22 15:19     ` Izik Eidus
2011-06-22 15:19       ` Izik Eidus
2011-06-22 23:19     ` Nai Xia
2011-06-22 23:19       ` Nai Xia
2011-06-22 23:44       ` Andrea Arcangeli
2011-06-22 23:44         ` Andrea Arcangeli
2011-06-23  0:14         ` Nai Xia
2011-06-23  0:14           ` Nai Xia
2011-06-22 23:42     ` Nai Xia
2011-06-22 23:42       ` Nai Xia
2011-06-21 13:36 ` [PATCH 2/2 V2] ksm: take dirty bit as reference to avoid volatile pages scanning Nai Xia
2011-06-21 13:36   ` Nai Xia
2011-06-21 22:38   ` Chris Wright
2011-06-21 22:38     ` Chris Wright
2011-06-22  0:04     ` Nai Xia
2011-06-22  0:04       ` Nai Xia
2011-06-22  0:35       ` Chris Wright
2011-06-22  0:35         ` Chris Wright
2011-06-22  4:47         ` Nai Xia
2011-06-22  4:47           ` Nai Xia
2011-06-22 10:55         ` Nai Xia
2011-06-22 10:55           ` Nai Xia
2011-06-22  0:46 ` [PATCH 0/2 " Chris Wright
2011-06-22  0:46   ` Chris Wright
2011-06-22  4:15   ` Nai Xia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E027A96.3040905@redhat.com \
    --to=riel@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=izik.eidus@ravellosystems.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nai.xia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.