Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking

From: Nai Xia <nai.xia@gmail.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Avi Kivity <avi@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Chris Wright <chrisw@sous-sol.org>, linux-mm <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking
Date: Thu, 23 Jun 2011 09:36:37 +0800	[thread overview]
Message-ID: <BANLkTik4+A0owJhKKb27KO0HtQ0v-KzU9g@mail.gmail.com> (raw)
In-Reply-To: <20110623004404.GE20843@redhat.com>

On Thu, Jun 23, 2011 at 8:44 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Thu, Jun 23, 2011 at 08:31:56AM +0800, Nai Xia wrote:
>> On Thu, Jun 23, 2011 at 7:59 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
>> > On Thu, Jun 23, 2011 at 07:37:47AM +0800, Nai Xia wrote:
>> >> On 2MB pages, I'd like to remind you and Rik that ksmd currently splits
>> >> huge pages before their sub pages gets really merged to stable tree.
>> >> So when there are many 2MB pages each having a 4kB subpage
>> >> changed for all time, this is already a concern for ksmd to judge
>> >> if it's worthwhile to split 2MB page and get its sub-pages merged.
>> >
>> > Hmm not sure to follow. KSM memory density with THP on and off should
>> > be identical. The cksum is computed on subpages so the fact the 4k
>> > subpage is actually mapped by a hugepmd is invisible to KSM up to the
>> > point we get a unstable_tree_search_insert/stable_tree_search lookup
>> > succeeding.
>>
>> I agree on your points.
>>
>> But, I mean splitting the huge page into normal pages when some subpages
>> need to be merged may increase the TLB lookside timing of CPU and
>> _might_ hurt the workload ksmd is scanning. If only a small portion of false
>> negative 2MB pages are really get merged eventually, maybe it's not worthwhile,
>> right?
>
> Yes, there's not threshold to say "only split if we could merge more
> than N subpages", 1 subpage match in two different hugepages is enough
> to split both and save just 4k but then memory accesses will be slower
> for both 2m ranges that have been splitted. But the point is that it
> won't be slower than if THP was off in the first place. So in the end
> all we gain is 4k saved but we still run faster than THP off, in the
> other hugepages that haven't been splitted yet.

Yes, so ksmd is still doing good compared to THP off.
Thanks for making my mind clearer :)

>
>> But, well, just like Rik said below, yes, ksmd should be more aggressive to
>> avoid much more time consuming cost for swapping.
>
> Correct the above logic also follows the idea to always maximize
> memory merging in KSM, which is why we've no threshold to wait N
> subpages to be mergeable before we split the hugepage.
>
> I'm unsure if admins in real life would then start to use those
> thresholds even if we'd implement them. The current way of enabling
> KSM a per-VM (not per-host) basis is pretty simple: the performance
> critical VM has KSM off, non-performance critical VM has KSM on and it
> prioritizes on memory merging.
>
Hmm, yes, you are right.

Thanks,
Nai