All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nai Xia <nai.xia@gmail.com>
To: "Undisclosed.Recipients:"@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Izik Eidus <izik.eidus@ravellosystems.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Chris Wright <chrisw@sous-sol.org>,
	Rik van Riel <riel@redhat.com>, "linux-mm" <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"linux-kernel" <linux-kernel@vger.kernel.org>,
	kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking
Date: Wed, 22 Jun 2011 19:24:36 +0800	[thread overview]
Message-ID: <201106221924.36996.nai.xia@gmail.com> (raw)
In-Reply-To: <4E01C752.10405@redhat.com>

Hi Avi,

Thanks for viewing!

On Wednesday 22 June 2011 18:43:30 Avi Kivity wrote:
> On 06/21/2011 04:32 PM, Nai Xia wrote:
> > Introduced kvm_mmu_notifier_test_and_clear_dirty(), kvm_mmu_notifier_dirty_update()
> > and their mmu_notifier interfaces to support KSM dirty bit tracking, which brings
> > significant performance gain in volatile pages scanning in KSM.
> > Currently, kvm_mmu_notifier_dirty_update() returns 0 if and only if intel EPT is
> > enabled to indicate that the dirty bits of underlying sptes are not updated by
> > hardware.
> >
> 
> 
> Can you quantify the performance gains?

Compared with checksum based approach, the speed up for volatile host working 
set is about 8 times on normal pages, 16 times on transhuge page. I have not
collect the figures in guest os yet. I'll be back with these numbers in guest.

> 
> > +int kvm_test_and_clear_dirty_rmapp(struct kvm *kvm, unsigned long *rmapp,
> > +				   unsigned long data)
> > +{
> > +	u64 *spte;
> > +	int dirty = 0;
> > +
> > +	if (!shadow_dirty_mask) {
> > +		WARN(1, "KVM: do NOT try to test dirty bit in EPT\n");
> > +		goto out;
> > +	}
> > +
> > +	spte = rmap_next(kvm, rmapp, NULL);
> > +	while (spte) {
> > +		int _dirty;
> > +		u64 _spte = *spte;
> > +		BUG_ON(!(_spte&  PT_PRESENT_MASK));
> > +		_dirty = _spte&  PT_DIRTY_MASK;
> > +		if (_dirty) {
> > +			dirty = 1;
> > +			clear_bit(PT_DIRTY_SHIFT, (unsigned long *)spte);
> > +		}
> 
> Racy.  Also, needs a tlb flush eventually.
> 
> > +		spte = rmap_next(kvm, rmapp, spte);
> > +	}
> > +out:
> > +	return dirty;
> > +}
> > +
> >   #define RMAP_RECYCLE_THRESHOLD 1000
> >
> >
> >   struct mmu_notifier_ops {
> > +	int (*dirty_update)(struct mmu_notifier *mn,
> > +			     struct mm_struct *mm);
> > +
> 
> I prefer to have test_and_clear_dirty() always return 1 in this case (if 
> the spte is writeable), and drop this callback.

If test_and_clear_dirty() always return 1, how can ksmd tell if it's a real
dirty page or just casued by EPT and ksmd should just fallback to checksum 
based approach?

> > +int __mmu_notifier_dirty_update(struct mm_struct *mm)
> > +{
> > +	struct mmu_notifier *mn;
> > +	struct hlist_node *n;
> > +	int dirty_update = 0;
> > +
> > +	rcu_read_lock();
> > +	hlist_for_each_entry_rcu(mn, n,&mm->mmu_notifier_mm->list, hlist) {
> > +		if (mn->ops->dirty_update)
> > +			dirty_update |= mn->ops->dirty_update(mn, mm);
> > +	}
> > +	rcu_read_unlock();
> > +
> 
> Should it not be &= instead?

I think the logic is "if _any_ underlying MMU is going to update the bit, then
this bit is not dead, we can query it throught test_and_clear....". ksmd should 
not care about which one dirties the page, as long as it's dirty, it can be skipped.
Did I miss sth?

Thanks,

Nai


> 
> > +	return dirty_update;
> > +}
> > +
> >   /*
> >    * This function can't run concurrently against mmu_notifier_register
> >    * because mm->mm_users>  0 during mmu_notifier_register and exit_mmap
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-22 11:24 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-21 12:55 [PATCH 0/2 V2] ksm: take dirty bit as reference to avoid volatile pages scanning Nai Xia
2011-06-21 12:55 ` Nai Xia
2011-06-21 13:26 ` [PATCH 1/2 " Nai Xia
2011-06-21 13:26   ` Nai Xia
2011-06-21 21:42   ` Chris Wright
2011-06-21 21:42     ` Chris Wright
2011-06-22  0:02     ` Nai Xia
2011-06-22  0:02       ` Nai Xia
2011-06-22  0:42       ` Chris Wright
2011-06-22  0:42         ` Chris Wright
2011-06-21 13:32 ` [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking Nai Xia
2011-06-21 13:32   ` Nai Xia
2011-06-22  0:21   ` Chris Wright
2011-06-22  0:21     ` Chris Wright
2011-06-22  4:43     ` Nai Xia
2011-06-22  4:43       ` Nai Xia
2011-06-22  6:15     ` Izik Eidus
2011-06-22  6:15       ` Izik Eidus
2011-06-22  6:38       ` Nai Xia
2011-06-22  6:38         ` Nai Xia
2011-06-22 15:46       ` Chris Wright
2011-06-22 15:46         ` Chris Wright
2011-06-22 10:43   ` Avi Kivity
2011-06-22 10:43     ` Avi Kivity
2011-06-22 11:05     ` Izik Eidus
2011-06-22 11:05       ` Izik Eidus
2011-06-22 11:10       ` Avi Kivity
2011-06-22 11:10         ` Avi Kivity
2011-06-22 11:19         ` Izik Eidus
2011-06-22 11:19           ` Izik Eidus
2011-06-22 11:24           ` Avi Kivity
2011-06-22 11:24             ` Avi Kivity
2011-06-22 11:28             ` Avi Kivity
2011-06-22 11:28               ` Avi Kivity
2011-06-22 11:31               ` Avi Kivity
2011-06-22 11:31                 ` Avi Kivity
2011-06-22 11:33               ` Nai Xia
2011-06-22 11:33                 ` Nai Xia
2011-06-22 11:39                 ` Izik Eidus
2011-06-22 11:39                   ` Izik Eidus
2011-06-22 15:39           ` Rik van Riel
2011-06-22 15:39             ` Rik van Riel
2011-06-22 16:55             ` Andrea Arcangeli
2011-06-22 16:55               ` Andrea Arcangeli
2011-06-22 23:37               ` Nai Xia
2011-06-22 23:37                 ` Nai Xia
2011-06-22 23:59                 ` Andrea Arcangeli
2011-06-22 23:59                   ` Andrea Arcangeli
2011-06-23  0:31                   ` Nai Xia
2011-06-23  0:31                     ` Nai Xia
2011-06-23  0:44                     ` Andrea Arcangeli
2011-06-23  0:44                       ` Andrea Arcangeli
2011-06-23  1:36                       ` Nai Xia
2011-06-23  1:36                         ` Nai Xia
2011-06-23  0:00                 ` Rik van Riel
2011-06-23  0:00                   ` Rik van Riel
2011-06-23  0:42                   ` Nai Xia
2011-06-23  0:42                     ` Nai Xia
2011-06-22 23:13             ` Nai Xia
2011-06-22 23:13               ` Nai Xia
2011-06-22 23:25               ` Andrea Arcangeli
2011-06-22 23:25                 ` Andrea Arcangeli
2011-06-23  1:30                 ` Nai Xia
2011-06-23  1:30                   ` Nai Xia
2011-06-22 23:28               ` Rik van Riel
2011-06-22 23:28                 ` Rik van Riel
2011-06-23  0:52                 ` Nai Xia
2011-06-23  0:52                   ` Nai Xia
2011-06-22 11:24     ` Nai Xia [this message]
2011-06-22 15:03   ` Andrea Arcangeli
2011-06-22 15:03     ` Andrea Arcangeli
2011-06-22 15:19     ` Izik Eidus
2011-06-22 15:19       ` Izik Eidus
2011-06-22 23:19     ` Nai Xia
2011-06-22 23:19       ` Nai Xia
2011-06-22 23:44       ` Andrea Arcangeli
2011-06-22 23:44         ` Andrea Arcangeli
2011-06-23  0:14         ` Nai Xia
2011-06-23  0:14           ` Nai Xia
2011-06-22 23:42     ` Nai Xia
2011-06-22 23:42       ` Nai Xia
2011-06-21 13:36 ` [PATCH 2/2 V2] ksm: take dirty bit as reference to avoid volatile pages scanning Nai Xia
2011-06-21 13:36   ` Nai Xia
2011-06-21 22:38   ` Chris Wright
2011-06-21 22:38     ` Chris Wright
2011-06-22  0:04     ` Nai Xia
2011-06-22  0:04       ` Nai Xia
2011-06-22  0:35       ` Chris Wright
2011-06-22  0:35         ` Chris Wright
2011-06-22  4:47         ` Nai Xia
2011-06-22  4:47           ` Nai Xia
2011-06-22 10:55         ` Nai Xia
2011-06-22 10:55           ` Nai Xia
2011-06-22  0:46 ` [PATCH 0/2 " Chris Wright
2011-06-22  0:46   ` Chris Wright
2011-06-22  4:15   ` Nai Xia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201106221924.36996.nai.xia@gmail.com \
    --to=nai.xia@gmail.com \
    --cc="Undisclosed.Recipients:"@kvack.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisw@sous-sol.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=izik.eidus@ravellosystems.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.