[PATCH -mm 0/3] fix numa vs kvm scalability issue

From: riel@redhat.com
To: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org,
	chegu_vinod@hp.com, aarcange@redhat.com,
	akpm@linux-foundation.org
Subject: [PATCH -mm 0/3] fix numa vs kvm scalability issue
Date: Tue, 18 Feb 2014 17:12:43 -0500	[thread overview]
Message-ID: <1392761566-24834-1-git-send-email-riel@redhat.com> (raw)

The NUMA scanning code can end up iterating over many gigabytes
of unpopulated memory, especially in the case of a freshly started
KVM guest with lots of memory.

This results in the mmu notifier code being called even when
there are no mapped pages in a virtual address range. The amount
of time wasted can be enough to trigger soft lockup warnings
with very large (>2TB) KVM guests.

This patch moves the mmu notifier call to the pmd level, which
represents 1GB areas of memory on x86-64. Furthermore, the mmu
notifier code is only called from the address in the PMD where
present mappings are first encountered.

The hugetlbfs code is left alone for now; hugetlb mappings are
not relocatable, and as such are left alone by the NUMA code,
and should never trigger this problem to begin with.

The series also adds a cond_resched to task_numa_work, to
fix another potential latency issue.