From: Alexander Duyck <alexander.duyck@gmail.com> To: Nadav Amit <nadav.amit@gmail.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>, Linux-MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>, kvm list <kvm@vger.kernel.org>, Radim Krcmar <rkrcmar@redhat.com>, X86 ML <x86@kernel.org>, Ingo Molnar <mingo@redhat.com>, bp@alien8.de, Peter Anvin <hpa@zytor.com>, Paolo Bonzini <pbonzini@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Andrew Morton <akpm@linux-foundation.org> Subject: Re: [RFC PATCH 3/4] kvm: Add guest side support for free memory hints Date: Mon, 4 Feb 2019 16:16:29 -0800 Message-ID: <CAKgT0UevPXAG7xGzEur731-EJ0tOSGeg+AwugnRt6ugmfEKeLw@mail.gmail.com> (raw) In-Reply-To: <4DFBB378-8E7A-4905-A94D-D56B5FF6D42B@gmail.com> On Mon, Feb 4, 2019 at 4:03 PM Nadav Amit <nadav.amit@gmail.com> wrote: > > > On Feb 4, 2019, at 3:37 PM, Alexander Duyck <alexander.h.duyck@linux.intel.com> wrote: > > > > On Mon, 2019-02-04 at 15:00 -0800, Nadav Amit wrote: > >>> On Feb 4, 2019, at 10:15 AM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > >>> > >>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > >>> > >>> Add guest support for providing free memory hints to the KVM hypervisor for > >>> freed pages huge TLB size or larger. I am restricting the size to > >>> huge TLB order and larger because the hypercalls are too expensive to be > >>> performing one per 4K page. Using the huge TLB order became the obvious > >>> choice for the order to use as it allows us to avoid fragmentation of higher > >>> order memory on the host. > >>> > >>> I have limited the functionality so that it doesn't work when page > >>> poisoning is enabled. I did this because a write to the page after doing an > >>> MADV_DONTNEED would effectively negate the hint, so it would be wasting > >>> cycles to do so. > >>> > >>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > >>> --- > >>> arch/x86/include/asm/page.h | 13 +++++++++++++ > >>> arch/x86/kernel/kvm.c | 23 +++++++++++++++++++++++ > >>> 2 files changed, 36 insertions(+) > >>> > >>> diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h > >>> index 7555b48803a8..4487ad7a3385 100644 > >>> --- a/arch/x86/include/asm/page.h > >>> +++ b/arch/x86/include/asm/page.h > >>> @@ -18,6 +18,19 @@ > >>> > >>> struct page; > >>> > >>> +#ifdef CONFIG_KVM_GUEST > >>> +#include <linux/jump_label.h> > >>> +extern struct static_key_false pv_free_page_hint_enabled; > >>> + > >>> +#define HAVE_ARCH_FREE_PAGE > >>> +void __arch_free_page(struct page *page, unsigned int order); > >>> +static inline void arch_free_page(struct page *page, unsigned int order) > >>> +{ > >>> + if (static_branch_unlikely(&pv_free_page_hint_enabled)) > >>> + __arch_free_page(page, order); > >>> +} > >>> +#endif > >> > >> This patch and the following one assume that only KVM should be able to hook > >> to these events. I do not think it is appropriate for __arch_free_page() to > >> effectively mean “kvm_guest_free_page()”. > >> > >> Is it possible to use the paravirt infrastructure for this feature, > >> similarly to other PV features? It is not the best infrastructure, but at least > >> it is hypervisor-neutral. > > > > I could probably tie this into the paravirt infrastructure, but if I > > did so I would probably want to pull the checks for the page order out > > of the KVM specific bits and make it something we handle in the inline. > > Doing that I would probably make it a paravirtual hint that only > > operates at the PMD level. That way we wouldn't incur the cost of the > > paravirt infrastructure at the per 4K page level. > > If I understand you correctly, you “complain” that this would affect > performance. It wasn't so much a "complaint" as an "observation". What I was getting at is that if I am going to make it a PV operation I might set a hard limit on it so that it will specifically only apply to huge pages and larger. By doing that I can justify performing the screening based on page order in the inline path and avoid any PV infrastructure overhead unless I have to incur it. > While it might be, you may want to check whether the already available > tools can solve the problem: > > 1. You can use a combination of static-key and pv-ops - see for example > steal_account_process_time() Okay, I was kind of already heading in this direction. The static key I am using now would probably stay put. > 2. You can use callee-saved pv-ops. > > The latter might anyhow be necessary since, IIUC, you change a very hot > path. So you may want have a look on the assembly code of free_pcp_prepare() > (or at least its code-size) before and after your changes. If they are too > big, a callee-saved function might be necessary. I'll have to take a look. I will spend the next couple days familiarizing myself with the pv-ops infrastructure.
next prev parent reply index Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-02-04 18:15 [RFC PATCH 0/4] kvm: Report unused guest pages to host Alexander Duyck 2019-02-04 18:15 ` [RFC PATCH 1/4] madvise: Expose ability to set dontneed from kernel Alexander Duyck 2019-02-04 18:15 ` [RFC PATCH 2/4] kvm: Add host side support for free memory hints Alexander Duyck 2019-02-10 0:44 ` Michael S. Tsirkin 2019-02-11 17:34 ` Alexander Duyck 2019-02-11 17:36 ` Michael S. Tsirkin 2019-02-11 17:41 ` Dave Hansen 2019-02-11 17:48 ` Michael S. Tsirkin 2019-02-11 18:30 ` Alexander Duyck 2019-02-11 19:24 ` Michael S. Tsirkin 2019-02-04 18:15 ` [RFC PATCH 3/4] kvm: Add guest " Alexander Duyck 2019-02-04 19:44 ` Dave Hansen 2019-02-04 20:42 ` Alexander Duyck 2019-02-04 23:00 ` Nadav Amit 2019-02-04 23:37 ` Alexander Duyck 2019-02-05 0:03 ` Nadav Amit 2019-02-05 0:16 ` Alexander Duyck [this message] 2019-02-05 1:46 ` Nadav Amit 2019-02-05 18:09 ` Alexander Duyck 2019-02-07 18:21 ` Luiz Capitulino 2019-02-07 18:44 ` Alexander Duyck 2019-02-07 20:02 ` Luiz Capitulino 2019-02-08 21:05 ` Nitesh Narayan Lal 2019-02-08 21:31 ` Alexander Duyck 2019-02-10 0:49 ` Michael S. Tsirkin 2019-02-11 16:31 ` Alexander Duyck 2019-02-11 17:36 ` Michael S. Tsirkin 2019-02-11 18:10 ` Alexander Duyck 2019-02-11 19:54 ` Michael S. Tsirkin 2019-02-11 21:00 ` Alexander Duyck 2019-02-11 22:52 ` Michael S. Tsirkin [not found] ` <94462313ccd927d25675f69de459456cf066c1a2.camel@linux.intel.com> 2019-02-12 0:34 ` Michael S. Tsirkin 2019-02-11 17:48 ` Dave Hansen 2019-02-11 17:58 ` Michael S. Tsirkin 2019-02-11 18:19 ` Dave Hansen 2019-02-11 19:56 ` Michael S. Tsirkin 2019-02-04 18:15 ` [RFC PATCH 4/4] mm: Add merge page notifier Alexander Duyck 2019-02-04 19:40 ` Dave Hansen 2019-02-04 19:51 ` Alexander Duyck 2019-02-10 0:57 ` Michael S. Tsirkin 2019-02-11 13:30 ` Nitesh Narayan Lal 2019-02-11 14:17 ` Michael S. Tsirkin 2019-02-11 16:24 ` Nitesh Narayan Lal 2019-02-11 17:41 ` Michael S. Tsirkin 2019-02-11 18:09 ` Nitesh Narayan Lal 2019-02-11 6:40 ` Aaron Lu 2019-02-11 15:58 ` Alexander Duyck 2019-02-12 2:09 ` Aaron Lu 2019-02-12 17:20 ` Alexander Duyck 2019-02-04 18:19 ` [RFC PATCH QEMU] i386/kvm: Enable paravirtual unused page hint mechanism Alexander Duyck 2019-02-05 17:25 ` [RFC PATCH 0/4] kvm: Report unused guest pages to host Nitesh Narayan Lal 2019-02-05 18:43 ` Alexander Duyck 2019-02-07 14:48 ` Nitesh Narayan Lal 2019-02-07 16:56 ` Alexander Duyck 2019-02-10 0:51 ` Michael S. Tsirkin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAKgT0UevPXAG7xGzEur731-EJ0tOSGeg+AwugnRt6ugmfEKeLw@mail.gmail.com \ --to=alexander.duyck@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=alexander.h.duyck@linux.intel.com \ --cc=bp@alien8.de \ --cc=hpa@zytor.com \ --cc=kvm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mingo@redhat.com \ --cc=nadav.amit@gmail.com \ --cc=pbonzini@redhat.com \ --cc=rkrcmar@redhat.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git