linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
	David Gibson <david@gibson.dropbear.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org
Subject: Re: [PATCH 8/8] KVM: PPC: Add hugepage support for IOMMU in-kernel handling
Date: Thu, 11 Jul 2013 11:52:38 +0200	[thread overview]
Message-ID: <902F79B9-BB81-40A0-865D-94E7108DAC5E@suse.de> (raw)
In-Reply-To: <51DE7377.1060503@ozlabs.ru>


On 11.07.2013, at 10:57, Alexey Kardashevskiy wrote:

> On 07/10/2013 03:32 AM, Alexander Graf wrote:
>> On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
>>> This adds special support for huge pages (16MB).  The reference
>>> counting cannot be easily done for such pages in real mode (when
>>> MMU is off) so we added a list of huge pages.  It is populated in
>>> virtual mode and get_page is called just once per a huge page.
>>> Real mode handlers check if the requested page is huge and in the list,
>>> then no reference counting is done, otherwise an exit to virtual mode
>>> happens.  The list is released at KVM exit.  At the moment the fastest
>>> card available for tests uses up to 9 huge pages so walking through this
>>> list is not very expensive.  However this can change and we may want
>>> to optimize this.
>>> 
>>> Signed-off-by: Paul Mackerras<paulus@samba.org>
>>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
>>> 
>>> ---
>>> 
>>> Changes:
>>> 2013/06/27:
>>> * list of huge pages replaces with hashtable for better performance
>> 
>> So the only thing your patch description really talks about is not true
>> anymore?
>> 
>>> * spinlock removed from real mode and only protects insertion of new
>>> huge [ages descriptors into the hashtable
>>> 
>>> 2013/06/05:
>>> * fixed compile error when CONFIG_IOMMU_API=n
>>> 
>>> 2013/05/20:
>>> * the real mode handler now searches for a huge page by gpa (used to be pte)
>>> * the virtual mode handler prints warning if it is called twice for the same
>>> huge page as the real mode handler is expected to fail just once - when a
>>> huge
>>> page is not in the list yet.
>>> * the huge page is refcounted twice - when added to the hugepage list and
>>> when used in the virtual mode hcall handler (can be optimized but it will
>>> make the patch less nice).
>>> 
>>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
>>> ---
>>>  arch/powerpc/include/asm/kvm_host.h |  25 +++++++++
>>>  arch/powerpc/kernel/iommu.c         |   6 ++-
>>>  arch/powerpc/kvm/book3s_64_vio.c    | 104
>>> +++++++++++++++++++++++++++++++++---
>>>  arch/powerpc/kvm/book3s_64_vio_hv.c |  21 ++++++--
>>>  4 files changed, 146 insertions(+), 10 deletions(-)
>>> 
>>> diff --git a/arch/powerpc/include/asm/kvm_host.h
>>> b/arch/powerpc/include/asm/kvm_host.h
>>> index 53e61b2..a7508cf 100644
>>> --- a/arch/powerpc/include/asm/kvm_host.h
>>> +++ b/arch/powerpc/include/asm/kvm_host.h
>>> @@ -30,6 +30,7 @@
>>>  #include<linux/kvm_para.h>
>>>  #include<linux/list.h>
>>>  #include<linux/atomic.h>
>>> +#include<linux/hashtable.h>
>>>  #include<asm/kvm_asm.h>
>>>  #include<asm/processor.h>
>>>  #include<asm/page.h>
>>> @@ -182,10 +183,34 @@ struct kvmppc_spapr_tce_table {
>>>      u32 window_size;
>>>      struct iommu_group *grp;        /* used for IOMMU groups */
>>>      struct vfio_group *vfio_grp;        /* used for IOMMU groups */
>>> +    DECLARE_HASHTABLE(hash_tab, ilog2(64));    /* used for IOMMU groups */
>>> +    spinlock_t hugepages_write_lock;    /* used for IOMMU groups */
>>>      struct { struct { unsigned long put, indir, stuff; } rm, vm; } stat;
>>>      struct page *pages[0];
>>>  };
>>> 
>>> +/*
>>> + * The KVM guest can be backed with 16MB pages.
>>> + * In this case, we cannot do page counting from the real mode
>>> + * as the compound pages are used - they are linked in a list
>>> + * with pointers as virtual addresses which are inaccessible
>>> + * in real mode.
>>> + *
>>> + * The code below keeps a 16MB pages list and uses page struct
>>> + * in real mode if it is already locked in RAM and inserted into
>>> + * the list or switches to the virtual mode where it can be
>>> + * handled in a usual manner.
>>> + */
>>> +#define KVMPPC_SPAPR_HUGEPAGE_HASH(gpa)    hash_32(gpa>>  24, 32)
>>> +
>>> +struct kvmppc_spapr_iommu_hugepage {
>>> +    struct hlist_node hash_node;
>>> +    unsigned long gpa;    /* Guest physical address */
>>> +    unsigned long hpa;    /* Host physical address */
>>> +    struct page *page;    /* page struct of the very first subpage */
>>> +    unsigned long size;    /* Huge page size (always 16MB at the moment) */
>>> +};
>>> +
>>>  struct kvmppc_linear_info {
>>>      void        *base_virt;
>>>      unsigned long     base_pfn;
>>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>>> index 51678ec..e0b6eca 100644
>>> --- a/arch/powerpc/kernel/iommu.c
>>> +++ b/arch/powerpc/kernel/iommu.c
>>> @@ -999,7 +999,8 @@ int iommu_free_tces(struct iommu_table *tbl, unsigned
>>> long entry,
>>>              if (!pg) {
>>>                  ret = -EAGAIN;
>>>              } else if (PageCompound(pg)) {
>>> -                ret = -EAGAIN;
>>> +                /* Hugepages will be released at KVM exit */
>>> +                ret = 0;
>>>              } else {
>>>                  if (oldtce&  TCE_PCI_WRITE)
>>>                      SetPageDirty(pg);
>>> @@ -1009,6 +1010,9 @@ int iommu_free_tces(struct iommu_table *tbl,
>>> unsigned long entry,
>>>              struct page *pg = pfn_to_page(oldtce>>  PAGE_SHIFT);
>>>              if (!pg) {
>>>                  ret = -EAGAIN;
>>> +            } else if (PageCompound(pg)) {
>>> +                /* Hugepages will be released at KVM exit */
>>> +                ret = 0;
>>>              } else {
>>>                  if (oldtce&  TCE_PCI_WRITE)
>>>                      SetPageDirty(pg);
>>> diff --git a/arch/powerpc/kvm/book3s_64_vio.c
>>> b/arch/powerpc/kvm/book3s_64_vio.c
>>> index 2b51f4a..c037219 100644
>>> --- a/arch/powerpc/kvm/book3s_64_vio.c
>>> +++ b/arch/powerpc/kvm/book3s_64_vio.c
>>> @@ -46,6 +46,40 @@
>>> 
>>>  #define ERROR_ADDR      ((void *)~(unsigned long)0x0)
>>> 
>>> +#ifdef CONFIG_IOMMU_API
>> 
>> Can't you just make CONFIG_IOMMU_API mandatory in Kconfig?
> 
> 
> Where exactly (it is rather SPAPR_TCE_IOMMU but does not really matter)?
> Select it on KVM_BOOK3S_64? CONFIG_KVM_BOOK3S_64_HV?
> CONFIG_KVM_BOOK3S_64_PR? PPC_BOOK3S_64?

I'd say the most logical choice would be to check the Makefile and see when it gets compiled. For those cases we want it enabled.

> I am trying to imagine a configuration where we really do not want
> IOMMU_API. Ben mentioned PPC32 and embedded PPC64 and that's it so any of
> BOOK3S (KVM_BOOK3S_64 is the best) should be fine, no?

book3s_32 doesn't want this, but any book3s_64 implementation could potentially use it, yes. That's pretty much what the Makefile tells you too :).


Alex


  reply	other threads:[~2013-07-11  9:52 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-06 15:06 [PATCH 0/8 v5] KVM: PPC: IOMMU in-kernel handling Alexey Kardashevskiy
2013-07-06 15:07 ` [PATCH 1/8] KVM: PPC: reserve a capability number for multitce support Alexey Kardashevskiy
2013-07-06 15:07 ` [PATCH 2/8] KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO Alexey Kardashevskiy
2013-07-06 15:07 ` [PATCH 3/8] vfio: add external user support Alexey Kardashevskiy
2013-07-08 21:52   ` Alex Williamson
2013-07-09  5:40     ` Alexey Kardashevskiy
2013-07-09 14:08       ` Alex Williamson
2013-07-06 15:07 ` [PATCH 4/8] powerpc: Prepare to support kernel handling of IOMMU map/unmap Alexey Kardashevskiy
2013-07-08  1:33   ` Benjamin Herrenschmidt
2013-07-09 15:54     ` Alexander Graf
2013-07-06 15:07 ` [PATCH 5/8] powerpc: add real mode support for dma operations on powernv Alexey Kardashevskiy
2013-07-08  4:44   ` [PATCH v2] " Alexey Kardashevskiy
2013-07-08  7:20     ` Benjamin Herrenschmidt
2013-07-08  7:31       ` Alexey Kardashevskiy
2013-07-08  7:40         ` Benjamin Herrenschmidt
2013-07-09 16:02   ` [PATCH 5/8] " Alexander Graf
2013-07-10  3:17     ` Alexey Kardashevskiy
2013-07-10  3:37     ` Benjamin Herrenschmidt
2013-07-06 15:07 ` [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls Alexey Kardashevskiy
2013-07-09 17:02   ` Alexander Graf
2013-07-10  5:00     ` Alexey Kardashevskiy
2013-07-10 10:05       ` Alexander Graf
2013-07-11  5:12         ` Alexey Kardashevskiy
2013-07-11 10:11           ` Alexander Graf
2013-07-11 10:54             ` Alexey Kardashevskiy
2013-07-11 11:15               ` Alexander Graf
2013-07-11 12:39                 ` Benjamin Herrenschmidt
2013-07-11 12:51                   ` Alexander Graf
2013-07-11 12:56                     ` Alexey Kardashevskiy
2013-07-11 12:58                     ` Benjamin Herrenschmidt
2013-07-11 13:13                       ` Alexey Kardashevskiy
2013-07-11 13:21                         ` Alexander Graf
2013-07-11 12:40                 ` Benjamin Herrenschmidt
2013-07-11 12:38             ` Benjamin Herrenschmidt
2013-07-11 12:33           ` Benjamin Herrenschmidt
2013-07-11 13:11             ` Alexander Graf
2013-07-06 15:07 ` [PATCH 7/8] KVM: PPC: Add support for IOMMU in-kernel handling Alexey Kardashevskiy
2013-07-09 17:06   ` Alexander Graf
2013-07-06 15:07 ` [PATCH 8/8] KVM: PPC: Add hugepage " Alexey Kardashevskiy
2013-07-09 17:32   ` Alexander Graf
2013-07-09 23:29     ` Alexey Kardashevskiy
2013-07-10 10:33       ` Alexander Graf
2013-07-10 10:39         ` Benjamin Herrenschmidt
2013-07-10 10:40           ` Alexander Graf
2013-07-10 10:42             ` Alexander Graf
2013-07-11  8:57     ` Alexey Kardashevskiy
2013-07-11  9:52       ` Alexander Graf [this message]
2013-07-11 12:37         ` Benjamin Herrenschmidt
2013-07-11 12:50           ` Alexander Graf
2013-07-11 12:56             ` Benjamin Herrenschmidt
2013-07-11 13:41               ` chandrashekar shastri
2013-07-11 13:44                 ` Alexander Graf
2013-07-11 13:46                 ` Alexey Kardashevskiy
  -- strict thread matches above, loose matches on Subject: below --
2013-06-27  5:02 [PATCH 0/8 v4] KVM: PPC: " Alexey Kardashevskiy
2013-06-27  5:02 ` [PATCH 8/8] KVM: PPC: Add hugepage support for " Alexey Kardashevskiy
2013-06-27 18:39   ` Scott Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=902F79B9-BB81-40A0-865D-94E7108DAC5E@suse.de \
    --to=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).