From: Christian Borntraeger <borntraeger@de.ibm.com>
To: David Hildenbrand <david@redhat.com>,
Janosch Frank <frankja@linux.vnet.ibm.com>
Cc: KVM <kvm@vger.kernel.org>, Cornelia Huck <cohuck@redhat.com>,
Thomas Huth <thuth@redhat.com>,
Ulrich Weigand <Ulrich.Weigand@de.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-s390 <linux-s390@vger.kernel.org>,
Michael Mueller <mimu@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 02/35] KVM: s390/interrupt: do not pin adapter interrupt pages
Date: Mon, 10 Feb 2020 19:38:53 +0100 [thread overview]
Message-ID: <083a3fd0-7b56-e92b-bf15-3383b7f5488b@de.ibm.com> (raw)
In-Reply-To: <2cf62b84-8eb6-18d5-437b-7e86401b9c45@redhat.com>
On 10.02.20 13:26, David Hildenbrand wrote:
> On 07.02.20 12:39, Christian Borntraeger wrote:
>> From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>>
>> The adapter interrupt page containing the indicator bits is currently
>> pinned. That means that a guest with many devices can pin a lot of
>> memory pages in the host. This also complicates the reference tracking
>> which is needed for memory management handling of protected virtual
>> machines.
>> We can reuse the pte notifiers to "cache" the page without pinning it.
>>
>> Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
>> Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
>> [borntraeger@de.ibm.com: patch merging, splitting, fixing]
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>
> So, instead of pinning explicitly, look up the page address, cache it,
> and glue its lifetime to the gmap table entry. When that entry is
> changed, invalidate the cached page. On re-access, look up the page
> again and register the gmap notifier for the table entry again.
I think I might want to split this into two parts.
part 1: a naive approach that always does get_user_pages_remote/put_page
part 2: do the complex caching
Ulrich mentioned that this actually could make the map/unmap a no-op as we
have the address and bit already in the irq route. In the end this might be
as fast as todays pinning as we replace a list walk with a page table walk.
Plus it would simplify the code. Will have a look if that is the case.
>
> [...]
>
>> #define MAX_S390_IO_ADAPTERS ((MAX_ISC + 1) * 8)
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index c06c89d370a7..4bfb2f8fe57c 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -28,6 +28,7 @@
>> #include <asm/switch_to.h>
>> #include <asm/nmi.h>
>> #include <asm/airq.h>
>> +#include <linux/pagemap.h>
>> #include "kvm-s390.h"
>> #include "gaccess.h"
>> #include "trace-s390.h"
>> @@ -2328,8 +2329,8 @@ static int register_io_adapter(struct kvm_device *dev,
>> return -ENOMEM;
>>
>> INIT_LIST_HEAD(&adapter->maps);
>> - init_rwsem(&adapter->maps_lock);
>> - atomic_set(&adapter->nr_maps, 0);
>> + spin_lock_init(&adapter->maps_lock);
>> + adapter->nr_maps = 0;
>> adapter->id = adapter_info.id;
>> adapter->isc = adapter_info.isc;
>> adapter->maskable = adapter_info.maskable;
>> @@ -2375,19 +2376,15 @@ static int kvm_s390_adapter_map(struct kvm *kvm, unsigned int id, __u64 addr)
>> ret = -EFAULT;
>> goto out;
>> }
>> - ret = get_user_pages_fast(map->addr, 1, FOLL_WRITE, &map->page);
>> - if (ret < 0)
>> - goto out;
>> - BUG_ON(ret != 1);
>> - down_write(&adapter->maps_lock);
>> - if (atomic_inc_return(&adapter->nr_maps) < MAX_S390_ADAPTER_MAPS) {
>> + spin_lock(&adapter->maps_lock);
>> + if (adapter->nr_maps < MAX_S390_ADAPTER_MAPS) {
>> + adapter->nr_maps++;
>> list_add_tail(&map->list, &adapter->maps);
>
> I do wonder if we should check for duplicates. The unmap path will only
> remove exactly one entry. But maybe this can never happen or is already
> handled on a a higher layer.
This would be a broken userspace, but I also do not see a what would break
in the host if this happens.
>
>> }
>> @@ -2430,7 +2426,6 @@ void kvm_s390_destroy_adapters(struct kvm *kvm)
>> list_for_each_entry_safe(map, tmp,
>> &kvm->arch.adapters[i]->maps, list) {
>> list_del(&map->list);
>> - put_page(map->page);
>> kfree(map);
>> }
>> kfree(kvm->arch.adapters[i]);
>
> Between the gmap being removed in kvm_arch_vcpu_destroy() and
> kvm_s390_destroy_adapters(), the entries would no longer properly get
> invalidated. AFAIK, removing/freeing the gmap will not trigger any
> notifiers.
>
> Not sure if that's an issue (IOW, if we can have some very weird race).
> But I guess we would have similar races already :)
This is only called when all file descriptors are closed and this also closes
all irq routes. So I guess no I/O should be going on any more.
>
>> @@ -2690,6 +2685,31 @@ struct kvm_device_ops kvm_flic_ops = {
>> .destroy = flic_destroy,
>> };
>>
>> +void kvm_s390_adapter_gmap_notifier(struct gmap *gmap, unsigned long start,
>> + unsigned long end)
>> +{
>> + struct kvm *kvm = gmap->private;
>> + struct s390_map_info *map, *tmp;
>> + int i;
>> +
>> + for (i = 0; i < MAX_S390_IO_ADAPTERS; i++) {
>> + struct s390_io_adapter *adapter = kvm->arch.adapters[i];
>> +
>> + if (!adapter)
>> + continue;
>
> I have to ask very dumb: How is kvm->arch.adapters[] protected?
We only add new ones and this is removed at guest teardown it seems.
[...]
Let me have a look if we can simplify this.
next prev parent reply other threads:[~2020-02-10 18:39 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 11:39 [PATCH 00/35] KVM: s390: Add support for protected VMs Christian Borntraeger
2020-02-07 11:39 ` [PATCH 01/35] mm:gup/writeback: add callbacks for inaccessible pages Christian Borntraeger
2020-02-10 17:27 ` Christian Borntraeger
2020-02-11 11:26 ` Will Deacon
2020-02-11 11:43 ` Christian Borntraeger
2020-02-13 14:48 ` Christian Borntraeger
2020-02-18 16:02 ` Will Deacon
2020-02-13 19:56 ` Sean Christopherson
2020-02-13 20:13 ` Christian Borntraeger
2020-02-13 20:46 ` Sean Christopherson
2020-02-17 20:55 ` Tom Lendacky
2020-02-17 21:14 ` Christian Borntraeger
2020-02-10 18:17 ` David Hildenbrand
2020-02-10 18:28 ` Christian Borntraeger
2020-02-10 18:43 ` David Hildenbrand
2020-02-10 18:51 ` Christian Borntraeger
2020-02-18 3:36 ` Tian, Kevin
2020-02-18 6:44 ` Christian Borntraeger
2020-02-07 11:39 ` [PATCH 02/35] KVM: s390/interrupt: do not pin adapter interrupt pages Christian Borntraeger
2020-02-10 12:26 ` David Hildenbrand
2020-02-10 18:38 ` Christian Borntraeger [this message]
2020-02-10 19:33 ` David Hildenbrand
2020-02-11 9:23 ` [PATCH v2 RFC] " Christian Borntraeger
2020-02-12 11:52 ` Christian Borntraeger
2020-02-12 12:16 ` David Hildenbrand
2020-02-12 12:22 ` Christian Borntraeger
2020-02-12 12:47 ` David Hildenbrand
2020-02-12 12:39 ` Cornelia Huck
2020-02-12 12:44 ` Christian Borntraeger
2020-02-12 13:07 ` Cornelia Huck
2020-02-10 18:56 ` [PATCH 02/35] KVM: s390/interrupt: do not pin adapter interrupt Ulrich Weigand
2020-02-10 12:40 ` [PATCH 02/35] KVM: s390/interrupt: do not pin adapter interrupt pages David Hildenbrand
2020-02-07 11:39 ` [PATCH 05/35] s390/mm: provide memory management functions for protected KVM guests Christian Borntraeger
2020-02-12 13:42 ` Cornelia Huck
2020-02-13 7:43 ` Christian Borntraeger
2020-02-13 8:44 ` Cornelia Huck
2020-02-14 17:59 ` David Hildenbrand
2020-02-14 21:17 ` Christian Borntraeger
2020-02-07 11:39 ` [PATCH 06/35] s390/mm: add (non)secure page access exceptions handlers Christian Borntraeger
2020-02-14 18:05 ` David Hildenbrand
2020-02-14 19:59 ` Christian Borntraeger
2020-02-07 11:39 ` [PATCH 10/35] KVM: s390: protvirt: Secure memory is not mergeable Christian Borntraeger
2020-02-07 11:39 ` [PATCH 11/35] KVM: s390/mm: Make pages accessible before destroying the guest Christian Borntraeger
2020-02-14 18:40 ` David Hildenbrand
2020-02-07 11:39 ` [PATCH 21/35] KVM: s390/mm: handle guest unpin events Christian Borntraeger
2020-02-10 14:58 ` Thomas Huth
2020-02-11 13:21 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=083a3fd0-7b56-e92b-bf15-3383b7f5488b@de.ibm.com \
--to=borntraeger@de.ibm.com \
--cc=Ulrich.Weigand@de.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=frankja@linux.vnet.ibm.com \
--cc=gor@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=mimu@linux.ibm.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).