All of lore.kernel.org
 help / color / mirror / Atom feed
From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: Janosch Frank <frankja@linux.ibm.com>
Cc: kvm@vger.kernel.org, borntraeger@de.ibm.com, thuth@redhat.com,
	pasic@linux.ibm.com, david@redhat.com,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	scgl@linux.ibm.com
Subject: Re: [PATCH v7 01/17] KVM: s390: pv: leak the topmost page table when destroy fails
Date: Mon, 7 Feb 2022 10:02:39 +0100	[thread overview]
Message-ID: <20220207100239.1e043759@p-imbrenda> (raw)
In-Reply-To: <0939aac3-9427-ed04-17e4-3c1e4195d509@linux.ibm.com>

On Mon, 7 Feb 2022 09:56:39 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 2/4/22 16:53, Claudio Imbrenda wrote:
> > Each secure guest must have a unique ASCE (address space control
> > element); we must avoid that new guests use the same page for their
> > ASCE, to avoid errors.
> > 
> > Since the ASCE mostly consists of the address of the topmost page table
> > (plus some flags), we must not return that memory to the pool unless
> > the ASCE is no longer in use.
> > 
> > Only a successful Destroy Secure Configuration UVC will make the ASCE
> > reusable again.
> > 
> > If the Destroy Configuration UVC fails, the ASCE cannot be reused for a
> > secure guest (either for the ASCE or for other memory areas). To avoid
> > a collision, it must not be used again. This is a permanent error and
> > the page becomes in practice unusable, so we set it aside and leak it.
> > On failure we already leak other memory that belongs to the ultravisor
> > (i.e. the variable and base storage for a guest) and not leaking the
> > topmost page table was an oversight.
> > 
> > This error (and thus the leakage) should not happen unless the hardware
> > is broken or KVM has some unknown serious bug.
> > 
> > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > Fixes: 29b40f105ec8d55 ("KVM: s390: protvirt: Add initial vm and cpu lifecycle handling")
> > ---
> >   arch/s390/include/asm/gmap.h |  2 ++
> >   arch/s390/kvm/pv.c           |  9 +++--
> >   arch/s390/mm/gmap.c          | 69 ++++++++++++++++++++++++++++++++++++
> >   3 files changed, 77 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
> > index 40264f60b0da..746e18bf8984 100644
> > --- a/arch/s390/include/asm/gmap.h
> > +++ b/arch/s390/include/asm/gmap.h
> > @@ -148,4 +148,6 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
> >   			     unsigned long gaddr, unsigned long vmaddr);
> >   int gmap_mark_unmergeable(void);
> >   void s390_reset_acc(struct mm_struct *mm);
> > +void s390_remove_old_asce(struct gmap *gmap);
> > +int s390_replace_asce(struct gmap *gmap);
> >   #endif /* _ASM_S390_GMAP_H */
> > diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
> > index 7f7c0d6af2ce..3c59ef763dde 100644
> > --- a/arch/s390/kvm/pv.c
> > +++ b/arch/s390/kvm/pv.c
> > @@ -166,10 +166,13 @@ int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
> >   	atomic_set(&kvm->mm->context.is_protected, 0);
> >   	KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
> >   	WARN_ONCE(cc, "protvirt destroy vm failed rc %x rrc %x", *rc, *rrc);
> > -	/* Inteded memory leak on "impossible" error */
> > -	if (!cc)
> > +	/* Intended memory leak on "impossible" error */
> > +	if (!cc) {
> >   		kvm_s390_pv_dealloc_vm(kvm);
> > -	return cc ? -EIO : 0;
> > +		return 0;
> > +	}
> > +	s390_replace_asce(kvm->arch.gmap);
> > +	return -EIO;
> >   }
> >   
> >   int kvm_s390_pv_init_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
> > diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
> > index dfee0ebb2fac..ce6cac4463f2 100644
> > --- a/arch/s390/mm/gmap.c
> > +++ b/arch/s390/mm/gmap.c
> > @@ -2714,3 +2714,72 @@ void s390_reset_acc(struct mm_struct *mm)
> >   	mmput(mm);
> >   }
> >   EXPORT_SYMBOL_GPL(s390_reset_acc);
> > +
> > +/**
> > + * s390_remove_old_asce - Remove the topmost level of page tables from the
> > + * list of page tables of the gmap.
> > + * @gmap the gmap whose table is to be removed
> > + *
> > + * This means that it will not be freed when the VM is torn down, and needs
> > + * to be handled separately by the caller, unless an intentional leak is
> > + * intended.
> > + */
> > +void s390_remove_old_asce(struct gmap *gmap)
> > +{
> > +	struct page *old;
> > +
> > +	old = virt_to_page(gmap->table);
> > +	spin_lock(&gmap->guest_table_lock);
> > +	list_del(&old->lru);
> > +	/*
> > +	 * in case the ASCE needs to be "removed" multiple times, for example
> > +	 * if the VM is rebooted into secure mode several times
> > +	 * concurrently.
> > +	 */
> > +	INIT_LIST_HEAD(&old->lru);
> > +	spin_unlock(&gmap->guest_table_lock);  
> 
> The patch itself looks fine to me, but there's one oddity which made me 
> look twice:
> 
> You're not overwriting gmap->table here so you can use it in the 
> function below. I guess that's intentional so it can still be used as a 
> reference until we switch over to the new ASCE page?

yes. maybe I should rename the function or add more comments explaining
that the page is only removed from the list, so that it will not be freed at
teardown, but it's still in use by the VM (because we always need an
ASCE)

> 
> 
> > +}
> > +EXPORT_SYMBOL_GPL(s390_remove_old_asce);
> > +
> > +/**
> > + * s390_replace_asce - Try to replace the current ASCE of a gmap with
> > + * another equivalent one.
> > + * @gmap the gmap
> > + *
> > + * If the allocation of the new top level page table fails, the ASCE is not
> > + * replaced.
> > + * In any case, the old ASCE is always removed from the list. Therefore the
> > + * caller has to make sure to save a pointer to it beforehands, unless an > + * intentional leak is intended.
> > + */
> > +int s390_replace_asce(struct gmap *gmap)
> > +{
> > +	unsigned long asce;
> > +	struct page *page;
> > +	void *table;
> > +
> > +	s390_remove_old_asce(gmap);
> > +
> > +	page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
> > +	if (!page)
> > +		return -ENOMEM;
> > +	table = page_to_virt(page);
> > +	memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
> > +
> > +	/*
> > +	 * The caller has to deal with the old ASCE, but here we make sure
> > +	 * the new one is properly added to the list of page tables, so that
> > +	 * it will be freed when the VM is torn down.
> > +	 */
> > +	spin_lock(&gmap->guest_table_lock);
> > +	list_add(&page->lru, &gmap->crst_list);
> > +	spin_unlock(&gmap->guest_table_lock);
> > +
> > +	asce = (gmap->asce & ~PAGE_MASK) | __pa(table);  
> 
> Please add a comment:
> Set the new table origin while preserving ASCE control bits like table 
> type and length.

will do

> 
> > +	WRITE_ONCE(gmap->asce, asce);
> > +	WRITE_ONCE(gmap->mm->context.gmap_asce, asce);
> > +	WRITE_ONCE(gmap->table, table);
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(s390_replace_asce);
> >   
> 


  reply	other threads:[~2022-02-07  9:15 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-04 15:53 [PATCH v7 00/17] KVM: s390: pv: implement lazy destroy for reboot Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 01/17] KVM: s390: pv: leak the topmost page table when destroy fails Claudio Imbrenda
2022-02-07  8:56   ` Janosch Frank
2022-02-07  9:02     ` Claudio Imbrenda [this message]
2022-02-07 14:33     ` Heiko Carstens
2022-02-07 14:49       ` Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 02/17] KVM: s390: pv: handle secure storage violations for protected guests Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 03/17] KVM: s390: pv: handle secure storage exceptions for normal guests Claudio Imbrenda
2022-02-04 19:51   ` kernel test robot
2022-02-04 19:51     ` kernel test robot
2022-02-07  9:40   ` Janosch Frank
2022-02-04 15:53 ` [PATCH v7 04/17] KVM: s390: pv: refactor s390_reset_acc Claudio Imbrenda
2022-02-07 10:02   ` Janosch Frank
2022-02-07 10:47     ` Claudio Imbrenda
2022-02-07 10:56       ` Janosch Frank
2022-02-07 11:01         ` Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 05/17] KVM: s390: pv: usage counter instead of flag Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 06/17] KVM: s390: pv: add export before import Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 07/17] KVM: s390: pv: module parameter to fence lazy destroy Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 08/17] KVM: s390: pv: make kvm_s390_cpus_from_pv global Claudio Imbrenda
2022-02-07 10:15   ` Janosch Frank
2022-02-04 15:53 ` [PATCH v7 09/17] KVM: s390: pv: clear the state without memset Claudio Imbrenda
2022-02-07 10:09   ` Janosch Frank
2022-02-04 15:53 ` [PATCH v7 10/17] KVM: s390: pv: add mmu_notifier Claudio Imbrenda
2022-02-04 21:22   ` kernel test robot
2022-02-04 21:22     ` kernel test robot
2022-02-04 21:22   ` kernel test robot
2022-02-04 21:22     ` kernel test robot
2022-02-07 11:04   ` Janosch Frank
2022-02-07 12:16     ` Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 11/17] s390/mm: KVM: pv: when tearing down, try to destroy protected pages Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 12/17] KVM: s390: pv: refactoring of kvm_s390_pv_deinit_vm Claudio Imbrenda
2022-02-07 11:06   ` Janosch Frank
2022-02-04 15:53 ` [PATCH v7 13/17] KVM: s390: pv: cleanup leftover protected VMs if needed Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 14/17] KVM: s390: pv: asynchronous destroy for reboot Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 15/17] KVM: s390: pv: api documentation for asynchronous destroy Claudio Imbrenda
2022-02-07 14:52   ` Janosch Frank
2022-02-07 15:17     ` Claudio Imbrenda
2022-02-04 15:53 ` [PATCH v7 16/17] KVM: s390: pv: add KVM_CAP_S390_PROT_REBOOT_ASYNC Claudio Imbrenda
2022-02-07 14:37   ` Janosch Frank
2022-02-07 15:19     ` Claudio Imbrenda
2022-02-07 15:40       ` Janosch Frank
2022-02-04 15:53 ` [PATCH v7 17/17] KVM: s390: pv: avoid export before import if possible Claudio Imbrenda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220207100239.1e043759@p-imbrenda \
    --to=imbrenda@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=scgl@linux.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.