All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>,
	qemu-s390x <qemu-s390x@nongnu.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	Cornelia Huck <cohuck@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Halil Pasic <pasic@linux.vnet.ibm.com>,
	Janosch Frank <frankja@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/1] s390/kvm: implement clearing part of IPL clear
Date: Thu, 1 Mar 2018 12:58:12 +0000	[thread overview]
Message-ID: <20180301125811.GF2994@work-vm> (raw)
In-Reply-To: <fa1fed0b-4c7d-068e-01d3-cd34aa9d6864@de.ibm.com>

* Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> 
> 
> On 03/01/2018 01:35 PM, Christian Borntraeger wrote:
> > 
> > 
> > On 03/01/2018 01:28 PM, Dr. David Alan Gilbert wrote:
> >> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> >>>
> >>>
> >>> On 03/01/2018 12:45 PM, Dr. David Alan Gilbert wrote:
> >>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> >>>>>
> >>>>>
> >>>>> On 03/01/2018 10:24 AM, Dr. David Alan Gilbert wrote:
> >>>>>> * Thomas Huth (thuth@redhat.com) wrote:
> >>>>>>> On 28.02.2018 20:53, Christian Borntraeger wrote:
> >>>>>>>> When a guests reboots with diagnose 308 subcode 3 it requests the memory
> >>>>>>>> to be cleared. We did not do it so far. This does not only violate the
> >>>>>>>> architecture, it also misses the chance to free up that memory on
> >>>>>>>> reboot, which would help on host memory over commitment.  By using
> >>>>>>>> ram_block_discard_range we can cover both cases.
> >>>>>>>
> >>>>>>> Sounds like a good idea. I wonder whether that release_all_ram()
> >>>>>>> function should maybe rather reside in exec.c, so that other machines
> >>>>>>> that want to clear all RAM at reset time can use it, too?
> >>>>>>>
> >>>>>>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> >>>>>>>> ---
> >>>>>>>>  target/s390x/kvm.c | 19 +++++++++++++++++++
> >>>>>>>>  1 file changed, 19 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> >>>>>>>> index 8f3a422288..2e145ad5c3 100644
> >>>>>>>> --- a/target/s390x/kvm.c
> >>>>>>>> +++ b/target/s390x/kvm.c
> >>>>>>>> @@ -34,6 +34,8 @@
> >>>>>>>>  #include "qapi/error.h"
> >>>>>>>>  #include "qemu/error-report.h"
> >>>>>>>>  #include "qemu/timer.h"
> >>>>>>>> +#include "qemu/rcu_queue.h"
> >>>>>>>> +#include "sysemu/cpus.h"
> >>>>>>>>  #include "sysemu/sysemu.h"
> >>>>>>>>  #include "sysemu/hw_accel.h"
> >>>>>>>>  #include "hw/boards.h"
> >>>>>>>> @@ -41,6 +43,7 @@
> >>>>>>>>  #include "sysemu/device_tree.h"
> >>>>>>>>  #include "exec/gdbstub.h"
> >>>>>>>>  #include "exec/address-spaces.h"
> >>>>>>>> +#include "exec/ram_addr.h"
> >>>>>>>>  #include "trace.h"
> >>>>>>>>  #include "qapi-event.h"
> >>>>>>>>  #include "hw/s390x/s390-pci-inst.h"
> >>>>>>>> @@ -1841,6 +1844,14 @@ static int kvm_arch_handle_debug_exit(S390CPU *cpu)
> >>>>>>>>      return ret;
> >>>>>>>>  }
> >>>>>>>>  
> >>>>>>>> +static void release_all_rams(void)
> >>>>>>>
> >>>>>>> s/rams/ram/ maybe?
> >>>>>>>
> >>>>>>>> +{
> >>>>>>>> +    struct RAMBlock *rb;
> >>>>>>>> +
> >>>>>>>> +    QLIST_FOREACH_RCU(rb, &ram_list.blocks, next)
> >>>>>>>> +        ram_block_discard_range(rb, 0, rb->used_length);
> >>>>>>>
> >>>>>>> From a coding style point of view, I think there should be curly braces
> >>>>>>> around ram_block_discard_range() ?
> >>>>>>
> >>>>>> I think this might break if it happens during a postcopy migrate.
> >>>>>> The destination CPU is running, so it can do a reboot at just the wrong
> >>>>>> time; and then the pages (that are protected by userfaultfd) would get
> >>>>>> deallocated and trigger userfaultfd requests if accessed.
> >>>>>
> >>>>> Yes, userfaultd/postcopy is really fragile and relies on things that are not
> >>>>> necessarily true (e.g. virito-balloon can also invalidate pages).
> >>>>
> >>>> That's why we use qemu_balloon_inhibit around postcopy to stop
> >>>> ballooning; I'm not aware of anything else that does the same.
> >>>
> >>> we also have at least the pte_unused thing in mm/rmap.c that clearly
> >>> predates userfaultfd. We might need to look into this as well....
> >>
> >> I've not come across that; what does that do?
> > 
> > It can drop a page on page out if the page is no longer of value. It is used by
> > the CMMA (guest page hinting) code of s390x.
> > 
> > see kernel mm/rmap.c
> > 
> > 
> > static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >                      unsigned long address, void *arg)
> > {
> > [...]
> >                 } else if (pte_unused(pteval)) {
> >                         /*
> >                          * The guest indicated that the page content is of no
> >                          * interest anymore. Simply discard the pte, vmscan
> >                          * will take care of the rest.
> >                          */
> > 			dec_mm_counter(mm, mm_counter(page));
> >                         /* We have to invalidate as we cleared the pte */
> >                         mmu_notifier_invalidate_range(mm, address,
> >                                                       address + PAGE_SIZE);
> >                 } else if (IS_ENABLED(CONFIG_MIGRATION) &&
> >                                 (flags & (TTU_MIGRATION|TTU_SPLIT_FREEZE))) {
> > [...]
> > 
> > 
> 
> Maybe something like this in the kernel
> 
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 47db27f8049e..9bdf4d448987 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1483,7 +1483,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>                                 set_pte_at(mm, address, pvmw.pte, pteval);
>                         }
>  
> -               } else if (pte_unused(pteval)) {
> +               } else if (pte_unused(pteval) && !vma->vm_userfaultfd_ctx.ctx) {
>                         /*
>                          * The guest indicated that the page content is of no
>                          * interest anymore. Simply discard the pte, vmscan
> 
> 
> could help?

I guess so, but please check with aarcange; I don't know the mm code.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-03-01 12:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-28 19:53 [Qemu-devel] [PATCH 1/1] s390/kvm: implement clearing part of IPL clear Christian Borntraeger
2018-03-01  3:58 ` Thomas Huth
2018-03-01  7:37   ` Christian Borntraeger
2018-03-01  8:44   ` Paolo Bonzini
2018-03-01  9:24   ` Dr. David Alan Gilbert
2018-03-01 11:00     ` Christian Borntraeger
2018-03-01 11:45       ` Dr. David Alan Gilbert
2018-03-01 12:08         ` Christian Borntraeger
2018-03-01 12:28           ` Dr. David Alan Gilbert
2018-03-01 12:35             ` Christian Borntraeger
2018-03-01 12:39               ` Christian Borntraeger
2018-03-01 12:58                 ` Dr. David Alan Gilbert [this message]
2018-03-01 12:49               ` Dr. David Alan Gilbert
2018-03-01  9:21 ` David Hildenbrand
2018-03-05 12:54 ` Cornelia Huck
2018-03-05 13:04   ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180301125811.GF2994@work-vm \
    --to=dgilbert@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.vnet.ibm.com \
    --cc=pasic@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.