All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Farman <farman@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Thomas Huth <thuth@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	kvm@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability
Date: Tue, 30 Nov 2021 15:11:23 -0500	[thread overview]
Message-ID: <c37473352b1fd2d2e6211911750b6061d4806090.camel@linux.ibm.com> (raw)
In-Reply-To: <ffa6e5da-e0d2-6b21-f0aa-589ee7aabe47@redhat.com>

On Tue, 2021-11-23 at 19:44 +0100, David Hildenbrand wrote:
> > > > I ALSO took a stab at folding this into the S390 IRQ paths [2],
> > > > similar
> > > > to what was done with kvm_s390_stop_info. This worked
> > > > reasonably
> > > > well,
> > > > except the QEMU interface kvm_s390_vcpu_interrupt() returns a
> > > > void,
> > > > and
> > > > so wouldn't notice an error sent back by KVM. Not a deal
> > > > breaker,
> > > > but
> > > > having not heard anything to this idea, I didn't go much
> > > > farther.
> > > 
> > > We only care about SIGP STOP* handling so far, if anybody is
> > > aware of
> > > other issues
> > > that need fixing, it would be helpful  to spell them out. 
> > 
> > Yes, I keep using SIGP STOP* as an example because it offers (A) a
> > clear miscommunication with the status bits returned by SIGP SENSE,
> > and
> > (B) has some special handling in QEMU that to my experience is
> > still
> > incomplete. But I'm seeing issues with other orders, like SIGP
> > RESTART
> > [1] [2] (where QEMU does a kvm_s390_vcpu_interrupt() with
> > KVM_S390_RESTART, and thus adds to pending_irq) and SIGP (INITIAL)
> > CPU
> > RESET [2] (admittedly not greatly researched).
> 
> Sorry I missed that discussion previously. Summarizing what the PoP
> says:
> 
> Once we have a pending:
> * start (-> synchronous)
> * stop (-> asynchronous)
> * restart (-> asynchronous)
> * stop-and-store-status (-> asynchronous)
> * set-prefix (-> synchronous)
> * store-status-at-address (-> synchronous)
> * store-additional-status-at-address (-> synchronous)
> * initial-CPU-reset (-> synchronous)
> * CPU-reset order (-> synchronous)

I would argue that these last two are not necessarily synchronous.
Looking at QEMU...

handle_sigp_single_dst()
sigp_(initial_)cpu_reset()
s390_cpu_reset()
>> If kvm_enabled(), call
   kvm_s390_reset_vcpu_initial()
   -or-
   kvm_s390_reset_vcpu_normal()
kvm_s390_reset_vcpu()
kvm_vcpu_ioctl()

(Switch to KVM code)
kvm_vcpu_ioctl()
 -- Acquire vcpu->mutex
kvm_arch_vcpu_ioctl()
kvm_arch_vcpu_ioctl_initial_reset()
 -and/or-
kvm_arch_vcpu_ioctl_normal_reset()

So, given that, couldn't it be possible for a SIGP SENSE to be sent to
a CPU that is currently processing one of the RESET orders? The mutex
acquired as part of the ioctl would end up gating one of them, when
according to POPS a subsequent SENSE should get CC2 until the reset
completes. Unlike STOP*/RESTART, there's no IRQ that we can key off of
to know that the RESET is still in process, unless we define another
IOCTL as in v2-v4 here.

> 
> The following orders have to return busy (my take: only a must if the
> guest could observe the difference):
> * sense
> * external call
> * emergency signal

These ones:

> * start
> * stop
> * restart
> * stop and store status
> * set prefix
> * store status at address
> * set architecture
> * set multithreading
> * store additional status at address

... are handled in userspace, which QEMU serializes in handle_sigp()
and returns CC2 today:

    if (qemu_mutex_trylock(&qemu_sigp_mutex)) {
        ret = SIGP_CC_BUSY;
        goto out;
    }

> 
> We have to ask ourselves
> 
> a) How could a single VCPU observe the difference if executing any
> other
> instruction immediately afterwards. My take would be that for
> synchronous orders we can't really. So we're left with:
> * stop (-> asynchronous)
> * restart (-> asynchronous)
> * stop-and-store-status (-> asynchronous)
> 
> b) How could multiple VCPUs observe the difference that a single VCPU
> can't observe. That will require more thought, but I think it will be
> hardly possible.
> 
> 
> We know that SIGP STOP * needs care.
> 
> SIGP RESTART is interesting. We inject it only for OPERATING cpus and
> it
> will only change the LC psw. What if we execute immediately
> afterwards:
> 
> * sense -> does not affect any bits
> * external call -> has higher IRQ priority. There could be a
> difference
>   if injected before or after the restart was delivered. Could be
> fixed
>   in the kernel (check IRQ_PEND_RESTART).
> * emergency signal -> has higher IRQ priority. There could be a
>   difference if injected before or after the restart was delivered.
>   Could be fixed in the kernel (check IRQ_PEND_RESTART).
> * start -> CPU is already operating
> * stop -> restart is delivered first
> * restart -> I think the lowcore will look different if we inject two
>   RESTARTs immediately afterwards instead of only a single
>   one. Could be fixed in the kernel (double-deliver the interrupt).
> * stop and store status -> restart is delivered first
> * set prefix -> CPU is operating, not possible
> * store status at address -> CPU is operating, not possible
> * set architecture -> don't care
> * set multithreading -> don't care
> * store additional status at address -> CPU is operating, not
> possible
> * initial-CPU-reset -> clears local IRQ. LC will look different if
>   RESTART was delivered or not. Could be fixed in the kernel quite
>   easily (deliver restart first before clearing interrupts).
> * CPU-reset -> clears local IRQs. LC will look different if
>   injected before vs. after. Could be fixed in the kernel quite
>   easily (deliver restart first before clearing interrupts)..

These might be of value. I don't yet have a clear order of events in
these scenarios, but will keep this in mind as I am seeing some
oddities there.

> 
> external call as handled by the SIGP interpretation facility will
> already also violate that description. We don't know that a SIGP
> restart
> is pending. We'd have to disable the SIGP interpretation facility
> temporarily.
> 
> /me shivers

/me too

> 
> This sounds like the kind of things we should happily not be fixing
> because nobody might really care :)
> 
> 

Hi, me again, really hoping I don't care about this aspect of it. :)

> 
> > The reason for why I have no spent a lot of time in the latter is
> > that
> > I have also mentioned that POPS has lists of orders that will
> > return
> > busy [3], and so something more general is perhaps warranted. The
> > QEMU
> > RFC's don't handle anything further than SIGP STOP*, on account of
> > it
> > makes sense to get the interface right first.
> 
> Right. My take is to have a look what we actually have to fix --
> where
> the guest can actually observe the difference. If the guest can't
> observe the difference there is no need to actually implement BUSY
> logic
> as instructed in the PoP.

My concern is largely with SIGP SENSE giving a clear answer for the
state of the cpu. Since most of the SIGP orders have some variation of
"may not occur/be completed during the execution of SIGNAL PROCESSOR"
in POPs, the SIGP SENSE is a quick mechanism (being defined as a "fast"
order) to determine if another order has completed or is still in-
process, and if the cpu was left in the expected state at the
completion of the order.

It does suggest that those synchronous orders could offer a misleading
answer, but since there's no (obvious) loss of control in those paths,
it's not as big a concern for me.

> 
> > > I'll keep assuming that
> > > only SIGP STOP*  needs fixing, as I already explained.
> > > 
> > > Whenever QEMU tells a CPU to stop asynchronously, it does so via
> > > a
> > > STOP IRQ from
> > > the destination CPU thread via KVM_S390_SIGP_STOP. Then, we know
> > > the
> > > CPU is busy
> > > ... until we clear that interrupt, which happens via
> > > kvm_s390_clear_stop_irq().
> > > 
> > > Interestingly, we clear that interrupt via two paths:
> > > 
> > > 1) kvm_s390_clear_local_irqs(), via KVM_S390_INITIAL_RESET and 
> > >    KVM_S390_NORMAL_RESET. Here, we expect that new user space
> > > also
> > > sets  
> > >    the CPU to stopped via KVM_MP_STATE_STOPPED. In fact, modern
> > > QEMU 
> > >    properly sets the CPU stopped before triggering clearing of
> > > the 
> > >    interrupts (s390_cpu_reset()).
> > > 2) kvm_s390_clear_stop_irq() via kvm_s390_vcpu_stop(), which is 
> > >    triggered via:
> > > 
> > > a) STOP intercept (handle_stop()), KVM_S390_INITIAL_RESET and 
> > >    KVM_S390_NORMAL_RESET with old user space -- 
> > >    !kvm_s390_user_cpu_state_ctrl().
> > > 
> > > b) KVM_MP_STATE_STOPPED for modern user space.
> > > 
> > > 
> > > 
> > > Would the following solve our SIGP STOP * issue w.o. uapi
> > > changes?
> > > 
> > 
> > A quick pass shows some promise, but I haven't the bandwidth to
> > throw
> > the battery of stuff at it. I'll have to take a closer look after
> > the
> > US Holiday to give a better answer. (Example: looking for
> > IRQ_PEND_SIGP_STOP || IRQ_PEND_RESTART is trivial.)
> 
> Yes, extending to IRQ_PEND_RESTART would make sense.
> 

Running my stressors at this combined patch goes well. Going to work on
some additional ones this week, with some debug on the reset orders.

Thanks,
Eric


  reply	other threads:[~2021-11-30 20:11 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 20:33 [RFC PATCH v3 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 1/2] Capability/IOCTL/Documentation Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
2021-11-11  9:15   ` David Hildenbrand
2021-11-11 15:03     ` Eric Farman
2021-11-11 16:13       ` Janosch Frank
2021-11-11 17:48         ` Eric Farman
2021-11-11 18:29           ` David Hildenbrand
2021-11-11 19:05             ` Eric Farman
2021-11-11 19:15               ` David Hildenbrand
2021-11-11 19:44                 ` Eric Farman
2021-11-12  9:34                   ` David Hildenbrand
2021-11-12  9:35                     ` David Hildenbrand
2021-11-17  7:54               ` Christian Borntraeger
2021-11-19 20:20                 ` Eric Farman
2021-11-22 10:52                   ` David Hildenbrand
2021-11-23 17:42                     ` Eric Farman
2021-11-23 18:44                       ` David Hildenbrand
2021-11-30 20:11                         ` Eric Farman [this message]
2021-11-12  8:49           ` Janosch Frank
2021-11-12 16:09             ` Eric Farman
2021-11-12 20:30               ` Eric Farman
2021-11-11 16:16   ` Janosch Frank
2021-11-11 17:50     ` Eric Farman
2021-11-11 19:48   ` kernel test robot
2021-11-12 15:10   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c37473352b1fd2d2e6211911750b6061d4806090.camel@linux.ibm.com \
    --to=farman@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.