kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Farman <farman@linux.ibm.com>
To: Janosch Frank <frankja@linux.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Thomas Huth <thuth@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	kvm@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability
Date: Fri, 12 Nov 2021 15:30:25 -0500	[thread overview]
Message-ID: <b206e7b73696907328bc4338664dea1ef572e8aa.camel@linux.ibm.com> (raw)
In-Reply-To: <42b1d53a11106f9f8e9d9b34a41b6bba738eb410.camel@linux.ibm.com>

On Fri, 2021-11-12 at 11:09 -0500, Eric Farman wrote:
> On Fri, 2021-11-12 at 09:49 +0100, Janosch Frank wrote:
> > On 11/11/21 18:48, Eric Farman wrote:
> > > On Thu, 2021-11-11 at 17:13 +0100, Janosch Frank wrote:
> > > > On 11/11/21 16:03, Eric Farman wrote:
> > > > > On Thu, 2021-11-11 at 10:15 +0100, David Hildenbrand wrote:
> > > > > > On 10.11.21 21:33, Eric Farman wrote:
> > > > > > > With commit 2444b352c3ac ("KVM: s390: forward most SIGP
> > > > > > > orders
> > > > > > > to
> > > > > > > user
> > > > > > > space") we have a capability that allows the "fast" SIGP
> > > > > > > orders
> > > > > > > (as
> > > > > > > defined by the Programming Notes for the SIGNAL PROCESSOR
> > > > > > > instruction in
> > > > > > > the Principles of Operation) to be handled in-kernel,
> > > > > > > while
> > > > > > > all
> > > > > > > others are
> > > > > > > sent to userspace for processing.
> > > > > > > 
> > > > > > > This works fine but it creates a situation when, for
> > > > > > > example, a
> > > > > > > SIGP SENSE
> > > > > > > might return CC1 (STATUS STORED, and status bits
> > > > > > > indicating
> > > > > > > the
> > > > > > > vcpu is
> > > > > > > stopped), when in actuality userspace is still processing
> > > > > > > a
> > > > > > > SIGP
> > > > > > > STOP AND
> > > > > > > STORE STATUS order, and the vcpu is not yet actually
> > > > > > > stopped.
> > > > > > > Thus,
> > > > > > > the
> > > > > > > SIGP SENSE should actually be returning CC2 (busy)
> > > > > > > instead
> > > > > > > of
> > > > > > > CC1.
> > > > > > > 
> > > > > > > To fix this, add another CPU capability, dependent on the
> > > > > > > USER_SIGP
> > > > > > > one,
> > > > > > > and two associated IOCTLs. One IOCTL will be used by
> > > > > > > userspace
> > > > > > > to
> > > > > > > mark a
> > > > > > > vcpu "busy" processing a SIGP order, and cause concurrent
> > > > > > > orders
> > > > > > > handled
> > > > > > > in-kernel to be returned with CC2 (busy). Another IOCTL
> > > > > > > will be
> > > > > > > used by
> > > > > > > userspace to mark the SIGP "finished", and the vcpu free
> > > > > > > to
> > > > > > > process
> > > > > > > additional orders.
> > > > > > > 
> > > > > > 
> > > > > > This looks much cleaner to me, thanks!
> > > > > > 
> > > > > > [...]
> > > > > > 
> > > > > > > diff --git a/arch/s390/kvm/kvm-s390.h
> > > > > > > b/arch/s390/kvm/kvm-
> > > > > > > s390.h
> > > > > > > index c07a050d757d..54371cede485 100644
> > > > > > > --- a/arch/s390/kvm/kvm-s390.h
> > > > > > > +++ b/arch/s390/kvm/kvm-s390.h
> > > > > > > @@ -82,6 +82,22 @@ static inline int is_vcpu_idle(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu)
> > > > > > >    	return test_bit(vcpu->vcpu_idx, vcpu->kvm-
> > > > > > > > arch.idle_mask);
> > > > > > >    }
> > > > > > >    
> > > > > > > +static inline bool kvm_s390_vcpu_is_sigp_busy(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu)
> > > > > > > +{
> > > > > > > +	return (atomic_read(&vcpu->arch.sigp_busy) == 1);
> > > > > > 
> > > > > > You can drop ()
> > > > > > 
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline bool kvm_s390_vcpu_set_sigp_busy(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu)
> > > > > > > +{
> > > > > > > +	/* Return zero for success, or -EBUSY if another vcpu
> > > > > > > won */
> > > > > > > +	return (atomic_cmpxchg(&vcpu->arch.sigp_busy, 0, 1) ==
> > > > > > > 0) ? 0 :
> > > > > > > -EBUSY;
> > > > > > 
> > > > > > You can drop () as well.
> > > > > > 
> > > > > > We might not need the -EBUSY semantics after all. User
> > > > > > space
> > > > > > can
> > > > > > just
> > > > > > track if it was set, because it's in charge of setting it.
> > > > > 
> > > > > Hrm, I added this to distinguish a newer kernel with an older
> > > > > QEMU,
> > > > > but
> > > > > of course an older QEMU won't know the difference either.
> > > > > I'll
> > > > > doublecheck that this is works fine in the different
> > > > > permutations.
> > > > > 
> > > > > > > +}
> > > > > > > +
> > > > > > > +static inline void kvm_s390_vcpu_clear_sigp_busy(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu)
> > > > > > > +{
> > > > > > > +	atomic_set(&vcpu->arch.sigp_busy, 0);
> > > > > > > +}
> > > > > > > +
> > > > > > >    static inline int kvm_is_ucontrol(struct kvm *kvm)
> > > > > > >    {
> > > > > > >    #ifdef CONFIG_KVM_S390_UCONTROL
> > > > > > > diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> > > > > > > index 5ad3fb4619f1..a37496ea6dfa 100644
> > > > > > > --- a/arch/s390/kvm/sigp.c
> > > > > > > +++ b/arch/s390/kvm/sigp.c
> > > > > > > @@ -276,6 +276,10 @@ static int handle_sigp_dst(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu, u8 order_code,
> > > > > > >    	if (!dst_vcpu)
> > > > > > >    		return SIGP_CC_NOT_OPERATIONAL;
> > > > > > >    
> > > > > > > +	if (kvm_s390_vcpu_is_sigp_busy(dst_vcpu)) {
> > > > > > > +		return SIGP_CC_BUSY;
> > > > > > > +	}
> > > > > > 
> > > > > > You can drop {}
> > > > > 
> > > > > Arg, I had some debug in there which needed the braces, and
> > > > > of
> > > > > course
> > > > > it's unnecessary now. Thanks.
> > > > > 
> > > > > > > +
> > > > > > >    	switch (order_code) {
> > > > > > >    	case SIGP_SENSE:
> > > > > > >    		vcpu->stat.instruction_sigp_sense++;
> > > > > > > @@ -411,6 +415,12 @@ int kvm_s390_handle_sigp(struct
> > > > > > > kvm_vcpu
> > > > > > > *vcpu)
> > > > > > >    	if (handle_sigp_order_in_user_space(vcpu,
> > > > > > > order_code,
> > > > > > > cpu_addr))
> > > > > > >    		return -EOPNOTSUPP;
> > > > > > >    
> > > > > > > +	/* Check the current vcpu, if it was a target from
> > > > > > > another vcpu
> > > > > > > */
> > > > > > > +	if (kvm_s390_vcpu_is_sigp_busy(vcpu)) {
> > > > > > > +		kvm_s390_set_psw_cc(vcpu, SIGP_CC_BUSY);
> > > > > > > +		return 0;
> > > > > > > +	}
> > > > > > 
> > > > > > I don't think we need this. I think the above (checking the
> > > > > > target of
> > > > > > a
> > > > > > SIGP order) is sufficient. Or which situation do you have
> > > > > > in
> > > > > > mind?
> > > > > > 
> > > > > 
> > > > > Hrm... I think you're right. I was thinking of this:
> > > > > 
> > > > > VCPU 1 - SIGP STOP CPU 2
> > > > > VCPU 2 - SIGP SENSE CPU 1
> > > > > 
> > > > > But of course either CPU2 is going to be marked "busy" first,
> > > > > and
> > > > > the
> > > > > sense doesn't get processed until it's reset, or the sense
> > > > > arrives
> > > > > first, and the busy/notbusy doesn't matter. Let me
> > > > > doublecheck
> > > > > my
> > > > > tests
> > > > > for the non-RFC version.
> > > > > 
> > > > > > I do wonder if we want to make this a kvm_arch_vcpu_ioctl()
> > > > > > instead,
> > > > > 
> > > > > In one of my original attempts between v1 and v2, I had put
> > > > > this
> > > > > there.
> > > > > This reliably deadlocks my guest, because the caller
> > > > > (kvm_vcpu_ioctl())
> > > > > tries to acquire vcpu->mutex, and racing SIGPs (via KVM_RUN)
> > > > > might
> > > > > already be holding it. Thus, it's an async ioctl. I could
> > > > > fold
> > > > > it
> > > > > into
> > > > > the existing interrupt ioctl, but as those are architected
> > > > > structs
> > > > > it
> > > > > seems more natural do it this way. Or I have mis-understood
> > > > > something
> > > > > along the way?
> > > > > 
> > > > > > essentially just providing a KVM_S390_SET_SIGP_BUSY *and*
> > > > > > providing
> > > > > > the
> > > > > > order. "order == 0" sets it to !busy.
> > > > > 
> > > > > I'd tried this too, since it provided some nice debug-
> > > > > ability.
> > > > > Unfortunately, I have a testcase (which I'll eventually get
> > > > > folded
> > > > > into
> > > > > kvm-unit-tests :)) that picks a random order between 0-255,
> > > > > knowing
> > > > > that there's only a couple handfuls of valid orders, to check
> > > > > the
> > > > > response. Zero is valid architecturally (POPS figure 4-29),
> > > > > even if
> > > > > it's unassigned. The likelihood of it becoming assigned is
> > > > > probably
> > > > > quite low, but I'm not sure that I like special-casing an
> > > > > order
> > > > > of
> > > > > zero
> > > > > in this way.
> > > > > 
> > > > 
> > > > Looking at the API I'd like to avoid having two IOCTLs
> > > 
> > > Since the order is a single byte, we could have the payload of an
> > > ioctl
> > > say "0-255 is an order that we're busy processing, anything
> > > higher
> > > than
> > > that resets the busy" or something. That would remove the need
> > > for
> > > a
> > > second IOCTL.
> > > 
> > > > and I'd love to
> > > > see some way to extend this without the need for a whole new
> > > > IOCTL.
> > > > 
> > > 
> > > Do you mean zero IOCTLs? Because... I think the only way we can
> > > do
> > > that
> > > is to get rid of USER_SIGP altogether, and handle everything in-
> > > kernel.
> > > Some weeks ago I played with QEMU not enabling USER_SIGP, but I
> > > can't
> > > say I've tried it since we went down this "mark a vcpu busy"
> > > path.
> > > If I
> > > do that busy/not-busy tagging in the kernel for !USER_SIGP, that
> > > might
> > > not be a bad thing anyway. But I don't know how we get the
> > > behavior
> > > straightened out for USER_SIGP without some type of handshake.
> > 
> > I'd move over to a very small struct argument with a command and a
> > flags 
> > field so we can extend the IOCTL at a later time without the need
> > to 
> > introduce a new IOCTL.
> > IMHO there's no real need to make this IOCTL as small as possible
> > and 
> > only handle an int as the argument with < 0 shenanigans. We should 
> > rather focus on making this a nice and sane API if we have the
> > option
> > to 
> > do so.
> > 
> 
> So then naming this as "USER_SIGP_BUSY" is too narrow. What about
> just
> "USER_BUSY" and something like this?
> 
> enum user_busy_function {
> 	SET,
> 	RESET,
> }
> 
> enum user_busy_reason {
> 	SIGP,
> }
> 
> struct user_busy {
> 	int function;
> 	int reason;
> 	long payload;		// Optional? In our case, SIGP order
> }
> 

Looking at this further, rather than pile on as its own thing, why
couldn't this be added as an IRQ type and embed such a structure into
kvm_s390_irq? Then the ioctl is moot.

> > > > > > Not that we would need the value
> > > > > > right now, but who knows for what we might reuse that
> > > > > > interface
> > > > > > in
> > > > > > the
> > > > > > future.
> > > > > > 
> > > > > > Thanks!
> > > > > > 


  reply	other threads:[~2021-11-12 20:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 20:33 [RFC PATCH v3 0/2] s390x: Improvements to SIGP handling [KVM] Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 1/2] Capability/IOCTL/Documentation Eric Farman
2021-11-10 20:33 ` [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability Eric Farman
2021-11-11  9:15   ` David Hildenbrand
2021-11-11 15:03     ` Eric Farman
2021-11-11 16:13       ` Janosch Frank
2021-11-11 17:48         ` Eric Farman
2021-11-11 18:29           ` David Hildenbrand
2021-11-11 19:05             ` Eric Farman
2021-11-11 19:15               ` David Hildenbrand
2021-11-11 19:44                 ` Eric Farman
2021-11-12  9:34                   ` David Hildenbrand
2021-11-12  9:35                     ` David Hildenbrand
2021-11-17  7:54               ` Christian Borntraeger
2021-11-19 20:20                 ` Eric Farman
2021-11-22 10:52                   ` David Hildenbrand
2021-11-23 17:42                     ` Eric Farman
2021-11-23 18:44                       ` David Hildenbrand
2021-11-30 20:11                         ` Eric Farman
2021-11-12  8:49           ` Janosch Frank
2021-11-12 16:09             ` Eric Farman
2021-11-12 20:30               ` Eric Farman [this message]
2021-11-11 16:16   ` Janosch Frank
2021-11-11 17:50     ` Eric Farman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b206e7b73696907328bc4338664dea1ef572e8aa.camel@linux.ibm.com \
    --to=farman@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).