All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Farman <farman@linux.ibm.com>
To: Cornelia Huck <cohuck@redhat.com>, Halil Pasic <pasic@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, Pierre Morel <pmorel@linux.ibm.com>,
	kvm@vger.kernel.org, qemu-s390x@nongnu.org,
	Farhan Ali <alifm@linux.ibm.com>,
	qemu-devel@nongnu.org,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH v2 2/5] vfio-ccw: concurrent I/O handling
Date: Thu, 24 Jan 2019 14:16:21 -0500	[thread overview]
Message-ID: <b1f0a5f6-51b0-0986-00dd-51bdd79a4779@linux.ibm.com> (raw)
In-Reply-To: <20190123143429.7a8da112.cohuck@redhat.com>



On 01/23/2019 08:34 AM, Cornelia Huck wrote:
> On Wed, 23 Jan 2019 14:06:01 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Wed, 23 Jan 2019 11:34:47 +0100
>> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>> Yes, one can usually think of interfaces as contracts: both sides need
>> to keep their end for things to work as intended. Unfortunately the
>> vfio-ccw iterface is not a very well specified one, and that makes
>> reasoning about right order so much harder.
> 
> That's probably where our disconnect comes from.
> 
>>
>> I was under the impression that the right ordering is dictated by the
>> SCSW in userspace. E.g. if there is an FC bit set there userspace is not
>> ought to issue a SSCH request (write to the io_region). The kernel part
>> however may say 'userspace read the actual SCSW' by signaling
>> the io_trigger eventfd. Userspace is supposed to read the IRB from the
>> region and update it's SCSW.
>>
>> Now if userspace reads a broken SCSW from the IRB, because of a race
>> (due to poorly written kernel part -- userspace not at fault), it is
>> going to make wrong assumptions about currently legal and illegal
>> operations (ordering).
> 
> My understanding of the interface was that writing to the I/O region
> triggers a ssch (unless rejected with error) and that reading it just
> gets whatever the kernel wrote there the last time it updated its
> internal structures. The eventfd simply triggers to say "the region has
> been updated with an IRB", not to say "userspace, read this".
> 
>>
>> Previously I described a scenario where IRB can break without userspace
>> being at fault (race between unsolicited interrupt -- can happen at any
>> time -- and a legit io request). I was under the impression we agreed on
>> this.
> 
> There is a bug in there (clearing the cp for non-final interrupts), and
> it needs to be fixed. I'm not so sure if the unsolicited interrupt
> thing is a bug (beyond that the internal state machine is confused).
> 
>>
>> This in turn could lead to userspace violating the contract, as perceived
>> by the kernel side.
> 
> Which contract? ;)
> 
> Also, I'm not sure if we'd rather get a deferred cc 1?

As I'm encountering dcc=1 quite regularly lately, it's a nice error. 
But we don't have a good way of recovering from it, and so my test tends 
to go down in a heap quite quickly.  This patch set will probably help; 
I should really get it applied and try it out.

> 
>>
>>> At this point, I'm mostly confused... I'd prefer to simply fix things
>>> as they come up so that we can finally move forward with the halt/clear
>>> handling (and probably rework the state machine on top of that.)

+1 for fixing things as we go.  I hear the complaints about this code 
(and probably say them too), but remain convinced that a large rewrite 
is unnecessary.  Lots of opportunities for improvement, with lots of 
willing and motivated participants, means it can only get better!

>>>    
>>
>> I understand. I guess you will want to send a new version because of the
>> stuff that got lost in the rebase, or?
> 
> Yes, I'll send a new version; but I'll wait for more feedback for a bit.
> 

I'll try to provide some now.  Still digging through the emails marked 
"todo" :)

  - Eric

WARNING: multiple messages have this Message-ID (diff)
From: Eric Farman <farman@linux.ibm.com>
To: Cornelia Huck <cohuck@redhat.com>, Halil Pasic <pasic@linux.ibm.com>
Cc: Farhan Ali <alifm@linux.ibm.com>,
	Pierre Morel <pmorel@linux.ibm.com>,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	qemu-devel@nongnu.org, qemu-s390x@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling
Date: Thu, 24 Jan 2019 14:16:21 -0500	[thread overview]
Message-ID: <b1f0a5f6-51b0-0986-00dd-51bdd79a4779@linux.ibm.com> (raw)
In-Reply-To: <20190123143429.7a8da112.cohuck@redhat.com>



On 01/23/2019 08:34 AM, Cornelia Huck wrote:
> On Wed, 23 Jan 2019 14:06:01 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Wed, 23 Jan 2019 11:34:47 +0100
>> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>> Yes, one can usually think of interfaces as contracts: both sides need
>> to keep their end for things to work as intended. Unfortunately the
>> vfio-ccw iterface is not a very well specified one, and that makes
>> reasoning about right order so much harder.
> 
> That's probably where our disconnect comes from.
> 
>>
>> I was under the impression that the right ordering is dictated by the
>> SCSW in userspace. E.g. if there is an FC bit set there userspace is not
>> ought to issue a SSCH request (write to the io_region). The kernel part
>> however may say 'userspace read the actual SCSW' by signaling
>> the io_trigger eventfd. Userspace is supposed to read the IRB from the
>> region and update it's SCSW.
>>
>> Now if userspace reads a broken SCSW from the IRB, because of a race
>> (due to poorly written kernel part -- userspace not at fault), it is
>> going to make wrong assumptions about currently legal and illegal
>> operations (ordering).
> 
> My understanding of the interface was that writing to the I/O region
> triggers a ssch (unless rejected with error) and that reading it just
> gets whatever the kernel wrote there the last time it updated its
> internal structures. The eventfd simply triggers to say "the region has
> been updated with an IRB", not to say "userspace, read this".
> 
>>
>> Previously I described a scenario where IRB can break without userspace
>> being at fault (race between unsolicited interrupt -- can happen at any
>> time -- and a legit io request). I was under the impression we agreed on
>> this.
> 
> There is a bug in there (clearing the cp for non-final interrupts), and
> it needs to be fixed. I'm not so sure if the unsolicited interrupt
> thing is a bug (beyond that the internal state machine is confused).
> 
>>
>> This in turn could lead to userspace violating the contract, as perceived
>> by the kernel side.
> 
> Which contract? ;)
> 
> Also, I'm not sure if we'd rather get a deferred cc 1?

As I'm encountering dcc=1 quite regularly lately, it's a nice error. 
But we don't have a good way of recovering from it, and so my test tends 
to go down in a heap quite quickly.  This patch set will probably help; 
I should really get it applied and try it out.

> 
>>
>>> At this point, I'm mostly confused... I'd prefer to simply fix things
>>> as they come up so that we can finally move forward with the halt/clear
>>> handling (and probably rework the state machine on top of that.)

+1 for fixing things as we go.  I hear the complaints about this code 
(and probably say them too), but remain convinced that a large rewrite 
is unnecessary.  Lots of opportunities for improvement, with lots of 
willing and motivated participants, means it can only get better!

>>>    
>>
>> I understand. I guess you will want to send a new version because of the
>> stuff that got lost in the rebase, or?
> 
> Yes, I'll send a new version; but I'll wait for more feedback for a bit.
> 

I'll try to provide some now.  Still digging through the emails marked 
"todo" :)

  - Eric

  reply	other threads:[~2019-01-24 19:16 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21 11:03 [PATCH v2 0/5] vfio-ccw: support hsch/csch (kernel part) Cornelia Huck
2019-01-21 11:03 ` [Qemu-devel] " Cornelia Huck
2019-01-21 11:03 ` [PATCH v2 1/5] vfio-ccw: make it safe to access channel programs Cornelia Huck
2019-01-21 11:03   ` [Qemu-devel] " Cornelia Huck
2019-01-22 14:56   ` Halil Pasic
2019-01-22 14:56     ` [Qemu-devel] " Halil Pasic
2019-01-22 15:19     ` Cornelia Huck
2019-01-22 15:19       ` [Qemu-devel] " Cornelia Huck
2019-01-21 11:03 ` [PATCH v2 2/5] vfio-ccw: concurrent I/O handling Cornelia Huck
2019-01-21 11:03   ` [Qemu-devel] " Cornelia Huck
2019-01-21 20:20   ` Halil Pasic
2019-01-21 20:20     ` [Qemu-devel] " Halil Pasic
2019-01-22 10:29     ` Cornelia Huck
2019-01-22 10:29       ` [Qemu-devel] " Cornelia Huck
2019-01-22 11:17       ` Halil Pasic
2019-01-22 11:17         ` [Qemu-devel] " Halil Pasic
2019-01-22 11:53         ` Cornelia Huck
2019-01-22 11:53           ` [Qemu-devel] " Cornelia Huck
2019-01-22 12:46           ` Halil Pasic
2019-01-22 12:46             ` [Qemu-devel] " Halil Pasic
2019-01-22 17:26             ` Cornelia Huck
2019-01-22 17:26               ` [Qemu-devel] " Cornelia Huck
2019-01-22 19:03               ` Halil Pasic
2019-01-22 19:03                 ` [Qemu-devel] " Halil Pasic
2019-01-23 10:34                 ` Cornelia Huck
2019-01-23 10:34                   ` [Qemu-devel] " Cornelia Huck
2019-01-23 13:06                   ` Halil Pasic
2019-01-23 13:06                     ` [Qemu-devel] " Halil Pasic
2019-01-23 13:34                     ` Cornelia Huck
2019-01-23 13:34                       ` [Qemu-devel] " Cornelia Huck
2019-01-24 19:16                       ` Eric Farman [this message]
2019-01-24 19:16                         ` Eric Farman
2019-01-25 10:13                         ` Cornelia Huck
2019-01-25 10:13                           ` [Qemu-devel] " Cornelia Huck
2019-01-22 18:33   ` Halil Pasic
2019-01-22 18:33     ` [Qemu-devel] " Halil Pasic
2019-01-23 10:21     ` Cornelia Huck
2019-01-23 10:21       ` [Qemu-devel] " Cornelia Huck
2019-01-23 13:30       ` Halil Pasic
2019-01-23 13:30         ` [Qemu-devel] " Halil Pasic
2019-01-24 10:05         ` Cornelia Huck
2019-01-24 10:05           ` [Qemu-devel] " Cornelia Huck
2019-01-24 10:08       ` Pierre Morel
2019-01-24 10:08         ` [Qemu-devel] " Pierre Morel
2019-01-24 10:19         ` Cornelia Huck
2019-01-24 10:19           ` [Qemu-devel] " Cornelia Huck
2019-01-24 11:18           ` Pierre Morel
2019-01-24 11:18             ` [Qemu-devel] " Pierre Morel
2019-01-24 11:45           ` Halil Pasic
2019-01-24 11:45             ` [Qemu-devel] " Halil Pasic
2019-01-24 19:14           ` Eric Farman
2019-01-24 19:14             ` [Qemu-devel] " Eric Farman
2019-01-25  2:25   ` Eric Farman
2019-01-25  2:25     ` [Qemu-devel] " Eric Farman
2019-01-25  2:37     ` Eric Farman
2019-01-25  2:37       ` [Qemu-devel] " Eric Farman
2019-01-25 10:24       ` Cornelia Huck
2019-01-25 10:24         ` [Qemu-devel] " Cornelia Huck
2019-01-25 12:58         ` Cornelia Huck
2019-01-25 12:58           ` [Qemu-devel] " Cornelia Huck
2019-01-25 14:01           ` Halil Pasic
2019-01-25 14:01             ` [Qemu-devel] " Halil Pasic
2019-01-25 14:21             ` Cornelia Huck
2019-01-25 14:21               ` [Qemu-devel] " Cornelia Huck
2019-01-25 16:04               ` Halil Pasic
2019-01-25 16:04                 ` [Qemu-devel] " Halil Pasic
2019-01-28 17:13                 ` Cornelia Huck
2019-01-28 17:13                   ` [Qemu-devel] " Cornelia Huck
2019-01-28 19:30                   ` Halil Pasic
2019-01-28 19:30                     ` [Qemu-devel] " Halil Pasic
2019-01-29  9:58                     ` Cornelia Huck
2019-01-29  9:58                       ` [Qemu-devel] " Cornelia Huck
2019-01-29 19:39                       ` Halil Pasic
2019-01-29 19:39                         ` [Qemu-devel] " Halil Pasic
2019-01-30 13:29                         ` Cornelia Huck
2019-01-30 13:29                           ` [Qemu-devel] " Cornelia Huck
2019-01-30 14:32                           ` Farhan Ali
2019-01-30 14:32                             ` [Qemu-devel] " Farhan Ali
2019-01-28 17:09             ` Cornelia Huck
2019-01-28 17:09               ` [Qemu-devel] " Cornelia Huck
2019-01-28 19:15               ` Halil Pasic
2019-01-28 19:15                 ` [Qemu-devel] " Halil Pasic
2019-01-28 21:48                 ` Eric Farman
2019-01-28 21:48                   ` [Qemu-devel] " Eric Farman
2019-01-29 10:20                   ` Cornelia Huck
2019-01-29 10:20                     ` [Qemu-devel] " Cornelia Huck
2019-01-29 14:14                     ` Eric Farman
2019-01-29 14:14                       ` [Qemu-devel] " Eric Farman
2019-01-29 18:53                       ` Cornelia Huck
2019-01-29 18:53                         ` [Qemu-devel] " Cornelia Huck
2019-01-29 10:10                 ` Cornelia Huck
2019-01-29 10:10                   ` [Qemu-devel] " Cornelia Huck
2019-01-25 15:57           ` Eric Farman
2019-01-25 15:57             ` [Qemu-devel] " Eric Farman
2019-01-28 17:24             ` Cornelia Huck
2019-01-28 17:24               ` [Qemu-devel] " Cornelia Huck
2019-01-28 21:50               ` Eric Farman
2019-01-28 21:50                 ` [Qemu-devel] " Eric Farman
2019-01-25 20:22         ` Eric Farman
2019-01-25 20:22           ` [Qemu-devel] " Eric Farman
2019-01-28 17:31           ` Cornelia Huck
2019-01-28 17:31             ` [Qemu-devel] " Cornelia Huck
2019-01-25 13:09       ` Halil Pasic
2019-01-25 13:09         ` [Qemu-devel] " Halil Pasic
2019-01-25 12:58     ` Halil Pasic
2019-01-25 12:58       ` [Qemu-devel] " Halil Pasic
2019-01-25 20:21       ` Eric Farman
2019-01-25 20:21         ` [Qemu-devel] " Eric Farman
2019-01-21 11:03 ` [PATCH v2 3/5] vfio-ccw: add capabilities chain Cornelia Huck
2019-01-21 11:03   ` [Qemu-devel] " Cornelia Huck
2019-01-23 15:57   ` [qemu-s390x] " Halil Pasic
2019-01-23 15:57     ` [Qemu-devel] " Halil Pasic
2019-01-25 16:19   ` Eric Farman
2019-01-25 16:19     ` [Qemu-devel] " Eric Farman
2019-01-25 21:00     ` Eric Farman
2019-01-25 21:00       ` [Qemu-devel] " Eric Farman
2019-01-28 17:34       ` Cornelia Huck
2019-01-28 17:34         ` [Qemu-devel] " Cornelia Huck
2019-01-21 11:03 ` [PATCH v2 4/5] s390/cio: export hsch to modules Cornelia Huck
2019-01-21 11:03   ` [Qemu-devel] " Cornelia Huck
2019-01-22 15:21   ` [qemu-s390x] " Halil Pasic
2019-01-22 15:21     ` [Qemu-devel] " Halil Pasic
2019-01-21 11:03 ` [PATCH v2 5/5] vfio-ccw: add handling for async channel instructions Cornelia Huck
2019-01-21 11:03   ` [Qemu-devel] " Cornelia Huck
2019-01-23 15:51   ` Halil Pasic
2019-01-23 15:51     ` [Qemu-devel] " Halil Pasic
2019-01-24 10:06     ` Cornelia Huck
2019-01-24 10:06       ` [Qemu-devel] " Cornelia Huck
2019-01-24 10:37       ` Halil Pasic
2019-01-24 10:37         ` [Qemu-devel] " Halil Pasic
2019-01-25 21:00   ` Eric Farman
2019-01-25 21:00     ` [Qemu-devel] " Eric Farman
2019-01-28 17:40     ` Cornelia Huck
2019-01-28 17:40       ` [Qemu-devel] " Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1f0a5f6-51b0-0986-00dd-51bdd79a4779@linux.ibm.com \
    --to=farman@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=alifm@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=pmorel@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.