All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC Userspace hypercalls
@ 2016-01-06 11:44 Andrew Cooper
  2016-01-06 14:14 ` Jan Beulich
  2016-01-07 10:42 ` Ian Campbell
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Cooper @ 2016-01-06 11:44 UTC (permalink / raw)
  To: Xen-devel List; +Cc: Tim Deegan, Keir Fraser, Jan Beulich

Hi,

I am in the middle of getting my Xen Test Framework working and usable.

Embarrassingly, the unit test I hacked up for investigating XSA-106
(which was the inspiration to make the framework) correctly identifies
the regression caused by XSA-156.  To avoid similar situations in the
future, I am getting the XTF usable as a matter of priority.

The XTF uses a flat, shared address space, with the test free to change
cpl at part of normal operation.  For the XSA-106 usecase, this was to
confirm that the x86 emulator correctly performed dpl checks on emulated
exception injection.

All console logging is synchronous (to ensure that log messages have
escaped the VM before an action occurs) and by default, an HVM test will
use the qemu debug port, console_io hypercall, and PV console (which
uses evtchn hypercalls).

This causes problems when the test moves into userspace.  The qemu debug
port can trivially be fixed by setting IOPL=3, but the hypercalls are
more problematic.  The HVM ABI (for whatever reason) unilaterally fails
a userspace hypercall with -EPERM, making it impossible for the kernel
to trap-and-forward even it wanted to.

There are already scenarios under test where we cannot rely on the test
kernel having a fully functioning set of entry points (e.g. the DPL part
of the test above).  Therefore I specifically want to make it possible
to make userspace hypercalls, rather than simply making them possible to
be trapped-and-forwarded.


As a result, I proposing introducing a hypercall which allows a domain
to adjust its entry criteria for hypercalls (e.g. set_hypercall_iopl). 
Doing this for HVM guests is straight forward, but PV guests are harder,
as they bounce through Xen entrypoints.

For PV guests, I propose that userspace hypercalls get implemented with
the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
causes the hypercall page writing logic to consider the guest a ring1
kernel, and the int $0x82 entrypoint suitably delegates between a
regular hypercall and a compat hypercall.

Thoughts?

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 11:44 RFC Userspace hypercalls Andrew Cooper
@ 2016-01-06 14:14 ` Jan Beulich
  2016-01-06 14:44   ` Andrew Cooper
  2016-01-07 10:42 ` Ian Campbell
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2016-01-06 14:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

>>> On 06.01.16 at 12:44, <andrew.cooper3@citrix.com> wrote:
> The HVM ABI (for whatever reason) unilaterally fails
> a userspace hypercall with -EPERM, making it impossible for the kernel
> to trap-and-forward even it wanted to.

Perhaps just to match PV behavior?

> There are already scenarios under test where we cannot rely on the test
> kernel having a fully functioning set of entry points (e.g. the DPL part
> of the test above).  Therefore I specifically want to make it possible
> to make userspace hypercalls, rather than simply making them possible to
> be trapped-and-forwarded.
> 
> 
> As a result, I proposing introducing a hypercall which allows a domain
> to adjust its entry criteria for hypercalls (e.g. set_hypercall_iopl). 
> Doing this for HVM guests is straight forward, but PV guests are harder,
> as they bounce through Xen entrypoints.

The primary question I have is whether this proposal is going to be
of use to anything other than your test framework (i.e. namely any
"ordinary" guests). A second question then would be whether the PV
case really needs to be handled.

> For PV guests, I propose that userspace hypercalls get implemented with
> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
> causes the hypercall page writing logic to consider the guest a ring1
> kernel, and the int $0x82 entrypoint suitably delegates between a
> regular hypercall and a compat hypercall.

With int $0x82 being the primary hypercall path for 32-bit guests,
I'd be concerned of any code addition, especially that of further
conditionals.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 14:14 ` Jan Beulich
@ 2016-01-06 14:44   ` Andrew Cooper
  2016-01-06 16:09     ` Jan Beulich
  2016-01-06 16:31     ` Jan Beulich
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Cooper @ 2016-01-06 14:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

On 06/01/16 14:14, Jan Beulich wrote:
>>>> On 06.01.16 at 12:44, <andrew.cooper3@citrix.com> wrote:
>> The HVM ABI (for whatever reason) unilaterally fails
>> a userspace hypercall with -EPERM, making it impossible for the kernel
>> to trap-and-forward even it wanted to.
> Perhaps just to match PV behavior?

But it doesn't.  PV userspace hypercalls currently end up in the guest
kernel at the sysenter or int $0x82 handler.

>
>> There are already scenarios under test where we cannot rely on the test
>> kernel having a fully functioning set of entry points (e.g. the DPL part
>> of the test above).  Therefore I specifically want to make it possible
>> to make userspace hypercalls, rather than simply making them possible to
>> be trapped-and-forwarded.
>>
>>
>> As a result, I proposing introducing a hypercall which allows a domain
>> to adjust its entry criteria for hypercalls (e.g. set_hypercall_iopl). 
>> Doing this for HVM guests is straight forward, but PV guests are harder,
>> as they bounce through Xen entrypoints.
> The primary question I have is whether this proposal is going to be
> of use to anything other than your test framework (i.e. namely any
> "ordinary" guests).

We did have an internal request for an HVM guest userspace netfront
driver to be able to use evntchnop calls directly.

The use of userspace hypercalls is restricted to single appliances
(rather than general purpose VMs), but isn't limited to my test
framework specifically.

> A second question then would be whether the PV case really needs to be handled.

Yes - I am going out of my way to make the test environments as
inequivalent as possible.

>
>> For PV guests, I propose that userspace hypercalls get implemented with
>> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
>> causes the hypercall page writing logic to consider the guest a ring1
>> kernel, and the int $0x82 entrypoint suitably delegates between a
>> regular hypercall and a compat hypercall.
> With int $0x82 being the primary hypercall path for 32-bit guests,
> I'd be concerned of any code addition, especially that of further
> conditionals.

The overhead of one extra conditional in the hypercall path is lost in
the noise, compared to the overhead of the task switch itself.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 14:44   ` Andrew Cooper
@ 2016-01-06 16:09     ` Jan Beulich
  2016-01-06 16:20       ` Andrew Cooper
  2016-01-06 16:31     ` Jan Beulich
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2016-01-06 16:09 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
> On 06/01/16 14:14, Jan Beulich wrote:
>>>>> On 06.01.16 at 12:44, <andrew.cooper3@citrix.com> wrote:
>>> The HVM ABI (for whatever reason) unilaterally fails
>>> a userspace hypercall with -EPERM, making it impossible for the kernel
>>> to trap-and-forward even it wanted to.
>> Perhaps just to match PV behavior?
> 
> But it doesn't.  PV userspace hypercalls currently end up in the guest
> kernel at the sysenter or int $0x82 handler.

That's not the part I meant it could have been intended to match
in behavior; I only referred to the privilege aspect.

>>> For PV guests, I propose that userspace hypercalls get implemented with
>>> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
>>> causes the hypercall page writing logic to consider the guest a ring1
>>> kernel, and the int $0x82 entrypoint suitably delegates between a
>>> regular hypercall and a compat hypercall.
>> With int $0x82 being the primary hypercall path for 32-bit guests,
>> I'd be concerned of any code addition, especially that of further
>> conditionals.
> 
> The overhead of one extra conditional in the hypercall path is lost in
> the noise, compared to the overhead of the task switch itself.

Task switch? On the hypercall path?

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:09     ` Jan Beulich
@ 2016-01-06 16:20       ` Andrew Cooper
  2016-01-06 16:24         ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2016-01-06 16:20 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, Tim Deegan, Xen-devel List

On 06/01/16 16:09, Jan Beulich wrote:
>
>>>> For PV guests, I propose that userspace hypercalls get implemented with
>>>> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
>>>> causes the hypercall page writing logic to consider the guest a ring1
>>>> kernel, and the int $0x82 entrypoint suitably delegates between a
>>>> regular hypercall and a compat hypercall.
>>> With int $0x82 being the primary hypercall path for 32-bit guests,
>>> I'd be concerned of any code addition, especially that of further
>>> conditionals.
>> The overhead of one extra conditional in the hypercall path is lost in
>> the noise, compared to the overhead of the task switch itself.
> Task switch? On the hypercall path?

Apologies - I meant the context switch caused by `int $0x82`.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:20       ` Andrew Cooper
@ 2016-01-06 16:24         ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2016-01-06 16:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

>>> On 06.01.16 at 17:20, <andrew.cooper3@citrix.com> wrote:
> On 06/01/16 16:09, Jan Beulich wrote:
>>
>>>>> For PV guests, I propose that userspace hypercalls get implemented with
>>>>> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
>>>>> causes the hypercall page writing logic to consider the guest a ring1
>>>>> kernel, and the int $0x82 entrypoint suitably delegates between a
>>>>> regular hypercall and a compat hypercall.
>>>> With int $0x82 being the primary hypercall path for 32-bit guests,
>>>> I'd be concerned of any code addition, especially that of further
>>>> conditionals.
>>> The overhead of one extra conditional in the hypercall path is lost in
>>> the noise, compared to the overhead of the task switch itself.
>> Task switch? On the hypercall path?
> 
> Apologies - I meant the context switch caused by `int $0x82`.

I don't think a software interrupt is unilaterally on all hardware so
slow that a mispredicted branch would be completely unnoticable.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 14:44   ` Andrew Cooper
  2016-01-06 16:09     ` Jan Beulich
@ 2016-01-06 16:31     ` Jan Beulich
  2016-01-06 16:38       ` Andrew Cooper
  2016-01-06 16:41       ` David Vrabel
  1 sibling, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2016-01-06 16:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
> We did have an internal request for an HVM guest userspace netfront
> driver to be able to use evntchnop calls directly.

And this can't be accomplished using the evtchn and/or privcmd
drivers?

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:31     ` Jan Beulich
@ 2016-01-06 16:38       ` Andrew Cooper
  2016-01-06 16:49         ` Jan Beulich
  2016-01-06 16:41       ` David Vrabel
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2016-01-06 16:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

On 06/01/16 16:31, Jan Beulich wrote:
>>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
>> We did have an internal request for an HVM guest userspace netfront
>> driver to be able to use evntchnop calls directly.
> And this can't be accomplished using the evtchn and/or privcmd
> drivers?

It can, and I don't believe the worry about extra overhead is well
placed.  (There many areas of lower hanging fruit in this specific case).

However, a userspace backend isn't in principle a bad idea.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:31     ` Jan Beulich
  2016-01-06 16:38       ` Andrew Cooper
@ 2016-01-06 16:41       ` David Vrabel
  1 sibling, 0 replies; 13+ messages in thread
From: David Vrabel @ 2016-01-06 16:41 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Keir Fraser, Tim Deegan, Xen-devel List

On 06/01/16 16:31, Jan Beulich wrote:
>>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
>> We did have an internal request for an HVM guest userspace netfront
>> driver to be able to use evntchnop calls directly.
> 
> And this can't be accomplished using the evtchn and/or privcmd
> drivers?

It can and should be done with the evtchn driver.

Even for tests I think it should return to kernel mode to perform
hypercalls.  Tests should not run in a "magic" mode that normal guests
won't use.

If there are failure cases where return to kernel isn't possible and
some logging would be useful, perhaps writing to a memory buffer and
retrieving this via a crash dump would be ok?

David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:38       ` Andrew Cooper
@ 2016-01-06 16:49         ` Jan Beulich
  2016-01-06 17:06           ` Andrew Cooper
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2016-01-06 16:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: TimDeegan, Keir Fraser, Xen-devel List

>>> On 06.01.16 at 17:38, <andrew.cooper3@citrix.com> wrote:
> On 06/01/16 16:31, Jan Beulich wrote:
>>>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
>>> We did have an internal request for an HVM guest userspace netfront
>>> driver to be able to use evntchnop calls directly.
>> And this can't be accomplished using the evtchn and/or privcmd
>> drivers?
> 
> It can, and I don't believe the worry about extra overhead is well
> placed.  (There many areas of lower hanging fruit in this specific case).
> 
> However, a userspace backend isn't in principle a bad idea.

Backend? Earlier you said frontend. Nor can I see how using
the evtchn/privcmd devices would preclude that. After all their
purpose is to avoid having to expose hypercalls directly.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 16:49         ` Jan Beulich
@ 2016-01-06 17:06           ` Andrew Cooper
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Cooper @ 2016-01-06 17:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, TimDeegan, Xen-devel List

On 06/01/16 16:49, Jan Beulich wrote:
>>>> On 06.01.16 at 17:38, <andrew.cooper3@citrix.com> wrote:
>> On 06/01/16 16:31, Jan Beulich wrote:
>>>>>> On 06.01.16 at 15:44, <andrew.cooper3@citrix.com> wrote:
>>>> We did have an internal request for an HVM guest userspace netfront
>>>> driver to be able to use evntchnop calls directly.
>>> And this can't be accomplished using the evtchn and/or privcmd
>>> drivers?
>> It can, and I don't believe the worry about extra overhead is well
>> placed.  (There many areas of lower hanging fruit in this specific case).
>>
>> However, a userspace backend isn't in principle a bad idea.
> Backend? Earlier you said frontend.

I did mean frontend, but it really doesn't matter as far as this is
concerned.

> Nor can I see how using the evtchn/privcmd devices would preclude that.

They don't.  I didn't imply that they would.

> After all their purpose is to avoid having to expose hypercalls directly.

Only one purpose.  Another purpose is to enforce separation between
processes, and handle allocation of global resources.

In a dedicated utility VM, where all components are trusted, none of
these reasons have as much weight as they do in a general purpose OS,
and there is a valid argument to be made for favouring performance over
isolation.

I am not suggesting that userspace hypercalls would make orders of
magnitude difference, but they would make some difference, and allow a
Xen domain to have a more rDMA-like approach, if it chooses.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-06 11:44 RFC Userspace hypercalls Andrew Cooper
  2016-01-06 14:14 ` Jan Beulich
@ 2016-01-07 10:42 ` Ian Campbell
  2016-01-07 10:55   ` Andrew Cooper
  1 sibling, 1 reply; 13+ messages in thread
From: Ian Campbell @ 2016-01-07 10:42 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel List; +Cc: Keir Fraser, Tim Deegan, Jan Beulich

On Wed, 2016-01-06 at 11:44 +0000, Andrew Cooper wrote:
> All console logging is synchronous (to ensure that log messages have
> escaped the VM before an action occurs) and by default, an HVM test will
> use the qemu debug port, console_io hypercall, and PV console (which
> uses evtchn hypercalls).

All three simultaneously, or it picks one depending on the scenario?

> There are already scenarios under test where we cannot rely on the test
> kernel having a fully functioning set of entry points (e.g. the DPL part
> of the test above).  Therefore I specifically want to make it possible
> to make userspace hypercalls, rather than simply making them possible to
> be trapped-and-forwarded.

And in these test cases there is useful logging to be done between the
break the world and repair the world phases which I suppose follows if
things didn't crash?

> As a result, I proposing introducing a hypercall which allows a domain
> to adjust its entry criteria for hypercalls (e.g. set_hypercall_iopl). 
> Doing this for HVM guests is straight forward, but PV guests are harder,
> as they bounce through Xen entrypoints.
> 
> For PV guests, I propose that userspace hypercalls get implemented with
> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
> causes the hypercall page writing logic to consider the guest a ring1
> kernel, and the int $0x82 entrypoint suitably delegates between a
> regular hypercall and a compat hypercall.
> 
> Thoughts?

Would a xenconsoled mode which polls for updates (on specific guests only),
along with the guest spinning waiting for the cons pointer to catch the
prod one if it cares about synchronous logging be sufficient for this use
case?

Other random ideas:
Implement the debug io port for PV guests too
Log to a in guest buffer, as David suggested, possibly use xenaccess or
similar to trap updates or as a doorbell.

> 
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: RFC Userspace hypercalls
  2016-01-07 10:42 ` Ian Campbell
@ 2016-01-07 10:55   ` Andrew Cooper
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Cooper @ 2016-01-07 10:55 UTC (permalink / raw)
  To: Ian Campbell, Xen-devel List; +Cc: Keir Fraser, Tim Deegan, Jan Beulich

On 07/01/16 10:42, Ian Campbell wrote:
> On Wed, 2016-01-06 at 11:44 +0000, Andrew Cooper wrote:
>> All console logging is synchronous (to ensure that log messages have
>> escaped the VM before an action occurs) and by default, an HVM test will
>> use the qemu debug port, console_io hypercall, and PV console (which
>> uses evtchn hypercalls).
> All three simultaneously, or it picks one depending on the scenario?

Currently all three (for simplicity), but I want to make the precise
setup configurable.

>
>> There are already scenarios under test where we cannot rely on the test
>> kernel having a fully functioning set of entry points (e.g. the DPL part
>> of the test above).  Therefore I specifically want to make it possible
>> to make userspace hypercalls, rather than simply making them possible to
>> be trapped-and-forwarded.
> And in these test cases there is useful logging to be done between the
> break the world and repair the world phases which I suppose follows if
> things didn't crash?

Precisely.

>
>> As a result, I proposing introducing a hypercall which allows a domain
>> to adjust its entry criteria for hypercalls (e.g. set_hypercall_iopl). 
>> Doing this for HVM guests is straight forward, but PV guests are harder,
>> as they bounce through Xen entrypoints.
>>
>> For PV guests, I propose that userspace hypercalls get implemented with
>> the int $0x82 path exclusively.  i.e. enabling userspace hypercalls
>> causes the hypercall page writing logic to consider the guest a ring1
>> kernel, and the int $0x82 entrypoint suitably delegates between a
>> regular hypercall and a compat hypercall.
>>
>> Thoughts?
> Would a xenconsoled mode which polls for updates (on specific guests only),
> along with the guest spinning waiting for the cons pointer to catch the
> prod one if it cares about synchronous logging be sufficient for this use
> case?

The framework already waits for cons to catch prod.

>
> Other random ideas:
> Implement the debug io port for PV guests too
> Log to a in guest buffer, as David suggested, possibly use xenaccess or
> similar to trap updates or as a doorbell.

Specifically not.  I have been bitten by that one too many times already.

In the case of XSA regression tests, or indeed the random x86
instruction executor which discovered XSA-44, the logging needs to have
escaped the host before the action is taken, or it all gets lost in a
host crash.

This is why console_io hypercalls are also used.

~Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-01-07 10:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-06 11:44 RFC Userspace hypercalls Andrew Cooper
2016-01-06 14:14 ` Jan Beulich
2016-01-06 14:44   ` Andrew Cooper
2016-01-06 16:09     ` Jan Beulich
2016-01-06 16:20       ` Andrew Cooper
2016-01-06 16:24         ` Jan Beulich
2016-01-06 16:31     ` Jan Beulich
2016-01-06 16:38       ` Andrew Cooper
2016-01-06 16:49         ` Jan Beulich
2016-01-06 17:06           ` Andrew Cooper
2016-01-06 16:41       ` David Vrabel
2016-01-07 10:42 ` Ian Campbell
2016-01-07 10:55   ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.