Re: [ARM] Native application design and discussion (I hope)

From: Volodymyr Babchuk <vlad.babchuk@gmail.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>,
	Artem Mygaiev <joculator@gmail.com>,
	Xen Devel <xen-devel@lists.xen.org>
Subject: Re: [ARM] Native application design and discussion (I hope)
Date: Fri, 7 Apr 2017 14:03:47 +0300	[thread overview]
Message-ID: <CAOcqxo2n9gtwUz0xv0t22aA=mO8UZboT3ciuoN_e5q6isAWaNw@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1704061349310.2759@sstabellini-ThinkPad-X260>

Hi Stefano,

On 7 April 2017 at 00:31, Stefano Stabellini <sstabellini@kernel.org> wrote:
> On Thu, 6 Apr 2017, Volodymyr Babchuk wrote:
>> Hello all,
>>
>> I want to discuss EL0 (native) applications for XEN. This will be relatively
>> long e-mail with requirements, proposed design and my PoC results.
>>
>> So, why we want XEN native applications in the first place? I see the following
>> reasons:
>>
>> 1. Isolation. I see XEN as a sort of micro-kernel, so there are no place for
>> device drivers, emulators, specific SMC handlers, hypervisor extension, etc..
>>
>> 2. Modularity. Just look at Linux kernel. Obviously, for different devices we
>> can load different drivers.
>>
>> 3. Performance. Native application should be faster than stub domain, or there
>> will be no sense in it.
>>
>> 4. Ease of use. I want to make call to EL0 app as easy as possible.
>> Ideally - as a function call.
>>
>> Actually, no one wants extra code in hypervisor, so reasons (1) and (2) are most
>> important. I know that there was tries to do such thing in x86 but with
>> different approach. I want describe my idea for arm64.
>>
>> Native application is an another domain type. It has own vCPU (only one at this
>> moment) Native app is loaded as any other kernel, using ELF loader.
>> It looks like another stub-domain such as MiniOS, but there are two big
>> differences:
>
> Could you describe the reason why you are suggesting it? Unless strictly
> necessary, I wouldn't go down the vcpu route, because as soon as we
> bring a vcpu into the picture, we have a number of problems, including
> scheduling, affinity, etc. It is also user-visible (xl vcpu-list) which
> I don't think it should be.
I used this in my PoC because I didn't want to do extra work. Also this looks
very natural. Domain is actually the same as a process, vcpu is like a thread.
But yes, I already had some issues with scheduler. Manageable, thought.

> I understand that one of the goals is "Modularity", which makes us think
> of an ELF loader, such as the one for a new domain. I agree that
> modularity is important, but I would solve it as a second step. In first
> instance, I would limit the scope to run some code under
> /xen/arch/arm/apps or, better, /apps (for example) in a lower privilege
> mode. After that is done and working, I am sure we can find a way to
> dynamically load more apps at run time.
Again, use of existing domain framework was the easiest way. I needed
some container to hold app and domain fits perfectly. I need to map pages
there, need routines to copy to and from its memory, need p2m code, etc.

But, yes, if we are going to implement this in right way, then maybe we need
separate identities like 'app_container' and 'app_thread'. See below.

>> 1. MiniOS has event loop that serves requests from hypervisor. Native
>> application does not has such loop. It has one entry point where you jump every
>> time when you need something from it.
>>
>> 2. Native application runs in EL0 mode, so it does not have access to MMU,
>> it can't handle vIQRs, exceptions and so on. XEN does all this for it.
>>
>> You can find example native application at [1]. I used exactly this one to
>> benchmark my implementation. Mostly it is inspired by approach used in TEE.
>> Actually, I took some code directly from OP-TEE Trusted Application library.
>> In app_entry.c you can find entry point - __app_entry(). It takes function
>> number and some parameters that will be passed to a function. I probably going
>> to change ABI a bit, but basic idea will be the same.
>>
>> Function number will be something like APP_INIT, APP_HANDLE_SMC
>> or APP_HANDLE_MMIO... I think you got the idea. I also implemented two syscalls
>> (via old plain SVC instruction). app_log() writes to XEN log, app_return() exits
>> from application back to hypervisor. We will need other syscalls like
>> app_call_smc(), app_map_guest_page(), app_map_io(), etc.
>>
>> Now, back to XEN. Classic way to handle something with stubdomain is to
>> write request to a ring buffer, fire an event through event channel, that will
>> trigger vIRQ in stubdomain and stubdomain's vCPU will be scheduled to handle
>> a request. Problem it that you can't control scheduler, so you don't know
>> when your request will be really handled, which in not fine in some
>> embedded use cases.
>>
>> There is how I see handling requests with native application:
>>
>> 0. Hypervisor pauses requester vCPU
>> 1. Hypervisor either passes parameters via registers or writes request to a
>> shared page/ring buffer.
>> 2. Then in sets PC of native app vCPU to entry point and initializes r0-r7
>> with function code and other parameters.
>> 3. Hypervisor switches context to native app vCPU
>> 4. When native app finishes request handling it calls special syscall app_exit()
>> 5. Hypervisor analyses return code, updates requester vCPU state (if needed),
>> switches back to that vCPU, unpauses it.
>>
>> Most of that was done at [2]. Most interesting part is in arch/arm/domain.c
>> There are functions call_el0_app() and return_from_el0_app() that do most
>> of the work. Also I have added syscalls handlers (in the same way,
>> as hypercalls are handled). You can find them at xen/arch/arm/syscall.c
>
> This workflow is actually kind of OK. I would not use the term "vcpu"
> for anything related to an el0 app. Maybe we need to introduce a new
> concept, such as "app_context" for example. But I would not want to
> confuse "vcpu" which is the runtime environment exposed to guests, with
> the el0 Xen context.
>
> A vcpu is expected to be running simultenously with other vcpus of the
> same domain or different domains. The scheduler is expected to choose
> when it is supposed to be running. On the other end, an el0 app runs to
> handle/emulate a single request from a guest vcpu, which will be paused
> until the el0 app finishes. After that, the guest vcpu will resume.
Okay, but what should be stored in `current` while el0 application is running?
Remember, that it can issue syscalls, which will be handled in hypervisor.

We can create separates types for native applications. But then we can end
having two parallel and mostly identical frameworks. One for domains and
another one - for apps. What do you think?

>> At this moment entry point is hardcoded and you need to update it every time
>> you rebuild native application. Also there are no actual parameters passed.
>> Also, whole code is a piece of gosa, because it was first time I hacked XEN.
>
> :-)
> I would start by introducing a proper way to pass parameters and return
> values.
>
>> I don't want to repeat benchmark results, because they already was posted in ML.
>> You can find them at [3].
>>
>> I understand that I have missed many things:
>>
>> 1. How to ship and load native app, because some of them will be needed even
>> before dom0 is created.
>
> I envision something like Linux's insmod, but I suggest postponing this
> problem. At the moment, it would be fine to assume that all apps need to
> be built statically and cannot be loaded at runtime.
Okay. Then we need to hold them in special sections of hypervisor image
and also we need some sort of loader in hypervisor.

>> 2. How to distinguish multiple native apps
>
> Each apps need to specify a range of MMIO/SMC handlers. Xen will invoke
> the right one.
What about device drivers? Consider power management for example. This is
crucial if we want to use XEN in mobile devices. Our (there, in EPAM) idea is
to hold drivers for PM, drivers for coprocessors and so on in native apps.
Probably we will need different types of apps: SMC handler, MMIO handler,
PM driver, and so on.

>> 3. Concurrency in native apps
>
> This is an interesting problem: what do we do if two guest vcpus make
> simultenous requests that need to be handled by the same app?
> Technically, we could run the same app twice on two different pcpus
> simultenously. But then, the apps would need to be able to cope with
> concurrency (spin_locks, etc.) From Xen point of view, it should be OK
> though.
Yes. Probably, we can pass id of pcpu to app, so it can have per-cpu storage
if it wants to. Plus spin_locks and no blocking syscalls.

>
>> 4. How to restart misbehaved apps.
>
> A related question is the following: do we expect to allocate each app
> once at boot or once per guest? Apps need to have some per-domain
> context, but it could be passed from Xen to the app on a shared page,
> possibly reducing the need for allocating the same app once per guest?
SMC handler needs to be cross-domain for example. Emulators can be
tied to guests, I think. Device drivers should be cross-domain also.

>
>> But at this moment I want to discuss basic approach. If there are will be no
>> objections against basic concept, then we can develop details.
>>
>> [1] https://github.com/lorc/xen_app_stub - native app
>> [2] https://github.com/lorc/xen/tree/el0_app - my branch with PoC
>> [3] http://marc.info/?l=xen-devel&m=149088856116797&w=2 - benchmark results

-- 
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@gmail.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel