On Tue, 2017-07-04 at 11:12 -0400, Meng Xu wrote:
> On Tue, Jul 4, 2017 at 8:28 AM, Andrii Anisov <andrii_anisov@epam.com
> > wrote:
> > 
> > So you are suggesting to introduce more RT schedulers with
> > different algorithms. Did I get you right?
> 
> The EDF scheduling cares about the overall system's RT performance.
> If
> you want to guarantee the *soft* real-time performance of the IVI
> domains and allow the IVI domain to delay the two RT domains in some
> scheduling periods, the EDF scheduling is better than the RM
> scheduling. Note that we need to reserve enough CPU resources to make
> sure the delay from the IVI domain to the two RT domains won't cause
> the deadline miss of the two RT domains.
> 
This is technically correct, but, at the same time, I don't think it is
the best way to describe why and how one should use the RTDS scheduler.

In fact, what scheduling and prioritization strategy is used,
internally in the scheduler, is (for now) not exposed to the user, and
it hence should not have an impact in deciding whether or not to adopt
the scheduler... Unless we've done things in a very wrong way! :-P

What I'd say, as a description of what RTDS can give, to people
interested in using it, would be as follows.

RTDS gives you the chance to provide your VMs, guarantees of CPU
utilization that is precise, and has a well defined and strictly
enforced granularity. In fact, by using RTDS, it's possible to specify
two things:
- that a VM should at least be able to execute for a certain U% of 
  total CPU time
- that a VM will be able to exploit this 'reservation' with a time 
  granularity of P milliseconds.

U, in fact, is expressed as U=B/P, P (called period) is how frequently
a VM is given a chance to run, while B (called budget) is for how long
it will be able to run, on every time interval of length P.

So, if, as an example, a VM has a budget of 10 milliseconds and a
period of 100 milliseconds, this means:
- the VM will be granted 10% CPU execution time;
- if an event for the VM arrives at time t1, the VM itself will be
  able to start processing process it no later than t2=t1+2*P-2*B

That's why, IMO, the period matters (a lot!). If one "just" knows that
a VM will roughly need, say, 40% CPU time, then it does not matter if
the scheduling parameters are B/P=4/10, or B/P=40/100, or
B/P=40000/100000.
OTOH, if one also cares about the latency, doing the math and setting
the period properly.

In fact, this capability of specifying the granularity of a
reservation, is one of the main differences between RTDS (and, in
general, or real time scheduling algorithms) and other general purpose
algorithm. In fact, it is possible with general purpose algorithms too
(for example, using weights, in Credit1 and Credit2, or using `nice' in
Linux's CFS) to specify a certain utilization of a VM (task). But, in
those algorithms, it's impossible to specify precisely, and on a per-VM 
basis, the granularity of such reservation.

The caveat is that, unfortunately, the guarantee does not extend to
letting you exploit the full capacity. What I mean is that, while on
uniprocessor systems all that I have said above stays true, with the
only constraint of not giving, to the various VMs cumulatively, more
than 100% utilization, on multiprocessors, that is not true. Therefore,
if you have 4 pCPUs, and you assign the parameters to the various VMs
in such a way that the sum of B/P of all of them is <= 400%, it's not
guaranteed that _all_ of them will actually get their B, in every
interval of length P.

Knowing what the upper bound is, for a given number of pCPU, is not
easy. A necessary and sufficient limit has (to the best of my
knowledge, which may not be updated to the current state of the art of
RT academic literature) yet to be found. There are various limits, and
various ways of computing them, none of which is suitable to be
implemented inside an hypervisor... so Xen won't tell you whether or
not your overall set of parameters is feasible or not. :-(

(Perhaps we could, at least, keep track of the total utilization and at
least warn the user when we overcome full capacity. Say, if with 4
pCPUs, we go over 400%, we can well print a warning saying that
deadlines will be missed. Meng?)

These limits also depends on the actual scheduling policy (e.g.,
Eearliest Deadline First vs Rate Monotonic), but (again, to the best of
my knowledge) it has not been determined yet whether one is always
better than the other (again, for SMPs, in UPs, EDF wins), and so it's
again improper to bother with what algorithm to choose.

> Supporting the RM scheduling policy in the RTDS scheduler is not
> difficult. Actually, the RTDS scheduler was designed to be able to
> extend to other scheduling policies, such as RM scheduling. In the
> RT-Xen project[1], it supports both RM and EDF scheduling policy. We
> just choose to upstream the EDF first.
> 
Exactly. And I'm ok having RM, but we'll have to be careful about how
we document/advertise it, or we risk confusing people. :-)

In fact, I think that, whether or not Andrii will find RTDS
useful, depends really really really little, if at all, by the fact
that we implemented EDF or RM!

> I personally am very interested in the realistic use case, especially
> the automotive use cases, for the RTDS scheduler. If you have any use
> case that we can help to test, please don't hesitate to ask.
> 
Indeed! :-)

If you, or anyone from your team, have questions about this, don't
hesitate to fish me during the summit in Budapest. :-D

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)