All of lore.kernel.org
 help / color / mirror / Atom feed
* RTDS with extra time issue
@ 2018-02-09 12:20 Andrii Anisov
  2018-02-09 12:25 ` Andrii Anisov
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 12:20 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Dear Dario,

Now I'm experimenting with RTDS, in particular with "extra time" 
functionality.

My experimental setup is built on Salvator-X board with H3 SOC (running 
only big cores cluster, 4xA57).
Domains up and running, and their VCPU are as following:

root@generic-armv8-xt-dom0:/xt/dom.cfg# xl sched-rtds -v all
Cpupool Pool-0: sched=RTDS
Name                                ID VCPU    Period    Budget Extratime
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
Domain-0                             0    0     10000 1000        yes
Domain-0                             0    1     10000 1000        yes
Domain-0                             0    2     10000 1000        yes
Domain-0                             0    3     10000 1000        yes
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomR                                 3    0     10000 5000         no
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomA                                 5    0     10000 1000        yes
DomA                                 5    1     10000 1000        yes
DomA                                 5    2     10000 1000        yes
DomA                                 5    3     10000 1000        yes
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomD                                 6    0     10000 1000        yes
DomD                                 6    1     10000 1000        yes
DomD                                 6    2     10000 1000        yes
DomD                                 6    3     10000 1000        yes

The idea of such configuration is that only DomR really runs RT tasks, 
and their CPU utilization would be less than half a CPU. Rest of the 
domains are application domains without need of RT guarantees for their 
tasks, but can utilize as much CPU as they need and is available at this 
moment.
I load application domains with `dd if=/dev/zero of=/dev/null` per VCPU.
In DomR I run one RT task with period 10ms and wcet 4ms (I'm using 
LITMUS-RT for DomR), and see that this task sometime misses its 
deadline. Which means that the only VCPU of DomR haven't got its 5ms 
each 10ms.
The ps in DomR is as following:

root@genericarmv8:~# ps
   PID USER       VSZ STAT COMMAND
     1 root      1764 S    init
     2 root         0 SW   [kthreadd]
     3 root         0 SW   [ksoftirqd/0]
     4 root         0 SW   [kworker/0:0]
     5 root         0 SW<  [kworker/0:0H]
     6 root         0 SW   [kworker/u2:0]
     7 root         0 SW   [rcu_preempt]
     8 root         0 SW   [rcu_sched]
     9 root         0 SW   [rcu_bh]
    10 root         0 SW   [migration/0]
    11 root         0 SW<  [lru-add-drain]
    12 root         0 SW   [watchdog/0]
    13 root         0 SW   [cpuhp/0]
    14 root         0 SW   [kdevtmpfs]
    15 root         0 SW<  [netns]
    16 root         0 SW   [kworker/u2:1]
    17 root         0 SW   [xenwatch]
    18 root         0 SW   [xenbus]
   360 root         0 SW   [khungtaskd]
   361 root         0 SW   [oom_reaper]
   362 root         0 SW<  [writeback]
   364 root         0 SW   [kcompactd0]
   365 root         0 SWN  [ksmd]
   366 root         0 SW<  [crypto]
   367 root         0 SW<  [kintegrityd]
   368 root         0 SW<  [bioset]
   370 root         0 SW<  [kblockd]
   388 root         0 SW<  [ata_sff]
   394 root         0 SW   [kworker/0:1]
   433 root         0 SW<  [watchdogd]
   519 root         0 SW<  [rpciod]
   520 root         0 SW<  [xprtiod]
   548 root         0 SW   [kswapd0]
   549 root         0 SW<  [vmstat]
   633 root         0 SW<  [nfsiod]
   802 root         0 SW   [khvcd]
   844 root         0 SW<  [bioset]
   847 root         0 SW<  [bioset]
   850 root         0 SW<  [bioset]
   853 root         0 SW<  [bioset]
   856 root         0 SW<  [bioset]
   859 root         0 SW<  [bioset]
   861 root         0 SW<  [bioset]
   864 root         0 SW<  [bioset]
   946 root         0 SW<  [vfio-irqfd-clea]
  1405 root      2912 S    {start_getty} /bin/sh /bin/start_getty 115200 
hvc0 vt102
  1406 root      2976 S    /sbin/getty 38400 tty1
  1407 root      3256 S    -sh
  1512 root      2908 S    {st-trace-schedu} /bin/bash 
/usr/sbin/st-trace-schedule -s m1
  1523 root      1932 S    rtspin -a 194000 -w 48 100 120
  1527 root      1748 S    /usr/sbin/ftcat -p /tmp/tmp.3PXDeo/cpu0.pid 
/dev/litmus/sched_trace0 501 502 503 504 505 506 507 508 509 510 511
  1533 root      3224 R    ps

I noticed that behavior while running EPAM demo setup. Which consists of 
HW drivers in DomD, PV Drivers backends in DomD, DomA running real 
Android with PV Drivers and utilizing GPU sharing, etc.
But I managed to reproduce the issue when all domains are running 
generic armv8 kernel with minimal initramfses.
So I suspect an issue in RTDS.

-- 

*Andrii Anisov*


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 12:20 RTDS with extra time issue Andrii Anisov
@ 2018-02-09 12:25 ` Andrii Anisov
  2018-02-09 13:18 ` Dario Faggioli
  2018-02-09 15:34 ` Meng Xu
  2 siblings, 0 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 12:25 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Hello Dario,


I eventually used your old email.

Please take a look here.


On 09.02.18 14:20, Andrii Anisov wrote:
> Dear Dario,
>
> Now I'm experimenting with RTDS, in particular with "extra time" 
> functionality.
>
> My experimental setup is built on Salvator-X board with H3 SOC 
> (running only big cores cluster, 4xA57).
> Domains up and running, and their VCPU are as following:
>
> root@generic-armv8-xt-dom0:/xt/dom.cfg# xl sched-rtds -v all
> Cpupool Pool-0: sched=RTDS
> Name                                ID VCPU    Period    Budget Extratime
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> Domain-0                             0    0     10000 1000 yes
> Domain-0                             0    1     10000 1000 yes
> Domain-0                             0    2     10000 1000 yes
> Domain-0                             0    3     10000 1000 yes
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomR                                 3    0     10000 5000 no
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomA                                 5    0     10000 1000 yes
> DomA                                 5    1     10000 1000 yes
> DomA                                 5    2     10000 1000 yes
> DomA                                 5    3     10000 1000 yes
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomD                                 6    0     10000 1000 yes
> DomD                                 6    1     10000 1000 yes
> DomD                                 6    2     10000 1000 yes
> DomD                                 6    3     10000 1000 yes
>
> The idea of such configuration is that only DomR really runs RT tasks, 
> and their CPU utilization would be less than half a CPU. Rest of the 
> domains are application domains without need of RT guarantees for 
> their tasks, but can utilize as much CPU as they need and is available 
> at this moment.
> I load application domains with `dd if=/dev/zero of=/dev/null` per VCPU.
> In DomR I run one RT task with period 10ms and wcet 4ms (I'm using 
> LITMUS-RT for DomR), and see that this task sometime misses its 
> deadline. Which means that the only VCPU of DomR haven't got its 5ms 
> each 10ms.
> The ps in DomR is as following:
>
> root@genericarmv8:~# ps
>   PID USER       VSZ STAT COMMAND
>     1 root      1764 S    init
>     2 root         0 SW   [kthreadd]
>     3 root         0 SW   [ksoftirqd/0]
>     4 root         0 SW   [kworker/0:0]
>     5 root         0 SW<  [kworker/0:0H]
>     6 root         0 SW   [kworker/u2:0]
>     7 root         0 SW   [rcu_preempt]
>     8 root         0 SW   [rcu_sched]
>     9 root         0 SW   [rcu_bh]
>    10 root         0 SW   [migration/0]
>    11 root         0 SW<  [lru-add-drain]
>    12 root         0 SW   [watchdog/0]
>    13 root         0 SW   [cpuhp/0]
>    14 root         0 SW   [kdevtmpfs]
>    15 root         0 SW<  [netns]
>    16 root         0 SW   [kworker/u2:1]
>    17 root         0 SW   [xenwatch]
>    18 root         0 SW   [xenbus]
>   360 root         0 SW   [khungtaskd]
>   361 root         0 SW   [oom_reaper]
>   362 root         0 SW<  [writeback]
>   364 root         0 SW   [kcompactd0]
>   365 root         0 SWN  [ksmd]
>   366 root         0 SW<  [crypto]
>   367 root         0 SW<  [kintegrityd]
>   368 root         0 SW<  [bioset]
>   370 root         0 SW<  [kblockd]
>   388 root         0 SW<  [ata_sff]
>   394 root         0 SW   [kworker/0:1]
>   433 root         0 SW<  [watchdogd]
>   519 root         0 SW<  [rpciod]
>   520 root         0 SW<  [xprtiod]
>   548 root         0 SW   [kswapd0]
>   549 root         0 SW<  [vmstat]
>   633 root         0 SW<  [nfsiod]
>   802 root         0 SW   [khvcd]
>   844 root         0 SW<  [bioset]
>   847 root         0 SW<  [bioset]
>   850 root         0 SW<  [bioset]
>   853 root         0 SW<  [bioset]
>   856 root         0 SW<  [bioset]
>   859 root         0 SW<  [bioset]
>   861 root         0 SW<  [bioset]
>   864 root         0 SW<  [bioset]
>   946 root         0 SW<  [vfio-irqfd-clea]
>  1405 root      2912 S    {start_getty} /bin/sh /bin/start_getty 
> 115200 hvc0 vt102
>  1406 root      2976 S    /sbin/getty 38400 tty1
>  1407 root      3256 S    -sh
>  1512 root      2908 S    {st-trace-schedu} /bin/bash 
> /usr/sbin/st-trace-schedule -s m1
>  1523 root      1932 S    rtspin -a 194000 -w 48 100 120
>  1527 root      1748 S    /usr/sbin/ftcat -p /tmp/tmp.3PXDeo/cpu0.pid 
> /dev/litmus/sched_trace0 501 502 503 504 505 506 507 508 509 510 511
>  1533 root      3224 R    ps
>
> I noticed that behavior while running EPAM demo setup. Which consists 
> of HW drivers in DomD, PV Drivers backends in DomD, DomA running real 
> Android with PV Drivers and utilizing GPU sharing, etc.
> But I managed to reproduce the issue when all domains are running 
> generic armv8 kernel with minimal initramfses.
> So I suspect an issue in RTDS.
>

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 12:20 RTDS with extra time issue Andrii Anisov
  2018-02-09 12:25 ` Andrii Anisov
@ 2018-02-09 13:18 ` Dario Faggioli
  2018-02-09 15:03   ` Andrii Anisov
  2018-02-09 15:34 ` Meng Xu
  2 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2018-02-09 13:18 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Meng Xu, Dario Faggioli


[-- Attachment #1.1: Type: text/plain, Size: 4323 bytes --]

On Fri, 2018-02-09 at 14:20 +0200, Andrii Anisov wrote:
> Dear Dario,
> 
Hi,

> My experimental setup is built on Salvator-X board with H3 SOC
> (running 
> only big cores cluster, 4xA57).
> Domains up and running, and their VCPU are as following:
> 
> root@generic-armv8-xt-dom0:/xt/dom.cfg# xl sched-rtds -v all
> Cpupool Pool-0: sched=RTDS
> Name                                ID VCPU    Period    Budget
> Extratime
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> Domain-0                             0    0     10000 1000        yes
> Domain-0                             0    1     10000 1000        yes
> Domain-0                             0    2     10000 1000        yes
> Domain-0                             0    3     10000 1000        yes
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomR                                 3    0     10000 5000         no
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomA                                 5    0     10000 1000        yes
> DomA                                 5    1     10000 1000        yes
> DomA                                 5    2     10000 1000        yes
> DomA                                 5    3     10000 1000        yes
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomD                                 6    0     10000 1000        yes
> DomD                                 6    1     10000 1000        yes
> DomD                                 6    2     10000 1000        yes
> DomD                                 6    3     10000 1000        yes
> 
Ok, so you're giving:
- 40% CPU time to Domain-0
- 50% CPU time to DomR
- 40% CPU time to DomA
- 40% CPU time to DomD

total utilization is 170%. As far as I've understood you have 4 CPUs,
right? If yes, there *should* be no problems. (Well, in theory, we'd
need a schedulability test to know for sure whether the system is
"feasible", but I'm going to assume that it sort of is, and leave to
Meng any further real-time scheduling analysis related configurations.
:-) ).

> The idea of such configuration is that only DomR really runs RT
> tasks, 
> and their CPU utilization would be less than half a CPU. Rest of the 
> domains are application domains without need of RT guarantees for
> their 
> tasks, but can utilize as much CPU as they need and is available at
> this 
> moment.
>
So, this should work, as allowing the other domains to use extratime
should *not* allow them to prevent DomR to get it's 50% share of CPU
time.

I wonder, though, if this case would not be better if cpupools are
used. E.g., you can leve the non real-time domains in the default pool
(and have Credit or Credit2 there), and then have an RTDS cpupool in
which you put DomR, with its 50% share, and perhaps someone else (just
to avoid wasting the other 50%).

But that's a different story...

> I load application domains with `dd if=/dev/zero of=/dev/null` per
> VCPU.
> In DomR I run one RT task with period 10ms and wcet 4ms (I'm using 
> LITMUS-RT for DomR), and see that this task sometime misses its 
> deadline. Which means that the only VCPU of DomR haven't got its 5ms 
> each 10ms.
>
Well, that's a possibility, and (if the system is indeed schedulable,
which again, I'm assuming just out of laziness :-/ ) it would be a bug.
However, as a first thing, I'd make sure that this is actually not
happening.

Basically, can you also fully load (like with dd as above, or just yes
or while(1)) DomR, and then check if it is getting 50%? For a first
approximation of this, you can check with xentop. If you want to be
even more sure/you want to know it precisely, you can use tracing.

If DomR is not able to get its share, then we have an issue/bug in the
scheduler. If it does, then the scheduler is doing its job, and the
issue may be somewhere else (e.g., something inside the guest may eat
some of the budget, in such a way that not all of it is available when
you actually need it).

Let me know.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 13:18 ` Dario Faggioli
@ 2018-02-09 15:03   ` Andrii Anisov
  2018-02-09 15:18     ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 15:03 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Hello Dario,

On 09.02.18 15:18, Dario Faggioli wrote:
> Ok, so you're giving:
> - 40% CPU time to Domain-0
> - 50% CPU time to DomR
> - 40% CPU time to DomA
> - 40% CPU time to DomD
> total utilization is 170%. As far as I've understood you have 4 CPUs,
> right? If yes, there *should* be no problems. (Well, in theory, we'd
> need a schedulability test to know for sure whether the system is
> "feasible", but I'm going to assume that it sort of is, and leave to
> Meng any further real-time scheduling analysis related configurations.
> :-) ).
Being a bit more specific, I give:

  - 4*10% CPU time to Domain-0
  - 1*50% CPU time to DomR
  - 4*10% CPU time to DomA
  - 4*10% CPU time to DomD

Which seems to be schedulable on 4*100% CPU. I guess Meng could shed 
more light on this topic from theoretical point of view.

> So, this should work, as allowing the other domains to use extratime
> should *not* allow them to prevent DomR to get it's 50% share of CPU
> time.
That is my point.

> I wonder, though, if this case would not be better if cpupools are
> used. E.g., you can leve the non real-time domains in the default pool
> (and have Credit or Credit2 there), and then have an RTDS cpupool in
> which you put DomR, with its 50% share, and perhaps someone else (just
> to avoid wasting the other 50%).
The problem here is that domains can not have their vcpus from different 
cpupool. So we would waste that fraction of CPU.
IMHO in case of pcpu partitioning with cpupools, we loose a practical 
application of RTDS scheduler. A pool with null scheduler will do the 
job for RT domain.
IMHO one would benefit from a RTDS scheduler only in case he has rt 
vcpu(s) with utilization of fraction of pcpu and want to save remaining 
resources.

> Basically, can you also fully load (like with dd as above, or just yes
> or while(1)) DomR, and then check if it is getting 50%?
>   For a first
> approximation of this, you can check with xentop.

For sure I did this run. xentop clearly shows 50% in case of DomR loaded 
with dd, and equal distribution of CPU resources among other domains, i.e.:

             STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) 
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD VBD_WR  
VBD_RSECT  VBD_WSECT SSID
       DomA -----r      20325  117.0    2178960   53.6 218009653.7   
n/a     4    0        0        0    0 0        0        0          
0          0   11
       DomD -----r      20282  116.4    1048464   25.8 104960025.8     
1    04    0        0        0    0 0        0        0          
0          0   11
   Domain-0 -----r      21123  117.2     262144    6.5   no limit 
n/a     4    04    0        0        0    0        0 0        0          
0          0    2
       DomR -----r        284   50.0     196496    449600 197632 4.9     
1     1    0        0        0    0        0 0        0          
0          0   11

In case I run my test, with xentop I see something like following:

             STATE   CPU(sec) CPU(%)     MEM(k) MEM(%) MAXMEM(k) 
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO VBD_RD   VBD_WR  
VBD_RSECT  VBD_WSECT SSID
   Domain-0 -----r      223493 120.2    2178960   53.6 218009653.7   
n/a     4    0        0        0    0 0        0        0          
0          0    2
       DomD -----r      215036 118.7    1048464   25.8 104960025.8     
4     4    0        0        0    0 0        0        0          
0          0   11
       DomA ------      215396 115.4    2178960   53.6 218009653.7     
4    04    0        0        0    0 0        0        0          
0          0   11
       DomR -----r        6145  38.7     196496    4.8 197632 4.9     
1     1    0        0        0    0        0 0        0          
0          0   11

What is ok for litmus-rt test load, which by default runs task for 0.95 
of given wcet.

I get several deadline misses in a 4 minute run of task with 10ms 
period. Xentop would not show such deviations, it is too coarse grained.

>   If you want to be even more sure/you want to know it precisely, you can use tracing.
Yep, maybe its time for me to get familiar with tracing in XEN.

> If DomR is not able to get its share, then we have an issue/bug in the
> scheduler. If it does, then the scheduler is doing its job, and the
> issue may be somewhere else (e.g., something inside the guest may eat
> some of the budget, in such a way that not all of it is available when
> you actually need it).
The DomR guest is really lean, I already shown processes list in it. I 
really doubt init together with getty on HVC can eat another 10% of CPU 
at any moment.


-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:03   ` Andrii Anisov
@ 2018-02-09 15:18     ` Dario Faggioli
  2018-02-09 15:36       ` Meng Xu
  2018-02-12 10:20       ` Andrii Anisov
  0 siblings, 2 replies; 25+ messages in thread
From: Dario Faggioli @ 2018-02-09 15:18 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Meng Xu


[-- Attachment #1.1: Type: text/plain, Size: 1101 bytes --]

On Fri, 2018-02-09 at 17:03 +0200, Andrii Anisov wrote:
> > If DomR is not able to get its share, then we have an issue/bug in
> > the
> > scheduler. If it does, then the scheduler is doing its job, and the
> > issue may be somewhere else (e.g., something inside the guest may
> > eat
> > some of the budget, in such a way that not all of it is available
> > when
> > you actually need it).
> 
> The DomR guest is really lean, I already shown processes list in it.
> I 
> really doubt init together with getty on HVC can eat another 10% of
> CPU 
> at any moment.
> 
So, I'm a little bit in a hurry now, and I'll reply better later (or on
Monday). But for now, just to understand things better, can you enable
extratime for DomR as well, and report what you see in xentop, and
whether or not you still see deadline misses?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 12:20 RTDS with extra time issue Andrii Anisov
  2018-02-09 12:25 ` Andrii Anisov
  2018-02-09 13:18 ` Dario Faggioli
@ 2018-02-09 15:34 ` Meng Xu
  2018-02-09 15:53   ` Andrii Anisov
  2018-02-09 16:04   ` Andrii Anisov
  2 siblings, 2 replies; 25+ messages in thread
From: Meng Xu @ 2018-02-09 15:34 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Dario Faggioli


[-- Attachment #1.1.1: Type: text/plain, Size: 2644 bytes --]

On Fri, Feb 9, 2018 at 7:20 AM, Andrii Anisov <andrii_anisov@epam.com>
wrote:

> Dear Dario,
>
> Now I'm experimenting with RTDS, in particular with "extra time"
> functionality.
>
> My experimental setup is built on Salvator-X board with H3 SOC (running
> only big cores cluster, 4xA57).
> Domains up and running, and their VCPU are as following:
>
> root@generic-armv8-xt-dom0:/xt/dom.cfg# xl sched-rtds -v all
> Cpupool Pool-0: sched=RTDS
> Name                                ID VCPU    Period    Budget Extratime
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> Domain-0                             0    0     10000 1000        yes
> Domain-0                             0    1     10000 1000        yes
> Domain-0                             0    2     10000 1000        yes
> Domain-0                             0    3     10000 1000        yes
> (XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
> DomR                                 3    0     10000 5000         no
>

To make sure no deadline miss of a task on a VCPU, we must guarantee:
1) The VCPU gets its configured time, which is shown in your following
emails that it does;
2) When the VCPU ​gets its configured time, the task on the VCPU can be
scheduled. <-- This can be achieved by configuring the VCPU's parameters
correctly.

The deadline miss problem in the test case you presented here is likely
caused by the case (2).
Even if the DomR gets 5ms in every 10ms, the task (period  = 10ms, budget =
4ms) on the VCPU will still  ​miss deadline.
In theory, the domR  with period 10ms should be configured to have budget =
7ms. But here the budget's configured to be 5ms, which is less than the
required.

The foundamental reason is that the release time of RT task and the task's
VCPU is not synchronized.
Here I show why we cannot assign the same/similar parameter of a task to
its VCPU. If you change the VCPU's budget to 5ms, the starvation interval
is 2 * ( period - budget) = 10ms, still making the VCPU's task miss
deadline.

[Forgive me to attach a slide to explain this.]
[image: Inline image 1]

If you want to keep the same VCPU parameter, can you try to set task's
period = 100ms and exe time = 40ms?
By theory (I used CARTS to compute), a VCPU (10ms, 5ms) can schedule a task
(100ms, 40ms).
Note that the resource demand of two RT tasks with the same utilization is
different: the task with smaller period has larger demand.

​Best,

Meng​


-----------
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

[-- Attachment #1.1.2: Type: text/html, Size: 9262 bytes --]

[-- Attachment #1.2: image.png --]
[-- Type: image/png, Size: 32974 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:18     ` Dario Faggioli
@ 2018-02-09 15:36       ` Meng Xu
  2018-02-09 15:56         ` Andrii Anisov
  2018-02-12 10:20       ` Andrii Anisov
  1 sibling, 1 reply; 25+ messages in thread
From: Meng Xu @ 2018-02-09 15:36 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Andrii Anisov

On Fri, Feb 9, 2018 at 10:18 AM, Dario Faggioli <dfaggioli@suse.com> wrote:
>
> On Fri, 2018-02-09 at 17:03 +0200, Andrii Anisov wrote:
> > > If DomR is not able to get its share, then we have an issue/bug in
> > > the
> > > scheduler. If it does, then the scheduler is doing its job, and the
> > > issue may be somewhere else (e.g., something inside the guest may
> > > eat
> > > some of the budget, in such a way that not all of it is available
> > > when
> > > you actually need it).
> >
> > The DomR guest is really lean, I already shown processes list in it.
> > I
> > really doubt init together with getty on HVC can eat another 10% of
> > CPU
> > at any moment.
> >
> So, I'm a little bit in a hurry now, and I'll reply better later (or on
> Monday). But for now, just to understand things better, can you enable
> extratime for DomR as well, and report what you see in xentop, and
> whether or not you still see deadline misses?
>

Another way to check if there is interference from services in domR is
to set period = budget for the domR's VCPUs.

Best Regards,

Meng

-----------
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:34 ` Meng Xu
@ 2018-02-09 15:53   ` Andrii Anisov
  2018-02-09 16:04   ` Andrii Anisov
  1 sibling, 0 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 15:53 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Dario Faggioli

Hello Meng Xu,


Thank you for your explanation.

On 09.02.18 17:34, Meng Xu wrote:
> To make sure no deadline miss of a task on a VCPU, we must guarantee:
> 1) The VCPU gets its configured time, which is shown in your following 
> emails that it does;
> 2) When the VCPU ​gets its configured time, the task on the VCPU can 
> be scheduled. <-- This can be achieved by configuring the VCPU's 
> parameters correctly.
>
> The deadline miss problem in the test case you presented here is 
> likely caused by the case (2).
> Even if the DomR gets 5ms in every 10ms, the task (period  = 10ms, 
> budget = 4ms) on the VCPU will still  ​miss deadline.
> In theory, the domR  with period 10ms should be configured to have 
> budget = 7ms. But here the budget's configured to be 5ms, which is 
> less than the required.
>
> The foundamental reason is that the release time of RT task and the 
> task's VCPU is not synchronized.
> Here I show why we cannot assign the same/similar parameter of a task 
> to its VCPU. If you change the VCPU's budget to 5ms, the starvation 
> interval is 2 * ( period - budget) = 10ms, still making the VCPU's 
> task miss deadline.
>
> [Forgive me to attach a slide to explain this.]
I think I've got the point.

> If you want to keep the same VCPU parameter, can you try to set task's 
> period = 100ms and exe time = 40ms?
> By theory (I used CARTS to compute), a VCPU (10ms, 5ms) can schedule a 
> task (100ms, 40ms).
> Note that the resource demand of two RT tasks with the same 
> utilization is different: the task with smaller period has larger demand.
I'll do more experiments.

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:36       ` Meng Xu
@ 2018-02-09 15:56         ` Andrii Anisov
  2018-02-09 17:51           ` Meng Xu
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 15:56 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Dario Faggioli

Hello Meng Xu,


On 09.02.18 17:36, Meng Xu wrote:
> Another way to check if there is interference from services in domR is
> to set period = budget for the domR's VCPUs.
Could you please explain how setting budget equal to period would help 
discover any interferences from services in the domain?

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:34 ` Meng Xu
  2018-02-09 15:53   ` Andrii Anisov
@ 2018-02-09 16:04   ` Andrii Anisov
  2018-02-09 17:53     ` Meng Xu
  1 sibling, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 16:04 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Dario Faggioli


On 09.02.18 17:34, Meng Xu wrote:
> If you want to keep the same VCPU parameter, can you try to set task's 
> period = 100ms and exe time = 40ms?
> By theory (I used CARTS to compute), a VCPU (10ms, 5ms) can schedule a 
> task (100ms, 40ms).
> Note that the resource demand of two RT tasks with the same 
> utilization is different: the task with smaller period has larger demand.
BTW, could you please share the model xml file to me?

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:56         ` Andrii Anisov
@ 2018-02-09 17:51           ` Meng Xu
  2018-02-10  0:14             ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Meng Xu @ 2018-02-09 17:51 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Dario Faggioli

On Fri, Feb 9, 2018 at 10:56 AM, Andrii Anisov <andrii_anisov@epam.com> wrote:
> Hello Meng Xu,
>
>
> On 09.02.18 17:36, Meng Xu wrote:
>>
>> Another way to check if there is interference from services in domR is
>> to set period = budget for the domR's VCPUs.
>
> Could you please explain how setting budget equal to period would help
> discover any interferences from services in the domain?
>
> --

Basically, setting period = budget is similar to what Dario suggests.
The only difference is that it can avoid some tiny scheduler overhead
in replenishing the VCPU's budget with the unused CPU time.

Best,

Meng


Best Regards,

Meng

-----------
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 16:04   ` Andrii Anisov
@ 2018-02-09 17:53     ` Meng Xu
  2018-02-09 18:07       ` Andrii Anisov
  0 siblings, 1 reply; 25+ messages in thread
From: Meng Xu @ 2018-02-09 17:53 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Dario Faggioli

[-- Attachment #1: Type: text/plain, Size: 558 bytes --]

On Fri, Feb 9, 2018 at 11:04 AM, Andrii Anisov <andrii_anisov@epam.com> wrote:
>
> On 09.02.18 17:34, Meng Xu wrote:
>>
>> If you want to keep the same VCPU parameter, can you try to set task's
>> period = 100ms and exe time = 40ms?
>> By theory (I used CARTS to compute), a VCPU (10ms, 5ms) can schedule a
>> task (100ms, 40ms).
>> Note that the resource demand of two RT tasks with the same utilization is
>> different: the task with smaller period has larger demand.
>
> BTW, could you please share the model xml file to me?
>

Sure!
It's attached.

Meng

[-- Attachment #2: example-in.xml --]
[-- Type: text/xml, Size: 178 bytes --]

<system os_scheduler="EDF" period="2"> 
	<component name="C0" scheduler="EDF" period="10"> 
		 <task name="T0" p="100" d="100" e="40" > </task>
		 
	</component>
</system>

[-- Attachment #3: example-out.xml --]
[-- Type: text/xml, Size: 508 bytes --]

<component name="OS Scheduler" algorithm="PRM interface">
	<resource>
		<model period="2" bandwidth="1" deadline="2"> </model>
	</resource>
	<processed_task>
		<model period="2" execution_time="2" deadline="2"> </model>
	</processed_task>
	<component name="C0" algorithm="PRM interface">
		<resource>
			<model period="10" bandwidth="0.5" deadline="10"> </model>
		</resource>
		<processed_task>
			<model period="10" execution_time="5" deadline="10"> </model>
		</processed_task>
	</component>
</component>

[-- Attachment #4: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 17:53     ` Meng Xu
@ 2018-02-09 18:07       ` Andrii Anisov
  0 siblings, 0 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 18:07 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Dario Faggioli

Thank you, I'll take a look.


On 09.02.18 19:53, Meng Xu wrote:
> Sure!
> It's attached.
>
> Meng

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 17:51           ` Meng Xu
@ 2018-02-10  0:14             ` Dario Faggioli
  2018-02-10  4:53               ` Meng Xu
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2018-02-10  0:14 UTC (permalink / raw)
  To: Meng Xu, Andrii Anisov; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2350 bytes --]

On Fri, 2018-02-09 at 12:51 -0500, Meng Xu wrote:
> > On 09.02.18 17:36, Meng Xu wrote:
> > > Another way to check if there is interference from services in
> > > domR is
> > > to set period = budget for the domR's VCPUs.
> > 
> > Could you please explain how setting budget equal to period would
> > help
> > discover any interferences from services in the domain?
> > 
> Basically, setting period = budget is similar to what Dario suggests.
>
The goal is to figure out where the problem is.

It looks like DomR's vCPU does get 50% of CPU time, so it's not that
other vCPUs are preventing it to exploit all its own reservation. If
that would have not been the case, there'd be a bug in the scheduler.

By giving the vCPU 100% (either via "budget == period" or with
extratime), we will figure out if the real-time applications inside can
actually meet their deadline. If they can't even with such setup, it
would mean the problem is somewhere else (virtualization overhead, IRQ
latency, etc).

If the applications can meet their deadline with 100% CPU time, but not
with 50%, then there are two possibilities:
1) they need more than 50%;
2) you're having "period synchronization" issues, as Meng was
describing.

Figuring out if you're on 1, should be as easy as trying to five DomR
55%, then 60%, then 65%, etc. And see when it happens that the deadline
misses are gone forever and for good.

If you can't get rid of deadline misses, it probably means you are in
case 2, and you need to find a way to make sure that your real-time
applications inside the domain are able to actually exploit the
domain's vCPU's budget when it's there. I.e., you don't want them to
activate when the budget is about to finish, and hence suffer from the
"blackout" shown in Meng's diagram.

Unfortunately, that's not really trivial to do. If the workload is
really periodic, it may be enough to find a way to do some kind of
"calibration" at the beginning. But I'm not sure how robust this will
actually be.

Perhaps Meng has some more ideas on this as well. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-10  0:14             ` Dario Faggioli
@ 2018-02-10  4:53               ` Meng Xu
  2018-02-12 10:17                 ` Dario Faggioli
  2018-02-12 10:38                 ` Andrii Anisov
  0 siblings, 2 replies; 25+ messages in thread
From: Meng Xu @ 2018-02-10  4:53 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Andrii Anisov

>
> It looks like DomR's vCPU does get 50% of CPU time, so it's not that
> other vCPUs are preventing it to exploit all its own reservation. If
> that would have not been the case, there'd be a bug in the scheduler.
>
> By giving the vCPU 100% (either via "budget == period" or with
> extratime), we will figure out if the real-time applications inside can
> actually meet their deadline. If they can't even with such setup, it
> would mean the problem is somewhere else (virtualization overhead, IRQ
> latency, etc).
>
> If the applications can meet their deadline with 100% CPU time, but not
> with 50%, then there are two possibilities:
> 1) they need more than 50%;
> 2) you're having "period synchronization" issues, as Meng was
> describing.
>
> Figuring out if you're on 1, should be as easy as trying to five DomR
> 55%, then 60%, then 65%, etc. And see when it happens that the deadline
> misses are gone forever and for good.
>
> If you can't get rid of deadline misses, it probably means you are in
> case 2, and you need to find a way to make sure that your real-time
> applications inside the domain are able to actually exploit the
> domain's vCPU's budget when it's there. I.e., you don't want them to
> activate when the budget is about to finish, and hence suffer from the
> "blackout" shown in Meng's diagram.
>
> Unfortunately, that's not really trivial to do. If the workload is
> really periodic, it may be enough to find a way to do some kind of
> "calibration" at the beginning. But I'm not sure how robust this will
> actually be.
>
> Perhaps Meng has some more ideas on this as well. :-)

If the RT VCPU has only one RT task on it, we can synchronize the
release time of the VCPU and that of the RT task. In other words, the
release offset of both the VCPU and the RT task are the same in terms
of the wall clock. Then we can assign the task's parameter to the VCPU
and guarantee the task has no deadline miss if the VCPU has no
deadline miss.
However, this observation only works when the assumption that one VCPU
has only one task in the RT domain. I'm not sure how practical it is
because the observation cannot be generalized for multiple tasks on
one VCPU.

Andrii and Dario,
Do you think the assumption that one VCPU runs only one RT task is
reasonable in practice?
If it is, is there some use cases for this assumption?

Thanks,

Meng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-10  4:53               ` Meng Xu
@ 2018-02-12 10:17                 ` Dario Faggioli
  2018-02-12 11:08                   ` Andrii Anisov
  2018-02-12 10:38                 ` Andrii Anisov
  1 sibling, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2018-02-12 10:17 UTC (permalink / raw)
  To: Meng Xu; +Cc: xen-devel, Andrii Anisov


[-- Attachment #1.1: Type: text/plain, Size: 1389 bytes --]

On Fri, 2018-02-09 at 23:53 -0500, Meng Xu wrote:
> > 
> > Perhaps Meng has some more ideas on this as well. :-)
> 
> If the RT VCPU has only one RT task on it, we can synchronize the
> release time of the VCPU and that of the RT task. In other words, the
> release offset of both the VCPU and the RT task are the same in terms
> of the wall clock. Then we can assign the task's parameter to the
> VCPU
> and guarantee the task has no deadline miss if the VCPU has no
> deadline miss.
> However, this observation only works when the assumption that one
> VCPU
> has only one task in the RT domain. I'm not sure how practical it is
> because the observation cannot be generalized for multiple tasks on
> one VCPU.
> 
> Andrii and Dario,
> Do you think the assumption that one VCPU runs only one RT task is
> reasonable in practice?
> If it is, is there some use cases for this assumption?
> 
Well, I'll let Andrii reply, but honestly, I don't think it is.

See, for instance, the fact that DomR has only 1 vCPU, so I find it
unlikely that the only thing that run there is *just* *one* real-time
task. :-/

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-09 15:18     ` Dario Faggioli
  2018-02-09 15:36       ` Meng Xu
@ 2018-02-12 10:20       ` Andrii Anisov
  2018-02-12 18:44         ` Andrii Anisov
  1 sibling, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-12 10:20 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Hello Dario,


On 09.02.18 17:18, Dario Faggioli wrote:
> So, I'm a little bit in a hurry now, and I'll reply better later (or on
> Monday). But for now, just to understand things better, can you enable
> extratime for DomR as well, and report what you see in xentop, and
> whether or not you still see deadline misses?

Actually as per Meng's explanation and calculations the problem was on 
my side - wrong DomR task/VCPU parameters.
I was running the system with dummy loads and values received from CARTS 
and all seems to be ok (no deadline misses occured).

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-10  4:53               ` Meng Xu
  2018-02-12 10:17                 ` Dario Faggioli
@ 2018-02-12 10:38                 ` Andrii Anisov
  1 sibling, 0 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-12 10:38 UTC (permalink / raw)
  To: Meng Xu, Dario Faggioli; +Cc: xen-devel

Hello Meng,


On 10.02.18 06:53, Meng Xu wrote:
> If the RT VCPU has only one RT task on it, we can synchronize the
> release time of the VCPU and that of the RT task. In other words, the
> release offset of both the VCPU and the RT task are the same in terms
> of the wall clock. Then we can assign the task's parameter to the VCPU
> and guarantee the task has no deadline miss if the VCPU has no
> deadline miss.
IMO, such configuration could be useful for VCPU scheduling overhead 
estimations. Though it seems to be not realistic, because we need some 
instruments in that domain to measure if the task meets its deadline.

> Andrii and Dario,
> Do you think the assumption that one VCPU runs only one RT task is
> reasonable in practice?
> If it is, is there some use cases for this assumption?
No, I doubt real life use-cases would fit this scheme.

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-12 10:17                 ` Dario Faggioli
@ 2018-02-12 11:08                   ` Andrii Anisov
  2018-02-12 14:52                     ` Meng Xu
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-12 11:08 UTC (permalink / raw)
  To: Dario Faggioli, Meng Xu; +Cc: xen-devel

Dario, Meng,


On 12.02.18 12:17, Dario Faggioli wrote:
> Well, I'll let Andrii reply, but honestly, I don't think it is.
>
> See, for instance, the fact that DomR has only 1 vCPU, so I find it
> unlikely that the only thing that run there is *just* *one* real-time
> task. :-/
While I'm focused mainly on the topic discussed here [1], a RT domain 
will have some RTOS with its set tasks (RT as well as non-RT), also it 
would communicate with other domains. Likely it would be one RT domain 
per system.
So I'm trying to estimate somehow if RTDS has its practical usage or 
dedicated cpupool with null scheduler will do the job.

[1] 
https://lists.linuxfoundation.org/pipermail/automotive-discussions/2018-January/005590.html

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-12 11:08                   ` Andrii Anisov
@ 2018-02-12 14:52                     ` Meng Xu
  0 siblings, 0 replies; 25+ messages in thread
From: Meng Xu @ 2018-02-12 14:52 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Dario Faggioli

On Mon, Feb 12, 2018 at 6:08 AM, Andrii Anisov <andrii_anisov@epam.com> wrote:
>
> Dario, Meng,
>
>
> On 12.02.18 12:17, Dario Faggioli wrote:
>>
>> Well, I'll let Andrii reply, but honestly, I don't think it is.
>>
>> See, for instance, the fact that DomR has only 1 vCPU, so I find it
>> unlikely that the only thing that run there is *just* *one* real-time
>> task. :-/
>
> While I'm focused mainly on the topic discussed here [1], a RT domain will have some RTOS with its set tasks (RT as well as non-RT), also it would communicate with other domains. Likely it would be one RT domain per system.
> So I'm trying to estimate somehow if RTDS has its practical usage or dedicated cpupool with null scheduler will do the job.
>
> [1] https://lists.linuxfoundation.org/pipermail/automotive-discussions/2018-January/005590.html

I see. This is interesting. I'm also interested in the practical use
case of both RTDS and null scheduler.

Best,

Meng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-12 10:20       ` Andrii Anisov
@ 2018-02-12 18:44         ` Andrii Anisov
  2018-02-16 18:37           ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-12 18:44 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Dario,


On 12.02.18 12:20, Andrii Anisov wrote:
> Actually as per Meng's explanation and calculations the problem was on 
> my side - wrong DomR task/VCPU parameters.
> I was running the system with dummy loads and values received from 
> CARTS and all seems to be ok (no deadline misses occured).
Well, what I expressed as dummy loads was all domains are generic armv8 
kernels with minimal fs'es running `dd if=/dev/zero of=/dev/null`, 
except DomR. In this case no DL misses occurred with parameters given by 
CARTS.

Now I have real driver domain, Android with GPU sharing. Loads are like 
youtube playback in DomA, dd from mmc through ssh in DomD. And I see 
unexpected DL misses for the same RT configurations.

Well this provides some ground for another my concern about XEN 
scheduling approach. My doubt is that scheduling is done within softirq, 
so all time spent with pcpu for exception itself and possible timer 
actions is accounted for the vcpu which context was interrupted. This 
seems to be not really fair and might be disruptive for RT scheduling.

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-12 18:44         ` Andrii Anisov
@ 2018-02-16 18:37           ` Dario Faggioli
  2018-02-20 11:34             ` Andrii Anisov
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2018-02-16 18:37 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Meng Xu


[-- Attachment #1.1: Type: text/plain, Size: 3207 bytes --]

On Mon, 2018-02-12 at 20:44 +0200, Andrii Anisov wrote:
> Dario,
> 
Hi,

> On 12.02.18 12:20, Andrii Anisov wrote:
> > Actually as per Meng's explanation and calculations the problem was
> > on 
> > my side - wrong DomR task/VCPU parameters.
> > I was running the system with dummy loads and values received from 
> > CARTS and all seems to be ok (no deadline misses occured).
> 
> Well, what I expressed as dummy loads was all domains are generic
> armv8 
> kernels with minimal fs'es running `dd if=/dev/zero of=/dev/null`, 
> except DomR. In this case no DL misses occurred with parameters given
> by 
> CARTS.
> 
> Now I have real driver domain, Android with GPU sharing. Loads are
> like 
> youtube playback in DomA, dd from mmc through ssh in DomD. And I see 
> unexpected DL misses for the same RT configurations.
> 
And what is it that is running in DomR, the same thing as before, when
the load was synthetic? And in any case, is it, in its turn (I mean the
workload running in DomR) a synthetic real-time load, or is it a real
real-time application?

If the latter, are you sure the misses are not due to the fact that,
for instance, the rt app does not always behave as measured/expected,
when computing the parameters?

> Well this provides some ground for another my concern about XEN 
> scheduling approach. My doubt is that scheduling is done within
> softirq, 
> so all time spent with pcpu for exception itself and possible timer 
> actions is accounted for the vcpu which context was interrupted. 
>
I am not sure I fully understand this.

If you're worried about some kind of overhead may be consuming some of
your real-time reservation, try to increase the reservation itself a
bit, and see if the misses disappear.

Scheduling always as a consequence of some task/vcpu blocking, some
task/vcpu waking up, or of timer events, in all the OSes I have ever
seen, so I don't think Xen is really special wrt this.

One difference could be that Linux can be configured to be fully
preemptible --even the kernel-- while Xen is not. But I don't think
this is what you're hinting at, is it?

> This 
> seems to be not really fair and might be disruptive for RT
> scheduling.
> 
Well, if you're saying that accounting can be improved, I do agree. It
always can (again, in all the OSes! :-D)

Note that it is not always evident how to do that, and I'm not talking
about the actual implementation. I think it would not be too hard to
track the time we spend inside the hypervisor. But then, what do we
do? 

Because if DomX was running, and we entered Xen because an interrupt
arrived to deal with a timer or whatever from DomY, then I agree it's
not fair to charge DomX for that. But, OTOH, if we are in Xen because
DomX itself called an hypercall, then it is indeed ok to charge DomX.

And note that this does not have much to do with how schedule() gets
called. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-16 18:37           ` Dario Faggioli
@ 2018-02-20 11:34             ` Andrii Anisov
  2018-02-22 17:53               ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Anisov @ 2018-02-20 11:34 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Hello Dario,


On 16.02.18 20:37, Dario Faggioli wrote:
> And what is it that is running in DomR, the same thing as before, when
> the load was synthetic?
For sure I compare apples to apples.

> And in any case, is it, in its turn (I mean the
> workload running in DomR) a synthetic real-time load, or is it a real
> real-time application?
Real-time domain is synthetic, though. I'm using LITMUS-RT system for 
the DomR. In particular with amount of work based configuration [1] 
introduced recently.

> If the latter, are you sure the misses are not due to the fact that,
> for instance, the rt app does not always behave as measured/expected,
> when computing the parameters?
Even for the synthetic rt workload some deviations in execution time are 
noticed (~0.5%). But with no IRQ extensive load in application domains, 
no DL misses are noticed in RT domain.

>> Well this provides some ground for another my concern about XEN
>> scheduling approach. My doubt is that scheduling is done within
>> softirq,
>> so all time spent with pcpu for exception itself and possible timer
>> actions is accounted for the vcpu which context was interrupted.
> I am not sure I fully understand this.
My idea is to charge time spent in hypervisor from current vcpu budget, 
except serving that vcpu hypercalls and handling interrupts targeted 
current vcpu. Same as you expressed:

> Because if DomX was running, and we entered Xen because an interrupt
> arrived to deal with a timer or whatever from DomY, then I agree it's
> not fair to charge DomX for that. But, OTOH, if we are in Xen because
> DomX itself called an hypercall, then it is indeed ok to charge DomX.
For RT scheduling this would make big difference.

> If you're worried about some kind of overhead may be consuming some of
> your real-time reservation, try to increase the reservation itself a
> bit, and see if the misses disappear.
Its not about overhead but unfair time accounting. And this unfairness 
is pretty arbitrary, depends on other domains activity.

> One difference could be that Linux can be configured to be fully
> preemptible --even the kernel-- while Xen is not. But I don't think
> this is what you're hinting at, is it?
No, it is not.
If we are speaking about Linux, it much closer to 
CONFIG_IRQ_TIME_ACCOUNTING [1].

> And note that this does not have much to do with how schedule() gets
> called. :-)
In current implementation it does matter *when* `schedule()` is called. 
Because time accounting is done by passing `now` time value to 
`sched->do_schedule()` right in `schedule()`.

[1] https://github.com/LITMUS-RT/liblitmus/pull/3
[2] https://lkml.org/lkml/2011/2/10/135

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-20 11:34             ` Andrii Anisov
@ 2018-02-22 17:53               ` Dario Faggioli
  2018-02-26 12:00                 ` Andrii Anisov
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2018-02-22 17:53 UTC (permalink / raw)
  To: Andrii Anisov; +Cc: xen-devel, Meng Xu


[-- Attachment #1.1: Type: text/plain, Size: 2776 bytes --]

On Tue, 2018-02-20 at 13:34 +0200, Andrii Anisov wrote:
> Hello Dario,
> 
Hi,

> On 16.02.18 20:37, Dario Faggioli wrote:
> > And in any case, is it, in its turn (I mean the
> > workload running in DomR) a synthetic real-time load, or is it a
> > real
> > real-time application?
> 
> Real-time domain is synthetic, though. I'm using LITMUS-RT system
> for 
> the DomR. In particular with amount of work based configuration [1] 
> introduced recently.
> 
Ah, nice! :-)

> Even for the synthetic rt workload some deviations in execution time
> are 
> noticed (~0.5%). But with no IRQ extensive load in application
> domains, 
> no DL misses are noticed in RT domain.
> 
Ok, I see.

> > > Well this provides some ground for another my concern about XEN
> > > scheduling approach. My doubt is that scheduling is done within
> > > softirq,
> > > so all time spent with pcpu for exception itself and possible
> > > timer
> > > actions is accounted for the vcpu which context was interrupted.
> > 
> > I am not sure I fully understand this.
> 
> My idea is to charge time spent in hypervisor from current vcpu
> budget, 
> except serving that vcpu hypercalls and handling interrupts targeted 
> current vcpu. Same as you expressed:
> 
As I said already, improving the accounting would be more than welcome.
If you're planning on doing something like this already, I'll be happy
to look at the patches. :-)

> > If you're worried about some kind of overhead may be consuming some
> > of
> > your real-time reservation, try to increase the reservation itself
> > a
> > bit, and see if the misses disappear.
> 
> Its not about overhead but unfair time accounting. And this
> unfairness 
> is pretty arbitrary, depends on other domains activity.
> 
Sure, I agree it's pretty bad. It's indeed particularly bad for RTDS,
where it messes with guarantees, but it's rather bad for other
schedulers as well, as it messed with fairness.

> > One difference could be that Linux can be configured to be fully
> > preemptible --even the kernel-- while Xen is not. But I don't think
> > this is what you're hinting at, is it?
> 
> No, it is not.
>
Ok, I was just double checking.

> If we are speaking about Linux, it much closer to 
> CONFIG_IRQ_TIME_ACCOUNTING [1].
> 
Yes, I'm familiar with that. That exact same model can't be applied to
Xen, but at least tracking time spent in IRQ handling, and discounting
that from vCPU execution time, should not be too hard.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: RTDS with extra time issue
  2018-02-22 17:53               ` Dario Faggioli
@ 2018-02-26 12:00                 ` Andrii Anisov
  0 siblings, 0 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-26 12:00 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Hello Dario,

On 22.02.18 19:53, Dario Faggioli wrote:
> As I said already, improving the accounting would be more than welcome.
> If you're planning on doing something like this already, I'll be happy
> to look at the patches. :-)
First I have to document my findings and make some conclusions about 
applicability of XEN to build systems with real-time requirements.
Then I hopefully will be on that.

-- 

*Andrii Anisov*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-02-26 12:00 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-09 12:20 RTDS with extra time issue Andrii Anisov
2018-02-09 12:25 ` Andrii Anisov
2018-02-09 13:18 ` Dario Faggioli
2018-02-09 15:03   ` Andrii Anisov
2018-02-09 15:18     ` Dario Faggioli
2018-02-09 15:36       ` Meng Xu
2018-02-09 15:56         ` Andrii Anisov
2018-02-09 17:51           ` Meng Xu
2018-02-10  0:14             ` Dario Faggioli
2018-02-10  4:53               ` Meng Xu
2018-02-12 10:17                 ` Dario Faggioli
2018-02-12 11:08                   ` Andrii Anisov
2018-02-12 14:52                     ` Meng Xu
2018-02-12 10:38                 ` Andrii Anisov
2018-02-12 10:20       ` Andrii Anisov
2018-02-12 18:44         ` Andrii Anisov
2018-02-16 18:37           ` Dario Faggioli
2018-02-20 11:34             ` Andrii Anisov
2018-02-22 17:53               ` Dario Faggioli
2018-02-26 12:00                 ` Andrii Anisov
2018-02-09 15:34 ` Meng Xu
2018-02-09 15:53   ` Andrii Anisov
2018-02-09 16:04   ` Andrii Anisov
2018-02-09 17:53     ` Meng Xu
2018-02-09 18:07       ` Andrii Anisov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.