RTDS with extra time issue

* RTDS with extra time issue
@ 2018-02-09 12:20 Andrii Anisov
  2018-02-09 12:25 ` Andrii Anisov
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Andrii Anisov @ 2018-02-09 12:20 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Meng Xu

Dear Dario,

Now I'm experimenting with RTDS, in particular with "extra time" 
functionality.

My experimental setup is built on Salvator-X board with H3 SOC (running 
only big cores cluster, 4xA57).
Domains up and running, and their VCPU are as following:

root@generic-armv8-xt-dom0:/xt/dom.cfg# xl sched-rtds -v all
Cpupool Pool-0: sched=RTDS
Name                                ID VCPU    Period    Budget Extratime
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
Domain-0                             0    0     10000 1000        yes
Domain-0                             0    1     10000 1000        yes
Domain-0                             0    2     10000 1000        yes
Domain-0                             0    3     10000 1000        yes
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomR                                 3    0     10000 5000         no
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomA                                 5    0     10000 1000        yes
DomA                                 5    1     10000 1000        yes
DomA                                 5    2     10000 1000        yes
DomA                                 5    3     10000 1000        yes
(XEN) FLASK: Allowing unknown domctl_scheduler_op: 3.
DomD                                 6    0     10000 1000        yes
DomD                                 6    1     10000 1000        yes
DomD                                 6    2     10000 1000        yes
DomD                                 6    3     10000 1000        yes

The idea of such configuration is that only DomR really runs RT tasks, 
and their CPU utilization would be less than half a CPU. Rest of the 
domains are application domains without need of RT guarantees for their 
tasks, but can utilize as much CPU as they need and is available at this 
moment.
I load application domains with `dd if=/dev/zero of=/dev/null` per VCPU.
In DomR I run one RT task with period 10ms and wcet 4ms (I'm using 
LITMUS-RT for DomR), and see that this task sometime misses its 
deadline. Which means that the only VCPU of DomR haven't got its 5ms 
each 10ms.
The ps in DomR is as following:

root@genericarmv8:~# ps
   PID USER       VSZ STAT COMMAND
     1 root      1764 S    init
     2 root         0 SW   [kthreadd]
     3 root         0 SW   [ksoftirqd/0]
     4 root         0 SW   [kworker/0:0]
     5 root         0 SW<  [kworker/0:0H]
     6 root         0 SW   [kworker/u2:0]
     7 root         0 SW   [rcu_preempt]
     8 root         0 SW   [rcu_sched]
     9 root         0 SW   [rcu_bh]
    10 root         0 SW   [migration/0]
    11 root         0 SW<  [lru-add-drain]
    12 root         0 SW   [watchdog/0]
    13 root         0 SW   [cpuhp/0]
    14 root         0 SW   [kdevtmpfs]
    15 root         0 SW<  [netns]
    16 root         0 SW   [kworker/u2:1]
    17 root         0 SW   [xenwatch]
    18 root         0 SW   [xenbus]
   360 root         0 SW   [khungtaskd]
   361 root         0 SW   [oom_reaper]
   362 root         0 SW<  [writeback]
   364 root         0 SW   [kcompactd0]
   365 root         0 SWN  [ksmd]
   366 root         0 SW<  [crypto]
   367 root         0 SW<  [kintegrityd]
   368 root         0 SW<  [bioset]
   370 root         0 SW<  [kblockd]
   388 root         0 SW<  [ata_sff]
   394 root         0 SW   [kworker/0:1]
   433 root         0 SW<  [watchdogd]
   519 root         0 SW<  [rpciod]
   520 root         0 SW<  [xprtiod]
   548 root         0 SW   [kswapd0]
   549 root         0 SW<  [vmstat]
   633 root         0 SW<  [nfsiod]
   802 root         0 SW   [khvcd]
   844 root         0 SW<  [bioset]
   847 root         0 SW<  [bioset]
   850 root         0 SW<  [bioset]
   853 root         0 SW<  [bioset]
   856 root         0 SW<  [bioset]
   859 root         0 SW<  [bioset]
   861 root         0 SW<  [bioset]
   864 root         0 SW<  [bioset]
   946 root         0 SW<  [vfio-irqfd-clea]
  1405 root      2912 S    {start_getty} /bin/sh /bin/start_getty 115200 
hvc0 vt102
  1406 root      2976 S    /sbin/getty 38400 tty1
  1407 root      3256 S    -sh
  1512 root      2908 S    {st-trace-schedu} /bin/bash 
/usr/sbin/st-trace-schedule -s m1
  1523 root      1932 S    rtspin -a 194000 -w 48 100 120
  1527 root      1748 S    /usr/sbin/ftcat -p /tmp/tmp.3PXDeo/cpu0.pid 
/dev/litmus/sched_trace0 501 502 503 504 505 506 507 508 509 510 511
  1533 root      3224 R    ps

I noticed that behavior while running EPAM demo setup. Which consists of 
HW drivers in DomD, PV Drivers backends in DomD, DomA running real 
Android with PV Drivers and utilizing GPU sharing, etc.
But I managed to reproduce the issue when all domains are running 
generic armv8 kernel with minimal initramfses.
So I suspect an issue in RTDS.

-- 

*Andrii Anisov*

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread