Introduce rt real-time scheduler for Xen

* Introduce rt real-time scheduler for Xen
@ 2014-08-24 22:58 Meng Xu
  2014-08-24 22:58 ` [PATCH v1 1/4] xen: add real time scheduler rt Meng Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 72+ messages in thread
From: Meng Xu @ 2014-08-24 22:58 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, xisisu, stefano.stabellini, george.dunlap,
	dario.faggioli, ian.jackson, xumengpanda, JBeulich, chaowang,
	lichong659, dgolomb

Hi all,

This serie of patches adds rt real-time scheduler to Xen.

In summary, It supports:
1) Preemptive Global Earliest Deadline First scheduling policy by using a global RunQ for the scheduler;
2) Assign/display each VCPU's parameters of each domain;
3) Supports CPU Pool

Compared with the set of patches in version RFC v2, this set of patch has the following improvement:
    a) added rt scheduler specific TRACE facility for rt scheduler
    b) more efficient RunQ implementation to avoid scanning the whole RunQ when insert a vcpu without budget.
    c) bug fix for cpupool support.

-----------------------------------------------------------------------------------------------------------------------------
TODO:
    a) Burn budget in finer granularity instead of 1ms; [medium]
    b) Use separate timer per vcpu for each vcpu's budget replenishment, instead of scanning the full runqueue every now and then [medium]
    c) Handle time stolen from domU by hypervisor. When it runs on a machine with many sockets and lots of cores, the spin-lock for global RunQ used in rt scheduler could eat up time from domU, which could make domU have less budget than it requires. [not sure about difficulty right now] (Thank Konrad Rzeszutek to point this out in the XenSummit. :-))

Plan:
    We will work on TODO a) and b) and try to finish these two items before September 10th. (We will also tackle the comments raised in the review of this set of patches.)

-----------------------------------------------------------------------------------------------------------------------------
The design of this rt scheduler is as follows:
This rt scheduler follows the Preemptive Global Earliest Deadline First (GEDF) theory in real-time field.
Each VCPU can have a dedicated period and budget. While scheduled, a VCPU burns its budget. Each VCPU has its budget replenished at the beginning of each of its periods; Each VCPU discards its unused budget at the end of each of its periods. If a VCPU runs out of budget in a period, it has to wait until next period.
The mechanism of how to burn a VCPU's budget depends on the server mechanism implemented for each VCPU.
The mechanism of deciding the priority of VCPUs at each scheduling point is based on the Preemptive Global Earliest Deadline First scheduling scheme.

Server mechanism: a VCPU is implemented as a deferrable server.
When a VCPU has a task running on it, its budget is continuously burned;
When a VCPU has no task but with budget left, its budget is preserved.

Priority scheme: Global Earliest Deadline First (EDF).
At any scheduling point, the VCPU with earliest deadline has highest priority.

Queue scheme: A global runqueue for each CPU pool.
The runqueue holds all runnable VCPUs.
VCPUs in the runqueue are divided into two parts: with and without remaining budget.
At each part, VCPUs are sorted based on GEDF priority scheme.

Scheduling quanta: 1 ms.

If you are intersted in the details of the design and evaluation of this rt scheduler, please refer to our paper "Real-Time Multi-Core Virtual Machine Scheduling in Xen" (http://www.cis.upenn.edu/~mengxu/emsoft14/emsoft14.pdf), which will be published in EMSOFT14. This paper has the following details:
    a) Desgin of this scheduler;
    b) Measurement of the implementation overhead, e.g., scheduler overhead, context switch overhead, etc.
    c) Comparison of this rt scheduler and credit scheduler in terms of the real-time performance.
-----------------------------------------------------------------------------------------------------------------------------
One scenario to show the functionality of this rt scheduler is as follows:
//list each vcpu's parameters of each domain in cpu pools using rt scheduler
#xl sched-rt
Cpupool Pool-0: sched=EDF
Name                                ID VCPU Period Budget
Domain-0                             0    0  10000  10000
Domain-0                             0    1  20000  20000
Domain-0                             0    2  30000  30000
Domain-0                             0    3  10000  10000
litmus1                              1    0  10000   4000
litmus1                              1    1  10000   4000

//set the parameters of the vcpu 1 of domain litmus1:
# xl sched-rt -d litmus1 -v 1 -p 20000 -b 10000

//domain litmus1's vcpu 1's parameters are changed, display each VCPU's parameters separately:
# xl sched-rt -d litmus1
Name                                ID VCPU Period Budget
litmus1                              1    0  10000   4000
litmus1                              1    1  20000  10000

// list cpupool information
xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              12        rt       y          2

//create a cpupool test
#xl cpupool-cpu-remove Pool-0 11
#xl cpupool-cpu-remove Pool-0 10
#xl cpupool-create name=\"test\" sched=\"credit\"
#xl cpupool-cpu-add test 11
#xl cpupool-cpu-add test 10
#xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              10        rt       y          2
test                 2    credit       y          0   

//migrate litmus1 from cpupool Pool-0 to cpupool test.
#xl cpupool-migrate litmus1 test

//now litmus1 is in cpupool test
# xl sched-credit 
Cpupool test: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
litmus1                              1    256    0 

-----------------------------------------------------------------------------------------------------------------------------
This set of patches is tested by using the above scenario and running cpu intensive tasks inside each guest domain. We manually check if a domain can have its required resource without being interferenced by other domains; We also manually checked if the scheduling sequence of vcpus follows the Earliest Deadline First scheduling policy.

Any comment, question, and concerns are more than welcome! :-)

Thank you very much!

Best,

Meng

[PATCH v1 1/4] xen: add real time scheduler rt
[PATCH v1 2/4] libxc: add rt scheduler
[PATCH v1 3/4] libxl: add rt scheduler
[PATCH v1 4/4] xl: introduce rt scheduler

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania

^ permalink raw reply	[flat|nested] 72+ messages in thread