Should we mark RTDS as supported feature from experimental feature?

All of lore.kernel.org
 help / color / mirror / Atom feed

* Should we mark RTDS as supported feature from experimental feature?
@ 2016-04-26  1:44 Meng Xu
  2016-04-26  7:56 ` Dario Faggioli
  0 siblings, 1 reply; 14+ messages in thread
From: Meng Xu @ 2016-04-26  1:44 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Dario Faggioli

Hi Dario and all,

When RTDS scheduler is initialized, it will print out that the
scheduler is an experimental feature with the following lines:

    printk("Initializing RTDS scheduler\n"

           "WARNING: This is experimental software in development.\n"

           "Use at your own risk.\n");

On RTDS' wiki [1], it says the RTDS scheduler is experimental feature.

All of the above information haven't been updated since Xen 4.5.

However, inside MAINTAINERS file, the status of RTDS scheduler is
marked as Supported (refer to commit point 28041371 by Dario Faggioli
on 2015-06-25).

In my opinion, the RTDS scheduler's functionality is finished and
tested. So should I send a patch to change the message printed out
when the scheduler is initialized?

If I understand correctly, the status in MAINTAINERS file should have
the highest priority and information from other sources should keep
updated with what the MAINTAINERS file says?

Please correct me if I'm wrong.

[1] http://wiki.xenproject.org/wiki/RTDS-Based-Scheduler

Thanks,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26  1:44 Should we mark RTDS as supported feature from experimental feature? Meng Xu
@ 2016-04-26  7:56 ` Dario Faggioli
  2016-04-26  8:56   ` Andrew Cooper
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Dario Faggioli @ 2016-04-26  7:56 UTC (permalink / raw)
  To: Meng Xu, xen-devel; +Cc: George Dunlap


[-- Attachment #1.1: Type: text/plain, Size: 3256 bytes --]

On Mon, 2016-04-25 at 21:44 -0400, Meng Xu wrote:
> Hi Dario and all,
> 
Hi,

> When RTDS scheduler is initialized, it will print out that the
> scheduler is an experimental feature with the following lines:
> 
>     printk("Initializing RTDS scheduler\n"
> 
>            "WARNING: This is experimental software in development.\n"
> 
>            "Use at your own risk.\n");
> 
> On RTDS' wiki [1], it says the RTDS scheduler is experimental
> feature.
> 
Yes.

> However, inside MAINTAINERS file, the status of RTDS scheduler is
> marked as Supported (refer to commit point 28041371 by Dario Faggioli
> on 2015-06-25).
> 
There's indeed a discrepancy between the way one can read that bit of
MAINTAINERS, and what is generally considered Supported (e.g., subject
to security support, etc).

This is true in general, not only for RTDS (more about this below).

> In my opinion, the RTDS scheduler's functionality is finished and
> tested. So should I send a patch to change the message printed out
> when the scheduler is initialized?
> 
So, yes, the scheduler is now feature complete (with the per-vcpu
parameters) and adheres to a much more sensible and scalable design
(event driven). Yet, these features have been merged very recently,
therefore, when you say "tested", I'm not so sure I agree. In fact, we
do test it on OSSTest, but only in a couple of tests. The combination
of these two things make me think that we should allow for at least
another development cycle, before considering switching.

And speaking of OSSTest, there have benn occasional failures, on ARM,
which I haven't yet found the time to properly analyze. It may be just
something related to the fact that the specific board was very slow,
but I'm not sure yet.

And even in that case, I wonder how we should handle such a
situation... I was thinking of adding a work-conserving mode, what do
you think? You may have something similar in RT-Xen already but, even
if you don't, there are a number of ways for achieving that without
disrupting the real-time guarantees.

What do you think?

> If I understand correctly, the status in MAINTAINERS file should have
> the highest priority and information from other sources should keep
> updated with what the MAINTAINERS file says?
> 
> Please correct me if I'm wrong.
> 
This has been discussed before. Have a look at this thread/messages.

http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01775.html

And at this:
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01992.html

The feature document template has been put together:
http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html

And there are feature documents in tree already.

Actually, writing one for RTDS would be a rather interesting and useful
thing to do, IMO! :-)

Regards,
Dario
---
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26  7:56 ` Dario Faggioli
@ 2016-04-26  8:56   ` Andrew Cooper
  2016-04-26 18:41     ` Meng Xu
  2016-04-26 15:35   ` George Dunlap
  2016-04-26 18:38   ` Meng Xu
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Cooper @ 2016-04-26  8:56 UTC (permalink / raw)
  To: Dario Faggioli, Meng Xu, xen-devel; +Cc: George Dunlap


>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>> on 2015-06-25).
>>
> There's indeed a discrepancy between the way one can read that bit of
> MAINTAINERS, and what is generally considered Supported (e.g., subject
> to security support, etc).
>
> This is true in general, not only for RTDS (more about this below).

The purpose of starting the feature docs (in docs/features/) was to
identify the technical status of a feature, along side some
documentation pertinent to its use.

I am tempted to suggest a requirement of "no security support without a
feature doc" for new features, in an effort to resolve the current
uncertainty as to what is supported and what is not.

As for the MAINTAINERS file, supported has a different meaning.  From
the file itself,

Descriptions of section entries:

M: Mail patches to: FullName <address@domain>
L: Mailing list that is relevant to this area
W: Web-page with status/info
T: SCM tree type and location.  Type is one of: git, hg, quilt, stgit.
S: Status, one of the following:
           Supported:   Someone is actually paid to look after this.
           Maintained:  Someone actually looks after it.
           Odd Fixes:   It has a maintainer but they don't have time to do
            much other than throw the odd patch in. See below..
           Orphan:      No current maintainer [but maybe you could take the
                    role as you write your new code].
           Obsolete:    Old code. Something tagged obsolete generally means
            it has been replaced by a better system and you
                        should be using that.

Nothing in the MAINTAINERS file constitutes a security statement.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26  7:56 ` Dario Faggioli
  2016-04-26  8:56   ` Andrew Cooper
@ 2016-04-26 15:35   ` George Dunlap
  2016-04-26 20:00     ` Meng Xu
  2016-04-26 22:38     ` Dario Faggioli
  2016-04-26 18:38   ` Meng Xu
  2 siblings, 2 replies; 14+ messages in thread
From: George Dunlap @ 2016-04-26 15:35 UTC (permalink / raw)
  To: Dario Faggioli, Meng Xu, xen-devel; +Cc: George Dunlap

On 26/04/16 08:56, Dario Faggioli wrote:
> On Mon, 2016-04-25 at 21:44 -0400, Meng Xu wrote:
>> Hi Dario and all,
>>
> Hi,
> 
>> When RTDS scheduler is initialized, it will print out that the
>> scheduler is an experimental feature with the following lines:
>>
>>     printk("Initializing RTDS scheduler\n"
>>
>>            "WARNING: This is experimental software in development.\n"
>>
>>            "Use at your own risk.\n");
>>
>> On RTDS' wiki [1], it says the RTDS scheduler is experimental
>> feature.
>>
> Yes.
> 
>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>> on 2015-06-25).
>>
> There's indeed a discrepancy between the way one can read that bit of
> MAINTAINERS, and what is generally considered Supported (e.g., subject
> to security support, etc).
> 
> This is true in general, not only for RTDS (more about this below).
> 
>> In my opinion, the RTDS scheduler's functionality is finished and
>> tested. So should I send a patch to change the message printed out
>> when the scheduler is initialized?
>>
> So, yes, the scheduler is now feature complete (with the per-vcpu
> parameters) and adheres to a much more sensible and scalable design
> (event driven). Yet, these features have been merged very recently,
> therefore, when you say "tested", I'm not so sure I agree. In fact, we
> do test it on OSSTest, but only in a couple of tests. The combination
> of these two things make me think that we should allow for at least
> another development cycle, before considering switching.
> 
> And speaking of OSSTest, there have benn occasional failures, on ARM,
> which I haven't yet found the time to properly analyze. It may be just
> something related to the fact that the specific board was very slow,
> but I'm not sure yet.
> 
> And even in that case, I wonder how we should handle such a
> situation... I was thinking of adding a work-conserving mode, what do
> you think? You may have something similar in RT-Xen already but, even
> if you don't, there are a number of ways for achieving that without
> disrupting the real-time guarantees.
> 
> What do you think?
> 
>> If I understand correctly, the status in MAINTAINERS file should have
>> the highest priority and information from other sources should keep
>> updated with what the MAINTAINERS file says?
>>
>> Please correct me if I'm wrong.
>>
> This has been discussed before. Have a look at this thread/messages.
> 
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01775.html
> 
> And at this:
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01992.html
> 
> The feature document template has been put together:
> http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html
> 
> And there are feature documents in tree already.
> 
> Actually, writing one for RTDS would be a rather interesting and useful
> thing to do, IMO! :-)

I think it would be helpful to try to spell out what we think are the
criteria for marking RTDS non-experimental.  Reading your e-mail, Dario,
I might infer the following criteria:

1. New event-driven code spends most of a full release cycle in the tree
being tested
2. Better tests in osstest (which ones?)
3. A feature doc
4. A work-conserving mode

Is that about right?

#3 definitely sounds like a good idea.  #1 is probably reasonable.

I don't think #4 should be a blocker; we have plenty of work-conserving
schedulers. :-)

Regarding #2, did you have specific tests in mind?

Thoughts?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26  7:56 ` Dario Faggioli
  2016-04-26  8:56   ` Andrew Cooper
  2016-04-26 15:35   ` George Dunlap
@ 2016-04-26 18:38   ` Meng Xu
  2016-04-26 22:49     ` Dario Faggioli
  2 siblings, 1 reply; 14+ messages in thread
From: Meng Xu @ 2016-04-26 18:38 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel

>> When RTDS scheduler is initialized, it will print out that the
>> scheduler is an experimental feature with the following lines:
>>
>>     printk("Initializing RTDS scheduler\n"
>>
>>            "WARNING: This is experimental software in development.\n"
>>
>>            "Use at your own risk.\n");
>>
>> On RTDS' wiki [1], it says the RTDS scheduler is experimental
>> feature.
>>
> Yes.
>
>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>> on 2015-06-25).
>>
> There's indeed a discrepancy between the way one can read that bit of
> MAINTAINERS, and what is generally considered Supported (e.g., subject
> to security support, etc).
>
> This is true in general, not only for RTDS (more about this below).

Ah-ha, I see.

>
>> In my opinion, the RTDS scheduler's functionality is finished and
>> tested. So should I send a patch to change the message printed out
>> when the scheduler is initialized?
>>
> So, yes, the scheduler is now feature complete (with the per-vcpu
> parameters) and adheres to a much more sensible and scalable design
> (event driven). Yet, these features have been merged very recently,
> therefore, when you say "tested", I'm not so sure I agree. In fact, we
> do test it on OSSTest, but only in a couple of tests. The combination
> of these two things make me think that we should allow for at least
> another development cycle, before considering switching.

I see. So should we mark it as Completed for Xen 4.7? or should we
wait until Xen 4.8 to mark it as Completed if nothing bad happens to
the scheduler?

>
> And speaking of OSSTest, there have benn occasional failures, on ARM,
> which I haven't yet found the time to properly analyze. It may be just
> something related to the fact that the specific board was very slow,
> but I'm not sure yet.

Hmm, I see. I plan to have a look at Xen on ARM this summer. When I
boot Xen on ARM, I probably could have a look at it as well.

>
> And even in that case, I wonder how we should handle such a
> situation... I was thinking of adding a work-conserving mode, what do
> you think?

Hmm, I can get why work-conserving mode is necessary and useful. I'm
thinking about the tradeoff  between the scheduler's complexity and
the benefit brought by introducing complexity.

The work-conserving mode is useful. However, there are other real time
features in terms of the scheduler that may be also useful. For
example, I heard from some company that they want to run RT VM with
non-RT VM, which is supported in RT-Xen 2.1 version, but not supported
in RTDS.

There are other RT-related issues we may need to solve to make it more
suitable for real-time or embedded field, such as protocols to handle
the shared resource.

Since the scheduler aims for the embedded and real-time applications,
those RT-related features seems to me more important than the
work-conserving feature.

What do you think?

> You may have something similar in RT-Xen already but, even
> if you don't, there are a number of ways for achieving that without
> disrupting the real-time guarantees.

Actually, in RT-Xen, we don't have the work-conserving version yet.
The work-conversing feature may not affect the real-time guarantees,
but it may not bring any improved real-time guarantees in theory. When
an embedded system designer wants to use the RTDS scheduler "with
work-conserving feature" (suppose we implement it), he cannot pack
more workload to the system by leveraging the work-conserving feature.
In practice, the system may run faster than he expects, but he won't
know how faster it will be unless we provide theoretical guarantee.

>
> What do you think?

IMHO, handling the other real-time features related to the scheduler
may be more important than the work-conserving feature, in order to
make the scheduler more adoptable in embedded world.

>
>> If I understand correctly, the status in MAINTAINERS file should have
>> the highest priority and information from other sources should keep
>> updated with what the MAINTAINERS file says?
>>
>> Please correct me if I'm wrong.
>>
> This has been discussed before. Have a look at this thread/messages.
>
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01775.html

I remembered this. Always keep an eye on ARINC653 as well. :-)

>
> And at this:
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01992.html

Yes. I read this before I asked. :-)

>
> The feature document template has been put together:
> http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html

This is great!

>
> And there are feature documents in tree already.
I see.

>
> Actually, writing one for RTDS would be a rather interesting and useful
> thing to do, IMO! :-)

Agree. I can do it in the summer.  Put it on my TODO list now.

Thanks,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26  8:56   ` Andrew Cooper
@ 2016-04-26 18:41     ` Meng Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Meng Xu @ 2016-04-26 18:41 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, xen-devel, Dario Faggioli

On Tue, Apr 26, 2016 at 4:56 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>
>>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>>> on 2015-06-25).
>>>
>> There's indeed a discrepancy between the way one can read that bit of
>> MAINTAINERS, and what is generally considered Supported (e.g., subject
>> to security support, etc).
>>
>> This is true in general, not only for RTDS (more about this below).
>
> The purpose of starting the feature docs (in docs/features/) was to
> identify the technical status of a feature, along side some
> documentation pertinent to its use.
>
> I am tempted to suggest a requirement of "no security support without a
> feature doc" for new features, in an effort to resolve the current
> uncertainty as to what is supported and what is not.

I see. As I said in Dario's reply, I will add a feature doc in the
summer about the RTDS scheduler.

>
> As for the MAINTAINERS file, supported has a different meaning.  From
> the file itself,

Right. I read this doc before asking. :-)

>
> Descriptions of section entries:
>
> M: Mail patches to: FullName <address@domain>
> L: Mailing list that is relevant to this area
> W: Web-page with status/info
> T: SCM tree type and location.  Type is one of: git, hg, quilt, stgit.
> S: Status, one of the following:
>            Supported:   Someone is actually paid to look after this.
>            Maintained:  Someone actually looks after it.
>            Odd Fixes:   It has a maintainer but they don't have time to do
>             much other than throw the odd patch in. See below..
>            Orphan:      No current maintainer [but maybe you could take the
>                     role as you write your new code].
>            Obsolete:    Old code. Something tagged obsolete generally means
>             it has been replaced by a better system and you
>                         should be using that.
>
> Nothing in the MAINTAINERS file constitutes a security statement.

I didn't realize this before.

Thank you very much for clarification!

Meng

-- 
-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 15:35   ` George Dunlap
@ 2016-04-26 20:00     ` Meng Xu
  2016-04-26 23:01       ` Dario Faggioli
  2016-04-26 22:38     ` Dario Faggioli
  1 sibling, 1 reply; 14+ messages in thread
From: Meng Xu @ 2016-04-26 20:00 UTC (permalink / raw)
  To: George Dunlap; +Cc: George Dunlap, xen-devel, Dario Faggioli

>> The feature document template has been put together:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html
>>
>> And there are feature documents in tree already.
>>
>> Actually, writing one for RTDS would be a rather interesting and useful
>> thing to do, IMO! :-)
>
> I think it would be helpful to try to spell out what we think are the
> criteria for marking RTDS non-experimental.  Reading your e-mail, Dario,
> I might infer the following criteria:
>
> 1. New event-driven code spends most of a full release cycle in the tree
> being tested
> 2. Better tests in osstest (which ones?)
> 3. A feature doc

I agree with the above three items.

> 4. A work-conserving mode

I think we need to consider the item 4 carefully. Work-conserving mode
is not a must for real-time schedulers and it is not the main
purpose/goal of the RTDS scheduler.

>
> #3 definitely sounds like a good idea.  #1 is probably reasonable.
>
> I don't think #4 should be a blocker; we have plenty of work-conserving
> schedulers. :-)

Exactly.. Actually, work-conserving feature is not a top property for
real-time applications. The resource sharing issues, interacted with
the scheduler, are more important than the work-conserving "issue" for
complex non-independent real-time applications.

>
> Regarding #2, did you have specific tests in mind?

I've been thinking about how to confirm the correctness of (RTDS)
schedulers. It is actually quite challenging to prove the scheduler is
correct.

I'm thinking what the goal of the tests is? It will determine how the
scheduler should be tested, IMHO.
There are three possible goals in increasing difficulty:
(1) Make sure the scheduler won't crash the system, or
(2) make sure the performance of the scheduler is correct, or
(3) prove the scheduler is correct?

Which one are we talking about here? (maybe item 1?)

Thanks,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 15:35   ` George Dunlap
  2016-04-26 20:00     ` Meng Xu
@ 2016-04-26 22:38     ` Dario Faggioli
  1 sibling, 0 replies; 14+ messages in thread
From: Dario Faggioli @ 2016-04-26 22:38 UTC (permalink / raw)
  To: George Dunlap, Meng Xu, xen-devel; +Cc: George Dunlap

[-- Attachment #1.1: Type: text/plain, Size: 4529 bytes --]

On Tue, 2016-04-26 at 16:35 +0100, George Dunlap wrote:
> On 26/04/16 08:56, Dario Faggioli wrote:
> > 
> > On Mon, 2016-04-25 at 21:44 -0400, Meng Xu wrote:
> > > 
> > Actually, writing one for RTDS would be a rather interesting and
> > useful
> > thing to do, IMO! :-)
> I think it would be helpful to try to spell out what we think are the
> criteria for marking RTDS non-experimental.
>
Indeed.

>   Reading your e-mail, Dario,
> I might infer the following criteria:
> 
Thanks for this :-)

> 1. New event-driven code spends most of a full release cycle in the
> tree
> being tested
> 2. Better tests in osstest (which ones?)
> 3. A feature doc
> 4. A work-conserving mode
> 
> Is that about right?
> 
I think it is.

> #3 definitely sounds like a good idea.  #1 is probably reasonable.
> 
Good that we agree on this.

> I don't think #4 should be a blocker; we have plenty of work-
> conserving
> schedulers. :-)
> 
I am not absolutely sure about this either.

We do have work conserving schedulers, and one can partition the system
in cpupools and assign each VM to the one that best suits its needs.

Yet, think at someone wanting to boot Xen with "sched=rtds". This may
be someone wanting to play with/try the RTDS scheduler, it could be our
OSSTest jobs (the one that wants to test RTDS), or it could be someone
with a small enough system that partitioning it with cpupools is not
desirable.

What this 'someone' would get is a dom0 that only has
(4*nr_dom0_vcpus)% CPU capacity available. If (let's assume we are in
the small system case, which as a matter of fact is also the case of
some of our test jobs) dom0 has 2 vcpus, this means 8% CPU total
capacity. The rest 92% of the time, the CPUs will just stay idle.

Let's assume that our 'someone' tries doing a local migration (OSSTest
does that). Or connecting with SSH and/or copying some medium to large
files with rsync. What would happen (and in fact, this is what happens
to OSSTest, as far as I can tell) is that things will timeout all the
time, migrations, sessions and file transfers would be incredibly slow.

And therefore, our dear 'someone' would, IMO, just turn away and look
at something else. Or will email xen-devel reporting a bug about
migration being slow, or connections timing out on Xen.

Increasing the reservation for dom0 --maybe even by default-- would
certainly allow to mitigate this, but at the cose of having less
bandwidth available to be guaranteed for actual, guests which is
certainly non desirable.

Of course, the same exact scenario just described, applies even if the
system is fully booked by guest domains, but all or most of them are
idle. There will again be a lot of idle time, while a couple of domains
(in the example at hand, just dom0) struggle to get done all they're
asked to do in their 8%! :-O

A work-conserving mode, selectable together with the other scheduling
parameters (and maybe enabled by default for dom0, and with a dedicated
boot parameter to change/affect that) would, according to me, provide a
more than decent solution, in a very simple way. It's not the perfect
solution. Not even the best one, probably. There are more advanced
techniques (like adaptive reservations, and others) but they all come
at a high price in terms of development and maintainability effort.

So, yeah, I'm not entirely sure yet, but I think a work conserving mode
could be very useful and should be regarded as important...

> Regarding #2, did you have specific tests in mind?
> 
Nothing too advanced. This is a special purpose scheduler and need to
be tested by people that actually need it, and on their workload.

Still, there's a test case stressing cpupools code that also involves
this scheduler that I've been working on-&-off for a while now, that I
think would be really useful (and in fact, I want to finish it).

I also want to add a test (not sure how yet: a new job, a phase of a
job, etc), that plays with the scheduling parameters of all schedulers
(weights for Credit 1 and 2, budget and period for this).

There's not much else that I can think of, but this would be already
quite a bit better than now.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 18:38   ` Meng Xu
@ 2016-04-26 22:49     ` Dario Faggioli
  2016-04-27  0:02       ` Meng Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Dario Faggioli @ 2016-04-26 22:49 UTC (permalink / raw)
  To: Meng Xu; +Cc: George Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3160 bytes --]

On Tue, 2016-04-26 at 14:38 -0400, Meng Xu wrote:
> > So, yes, the scheduler is now feature complete (with the per-vcpu
> > parameters) and adheres to a much more sensible and scalable design
> > (event driven). Yet, these features have been merged very recently,
> > therefore, when you say "tested", I'm not so sure I agree. In fact,
> > we
> > do test it on OSSTest, but only in a couple of tests. The
> > combination
> > of these two things make me think that we should allow for at least
> > another development cycle, before considering switching.
> I see. So should we mark it as Completed for Xen 4.7? or should we
> wait until Xen 4.8 to mark it as Completed if nothing bad happens to
> the scheduler?
> 
We should define the criteria. :-)

In any case, not earlier than 4.8, IMO.

> > And even in that case, I wonder how we should handle such a
> > situation... I was thinking of adding a work-conserving mode, what
> > do
> > you think?
> Hmm, I can get why work-conserving mode is necessary and useful. I'm
> thinking about the tradeoff  between the scheduler's complexity and
> the benefit brought by introducing complexity.
> 
> The work-conserving mode is useful. However, there are other real
> time
> features in terms of the scheduler that may be also useful. For
> example, I heard from some company that they want to run RT VM with
> non-RT VM, which is supported in RT-Xen 2.1 version, but not
> supported
> in RTDS.
> 
I remember that, but I'm not sure what "running a non-RT VM" inside
RTDS would mean. According to what algorithm these non real-time VMs
would be scheduled?

Since you mentioned complexity, adding a work conserving mode should be
easy enough, and if you allow a VM to be in work conserving mode, and
have a very small (or even zero) budget, here you are a non real-time
VM.

> There are other RT-related issues we may need to solve to make it
> more
> suitable for real-time or embedded field, such as protocols to handle
> the shared resource.
> 
> Since the scheduler aims for the embedded and real-time applications,
> those RT-related features seems to me more important than the
> work-conserving feature.
> 
> What do you think?
> 
There always will be new/other features... But that's not the point.

What we need, here, is agree on what is the _minimum_ set of them that
allows us to call the scheduler complete and usable. I think we're
pretty close, with this work conserving mode I'm talking about the only
candidate I can think of.

> > 
> > You may have something similar in RT-Xen already but, even
> > if you don't, there are a number of ways for achieving that without
> > disrupting the real-time guarantees.
> Actually, in RT-Xen, we don't have the work-conserving version yet.
>
Yeah, sorry, I probably was confusing it with the "RT / non-RT" flag.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 20:00     ` Meng Xu
@ 2016-04-26 23:01       ` Dario Faggioli
  2016-04-27  1:16         ` Meng Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Dario Faggioli @ 2016-04-26 23:01 UTC (permalink / raw)
  To: Meng Xu, George Dunlap; +Cc: George Dunlap, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2770 bytes --]

On Tue, 2016-04-26 at 16:00 -0400, Meng Xu wrote:
> > 
> > > 
> > > The feature document template has been put together:
> > > http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01
> > > 929.html
> > > 
> > > And there are feature documents in tree already.
> > > 
> > > Actually, writing one for RTDS would be a rather interesting and
> > > useful
> > > thing to do, IMO! :-)
> > I think it would be helpful to try to spell out what we think are
> > the
> > criteria for marking RTDS non-experimental.  Reading your e-mail,
> > Dario,
> > I might infer the following criteria:
> > 
> > 1. New event-driven code spends most of a full release cycle in the
> > tree
> > being tested
> > 2. Better tests in osstest (which ones?)
> > 3. A feature doc
> I agree with the above three items.
> 
Great!

> > 4. A work-conserving mode
> I think we need to consider the item 4 carefully. Work-conserving
> mode
> is not a must for real-time schedulers and it is not the main
> purpose/goal of the RTDS scheduler.
> 
It's indeed not a must for real-time schedulers. In fact, it's only
important if one wants the system to be overall usable, when using a
real-time scheduler. :-P

Also, I may be wrong but it should not be too hard to implement...
I.e., a win-win. :-)

> > Regarding #2, did you have specific tests in mind?
> I've been thinking about how to confirm the correctness of (RTDS)
> schedulers. It is actually quite challenging to prove the scheduler
> is
> correct.
> 
> I'm thinking what the goal of the tests is? It will determine how the
> scheduler should be tested, IMHO.
> There are three possible goals in increasing difficulty:
> (1) Make sure the scheduler won't crash the system, or
> (2) make sure the performance of the scheduler is correct, or
> (3) prove the scheduler is correct?
> 
> Which one are we talking about here? (maybe item 1?)
> 
Definitely 1, with all the security related implications that applies
to it (e.g., if a guest running within RTDS could crash or DoS the
entire host, that would be a security issue).

Good performance is important, but we'd need to define what 'good
performance' means. And since this is the only (online) real-time
scheduler we have, maybe that is not really necessary for now (e.g.,
'good performance', as compared to what?).

Assessing correctness of the actual schedule is interesting, but beyond
the scope of what I'd call 'supported status'. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 22:49     ` Dario Faggioli
@ 2016-04-27  0:02       ` Meng Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Meng Xu @ 2016-04-27  0:02 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel

On Tue, Apr 26, 2016 at 6:49 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Tue, 2016-04-26 at 14:38 -0400, Meng Xu wrote:
>> > So, yes, the scheduler is now feature complete (with the per-vcpu
>> > parameters) and adheres to a much more sensible and scalable design
>> > (event driven). Yet, these features have been merged very recently,
>> > therefore, when you say "tested", I'm not so sure I agree. In fact,
>> > we
>> > do test it on OSSTest, but only in a couple of tests. The
>> > combination
>> > of these two things make me think that we should allow for at least
>> > another development cycle, before considering switching.
>> I see. So should we mark it as Completed for Xen 4.7? or should we
>> wait until Xen 4.8 to mark it as Completed if nothing bad happens to
>> the scheduler?
>>
> We should define the criteria. :-)
>
> In any case, not earlier than 4.8, IMO.
>
>> > And even in that case, I wonder how we should handle such a
>> > situation... I was thinking of adding a work-conserving mode, what
>> > do
>> > you think?
>> Hmm, I can get why work-conserving mode is necessary and useful. I'm
>> thinking about the tradeoff  between the scheduler's complexity and
>> the benefit brought by introducing complexity.
>>
>> The work-conserving mode is useful. However, there are other real
>> time
>> features in terms of the scheduler that may be also useful. For
>> example, I heard from some company that they want to run RT VM with
>> non-RT VM, which is supported in RT-Xen 2.1 version, but not
>> supported
>> in RTDS.
>>
> I remember that, but I'm not sure what "running a non-RT VM" inside
> RTDS would mean. According to what algorithm these non real-time VMs
> would be scheduled?

A non-RT VM means the VM whose priority is lower than any RT VM. The
non-RT VMs won't get scheduled until all RT VMs have  been scheduled.
We can use the same gEDF scheduling policy to schedule non-RT VMs.

>
> Since you mentioned complexity, adding a work conserving mode should be
> easy enough, and if you allow a VM to be in work conserving mode, and
> have a very small (or even zero) budget, here you are a non real-time
> VM.

OK. I think it depends on what algorithm we want to use for the work
conserving mode? Do you have some algorithm in mind?

>
>> There are other RT-related issues we may need to solve to make it
>> more
>> suitable for real-time or embedded field, such as protocols to handle
>> the shared resource.
>>
>> Since the scheduler aims for the embedded and real-time applications,
>> those RT-related features seems to me more important than the
>> work-conserving feature.
>>
>> What do you think?
>>
> There always will be new/other features... But that's not the point.

OK.

>
> What we need, here, is agree on what is the _minimum_ set of them that
> allows us to call the scheduler complete and usable. I think we're
> pretty close, with this work conserving mode I'm talking about the only
> candidate I can think of.

Since the point you raised here is that the work-conserving is
(probably) a must.

>
>> >
>> > You may have something similar in RT-Xen already but, even
>> > if you don't, there are a number of ways for achieving that without
>> > disrupting the real-time guarantees.
>> Actually, in RT-Xen, we don't have the work-conserving version yet.
>>
> Yeah, sorry, I probably was confusing it with the "RT / non-RT" flag.

I see. :-)

Best regards,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-26 23:01       ` Dario Faggioli
@ 2016-04-27  1:16         ` Meng Xu
  2016-04-27 12:27           ` Dario Faggioli
  0 siblings, 1 reply; 14+ messages in thread
From: Meng Xu @ 2016-04-27  1:16 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel, George Dunlap

>
>> > 4. A work-conserving mode
>> I think we need to consider the item 4 carefully. Work-conserving
>> mode
>> is not a must for real-time schedulers and it is not the main
>> purpose/goal of the RTDS scheduler.
>>
> It's indeed not a must for real-time schedulers. In fact, it's only
> important if one wants the system to be overall usable, when using a
> real-time scheduler. :-P
>
> Also, I may be wrong but it should not be too hard to implement...
> I.e., a win-win. :-)

I'm thinking if we want to implement work-conserving policy in RTDS,
how should we allocate the unused resource to domains. Should this
allocation be promotional to the budget/period each domain is
configured with?
I guess the complexity totally depends on which work-conserving
algorithm we want to encode into RTDS.

For example, we can have priority bands that when a VCPU depletes its
budget, it will goes to the lower priority band. The VCPU on a lower
priority band will not be scheduled until all VCPUs in a higher
priority band are scheduled.
This policy seems easy to incorporate into the RTDS. (But I have to
think harder to make sure there is not catch.... :-) )

Best,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-27  1:16         ` Meng Xu
@ 2016-04-27 12:27           ` Dario Faggioli
  2016-04-27 20:04             ` Meng Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Dario Faggioli @ 2016-04-27 12:27 UTC (permalink / raw)
  To: Meng Xu; +Cc: George Dunlap, xen-devel, George Dunlap

[-- Attachment #1.1: Type: text/plain, Size: 1985 bytes --]

On Tue, 2016-04-26 at 21:16 -0400, Meng Xu wrote:
> > It's indeed not a must for real-time schedulers. In fact, it's only
> > important if one wants the system to be overall usable, when using
> > a
> > real-time scheduler. :-P
> > 
> > Also, I may be wrong but it should not be too hard to implement...
> > I.e., a win-win. :-)
> I'm thinking if we want to implement work-conserving policy in RTDS,
> how should we allocate the unused resource to domains. Should this
> allocation be promotional to the budget/period each domain is
> configured with?
> I guess the complexity totally depends on which work-conserving
> algorithm we want to encode into RTDS.
> 
Indeed it does.

Everything works for me, basically. As you say, it would not be a
critical aspect of this scheduler, and the implementation details of
the work conserving mode is not going to be the reason why people
choose it anyway... It's just to avoid that people runs away from it
(and from Xen) screaming! :-)

So, for instance, how do you manage non real-time VMs in RT Xen? You
say you still use EDF, how do you do that? When does one non real-time
VM preempt another non real-time VM? (Ideally, I'd go and check the RT-
Xen code that does this myself, but right now, I can't, sorry.)

We could go for this that you have already, and as soon as a VM
exhausts its budget, we demote it to non real-time, until it receives
the replenishment. Or something like that.

In this case, we basically get two features at the cost of one (support
for non real-time VMs and work conserving mode for real-time VMs). Not
to mention that you basically have the code already, and "only" need to
upstream it! :-DD

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Should we mark RTDS as supported feature from experimental feature?
  2016-04-27 12:27           ` Dario Faggioli
@ 2016-04-27 20:04             ` Meng Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Meng Xu @ 2016-04-27 20:04 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel, George Dunlap

On Wed, Apr 27, 2016 at 8:27 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> On Tue, 2016-04-26 at 21:16 -0400, Meng Xu wrote:
>> > It's indeed not a must for real-time schedulers. In fact, it's only
>> > important if one wants the system to be overall usable, when using
>> > a
>> > real-time scheduler. :-P
>> >
>> > Also, I may be wrong but it should not be too hard to implement...
>> > I.e., a win-win. :-)
>> I'm thinking if we want to implement work-conserving policy in RTDS,
>> how should we allocate the unused resource to domains. Should this
>> allocation be promotional to the budget/period each domain is
>> configured with?
>> I guess the complexity totally depends on which work-conserving
>> algorithm we want to encode into RTDS.
>>
> Indeed it does.
>
> Everything works for me, basically. As you say, it would not be a
> critical aspect of this scheduler, and the implementation details of
> the work conserving mode is not going to be the reason why people
> choose it anyway... It's just to avoid that people runs away from it
> (and from Xen) screaming! :-)

I see. Right! This is a good point.

>
> So, for instance, how do you manage non real-time VMs in RT Xen?

RT-Xen is not working-serving right now. The way we handle the non RT
VM in RT-Xen 2.1 (not the latest version) is that we use another bit
in rt_vcpu to indicate if a VCPU is RT or not.

The non-RT VCPUs always have lower priority than the RT VCPUs.

> You
> say you still use EDF, how do you do that?

When RT VCPUs all depleted budget,  the non-RT VCPUs will be scheduled
by gEDF scheduling policy.

> When does one non real-time
> VM preempt another non real-time VM? (Ideally, I'd go and check the RT-
> Xen code that does this myself, but right now, I can't, sorry.)

The non-RT VCPU cannot be scheduled if any RT VCPU still has budget.
Once non-RT VCPUs are scheduled, they are preempted/scheduled based on
gEDF, since a non-RT VCPU also has budget and period parameters.

>
> We could go for this that you have already, and as soon as a VM
> exhausts its budget, we demote it to non real-time, until it receives
> the replenishment. Or something like that.

Right. To make it work-conserving, we will have to keep decreasing the
priority whenever it runs out of budget at that priority, until there
is no idle resource in the system any more.

>
> In this case, we basically get two features at the cost of one (support
> for non real-time VMs and work conserving mode for real-time VMs). Not
> to mention that you basically have the code already, and "only" need to
> upstream it! :-DD
>

Right. That is true... Let me think about it and send out a design later.

Meng


-- 
-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-04-27 20:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-26  1:44 Should we mark RTDS as supported feature from experimental feature? Meng Xu
2016-04-26  7:56 ` Dario Faggioli
2016-04-26  8:56   ` Andrew Cooper
2016-04-26 18:41     ` Meng Xu
2016-04-26 15:35   ` George Dunlap
2016-04-26 20:00     ` Meng Xu
2016-04-26 23:01       ` Dario Faggioli
2016-04-27  1:16         ` Meng Xu
2016-04-27 12:27           ` Dario Faggioli
2016-04-27 20:04             ` Meng Xu
2016-04-26 22:38     ` Dario Faggioli
2016-04-26 18:38   ` Meng Xu
2016-04-26 22:49     ` Dario Faggioli
2016-04-27  0:02       ` Meng Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.