On Mon, 2017-03-27 at 11:48 +0100, George Dunlap wrote: > On Mon, Mar 27, 2017 at 11:31 AM, George Dunlap > wrote: > > > > Would it be possible instead to have domain assignment, vcpu-add / > > remove, pcpu remove, &c just fail (perhaps with -ENOSPC and/or > > -EBUSY) > > if we ever reach a situation where |vcpus| > |pcpus|? > > > > Or, to fail as many operations *as possible* which would bring us > > to > > that state, use the `waitqueue` idea as a backup for situations > > where we > > can't really avoid it? > > I suppose one reason is that it looks like a lot of the operations > can't really fail -- insert_vcpu and deinit_pdata both return void, > and the scheduler isn't even directly involved in setting the hard > affinity, so doesn't get a chance to object that with the new hard > affinity there is nowhere to run the vcpu. > This is exactly how it is. The waitqueue handling is the most complicated thing to deal with in this scheduler, and I expect it to be completely useless, at least if the scheduler is used in the way we think it should be actually used. *But* I feel like assuming that this will happen 100% of the time is unrealistic, and a waitqueue was the best I could come up with. As you say, there are a whole bunch of operations that just can't be forced to fail by the scheduler. E.g., I won't be able to forbid removing a pCPU from a sched_null pool because it has a vCPU assigned to it, nor adding a domain (and hence it's vCPUs) to such pool, if there are not enough free pCPUs. :-/ > I don't want to wait to re-write the interfaces to get this scheduler > in, so I suppose the waitqueue thing will have to do for now. :-) > Yep. :-D Let me add that, FWIW, I've tested situations where a (Linux) VM with 4 vCPUs was in a null pool with 4 pCPUs and then, with all the vCPUs running, I removed and re-added 3 of the 4 pCPUs of the pool. And while I agree that this should not be done, and that it's at high risk of confusing, stalling or deadlocking the guest kernel, nothing exploded. Doing the same thing to dom0, for instance, proved to be a lot less safe. :-) What I certainly can do is adding a warning when a vCPU hits the waitqueue. Chatty indeed, but we _do_want_ to be a bit nasty to the ones that misuse the scheduler... it's for their own good! :-P Thanks and Regards, Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)