From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47303) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dpwPI-0007Lk-1i for qemu-devel@nongnu.org; Thu, 07 Sep 2017 08:59:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dpwPC-0003gZ-NV for qemu-devel@nongnu.org; Thu, 07 Sep 2017 08:59:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44830) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dpwPC-0003fr-E4 for qemu-devel@nongnu.org; Thu, 07 Sep 2017 08:59:38 -0400 From: Markus Armbruster References: <1503471071-2233-1-git-send-email-peterx@redhat.com> <20170829110357.GG3783@redhat.com> <20170906094846.GA2215@work-vm> <20170906104603.GK15510@redhat.com> <20170906104850.GB2215@work-vm> <20170906105414.GL15510@redhat.com> <20170906105704.GC2215@work-vm> <20170906110629.GM15510@redhat.com> <20170906113157.GD2215@work-vm> <20170906115428.GP15510@redhat.com> <20170907081341.GA23040@pxdev.xzpeter.org> Date: Thu, 07 Sep 2017 14:59:28 +0200 In-Reply-To: <20170907081341.GA23040@pxdev.xzpeter.org> (Peter Xu's message of "Thu, 7 Sep 2017 16:13:41 +0800") Message-ID: <87inguaclr.fsf@dusky.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: "Daniel P. Berrange" , Laurent Vivier , Fam Zheng , Juan Quintela , qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, "Dr. David Alan Gilbert" , Paolo Bonzini , John Snow Peter Xu writes: > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote: >> On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert wrote: >> > * Daniel P. Berrange (berrange@redhat.com) wrote: >> > > This does imply that you need a separate monitor I/O processing, fro= m the >> > > command execution thread, but I see no need for all commands to sudd= enly >> > > become async. Just allowing interleaved replies is sufficient from t= he >> > > POV of the protocol definition. This interleaving is easy to handle = from >> > > the client POV - just requires a unique 'serial' in the request by t= he >> > > client, that is copied into the reply by QEMU. >> >=20 >> > OK, so for that we can just take Marc-Andr=C3=A9's syntax and call it = 'id': >> > https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html >> >=20 >> > then it's upto the caller to ensure those id's are unique. >>=20 >> Libvirt has in fact generated a unique 'id' for every monitor command >> since day 1 of supporting QMP. >>=20 >> > I do worry about two things: >> > a) With this the caller doesn't really know which commands could be >> > in parallel - for example if we've got a recovery command that's >> > executed by this non-locking thread that's OK, we expect that >> > to be doable in parallel. If in the future though we do >> > what you initially suggested and have a bunch of commands get >> > routed to the migration thread (say) then those would suddenly >> > operate in parallel with other commands that we're previously >> > synchronous. >>=20 >> We could still have an opt-in for async commands. eg default to executing >> all commands in the main thread, unless the client issues an explicit >> "make it async" command, to switch to allowing the migration thread to >> process it async. >>=20 >> { "execute": "qmp_allow_async", >> "data": { "commands": [ >> "migrate_cancel", >> ] } } >>=20 >>=20 >> { "return": { "commands": [ >> "migrate_cancel", >> ] } } >>=20 >> The server response contains the subset of commands from the request >> for which async is supported. >>=20 >> That gives good negotiation ability going forward as we incrementally >> support async on more commands. > > I think this goes back to the discussion on which design we'd like to > choose. IMHO the whole async idea plus the per-command-id is indeed > cleaner and nicer, and I believe that can benefit not only libvirt, The following may be a bit harsh in places. I apologize in advance. A better writer than me wouldn't have to resort to that. I've tried a few times to make my point that "async QMP" is neither necessary nor sufficient for monitor availability, but apparently without luck, since there's still talk like it was. I hope this attempt will work. > but also other QMP users. The problem is, I have no idea how long > it'll take to let us have such a feature - I believe that will include > QEMU and Libvirt to both support that. And it'll be a pity if the > postcopy recovery cannot work only because we cannot guarantee a > stable monitor. > > I'm curious whether there are other requirements (besides postcopy > recovery) that would want an always-alive monitor to run some > lock-free commands? If there is, I'd be more inclined to first > provide a work-around solution like "-qmp-lockfree", and we can > provide a better solution afterwards until when the whole async QMP > work ready. Yes, there are other requirements for "async QMP", and no, "async QMP" isn't a solution, but at best a part of a solution. Before I talk about QMP requirements, I need to ask a whole raft of questions, because so far this thread feels like dreaming up grand designs with only superficial understanding of the subject matter. Quite possibly because *my* understanding is superficial. If yours isn't, great! Go answer my questions :) The root problem are main loop hangs. QMP monitor hangs are merely a special case. The main loop should not hang. We've always violated that design assumption in places, e.g. in monitor commands that write to disk, and thus can hang indefinitely with NFS. Post-copy adds more violations, as Stefan pointed out. I can't say whether solving the special case "QMP monitor hangs" without also solving "main loop hangs" is useful. A perfectly available QMP monitor buys you nothing if it feeds a command queue that isn't being emptied because its consumers all hang. So, what exactly is going to drain the command queue? If there's more than one consumer, how exactly are commands from the queue dispatched to the consumers? What are the "no hang" guarantees (if any) and conditions for each of these consumers? We can have any number of QMP monitors today. Would each of them feed its own queue? Would they all feed a shared queue? How exactly is opt-in asynchronous to work? Per QMP monitor? Per command? What does it mean when an asynchronous command follows a synchronous command in the same QMP monitor? I would expect the synchronous command to complete before the asynchronous command, because that's what synchronous means, isn't it? To keep your QMP monitor available, you then must not send synchronous commands that can hang. How can we determine whether a certain synchronous command can hang? Note that with opt-in async, *all* commands are also synchronous commands. In short, explain to me how exactly you plan to ensure that certain QMP commands (such as post-copy recovery) can always "get through", in the presence of multiple monitors, hanging main loop, hanging synchronous commands, hanging whatever-else-can-now-hang-in-this-post-copy-world. Now let's talk about QMP requirements. Any addition to QMP must consider what exists already. You may add more of the same. You may generalize existing stuff. You may change existing stuff if you have sufficient reason, subject to backward compatibility constraints. But attempts to add new ways to do the same old stuff without properly integrating the existing ways are not going to fly. In particular, any new way to start some job, monitor and control it while it lives, get notified about its state changes and so forth must integrate the existing ways. These include block jobs (probably the most sophisticated of the lot), migration, dump-guest-memory, and possibly more. They all work the same way: synchronous command to kick off the job, more synchronous commands to monitor and control, events to notify. They do differ in detail. Asynchronous commands are a new way to do this. When you only need to be notified on "done", and don't need to monitor / control, they fit the bill quite neatly. However, we can't just ignore the cases where we need more than that! For those, we want a single generic solution instead of the several ad hoc solutions we have now. If we add asynchronous commands *now*, and for simple cases only, we add yet another special case for a future generic solution to integrate. I'm not going to let that happen. I figure the closest to a generic solution we have is block jobs. Perhaps a generic solution could be had by abstracting away the "block" from "block jobs", leaving just "jobs". Another approach is generalizing the asynchronous command proposal to fully cover the not-so-simple cases. If you'd rather want to make progress on monitor availability without cracking the "jobs" problem, you're in luck! Use your license to "add more of the same": synchronous command to start a job, query to monitor, event to notify.=20=20 If you insist on tying your monitor availability solution to asynchronous commands, then I'm in luck! I just found volunteers to solve the "jobs" problem for me.