From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60119) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yb2rP-0002Ls-Qn for qemu-devel@nongnu.org; Thu, 26 Mar 2015 04:09:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yb2rL-0006PN-M6 for qemu-devel@nongnu.org; Thu, 26 Mar 2015 04:09:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50919) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yb2rL-0006Mc-70 for qemu-devel@nongnu.org; Thu, 26 Mar 2015 04:09:47 -0400 From: Markus Armbruster References: <1427227433-5030-1-git-send-email-eblake@redhat.com> <1427227433-5030-2-git-send-email-eblake@redhat.com> <87vbhpc4j7.fsf@blackfin.pond.sub.org> <55131689.8030107@redhat.com> Date: Thu, 26 Mar 2015 09:09:41 +0100 In-Reply-To: <55131689.8030107@redhat.com> (Eric Blake's message of "Wed, 25 Mar 2015 14:11:53 -0600") Message-ID: <87y4mk89iy.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH v5 01/28] qapi: Document type-safety considerations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: kwolf@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, wenchaoqemu@gmail.com, lcapitulino@redhat.com Eric Blake writes: > On 03/25/2015 12:31 PM, Markus Armbruster wrote: >> Eric Blake writes: >> >>> Go into more details about the various types of valid expressions >>> in a qapi schema, including tweaks to document fixes being done >>> later in the current patch series. Also fix some stale and missing >>> documentation in the QMP specification. >>> >>> Signed-off-by: Eric Blake >>> >>> --- >>> >>> I'm not sure if it is okay to assert GPLv2+ licensing without an >>> explicit Copyright, but as I am not the original author, I don't >>> know who to attribute any original Copyright to. Advice? Should >>> I split the license addition to a separate patch? > > No thoughts to this question? Separate message. >>> >>> +There are six top-level expressions recognized by the parser: >>> +'include', 'command', 'type', 'enum', 'union', and 'event'. There are >>> +several built-in types, such as 'int' and 'str'; additionally, the >>> +top-level expressions can define complex types, enumeration types, and >>> +several flavors of union types. The 'command' expression can refer to >>> +existing types by name, or list an anonymous type as a dictionary. >> >> 'event' can do that, too. > > Yep. I can fix that on top if no respin is needed. > > >>> +Types, commands, and events share a common namespace. Therefore, >>> +generally speaking, type definitions should always use CamelCase for >>> +user-defined type names, while built-in types are lowercase. Type >>> +definitions should not end in 'Kind', as this namespace is used for >>> +creating implicit C enums for visiting union types. Command names, >>> +and field names within a type, should be all lower case with words >>> +separated by a hyphen. However, some existing older commands and >>> +complex types use underscore; when extending such expressions, >>> +consistency is preferred over blindly avoiding underscore. Event >>> +names should be ALL_CAPS with words separated by underscore. The >>> +special string '**' appears for some commands that manually perform >>> +their own type checking rather than relying on the type-safe code >>> +produced by the qapi code generators. >> >> The generator generates C identifiers pretty thoughtlessly, and is >> therefore prone to generate clashes. Fixable, but we've got bigger fish >> to fry, and this series is hefty enough as it is. Documenting the mess >> instead makes sense. > > And I actually DO tighten up the parser to flag some of these problems > later in the series. > > >>> The QAPI schema definitions can be modularized using the 'include' >>> directive: >>> >>> { 'include': 'path/to/file.json'} >> >> If you need to respin for some reason, insert a space before the closing >> brace. > > Sure; added to my on-top-or-respin pile. > > >>> >>> -A complex type is a dictionary containing a single key whose value is a >>> -dictionary. This corresponds to a struct in C or an Object in JSON. An >>> -example of a complex type is: >>> +Usage: { 'type': 'str', 'data': 'dict', '*base': 'complex-type-name' } >> >> Here, the reader needs to get that 'type' and 'data' and 'base' are >> literals, but 'str', 'dict' and 'complex-type-name' are placeholders. I >> like setting apart placeholders with typograhic conventions. If we want >> to do that, let's do it on top. > > I was trying to borrow from qapi syntax, in that the 'data' dictionary > uses '*name':'type' for a literal "name" coupled with a > value-of-that-type pair in the dictionary sent over the wire for > "arguments". As for a typographical convention, would either of these > be any easier to read? > > Usage: { 'type': 'STR', 'data': 'DICT', '*base': 'COMPLEX-TYPE-NAME' } > > or > > Usage: { 'type': string, 'data': dict [, 'base': complex-type-name] } Or Usage: { 'type': STRING, 'data': DICT [, 'base': COMPLEX-TYPE-NAME] } Makes the placeholders stand out. Unambiguous as long as we don't use ALL-CAPS for anything else outside quotes. Using QAPI's *-prefix instead of EBNF's brackets Usage: { 'type': STRING, 'data': DICT , '*base': COMPLEX-TYPE-NAME } would also work, but I feel the *-prefix needs to be explained earlier then. >> >> '*base' is interesting. It's a tacit use of the schema language's >> "'*name' means member 'name' is optional" rule on meta-schema-level. We >> can take care of that when/if we clarify placeholders. >> >>> + >>> +A complex type is a dictionary containing a single 'data' key whose >>> +value is a dictionary. This corresponds to a struct in C or an Object >>> +in JSON. Each value of the 'data' dictionary must be the name of a >>> +complex, enum, union, or built-in type, >> >> Are there any others? If not, we can simplify to >> >> Each value of the 'data' dictionary must be a type name > > Works for me, especially since the addition of 'alternate' later in this > series is also a type that works in this context. > >> >>> or a one-element array >>> +containing a type name. An example of a complex type is: >>> >>> { 'type': 'MyType', >>> 'data': { 'member1': 'str', 'member2': 'int', '*member3': 'str' } } >>> >>> -The use of '*' as a prefix to the name means the member is optional. >>> +The use of '*' as a prefix to the name means the member is optional in >>> +the corresponding QMP usage. >> >> Not just in QMP, also QGA and (internal) C interfaces. Since explaining >> that is tiresome, I'd be tempted to stick to the original sentence here. >> >> Preexisting: we sometimes say "QMP wire format", and sometimes simply >> "JSON". Again, we got bigger fish to fry. > > If I do anything for this, it will be a separate patch on top scrubbing > for all uses. Absolutely. >>> +Additionally, an implicit C enum NameKind is created, corresponding to >>> +the union Name, for accessing the various branches of the union. No >>> +branch of the union can be named 'max', as this would collide with the >>> +implicit enum. >> >> Aside: I guess this implicit enum is just like a normal QAPI enum, >> except it doesn't have a name in the schema. > > Implemented identically; and why I want to eventually add: > > { 'enum': 'Enum', 'data': [ 'one', 'two' ] } > { 'union': 'Name', 'discriminator': 'Enum', > 'data': { 'one': 'str', 'two': 'int' } } > { 'command': 'foo', 'data': 'Name' } > > to allow this on the wire: > > { "execute": "foo", "arguments": { "type": "one", "data": "string" } } > { "execute": "foo", "arguments": { "type": "two", "data": 1 } } > >>> { 'type': 'BlockdevCommonOptions', 'data': { 'readonly': 'bool' } } >>> { 'union': 'BlockdevOptions', >> >> Worth mentioning explicitly that the base type's members are common to >> all the union's cases? > > Sure. > > >>> +The final flavor of unions is an anonymous union. While the other two >>> +union types are always passed as a dictionary in the wire format, an >>> +anonymous union instead allows the direct use of different types in >>> +its place. As they aren't structured, they don't have any explicit >>> +discriminator but use the (QObject) data type of their value as an >>> +implicit discriminator. This means that they are restricted to using >>> +only one discriminator value per QObject type. For example, you cannot >>> +have two different complex types in an anonymous union, or two >>> +different integer types. >> >> This is the first mention of "QObject type". How it's related to JSON >> is left to the reader's deductive skills :) >> >> The QObject types are QTYPE_NONE, QTYPE_QINT, QTYPE_QSTRING, >> QTYPE_QDICT, QTYPE_QLIST, QTYPE_QFLOAT, QTYPE_QBOOL, QTYPE_QERROR. >> >> The connections JSON string - QTYPE_QSTRING, JSON object - QTYPE_QDICT, >> JSON array - QTYPE_QLIST and JSON boolean - QTYPE_QBOOL are obvious >> enough. >> >> If I remember correctly, our JSON parser chokes on the JSON keyword >> null. >> >> That leaves just JSON numbers - QTYPE_QINT or QTYPE_QFLOAT. Can an >> anonymous union have a separate case for each of the two? >> >> For completeness: on the QTYPE_ side, it leaves QTYPE_QERROR and >> QTYPE_NONE. QTYPE_QERROR is internal only, and will hopefully be gone >> soon. I can't see QTYPE_NONE objects being created anywhere. >> >> Should we explain this in terms of JSON types instead of QObject types? > > Yeah, makes sense. Sounds like my on-top patch is getting bigger. > >> >>> >>> Anonymous unions are declared using an empty dictionary as their >>> discriminator. >>> The discriminator values never appear on the wire, they are only used in the >>> @@ -200,23 +350,93 @@ This example allows using both of the >>> following example objects: >>> >>> === Commands === >>> >>> -Commands are defined by using a list containing three members. The first >>> -member is the command name, the second member is a dictionary containing >>> -arguments, and the third member is the return type. >>> +Usage: { 'command': 'str', '*data': 'dict-or-complex-type-name', >>> + '*returns': 'type', >> >> Shoudn't this be '*returns': 'dict-or-type-name'? > > Yep. Good catch. > >> >>> + '*gen': false, '*success-response': false } >>> >>> -An example command is: >>> +Commands are defined by using a dictionary containing several members, >>> +where three members are most common. The 'command' member is a >>> +mandatory string, and determines the "execute" value passed in a QMP >>> +command exchange. >>> + >>> +The 'data' member is optional; if absent, the command accepts an >>> +optional empty dictionary. >> >> Suggest: The 'data' member is optional, and defaults to {}. >> >>> + If present, it must be the string name of >>> +a complex type, a one-element array containing the name of a complex >>> +type, or a dictionary that declares an anonymous type with the same >>> +semantics as a 'type' expression, with one exception noted below when >>> +'gen' is used. The 'data' argument maps to the "arguments" dictionary >>> +passed in as part of a QMP command. >> >> Suggest to move the last sentence to the beginning of the paragaph. > > Good ideas. > >> >>> + >>> +The 'returns' member describes what will appear in the "return" field >>> +of a QMP reply on successful completion of a command. The member is >>> +optional from the command declaration; if absent, the "return" field >>> +will be an empty dictionary. If 'returns' is present, it must be the >>> +string name of a complex or built-in type, a one-element array >>> +containing the name of a complex or built-in type, or a dictionary >>> +that declares an anonymous type with the same semantics as a 'type' >>> +expression, with one exception noted below when 'gen' is used. >> >> Oh, 'gen' affect 'returns', too? Live and learn... > > Yes; thanks to 'qom-get'. > > >>> + >>> + { 'command': 'netdev_add', >>> + 'data': {'type': 'str', 'id': 'str', '*props': '**'}, >>> + 'gen': false } >> >> We use 'gen': 'no' until PATCH 18. > > Yeah. I debated about that one, whether to do all my docs to their > final state up front, or whether to doc existing practice and then tweak > it to be more specific later in the series as fixes were made. I guess > you can see which way I chose. And of course, if I do more followups on > top of the series, I no longer have all changes in one place. Oh well; > doc changes don't affect bisection :) > > (well, I did more docs later when creating 'alternate' in place of > anonymous union, but still limited to two patches) Not enough OCD to make you evolve the documentation in lockstep with the code 100% ;-) >>> + >>> +Normally, the QAPI schema is used to describe synchronous exchanges, >>> +where a response is expected. But in some cases, the action of a >>> +command is expected to change state in a way that a successful >>> +response is not possible (the command still returns a normal >>> +dictionary error on failure). When a successful reply is not >>> +possible, the command expression should include the optional key >>> +'success-response' with boolean value false. So far, only the >>> +qemu-guest-agent makes use of this field. >> >> I had no idea :) > > You said that last time :) > https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg04254.html I guess we're approaching the state where we'll have forgotten more about the QAPI schema language than the average QEMU developer needs to know... >>> >>> http://www.ietf.org/rfc/rfc4627.txt >> >> Upgrade to RFC 7159? > > Sure; wasn't even aware it was out there. Me neither, until I stumbled over it the other day. > /me reads it > > Looks like it added hints about objects being true dictionaries > (repeating a name leads to unpredictable results, you can't rely on > order being preserved in the dictionary by default), clarified that > arrays need not be homogeneous types; gave more details on the limits of > numbers, spoke more about consequences of encoding in other than UTF-8, > and required that escape sequences be normalized before comparing > strings for equality. None of those should really hurt our usage (well, > we DO document that our use of json-object for 'data' member is > special-cased in that order is significant to the generated C code but > not to the QMP wire format). > > >>> >>> The format of a success response is: >>> >>> -{ "return": json-object, "id": json-value } >>> +{ "return": json-entity, "id": json-value } >> >> Unlike the other json-FOOs we use, "entity" isn't defined in RFC4627. >> "value" is, and we already use json-value. What's the difference >> between the two? > > Umm, when I wrote that, I was thinking "id":json-value meant integer, so > I wanted something that was not an integer. But sure enough, json-value > is precisely the term I wanted to use: > > $ qemu-kvm -qmp stdio -nodefaults > {"QMP": {"version": {"qemu": {"micro": 90, "minor": 2, "major": 2}, > "package": " (qemu-2.3.0-0.1.rc0.fc21)"}, "capabilities": []}} > {"execute":"qmp_capabilities","id":[]} > {"return": {}, "id": []} > {"execute":"huh", "id":{"a":1,"b":true}} > {"id": {"a": 1, "b": true}, "error": {"class": "CommandNotFound", > "desc": "The command huh has not been found"}} > > So _I_ learned something (the id can be anything at all!) > > Looks like I have some wording improvements to add. > >> >>> >>> Where, >>> >>> -- The "return" member contains the command returned data, which is defined >>> - in a per-command basis or an empty json-object if the command does not >>> - return data >>> +- The "return" member contains the data returned by the command, which >>> + is defined on a per-command basis (usually a json-object or >>> + json-array of json-objects, but sometimes a json-value, json-string, >>> + or json-array of json-strings); it is an empty json-object if the >>> + command does not return data >> >> Err, aren't json-object, -string, -array all json-value? At least >> that's how the JSON RFC uses "value". > > Yep. > >> >> Massive improvement. We can always improve some more on top, thus: >> >> Reviewed-by: Markus Armbruster > > Thank you. If there's cause for respin, I can fold in some improvements > then; otherwise, I'm already working on putting my changes into a > followup patch. Followup patch works for me.