Notes on Generating Python signatures for QMP RPCs

* Notes on Generating Python signatures for QMP RPCs
@ 2022-01-26 18:58 John Snow
  2022-01-27 14:03 ` Markus Armbruster
  2022-02-03 10:39 ` Daniel P. Berrangé
  0 siblings, 2 replies; 11+ messages in thread
From: John Snow @ 2022-01-26 18:58 UTC (permalink / raw)
  To: Markus Armbruster, Marc-André Lureau,
	Victor Toso de Carvalho, Andrea Bolognani
  Cc: qemu-devel

Hiya, I was experimenting with $subject and ran into a few points of
interest. This is basically an informal status report from me. I've
CC'd some of the usual suspects for people who care about SDKs and API
design and such.

This is just a list of some observations I had, so not everything
below is a question or an action item. Just sharing some notes.

(0) This experiment concerned generating signatures based on
introspection data, dynamically at runtime. In this environment type
hints are not required, as they are not actually used at runtime.
However, I added them anyway as an exercise for dynamic documentation
purposes. (i.e. `help(auto_generated_function)` showing type hints can
still be useful -- especially without access to QAPI doc blocks.)
Determining type information is also necessary for generating the
marshaling/unmarshaling functions to communicate with the server.

(1) QAPI types the return of many commands as an empty object. That's
literally indeed what happens on the wire, and it makes sense in that
if these commands were ever to return anything, it is a "compatible
evolution" to include new fields in such an object. In Python, this
does not make much sense, though; as this is somewhat hard to
annotate:

async def stop() -> Literal[{}]: ...

The more pythonic signature is:

async def stop() -> None: ...

I feel like it's spiritually equivalent, but I am aware it is a
distinct choice that is being made. It could theoretically interfere
with a choice made in QAPI later to explicitly return Null. I don't
think we'd do that, but it's still a choice of abstraction that
reduces the resolution of distinct return signatures.

(1.5) Do we have a formal definition for what we consider to be a
"compatible evolution" of the schema? I've got a fairly good idea, but
I am not sure how it's enforced. Is it just Markus being very
thorough? If we add more languages to the generator, we probably can't
burden Markus with knowing how to protect the compatibility of every
generator. We might need more assertions for invariants in the
generator itself ... but to protect "evolution", we need points of
reference to test against. Do we have anything for this? Do we need
one? Should I write a test?

(2) There are five commands that are exempted from returning an
object. qom-get is one. However, what I didn't really explicitly
realize is that this doesn't mean that only five commands don't return
an object -- we also actually allow for a list of objects, which
*many* commands use. There's no technical issue here, just an
observation. It is no problem at all to annotate Python commands as
"-> SomeReturnType" or "-> List[SomeDifferentReturnType]" or even "->
str:" as needed.

(3) Over the wire, the order of arguments to QMP commands is not
specified. In generating commands procedurally from introspection
data, I am made aware that there are several commands in which
"optional" arguments precede "required" arguments. This means that
function generation in Python cannot match the stated order 1:1.

That's not a crisis per se. For generating functions, we can use a
stable sort to bubble-up the required arguments, leaving the optional
ones trailing. However, it does mean that depending on how the QAPI
schema is modified in the future, the argument order may change
between versions of a generative SDK. I'd like to avoid that, if I
can.

One trick I have available to me in Python is the ability to stipulate
that all (QAPI) "optional" arguments are keyword-only. This means that
Optional parameters can be re-ordered arbitrarily without any breakage
in the generative python API. The only remaining concern is if the
*mandatory* arguments are re-ordered.

(In fact, I could stipulate that ALL arguments in Python bindings are
keyword-only, but I think that's going overboard and hurts usability
and readability.)

Marc-Andre has mentioned this before, but it might be nice to actually
specify a canonical ordering of arguments for languages that require
such things, and to make sure that we do not break this ordering
without good reason.

(Of course, SDK stability is not fully possible, and if this
functionality is desired, then it's time to use libvirt, hint hint
hint! However, we can avoid pointless churn in generated code and make
it easier to use and experiment with.)

(4) StrOrNull is a tricky design problem.

In Python, generally, omitted arguments are typed like this:
async def example_command(arg: Optional[int] = None) -> None: ...

Most Python programmers would recognize that signature as meaning that
they can omit 'arg' and some hopefully good default will be chosen.
However, in QAPI we do have the case where "omitted" is distinct from
"explicitly provided null". This is ... a bit challenging to convey
semantically. Python does not offer the ability to tell "what kind of
None" it received; i.e. unlike our generated QMP marshalling
functions, we do not have a "has_arg" boolean we can inspect.

So how do we write a function signature that conveys the difference
between "omitted" and "explicitly nulled" ...?

One common trick in Python is to create a new sentinel singleton, and
name it something like "Default" or "Unspecified" or "Undefined". Many
programmers use the ellipsis `...` value for this purpose. Then, we
can check if a value was omitted (`...`) or explicitly provided
(`None`). It is very unlikely that these sentinels would organically
collide with user-provided values (Unless they were trying to
explicitly invoke default behavior.)

However, `...` isn't supported as a type and using it as the default
value invalidates the typing of the field. As far as I can tell, it
CANNOT be typed. We could create our own sentinel, but IMO, this
creates a much less readable signature:

async def example_command(arg: Union[int, qmp.Default] = qmp.Default)
-> None: ...

This probably doesn't communicate "This parameter is actually
optional" to a casual Python programmer, so I think it's a dead end.

The last thing I can think of here is to instead introduce a special
sentinel that represents the explicit Null instead. We could use a
special Null() type that means "Explicitly send a null over the wire."

This value comes up fairly infrequently, so most signatures will
appear "Pythonic" and the jankiness will be confined to the few
commands that require it, e.g.

async def example_command(arg: Optional[Union[int, Null]] = None) -> None: ...

The above would imply an optional argument that can be omitted, can be
provided with an int, or can be provided with an explicit Null. I
think this is a good compromise.

(5) Generating functions from introspection data is difficult because
all of the structures are anonymous. The base type for most objects
becomes `Dict[str, Any]` but this isn't very descriptive. For Python
3.8+, we can do a little better and use `Dict[Literal["name", "node"],
Any]` to help suggest what keys are valid, but we don't have access to
an in-line definition that pairs key names with values.

Python 3.8+ would allow us the use of TypedDict, but those have to be
generated separately ... AND we still don't have a name for them, so
it'd be a little hogwash to have a function like:

async def some_command(arg: Anon321) -> None: ...

That doesn't really tell me, the human, much of anything. The best
that could perhaps be done is to create type aliases based on the name
of the argument it is the data type for, like "ArgObject". It's a bit
messy. For now, I've just stuck with the boring `Dict[Literal[...],
Any]` definition.

(6) Dealing with variants is hard. I didn't get a working
implementation for them within one day of hacking, so I stubbed them
out. There's no major blocker here, just reporting that I still have
to finish this part of the experiment. I'm pretty happy that Markus
simplified the union types we have, though. To my knowledge, I got
everything else working perfectly.

(7) I have no idea what to do about functions that "may not return".
The QGA stuff in particular, I believe, is prone to some weirdness
that violates the core principles of the QMP spec. Maybe we can add a
"NORETURN" feature flag to those commands in the schema so that
clients can be aware of which commands may break the expectation of
always getting an RPC reply?

(8) Thanks for reading. I'm still buried under my holiday inbox, but I
am trying like hell to catch up on everything. I know I missed a few
calls in which API design was discussed, and I apologize for that.
Please send me invitations using "to: jsnow@redhat.com" to ensure I do
not miss them. I am also frantically trying to clean up the Async QMP
project I was working on to have more mental bandwidth for other
tasks, but it's dragging on a bit longer than I had anticipated.
Please accept my apologies for being somewhat reclusive lately.

I'll (try to) send a status overview of the various projects I'm
working on later to help set priority and discuss with the community
what my goals are and what I'd like to do. I have an awful lot of code
I've built up in local branches that I would like to share, but I'm
already sending code upstream as fast as I can, so maybe I'll just do
an overview at some point and point to unfinished code/experiments so
it's at least not completely unwitnessed work.

I hope 2022 is treating you all well,
--John Snow

^ permalink raw reply	[flat|nested] 11+ messages in thread