qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <michael.roth@amd.com>,
	Cleber Rosa <crosa@redhat.com>,
	qemu-devel@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>
Subject: Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
Date: Wed, 10 Feb 2021 12:31:48 -0500	[thread overview]
Message-ID: <b74f2441-8f53-6dbd-f6ed-35ad798a327c@redhat.com> (raw)
In-Reply-To: <87lfbxvcds.fsf@dusky.pond.sub.org>

On 2/9/21 4:06 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
>> On 2/5/21 8:42 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:

> https://www.python.org/dev/peps/pep-0484/#type-aliases
> 
>      Note that we recommend capitalizing alias names, since they
>      represent user-defined types, which (like user-defined classes) are
>      typically spelled that way.
> 
> I think this wants names like _Scalar, _NonScalar, _Value, TreeValue.
> 

Yeah, that seems like the only way to make it consistent and not have 
pylint yell at me. I will try to adhere to this in the future, but maybe 
pylint needs a bug report to make it complain in these cases, too.

[...]

>>
>> ... as long as you don't feel that's incorrect to do. We are free to
>> name those structures SchemaInfo but type _tree_to_qlit() in terms of
>> generic Dict objects, leaving us without a middle-abstract thing to name
>> at all.
>>
>> Based on your review of the "dummy types" patch, I'm going to assume
>> that's fine.
> 
> I guess it's okayish enough.  It still feels more complicated to me than
> it needs to be.
> 
> QAPISchemaGenIntrospectVisitor an abstract representation of "QObject
> with #if and comments" for each SchemaInfo.
> 
> This is not really a representation of SchemaInfo.  We can choose to
> name it that way regardless, if it helps, and we explain it properly.

In that: SchemaInfo do not have annotations, but we do. Our SchemaInfo 
objects here are in a kind of superposition in that we have not yet 
applied the if conditionals.

Still, I do think it is *very* helpful to name those instances after the 
SchemaInfo types, because that is absolutely the interface we are 
matching. The keys are not arbitrary. The types of the values associated 
with those keys are not arbitrary.

So, I am not sure how useful it is to make such a careful distinction. 
My instinct is "not very, especially for passers-by to this module."

> 
> Once we hand off the data to _tree_to_qlit(), we can't name it that way
> anymore, simply because _tree_to_qlit() treats it as the stupid
> recursive data structure it is, and doesn't need or want to know about
> SchemaInfo.
> 

Yes, this is fine: the data is being interpreted in a new semantic 
context. It keeps the mechanical type but loses the semantic 
information. That sounds normal to me.

"Why bother, then?"

Mostly for the notational benefit in the code BUILDING the objects. 
_tree_to_qlit is so generic you can barely describe it, but the objects 
we build to feed it are quite concrete and have names and definitions 
that can be referenced.

> I think I'd dispense with _DObject entirely, and use TreeValue
> throughout.  Yes, we'd use Any a bit more.  I doubt the additional
> complexity to *sometimes* use object instead is worthwhile.  This data

I have gotten rid of _DObject entirely in v5; though using "Any" 
everywhere doesn't seem like an obvious win to me, because I'd need to 
turn this:

_NonScalar = Union[Dict[str, _Stub], List[_Stub]]
[...]
SchemaInfo = Dict[str, object]
[...]

into this:

_JSONObject = Dict[str, _Stub]
_JSONArray = List[_Stub]
_NonScalar = Union[_JSONObject, _JSONArray]
[...]
SchemaInfo = _JSONObject
[...]

Which doesn't really strike me as any simpler or nicer on either the 
semantic or mechanical axes.

> structure is used only within this file.  It pretty much never changes
> (because JSON doesn't).  It's basically write-only in
> QAPISchemaGenIntrospectVisitor.  This means all the extra typing work

Write-only variables need typing! mypy will assume these objects are 
Dict[str, str] otherwise. We have to type them as something.

And the way I typed them ... is correct, and avoided having to name two 
more intermediary types.

> buys us is use of object instead of Any where it doesn't actually
> matter.
> 

Maybe so. Comes down to habits. My current belief is "Do not use Any if 
you do not have to." I did not have to, so I didn't.

> I would use a more telling name than TreeValue, though.  One that
> actually hints at the kind of value "representation of QObject with #if
> and comment".
> 

We discussed this on IRC; ultimately I wasn't convinced of the utility 
of naming the tree type "QObject" on the logic that if QLit is a 
QObject, a function named "QObject to QLit" didn't make sense to me anymore.

>>> Since each (sub-)tree represents a JSON value / QObject, possibly with
>>> annotations, I'd like to propose a few "obvious" (hahaha) names:
>>>
>>> * an unannotated QObject: QObject
>>>
>>> * an annotated QObject: AnnotatedQObject
>>>
>>> * a possibly annotated QObject: PossiblyAnnotatedQObject
>>>
>>>     Too long.  Rename QObject to BareQObject, then call this one QObject.
>>>
>>> This gives us:
>>>
>>>       _BareQObject = Union[None, str, bool, Dict[str, Any], List[Any]]
>>>       _AnnotatedQObject = Annotated[_QObject]
>>>       _QObject = Union[_BareQObject, _AnnotatedQObject]
>>>
>>> Feel free to replace QObject by JsonValue in these names if you like
>>> that better.  I think I'd slightly prefer JsonValue right now.
>>>

On IRC, We agreed to disagree on the semantic name and use the more 
mechanically suggestive JsonValue instead. I'll give that a spin.

(It's also kinda-sorta wrong, but everything has felt kinda-sorta wrong 
to me so far. Guess it's no better or worse.)

>>> Now back to _DObject:
>>>
>>>> (See patch 12/14 for A More Betterer Understanding of what _DObject is
>>>> used for. It will contribute to A Greater Understanding.)
>>>>
>>>> Anyway, to your questions;
>>>>
>>>> (1) _DObject was my shorthand garbage way of saying "This is a Python
>>>> Dict that represents a JSON object". Hence Dict-Object, "DObject". I
>>>> have also derisively called this a "dictly-typed" object at times.
>>>
>>> In the naming system I proposed, this is BareQDict, with an additional
>>> complication: we actually have two different types for the same thing,
>>> an anonymous one within _BareQObject, and a named one.
>>>
>>>> (2) Dict[str, Any] and Dict[str, object] are similar, but do have a
>>>
>>> The former is the anonymous one, the latter the named one.
>>>
>>
>> Kinda-sorta. I am talking about pure mypy here, and the differences
>> between typing two things this way.
>>
>> Though I think you're right: I used the "Any" form for the anonymous
>> type (inherent to the structure of a JSON compound type) and the
>> "object" form for the named forms (The SchemaInfo objects we build in
>> the visitors to pass to the generator later).
>>
>>>> semantic difference. I alluded to it by calling this a "(strict) alias";
>>>> which does not help you understand any of the following points:
>>>>
>>>> Whenever you use "Any", it basically turns off type-checking at that
>>>> boundary; it is the gradually typed boundary type. Avoid it whenever
>>>> reasonably possible.
>>>>
>>>> Example time:
>>>>
>>>>
>>>> def foo(thing: Any) -> None:
>>>>        print(thing.value)  # Sure, I guess! We'll believe you.
>>>>
>>>>
>>>> def foo(thing: object) -> None:
>>>>        print(thing.value)  # BZZT, Python object has no value prop.
>>>>
>>>>
>>>> Use "Any" when you really just cannot constrain the type, because you're
>>>> out of bourbon or you've decided to give in to the darkness inside your
>>>> heart.
>>>>
>>>> Use "object" when the type of the value actually doesn't matter, because
>>>> you are only passing it on to something else later that will do its own
>>>> type analysis or introspection on the object.
>>>>
>>>> For introspect.py, 'object' is actually a really good type when we can
>>>> use it, because we interrogate the type exhaustively upon receipt in
>>>> _tree_to_qlit.
>>>>
>>>>
>>>> That leaves one question you would almost assuredly ask as a followup:
>>>>
>>>> "Why didn't you use object for the stub type to begin with?"
>>>>
>>>> Let's say we define _stub as `object` instead, the Python object. When
>>>> _tree_to_qlit recurses on non-scalar structures, the held value there is
>>>> only known as "object" and not as str/bool/None, which causes a typing
>>>> error at that point.
>>>>
>>>> Moving the stub to "Any" tells mypy to ... not worry about what type we
>>>> actually passed here. I gave in to the darkness in my heart. It's just
>>>> too annoying without real recursion.
>>>
>>> May I have an abridged version of this in the comments?  It might look
>>> quaint in ten years, when we're all fluent in Python type annotations.
>>> But right now, at least some readers aren't, and they can use a bit of
>>> help.
>>>
>>
>> Yeah, I'm sympathetic to that.... though I'm not sure what to write or
>> where. I can add some reference points in the commit message, like this one:
>>
>> https://mypy.readthedocs.io/en/stable/dynamic_typing.html#any-vs-object
>>
>> maybe in conjunction with the named type aliases patch this is actually
>> sufficient?
> 
> I can see two solutions right now:
> 
> 1. Use Dict[str, Any] throughout
> 
>     All we need to explain is
> 
>     * What the data structure is about (JSON annotated with ifconds and
>       comments; got that, could use improvement perhaps)
> 
>     * Your work-around for the lack of recursive types (got that
>       already)
> 
>     * That the use of Any bypasses type static checking on use (shouldn't
>       be hard)
> 
>     * Where such uses are (I believe only in _tree_to_qlit(), were Any
>       can't be avoided anyway).
> 
> 2. Use Dict[str, object] where we can
> 
>     Now we get to explain a few more things:
> 
>     * Why we bother (to get stricter static type checks on use)
> 
>     * Where such uses are (I can't see any offhand)
> 
>     * Maybe also where we go from one static type to the other.
> 
> In either case, we also need to pick names that need no explanation, or
> explain them.
> 

"that need no explanation" (to whom?) Suspect this is impossible; 
there's gonna be explanations anyway.

--js



  reply	other threads:[~2021-02-10 17:41 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
2021-02-02 17:46 ` [PATCH v4 01/14] qapi/introspect.py: assert schema is not None John Snow
2021-02-02 17:46 ` [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes John Snow
2021-02-03 13:49   ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 03/14] qapi/introspect.py: add _gen_features helper John Snow
2021-02-02 17:46 ` [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse John Snow
2021-02-03 14:08   ` Markus Armbruster
2021-02-03 20:42     ` John Snow
2021-02-03 21:18       ` Eduardo Habkost
2021-02-04 15:06       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 05/14] qapi/introspect.py: Unify return type of _make_tree() John Snow
2021-02-02 17:46 ` [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument John Snow
2021-02-03 14:23   ` Markus Armbruster
2021-02-03 21:21     ` John Snow
2021-02-04  8:37       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing John Snow
2021-02-03 14:30   ` Markus Armbruster
2021-02-03 21:40     ` John Snow
2021-02-02 17:46 ` [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure John Snow
2021-02-03 14:47   ` Markus Armbruster
2021-02-03 21:50     ` Eduardo Habkost
2021-02-04 15:37       ` Markus Armbruster
2021-02-04 16:20         ` John Snow
2021-02-04 16:28         ` Eduardo Habkost
2021-02-05  8:45           ` Markus Armbruster
2021-02-03 23:12     ` John Snow
2021-02-05  9:10       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 09/14] qapi/introspect.py: improve _tree_to_qlit error message John Snow
2021-02-02 17:46 ` [PATCH v4 10/14] qapi/introspect.py: improve readability of _tree_to_qlit John Snow
2021-02-02 17:46 ` [PATCH v4 11/14] qapi/introspect.py: add type hint annotations John Snow
2021-02-03 15:15   ` Markus Armbruster
2021-02-03 23:27     ` John Snow
2021-02-05 13:42       ` Markus Armbruster
2021-02-08 21:39         ` John Snow
2021-02-08 21:53           ` John Snow
2021-02-09  9:06           ` Markus Armbruster
2021-02-10 17:31             ` John Snow [this message]
2021-02-02 17:46 ` [PATCH v4 12/14] qapi/introspect.py: add introspect.json dummy types John Snow
2021-02-02 17:46 ` [PATCH v4 13/14] qapi/introspect.py: Add docstring to _tree_to_qlit John Snow
2021-02-02 17:46 ` [PATCH v4 14/14] qapi/introspect.py: Update copyright and authors list John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b74f2441-8f53-6dbd-f6ed-35ad798a327c@redhat.com \
    --to=jsnow@redhat.com \
    --cc=armbru@redhat.com \
    --cc=crosa@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=michael.roth@amd.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).