All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/14] qapi: static typing conversion, pt2
@ 2021-02-02 17:46 John Snow
  2021-02-02 17:46 ` [PATCH v4 01/14] qapi/introspect.py: assert schema is not None John Snow
                   ` (13 more replies)
  0 siblings, 14 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Hi, this series adds static type hints to the QAPI module.
This is part two, and covers introspect.py.

Part 2: https://gitlab.com/jsnow/qemu/-/tree/python-qapi-cleanup-pt2
Everything: https://gitlab.com/jsnow/qemu/-/tree/python-qapi-cleanup-pt6

- Requires Python 3.6+
- Requires mypy 0.770 or newer (for type analysis only)
- Requires pylint 2.6.0 or newer (for lint checking only)

Type hints are added in patches that add *only* type hints and change no
other behavior. Any necessary changes to behavior to accommodate typing
are split out into their own tiny patches.

Every commit should pass with (from ./scripts):
 - flake8 qapi/
 - pylint --rcfile=qapi/pylintrc qapi/
 - mypy --config-file=qapi/mypy.ini qapi/
 - isort -c qapi/

V4:
 - Rebased on "pt1.5" v4
 - signatures updated to use Optional[QAPISourceInfo]
 - Changelog from V3 still relevant (That series went unreviewed.)

V3:
 - Dropped all the R-Bs again...
 - Re-re-ordered to put type annotations last again.
 - Rebased on top of "pt1.5".
 - Ensured compliance with strict-optional typing.
 - Forgive me if I missed a specific critique;
   I probably just lost it in the shuffle.

V2:
 - Dropped all R-B from previous series; enough has changed.
 - pt2 is now introspect.py, expr.py is pushed to pt3.
 - Reworked again to have less confusing (?) type names
 - Added an assertion to prevent future accidental breakage

John Snow (14):
  qapi/introspect.py: assert schema is not None
  qapi/introspect.py: use _make_tree for features nodes
  qapi/introspect.py: add _gen_features helper
  qapi/introspect.py: guard against ifcond/comment misuse
  qapi/introspect.py: Unify return type of _make_tree()
  qapi/introspect.py: replace 'extra' dict with 'comment' argument
  qapi/introspect.py: Introduce preliminary tree typing
  qapi/introspect.py: create a typed 'Annotated' data strutcure
  qapi/introspect.py: improve _tree_to_qlit error message
  qapi/introspect.py: improve readability of _tree_to_qlit
  qapi/introspect.py: add type hint annotations
  qapi/introspect.py: add introspect.json dummy types
  qapi/introspect.py: Add docstring to _tree_to_qlit
  qapi/introspect.py: Update copyright and authors list

 scripts/qapi/introspect.py | 311 ++++++++++++++++++++++++++-----------
 scripts/qapi/mypy.ini      |   5 -
 scripts/qapi/schema.py     |   2 +-
 3 files changed, 221 insertions(+), 97 deletions(-)

-- 
2.29.2




^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 01/14] qapi/introspect.py: assert schema is not None
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes John Snow
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

The introspect visitor is stateful, but expects that it will have a
schema to refer to. Add assertions that state this.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index fafec94e022..43ab4be1f77 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -147,6 +147,8 @@ def _name(self, name):
         return self._name_map[name]
 
     def _use_type(self, typ):
+        assert self._schema is not None
+
         # Map the various integer types to plain int
         if typ.json_type() == 'int':
             typ = self._schema.lookup_type('int')
@@ -225,6 +227,8 @@ def visit_alternate_type(self, name, info, ifcond, features, variants):
     def visit_command(self, name, info, ifcond, features,
                       arg_type, ret_type, gen, success_response, boxed,
                       allow_oob, allow_preconfig, coroutine):
+        assert self._schema is not None
+
         arg_type = arg_type or self._schema.the_empty_object_type
         ret_type = ret_type or self._schema.the_empty_object_type
         obj = {'arg-type': self._use_type(arg_type),
@@ -234,6 +238,7 @@ def visit_command(self, name, info, ifcond, features,
         self._gen_tree(name, 'command', obj, ifcond, features)
 
     def visit_event(self, name, info, ifcond, features, arg_type, boxed):
+        assert self._schema is not None
         arg_type = arg_type or self._schema.the_empty_object_type
         self._gen_tree(name, 'event', {'arg-type': self._use_type(arg_type)},
                        ifcond, features)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
  2021-02-02 17:46 ` [PATCH v4 01/14] qapi/introspect.py: assert schema is not None John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 13:49   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 03/14] qapi/introspect.py: add _gen_features helper John Snow
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

At present, we open-code this in _make_tree itself; but if the structure
of the tree changes, this is brittle. Use an explicit recursive call to
_make_tree when appropriate to help keep the interior node typing
consistent.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 43ab4be1f77..3295a15c98e 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -30,7 +30,9 @@ def _make_tree(obj, ifcond, features, extra=None):
     if ifcond:
         extra['if'] = ifcond
     if features:
-        obj['features'] = [(f.name, {'if': f.ifcond}) for f in features]
+        obj['features'] = [
+            _make_tree(f.name, f.ifcond, None) for f in features
+        ]
     if extra:
         return (obj, extra)
     return obj
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 03/14] qapi/introspect.py: add _gen_features helper
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
  2021-02-02 17:46 ` [PATCH v4 01/14] qapi/introspect.py: assert schema is not None John Snow
  2021-02-02 17:46 ` [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse John Snow
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

_make_tree might receive a dict (a SchemaInfo object) or some other type
(usually, a string) for its obj parameter. Adding features information
should arguably be performed by the caller at such a time when we know
the type of the object and don't have to re-interrogate it.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 3295a15c98e..4749f65ea3c 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -24,15 +24,11 @@
 )
 
 
-def _make_tree(obj, ifcond, features, extra=None):
+def _make_tree(obj, ifcond, extra=None):
     if extra is None:
         extra = {}
     if ifcond:
         extra['if'] = ifcond
-    if features:
-        obj['features'] = [
-            _make_tree(f.name, f.ifcond, None) for f in features
-        ]
     if extra:
         return (obj, extra)
     return obj
@@ -169,6 +165,10 @@ def _use_type(self, typ):
             return '[' + self._use_type(typ.element_type) + ']'
         return self._name(typ.name)
 
+    @staticmethod
+    def _gen_features(features):
+        return [_make_tree(f.name, f.ifcond) for f in features]
+
     def _gen_tree(self, name, mtype, obj, ifcond, features):
         extra = None
         if mtype not in ('command', 'event', 'builtin', 'array'):
@@ -179,13 +179,17 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
             name = self._name(name)
         obj['name'] = name
         obj['meta-type'] = mtype
-        self._trees.append(_make_tree(obj, ifcond, features, extra))
+        if features:
+            obj['features'] = self._gen_features(features)
+        self._trees.append(_make_tree(obj, ifcond, extra))
 
     def _gen_member(self, member):
         obj = {'name': member.name, 'type': self._use_type(member.type)}
         if member.optional:
             obj['default'] = None
-        return _make_tree(obj, member.ifcond, member.features)
+        if member.features:
+            obj['features'] = self._gen_features(member.features)
+        return _make_tree(obj, member.ifcond)
 
     def _gen_variants(self, tag_name, variants):
         return {'tag': tag_name,
@@ -193,7 +197,7 @@ def _gen_variants(self, tag_name, variants):
 
     def _gen_variant(self, variant):
         obj = {'case': variant.name, 'type': self._use_type(variant.type)}
-        return _make_tree(obj, variant.ifcond, None)
+        return _make_tree(obj, variant.ifcond)
 
     def visit_builtin_type(self, name, info, json_type):
         self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (2 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 03/14] qapi/introspect.py: add _gen_features helper John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 14:08   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 05/14] qapi/introspect.py: Unify return type of _make_tree() John Snow
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

_tree_to_qlit is called recursively on dict values alone; at such a
point in generating output it is too late to apply an ifcond. Similarly,
comments do not necessarily have a "tidy" place they can be printed in
such a circumstance.

Forbid this usage.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 4749f65ea3c..ccdf4f1c0d0 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -43,6 +43,12 @@ def indent(level):
         ifobj, extra = obj
         ifcond = extra.get('if')
         comment = extra.get('comment')
+
+        # NB: _tree_to_qlit is called recursively on the values of a key:value
+        # pair; those values can't be decorated with comments or conditionals.
+        msg = "dict values cannot have attached comments or if-conditionals."
+        assert not suppress_first_indent, msg
+
         ret = ''
         if comment:
             ret += indent(level) + '/* %s */\n' % comment
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 05/14] qapi/introspect.py: Unify return type of _make_tree()
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (3 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument John Snow
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Returning two different types conditionally can be complicated to
type. Return one type for consistency.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index ccdf4f1c0d0..d3fbf694ad2 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -29,9 +29,7 @@ def _make_tree(obj, ifcond, extra=None):
         extra = {}
     if ifcond:
         extra['if'] = ifcond
-    if extra:
-        return (obj, extra)
-    return obj
+    return (obj, extra)
 
 
 def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (4 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 05/14] qapi/introspect.py: Unify return type of _make_tree() John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 14:23   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing John Snow
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

This is only used to pass in a dictionary with a comment already set, so
skip the runaround and just accept the comment.

This works because _tree_to_qlit() treats 'if': None; 'comment': None
exactly like absent 'if'; 'comment'.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index d3fbf694ad2..0aa3b77109f 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -10,6 +10,8 @@
 See the COPYING file in the top-level directory.
 """
 
+from typing import Optional
+
 from .common import (
     c_name,
     gen_endif,
@@ -24,11 +26,11 @@
 )
 
 
-def _make_tree(obj, ifcond, extra=None):
-    if extra is None:
-        extra = {}
-    if ifcond:
-        extra['if'] = ifcond
+def _make_tree(obj, ifcond, comment=None):
+    extra = {
+        'if': ifcond,
+        'comment': comment,
+    }
     return (obj, extra)
 
 
@@ -174,18 +176,18 @@ def _gen_features(features):
         return [_make_tree(f.name, f.ifcond) for f in features]
 
     def _gen_tree(self, name, mtype, obj, ifcond, features):
-        extra = None
+        comment: Optional[str] = None
         if mtype not in ('command', 'event', 'builtin', 'array'):
             if not self._unmask:
                 # Output a comment to make it easy to map masked names
                 # back to the source when reading the generated output.
-                extra = {'comment': '"%s" = %s' % (self._name(name), name)}
+                comment = f'"{self._name(name)}" = {name}'
             name = self._name(name)
         obj['name'] = name
         obj['meta-type'] = mtype
         if features:
             obj['features'] = self._gen_features(features)
-        self._trees.append(_make_tree(obj, ifcond, extra))
+        self._trees.append(_make_tree(obj, ifcond, comment))
 
     def _gen_member(self, member):
         obj = {'name': member.name, 'type': self._use_type(member.type)}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (5 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 14:30   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure John Snow
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

The types will be used in forthcoming patches to add typing. These types
describe the layout and structure of the objects passed to
_tree_to_qlit, but lack the power to describe annotations until the next
commit.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 0aa3b77109f..b82efe16f6e 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -10,7 +10,13 @@
 See the COPYING file in the top-level directory.
 """
 
-from typing import Optional
+from typing import (
+    Any,
+    Dict,
+    List,
+    Optional,
+    Union,
+)
 
 from .common import (
     c_name,
@@ -26,6 +32,28 @@
 )
 
 
+# This module constructs a tree data structure that is used to
+# generate the introspection information for QEMU. It behaves similarly
+# to a JSON value.
+#
+# A complexity over JSON is that our values may or may not be annotated.
+#
+# Un-annotated values may be:
+#     Scalar: str, bool, None.
+#     Non-scalar: List, Dict
+# _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
+#
+# With optional annotations, the type of all values is:
+# TreeValue = Union[_value, Annotated[_value]]
+#
+# Sadly, mypy does not support recursive types, so we must approximate this.
+_stub = Any
+_scalar = Union[str, bool, None]
+_nonscalar = Union[Dict[str, _stub], List[_stub]]
+_value = Union[_scalar, _nonscalar]
+# TreeValue = Union[_value, 'Annotated[_value]']
+
+
 def _make_tree(obj, ifcond, comment=None):
     extra = {
         'if': ifcond,
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (6 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 14:47   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 09/14] qapi/introspect.py: improve _tree_to_qlit error message John Snow
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Presently, we use a tuple to attach a dict containing annotations
(comments and compile-time conditionals) to a tree node. This is
undesirable because dicts are difficult to strongly type; promoting it
to a real class allows us to name the values and types of the
annotations we are expecting.

In terms of typing, the Annotated<T> type serves as a generic container
where the annotated node's type is preserved, allowing for greater
specificity than we'd be able to provide without a generic.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 77 ++++++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 33 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index b82efe16f6e..2b90a52f016 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -13,8 +13,12 @@
 from typing import (
     Any,
     Dict,
+    Generic,
+    Iterable,
     List,
     Optional,
+    Tuple,
+    TypeVar,
     Union,
 )
 
@@ -51,15 +55,25 @@
 _scalar = Union[str, bool, None]
 _nonscalar = Union[Dict[str, _stub], List[_stub]]
 _value = Union[_scalar, _nonscalar]
-# TreeValue = Union[_value, 'Annotated[_value]']
+TreeValue = Union[_value, 'Annotated[_value]']
 
 
-def _make_tree(obj, ifcond, comment=None):
-    extra = {
-        'if': ifcond,
-        'comment': comment,
-    }
-    return (obj, extra)
+_NodeT = TypeVar('_NodeT', bound=TreeValue)
+
+
+class Annotated(Generic[_NodeT]):
+    """
+    Annotated generally contains a SchemaInfo-like type (as a dict),
+    But it also used to wrap comments/ifconds around scalar leaf values,
+    for the benefit of features and enums.
+    """
+    # Remove after 3.7 adds @dataclass:
+    # pylint: disable=too-few-public-methods
+    def __init__(self, value: _NodeT, ifcond: Iterable[str],
+                 comment: Optional[str] = None):
+        self.value = value
+        self.comment: Optional[str] = comment
+        self.ifcond: Tuple[str, ...] = tuple(ifcond)
 
 
 def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
@@ -67,24 +81,20 @@ def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
     def indent(level):
         return level * 4 * ' '
 
-    if isinstance(obj, tuple):
-        ifobj, extra = obj
-        ifcond = extra.get('if')
-        comment = extra.get('comment')
-
+    if isinstance(obj, Annotated):
         # NB: _tree_to_qlit is called recursively on the values of a key:value
         # pair; those values can't be decorated with comments or conditionals.
         msg = "dict values cannot have attached comments or if-conditionals."
         assert not suppress_first_indent, msg
 
         ret = ''
-        if comment:
-            ret += indent(level) + '/* %s */\n' % comment
-        if ifcond:
-            ret += gen_if(ifcond)
-        ret += _tree_to_qlit(ifobj, level)
-        if ifcond:
-            ret += '\n' + gen_endif(ifcond)
+        if obj.comment:
+            ret += indent(level) + '/* %s */\n' % obj.comment
+        if obj.ifcond:
+            ret += gen_if(obj.ifcond)
+        ret += _tree_to_qlit(obj.value, level, suppress_first_indent)
+        if obj.ifcond:
+            ret += '\n' + gen_endif(obj.ifcond)
         return ret
 
     ret = ''
@@ -201,7 +211,7 @@ def _use_type(self, typ):
 
     @staticmethod
     def _gen_features(features):
-        return [_make_tree(f.name, f.ifcond) for f in features]
+        return [Annotated(f.name, f.ifcond) for f in features]
 
     def _gen_tree(self, name, mtype, obj, ifcond, features):
         comment: Optional[str] = None
@@ -215,7 +225,7 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
         obj['meta-type'] = mtype
         if features:
             obj['features'] = self._gen_features(features)
-        self._trees.append(_make_tree(obj, ifcond, comment))
+        self._trees.append(Annotated(obj, ifcond, comment))
 
     def _gen_member(self, member):
         obj = {'name': member.name, 'type': self._use_type(member.type)}
@@ -223,7 +233,7 @@ def _gen_member(self, member):
             obj['default'] = None
         if member.features:
             obj['features'] = self._gen_features(member.features)
-        return _make_tree(obj, member.ifcond)
+        return Annotated(obj, member.ifcond)
 
     def _gen_variants(self, tag_name, variants):
         return {'tag': tag_name,
@@ -231,16 +241,17 @@ def _gen_variants(self, tag_name, variants):
 
     def _gen_variant(self, variant):
         obj = {'case': variant.name, 'type': self._use_type(variant.type)}
-        return _make_tree(obj, variant.ifcond)
+        return Annotated(obj, variant.ifcond)
 
     def visit_builtin_type(self, name, info, json_type):
         self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
 
     def visit_enum_type(self, name, info, ifcond, features, members, prefix):
-        self._gen_tree(name, 'enum',
-                       {'values': [_make_tree(m.name, m.ifcond, None)
-                                   for m in members]},
-                       ifcond, features)
+        self._gen_tree(
+            name, 'enum',
+            {'values': [Annotated(m.name, m.ifcond) for m in members]},
+            ifcond, features
+        )
 
     def visit_array_type(self, name, info, ifcond, element_type):
         element = self._use_type(element_type)
@@ -257,12 +268,12 @@ def visit_object_type_flat(self, name, info, ifcond, features,
         self._gen_tree(name, 'object', obj, ifcond, features)
 
     def visit_alternate_type(self, name, info, ifcond, features, variants):
-        self._gen_tree(name, 'alternate',
-                       {'members': [
-                           _make_tree({'type': self._use_type(m.type)},
-                                      m.ifcond, None)
-                           for m in variants.variants]},
-                       ifcond, features)
+        self._gen_tree(name, 'alternate', {'members': [
+                Annotated({'type': self._use_type(m.type)}, m.ifcond)
+                for m in variants.variants
+            ]},
+            ifcond, features
+        )
 
     def visit_command(self, name, info, ifcond, features,
                       arg_type, ret_type, gen, success_response, boxed,
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 09/14] qapi/introspect.py: improve _tree_to_qlit error message
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (7 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 10/14] qapi/introspect.py: improve readability of _tree_to_qlit John Snow
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Trivial; make the error message just a pinch more explicit in case we
trip this by accident in the future.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 2b90a52f016..c72beeeed27 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -124,7 +124,9 @@ def indent(level):
     elif isinstance(obj, bool):
         ret += 'QLIT_QBOOL(%s)' % ('true' if obj else 'false')
     else:
-        assert False                # not implemented
+        raise NotImplementedError(
+            f"type '{type(obj).__name__}' not implemented"
+        )
     if level > 0:
         ret += ','
     return ret
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 10/14] qapi/introspect.py: improve readability of _tree_to_qlit
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (8 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 09/14] qapi/introspect.py: improve _tree_to_qlit error message John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 11/14] qapi/introspect.py: add type hint annotations John Snow
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Subjective, but I find getting rid of the comprehensions helps. Also,
divide the sections into scalar and non-scalar sections, and remove
old-style string formatting.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index c72beeeed27..60ec326d2c7 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -89,7 +89,7 @@ def indent(level):
 
         ret = ''
         if obj.comment:
-            ret += indent(level) + '/* %s */\n' % obj.comment
+            ret += indent(level) + f"/* {obj.comment} */\n"
         if obj.ifcond:
             ret += gen_if(obj.ifcond)
         ret += _tree_to_qlit(obj.value, level, suppress_first_indent)
@@ -100,33 +100,36 @@ def indent(level):
     ret = ''
     if not suppress_first_indent:
         ret += indent(level)
+
+    # Scalars:
     if obj is None:
         ret += 'QLIT_QNULL'
     elif isinstance(obj, str):
-        ret += 'QLIT_QSTR(' + to_c_string(obj) + ')'
+        ret += f"QLIT_QSTR({to_c_string(obj)})"
+    elif isinstance(obj, bool):
+        ret += f"QLIT_QBOOL({str(obj).lower()})"
+
+    # Non-scalars:
     elif isinstance(obj, list):
-        elts = [_tree_to_qlit(elt, level + 1).strip('\n')
-                for elt in obj]
-        elts.append(indent(level + 1) + "{}")
         ret += 'QLIT_QLIST(((QLitObject[]) {\n'
-        ret += '\n'.join(elts) + '\n'
+        for value in obj:
+            ret += _tree_to_qlit(value, level + 1).strip('\n') + '\n'
+        ret += indent(level + 1) + '{}\n'
         ret += indent(level) + '}))'
     elif isinstance(obj, dict):
-        elts = []
-        for key, value in sorted(obj.items()):
-            elts.append(indent(level + 1) + '{ %s, %s }' %
-                        (to_c_string(key),
-                         _tree_to_qlit(value, level + 1, True)))
-        elts.append(indent(level + 1) + '{}')
         ret += 'QLIT_QDICT(((QLitDictEntry[]) {\n'
-        ret += ',\n'.join(elts) + '\n'
+        for key, value in sorted(obj.items()):
+            ret += indent(level + 1) + "{{ {:s}, {:s} }},\n".format(
+                to_c_string(key),
+                _tree_to_qlit(value, level + 1, suppress_first_indent=True)
+            )
+        ret += indent(level + 1) + '{}\n'
         ret += indent(level) + '}))'
-    elif isinstance(obj, bool):
-        ret += 'QLIT_QBOOL(%s)' % ('true' if obj else 'false')
     else:
         raise NotImplementedError(
             f"type '{type(obj).__name__}' not implemented"
         )
+
     if level > 0:
         ret += ','
     return ret
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (9 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 10/14] qapi/introspect.py: improve readability of _tree_to_qlit John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-03 15:15   ` Markus Armbruster
  2021-02-02 17:46 ` [PATCH v4 12/14] qapi/introspect.py: add introspect.json dummy types John Snow
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
 scripts/qapi/mypy.ini      |   5 --
 scripts/qapi/schema.py     |   2 +-
 3 files changed, 82 insertions(+), 40 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 60ec326d2c7..b7f2a6cf260 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -30,10 +30,19 @@
 )
 from .gen import QAPISchemaMonolithicCVisitor
 from .schema import (
+    QAPISchema,
     QAPISchemaArrayType,
     QAPISchemaBuiltinType,
+    QAPISchemaEntity,
+    QAPISchemaEnumMember,
+    QAPISchemaFeature,
+    QAPISchemaObjectType,
+    QAPISchemaObjectTypeMember,
     QAPISchemaType,
+    QAPISchemaVariant,
+    QAPISchemaVariants,
 )
+from .source import QAPISourceInfo
 
 
 # This module constructs a tree data structure that is used to
@@ -57,6 +66,8 @@
 _value = Union[_scalar, _nonscalar]
 TreeValue = Union[_value, 'Annotated[_value]']
 
+# This is a (strict) alias for an arbitrary object non-scalar, as above:
+_DObject = Dict[str, object]
 
 _NodeT = TypeVar('_NodeT', bound=TreeValue)
 
@@ -76,9 +87,11 @@ def __init__(self, value: _NodeT, ifcond: Iterable[str],
         self.ifcond: Tuple[str, ...] = tuple(ifcond)
 
 
-def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
+def _tree_to_qlit(obj: TreeValue,
+                  level: int = 0,
+                  suppress_first_indent: bool = False) -> str:
 
-    def indent(level):
+    def indent(level: int) -> str:
         return level * 4 * ' '
 
     if isinstance(obj, Annotated):
@@ -135,21 +148,21 @@ def indent(level):
     return ret
 
 
-def to_c_string(string):
+def to_c_string(string: str) -> str:
     return '"' + string.replace('\\', r'\\').replace('"', r'\"') + '"'
 
 
 class QAPISchemaGenIntrospectVisitor(QAPISchemaMonolithicCVisitor):
 
-    def __init__(self, prefix, unmask):
+    def __init__(self, prefix: str, unmask: bool):
         super().__init__(
             prefix, 'qapi-introspect',
             ' * QAPI/QMP schema introspection', __doc__)
         self._unmask = unmask
-        self._schema = None
-        self._trees = []
-        self._used_types = []
-        self._name_map = {}
+        self._schema: Optional[QAPISchema] = None
+        self._trees: List[Annotated[_DObject]] = []
+        self._used_types: List[QAPISchemaType] = []
+        self._name_map: Dict[str, str] = {}
         self._genc.add(mcgen('''
 #include "qemu/osdep.h"
 #include "%(prefix)sqapi-introspect.h"
@@ -157,10 +170,10 @@ def __init__(self, prefix, unmask):
 ''',
                              prefix=prefix))
 
-    def visit_begin(self, schema):
+    def visit_begin(self, schema: QAPISchema) -> None:
         self._schema = schema
 
-    def visit_end(self):
+    def visit_end(self) -> None:
         # visit the types that are actually used
         for typ in self._used_types:
             typ.visit(self)
@@ -182,18 +195,18 @@ def visit_end(self):
         self._used_types = []
         self._name_map = {}
 
-    def visit_needed(self, entity):
+    def visit_needed(self, entity: QAPISchemaEntity) -> bool:
         # Ignore types on first pass; visit_end() will pick up used types
         return not isinstance(entity, QAPISchemaType)
 
-    def _name(self, name):
+    def _name(self, name: str) -> str:
         if self._unmask:
             return name
         if name not in self._name_map:
             self._name_map[name] = '%d' % len(self._name_map)
         return self._name_map[name]
 
-    def _use_type(self, typ):
+    def _use_type(self, typ: QAPISchemaType) -> str:
         assert self._schema is not None
 
         # Map the various integer types to plain int
@@ -215,10 +228,13 @@ def _use_type(self, typ):
         return self._name(typ.name)
 
     @staticmethod
-    def _gen_features(features):
+    def _gen_features(features: List[QAPISchemaFeature]
+                      ) -> List[Annotated[str]]:
         return [Annotated(f.name, f.ifcond) for f in features]
 
-    def _gen_tree(self, name, mtype, obj, ifcond, features):
+    def _gen_tree(self, name: str, mtype: str, obj: _DObject,
+                  ifcond: List[str],
+                  features: Optional[List[QAPISchemaFeature]]) -> None:
         comment: Optional[str] = None
         if mtype not in ('command', 'event', 'builtin', 'array'):
             if not self._unmask:
@@ -232,47 +248,67 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
             obj['features'] = self._gen_features(features)
         self._trees.append(Annotated(obj, ifcond, comment))
 
-    def _gen_member(self, member):
-        obj = {'name': member.name, 'type': self._use_type(member.type)}
+    def _gen_member(self, member: QAPISchemaObjectTypeMember
+                    ) -> Annotated[_DObject]:
+        obj: _DObject = {
+            'name': member.name,
+            'type': self._use_type(member.type)
+        }
         if member.optional:
             obj['default'] = None
         if member.features:
             obj['features'] = self._gen_features(member.features)
         return Annotated(obj, member.ifcond)
 
-    def _gen_variants(self, tag_name, variants):
+    def _gen_variants(self, tag_name: str,
+                      variants: List[QAPISchemaVariant]) -> _DObject:
         return {'tag': tag_name,
                 'variants': [self._gen_variant(v) for v in variants]}
 
-    def _gen_variant(self, variant):
-        obj = {'case': variant.name, 'type': self._use_type(variant.type)}
+    def _gen_variant(self, variant: QAPISchemaVariant) -> Annotated[_DObject]:
+        obj: _DObject = {
+            'case': variant.name,
+            'type': self._use_type(variant.type)
+        }
         return Annotated(obj, variant.ifcond)
 
-    def visit_builtin_type(self, name, info, json_type):
+    def visit_builtin_type(self, name: str, info: Optional[QAPISourceInfo],
+                           json_type: str) -> None:
         self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
 
-    def visit_enum_type(self, name, info, ifcond, features, members, prefix):
+    def visit_enum_type(self, name: str, info: Optional[QAPISourceInfo],
+                        ifcond: List[str], features: List[QAPISchemaFeature],
+                        members: List[QAPISchemaEnumMember],
+                        prefix: Optional[str]) -> None:
         self._gen_tree(
             name, 'enum',
             {'values': [Annotated(m.name, m.ifcond) for m in members]},
             ifcond, features
         )
 
-    def visit_array_type(self, name, info, ifcond, element_type):
+    def visit_array_type(self, name: str, info: Optional[QAPISourceInfo],
+                         ifcond: List[str],
+                         element_type: QAPISchemaType) -> None:
         element = self._use_type(element_type)
         self._gen_tree('[' + element + ']', 'array', {'element-type': element},
                        ifcond, None)
 
-    def visit_object_type_flat(self, name, info, ifcond, features,
-                               members, variants):
-        obj = {'members': [self._gen_member(m) for m in members]}
+    def visit_object_type_flat(self, name: str, info: Optional[QAPISourceInfo],
+                               ifcond: List[str],
+                               features: List[QAPISchemaFeature],
+                               members: List[QAPISchemaObjectTypeMember],
+                               variants: Optional[QAPISchemaVariants]) -> None:
+        obj: _DObject = {'members': [self._gen_member(m) for m in members]}
         if variants:
             obj.update(self._gen_variants(variants.tag_member.name,
                                           variants.variants))
 
         self._gen_tree(name, 'object', obj, ifcond, features)
 
-    def visit_alternate_type(self, name, info, ifcond, features, variants):
+    def visit_alternate_type(self, name: str, info: Optional[QAPISourceInfo],
+                             ifcond: List[str],
+                             features: List[QAPISchemaFeature],
+                             variants: QAPISchemaVariants) -> None:
         self._gen_tree(name, 'alternate', {'members': [
                 Annotated({'type': self._use_type(m.type)}, m.ifcond)
                 for m in variants.variants
@@ -280,27 +316,38 @@ def visit_alternate_type(self, name, info, ifcond, features, variants):
             ifcond, features
         )
 
-    def visit_command(self, name, info, ifcond, features,
-                      arg_type, ret_type, gen, success_response, boxed,
-                      allow_oob, allow_preconfig, coroutine):
+    def visit_command(self, name: str, info: Optional[QAPISourceInfo],
+                      ifcond: List[str],
+                      features: List[QAPISchemaFeature],
+                      arg_type: Optional[QAPISchemaObjectType],
+                      ret_type: Optional[QAPISchemaType], gen: bool,
+                      success_response: bool, boxed: bool, allow_oob: bool,
+                      allow_preconfig: bool, coroutine: bool) -> None:
         assert self._schema is not None
 
         arg_type = arg_type or self._schema.the_empty_object_type
         ret_type = ret_type or self._schema.the_empty_object_type
-        obj = {'arg-type': self._use_type(arg_type),
-               'ret-type': self._use_type(ret_type)}
+        obj: _DObject = {
+            'arg-type': self._use_type(arg_type),
+            'ret-type': self._use_type(ret_type)
+        }
         if allow_oob:
             obj['allow-oob'] = allow_oob
         self._gen_tree(name, 'command', obj, ifcond, features)
 
-    def visit_event(self, name, info, ifcond, features, arg_type, boxed):
+    def visit_event(self, name: str, info: Optional[QAPISourceInfo],
+                    ifcond: List[str], features: List[QAPISchemaFeature],
+                    arg_type: Optional[QAPISchemaObjectType],
+                    boxed: bool) -> None:
         assert self._schema is not None
+
         arg_type = arg_type or self._schema.the_empty_object_type
         self._gen_tree(name, 'event', {'arg-type': self._use_type(arg_type)},
                        ifcond, features)
 
 
-def gen_introspect(schema, output_dir, prefix, opt_unmask):
+def gen_introspect(schema: QAPISchema, output_dir: str, prefix: str,
+                   opt_unmask: bool) -> None:
     vis = QAPISchemaGenIntrospectVisitor(prefix, opt_unmask)
     schema.visit(vis)
     vis.write(output_dir)
diff --git a/scripts/qapi/mypy.ini b/scripts/qapi/mypy.ini
index 04bd5db5278..0a000d58b37 100644
--- a/scripts/qapi/mypy.ini
+++ b/scripts/qapi/mypy.ini
@@ -13,11 +13,6 @@ disallow_untyped_defs = False
 disallow_incomplete_defs = False
 check_untyped_defs = False
 
-[mypy-qapi.introspect]
-disallow_untyped_defs = False
-disallow_incomplete_defs = False
-check_untyped_defs = False
-
 [mypy-qapi.parser]
 disallow_untyped_defs = False
 disallow_incomplete_defs = False
diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
index 353e8020a27..ff16578f6de 100644
--- a/scripts/qapi/schema.py
+++ b/scripts/qapi/schema.py
@@ -28,7 +28,7 @@
 class QAPISchemaEntity:
     meta: Optional[str] = None
 
-    def __init__(self, name, info, doc, ifcond=None, features=None):
+    def __init__(self, name: str, info, doc, ifcond=None, features=None):
         assert name is None or isinstance(name, str)
         for f in features or []:
             assert isinstance(f, QAPISchemaFeature)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 12/14] qapi/introspect.py: add introspect.json dummy types
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (10 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 11/14] qapi/introspect.py: add type hint annotations John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 13/14] qapi/introspect.py: Add docstring to _tree_to_qlit John Snow
  2021-02-02 17:46 ` [PATCH v4 14/14] qapi/introspect.py: Update copyright and authors list John Snow
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Add some aliases that declare intent for some of the "dictly-typed"
objects we pass around in introspect.py.

Signed-off-by: John Snow <jsnow@redhat.com>

---

This patch is optional, it can be dropped if desired. It is purely for
the sake of demonstrating what the _DObject was meant to convey: a
Python Dict that represents some JSON Object. It does not add any type
safety in and of itself, but does have some (light) annotational
benefit. In this case, it's usually a specific data structure from the
QAPI Schema we are referencing, but do not have "compile-time" access to
in Python.

These are loosely typed and don't bother reproducing the exact structure
of the real types. Python 3.6 does not have support for TypedDict
structures, so this is as good as we can do without involving a
third-party library (e.g. Pydantic), in which we might be able to say:


class SchemaMetaType(str, enum.Enum):
    BUILTIN = "builtin"
    ENUM = "enum"
    ARRAY = "array"
    OBJECT = "object"
    ALTERNATE = "alternate"
    COMMAND = "command"
    EVENT = "event"


class SchemaInfo(pydantic.BaseModel):
    name: str
    meta-type: SchemaMetaType
    features: Optional[List[str]]
    data: Union[SchemaInfoBuiltin, SchemaInfoEnum, SchemaInfoArray,
                SchemaInfoObject, SchemaInfoAlternate, SchemaInfoCommand,
                SchemaInfoEvent]


However, the cost of reproducing and keeping these structure definitions
in-sync between the user-defined portion of the schema and the code
generator is likely not worth doing any such thing. However, it does
illustrate an interesting dependency the generator has on the
user-defined schema itself in terms of types.

So, I settled on using some light types that suggest the form of the
object instead of enforcing the form.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 49 ++++++++++++++++++++++++--------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index b7f2a6cf260..eb5d34cdb42 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -66,8 +66,15 @@
 _value = Union[_scalar, _nonscalar]
 TreeValue = Union[_value, 'Annotated[_value]']
 
-# This is a (strict) alias for an arbitrary object non-scalar, as above:
-_DObject = Dict[str, object]
+# These types are based on structures defined in QEMU's schema, so we lack
+# precise types for them here. Python 3.6 does not offer TypedDict constructs,
+# so they are loosely typed here as simple python Dicts.
+SchemaInfo = Dict[str, object]
+SchemaInfoObject = Dict[str, object]
+SchemaInfoObjectVariant = Dict[str, object]
+SchemaInfoObjectMember = Dict[str, object]
+SchemaInfoCommand = Dict[str, object]
+
 
 _NodeT = TypeVar('_NodeT', bound=TreeValue)
 
@@ -160,7 +167,7 @@ def __init__(self, prefix: str, unmask: bool):
             ' * QAPI/QMP schema introspection', __doc__)
         self._unmask = unmask
         self._schema: Optional[QAPISchema] = None
-        self._trees: List[Annotated[_DObject]] = []
+        self._trees: List[Annotated[SchemaInfo]] = []
         self._used_types: List[QAPISchemaType] = []
         self._name_map: Dict[str, str] = {}
         self._genc.add(mcgen('''
@@ -232,9 +239,18 @@ def _gen_features(features: List[QAPISchemaFeature]
                       ) -> List[Annotated[str]]:
         return [Annotated(f.name, f.ifcond) for f in features]
 
-    def _gen_tree(self, name: str, mtype: str, obj: _DObject,
+    def _gen_tree(self, name: str, mtype: str, obj: Dict[str, object],
                   ifcond: List[str],
                   features: Optional[List[QAPISchemaFeature]]) -> None:
+        """
+        Build and append a SchemaInfo object to self._trees.
+
+        :param name: The entity's name.
+        :param mtype: The entity's meta-type.
+        :param obj: Additional entity fields, as appropriate for the meta-type.
+        :param ifcond: List of conditionals that apply to this entire entity.
+        :param features: Optional features field for SchemaInfo.
+        """
         comment: Optional[str] = None
         if mtype not in ('command', 'event', 'builtin', 'array'):
             if not self._unmask:
@@ -249,8 +265,8 @@ def _gen_tree(self, name: str, mtype: str, obj: _DObject,
         self._trees.append(Annotated(obj, ifcond, comment))
 
     def _gen_member(self, member: QAPISchemaObjectTypeMember
-                    ) -> Annotated[_DObject]:
-        obj: _DObject = {
+                    ) -> Annotated[SchemaInfoObjectMember]:
+        obj: SchemaInfoObjectMember = {
             'name': member.name,
             'type': self._use_type(member.type)
         }
@@ -260,13 +276,9 @@ def _gen_member(self, member: QAPISchemaObjectTypeMember
             obj['features'] = self._gen_features(member.features)
         return Annotated(obj, member.ifcond)
 
-    def _gen_variants(self, tag_name: str,
-                      variants: List[QAPISchemaVariant]) -> _DObject:
-        return {'tag': tag_name,
-                'variants': [self._gen_variant(v) for v in variants]}
-
-    def _gen_variant(self, variant: QAPISchemaVariant) -> Annotated[_DObject]:
-        obj: _DObject = {
+    def _gen_variant(self, variant: QAPISchemaVariant
+                     ) -> Annotated[SchemaInfoObjectVariant]:
+        obj: SchemaInfoObjectVariant = {
             'case': variant.name,
             'type': self._use_type(variant.type)
         }
@@ -298,11 +310,12 @@ def visit_object_type_flat(self, name: str, info: Optional[QAPISourceInfo],
                                features: List[QAPISchemaFeature],
                                members: List[QAPISchemaObjectTypeMember],
                                variants: Optional[QAPISchemaVariants]) -> None:
-        obj: _DObject = {'members': [self._gen_member(m) for m in members]}
+        obj: SchemaInfoObject = {
+            'members': [self._gen_member(m) for m in members]
+        }
         if variants:
-            obj.update(self._gen_variants(variants.tag_member.name,
-                                          variants.variants))
-
+            obj['tag'] = variants.tag_member.name
+            obj['variants'] = [self._gen_variant(v) for v in variants.variants]
         self._gen_tree(name, 'object', obj, ifcond, features)
 
     def visit_alternate_type(self, name: str, info: Optional[QAPISourceInfo],
@@ -327,7 +340,7 @@ def visit_command(self, name: str, info: Optional[QAPISourceInfo],
 
         arg_type = arg_type or self._schema.the_empty_object_type
         ret_type = ret_type or self._schema.the_empty_object_type
-        obj: _DObject = {
+        obj: SchemaInfoCommand = {
             'arg-type': self._use_type(arg_type),
             'ret-type': self._use_type(ret_type)
         }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 13/14] qapi/introspect.py: Add docstring to _tree_to_qlit
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (11 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 12/14] qapi/introspect.py: add introspect.json dummy types John Snow
@ 2021-02-02 17:46 ` John Snow
  2021-02-02 17:46 ` [PATCH v4 14/14] qapi/introspect.py: Update copyright and authors list John Snow
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index eb5d34cdb42..3e59caf7d4d 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -97,6 +97,13 @@ def __init__(self, value: _NodeT, ifcond: Iterable[str],
 def _tree_to_qlit(obj: TreeValue,
                   level: int = 0,
                   suppress_first_indent: bool = False) -> str:
+    """
+    Convert the type tree into a QLIT C string, recursively.
+
+    :param obj: The value to convert.
+    :param level: The indentation level for this particular value.
+    :param suppress_first_indent: True for dict value children.
+    """
 
     def indent(level: int) -> str:
         return level * 4 * ' '
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 14/14] qapi/introspect.py: Update copyright and authors list
  2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
                   ` (12 preceding siblings ...)
  2021-02-02 17:46 ` [PATCH v4 13/14] qapi/introspect.py: Add docstring to _tree_to_qlit John Snow
@ 2021-02-02 17:46 ` John Snow
  13 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-02 17:46 UTC (permalink / raw)
  To: qemu-devel, Markus Armbruster
  Cc: Michael Roth, John Snow, Eduardo Habkost, Cleber Rosa

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/introspect.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 3e59caf7d4d..df2bb50df32 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -1,10 +1,11 @@
 """
 QAPI introspection generator
 
-Copyright (C) 2015-2018 Red Hat, Inc.
+Copyright (C) 2015-2020 Red Hat, Inc.
 
 Authors:
  Markus Armbruster <armbru@redhat.com>
+ John Snow <jsnow@redhat.com>
 
 This work is licensed under the terms of the GNU GPL, version 2.
 See the COPYING file in the top-level directory.
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes
  2021-02-02 17:46 ` [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes John Snow
@ 2021-02-03 13:49   ` Markus Armbruster
  0 siblings, 0 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 13:49 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> At present, we open-code this in _make_tree itself; but if the structure
> of the tree changes, this is brittle. Use an explicit recursive call to
> _make_tree when appropriate to help keep the interior node typing
> consistent.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 43ab4be1f77..3295a15c98e 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -30,7 +30,9 @@ def _make_tree(obj, ifcond, features, extra=None):
>      if ifcond:
>          extra['if'] = ifcond
>      if features:
> -        obj['features'] = [(f.name, {'if': f.ifcond}) for f in features]
> +        obj['features'] = [
> +            _make_tree(f.name, f.ifcond, None) for f in features
> +        ]
>      if extra:
>          return (obj, extra)
>      return obj

The commit message made me expect a patch that improves the code without
changing its behavior.  This is not the case.  We go from

    obj['features'] = [(f.name, {'if': f.ifcond}) for f in features]

to

    obj['features'] = [_make_tree(f.name, f.ifcond) for f in features]

where

    _make_tree(f.name, f.ifcond)
    = (f.name, {'if': f.ifcond})       if f.ifcond
    = f.name                           else

Differs when not f.ifcond.  Feels like an improvement.  However, the
commit message did not prepare me for it.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse
  2021-02-02 17:46 ` [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse John Snow
@ 2021-02-03 14:08   ` Markus Armbruster
  2021-02-03 20:42     ` John Snow
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 14:08 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> _tree_to_qlit is called recursively on dict values alone; at such a
> point in generating output it is too late to apply an ifcond. Similarly,
> comments do not necessarily have a "tidy" place they can be printed in
> such a circumstance.
>
> Forbid this usage.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 4749f65ea3c..ccdf4f1c0d0 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -43,6 +43,12 @@ def indent(level):
>          ifobj, extra = obj
>          ifcond = extra.get('if')
>          comment = extra.get('comment')
> +
> +        # NB: _tree_to_qlit is called recursively on the values of a key:value
> +        # pair; those values can't be decorated with comments or conditionals.
> +        msg = "dict values cannot have attached comments or if-conditionals."
> +        assert not suppress_first_indent, msg
> +
>          ret = ''
>          if comment:
>              ret += indent(level) + '/* %s */\n' % comment

This uses @suppress_first_indent as a proxy for "@obj is a value in a
dict".  Works, because we pass suppress_first_indent=True exactly
there.  Took me a minute to see, though.

Do you need this assertion to help mypy over the hump?

Perhaps we'd be better off with two functions, one that takes possibly
annotated @obj, and one that takes only plain @obj.  "Yes, but not now"
woule be one acceptable answer to that.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument
  2021-02-02 17:46 ` [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument John Snow
@ 2021-02-03 14:23   ` Markus Armbruster
  2021-02-03 21:21     ` John Snow
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 14:23 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> This is only used to pass in a dictionary with a comment already set, so
> skip the runaround and just accept the comment.
>
> This works because _tree_to_qlit() treats 'if': None; 'comment': None
> exactly like absent 'if'; 'comment'.

Confusing, because the two paragraphs talk about two different things:

1. Actual arguments for @extra are either None or {'comment': comment}.
   Simplify: replace parameter @extra by parameter @comment.

2. Dumb down the return value to always be of the form

    (obj {'if': ifcond, 'comment': comment})

I suspect splitting the patch is easier than crafting a clear commit
message for the combined one.

>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index d3fbf694ad2..0aa3b77109f 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -10,6 +10,8 @@
>  See the COPYING file in the top-level directory.
>  """
>  
> +from typing import Optional
> +
>  from .common import (
>      c_name,
>      gen_endif,
> @@ -24,11 +26,11 @@
>  )
>  
>  
> -def _make_tree(obj, ifcond, extra=None):
> -    if extra is None:
> -        extra = {}
> -    if ifcond:
> -        extra['if'] = ifcond
> +def _make_tree(obj, ifcond, comment=None):
> +    extra = {
> +        'if': ifcond,
> +        'comment': comment,
> +    }
>      return (obj, extra)

Obvious way to do just 1.:

   def _make_tree(obj, ifcond, comment=None):
       extra = {}
       if ifcond:
           extra['if'] = ifcond
       if comment:
           extra['comment'] = comment

>  
>  
> @@ -174,18 +176,18 @@ def _gen_features(features):
>          return [_make_tree(f.name, f.ifcond) for f in features]
>  
>      def _gen_tree(self, name, mtype, obj, ifcond, features):
> -        extra = None
> +        comment: Optional[str] = None
>          if mtype not in ('command', 'event', 'builtin', 'array'):
>              if not self._unmask:
>                  # Output a comment to make it easy to map masked names
>                  # back to the source when reading the generated output.
> -                extra = {'comment': '"%s" = %s' % (self._name(name), name)}
> +                comment = f'"{self._name(name)}" = {name}'
>              name = self._name(name)
>          obj['name'] = name
>          obj['meta-type'] = mtype
>          if features:
>              obj['features'] = self._gen_features(features)
> -        self._trees.append(_make_tree(obj, ifcond, extra))
> +        self._trees.append(_make_tree(obj, ifcond, comment))
>  
>      def _gen_member(self, member):
>          obj = {'name': member.name, 'type': self._use_type(member.type)}



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing
  2021-02-02 17:46 ` [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing John Snow
@ 2021-02-03 14:30   ` Markus Armbruster
  2021-02-03 21:40     ` John Snow
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 14:30 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> The types will be used in forthcoming patches to add typing. These types
> describe the layout and structure of the objects passed to
> _tree_to_qlit, but lack the power to describe annotations until the next
> commit.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 0aa3b77109f..b82efe16f6e 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -10,7 +10,13 @@
>  See the COPYING file in the top-level directory.
>  """
>  
> -from typing import Optional
> +from typing import (
> +    Any,
> +    Dict,
> +    List,
> +    Optional,
> +    Union,
> +)
>  
>  from .common import (
>      c_name,
> @@ -26,6 +32,28 @@
>  )
>  
>  
> +# This module constructs a tree data structure that is used to
> +# generate the introspection information for QEMU. It behaves similarly
> +# to a JSON value.
> +#
> +# A complexity over JSON is that our values may or may not be annotated.
> +#
> +# Un-annotated values may be:
> +#     Scalar: str, bool, None.
> +#     Non-scalar: List, Dict
> +# _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
> +#
> +# With optional annotations, the type of all values is:
> +# TreeValue = Union[_value, Annotated[_value]]
> +#
> +# Sadly, mypy does not support recursive types, so we must approximate this.
> +_stub = Any
> +_scalar = Union[str, bool, None]
> +_nonscalar = Union[Dict[str, _stub], List[_stub]]
> +_value = Union[_scalar, _nonscalar]
> +# TreeValue = Union[_value, 'Annotated[_value]']

Why is TreeValue commented out?  Oh, because Annotated doesn't exist,
yet.

Possibly less confusing:

   # A complexity over JSON is that our values may or may not be annotated.
   #
   # Un-annotated values may be:
   #     Scalar: str, bool, None.
   #     Non-scalar: List, Dict
   # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
   #
   # With optional annotations, the type of all values is:
   # TODO
   #
   # Sadly, mypy does not support recursive types, so we must approximate this.
   _stub = Any
   _scalar = Union[str, bool, None]
   _nonscalar = Union[Dict[str, _stub], List[_stub]]
   _value = Union[_scalar, _nonscalar]

or even just

   # A complexity over JSON is that our values may or may not be annotated.
   #
   # Un-annotated values may be:
   #     Scalar: str, bool, None.
   #     Non-scalar: List, Dict
   # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
   #
   # Sadly, mypy does not support recursive types, so we must approximate this.
   _stub = Any
   _scalar = Union[str, bool, None]
   _nonscalar = Union[Dict[str, _stub], List[_stub]]
   _value = Union[_scalar, _nonscalar]

> +
> +
>  def _make_tree(obj, ifcond, comment=None):
>      extra = {
>          'if': ifcond,



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-02 17:46 ` [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure John Snow
@ 2021-02-03 14:47   ` Markus Armbruster
  2021-02-03 21:50     ` Eduardo Habkost
  2021-02-03 23:12     ` John Snow
  0 siblings, 2 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 14:47 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> Presently, we use a tuple to attach a dict containing annotations
> (comments and compile-time conditionals) to a tree node. This is
> undesirable because dicts are difficult to strongly type; promoting it
> to a real class allows us to name the values and types of the
> annotations we are expecting.
>
> In terms of typing, the Annotated<T> type serves as a generic container
> where the annotated node's type is preserved, allowing for greater
> specificity than we'd be able to provide without a generic.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 77 ++++++++++++++++++++++----------------
>  1 file changed, 44 insertions(+), 33 deletions(-)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index b82efe16f6e..2b90a52f016 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -13,8 +13,12 @@
>  from typing import (
>      Any,
>      Dict,
> +    Generic,
> +    Iterable,
>      List,
>      Optional,
> +    Tuple,
> +    TypeVar,
>      Union,
>  )
>  
> @@ -51,15 +55,25 @@
>  _scalar = Union[str, bool, None]
>  _nonscalar = Union[Dict[str, _stub], List[_stub]]
>  _value = Union[_scalar, _nonscalar]
> -# TreeValue = Union[_value, 'Annotated[_value]']
> +TreeValue = Union[_value, 'Annotated[_value]']

You need to quote 'Annotated[_value]', because it's a forward
reference.

Dependency cycle:

        +-----------------------------------------------+
        |                                               |
    TreeValue = Union[_value, 'Annotated[_value]']      |
                                   |                    |
        +--------------------------+                    |
        |                                               |
    Annotated(Generic[_NodeT])                          |
                         |                              |
       +-----------------+                              |
       |                                                |
    _NodeT = TypeVar('_NodeT', bound=TreeValue          |
                                         |              |
                                         +--------------+

Meh.  We'll live.

>  
> -def _make_tree(obj, ifcond, comment=None):
> -    extra = {
> -        'if': ifcond,
> -        'comment': comment,
> -    }
> -    return (obj, extra)
> +_NodeT = TypeVar('_NodeT', bound=TreeValue)

Can you teach me what NodeT is about?

> +
> +
> +class Annotated(Generic[_NodeT]):
> +    """
> +    Annotated generally contains a SchemaInfo-like type (as a dict),
> +    But it also used to wrap comments/ifconds around scalar leaf values,
> +    for the benefit of features and enums.
> +    """
> +    # Remove after 3.7 adds @dataclass:

Make this

       # TODO Remove after Python 3.7 ...

to give us a fighting chance to remember.

> +    # pylint: disable=too-few-public-methods
> +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
> +                 comment: Optional[str] = None):

Why not simply value: _value?

> +        self.value = value
> +        self.comment: Optional[str] = comment
> +        self.ifcond: Tuple[str, ...] = tuple(ifcond)
>  
>  
>  def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
> @@ -67,24 +81,20 @@ def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
>      def indent(level):
>          return level * 4 * ' '
>  
> -    if isinstance(obj, tuple):
> -        ifobj, extra = obj
> -        ifcond = extra.get('if')
> -        comment = extra.get('comment')
> -
> +    if isinstance(obj, Annotated):
>          # NB: _tree_to_qlit is called recursively on the values of a key:value
>          # pair; those values can't be decorated with comments or conditionals.
>          msg = "dict values cannot have attached comments or if-conditionals."
>          assert not suppress_first_indent, msg
>  
>          ret = ''
> -        if comment:
> -            ret += indent(level) + '/* %s */\n' % comment
> -        if ifcond:
> -            ret += gen_if(ifcond)
> -        ret += _tree_to_qlit(ifobj, level)
> -        if ifcond:
> -            ret += '\n' + gen_endif(ifcond)
> +        if obj.comment:
> +            ret += indent(level) + '/* %s */\n' % obj.comment
> +        if obj.ifcond:
> +            ret += gen_if(obj.ifcond)
> +        ret += _tree_to_qlit(obj.value, level, suppress_first_indent)

Why do you need to pass suppress_first_indent now?

> +        if obj.ifcond:
> +            ret += '\n' + gen_endif(obj.ifcond)
>          return ret
>  
>      ret = ''
> @@ -201,7 +211,7 @@ def _use_type(self, typ):
>  
>      @staticmethod
>      def _gen_features(features):
> -        return [_make_tree(f.name, f.ifcond) for f in features]
> +        return [Annotated(f.name, f.ifcond) for f in features]
>  
>      def _gen_tree(self, name, mtype, obj, ifcond, features):
>          comment: Optional[str] = None
> @@ -215,7 +225,7 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
>          obj['meta-type'] = mtype
>          if features:
>              obj['features'] = self._gen_features(features)
> -        self._trees.append(_make_tree(obj, ifcond, comment))
> +        self._trees.append(Annotated(obj, ifcond, comment))
>  
>      def _gen_member(self, member):
>          obj = {'name': member.name, 'type': self._use_type(member.type)}
> @@ -223,7 +233,7 @@ def _gen_member(self, member):
>              obj['default'] = None
>          if member.features:
>              obj['features'] = self._gen_features(member.features)
> -        return _make_tree(obj, member.ifcond)
> +        return Annotated(obj, member.ifcond)
>  
>      def _gen_variants(self, tag_name, variants):
>          return {'tag': tag_name,
> @@ -231,16 +241,17 @@ def _gen_variants(self, tag_name, variants):
>  
>      def _gen_variant(self, variant):
>          obj = {'case': variant.name, 'type': self._use_type(variant.type)}
> -        return _make_tree(obj, variant.ifcond)
> +        return Annotated(obj, variant.ifcond)
>  
>      def visit_builtin_type(self, name, info, json_type):
>          self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
>  
>      def visit_enum_type(self, name, info, ifcond, features, members, prefix):
> -        self._gen_tree(name, 'enum',
> -                       {'values': [_make_tree(m.name, m.ifcond, None)
> -                                   for m in members]},
> -                       ifcond, features)
> +        self._gen_tree(
> +            name, 'enum',
> +            {'values': [Annotated(m.name, m.ifcond) for m in members]},
> +            ifcond, features
> +        )
>  
>      def visit_array_type(self, name, info, ifcond, element_type):
>          element = self._use_type(element_type)
> @@ -257,12 +268,12 @@ def visit_object_type_flat(self, name, info, ifcond, features,
>          self._gen_tree(name, 'object', obj, ifcond, features)
>  
>      def visit_alternate_type(self, name, info, ifcond, features, variants):
> -        self._gen_tree(name, 'alternate',
> -                       {'members': [
> -                           _make_tree({'type': self._use_type(m.type)},
> -                                      m.ifcond, None)
> -                           for m in variants.variants]},
> -                       ifcond, features)
> +        self._gen_tree(name, 'alternate', {'members': [

I think I'd prefer another line break after 'alternate', to get
{'members': align...

> +                Annotated({'type': self._use_type(m.type)}, m.ifcond)
> +                for m in variants.variants
> +            ]},

... with ]}.

> +            ifcond, features
> +        )
>  
>      def visit_command(self, name, info, ifcond, features,
>                        arg_type, ret_type, gen, success_response, boxed,



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-02 17:46 ` [PATCH v4 11/14] qapi/introspect.py: add type hint annotations John Snow
@ 2021-02-03 15:15   ` Markus Armbruster
  2021-02-03 23:27     ` John Snow
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-03 15:15 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
>  scripts/qapi/mypy.ini      |   5 --
>  scripts/qapi/schema.py     |   2 +-
>  3 files changed, 82 insertions(+), 40 deletions(-)
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 60ec326d2c7..b7f2a6cf260 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -30,10 +30,19 @@
>  )
>  from .gen import QAPISchemaMonolithicCVisitor
>  from .schema import (
> +    QAPISchema,
>      QAPISchemaArrayType,
>      QAPISchemaBuiltinType,
> +    QAPISchemaEntity,
> +    QAPISchemaEnumMember,
> +    QAPISchemaFeature,
> +    QAPISchemaObjectType,
> +    QAPISchemaObjectTypeMember,
>      QAPISchemaType,
> +    QAPISchemaVariant,
> +    QAPISchemaVariants,
>  )
> +from .source import QAPISourceInfo
>  
>  
>  # This module constructs a tree data structure that is used to
> @@ -57,6 +66,8 @@
   # generate the introspection information for QEMU. It behaves similarly
   # to a JSON value.
   #
   # A complexity over JSON is that our values may or may not be annotated.
   #
   # Un-annotated values may be:
   #     Scalar: str, bool, None.
   #     Non-scalar: List, Dict
   # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
   #
   # With optional annotations, the type of all values is:
   # TreeValue = Union[_value, Annotated[_value]]
   #
   # Sadly, mypy does not support recursive types, so we must approximate this.
   _stub = Any
   _scalar = Union[str, bool, None]
   _nonscalar = Union[Dict[str, _stub], List[_stub]]
>  _value = Union[_scalar, _nonscalar]
>  TreeValue = Union[_value, 'Annotated[_value]']
>  
> +# This is a (strict) alias for an arbitrary object non-scalar, as above:
> +_DObject = Dict[str, object]

Sounds greek :)

It's almost the Dict part of _nonscalar, but not quite: object vs. Any.

I naively expect something closer to

   _scalar = ...
   _object = Dict[str, _stub]
   _nonscalar = Union[_object, List[_stub]

and (still naively) expect _object to be good enough to serve as type
annotation for dicts representing JSON objects.

[...]



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse
  2021-02-03 14:08   ` Markus Armbruster
@ 2021-02-03 20:42     ` John Snow
  2021-02-03 21:18       ` Eduardo Habkost
  2021-02-04 15:06       ` Markus Armbruster
  0 siblings, 2 replies; 40+ messages in thread
From: John Snow @ 2021-02-03 20:42 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/3/21 9:08 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> _tree_to_qlit is called recursively on dict values alone; at such a
>> point in generating output it is too late to apply an ifcond. Similarly,
>> comments do not necessarily have a "tidy" place they can be printed in
>> such a circumstance.
>>
>> Forbid this usage.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/introspect.py | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>> index 4749f65ea3c..ccdf4f1c0d0 100644
>> --- a/scripts/qapi/introspect.py
>> +++ b/scripts/qapi/introspect.py
>> @@ -43,6 +43,12 @@ def indent(level):
>>           ifobj, extra = obj
>>           ifcond = extra.get('if')
>>           comment = extra.get('comment')
>> +
>> +        # NB: _tree_to_qlit is called recursively on the values of a key:value
>> +        # pair; those values can't be decorated with comments or conditionals.
>> +        msg = "dict values cannot have attached comments or if-conditionals."
>> +        assert not suppress_first_indent, msg
>> +
>>           ret = ''
>>           if comment:
>>               ret += indent(level) + '/* %s */\n' % comment
> 
> This uses @suppress_first_indent as a proxy for "@obj is a value in a
> dict".  Works, because we pass suppress_first_indent=True exactly
> there.  Took me a minute to see, though.
> 

Yes, the link is a little tenuous; in truth, the field could be renamed 
"dict_value: bool" or so to make it more clear, at the expense of making 
the inner workings of _tree_to_qlit more opaque.

So we happen to know that only dict values want to suppress the indent; 
and the error message explains what went wrong in language 
(subjectively, again) more directly helpful to the theoretical hapless user.

(Tentatively: I'll amend the parameter name to make the relationship 
more concrete, but I expect you'll have more to say.)

> Do you need this assertion to help mypy over the hump?
> 

It was added as a result of an observation by Eduardo that by always 
generating annotation data (To make the return type from _make_tree not 
conditional) that there were cases where you could conceivably call 
_tree_to_qlit that didn't make sense; or at least we couldn't prove 
easily that it wouldn't happen.

(Of course, in practice, it does not.)

I added the assertion to call attention to the limitations of this 
existing code: passing annotations alongside dict values made no sense.

(Or maybe made no sense.)

Conceptually it makes sense that some keys of a dict might be 
conditionally compiled out, but in terms of the actual data structures 
we use to convey this information, we don't actually use dicts to 
represent keys like that ... we use a list, actually.

(See visit_object_type_flat)

Anyway, this was an attempt to clear up that misunderstanding for 
reviewers less familiar with these structures, and to guard against 
future code in particular that may misunderstand it.

It isn't (to my recollection) necessary for mypy. If you want to remove 
it, I think I'd like Eduardo to sign off on that matter.

(We both found this code very, very confusing to read and modify. For 
whatever reason, I still find it fairly hard to reason about clearly.)

> Perhaps we'd be better off with two functions, one that takes possibly
> annotated @obj, and one that takes only plain @obj.  "Yes, but not now"
> woule be one acceptable answer to that.
> 

Yes, I tried to prototype this a few times but found that it quickly 
touched too many things and I was losing appetite for re-spins. Recent 
reviews urged a focus on "typing what we have, where possible" before 
making alterations. The debate between "fix-then-type" or 
"type-then-fix" is valid, but largely intractable.

Since my only immediate goal was "Get everything typed", the 
"type-then-fix" approach has the side-effect of dropping improvements 
that aren't strictly needed whenever possible.

LONG STORY SHORT: Yes, I think that design would be better overall, but 
I think it should wait for later. In particular, if you embark upon your 
more radical rewrite of introspection, it could just be handled at that 
time.

(My careful separation of scalars vs non-scalars in the typing comment 
later in this series is an artifact of the time spent playing around 
with splitting this function out into two mutually recursive functions, 
but found it was too noisy in an already long-challenged series.)

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse
  2021-02-03 20:42     ` John Snow
@ 2021-02-03 21:18       ` Eduardo Habkost
  2021-02-04 15:06       ` Markus Armbruster
  1 sibling, 0 replies; 40+ messages in thread
From: Eduardo Habkost @ 2021-02-03 21:18 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, Markus Armbruster, qemu-devel

On Wed, Feb 03, 2021 at 03:42:54PM -0500, John Snow wrote:
> On 2/3/21 9:08 AM, Markus Armbruster wrote:
> > John Snow <jsnow@redhat.com> writes:
> > 
> > > _tree_to_qlit is called recursively on dict values alone; at such a
> > > point in generating output it is too late to apply an ifcond. Similarly,
> > > comments do not necessarily have a "tidy" place they can be printed in
> > > such a circumstance.
> > > 
> > > Forbid this usage.
> > > 
> > > Signed-off-by: John Snow <jsnow@redhat.com>
> > > ---
> > >   scripts/qapi/introspect.py | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> > > index 4749f65ea3c..ccdf4f1c0d0 100644
> > > --- a/scripts/qapi/introspect.py
> > > +++ b/scripts/qapi/introspect.py
> > > @@ -43,6 +43,12 @@ def indent(level):
> > >           ifobj, extra = obj
> > >           ifcond = extra.get('if')
> > >           comment = extra.get('comment')
> > > +
> > > +        # NB: _tree_to_qlit is called recursively on the values of a key:value
> > > +        # pair; those values can't be decorated with comments or conditionals.
> > > +        msg = "dict values cannot have attached comments or if-conditionals."
> > > +        assert not suppress_first_indent, msg
> > > +
> > >           ret = ''
> > >           if comment:
> > >               ret += indent(level) + '/* %s */\n' % comment
> > 
> > This uses @suppress_first_indent as a proxy for "@obj is a value in a
> > dict".  Works, because we pass suppress_first_indent=True exactly
> > there.  Took me a minute to see, though.
> > 
[...]
> Anyway, this was an attempt to clear up that misunderstanding for reviewers
> less familiar with these structures, and to guard against future code in
> particular that may misunderstand it.
> 
> It isn't (to my recollection) necessary for mypy. If you want to remove it,
> I think I'd like Eduardo to sign off on that matter.

Thanks for taking it into consideration.

I like having extra comments and extra asserts on cases like this
one.  It helps us see mistakes more easily, and to identify
future opportunities for cleaning up the code.

But I don't want to delay important work because of details that
don't seem to make the code objectively worse.  So I won't argue
about this.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument
  2021-02-03 14:23   ` Markus Armbruster
@ 2021-02-03 21:21     ` John Snow
  2021-02-04  8:37       ` Markus Armbruster
  0 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-03 21:21 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/3/21 9:23 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> This is only used to pass in a dictionary with a comment already set, so
>> skip the runaround and just accept the comment.
>>
>> This works because _tree_to_qlit() treats 'if': None; 'comment': None
>> exactly like absent 'if'; 'comment'.
> 
> Confusing, because the two paragraphs talk about two different things:
> 
> 1. Actual arguments for @extra are either None or {'comment': comment}.
>     Simplify: replace parameter @extra by parameter @comment.
> 
> 2. Dumb down the return value to always be of the form
> 
>      (obj {'if': ifcond, 'comment': comment})
> 

I think you are drawing attention to the fact that 'if' and 'comment' 
are now always present in this dict instead of conditionally present.

(else, I have misread you. (I think you are missing a comma.))

> I suspect splitting the patch is easier than crafting a clear commit
> message for the combined one.
> 

I wouldn't have considered to break out such a small change into two 
even smaller changes, but as you are in charge here ... Okey Dokey.

(meta-tangent: [1])

>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/introspect.py | 18 ++++++++++--------
>>   1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>> index d3fbf694ad2..0aa3b77109f 100644
>> --- a/scripts/qapi/introspect.py
>> +++ b/scripts/qapi/introspect.py
>> @@ -10,6 +10,8 @@
>>   See the COPYING file in the top-level directory.
>>   """
>>   
>> +from typing import Optional
>> +
>>   from .common import (
>>       c_name,
>>       gen_endif,
>> @@ -24,11 +26,11 @@
>>   )
>>   
>>   
>> -def _make_tree(obj, ifcond, extra=None):
>> -    if extra is None:
>> -        extra = {}
>> -    if ifcond:
>> -        extra['if'] = ifcond
>> +def _make_tree(obj, ifcond, comment=None):
>> +    extra = {
>> +        'if': ifcond,
>> +        'comment': comment,
>> +    }
>>       return (obj, extra)
> 
> Obvious way to do just 1.:
> 
>     def _make_tree(obj, ifcond, comment=None):
>         extra = {}
>         if ifcond:
>             extra['if'] = ifcond
>         if comment:
>             extra['comment'] = comment
> 

OK.

>>   
>>   
>> @@ -174,18 +176,18 @@ def _gen_features(features):
>>           return [_make_tree(f.name, f.ifcond) for f in features]
>>   
>>       def _gen_tree(self, name, mtype, obj, ifcond, features):
>> -        extra = None
>> +        comment: Optional[str] = None
>>           if mtype not in ('command', 'event', 'builtin', 'array'):
>>               if not self._unmask:
>>                   # Output a comment to make it easy to map masked names
>>                   # back to the source when reading the generated output.
>> -                extra = {'comment': '"%s" = %s' % (self._name(name), name)}
>> +                comment = f'"{self._name(name)}" = {name}'
>>               name = self._name(name)
>>           obj['name'] = name
>>           obj['meta-type'] = mtype
>>           if features:
>>               obj['features'] = self._gen_features(features)
>> -        self._trees.append(_make_tree(obj, ifcond, extra))
>> +        self._trees.append(_make_tree(obj, ifcond, comment))
>>   
>>       def _gen_member(self, member):
>>           obj = {'name': member.name, 'type': self._use_type(member.type)}


[1] As a matter of process, I sometimes find it cumbersome to 
intentionally engineer an intermediary state when I jumped straight from 
A->C in my actual editing.

I will usually keep such intermediary forms when they come about 
naturally in the course of development, but rarely seek to add them 
artificially -- it feels like a major bummer to engineer, test, and 
scrutinize code that's only bound to be deleted immediately after. 
Sometimes, it feels like a waste of reviewer effort, too.

It's been years and I still don't think I have any real intuitive sense 
for this, which is ...unfortunate.

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing
  2021-02-03 14:30   ` Markus Armbruster
@ 2021-02-03 21:40     ` John Snow
  0 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-03 21:40 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/3/21 9:30 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> The types will be used in forthcoming patches to add typing. These types
>> describe the layout and structure of the objects passed to
>> _tree_to_qlit, but lack the power to describe annotations until the next
>> commit.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/introspect.py | 30 +++++++++++++++++++++++++++++-
>>   1 file changed, 29 insertions(+), 1 deletion(-)
>>
>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>> index 0aa3b77109f..b82efe16f6e 100644
>> --- a/scripts/qapi/introspect.py
>> +++ b/scripts/qapi/introspect.py
>> @@ -10,7 +10,13 @@
>>   See the COPYING file in the top-level directory.
>>   """
>>   
>> -from typing import Optional
>> +from typing import (
>> +    Any,
>> +    Dict,
>> +    List,
>> +    Optional,
>> +    Union,
>> +)
>>   
>>   from .common import (
>>       c_name,
>> @@ -26,6 +32,28 @@
>>   )
>>   
>>   
>> +# This module constructs a tree data structure that is used to
>> +# generate the introspection information for QEMU. It behaves similarly
>> +# to a JSON value.
>> +#
>> +# A complexity over JSON is that our values may or may not be annotated.
>> +#
>> +# Un-annotated values may be:
>> +#     Scalar: str, bool, None.
>> +#     Non-scalar: List, Dict
>> +# _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>> +#
>> +# With optional annotations, the type of all values is:
>> +# TreeValue = Union[_value, Annotated[_value]]
>> +#
>> +# Sadly, mypy does not support recursive types, so we must approximate this.
>> +_stub = Any
>> +_scalar = Union[str, bool, None]
>> +_nonscalar = Union[Dict[str, _stub], List[_stub]]
>> +_value = Union[_scalar, _nonscalar]
>> +# TreeValue = Union[_value, 'Annotated[_value]']
> 
> Why is TreeValue commented out?  Oh, because Annotated doesn't exist,
> yet.
> 

In the comment region specifically, the intent was to give a standalone 
grammar of the structure without regard to implementation limitations. 
It is the "real" type of the tree.

i.e., the grammar is complete and accurate, but abstract.

 From the commit message: "These types describe the layout and structure 
of the objects passed to _tree_to_qlit, but lack the power to describe 
annotations until the next commit."

The second occurrence of that type, commented out, could be removed -- 
see below.

> Possibly less confusing:
> 
>     # A complexity over JSON is that our values may or may not be annotated.
>     #
>     # Un-annotated values may be:
>     #     Scalar: str, bool, None.
>     #     Non-scalar: List, Dict
>     # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>     #
>     # With optional annotations, the type of all values is:
>     # TODO

I'd actually prefer to keep that one in;

>     #
>     # Sadly, mypy does not support recursive types, so we must approximate this.
>     _stub = Any
>     _scalar = Union[str, bool, None]
>     _nonscalar = Union[Dict[str, _stub], List[_stub]]
>     _value = Union[_scalar, _nonscalar]

and augment it here instead, with:

# _TreeValue = TODO - defined in a forthcoming commit.

--js

> 
> or even just
> 
>     # A complexity over JSON is that our values may or may not be annotated.
>     #
>     # Un-annotated values may be:
>     #     Scalar: str, bool, None.
>     #     Non-scalar: List, Dict
>     # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>     #
>     # Sadly, mypy does not support recursive types, so we must approximate this.
>     _stub = Any
>     _scalar = Union[str, bool, None]
>     _nonscalar = Union[Dict[str, _stub], List[_stub]]
>     _value = Union[_scalar, _nonscalar]
> 
>> +
>> +
>>   def _make_tree(obj, ifcond, comment=None):
>>       extra = {
>>           'if': ifcond,



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-03 14:47   ` Markus Armbruster
@ 2021-02-03 21:50     ` Eduardo Habkost
  2021-02-04 15:37       ` Markus Armbruster
  2021-02-03 23:12     ` John Snow
  1 sibling, 1 reply; 40+ messages in thread
From: Eduardo Habkost @ 2021-02-03 21:50 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, John Snow, qemu-devel, Cleber Rosa

On Wed, Feb 03, 2021 at 03:47:36PM +0100, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
> > Presently, we use a tuple to attach a dict containing annotations
> > (comments and compile-time conditionals) to a tree node. This is
> > undesirable because dicts are difficult to strongly type; promoting it
> > to a real class allows us to name the values and types of the
> > annotations we are expecting.
> >
> > In terms of typing, the Annotated<T> type serves as a generic container
> > where the annotated node's type is preserved, allowing for greater
> > specificity than we'd be able to provide without a generic.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
[...]
> > +class Annotated(Generic[_NodeT]):
> > +    """
> > +    Annotated generally contains a SchemaInfo-like type (as a dict),
> > +    But it also used to wrap comments/ifconds around scalar leaf values,
> > +    for the benefit of features and enums.
> > +    """
> > +    # Remove after 3.7 adds @dataclass:
> 
> Make this
> 
>        # TODO Remove after Python 3.7 ...
> 
> to give us a fighting chance to remember.
> 
> > +    # pylint: disable=too-few-public-methods
> > +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
> > +                 comment: Optional[str] = None):
> 
> Why not simply value: _value?

Example:
  x = C(1)
  y: C[int]
  y = C('x')  # mistake

Declaring value as _NodeT does:
- Make the inferred type of x be Annotated[int].
- Catch the mistake above.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-03 14:47   ` Markus Armbruster
  2021-02-03 21:50     ` Eduardo Habkost
@ 2021-02-03 23:12     ` John Snow
  2021-02-05  9:10       ` Markus Armbruster
  1 sibling, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-03 23:12 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/3/21 9:47 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> Presently, we use a tuple to attach a dict containing annotations
>> (comments and compile-time conditionals) to a tree node. This is
>> undesirable because dicts are difficult to strongly type; promoting it
>> to a real class allows us to name the values and types of the
>> annotations we are expecting.
>>
>> In terms of typing, the Annotated<T> type serves as a generic container
>> where the annotated node's type is preserved, allowing for greater
>> specificity than we'd be able to provide without a generic.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/introspect.py | 77 ++++++++++++++++++++++----------------
>>   1 file changed, 44 insertions(+), 33 deletions(-)
>>
>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>> index b82efe16f6e..2b90a52f016 100644
>> --- a/scripts/qapi/introspect.py
>> +++ b/scripts/qapi/introspect.py
>> @@ -13,8 +13,12 @@
>>   from typing import (
>>       Any,
>>       Dict,
>> +    Generic,
>> +    Iterable,
>>       List,
>>       Optional,
>> +    Tuple,
>> +    TypeVar,
>>       Union,
>>   )
>>   
>> @@ -51,15 +55,25 @@
>>   _scalar = Union[str, bool, None]
>>   _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>   _value = Union[_scalar, _nonscalar]
>> -# TreeValue = Union[_value, 'Annotated[_value]']
>> +TreeValue = Union[_value, 'Annotated[_value]']
> 
> You need to quote 'Annotated[_value]', because it's a forward
> reference.
> 
> Dependency cycle:
> 
>          +-----------------------------------------------+
>          |                                               |
>      TreeValue = Union[_value, 'Annotated[_value]']      |
>                                     |                    |
>          +--------------------------+                    |
>          |                                               |
>      Annotated(Generic[_NodeT])                          |
>                           |                              |
>         +-----------------+                              |
>         |                                                |
>      _NodeT = TypeVar('_NodeT', bound=TreeValue          |
>                                           |              |
>                                           +--------------+
> 
> Meh.  We'll live.
> 

Python 3.10 (!) will fix this sort of thing. It fixes it in a funny way, 
though: all annotations are treated as delayed-evaluation strings. :)

See https://www.python.org/dev/peps/pep-0563/ which now becomes the 
*default* behavior.

We do not gain access to this behavior at all, because it is exclusive 
to 3.7+. Boo.

For now, (and for the foreseeable future of the QEMU project, as Python 
3.7 support will not be available to us for many, many, many more years 
[1]) forward references in the global scope need to be quoted.

>>   
>> -def _make_tree(obj, ifcond, comment=None):
>> -    extra = {
>> -        'if': ifcond,
>> -        'comment': comment,
>> -    }
>> -    return (obj, extra)
>> +_NodeT = TypeVar('_NodeT', bound=TreeValue)
> 
> Can you teach me what NodeT is about?
> 

It's a TypeVar. If you're a TAPL sort of person, canonically you use 
T,U,V and so on. Relevant tutorial section for mypy is here: 
https://mypy.readthedocs.io/en/stable/generics.html

(Yes, a weird thing about Generics in Python is that you have to declare 
them. No, I don't know why. Yes, it's weird.)

I have bound it to TreeValue, indicating that this type variable may 
only *ever* represent something that isa TreeValue. Because of this, I 
use "NodeT" to indicate that it's a Generic type T, but with some 
constraint.

(Though, it really should have been bound to _value instead ... my 
mistake. I shouldn't allow for nested annotations ...!)

Using it allows me to define the Annotated class below in terms of some 
input parameter instead of a fixed, opaque type.

>> +
>> +
>> +class Annotated(Generic[_NodeT]):

Annotated here is defined as Annotated[T], but T is our special _NodeT, 
so Annotated may only hold other TreeValues*.

(* Again, as per above, this is an oopsie.)

>> +    """
>> +    Annotated generally contains a SchemaInfo-like type (as a dict),
>> +    But it also used to wrap comments/ifconds around scalar leaf values,
>> +    for the benefit of features and enums.
>> +    """
>> +    # Remove after 3.7 adds @dataclass:
> 
> Make this
> 
>         # TODO Remove after Python 3.7 ...
> 
> to give us a fighting chance to remember.
> 

Having done a lot of the python 2 excision work, I only ever grepped for 
e.g. "3.7" or "sys.version". I was targeting that.

Adding TODO seems fine enough to do, and I'm here anyway.

>> +    # pylint: disable=too-few-public-methods
>> +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
>> +                 comment: Optional[str] = None):
> 
> Why not simply value: _value?
> 

This is what connects back with the Generic instantiation of this 
annotation. `class Annotated(Generic[_NodeT])` says that this class is a 
higher-kinded type parameterized on ... something. We don't know yet.

In the initializer, we use the TypeVar to describe which argument 
determines that parameterization.

>> +        self.value = value

That parameter flows down and connects to this property. Thus, this 
field actually has a dynamic type that will track the original type used 
to instantiate it. If it was a Dict, this will also be a Dict.

(Of course, it's limited to what mypy knows about the constraint of that 
value when passed. It isn't interrogated at runtime.)

This is a way to achieve dynamism in container-style classes without 
losing precision in data types. It allows us to declare and keep a more 
specific "inner type" that survives the boxing and unboxing process.

e.g. Annotated[Dict[str, object]] and Annotated[str] are two different 
types, and can be differentiated by mypy.

(*cough*, and yes, if you have a function that takes Annotated[Any], you 
lose that precision at that point. So, we aren't taking full advantage 
of that power here in introspect.py, but it still seemed like the 
"right" way to type something like this.)

>> +        self.comment: Optional[str] = comment
>> +        self.ifcond: Tuple[str, ...] = tuple(ifcond)
>>   
>>   
>>   def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
>> @@ -67,24 +81,20 @@ def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
>>       def indent(level):
>>           return level * 4 * ' '
>>   
>> -    if isinstance(obj, tuple):
>> -        ifobj, extra = obj
>> -        ifcond = extra.get('if')
>> -        comment = extra.get('comment')
>> -
>> +    if isinstance(obj, Annotated):
>>           # NB: _tree_to_qlit is called recursively on the values of a key:value
>>           # pair; those values can't be decorated with comments or conditionals.
>>           msg = "dict values cannot have attached comments or if-conditionals."
>>           assert not suppress_first_indent, msg
>>   
>>           ret = ''
>> -        if comment:
>> -            ret += indent(level) + '/* %s */\n' % comment
>> -        if ifcond:
>> -            ret += gen_if(ifcond)
>> -        ret += _tree_to_qlit(ifobj, level)
>> -        if ifcond:
>> -            ret += '\n' + gen_endif(ifcond)
>> +        if obj.comment:
>> +            ret += indent(level) + '/* %s */\n' % obj.comment
>> +        if obj.ifcond:
>> +            ret += gen_if(obj.ifcond)
>> +        ret += _tree_to_qlit(obj.value, level, suppress_first_indent)
> 
> Why do you need to pass suppress_first_indent now?
> 

UH, technically I guess I don't. Holdover from when that was maybe not 
clear.

>> +        if obj.ifcond:
>> +            ret += '\n' + gen_endif(obj.ifcond)
>>           return ret
>>   
>>       ret = ''
>> @@ -201,7 +211,7 @@ def _use_type(self, typ):
>>   
>>       @staticmethod
>>       def _gen_features(features):
>> -        return [_make_tree(f.name, f.ifcond) for f in features]
>> +        return [Annotated(f.name, f.ifcond) for f in features]
>>   
>>       def _gen_tree(self, name, mtype, obj, ifcond, features):
>>           comment: Optional[str] = None
>> @@ -215,7 +225,7 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
>>           obj['meta-type'] = mtype
>>           if features:
>>               obj['features'] = self._gen_features(features)
>> -        self._trees.append(_make_tree(obj, ifcond, comment))
>> +        self._trees.append(Annotated(obj, ifcond, comment))
>>   
>>       def _gen_member(self, member):
>>           obj = {'name': member.name, 'type': self._use_type(member.type)}
>> @@ -223,7 +233,7 @@ def _gen_member(self, member):
>>               obj['default'] = None
>>           if member.features:
>>               obj['features'] = self._gen_features(member.features)
>> -        return _make_tree(obj, member.ifcond)
>> +        return Annotated(obj, member.ifcond)
>>   
>>       def _gen_variants(self, tag_name, variants):
>>           return {'tag': tag_name,
>> @@ -231,16 +241,17 @@ def _gen_variants(self, tag_name, variants):
>>   
>>       def _gen_variant(self, variant):
>>           obj = {'case': variant.name, 'type': self._use_type(variant.type)}
>> -        return _make_tree(obj, variant.ifcond)
>> +        return Annotated(obj, variant.ifcond)
>>   
>>       def visit_builtin_type(self, name, info, json_type):
>>           self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
>>   
>>       def visit_enum_type(self, name, info, ifcond, features, members, prefix):
>> -        self._gen_tree(name, 'enum',
>> -                       {'values': [_make_tree(m.name, m.ifcond, None)
>> -                                   for m in members]},
>> -                       ifcond, features)
>> +        self._gen_tree(
>> +            name, 'enum',
>> +            {'values': [Annotated(m.name, m.ifcond) for m in members]},
>> +            ifcond, features
>> +        )
>>   
>>       def visit_array_type(self, name, info, ifcond, element_type):
>>           element = self._use_type(element_type)
>> @@ -257,12 +268,12 @@ def visit_object_type_flat(self, name, info, ifcond, features,
>>           self._gen_tree(name, 'object', obj, ifcond, features)
>>   
>>       def visit_alternate_type(self, name, info, ifcond, features, variants):
>> -        self._gen_tree(name, 'alternate',
>> -                       {'members': [
>> -                           _make_tree({'type': self._use_type(m.type)},
>> -                                      m.ifcond, None)
>> -                           for m in variants.variants]},
>> -                       ifcond, features)
>> +        self._gen_tree(name, 'alternate', {'members': [
> 
> I think I'd prefer another line break after 'alternate', to get
> {'members': align...
> 
>> +                Annotated({'type': self._use_type(m.type)}, m.ifcond)
>> +                for m in variants.variants
>> +            ]},
> 
> ... with ]}.
> 

How's about

```
self._gen_tree(
     name, 'alternate',
     {'members': [Annotated({'type': self._use_type(m.type)}, m.ifcond)
                  for m in variants.variants]},
     ifcond, features
)
```

(It'd look cooler if I re-arranged the other short parameters first, or 
I could create a temporary local. In-line objects with in-line 
comprehensions are bound to look sorta ugly. I'm going with what I wrote 
above for now, though.)

>> +            ifcond, features
>> +        )
>>   
>>       def visit_command(self, name, info, ifcond, features,
>>                         arg_type, ret_type, gen, success_response, boxed,

[1] Python 3.6 EOL is this December, but OpenSuSE Leap 15.2 was released 
2020-07-02 and only offers Python 3.6. Ouch! It is not clear if they 
will add support for Python 3.7 at any point, but otherwise we are stuck 
supporting 15.2 for two years after the next OpenSUSE is released, which 
hasn't happened yet. Ouch!!! So we don't even have a date on the tin for 
when we might conceivably inch up our requirements again.

I think the situation for RHEL and Debian is also sad, IIRC.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-03 15:15   ` Markus Armbruster
@ 2021-02-03 23:27     ` John Snow
  2021-02-05 13:42       ` Markus Armbruster
  0 siblings, 1 reply; 40+ messages in thread
From: John Snow @ 2021-02-03 23:27 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/3/21 10:15 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
>>   scripts/qapi/mypy.ini      |   5 --
>>   scripts/qapi/schema.py     |   2 +-
>>   3 files changed, 82 insertions(+), 40 deletions(-)
>>
>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>> index 60ec326d2c7..b7f2a6cf260 100644
>> --- a/scripts/qapi/introspect.py
>> +++ b/scripts/qapi/introspect.py
>> @@ -30,10 +30,19 @@
>>   )
>>   from .gen import QAPISchemaMonolithicCVisitor
>>   from .schema import (
>> +    QAPISchema,
>>       QAPISchemaArrayType,
>>       QAPISchemaBuiltinType,
>> +    QAPISchemaEntity,
>> +    QAPISchemaEnumMember,
>> +    QAPISchemaFeature,
>> +    QAPISchemaObjectType,
>> +    QAPISchemaObjectTypeMember,
>>       QAPISchemaType,
>> +    QAPISchemaVariant,
>> +    QAPISchemaVariants,
>>   )
>> +from .source import QAPISourceInfo
>>   
>>   
>>   # This module constructs a tree data structure that is used to
>> @@ -57,6 +66,8 @@
>     # generate the introspection information for QEMU. It behaves similarly
>     # to a JSON value.
>     #
>     # A complexity over JSON is that our values may or may not be annotated.
>     #
>     # Un-annotated values may be:
>     #     Scalar: str, bool, None.
>     #     Non-scalar: List, Dict
>     # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>     #
>     # With optional annotations, the type of all values is:
>     # TreeValue = Union[_value, Annotated[_value]]
>     #
>     # Sadly, mypy does not support recursive types, so we must approximate this.
>     _stub = Any
>     _scalar = Union[str, bool, None]
>     _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>   _value = Union[_scalar, _nonscalar]
>>   TreeValue = Union[_value, 'Annotated[_value]']
>>   
>> +# This is a (strict) alias for an arbitrary object non-scalar, as above:
>> +_DObject = Dict[str, object]
> 
> Sounds greek :)
> 

Admittedly it is still not explained well ... until the next patch. I'm 
going to leave it alone for now until you have a chance to respond to 
these walls of text.

> It's almost the Dict part of _nonscalar, but not quite: object vs. Any.
> 
> I naively expect something closer to
> 
>     _scalar = ...
>     _object = Dict[str, _stub]
>     _nonscalar = Union[_object, List[_stub]
> 
> and (still naively) expect _object to be good enough to serve as type
> annotation for dicts representing JSON objects.

"_object" would be good, except ... I am trying to avoid using that word 
because what does it mean? Python object? JSON object? Here at the 
boundary between two worlds, nothing makes sense.

(See patch 12/14 for A More Betterer Understanding of what _DObject is 
used for. It will contribute to A Greater Understanding.)

Anyway, to your questions;

(1) _DObject was my shorthand garbage way of saying "This is a Python 
Dict that represents a JSON object". Hence Dict-Object, "DObject". I 
have also derisively called this a "dictly-typed" object at times.

(2) Dict[str, Any] and Dict[str, object] are similar, but do have a 
semantic difference. I alluded to it by calling this a "(strict) alias"; 
which does not help you understand any of the following points:

Whenever you use "Any", it basically turns off type-checking at that 
boundary; it is the gradually typed boundary type. Avoid it whenever 
reasonably possible.

Example time:


def foo(thing: Any) -> None:
     print(thing.value)  # Sure, I guess! We'll believe you.


def foo(thing: object) -> None:
     print(thing.value)  # BZZT, Python object has no value prop.


Use "Any" when you really just cannot constrain the type, because you're 
out of bourbon or you've decided to give in to the darkness inside your 
heart.

Use "object" when the type of the value actually doesn't matter, because 
you are only passing it on to something else later that will do its own 
type analysis or introspection on the object.

For introspect.py, 'object' is actually a really good type when we can 
use it, because we interrogate the type exhaustively upon receipt in 
_tree_to_qlit.


That leaves one question you would almost assuredly ask as a followup:

"Why didn't you use object for the stub type to begin with?"

Let's say we define _stub as `object` instead, the Python object. When 
_tree_to_qlit recurses on non-scalar structures, the held value there is 
only known as "object" and not as str/bool/None, which causes a typing 
error at that point.

Moving the stub to "Any" tells mypy to ... not worry about what type we 
actually passed here. I gave in to the darkness in my heart. It's just 
too annoying without real recursion.

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument
  2021-02-03 21:21     ` John Snow
@ 2021-02-04  8:37       ` Markus Armbruster
  0 siblings, 0 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-04  8:37 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, qemu-devel, Eduardo Habkost, Cleber Rosa

John Snow <jsnow@redhat.com> writes:

> On 2/3/21 9:23 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> This is only used to pass in a dictionary with a comment already set, so
>>> skip the runaround and just accept the comment.
>>>
>>> This works because _tree_to_qlit() treats 'if': None; 'comment': None
>>> exactly like absent 'if'; 'comment'.
>> Confusing, because the two paragraphs talk about two different
>> things:
>> 1. Actual arguments for @extra are either None or {'comment':
>> comment}.
>>     Simplify: replace parameter @extra by parameter @comment.
>> 2. Dumb down the return value to always be of the form
>>      (obj {'if': ifcond, 'comment': comment})
>> 
>
> I think you are drawing attention to the fact that 'if' and 'comment'
> are now always present in this dict instead of conditionally present.

Correct.

> (else, I have misread you. (I think you are missing a comma.))

I am!  I meant to write

    (obj, {'if': ifcond, 'comment': comment})

>> I suspect splitting the patch is easier than crafting a clear commit
>> message for the combined one.
>> 
>
> I wouldn't have considered to break out such a small change into two
> even smaller changes, but as you are in charge here ... Okey Dokey.
>
> (meta-tangent: [1])
[...]
> [1] As a matter of process, I sometimes find it cumbersome to
> intentionally engineer an intermediary state when I jumped straight
> from A->C in my actual editing.

Yes, the extra work can be cumbersome.  But then writing a neat commit
message for a commit that does two things can also be cumbersome.
"Split and write two straightforward commit messages" has proven easier
for me many times.

> I will usually keep such intermediary forms when they come about
> naturally in the course of development, but rarely seek to add them 
> artificially -- it feels like a major bummer to engineer, test, and
> scrutinize code that's only bound to be deleted immediately after. 
> Sometimes, it feels like a waste of reviewer effort, too.

It depends.  Sometimes "don't split and write a complicated commit
message" is easier.

Which way you get to "commit message(s) don't confuse Markus" in this
particular case is up to you :)

> It's been years and I still don't think I have any real intuitive
> sense for this, which is ...unfortunate.

It's been years, and my intuition still evolves.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse
  2021-02-03 20:42     ` John Snow
  2021-02-03 21:18       ` Eduardo Habkost
@ 2021-02-04 15:06       ` Markus Armbruster
  1 sibling, 0 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-04 15:06 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, qemu-devel, Eduardo Habkost, Cleber Rosa

John Snow <jsnow@redhat.com> writes:

> On 2/3/21 9:08 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> _tree_to_qlit is called recursively on dict values alone; at such a
>>> point in generating output it is too late to apply an ifcond. Similarly,
>>> comments do not necessarily have a "tidy" place they can be printed in
>>> such a circumstance.
>>>
>>> Forbid this usage.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   scripts/qapi/introspect.py | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>>> index 4749f65ea3c..ccdf4f1c0d0 100644
>>> --- a/scripts/qapi/introspect.py
>>> +++ b/scripts/qapi/introspect.py
>>> @@ -43,6 +43,12 @@ def indent(level):
>>>           ifobj, extra = obj
>>>           ifcond = extra.get('if')
>>>           comment = extra.get('comment')
>>> +
>>> +        # NB: _tree_to_qlit is called recursively on the values of a key:value
>>> +        # pair; those values can't be decorated with comments or conditionals.
>>> +        msg = "dict values cannot have attached comments or if-conditionals."
>>> +        assert not suppress_first_indent, msg
>>> +
>>>           ret = ''
>>>           if comment:
>>>               ret += indent(level) + '/* %s */\n' % comment
>> This uses @suppress_first_indent as a proxy for "@obj is a value in
>> a
>> dict".  Works, because we pass suppress_first_indent=True exactly
>> there.  Took me a minute to see, though.
>> 
>
> Yes, the link is a little tenuous; in truth, the field could be
> renamed "dict_value: bool" or so to make it more clear, at the expense
> of making the inner workings of _tree_to_qlit more opaque.
>
> So we happen to know that only dict values want to suppress the
> indent; and the error message explains what went wrong in language 
> (subjectively, again) more directly helpful to the theoretical hapless user.
>
> (Tentatively: I'll amend the parameter name to make the relationship
> more concrete, but I expect you'll have more to say.)
>
>> Do you need this assertion to help mypy over the hump?
>> 
>
> It was added as a result of an observation by Eduardo that by always
> generating annotation data (To make the return type from _make_tree
> not conditional) that there were cases where you could conceivably
> call _tree_to_qlit that didn't make sense; or at least we couldn't
> prove easily that it wouldn't happen.
>
> (Of course, in practice, it does not.)
>
> I added the assertion to call attention to the limitations of this
> existing code: passing annotations alongside dict values made no
> sense.
>
> (Or maybe made no sense.)
>
> Conceptually it makes sense that some keys of a dict might be
> conditionally compiled out, but in terms of the actual data structures 
> we use to convey this information, we don't actually use dicts to
> represent keys like that ... we use a list, actually.
>
> (See visit_object_type_flat)
>
> Anyway, this was an attempt to clear up that misunderstanding for
> reviewers less familiar with these structures, and to guard against 
> future code in particular that may misunderstand it.

I doubt the guard is needed for code.  This stuff hasn't changed in a
long time.  JSON is set in stone.  If we ever need extras beyond ifcond
and comment (unlikely), we can stuff them in just like ifcond and
comment.

I accept readers and reviewers may find it useful.

> It isn't (to my recollection) necessary for mypy. If you want to
> remove it, I think I'd like Eduardo to sign off on that matter.
>
> (We both found this code very, very confusing to read and modify. For
> whatever reason, I still find it fairly hard to reason about clearly.)

Sorry about that.  If I had known how much trouble the cheap and
somewhat hacky extension of the QLit-generating code for ifcond (commit
d626b6c1ae7) would give you[*], I would've nacked it.

>> Perhaps we'd be better off with two functions, one that takes possibly
>> annotated @obj, and one that takes only plain @obj.  "Yes, but not now"
>> woule be one acceptable answer to that.
>> 
>
> Yes, I tried to prototype this a few times but found that it quickly
> touched too many things and I was losing appetite for re-spins. Recent 
> reviews urged a focus on "typing what we have, where possible" before
> making alterations. The debate between "fix-then-type" or 
> "type-then-fix" is valid, but largely intractable.

Yes, we need to focus, and resist mission creep.

When review uncovers improvement opportunities, we need to decide
whether to pursue within the series, in a follow-up series, or
"later"[**].

This should *not* stop us from pointing out improvement opportunities!

With the benefit of hindsight: I wish we had kicked this one down the
road some.  Sunk cost, though.

> Since my only immediate goal was "Get everything typed", the
> "type-then-fix" approach has the side-effect of dropping improvements 
> that aren't strictly needed whenever possible.
>
> LONG STORY SHORT: Yes, I think that design would be better overall,
> but I think it should wait for later. In particular, if you embark
> upon your more radical rewrite of introspection, it could just be
> handled at that time.

Got it.

> (My careful separation of scalars vs non-scalars in the typing comment
> later in this series is an artifact of the time spent playing around 
> with splitting this function out into two mutually recursive
> functions, but found it was too noisy in an already long-challenged
> series.)
>
> --js


[*] And then indirectly me, to be honest.

[**] Which may well mean "never" in practice.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-03 21:50     ` Eduardo Habkost
@ 2021-02-04 15:37       ` Markus Armbruster
  2021-02-04 16:20         ` John Snow
  2021-02-04 16:28         ` Eduardo Habkost
  0 siblings, 2 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-04 15:37 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Michael Roth, John Snow, qemu-devel, Cleber Rosa

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Wed, Feb 03, 2021 at 03:47:36PM +0100, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>> > Presently, we use a tuple to attach a dict containing annotations
>> > (comments and compile-time conditionals) to a tree node. This is
>> > undesirable because dicts are difficult to strongly type; promoting it
>> > to a real class allows us to name the values and types of the
>> > annotations we are expecting.
>> >
>> > In terms of typing, the Annotated<T> type serves as a generic container
>> > where the annotated node's type is preserved, allowing for greater
>> > specificity than we'd be able to provide without a generic.
>> >
>> > Signed-off-by: John Snow <jsnow@redhat.com>
> [...]
>> > +class Annotated(Generic[_NodeT]):
>> > +    """
>> > +    Annotated generally contains a SchemaInfo-like type (as a dict),
>> > +    But it also used to wrap comments/ifconds around scalar leaf values,
>> > +    for the benefit of features and enums.
>> > +    """
>> > +    # Remove after 3.7 adds @dataclass:
>> 
>> Make this
>> 
>>        # TODO Remove after Python 3.7 ...
>> 
>> to give us a fighting chance to remember.
>> 
>> > +    # pylint: disable=too-few-public-methods
>> > +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
>> > +                 comment: Optional[str] = None):
>> 
>> Why not simply value: _value?
>
> Example:
>   x = C(1)
>   y: C[int]
>   y = C('x')  # mistake
>
> Declaring value as _NodeT does:
> - Make the inferred type of x be Annotated[int].
> - Catch the mistake above.

I smell overengineering.  I may well be wrong.

Without doubt, there are uses for using the type system for keeping
SomeGenericType[SomeType] and SomeGenericType[AnotherType] apart.

But what do we gain by keeping the Annotated[T] for the possible T
apart?

_tree_to_qlit() doesn't care: it peels off the wrapper holding ifcond
and comment, and recurses for the JSON so wrapped.  Regardless of what
was wrapped, i.e. what kind of T we got.

Heck, it works just fine even if you wrap your JSON multiple times.  It
doesn't give a hoot whether that makes sense.  Making sense is the
caller's business.

So what does care?

Or am I simply confused?


PS: As far as I can tell, _tree_to_qlit() doesn't give a hoot whether a
dictionary's values are wrapped, either.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-04 15:37       ` Markus Armbruster
@ 2021-02-04 16:20         ` John Snow
  2021-02-04 16:28         ` Eduardo Habkost
  1 sibling, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-04 16:20 UTC (permalink / raw)
  To: Markus Armbruster, Eduardo Habkost; +Cc: Michael Roth, qemu-devel, Cleber Rosa

On 2/4/21 10:37 AM, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
>> On Wed, Feb 03, 2021 at 03:47:36PM +0100, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> Presently, we use a tuple to attach a dict containing annotations
>>>> (comments and compile-time conditionals) to a tree node. This is
>>>> undesirable because dicts are difficult to strongly type; promoting it
>>>> to a real class allows us to name the values and types of the
>>>> annotations we are expecting.
>>>>
>>>> In terms of typing, the Annotated<T> type serves as a generic container
>>>> where the annotated node's type is preserved, allowing for greater
>>>> specificity than we'd be able to provide without a generic.
>>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>> [...]
>>>> +class Annotated(Generic[_NodeT]):
>>>> +    """
>>>> +    Annotated generally contains a SchemaInfo-like type (as a dict),
>>>> +    But it also used to wrap comments/ifconds around scalar leaf values,
>>>> +    for the benefit of features and enums.
>>>> +    """
>>>> +    # Remove after 3.7 adds @dataclass:
>>>
>>> Make this
>>>
>>>         # TODO Remove after Python 3.7 ...
>>>
>>> to give us a fighting chance to remember.
>>>
>>>> +    # pylint: disable=too-few-public-methods
>>>> +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
>>>> +                 comment: Optional[str] = None):
>>>
>>> Why not simply value: _value?
>>
>> Example:
>>    x = C(1)
>>    y: C[int]
>>    y = C('x')  # mistake
>>
>> Declaring value as _NodeT does:
>> - Make the inferred type of x be Annotated[int].
>> - Catch the mistake above.
> 
> I smell overengineering.  I may well be wrong.
> 
> Without doubt, there are uses for using the type system for keeping
> SomeGenericType[SomeType] and SomeGenericType[AnotherType] apart.
> 
> But what do we gain by keeping the Annotated[T] for the possible T
> apart?
> 
> _tree_to_qlit() doesn't care: it peels off the wrapper holding ifcond
> and comment, and recurses for the JSON so wrapped.  Regardless of what
> was wrapped, i.e. what kind of T we got.
> 
> Heck, it works just fine even if you wrap your JSON multiple times.  It
> doesn't give a hoot whether that makes sense.  Making sense is the
> caller's business.
> 
> So what does care?
> 
> Or am I simply confused?
> 
> 
> PS: As far as I can tell, _tree_to_qlit() doesn't give a hoot whether a
> dictionary's values are wrapped, either.
> 

You're right that it offers no necessary power to the automated 
checking, I cede as much in my other replies to you.

(1) I hope the explanation was helpful, because the technique will 
return for structures like QAPISchemaMember where it does become more 
obviously useful to disambiguate the wrapped types.

(2) It still has a notational benefit to the human for documentation and 
IDE purposes, e.g. at the end of this series:

```
self._trees: List[Annotated[SchemaInfo]] = []
```

That's nice! It tells us exactly what's in there.

Does it work if I dumb it down to just:

```
self.trees: List[AnnotatedThingy] = []
```

Yes, but it's worse to the human and the IDE both. I consider type hints 
as serving a dual purpose; both provability and declaration of intent.

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-04 15:37       ` Markus Armbruster
  2021-02-04 16:20         ` John Snow
@ 2021-02-04 16:28         ` Eduardo Habkost
  2021-02-05  8:45           ` Markus Armbruster
  1 sibling, 1 reply; 40+ messages in thread
From: Eduardo Habkost @ 2021-02-04 16:28 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, John Snow, qemu-devel, Cleber Rosa

On Thu, Feb 04, 2021 at 04:37:45PM +0100, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Wed, Feb 03, 2021 at 03:47:36PM +0100, Markus Armbruster wrote:
> >> John Snow <jsnow@redhat.com> writes:
> >> 
> >> > Presently, we use a tuple to attach a dict containing annotations
> >> > (comments and compile-time conditionals) to a tree node. This is
> >> > undesirable because dicts are difficult to strongly type; promoting it
> >> > to a real class allows us to name the values and types of the
> >> > annotations we are expecting.
> >> >
> >> > In terms of typing, the Annotated<T> type serves as a generic container
> >> > where the annotated node's type is preserved, allowing for greater
> >> > specificity than we'd be able to provide without a generic.
> >> >
> >> > Signed-off-by: John Snow <jsnow@redhat.com>
> > [...]
> >> > +class Annotated(Generic[_NodeT]):
> >> > +    """
> >> > +    Annotated generally contains a SchemaInfo-like type (as a dict),
> >> > +    But it also used to wrap comments/ifconds around scalar leaf values,
> >> > +    for the benefit of features and enums.
> >> > +    """
> >> > +    # Remove after 3.7 adds @dataclass:
> >> 
> >> Make this
> >> 
> >>        # TODO Remove after Python 3.7 ...
> >> 
> >> to give us a fighting chance to remember.
> >> 
> >> > +    # pylint: disable=too-few-public-methods
> >> > +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
> >> > +                 comment: Optional[str] = None):
> >> 
> >> Why not simply value: _value?
> >
> > Example:
> >   x = C(1)
> >   y: C[int]
> >   y = C('x')  # mistake
> >
> > Declaring value as _NodeT does:
> > - Make the inferred type of x be Annotated[int].
> > - Catch the mistake above.
> 
> I smell overengineering.  I may well be wrong.

To me it's just regular and idiomatic use of Generic.

> 
> Without doubt, there are uses for using the type system for keeping
> SomeGenericType[SomeType] and SomeGenericType[AnotherType] apart.
> 
> But what do we gain by keeping the Annotated[T] for the possible T
> apart?

I understand this as (valid) criticism of the use of Generic.
If we don't want to make Generic[T1] and Generic[T2] be
different types, there's no point in using Generic at all.


> 
> _tree_to_qlit() doesn't care: it peels off the wrapper holding ifcond
> and comment, and recurses for the JSON so wrapped.  Regardless of what
> was wrapped, i.e. what kind of T we got.
> 
> Heck, it works just fine even if you wrap your JSON multiple times.  It
> doesn't give a hoot whether that makes sense.  Making sense is the
> caller's business.
> 
> So what does care?
> 
> Or am I simply confused?

Those are valid questions.  Maybe using Generic will be useful
in the future, but maybe we don't need it right now.

Personally, I don't care either way.  I just wish this small
choice don't became another obstacle for doing useful work.

> 
> 
> PS: As far as I can tell, _tree_to_qlit() doesn't give a hoot whether a
> dictionary's values are wrapped, either.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-04 16:28         ` Eduardo Habkost
@ 2021-02-05  8:45           ` Markus Armbruster
  0 siblings, 0 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-05  8:45 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Michael Roth, John Snow, qemu-devel, Cleber Rosa

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Thu, Feb 04, 2021 at 04:37:45PM +0100, Markus Armbruster wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > On Wed, Feb 03, 2021 at 03:47:36PM +0100, Markus Armbruster wrote:
>> >> John Snow <jsnow@redhat.com> writes:
>> >> 
>> >> > Presently, we use a tuple to attach a dict containing annotations
>> >> > (comments and compile-time conditionals) to a tree node. This is
>> >> > undesirable because dicts are difficult to strongly type; promoting it
>> >> > to a real class allows us to name the values and types of the
>> >> > annotations we are expecting.
>> >> >
>> >> > In terms of typing, the Annotated<T> type serves as a generic container
>> >> > where the annotated node's type is preserved, allowing for greater
>> >> > specificity than we'd be able to provide without a generic.
>> >> >
>> >> > Signed-off-by: John Snow <jsnow@redhat.com>
>> > [...]
>> >> > +class Annotated(Generic[_NodeT]):
>> >> > +    """
>> >> > +    Annotated generally contains a SchemaInfo-like type (as a dict),
>> >> > +    But it also used to wrap comments/ifconds around scalar leaf values,
>> >> > +    for the benefit of features and enums.
>> >> > +    """
>> >> > +    # Remove after 3.7 adds @dataclass:
>> >> 
>> >> Make this
>> >> 
>> >>        # TODO Remove after Python 3.7 ...
>> >> 
>> >> to give us a fighting chance to remember.
>> >> 
>> >> > +    # pylint: disable=too-few-public-methods
>> >> > +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
>> >> > +                 comment: Optional[str] = None):
>> >> 
>> >> Why not simply value: _value?
>> >
>> > Example:
>> >   x = C(1)
>> >   y: C[int]
>> >   y = C('x')  # mistake
>> >
>> > Declaring value as _NodeT does:
>> > - Make the inferred type of x be Annotated[int].
>> > - Catch the mistake above.
>> 
>> I smell overengineering.  I may well be wrong.
>
> To me it's just regular and idiomatic use of Generic.

Bear in mind that I'm (ab)using these reviews to learn Python's way of
static typing.  My ignorant questions may evolve into mere grumblings as
I learn.  Or into requests for change.

Grumbling: the notational overhead is regrettable.

>> 
>> Without doubt, there are uses for using the type system for keeping
>> SomeGenericType[SomeType] and SomeGenericType[AnotherType] apart.
>> 
>> But what do we gain by keeping the Annotated[T] for the possible T
>> apart?
>
> I understand this as (valid) criticism of the use of Generic.
> If we don't want to make Generic[T1] and Generic[T2] be
> different types, there's no point in using Generic at all.

True.

>> _tree_to_qlit() doesn't care: it peels off the wrapper holding ifcond
>> and comment, and recurses for the JSON so wrapped.  Regardless of what
>> was wrapped, i.e. what kind of T we got.
>> 
>> Heck, it works just fine even if you wrap your JSON multiple times.  It
>> doesn't give a hoot whether that makes sense.  Making sense is the
>> caller's business.
>> 
>> So what does care?
>> 
>> Or am I simply confused?
>
> Those are valid questions.  Maybe using Generic will be useful
> in the future, but maybe we don't need it right now.
>
> Personally, I don't care either way.  I just wish this small
> choice don't became another obstacle for doing useful work.

I agree this one's minor.

The general problem is not.

Some invariants can be elegantly expressed as static types.  Easier to
read than assertions and comments, and statically checkable.  Lovely, me
want.

There are also cases we can express, but the notational overhead
compromises readability.  We effectively trade readability for static
checking.  I believe this is a bad trade for the QAPI generator.

"Compromises readability" is highly subjective.  Lots of gray area.

My point is: please don't aim for maximally tight types.  Try to
optimize comprehension instead.  Be pragmatic.

There are plenty of languages "where you have to sit with a teacup of
types balanced on your knee and make polite conversation with a strict
old aunt of a compiler."[*]  Let's not press Python into that role :)

>> PS: As far as I can tell, _tree_to_qlit() doesn't give a hoot whether a
>> dictionary's values are wrapped, either.


[*] Paul Graham



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure
  2021-02-03 23:12     ` John Snow
@ 2021-02-05  9:10       ` Markus Armbruster
  0 siblings, 0 replies; 40+ messages in thread
From: Markus Armbruster @ 2021-02-05  9:10 UTC (permalink / raw)
  To: John Snow
  Cc: qemu-devel, Michael Roth, Markus Armbruster, Eduardo Habkost,
	Cleber Rosa

John Snow <jsnow@redhat.com> writes:

> On 2/3/21 9:47 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> Presently, we use a tuple to attach a dict containing annotations
>>> (comments and compile-time conditionals) to a tree node. This is
>>> undesirable because dicts are difficult to strongly type; promoting it
>>> to a real class allows us to name the values and types of the
>>> annotations we are expecting.
>>>
>>> In terms of typing, the Annotated<T> type serves as a generic container
>>> where the annotated node's type is preserved, allowing for greater
>>> specificity than we'd be able to provide without a generic.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   scripts/qapi/introspect.py | 77 ++++++++++++++++++++++----------------
>>>   1 file changed, 44 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>>> index b82efe16f6e..2b90a52f016 100644
>>> --- a/scripts/qapi/introspect.py
>>> +++ b/scripts/qapi/introspect.py
>>> @@ -13,8 +13,12 @@
>>>   from typing import (
>>>       Any,
>>>       Dict,
>>> +    Generic,
>>> +    Iterable,
>>>       List,
>>>       Optional,
>>> +    Tuple,
>>> +    TypeVar,
>>>       Union,
>>>   )
>>>   
>>> @@ -51,15 +55,25 @@
>>>   _scalar = Union[str, bool, None]
>>>   _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>>   _value = Union[_scalar, _nonscalar]
>>> -# TreeValue = Union[_value, 'Annotated[_value]']
>>> +TreeValue = Union[_value, 'Annotated[_value]']
>> 
>> You need to quote 'Annotated[_value]', because it's a forward
>> reference.
>> 
>> Dependency cycle:
>> 
>>          +-----------------------------------------------+
>>          |                                               |
>>      TreeValue = Union[_value, 'Annotated[_value]']      |
>>                                     |                    |
>>          +--------------------------+                    |
>>          |                                               |
>>      Annotated(Generic[_NodeT])                          |
>>                           |                              |
>>         +-----------------+                              |
>>         |                                                |
>>      _NodeT = TypeVar('_NodeT', bound=TreeValue          |
>>                                           |              |
>>                                           +--------------+
>> 
>> Meh.  We'll live.
>> 
>
> Python 3.10 (!) will fix this sort of thing. It fixes it in a funny way, 
> though: all annotations are treated as delayed-evaluation strings. :)
>
> See https://www.python.org/dev/peps/pep-0563/ which now becomes the 
> *default* behavior.
>
> We do not gain access to this behavior at all, because it is exclusive 
> to 3.7+. Boo.
>
> For now, (and for the foreseeable future of the QEMU project, as Python 
> 3.7 support will not be available to us for many, many, many more years 
> [1]) forward references in the global scope need to be quoted.
>
>>>   
>>> -def _make_tree(obj, ifcond, comment=None):
>>> -    extra = {
>>> -        'if': ifcond,
>>> -        'comment': comment,
>>> -    }
>>> -    return (obj, extra)
>>> +_NodeT = TypeVar('_NodeT', bound=TreeValue)
>> 
>> Can you teach me what NodeT is about?
>> 
>
> It's a TypeVar. If you're a TAPL sort of person, canonically you use 
> T,U,V and so on. Relevant tutorial section for mypy is here: 
> https://mypy.readthedocs.io/en/stable/generics.html
>
> (Yes, a weird thing about Generics in Python is that you have to declare 
> them. No, I don't know why. Yes, it's weird.)

The notational overhead is regrettable.  Not your fault.

> I have bound it to TreeValue, indicating that this type variable may 
> only *ever* represent something that isa TreeValue. Because of this, I 
> use "NodeT" to indicate that it's a Generic type T, but with some 
> constraint.
>
> (Though, it really should have been bound to _value instead ... my 
> mistake. I shouldn't allow for nested annotations ...!)

Fixed in v5.  Good.

> Using it allows me to define the Annotated class below in terms of some 
> input parameter instead of a fixed, opaque type.
>
>>> +
>>> +
>>> +class Annotated(Generic[_NodeT]):
>
> Annotated here is defined as Annotated[T], but T is our special _NodeT, 
> so Annotated may only hold other TreeValues*.
>
> (* Again, as per above, this is an oopsie.)
>
>>> +    """
>>> +    Annotated generally contains a SchemaInfo-like type (as a dict),
>>> +    But it also used to wrap comments/ifconds around scalar leaf values,
>>> +    for the benefit of features and enums.
>>> +    """
>>> +    # Remove after 3.7 adds @dataclass:
>> 
>> Make this
>> 
>>         # TODO Remove after Python 3.7 ...
>> 
>> to give us a fighting chance to remember.
>> 
>
> Having done a lot of the python 2 excision work, I only ever grepped for 
> e.g. "3.7" or "sys.version". I was targeting that.
>
> Adding TODO seems fine enough to do, and I'm here anyway.

Done in v5.  Thanks!

>>> +    # pylint: disable=too-few-public-methods
>>> +    def __init__(self, value: _NodeT, ifcond: Iterable[str],
>>> +                 comment: Optional[str] = None):
>> 
>> Why not simply value: _value?
>> 
>
> This is what connects back with the Generic instantiation of this 
> annotation. `class Annotated(Generic[_NodeT])` says that this class is a 
> higher-kinded type parameterized on ... something. We don't know yet.
>
> In the initializer, we use the TypeVar to describe which argument 
> determines that parameterization.
>
>>> +        self.value = value
>
> That parameter flows down and connects to this property. Thus, this 
> field actually has a dynamic type that will track the original type used 
> to instantiate it. If it was a Dict, this will also be a Dict.
>
> (Of course, it's limited to what mypy knows about the constraint of that 
> value when passed. It isn't interrogated at runtime.)
>
> This is a way to achieve dynamism in container-style classes without 
> losing precision in data types. It allows us to declare and keep a more 
> specific "inner type" that survives the boxing and unboxing process.
>
> e.g. Annotated[Dict[str, object]] and Annotated[str] are two different 
> types, and can be differentiated by mypy.
>
> (*cough*, and yes, if you have a function that takes Annotated[Any], you 
> lose that precision at that point. So, we aren't taking full advantage 
> of that power here in introspect.py, but it still seemed like the 
> "right" way to type something like this.)

A generic type feels like overkill here.  This is code that fits on a
page or two.  It solves a simple problem in a simple way.  It's also
badly structured, and virtually undocumented.  I think it needs a
restructuring much more than it needs maximally tight typing.

I'm not demanding you dumb down the types now.  Let's move on.

>>> +        self.comment: Optional[str] = comment
>>> +        self.ifcond: Tuple[str, ...] = tuple(ifcond)
>>>   
>>>   
>>>   def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
>>> @@ -67,24 +81,20 @@ def _tree_to_qlit(obj, level=0, suppress_first_indent=False):
>>>       def indent(level):
>>>           return level * 4 * ' '
>>>   
>>> -    if isinstance(obj, tuple):
>>> -        ifobj, extra = obj
>>> -        ifcond = extra.get('if')
>>> -        comment = extra.get('comment')
>>> -
>>> +    if isinstance(obj, Annotated):
>>>           # NB: _tree_to_qlit is called recursively on the values of a key:value
>>>           # pair; those values can't be decorated with comments or conditionals.
>>>           msg = "dict values cannot have attached comments or if-conditionals."
>>>           assert not suppress_first_indent, msg
>>>   
>>>           ret = ''
>>> -        if comment:
>>> -            ret += indent(level) + '/* %s */\n' % comment
>>> -        if ifcond:
>>> -            ret += gen_if(ifcond)
>>> -        ret += _tree_to_qlit(ifobj, level)
>>> -        if ifcond:
>>> -            ret += '\n' + gen_endif(ifcond)
>>> +        if obj.comment:
>>> +            ret += indent(level) + '/* %s */\n' % obj.comment
>>> +        if obj.ifcond:
>>> +            ret += gen_if(obj.ifcond)
>>> +        ret += _tree_to_qlit(obj.value, level, suppress_first_indent)
>> 
>> Why do you need to pass suppress_first_indent now?
>> 
>
> UH, technically I guess I don't. Holdover from when that was maybe not 
> clear.

Dropped in v5.  Good.

>>> +        if obj.ifcond:
>>> +            ret += '\n' + gen_endif(obj.ifcond)
>>>           return ret
>>>   
>>>       ret = ''
>>> @@ -201,7 +211,7 @@ def _use_type(self, typ):
>>>   
>>>       @staticmethod
>>>       def _gen_features(features):
>>> -        return [_make_tree(f.name, f.ifcond) for f in features]
>>> +        return [Annotated(f.name, f.ifcond) for f in features]
>>>   
>>>       def _gen_tree(self, name, mtype, obj, ifcond, features):
>>>           comment: Optional[str] = None
>>> @@ -215,7 +225,7 @@ def _gen_tree(self, name, mtype, obj, ifcond, features):
>>>           obj['meta-type'] = mtype
>>>           if features:
>>>               obj['features'] = self._gen_features(features)
>>> -        self._trees.append(_make_tree(obj, ifcond, comment))
>>> +        self._trees.append(Annotated(obj, ifcond, comment))
>>>   
>>>       def _gen_member(self, member):
>>>           obj = {'name': member.name, 'type': self._use_type(member.type)}
>>> @@ -223,7 +233,7 @@ def _gen_member(self, member):
>>>               obj['default'] = None
>>>           if member.features:
>>>               obj['features'] = self._gen_features(member.features)
>>> -        return _make_tree(obj, member.ifcond)
>>> +        return Annotated(obj, member.ifcond)
>>>   
>>>       def _gen_variants(self, tag_name, variants):
>>>           return {'tag': tag_name,
>>> @@ -231,16 +241,17 @@ def _gen_variants(self, tag_name, variants):
>>>   
>>>       def _gen_variant(self, variant):
>>>           obj = {'case': variant.name, 'type': self._use_type(variant.type)}
>>> -        return _make_tree(obj, variant.ifcond)
>>> +        return Annotated(obj, variant.ifcond)
>>>   
>>>       def visit_builtin_type(self, name, info, json_type):
>>>           self._gen_tree(name, 'builtin', {'json-type': json_type}, [], None)
>>>   
>>>       def visit_enum_type(self, name, info, ifcond, features, members, prefix):
>>> -        self._gen_tree(name, 'enum',
>>> -                       {'values': [_make_tree(m.name, m.ifcond, None)
>>> -                                   for m in members]},
>>> -                       ifcond, features)
>>> +        self._gen_tree(
>>> +            name, 'enum',
>>> +            {'values': [Annotated(m.name, m.ifcond) for m in members]},
>>> +            ifcond, features
>>> +        )
>>>   
>>>       def visit_array_type(self, name, info, ifcond, element_type):
>>>           element = self._use_type(element_type)
>>> @@ -257,12 +268,12 @@ def visit_object_type_flat(self, name, info, ifcond, features,
>>>           self._gen_tree(name, 'object', obj, ifcond, features)
>>>   
>>>       def visit_alternate_type(self, name, info, ifcond, features, variants):
>>> -        self._gen_tree(name, 'alternate',
>>> -                       {'members': [
>>> -                           _make_tree({'type': self._use_type(m.type)},
>>> -                                      m.ifcond, None)
>>> -                           for m in variants.variants]},
>>> -                       ifcond, features)
>>> +        self._gen_tree(name, 'alternate', {'members': [
>> 
>> I think I'd prefer another line break after 'alternate', to get
>> {'members': align...
>> 
>>> +                Annotated({'type': self._use_type(m.type)}, m.ifcond)
>>> +                for m in variants.variants
>>> +            ]},
>> 
>> ... with ]}.
>> 
>
> How's about
>
> ```
> self._gen_tree(
>      name, 'alternate',
>      {'members': [Annotated({'type': self._use_type(m.type)}, m.ifcond)
>                   for m in variants.variants]},
>      ifcond, features
> )
> ```
>
> (It'd look cooler if I re-arranged the other short parameters first, or 
> I could create a temporary local. In-line objects with in-line 
> comprehensions are bound to look sorta ugly. I'm going with what I wrote 
> above for now, though.)

I'll see how it comes out in v5.

>>> +            ifcond, features
>>> +        )
>>>   
>>>       def visit_command(self, name, info, ifcond, features,
>>>                         arg_type, ret_type, gen, success_response, boxed,
>
> [1] Python 3.6 EOL is this December, but OpenSuSE Leap 15.2 was released 
> 2020-07-02 and only offers Python 3.6. Ouch! It is not clear if they 
> will add support for Python 3.7 at any point, but otherwise we are stuck 
> supporting 15.2 for two years after the next OpenSUSE is released, which 
> hasn't happened yet. Ouch!!! So we don't even have a date on the tin for 
> when we might conceivably inch up our requirements again.
>
> I think the situation for RHEL and Debian is also sad, IIRC.

Things like these are why your salary is called "compensation".  The
"for personal suffering" part you figure out quickly enough.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-03 23:27     ` John Snow
@ 2021-02-05 13:42       ` Markus Armbruster
  2021-02-08 21:39         ` John Snow
  0 siblings, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-05 13:42 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, qemu-devel, Eduardo Habkost, Cleber Rosa

John Snow <jsnow@redhat.com> writes:

> On 2/3/21 10:15 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
>>>   scripts/qapi/mypy.ini      |   5 --
>>>   scripts/qapi/schema.py     |   2 +-
>>>   3 files changed, 82 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>>> index 60ec326d2c7..b7f2a6cf260 100644
>>> --- a/scripts/qapi/introspect.py
>>> +++ b/scripts/qapi/introspect.py
>>> @@ -30,10 +30,19 @@
>>>   )
>>>   from .gen import QAPISchemaMonolithicCVisitor
>>>   from .schema import (
>>> +    QAPISchema,
>>>       QAPISchemaArrayType,
>>>       QAPISchemaBuiltinType,
>>> +    QAPISchemaEntity,
>>> +    QAPISchemaEnumMember,
>>> +    QAPISchemaFeature,
>>> +    QAPISchemaObjectType,
>>> +    QAPISchemaObjectTypeMember,
>>>       QAPISchemaType,
>>> +    QAPISchemaVariant,
>>> +    QAPISchemaVariants,
>>>   )
>>> +from .source import QAPISourceInfo
>>>   
>>>   
>>>   # This module constructs a tree data structure that is used to
>>> @@ -57,6 +66,8 @@
>>     # generate the introspection information for QEMU. It behaves similarly
>>     # to a JSON value.
>>     #
>>     # A complexity over JSON is that our values may or may not be annotated.
>>     #
>>     # Un-annotated values may be:
>>     #     Scalar: str, bool, None.
>>     #     Non-scalar: List, Dict
>>     # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>>     #
>>     # With optional annotations, the type of all values is:
>>     # TreeValue = Union[_value, Annotated[_value]]
>>     #
>>     # Sadly, mypy does not support recursive types, so we must approximate this.
>>     _stub = Any
>>     _scalar = Union[str, bool, None]
>>     _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>>   _value = Union[_scalar, _nonscalar]
>>>   TreeValue = Union[_value, 'Annotated[_value]']

I'm once again terminally confused about when to use _lower_case and
when to use CamelCase for such variables.

The reader has to connect _stub = Any back "we must approximate this".
Hmm... "we approximate with Any"?

>>>   
>>> +# This is a (strict) alias for an arbitrary object non-scalar, as above:
>>> +_DObject = Dict[str, object]
>> 
>> Sounds greek :)
>> 
>
> Admittedly it is still not explained well ... until the next patch. I'm 
> going to leave it alone for now until you have a chance to respond to 
> these walls of text.

You explain it some futher down.

>> It's almost the Dict part of _nonscalar, but not quite: object vs. Any.
>> 
>> I naively expect something closer to
>> 
>>     _scalar = ...
>>     _object = Dict[str, _stub]
>>     _nonscalar = Union[_object, List[_stub]
>> 
>> and (still naively) expect _object to be good enough to serve as type
>> annotation for dicts representing JSON objects.
>
> "_object" would be good, except ... I am trying to avoid using that word 
> because what does it mean? Python object? JSON object? Here at the 
> boundary between two worlds, nothing makes sense.

Naming is hard.

We talked about these names in review of v2.  Let me try again.

introspect.py needs to generate (a suitable C representation of) an
instance of QAPI type '[SchemaInfo]'.

Its current choice of "suitable C representation" is "a QLitQObject
initializer with #if and comments".  This is a "lose" representation:
QLitQObject can encode pretty much anything, not just instances of
'[SchemaInfo]'.

C code converts this QLitQObject to a SchemaInfoList object[*].
SchemaInfoList is the C type for QAPI type '[SchemaInfo]'.  Automated
tests ensure this conversion cannot fail, i.e. the "lose" QLitQObject
actually encodes a '[SchemaInfo]'.

introspect.py separates concerns: it first builds an abstract
representation of "set of QObject with #if and comments", then generates
C code from that.

Why "QObject with #if and comments", and not "QLitQObject with #if and
comments"?  Because QLitQObject is *one* way to represent QObject, and
we don't care which way outside C code generation.

A QObject represents a JSON value.  We could just as well say "JSON
value with #if and comments".

So, the abstract representation of "JSON value with #if and comments" is
what we're trying to type.  If you'd rather say "QObject with #if and
comments", that's fine.

Our abstract representation is a tree, where

* JSON null / QNull is represented as Python None

* JSON string / QString as str

* JSON true and false / QBool as bool

* JSON number / QNum is not implemented

* JSON object / QDict is dict mapping string keys to sub-trees

* JSON array / QList is list of sub-trees

* #if and comment tacked to a sub-tree is represented by wrapping the
  subtree in Annotated

  Wrapping a sub-tree that is already wrapped seems mostly useless, but
  the code doesn't care.

  Wrapping dictionary values makes no sense.  The code doesn't care, and
  gives you GIGO.

  Making the code reject these two feels out of scope.  If you want to
  anyway, I won't object unless it gets in the way of "in scope" stuff
  (right now it doesn't seem to).

Let me stress once again: this is *not* an abstract representation of a
'SchemaInfo'.  Such a representation would also work, and you might like
it better, but it's simply not what we have.  Evidence: _tree_to_qlit()
works fine for *any* tree, not just for trees that encode instances of
'SchemaInfo'.

Since each (sub-)tree represents a JSON value / QObject, possibly with
annotations, I'd like to propose a few "obvious" (hahaha) names:

* an unannotated QObject: QObject

* an annotated QObject: AnnotatedQObject

* a possibly annotated QObject: PossiblyAnnotatedQObject

  Too long.  Rename QObject to BareQObject, then call this one QObject.

This gives us:

    _BareQObject = Union[None, str, bool, Dict[str, Any], List[Any]]
    _AnnotatedQObject = Annotated[_QObject]
    _QObject = Union[_BareQObject, _AnnotatedQObject]

Feel free to replace QObject by JsonValue in these names if you like
that better.  I think I'd slightly prefer JsonValue right now.

Now back to _DObject:

> (See patch 12/14 for A More Betterer Understanding of what _DObject is 
> used for. It will contribute to A Greater Understanding.)
>
> Anyway, to your questions;
>
> (1) _DObject was my shorthand garbage way of saying "This is a Python 
> Dict that represents a JSON object". Hence Dict-Object, "DObject". I 
> have also derisively called this a "dictly-typed" object at times.

In the naming system I proposed, this is BareQDict, with an additional
complication: we actually have two different types for the same thing,
an anonymous one within _BareQObject, and a named one.

> (2) Dict[str, Any] and Dict[str, object] are similar, but do have a 

The former is the anonymous one, the latter the named one.

> semantic difference. I alluded to it by calling this a "(strict) alias"; 
> which does not help you understand any of the following points:
>
> Whenever you use "Any", it basically turns off type-checking at that 
> boundary; it is the gradually typed boundary type. Avoid it whenever 
> reasonably possible.
>
> Example time:
>
>
> def foo(thing: Any) -> None:
>      print(thing.value)  # Sure, I guess! We'll believe you.
>
>
> def foo(thing: object) -> None:
>      print(thing.value)  # BZZT, Python object has no value prop.
>
>
> Use "Any" when you really just cannot constrain the type, because you're 
> out of bourbon or you've decided to give in to the darkness inside your 
> heart.
>
> Use "object" when the type of the value actually doesn't matter, because 
> you are only passing it on to something else later that will do its own 
> type analysis or introspection on the object.
>
> For introspect.py, 'object' is actually a really good type when we can 
> use it, because we interrogate the type exhaustively upon receipt in 
> _tree_to_qlit.
>
>
> That leaves one question you would almost assuredly ask as a followup:
>
> "Why didn't you use object for the stub type to begin with?"
>
> Let's say we define _stub as `object` instead, the Python object. When 
> _tree_to_qlit recurses on non-scalar structures, the held value there is 
> only known as "object" and not as str/bool/None, which causes a typing 
> error at that point.
>
> Moving the stub to "Any" tells mypy to ... not worry about what type we 
> actually passed here. I gave in to the darkness in my heart. It's just 
> too annoying without real recursion.

May I have an abridged version of this in the comments?  It might look
quaint in ten years, when we're all fluent in Python type annotations.
But right now, at least some readers aren't, and they can use a bit of
help.


[*] Actually, we take a shortcut and convert straight to QObject, but
that's just laziness.  See qmp_query_qmp_schema()'s "Minor hack:"
comment.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-05 13:42       ` Markus Armbruster
@ 2021-02-08 21:39         ` John Snow
  2021-02-08 21:53           ` John Snow
  2021-02-09  9:06           ` Markus Armbruster
  0 siblings, 2 replies; 40+ messages in thread
From: John Snow @ 2021-02-08 21:39 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, qemu-devel, Eduardo Habkost, Cleber Rosa

On 2/5/21 8:42 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> On 2/3/21 10:15 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>> ---
>>>>    scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
>>>>    scripts/qapi/mypy.ini      |   5 --
>>>>    scripts/qapi/schema.py     |   2 +-
>>>>    3 files changed, 82 insertions(+), 40 deletions(-)
>>>>
>>>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>>>> index 60ec326d2c7..b7f2a6cf260 100644
>>>> --- a/scripts/qapi/introspect.py
>>>> +++ b/scripts/qapi/introspect.py
>>>> @@ -30,10 +30,19 @@
>>>>    )
>>>>    from .gen import QAPISchemaMonolithicCVisitor
>>>>    from .schema import (
>>>> +    QAPISchema,
>>>>        QAPISchemaArrayType,
>>>>        QAPISchemaBuiltinType,
>>>> +    QAPISchemaEntity,
>>>> +    QAPISchemaEnumMember,
>>>> +    QAPISchemaFeature,
>>>> +    QAPISchemaObjectType,
>>>> +    QAPISchemaObjectTypeMember,
>>>>        QAPISchemaType,
>>>> +    QAPISchemaVariant,
>>>> +    QAPISchemaVariants,
>>>>    )
>>>> +from .source import QAPISourceInfo
>>>>    
>>>>    
>>>>    # This module constructs a tree data structure that is used to
>>>> @@ -57,6 +66,8 @@
>>>      # generate the introspection information for QEMU. It behaves similarly
>>>      # to a JSON value.
>>>      #
>>>      # A complexity over JSON is that our values may or may not be annotated.
>>>      #
>>>      # Un-annotated values may be:
>>>      #     Scalar: str, bool, None.
>>>      #     Non-scalar: List, Dict
>>>      # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>>>      #
>>>      # With optional annotations, the type of all values is:
>>>      # TreeValue = Union[_value, Annotated[_value]]
>>>      #
>>>      # Sadly, mypy does not support recursive types, so we must approximate this.
>>>      _stub = Any
>>>      _scalar = Union[str, bool, None]
>>>      _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>>>    _value = Union[_scalar, _nonscalar]
>>>>    TreeValue = Union[_value, 'Annotated[_value]']
> 
> I'm once again terminally confused about when to use _lower_case and
> when to use CamelCase for such variables.
> 

That's my fault for not using them consistently.

Generally:

TitleCase: Classes, Real Type Names :tm:
lowercase: instance names (and certain built-in types like str/bool/int)
UPPERCASE: "Constants". This is an extremely loose idea in Python.

I use the "_" prefix for any of the above categories to indicate 
something not intended to be used outside of the current scope. These 
types won't be accessible outside the module by default.

TypeVars I use "T", "U", "V", etc unless I bind them to another type; 
then I use e.g. NodeT instead.

When it comes to things like type aliases, I believe I instinctively 
used lowercase because I am not creating a new Real Type and wanted some 
visual distinction from a real class name. (aliases created in this way 
cannot be used with isinstance and hold no significance to mypy.)

That's why I used _stub, _scalar, _nonscalar, and _value for those types 
there. Then I disregarded my own convention and used TreeValue; perhaps 
that ought to be tree_value for consistency as it's not a Real Type :tm:

...but then we have the SchemaInfo type aliases, which I named using the 
same type name as they use in QAPI to help paint the association (and 
pick up 'git grep' searchers.)

Not fantastically consistent, sorry. Feel free to express a preference, 
I clearly don't have a universally applied one.

(Current leaning: rename TreeValue to tree_value, but leave everything 
else as it is.)

> The reader has to connect _stub = Any back "we must approximate this".
> Hmm... "we approximate with Any"?
> 

I can try to be more explicit about it.

>>>>    
>>>> +# This is a (strict) alias for an arbitrary object non-scalar, as above:
>>>> +_DObject = Dict[str, object]
>>>
>>> Sounds greek :)
>>>
>>
>> Admittedly it is still not explained well ... until the next patch. I'm
>> going to leave it alone for now until you have a chance to respond to
>> these walls of text.
> 
> You explain it some futher down.
> 
>>> It's almost the Dict part of _nonscalar, but not quite: object vs. Any.
>>>
>>> I naively expect something closer to
>>>
>>>      _scalar = ...
>>>      _object = Dict[str, _stub]
>>>      _nonscalar = Union[_object, List[_stub]
>>>
>>> and (still naively) expect _object to be good enough to serve as type
>>> annotation for dicts representing JSON objects.
>>
>> "_object" would be good, except ... I am trying to avoid using that word
>> because what does it mean? Python object? JSON object? Here at the
>> boundary between two worlds, nothing makes sense.
> 
> Naming is hard.
> 

Yep. We can skip this debate by just naming the incoming types 
SchemaInfo and similar... (cont'd below)

> We talked about these names in review of v2.  Let me try again.
> 
> introspect.py needs to generate (a suitable C representation of) an
> instance of QAPI type '[SchemaInfo]'.
> 
> Its current choice of "suitable C representation" is "a QLitQObject
> initializer with #if and comments".  This is a "lose" representation:
> QLitQObject can encode pretty much anything, not just instances of
> '[SchemaInfo]'.
> 
> C code converts this QLitQObject to a SchemaInfoList object[*].
> SchemaInfoList is the C type for QAPI type '[SchemaInfo]'.  Automated
> tests ensure this conversion cannot fail, i.e. the "lose" QLitQObject
> actually encodes a '[SchemaInfo]'.
> 
> introspect.py separates concerns: it first builds an abstract
> representation of "set of QObject with #if and comments", then generates
> C code from that.
> 
> Why "QObject with #if and comments", and not "QLitQObject with #if and
> comments"?  Because QLitQObject is *one* way to represent QObject, and
> we don't care which way outside C code generation.
> 
> A QObject represents a JSON value.  We could just as well say "JSON
> value with #if and comments".
> 
> So, the abstract representation of "JSON value with #if and comments" is
> what we're trying to type.  If you'd rather say "QObject with #if and
> comments", that's fine.
> 
> Our abstract representation is a tree, where
> 
> * JSON null / QNull is represented as Python None
> 
> * JSON string / QString as str
> 
> * JSON true and false / QBool as bool
> 
> * JSON number / QNum is not implemented
> 
> * JSON object / QDict is dict mapping string keys to sub-trees
> 
> * JSON array / QList is list of sub-trees
> 
> * #if and comment tacked to a sub-tree is represented by wrapping the
>    subtree in Annotated
> 
>    Wrapping a sub-tree that is already wrapped seems mostly useless, but
>    the code doesn't care.
> 
>    Wrapping dictionary values makes no sense.  The code doesn't care, and
>    gives you GIGO.
> 
>    Making the code reject these two feels out of scope.  If you want to
>    anyway, I won't object unless it gets in the way of "in scope" stuff
>    (right now it doesn't seem to).
> 
> Let me stress once again: this is *not* an abstract representation of a
> 'SchemaInfo'.  Such a representation would also work, and you might like
> it better, but it's simply not what we have.  Evidence: _tree_to_qlit()
> works fine for *any* tree, not just for trees that encode instances of
> 'SchemaInfo'.
> 

... as long as you don't feel that's incorrect to do. We are free to 
name those structures SchemaInfo but type _tree_to_qlit() in terms of 
generic Dict objects, leaving us without a middle-abstract thing to name 
at all.

Based on your review of the "dummy types" patch, I'm going to assume 
that's fine.

> Since each (sub-)tree represents a JSON value / QObject, possibly with
> annotations, I'd like to propose a few "obvious" (hahaha) names:
> 
> * an unannotated QObject: QObject
> 
> * an annotated QObject: AnnotatedQObject
> 
> * a possibly annotated QObject: PossiblyAnnotatedQObject
> 
>    Too long.  Rename QObject to BareQObject, then call this one QObject.
> 
> This gives us:
> 
>      _BareQObject = Union[None, str, bool, Dict[str, Any], List[Any]]
>      _AnnotatedQObject = Annotated[_QObject]
>      _QObject = Union[_BareQObject, _AnnotatedQObject]
> 
> Feel free to replace QObject by JsonValue in these names if you like
> that better.  I think I'd slightly prefer JsonValue right now.
> 
> Now back to _DObject:
> 
>> (See patch 12/14 for A More Betterer Understanding of what _DObject is
>> used for. It will contribute to A Greater Understanding.)
>>
>> Anyway, to your questions;
>>
>> (1) _DObject was my shorthand garbage way of saying "This is a Python
>> Dict that represents a JSON object". Hence Dict-Object, "DObject". I
>> have also derisively called this a "dictly-typed" object at times.
> 
> In the naming system I proposed, this is BareQDict, with an additional
> complication: we actually have two different types for the same thing,
> an anonymous one within _BareQObject, and a named one.
> 
>> (2) Dict[str, Any] and Dict[str, object] are similar, but do have a
> 
> The former is the anonymous one, the latter the named one.
> 

Kinda-sorta. I am talking about pure mypy here, and the differences 
between typing two things this way.

Though I think you're right: I used the "Any" form for the anonymous 
type (inherent to the structure of a JSON compound type) and the 
"object" form for the named forms (The SchemaInfo objects we build in 
the visitors to pass to the generator later).

>> semantic difference. I alluded to it by calling this a "(strict) alias";
>> which does not help you understand any of the following points:
>>
>> Whenever you use "Any", it basically turns off type-checking at that
>> boundary; it is the gradually typed boundary type. Avoid it whenever
>> reasonably possible.
>>
>> Example time:
>>
>>
>> def foo(thing: Any) -> None:
>>       print(thing.value)  # Sure, I guess! We'll believe you.
>>
>>
>> def foo(thing: object) -> None:
>>       print(thing.value)  # BZZT, Python object has no value prop.
>>
>>
>> Use "Any" when you really just cannot constrain the type, because you're
>> out of bourbon or you've decided to give in to the darkness inside your
>> heart.
>>
>> Use "object" when the type of the value actually doesn't matter, because
>> you are only passing it on to something else later that will do its own
>> type analysis or introspection on the object.
>>
>> For introspect.py, 'object' is actually a really good type when we can
>> use it, because we interrogate the type exhaustively upon receipt in
>> _tree_to_qlit.
>>
>>
>> That leaves one question you would almost assuredly ask as a followup:
>>
>> "Why didn't you use object for the stub type to begin with?"
>>
>> Let's say we define _stub as `object` instead, the Python object. When
>> _tree_to_qlit recurses on non-scalar structures, the held value there is
>> only known as "object" and not as str/bool/None, which causes a typing
>> error at that point.
>>
>> Moving the stub to "Any" tells mypy to ... not worry about what type we
>> actually passed here. I gave in to the darkness in my heart. It's just
>> too annoying without real recursion.
> 
> May I have an abridged version of this in the comments?  It might look
> quaint in ten years, when we're all fluent in Python type annotations.
> But right now, at least some readers aren't, and they can use a bit of
> help.
> 

Yeah, I'm sympathetic to that.... though I'm not sure what to write or 
where. I can add some reference points in the commit message, like this one:

https://mypy.readthedocs.io/en/stable/dynamic_typing.html#any-vs-object

maybe in conjunction with the named type aliases patch this is actually 
sufficient?

> 
> [*] Actually, we take a shortcut and convert straight to QObject, but
> that's just laziness.  See qmp_query_qmp_schema()'s "Minor hack:"
> comment.
> 

:)



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-08 21:39         ` John Snow
@ 2021-02-08 21:53           ` John Snow
  2021-02-09  9:06           ` Markus Armbruster
  1 sibling, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-08 21:53 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, qemu-devel, Eduardo Habkost, Cleber Rosa

On 2/8/21 4:39 PM, John Snow wrote:
>>
>> I'm once again terminally confused about when to use _lower_case and
>> when to use CamelCase for such variables.
>>
> 
> That's my fault for not using them consistently.
> 
> Generally:
> 
> TitleCase: Classes, Real Type Names :tm:
> lowercase: instance names (and certain built-in types like str/bool/int)
> UPPERCASE: "Constants". This is an extremely loose idea in Python.
> 
> I use the "_" prefix for any of the above categories to indicate 
> something not intended to be used outside of the current scope. These 
> types won't be accessible outside the module by default.
> 
> TypeVars I use "T", "U", "V", etc unless I bind them to another type; 
> then I use e.g. NodeT instead.
> 
> When it comes to things like type aliases, I believe I instinctively 
> used lowercase because I am not creating a new Real Type and wanted some 
> visual distinction from a real class name. (aliases created in this way 
> cannot be used with isinstance and hold no significance to mypy.)
> 
> That's why I used _stub, _scalar, _nonscalar, and _value for those types 
> there. Then I disregarded my own convention and used TreeValue; perhaps 
> that ought to be tree_value for consistency as it's not a Real Type :tm:
> 
> ...but then we have the SchemaInfo type aliases, which I named using the 
> same type name as they use in QAPI to help paint the association (and 
> pick up 'git grep' searchers.)
> 
> Not fantastically consistent, sorry. Feel free to express a preference, 
> I clearly don't have a universally applied one.
> 
> (Current leaning: rename TreeValue to tree_value, but leave everything 
> else as it is.)

Addendum: pylint wants any non-underscored type alias to be treated like 
a class name, as CamelCase.

I guess it just exempts underscore prefixed things. So, it does have to 
stay "TreeValue".

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-08 21:39         ` John Snow
  2021-02-08 21:53           ` John Snow
@ 2021-02-09  9:06           ` Markus Armbruster
  2021-02-10 17:31             ` John Snow
  1 sibling, 1 reply; 40+ messages in thread
From: Markus Armbruster @ 2021-02-09  9:06 UTC (permalink / raw)
  To: John Snow; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

John Snow <jsnow@redhat.com> writes:

> On 2/5/21 8:42 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> On 2/3/21 10:15 AM, Markus Armbruster wrote:
>>>> John Snow <jsnow@redhat.com> writes:
>>>>
>>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>>> ---
>>>>>    scripts/qapi/introspect.py | 115 ++++++++++++++++++++++++++-----------
>>>>>    scripts/qapi/mypy.ini      |   5 --
>>>>>    scripts/qapi/schema.py     |   2 +-
>>>>>    3 files changed, 82 insertions(+), 40 deletions(-)
>>>>>
>>>>> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
>>>>> index 60ec326d2c7..b7f2a6cf260 100644
>>>>> --- a/scripts/qapi/introspect.py
>>>>> +++ b/scripts/qapi/introspect.py
>>>>> @@ -30,10 +30,19 @@
>>>>>    )
>>>>>    from .gen import QAPISchemaMonolithicCVisitor
>>>>>    from .schema import (
>>>>> +    QAPISchema,
>>>>>        QAPISchemaArrayType,
>>>>>        QAPISchemaBuiltinType,
>>>>> +    QAPISchemaEntity,
>>>>> +    QAPISchemaEnumMember,
>>>>> +    QAPISchemaFeature,
>>>>> +    QAPISchemaObjectType,
>>>>> +    QAPISchemaObjectTypeMember,
>>>>>        QAPISchemaType,
>>>>> +    QAPISchemaVariant,
>>>>> +    QAPISchemaVariants,
>>>>>    )
>>>>> +from .source import QAPISourceInfo
>>>>>    
>>>>>    
>>>>>    # This module constructs a tree data structure that is used to
>>>>> @@ -57,6 +66,8 @@
>>>>      # generate the introspection information for QEMU. It behaves similarly
>>>>      # to a JSON value.
>>>>      #
>>>>      # A complexity over JSON is that our values may or may not be annotated.
>>>>      #
>>>>      # Un-annotated values may be:
>>>>      #     Scalar: str, bool, None.
>>>>      #     Non-scalar: List, Dict
>>>>      # _value = Union[str, bool, None, Dict[str, TreeValue], List[TreeValue]]
>>>>      #
>>>>      # With optional annotations, the type of all values is:
>>>>      # TreeValue = Union[_value, Annotated[_value]]
>>>>      #
>>>>      # Sadly, mypy does not support recursive types, so we must approximate this.
>>>>      _stub = Any
>>>>      _scalar = Union[str, bool, None]
>>>>      _nonscalar = Union[Dict[str, _stub], List[_stub]]
>>>>>    _value = Union[_scalar, _nonscalar]
>>>>>    TreeValue = Union[_value, 'Annotated[_value]']
>> 
>> I'm once again terminally confused about when to use _lower_case and
>> when to use CamelCase for such variables.
>> 
>
> That's my fault for not using them consistently.
>
> Generally:
>
> TitleCase: Classes, Real Type Names :tm:
> lowercase: instance names (and certain built-in types like str/bool/int)
> UPPERCASE: "Constants". This is an extremely loose idea in Python.
>
> I use the "_" prefix for any of the above categories to indicate 
> something not intended to be used outside of the current scope. These 
> types won't be accessible outside the module by default.
>
> TypeVars I use "T", "U", "V", etc unless I bind them to another type; 
> then I use e.g. NodeT instead.
>
> When it comes to things like type aliases, I believe I instinctively 
> used lowercase because I am not creating a new Real Type and wanted some 
> visual distinction from a real class name. (aliases created in this way 
> cannot be used with isinstance and hold no significance to mypy.)
>
> That's why I used _stub, _scalar, _nonscalar, and _value for those types 
> there. Then I disregarded my own convention and used TreeValue; perhaps 
> that ought to be tree_value for consistency as it's not a Real Type :tm:
>
> ...but then we have the SchemaInfo type aliases, which I named using the 
> same type name as they use in QAPI to help paint the association (and 
> pick up 'git grep' searchers.)
>
> Not fantastically consistent, sorry. Feel free to express a preference, 
> I clearly don't have a universally applied one.
>
> (Current leaning: rename TreeValue to tree_value, but leave everything 
> else as it is.)

https://www.python.org/dev/peps/pep-0484/#type-aliases

    Note that we recommend capitalizing alias names, since they
    represent user-defined types, which (like user-defined classes) are
    typically spelled that way.

I think this wants names like _Scalar, _NonScalar, _Value, TreeValue.

>> The reader has to connect _stub = Any back "we must approximate this".
>> Hmm... "we approximate with Any"?
>> 
>
> I can try to be more explicit about it.
>
>>>>>    
>>>>> +# This is a (strict) alias for an arbitrary object non-scalar, as above:
>>>>> +_DObject = Dict[str, object]
>>>>
>>>> Sounds greek :)
>>>>
>>>
>>> Admittedly it is still not explained well ... until the next patch. I'm
>>> going to leave it alone for now until you have a chance to respond to
>>> these walls of text.
>> 
>> You explain it some futher down.
>> 
>>>> It's almost the Dict part of _nonscalar, but not quite: object vs. Any.
>>>>
>>>> I naively expect something closer to
>>>>
>>>>      _scalar = ...
>>>>      _object = Dict[str, _stub]
>>>>      _nonscalar = Union[_object, List[_stub]
>>>>
>>>> and (still naively) expect _object to be good enough to serve as type
>>>> annotation for dicts representing JSON objects.
>>>
>>> "_object" would be good, except ... I am trying to avoid using that word
>>> because what does it mean? Python object? JSON object? Here at the
>>> boundary between two worlds, nothing makes sense.
>> 
>> Naming is hard.
>> 
>
> Yep. We can skip this debate by just naming the incoming types 
> SchemaInfo and similar... (cont'd below)
>
>> We talked about these names in review of v2.  Let me try again.
>> 
>> introspect.py needs to generate (a suitable C representation of) an
>> instance of QAPI type '[SchemaInfo]'.
>> 
>> Its current choice of "suitable C representation" is "a QLitQObject
>> initializer with #if and comments".  This is a "lose" representation:
>> QLitQObject can encode pretty much anything, not just instances of
>> '[SchemaInfo]'.
>> 
>> C code converts this QLitQObject to a SchemaInfoList object[*].
>> SchemaInfoList is the C type for QAPI type '[SchemaInfo]'.  Automated
>> tests ensure this conversion cannot fail, i.e. the "lose" QLitQObject
>> actually encodes a '[SchemaInfo]'.
>> 
>> introspect.py separates concerns: it first builds an abstract
>> representation of "set of QObject with #if and comments", then generates
>> C code from that.
>> 
>> Why "QObject with #if and comments", and not "QLitQObject with #if and
>> comments"?  Because QLitQObject is *one* way to represent QObject, and
>> we don't care which way outside C code generation.
>> 
>> A QObject represents a JSON value.  We could just as well say "JSON
>> value with #if and comments".
>> 
>> So, the abstract representation of "JSON value with #if and comments" is
>> what we're trying to type.  If you'd rather say "QObject with #if and
>> comments", that's fine.
>> 
>> Our abstract representation is a tree, where
>> 
>> * JSON null / QNull is represented as Python None
>> 
>> * JSON string / QString as str
>> 
>> * JSON true and false / QBool as bool
>> 
>> * JSON number / QNum is not implemented
>> 
>> * JSON object / QDict is dict mapping string keys to sub-trees
>> 
>> * JSON array / QList is list of sub-trees
>> 
>> * #if and comment tacked to a sub-tree is represented by wrapping the
>>    subtree in Annotated
>> 
>>    Wrapping a sub-tree that is already wrapped seems mostly useless, but
>>    the code doesn't care.
>> 
>>    Wrapping dictionary values makes no sense.  The code doesn't care, and
>>    gives you GIGO.
>> 
>>    Making the code reject these two feels out of scope.  If you want to
>>    anyway, I won't object unless it gets in the way of "in scope" stuff
>>    (right now it doesn't seem to).
>> 
>> Let me stress once again: this is *not* an abstract representation of a
>> 'SchemaInfo'.  Such a representation would also work, and you might like
>> it better, but it's simply not what we have.  Evidence: _tree_to_qlit()
>> works fine for *any* tree, not just for trees that encode instances of
>> 'SchemaInfo'.
>> 
>
> ... as long as you don't feel that's incorrect to do. We are free to 
> name those structures SchemaInfo but type _tree_to_qlit() in terms of 
> generic Dict objects, leaving us without a middle-abstract thing to name 
> at all.
>
> Based on your review of the "dummy types" patch, I'm going to assume 
> that's fine.

I guess it's okayish enough.  It still feels more complicated to me than
it needs to be.

QAPISchemaGenIntrospectVisitor an abstract representation of "QObject
with #if and comments" for each SchemaInfo.

This is not really a representation of SchemaInfo.  We can choose to
name it that way regardless, if it helps, and we explain it properly.

Once we hand off the data to _tree_to_qlit(), we can't name it that way
anymore, simply because _tree_to_qlit() treats it as the stupid
recursive data structure it is, and doesn't need or want to know about
SchemaInfo.

I think I'd dispense with _DObject entirely, and use TreeValue
throughout.  Yes, we'd use Any a bit more.  I doubt the additional
complexity to *sometimes* use object instead is worthwhile.  This data
structure is used only within this file.  It pretty much never changes
(because JSON doesn't).  It's basically write-only in
QAPISchemaGenIntrospectVisitor.  This means all the extra typing work
buys us is use of object instead of Any where it doesn't actually
matter.

I would use a more telling name than TreeValue, though.  One that
actually hints at the kind of value "representation of QObject with #if
and comment".

>> Since each (sub-)tree represents a JSON value / QObject, possibly with
>> annotations, I'd like to propose a few "obvious" (hahaha) names:
>> 
>> * an unannotated QObject: QObject
>> 
>> * an annotated QObject: AnnotatedQObject
>> 
>> * a possibly annotated QObject: PossiblyAnnotatedQObject
>> 
>>    Too long.  Rename QObject to BareQObject, then call this one QObject.
>> 
>> This gives us:
>> 
>>      _BareQObject = Union[None, str, bool, Dict[str, Any], List[Any]]
>>      _AnnotatedQObject = Annotated[_QObject]
>>      _QObject = Union[_BareQObject, _AnnotatedQObject]
>> 
>> Feel free to replace QObject by JsonValue in these names if you like
>> that better.  I think I'd slightly prefer JsonValue right now.
>> 
>> Now back to _DObject:
>> 
>>> (See patch 12/14 for A More Betterer Understanding of what _DObject is
>>> used for. It will contribute to A Greater Understanding.)
>>>
>>> Anyway, to your questions;
>>>
>>> (1) _DObject was my shorthand garbage way of saying "This is a Python
>>> Dict that represents a JSON object". Hence Dict-Object, "DObject". I
>>> have also derisively called this a "dictly-typed" object at times.
>> 
>> In the naming system I proposed, this is BareQDict, with an additional
>> complication: we actually have two different types for the same thing,
>> an anonymous one within _BareQObject, and a named one.
>> 
>>> (2) Dict[str, Any] and Dict[str, object] are similar, but do have a
>> 
>> The former is the anonymous one, the latter the named one.
>> 
>
> Kinda-sorta. I am talking about pure mypy here, and the differences 
> between typing two things this way.
>
> Though I think you're right: I used the "Any" form for the anonymous 
> type (inherent to the structure of a JSON compound type) and the 
> "object" form for the named forms (The SchemaInfo objects we build in 
> the visitors to pass to the generator later).
>
>>> semantic difference. I alluded to it by calling this a "(strict) alias";
>>> which does not help you understand any of the following points:
>>>
>>> Whenever you use "Any", it basically turns off type-checking at that
>>> boundary; it is the gradually typed boundary type. Avoid it whenever
>>> reasonably possible.
>>>
>>> Example time:
>>>
>>>
>>> def foo(thing: Any) -> None:
>>>       print(thing.value)  # Sure, I guess! We'll believe you.
>>>
>>>
>>> def foo(thing: object) -> None:
>>>       print(thing.value)  # BZZT, Python object has no value prop.
>>>
>>>
>>> Use "Any" when you really just cannot constrain the type, because you're
>>> out of bourbon or you've decided to give in to the darkness inside your
>>> heart.
>>>
>>> Use "object" when the type of the value actually doesn't matter, because
>>> you are only passing it on to something else later that will do its own
>>> type analysis or introspection on the object.
>>>
>>> For introspect.py, 'object' is actually a really good type when we can
>>> use it, because we interrogate the type exhaustively upon receipt in
>>> _tree_to_qlit.
>>>
>>>
>>> That leaves one question you would almost assuredly ask as a followup:
>>>
>>> "Why didn't you use object for the stub type to begin with?"
>>>
>>> Let's say we define _stub as `object` instead, the Python object. When
>>> _tree_to_qlit recurses on non-scalar structures, the held value there is
>>> only known as "object" and not as str/bool/None, which causes a typing
>>> error at that point.
>>>
>>> Moving the stub to "Any" tells mypy to ... not worry about what type we
>>> actually passed here. I gave in to the darkness in my heart. It's just
>>> too annoying without real recursion.
>> 
>> May I have an abridged version of this in the comments?  It might look
>> quaint in ten years, when we're all fluent in Python type annotations.
>> But right now, at least some readers aren't, and they can use a bit of
>> help.
>> 
>
> Yeah, I'm sympathetic to that.... though I'm not sure what to write or 
> where. I can add some reference points in the commit message, like this one:
>
> https://mypy.readthedocs.io/en/stable/dynamic_typing.html#any-vs-object
>
> maybe in conjunction with the named type aliases patch this is actually 
> sufficient?

I can see two solutions right now:

1. Use Dict[str, Any] throughout

   All we need to explain is

   * What the data structure is about (JSON annotated with ifconds and
     comments; got that, could use improvement perhaps)

   * Your work-around for the lack of recursive types (got that
     already)

   * That the use of Any bypasses type static checking on use (shouldn't
     be hard)

   * Where such uses are (I believe only in _tree_to_qlit(), were Any
     can't be avoided anyway).

2. Use Dict[str, object] where we can

   Now we get to explain a few more things:

   * Why we bother (to get stricter static type checks on use)

   * Where such uses are (I can't see any offhand)

   * Maybe also where we go from one static type to the other.

In either case, we also need to pick names that need no explanation, or
explain them.

>> [*] Actually, we take a shortcut and convert straight to QObject, but
>> that's just laziness.  See qmp_query_qmp_schema()'s "Minor hack:"
>> comment.
>> 
>
> :)



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 11/14] qapi/introspect.py: add type hint annotations
  2021-02-09  9:06           ` Markus Armbruster
@ 2021-02-10 17:31             ` John Snow
  0 siblings, 0 replies; 40+ messages in thread
From: John Snow @ 2021-02-10 17:31 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Michael Roth, Cleber Rosa, qemu-devel, Eduardo Habkost

On 2/9/21 4:06 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
>> On 2/5/21 8:42 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:

> https://www.python.org/dev/peps/pep-0484/#type-aliases
> 
>      Note that we recommend capitalizing alias names, since they
>      represent user-defined types, which (like user-defined classes) are
>      typically spelled that way.
> 
> I think this wants names like _Scalar, _NonScalar, _Value, TreeValue.
> 

Yeah, that seems like the only way to make it consistent and not have 
pylint yell at me. I will try to adhere to this in the future, but maybe 
pylint needs a bug report to make it complain in these cases, too.

[...]

>>
>> ... as long as you don't feel that's incorrect to do. We are free to
>> name those structures SchemaInfo but type _tree_to_qlit() in terms of
>> generic Dict objects, leaving us without a middle-abstract thing to name
>> at all.
>>
>> Based on your review of the "dummy types" patch, I'm going to assume
>> that's fine.
> 
> I guess it's okayish enough.  It still feels more complicated to me than
> it needs to be.
> 
> QAPISchemaGenIntrospectVisitor an abstract representation of "QObject
> with #if and comments" for each SchemaInfo.
> 
> This is not really a representation of SchemaInfo.  We can choose to
> name it that way regardless, if it helps, and we explain it properly.

In that: SchemaInfo do not have annotations, but we do. Our SchemaInfo 
objects here are in a kind of superposition in that we have not yet 
applied the if conditionals.

Still, I do think it is *very* helpful to name those instances after the 
SchemaInfo types, because that is absolutely the interface we are 
matching. The keys are not arbitrary. The types of the values associated 
with those keys are not arbitrary.

So, I am not sure how useful it is to make such a careful distinction. 
My instinct is "not very, especially for passers-by to this module."

> 
> Once we hand off the data to _tree_to_qlit(), we can't name it that way
> anymore, simply because _tree_to_qlit() treats it as the stupid
> recursive data structure it is, and doesn't need or want to know about
> SchemaInfo.
> 

Yes, this is fine: the data is being interpreted in a new semantic 
context. It keeps the mechanical type but loses the semantic 
information. That sounds normal to me.

"Why bother, then?"

Mostly for the notational benefit in the code BUILDING the objects. 
_tree_to_qlit is so generic you can barely describe it, but the objects 
we build to feed it are quite concrete and have names and definitions 
that can be referenced.

> I think I'd dispense with _DObject entirely, and use TreeValue
> throughout.  Yes, we'd use Any a bit more.  I doubt the additional
> complexity to *sometimes* use object instead is worthwhile.  This data

I have gotten rid of _DObject entirely in v5; though using "Any" 
everywhere doesn't seem like an obvious win to me, because I'd need to 
turn this:

_NonScalar = Union[Dict[str, _Stub], List[_Stub]]
[...]
SchemaInfo = Dict[str, object]
[...]

into this:

_JSONObject = Dict[str, _Stub]
_JSONArray = List[_Stub]
_NonScalar = Union[_JSONObject, _JSONArray]
[...]
SchemaInfo = _JSONObject
[...]

Which doesn't really strike me as any simpler or nicer on either the 
semantic or mechanical axes.

> structure is used only within this file.  It pretty much never changes
> (because JSON doesn't).  It's basically write-only in
> QAPISchemaGenIntrospectVisitor.  This means all the extra typing work

Write-only variables need typing! mypy will assume these objects are 
Dict[str, str] otherwise. We have to type them as something.

And the way I typed them ... is correct, and avoided having to name two 
more intermediary types.

> buys us is use of object instead of Any where it doesn't actually
> matter.
> 

Maybe so. Comes down to habits. My current belief is "Do not use Any if 
you do not have to." I did not have to, so I didn't.

> I would use a more telling name than TreeValue, though.  One that
> actually hints at the kind of value "representation of QObject with #if
> and comment".
> 

We discussed this on IRC; ultimately I wasn't convinced of the utility 
of naming the tree type "QObject" on the logic that if QLit is a 
QObject, a function named "QObject to QLit" didn't make sense to me anymore.

>>> Since each (sub-)tree represents a JSON value / QObject, possibly with
>>> annotations, I'd like to propose a few "obvious" (hahaha) names:
>>>
>>> * an unannotated QObject: QObject
>>>
>>> * an annotated QObject: AnnotatedQObject
>>>
>>> * a possibly annotated QObject: PossiblyAnnotatedQObject
>>>
>>>     Too long.  Rename QObject to BareQObject, then call this one QObject.
>>>
>>> This gives us:
>>>
>>>       _BareQObject = Union[None, str, bool, Dict[str, Any], List[Any]]
>>>       _AnnotatedQObject = Annotated[_QObject]
>>>       _QObject = Union[_BareQObject, _AnnotatedQObject]
>>>
>>> Feel free to replace QObject by JsonValue in these names if you like
>>> that better.  I think I'd slightly prefer JsonValue right now.
>>>

On IRC, We agreed to disagree on the semantic name and use the more 
mechanically suggestive JsonValue instead. I'll give that a spin.

(It's also kinda-sorta wrong, but everything has felt kinda-sorta wrong 
to me so far. Guess it's no better or worse.)

>>> Now back to _DObject:
>>>
>>>> (See patch 12/14 for A More Betterer Understanding of what _DObject is
>>>> used for. It will contribute to A Greater Understanding.)
>>>>
>>>> Anyway, to your questions;
>>>>
>>>> (1) _DObject was my shorthand garbage way of saying "This is a Python
>>>> Dict that represents a JSON object". Hence Dict-Object, "DObject". I
>>>> have also derisively called this a "dictly-typed" object at times.
>>>
>>> In the naming system I proposed, this is BareQDict, with an additional
>>> complication: we actually have two different types for the same thing,
>>> an anonymous one within _BareQObject, and a named one.
>>>
>>>> (2) Dict[str, Any] and Dict[str, object] are similar, but do have a
>>>
>>> The former is the anonymous one, the latter the named one.
>>>
>>
>> Kinda-sorta. I am talking about pure mypy here, and the differences
>> between typing two things this way.
>>
>> Though I think you're right: I used the "Any" form for the anonymous
>> type (inherent to the structure of a JSON compound type) and the
>> "object" form for the named forms (The SchemaInfo objects we build in
>> the visitors to pass to the generator later).
>>
>>>> semantic difference. I alluded to it by calling this a "(strict) alias";
>>>> which does not help you understand any of the following points:
>>>>
>>>> Whenever you use "Any", it basically turns off type-checking at that
>>>> boundary; it is the gradually typed boundary type. Avoid it whenever
>>>> reasonably possible.
>>>>
>>>> Example time:
>>>>
>>>>
>>>> def foo(thing: Any) -> None:
>>>>        print(thing.value)  # Sure, I guess! We'll believe you.
>>>>
>>>>
>>>> def foo(thing: object) -> None:
>>>>        print(thing.value)  # BZZT, Python object has no value prop.
>>>>
>>>>
>>>> Use "Any" when you really just cannot constrain the type, because you're
>>>> out of bourbon or you've decided to give in to the darkness inside your
>>>> heart.
>>>>
>>>> Use "object" when the type of the value actually doesn't matter, because
>>>> you are only passing it on to something else later that will do its own
>>>> type analysis or introspection on the object.
>>>>
>>>> For introspect.py, 'object' is actually a really good type when we can
>>>> use it, because we interrogate the type exhaustively upon receipt in
>>>> _tree_to_qlit.
>>>>
>>>>
>>>> That leaves one question you would almost assuredly ask as a followup:
>>>>
>>>> "Why didn't you use object for the stub type to begin with?"
>>>>
>>>> Let's say we define _stub as `object` instead, the Python object. When
>>>> _tree_to_qlit recurses on non-scalar structures, the held value there is
>>>> only known as "object" and not as str/bool/None, which causes a typing
>>>> error at that point.
>>>>
>>>> Moving the stub to "Any" tells mypy to ... not worry about what type we
>>>> actually passed here. I gave in to the darkness in my heart. It's just
>>>> too annoying without real recursion.
>>>
>>> May I have an abridged version of this in the comments?  It might look
>>> quaint in ten years, when we're all fluent in Python type annotations.
>>> But right now, at least some readers aren't, and they can use a bit of
>>> help.
>>>
>>
>> Yeah, I'm sympathetic to that.... though I'm not sure what to write or
>> where. I can add some reference points in the commit message, like this one:
>>
>> https://mypy.readthedocs.io/en/stable/dynamic_typing.html#any-vs-object
>>
>> maybe in conjunction with the named type aliases patch this is actually
>> sufficient?
> 
> I can see two solutions right now:
> 
> 1. Use Dict[str, Any] throughout
> 
>     All we need to explain is
> 
>     * What the data structure is about (JSON annotated with ifconds and
>       comments; got that, could use improvement perhaps)
> 
>     * Your work-around for the lack of recursive types (got that
>       already)
> 
>     * That the use of Any bypasses type static checking on use (shouldn't
>       be hard)
> 
>     * Where such uses are (I believe only in _tree_to_qlit(), were Any
>       can't be avoided anyway).
> 
> 2. Use Dict[str, object] where we can
> 
>     Now we get to explain a few more things:
> 
>     * Why we bother (to get stricter static type checks on use)
> 
>     * Where such uses are (I can't see any offhand)
> 
>     * Maybe also where we go from one static type to the other.
> 
> In either case, we also need to pick names that need no explanation, or
> explain them.
> 

"that need no explanation" (to whom?) Suspect this is impossible; 
there's gonna be explanations anyway.

--js



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-02-10 17:41 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02 17:46 [PATCH v4 00/14] qapi: static typing conversion, pt2 John Snow
2021-02-02 17:46 ` [PATCH v4 01/14] qapi/introspect.py: assert schema is not None John Snow
2021-02-02 17:46 ` [PATCH v4 02/14] qapi/introspect.py: use _make_tree for features nodes John Snow
2021-02-03 13:49   ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 03/14] qapi/introspect.py: add _gen_features helper John Snow
2021-02-02 17:46 ` [PATCH v4 04/14] qapi/introspect.py: guard against ifcond/comment misuse John Snow
2021-02-03 14:08   ` Markus Armbruster
2021-02-03 20:42     ` John Snow
2021-02-03 21:18       ` Eduardo Habkost
2021-02-04 15:06       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 05/14] qapi/introspect.py: Unify return type of _make_tree() John Snow
2021-02-02 17:46 ` [PATCH v4 06/14] qapi/introspect.py: replace 'extra' dict with 'comment' argument John Snow
2021-02-03 14:23   ` Markus Armbruster
2021-02-03 21:21     ` John Snow
2021-02-04  8:37       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 07/14] qapi/introspect.py: Introduce preliminary tree typing John Snow
2021-02-03 14:30   ` Markus Armbruster
2021-02-03 21:40     ` John Snow
2021-02-02 17:46 ` [PATCH v4 08/14] qapi/introspect.py: create a typed 'Annotated' data strutcure John Snow
2021-02-03 14:47   ` Markus Armbruster
2021-02-03 21:50     ` Eduardo Habkost
2021-02-04 15:37       ` Markus Armbruster
2021-02-04 16:20         ` John Snow
2021-02-04 16:28         ` Eduardo Habkost
2021-02-05  8:45           ` Markus Armbruster
2021-02-03 23:12     ` John Snow
2021-02-05  9:10       ` Markus Armbruster
2021-02-02 17:46 ` [PATCH v4 09/14] qapi/introspect.py: improve _tree_to_qlit error message John Snow
2021-02-02 17:46 ` [PATCH v4 10/14] qapi/introspect.py: improve readability of _tree_to_qlit John Snow
2021-02-02 17:46 ` [PATCH v4 11/14] qapi/introspect.py: add type hint annotations John Snow
2021-02-03 15:15   ` Markus Armbruster
2021-02-03 23:27     ` John Snow
2021-02-05 13:42       ` Markus Armbruster
2021-02-08 21:39         ` John Snow
2021-02-08 21:53           ` John Snow
2021-02-09  9:06           ` Markus Armbruster
2021-02-10 17:31             ` John Snow
2021-02-02 17:46 ` [PATCH v4 12/14] qapi/introspect.py: add introspect.json dummy types John Snow
2021-02-02 17:46 ` [PATCH v4 13/14] qapi/introspect.py: Add docstring to _tree_to_qlit John Snow
2021-02-02 17:46 ` [PATCH v4 14/14] qapi/introspect.py: Update copyright and authors list John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.