All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <michael.roth@amd.com>,
	Cleber Rosa <crosa@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Eduardo Habkost <ehabkost@redhat.com>
Subject: Re: [PATCH v2 3/6] qapi/parser: add type hint annotations (QAPIDoc)
Date: Tue, 28 Sep 2021 19:25:29 -0400	[thread overview]
Message-ID: <CAFn=p-Y4uqnLEvemPkTwYAdnFfoPLojyjRsWbD2zsEBpfvpqhQ@mail.gmail.com> (raw)
In-Reply-To: <87o8943brt.fsf@dusky.pond.sub.org>

[-- Attachment #1: Type: text/plain, Size: 15968 bytes --]

On Tue, Sep 7, 2021 at 6:44 AM Markus Armbruster <armbru@redhat.com> wrote:

> John Snow <jsnow@redhat.com> writes:
>
> > Annotations do not change runtime behavior.
> >
> > This commit adds mostly annotations, but uses a TYPE_CHECKING runtime
> > check to conditionally import dependencies, which only triggers during
> > runs of mypy.
>
> Please add a reference to
> https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles
>
>
OK.


> > Signed-off-by: John Snow <jsnow@redhat.com>
> >
> > ---
> >
> > TopLevelExpr, an idea from previous drafts, makes a return here in order
> > to give a semantic meaning to check_expr(). The type is intended to be
> > used more in forthcoming commits (pt5c), but I opted to include it now
> > instead of creating yet-another Dict[str, object] type hint that I would
> > forget to change later.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  scripts/qapi/parser.py | 77 ++++++++++++++++++++++++++----------------
> >  1 file changed, 48 insertions(+), 29 deletions(-)
> >
> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> > index 3ddde318376..b1e2fa5c577 100644
> > --- a/scripts/qapi/parser.py
> > +++ b/scripts/qapi/parser.py
> > @@ -18,6 +18,7 @@
> >  import os
> >  import re
> >  from typing import (
> > +    TYPE_CHECKING,
> >      Dict,
> >      List,
> >      Optional,
> > @@ -30,6 +31,15 @@
> >  from .source import QAPISourceInfo
> >
> >
> > +if TYPE_CHECKING:
> > +    # pylint: disable=cyclic-import
> > +    # TODO: Remove cycle. [schema -> expr -> parser -> schema]
>

WRT this todo: you mentioned that you'd prefer having some idea or plan for
how to eliminate the cycle in order to let this band-aid fly. How about
adding a 'doc' member to e.g. QAPISchemaFeature and doing the connection
entirely inside of schema.py, and dropping connect_member() and
connect_feature()?

Would that be serviceable?


> > +    from .schema import QAPISchemaFeature, QAPISchemaMember
> > +
> > +
> > +#: Represents a single Top Level QAPI schema expression.
> > +TopLevelExpr = Dict[str, object]
>
> Related: _ExprValue below, and _JSONObject in expr.py.  The latter's
> comment gives the best rationale (except I don't get "the purpose of
> this module is to interrogate that type").
>
>
in expr.py, the purpose of that module (expr) is explicitly to interrogate
(check, validate) the shape of arbitrary JSON objects. I am saying that a
more strict definition specifically here in expr.py is not necessary
because the entire purpose of expr.py is to, at runtime, verify the shape
of any such variables that might be annotated that way. I am drawing some
distinction to introspect.py, where we're generating that data ourselves --
the stronger types are more viable there, because we know what they are
already.
(again, sorry about mypy's lack of recursive typing, I hate it too, I
promise)


> I think we'd like to have
>
> * A recursive type for JSON value (in our bastardized version of JSON)
>
>   This is Union of bool, str, List[Self], Dict[str, Self].  It's what
>   .get_expr() returns.
>
>   Since mypy can't do recursive, we approximate with _ExprValue.
>
> * A recursive type for JSON object
>
>   This is the Dict branch of the above.  It's what check_keys() &
>   friends take as argument.
>
>   We approximate with _JSONObject.
>
>   Same for the List branch would make sense if we had a use for the
>   type.
>
> * A recursive type for TOP-LEVEL-EXPR
>
>   Actually the same type as the previous one, to be used only for the
>   schema's top-level expressions.  It's the elements of
>   QAPISchemaParser.exprs[], and what check_exprs() takes as argument.
>
>   We approximate with TopLevelExpr, but so far use it only for
>   check_exprs().
>
>   Doesn't really improve type checking, but may serve as documentation.
>
>
That's the intended effect here -- to help highlight which functions
operate on those top-level expressions, and which might be invoked in more
arbitrary cases.
Consider also a hypothetical TOP-LEVEL-EXPR that is actually a bona-fide
object with additional metadata, too. In these cases, the type really will
be different.


> Shouldn't these types be all defined in one place, namely right here?
> Bonus: we need to explain the mypy sadness just once then.
>
> Shouldn't their names be more systematic?  _ExprValue, _JSONObject and
> TopLevelExpr hardly suggest any relation...
>
>
I drop _JSONObject in pt5c, it was a stop-gap. For the rest, I'll see about
a more rigorous consolidation now that we're this far in.

pt5c was intended as a "cleanup" step that did some of that consolidation
of nearly-redundant types; but I wanted all of parser.py under the mypy gun
*first*.
Will you take a raincheck here and we'll focus on the consolidation in the
next series? I agree it's worth doing.

(I can add a 'FIXME' that will 100% need to be fixed before I move
scripts/qapi under python/qemu/qapi -- the linter config there prohibits
them, so you can be sure I can't ignore it.)


> > +
> >  # Return value alias for get_expr().
> >  _ExprValue = Union[List[object], Dict[str, object], str, bool]
> >
> > @@ -447,7 +457,8 @@ class QAPIDoc:
> >      """
> >
> >      class Section:
> > -        def __init__(self, parser, name=None, indent=0):
> > +        def __init__(self, parser: QAPISchemaParser,
> > +                     name: Optional[str] = None, indent: int = 0):
> >              # parser, for error messages about indentation
> >              self._parser = parser
> >              # optional section name (argument/member or section name)
> > @@ -459,7 +470,7 @@ def __init__(self, parser, name=None, indent=0):
> >          def __bool__(self) -> bool:
> >              return bool(self.name or self.text)
> >
> > -        def append(self, line):
> > +        def append(self, line: str) -> None:
> >              # Strip leading spaces corresponding to the expected indent
> level
> >              # Blank lines are always OK.
> >              if line:
> > @@ -474,39 +485,40 @@ def append(self, line):
> >              self.text += line.rstrip() + '\n'
> >
> >      class ArgSection(Section):
> > -        def __init__(self, parser, name, indent=0):
> > +        def __init__(self, parser: QAPISchemaParser,
> > +                     name: Optional[str] = None, indent: int = 0):
>
> Why not name: str?  All callers pass a str argument...
>
>
Maybe a holdover from when I was trying to stick to LSP for initializers. I
think I have since learned that mypy will only whine about LSP if you hold
pointers to the initializer to be invoked in factory functions. I'll fix it
and we'll worry about the LSP warnings if they ever decide to show up.


> >              super().__init__(parser, name, indent)
> > -            self.member = None
> > +            self.member: Optional['QAPISchemaMember'] = None
>
> I guess you need to quote 'QAPISchemaMember', because we actually import
> it only if TYPE_CHECKING.  More of the same below.
>
>
Yep. Future Python versions may be better about this (You don't have to
quote any types, ever -- they're parsed in a later pass) but we just don't
have that luxury yet.

See https://www.python.org/dev/peps/pep-0563/
Also see
https://www.reddit.com/r/Python/comments/muyz5h/pep_563_getting_rolled_back_from_python_310/
for some discussion on why this feature was not made the default for 3.10

(TLDR: It breaks a library called pydantic that largely does exactly what
expr.py does, using type annotations to define the valid shapes of objects.
I'd love to use it in QAPI, actually, ... a discussion for yet another day.)


> >
> > -        def connect(self, member):
> > +        def connect(self, member: 'QAPISchemaMember') -> None:
> >              self.member = member
> >
> > -    def __init__(self, parser, info):
> > +    def __init__(self, parser: QAPISchemaParser, info: QAPISourceInfo):
> >          # self._parser is used to report errors with QAPIParseError.
> The
> >          # resulting error position depends on the state of the parser.
> >          # It happens to be the beginning of the comment.  More or less
> >          # servicable, but action at a distance.
> >          self._parser = parser
> >          self.info = info
> > -        self.symbol = None
> > +        self.symbol: Optional[str] = None
> >          self.body = QAPIDoc.Section(parser)
> >          # dict mapping parameter name to ArgSection
> > -        self.args = OrderedDict()
> > -        self.features = OrderedDict()
> > +        self.args: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
> > +        self.features: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
> >          # a list of Section
> > -        self.sections = []
> > +        self.sections: List[QAPIDoc.Section] = []
> >          # the current section
> >          self._section = self.body
> >          self._append_line = self._append_body_line
> >
> > -    def has_section(self, name):
> > +    def has_section(self, name: str) -> bool:
> >          """Return True if we have a section with this name."""
> >          for i in self.sections:
> >              if i.name == name:
> >                  return True
> >          return False
> >
> > -    def append(self, line):
> > +    def append(self, line: str) -> None:
> >          """
> >          Parse a comment line and add it to the documentation.
> >
> > @@ -527,18 +539,18 @@ def append(self, line):
> >          line = line[1:]
> >          self._append_line(line)
> >
> > -    def end_comment(self):
> > +    def end_comment(self) -> None:
> >          self._end_section()
> >
> >      @staticmethod
> > -    def _is_section_tag(name):
> > +    def _is_section_tag(name: str) -> bool:
> >          return name in ('Returns:', 'Since:',
> >                          # those are often singular or plural
> >                          'Note:', 'Notes:',
> >                          'Example:', 'Examples:',
> >                          'TODO:')
> >
> > -    def _append_body_line(self, line):
> > +    def _append_body_line(self, line: str) -> None:
> >          """
> >          Process a line of documentation text in the body section.
> >
> > @@ -578,7 +590,7 @@ def _append_body_line(self, line):
> >              # This is a free-form documentation block
> >              self._append_freeform(line)
> >
> > -    def _append_args_line(self, line):
> > +    def _append_args_line(self, line: str) -> None:
> >          """
> >          Process a line of documentation text in an argument section.
> >
> > @@ -624,7 +636,7 @@ def _append_args_line(self, line):
> >
> >          self._append_freeform(line)
> >
> > -    def _append_features_line(self, line):
> > +    def _append_features_line(self, line: str) -> None:
> >          name = line.split(' ', 1)[0]
> >
> >          if name.startswith('@') and name.endswith(':'):
> > @@ -656,7 +668,7 @@ def _append_features_line(self, line):
> >
> >          self._append_freeform(line)
> >
> > -    def _append_various_line(self, line):
> > +    def _append_various_line(self, line: str) -> None:
> >          """
> >          Process a line of documentation text in an additional section.
> >
> > @@ -692,7 +704,11 @@ def _append_various_line(self, line):
> >
> >          self._append_freeform(line)
> >
> > -    def _start_symbol_section(self, symbols_dict, name, indent):
> > +    def _start_symbol_section(
> > +            self,
> > +            symbols_dict: Dict[str, 'QAPIDoc.ArgSection'],
>
> The need to quote this within the very class that defines it is
> annoying.
>

Yep. Python has weird scoping rules for nested classes. In my own hobby
code I've largely started avoiding ever using them, because they don't seem
to be worth the trouble.

Long story short: the name isn't available yet because at the time this
annotation is evaluated, we're still in the middle of defining it. I wish
there was a syntax for $thisclass for a nascent self-reference, but alas.

(Yes, I dislike it too. Nothing I can do about it, to my knowledge.)


>
> > +            name: str,
> > +            indent: int) -> None:
> >          # FIXME invalid names other than the empty string aren't flagged
> >          if not name:
> >              raise QAPIParseError(self._parser, "invalid parameter name")
> > @@ -704,13 +720,14 @@ def _start_symbol_section(self, symbols_dict,
> name, indent):
> >          self._section = QAPIDoc.ArgSection(self._parser, name, indent)
> >          symbols_dict[name] = self._section
> >
> > -    def _start_args_section(self, name, indent):
> > +    def _start_args_section(self, name: str, indent: int) -> None:
> >          self._start_symbol_section(self.args, name, indent)
> >
> > -    def _start_features_section(self, name, indent):
> > +    def _start_features_section(self, name: str, indent: int) -> None:
> >          self._start_symbol_section(self.features, name, indent)
> >
> > -    def _start_section(self, name=None, indent=0):
> > +    def _start_section(self, name: Optional[str] = None,
> > +                       indent: int = 0) -> None:
> >          if name in ('Returns', 'Since') and self.has_section(name):
> >              raise QAPIParseError(self._parser,
> >                                   "duplicated '%s' section" % name)
> > @@ -718,7 +735,7 @@ def _start_section(self, name=None, indent=0):
> >          self._section = QAPIDoc.Section(self._parser, name, indent)
> >          self.sections.append(self._section)
> >
> > -    def _end_section(self):
> > +    def _end_section(self) -> None:
> >          if self._section:
> >              text = self._section.text = self._section.text.strip()
> >              if self._section.name and (not text or text.isspace()):
> > @@ -727,7 +744,7 @@ def _end_section(self):
> >                      "empty doc section '%s'" % self._section.name)
> >              self._section = QAPIDoc.Section(self._parser)
> >
> > -    def _append_freeform(self, line):
> > +    def _append_freeform(self, line: str) -> None:
> >          match = re.match(r'(@\S+:)', line)
> >          if match:
> >              raise QAPIParseError(self._parser,
> > @@ -735,28 +752,30 @@ def _append_freeform(self, line):
> >                                   % match.group(1))
> >          self._section.append(line)
> >
> > -    def connect_member(self, member):
> > +    def connect_member(self, member: 'QAPISchemaMember') -> None:
> >          if member.name not in self.args:
> >              # Undocumented TODO outlaw
> >              self.args[member.name] = QAPIDoc.ArgSection(self._parser,
> >                                                          member.name)
> >          self.args[member.name].connect(member)
> >
> > -    def connect_feature(self, feature):
> > +    def connect_feature(self, feature: 'QAPISchemaFeature') -> None:
> >          if feature.name not in self.features:
> >              raise QAPISemError(feature.info,
> >                                 "feature '%s' lacks documentation"
> >                                 % feature.name)
> >          self.features[feature.name].connect(feature)
> >
> > -    def check_expr(self, expr):
> > +    def check_expr(self, expr: TopLevelExpr) -> None:
> >          if self.has_section('Returns') and 'command' not in expr:
> >              raise QAPISemError(self.info,
> >                                 "'Returns:' is only valid for commands")
> >
> > -    def check(self):
> > +    def check(self) -> None:
> >
> > -        def check_args_section(args, what):
> > +        def check_args_section(
> > +                args: Dict[str, QAPIDoc.ArgSection], what: str
> > +        ) -> None:
>
> This is the fourth use of Dict[str, QAPIDoc.ArgSection].  Still fine,
> but if we acquire even more, we should consider giving it a name.
>
>
If they share something in common that makes it name-able, sure. For now:
eh?


> >              bogus = [name for name, section in args.items()
> >                       if not section.member]
> >              if bogus:
>
>

[-- Attachment #2: Type: text/html, Size: 22830 bytes --]

  reply	other threads:[~2021-09-28 23:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 22:57 [PATCH v2 0/6] qapi: static typing conversion, pt5b John Snow
2021-05-20 22:57 ` [PATCH v2 1/6] qapi/parser: fix unused check_args_section arguments John Snow
2021-05-20 22:57 ` [PATCH v2 2/6] qapi/parser: Allow empty QAPIDoc Sections John Snow
2021-09-07  8:28   ` Markus Armbruster
2021-09-24 22:37     ` John Snow
2021-05-20 22:57 ` [PATCH v2 3/6] qapi/parser: add type hint annotations (QAPIDoc) John Snow
2021-09-07 10:44   ` Markus Armbruster
2021-09-28 23:25     ` John Snow [this message]
2021-09-30 10:04       ` Markus Armbruster
2021-05-20 22:57 ` [PATCH v2 4/6] qapi/parser: enable mypy checks John Snow
2021-05-20 22:57 ` [PATCH v2 5/6] qapi/parser: Silence too-few-public-methods warning John Snow
2021-05-20 22:57 ` [PATCH v2 6/6] qapi/parser: enable pylint checks John Snow
2021-08-05  0:20 ` [PATCH v2 0/6] qapi: static typing conversion, pt5b John Snow
2021-09-07 10:56 ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFn=p-Y4uqnLEvemPkTwYAdnFfoPLojyjRsWbD2zsEBpfvpqhQ@mail.gmail.com' \
    --to=jsnow@redhat.com \
    --cc=armbru@redhat.com \
    --cc=crosa@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=michael.roth@amd.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.