From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91C60C433F5 for ; Thu, 30 Sep 2021 17:46:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CF47F619F9 for ; Thu, 30 Sep 2021 17:46:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CF47F619F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:44822 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mW08X-0004U9-59 for qemu-devel@archiver.kernel.org; Thu, 30 Sep 2021 13:46:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42766) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mW06F-0003DT-7o for qemu-devel@nongnu.org; Thu, 30 Sep 2021 13:44:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:49249) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mW06A-000559-EC for qemu-devel@nongnu.org; Thu, 30 Sep 2021 13:44:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1633023837; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LsTF5mnehphkSzG2f0hmTta+I5wXad0HJMztGJk+xJ0=; b=Sf3LL+0BzJxiNj3ZPZWBoCFx8XX8wfPx+qRlsEG4mEEXUFEmXNVcR3VhLlkz+mLm9sq/nN z0HhTCyrFv9vjIsaKl/q86W1H/gouFYYtHLuob9yRwT66Obcfv8+f+hNP0lAObn/68kAHC yUnsSs5gRnDprY0AP7laIDB3y0dUJw8= Received: from mail-ua1-f72.google.com (mail-ua1-f72.google.com [209.85.222.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-584-70iStk7ENquRiNdM8A1WdQ-1; Thu, 30 Sep 2021 13:43:53 -0400 X-MC-Unique: 70iStk7ENquRiNdM8A1WdQ-1 Received: by mail-ua1-f72.google.com with SMTP id m3-20020a9f3fc3000000b002c9af61bd59so3403015uaj.6 for ; Thu, 30 Sep 2021 10:43:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LsTF5mnehphkSzG2f0hmTta+I5wXad0HJMztGJk+xJ0=; b=S65MvM0/gWLhdZ8W7cpSmMHuhBAd88oqT+fYWLKRVcqtobPG9vwyy/qATwgNjqHTZP TTIPSwPg8FqKOSebcuEdFKwiaPoZA54vjFwDhRU0Ro6wH7C90HFnYW9G7x4A6Z1Y1nfz hq3oE6S5JUgMG6YXsyQs+xn08l/c4YB3gYM1I5/dKRIoX5T+R99peehXEPaxHOQEJ16J hZbSkAJ3MvO7CSUPPo2fcdIocrB1O0TmkPeQY/MSw8s09WwM9N1CQmv8fIphD0tnVt4k 9CZ8GaF1QCQIapTfHVDNGJ3Fyc7MGM5rzZRPsZszKKbKLVObhO3GBkfiLABXtNZfdYfZ evzw== X-Gm-Message-State: AOAM532i3P2hozbzZSU5fsl4MOcm8XH87OJfC/eCc378ie913605BAuO 9vCamAehbYQYit29Gj2bQ8cUuTdbsX73bo8gjhQ2WNzUeHmV9jcI/H/UFJHFHxqNn9gc5/NtrLv Zof87HpRRNLNd9kIf9qWutZFQnD9Xmjc= X-Received: by 2002:a05:6102:3a0c:: with SMTP id b12mr646984vsu.13.1633023832988; Thu, 30 Sep 2021 10:43:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwqk8MYQ7jHrlHnULkWqbEMHPesRfxopf7OMt8vtMZPwExm7l4EfjmrcOAWeRZ7lHug5hPsowhiCbIU3sHo2jc= X-Received: by 2002:a05:6102:3a0c:: with SMTP id b12mr646952vsu.13.1633023832640; Thu, 30 Sep 2021 10:43:52 -0700 (PDT) MIME-Version: 1.0 References: <20210929194428.1038496-1-jsnow@redhat.com> <20210929194428.1038496-6-jsnow@redhat.com> <875yuisbdr.fsf@dusky.pond.sub.org> In-Reply-To: <875yuisbdr.fsf@dusky.pond.sub.org> From: John Snow Date: Thu, 30 Sep 2021 13:43:41 -0400 Message-ID: Subject: Re: [PATCH v3 05/13] qapi/parser: improve detection of '@symbol:' preface To: Markus Armbruster Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jsnow@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/alternative; boundary="00000000000074757c05cd39fdc2" Received-SPF: pass client-ip=170.10.133.124; envelope-from=jsnow@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michael Roth , Cleber Rosa , Eric Blake , qemu-devel , Eduardo Habkost Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --00000000000074757c05cd39fdc2 Content-Type: text/plain; charset="UTF-8" On Thu, Sep 30, 2021 at 4:42 AM Markus Armbruster wrote: > John Snow writes: > > > Leading and trailing whitespace are now discarded, addressing the FIXME > > comment. A new error is raised to detect this accidental case. > > > > Parsing for args sections is left alone here; the 'name' variable is > > moved into the only block where it is used. > > > > Signed-off-by: John Snow > > > > --- > > > > Tangentially related to delinting in that removing 'FIXME' comments is a > > goal for pylint. My goal is to allow 'TODO' to be checked in, but > > 'FIXME' should be fixed prior to inclusion. > > > > Arbitrary, but that's life for you. > > > > Signed-off-by: John Snow > > --- > > scripts/qapi/parser.py | 13 ++++++++----- > > tests/qapi-schema/doc-whitespace-leading-symbol.err | 1 + > > .../qapi-schema/doc-whitespace-leading-symbol.json | 6 ++++++ > > tests/qapi-schema/doc-whitespace-leading-symbol.out | 0 > > .../qapi-schema/doc-whitespace-trailing-symbol.err | 1 + > > .../qapi-schema/doc-whitespace-trailing-symbol.json | 6 ++++++ > > .../qapi-schema/doc-whitespace-trailing-symbol.out | 0 > > tests/qapi-schema/meson.build | 2 ++ > > 8 files changed, 24 insertions(+), 5 deletions(-) > > create mode 100644 tests/qapi-schema/doc-whitespace-leading-symbol.err > > create mode 100644 tests/qapi-schema/doc-whitespace-leading-symbol.json > > create mode 100644 tests/qapi-schema/doc-whitespace-leading-symbol.out > > create mode 100644 tests/qapi-schema/doc-whitespace-trailing-symbol.err > > create mode 100644 tests/qapi-schema/doc-whitespace-trailing-symbol.json > > create mode 100644 tests/qapi-schema/doc-whitespace-trailing-symbol.out > > > > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py > > index bfd2dbfd9a2..2f93a752f66 100644 > > --- a/scripts/qapi/parser.py > > +++ b/scripts/qapi/parser.py > > @@ -549,18 +549,21 @@ def _append_body_line(self, line): > > > > Else, append the line to the current section. > > """ > > - name = line.split(' ', 1)[0] > > - # FIXME not nice: things like '# @foo:' and '# @foo: ' aren't > > - # recognized, and get silently treated as ordinary text > > - if not self.symbol and not self.body.text and > line.startswith('@'): > > - if not line.endswith(':'): > > + stripped = line.strip() > > + > > + if not self.symbol and not self.body.text and > stripped.startswith('@'): > > + if not stripped.endswith(':'): > > raise QAPIParseError(self._parser, "line should end > with ':'") > > + if not stripped == line: > > + raise QAPIParseError( > > + self._parser, "extra whitespace around symbol > declaration") > > This rejects both leading and trailing whitespace. Rejecting leading > whitespace is good. Rejecting trailing whitespace feels a bit pedantic, > and it might not extend to the related case I'll point out below. > > err'd on the conservative side. Wasn't sure how permissive we really wanted to be. > Have you considered a regexp instead? Say > > match = re.match(r'(\s*)@([^:]*)(:?)(\s*)(.*)$', line) > > Then match.group(n) is > > n=1 leading whitespace, if any > n=2 symbol > n=3 trailing colon, if any > n=4 trailing whitespace, if any > n=5 trailing text, if any > > Omit the subgroups you don't need. > > Sensible, for a more comprehensive refactoring. > > self.symbol = line[1:-1] > > # FIXME invalid names other than the empty string aren't > flagged > > if not self.symbol: > > raise QAPIParseError(self._parser, "invalid name") > > elif self.symbol: > > # This is a definition documentation block > > + name = line.split(' ', 1)[0] > > if name.startswith('@') and name.endswith(':'): > > self._append_line = self._append_args_line > > self._append_args_line(line) > > Same issue here, and in _append_args_line(). To reproduce, I hacked up > doc-good.json like so > > diff --git a/tests/qapi-schema/doc-good.json > b/tests/qapi-schema/doc-good.json > index 86dc25d2bd..977fcbad48 100644 > --- a/tests/qapi-schema/doc-good.json > +++ b/tests/qapi-schema/doc-good.json > @@ -133,7 +133,7 @@ > ## > # @cmd: > # > -# @arg1: the first argument > +# @arg1: the first argument > # > # @arg2: the second > # argument > > and got > > $ PYTHONPATH=/work/armbru/qemu/scripts python3 > /work/armbru/qemu/tests/qapi-schema/test-qapi.py -d tests/qapi-schema > doc-good.json > doc-good FAIL > --- tests/qapi-schema/doc-good.out > +++ > @@ -149,12 +149,12 @@ > == Another subsection > doc symbol=cmd > body= > - > - arg=arg1 > -the first argument > +@arg1: the first argument > arg=arg2 > the second > argument > + arg=arg1 > + > arg=arg3 > > feature=cmd-feat1 > > [...] > > OK, more time in the oven with this one, and I will tackle it separately and later. Possibly as part of my sphinx-docs work I want to get to soon. We may drop it from this series to avoid holding it up. (The FIXME again keeps me honest here ... !) Thanks for the reviews! --js --00000000000074757c05cd39fdc2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Sep 30, 2021 at 4:42 AM Marku= s Armbruster <armbru@redhat.com= > wrote:
John= Snow <jsnow@redha= t.com> writes:

> Leading and trailing whitespace are now discarded, addressing the FIXM= E
> comment. A new error is raised to detect this accidental case.
>
> Parsing for args sections is left alone here; the 'name' varia= ble is
> moved into the only block where it is used.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
> ---
>
> Tangentially related to delinting in that removing 'FIXME' com= ments is a
> goal for pylint. My goal is to allow 'TODO' to be checked in, = but
> 'FIXME' should be fixed prior to inclusion.
>
> Arbitrary, but that's life for you.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>=C2=A0 scripts/qapi/parser.py=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 13 +++++++= +-----
>=C2=A0 tests/qapi-schema/doc-whitespace-leading-symbol.err |=C2=A0 1 +<= br> >=C2=A0 .../qapi-schema/doc-whitespace-leading-symbol.json=C2=A0 |=C2=A0= 6 ++++++
>=C2=A0 tests/qapi-schema/doc-whitespace-leading-symbol.out |=C2=A0 0 >=C2=A0 .../qapi-schema/doc-whitespace-trailing-symbol.err=C2=A0 |=C2=A0= 1 +
>=C2=A0 .../qapi-schema/doc-whitespace-trailing-symbol.json |=C2=A0 6 ++= ++++
>=C2=A0 .../qapi-schema/doc-whitespace-trailing-symbol.out=C2=A0 |=C2=A0= 0
>=C2=A0 tests/qapi-schema/meson.build=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 2 ++
>=C2=A0 8 files changed, 24 insertions(+), 5 deletions(-)
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-leading-symb= ol.err
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-leading-symb= ol.json
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-leading-symb= ol.out
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-trailing-sym= bol.err
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-trailing-sym= bol.json
>=C2=A0 create mode 100644 tests/qapi-schema/doc-whitespace-trailing-sym= bol.out
>
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index bfd2dbfd9a2..2f93a752f66 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -549,18 +549,21 @@ def _append_body_line(self, line):
>=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Else, append the line to the current= section.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 """
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 name =3D line.split(' ', 1)[0] > -=C2=A0 =C2=A0 =C2=A0 =C2=A0 # FIXME not nice: things like '#=C2= =A0 @foo:' and '# @foo: ' aren't
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 # recognized, and get silently treated as= ordinary text
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 if not self.symbol and not self.body.text= and line.startswith('@'):
> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if not line.endswith(':= '):
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 stripped =3D line.strip()
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 if not self.symbol and not self.body.text= and stripped.startswith('@'):
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if not stripped.endswith(&#= 39;:'):
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 raise QA= PIParseError(self._parser, "line should end with ':'") > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if not stripped =3D=3D line= :
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 raise QAPIPar= seError(
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= self._parser, "extra whitespace around symbol declaration")

This rejects both leading and trailing whitespace.=C2=A0 Rejecting leading<= br> whitespace is good.=C2=A0 Rejecting trailing whitespace feels a bit pedanti= c,
and it might not extend to the related case I'll point out below.


err'd on the conservative side. Wa= sn't sure how permissive we really wanted to be.
=C2=A0
Have you considered a regexp instead?=C2=A0 Say

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0match =3D re.match(r'(\s*)@([^= :]*)(:?)(\s*)(.*)$', line)

Then match.group(n) is

=C2=A0 =C2=A0 =C2=A0n=3D1=C2=A0 leading whitespace, if any
=C2=A0 =C2=A0 =C2=A0n=3D2=C2=A0 symbol
=C2=A0 =C2=A0 =C2=A0n=3D3=C2=A0 trailing colon, if any
=C2=A0 =C2=A0 =C2=A0n=3D4=C2=A0 trailing whitespace, if any
=C2=A0 =C2=A0 =C2=A0n=3D5=C2=A0 trailing text, if any

Omit the subgroups you don't need.


Sensible, for a more comprehensive ref= actoring.
=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 self.symbol =3D line[1= :-1]
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # FIXME invalid names = other than the empty string aren't flagged
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if not self.symbol: >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 raise QA= PIParseError(self._parser, "invalid name")
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 elif self.symbol:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # This is a definition= documentation block
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 name =3D line.split(' &= #39;, 1)[0]
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if name.startswith(= 9;@') and name.endswith(':'):
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 self._ap= pend_line =3D self._append_args_line
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 self._ap= pend_args_line(line)

Same issue here, and in _append_args_line().=C2=A0 To reproduce, I hacked u= p
doc-good.json like so

=C2=A0 =C2=A0 diff --git a/tests/qapi-schema/doc-good.json b/tests/qapi-sch= ema/doc-good.json
=C2=A0 =C2=A0 index 86dc25d2bd..977fcbad48 100644
=C2=A0 =C2=A0 --- a/tests/qapi-schema/doc-good.json
=C2=A0 =C2=A0 +++ b/tests/qapi-schema/doc-good.json
=C2=A0 =C2=A0 @@ -133,7 +133,7 @@
=C2=A0 =C2=A0 =C2=A0##
=C2=A0 =C2=A0 =C2=A0# @cmd:
=C2=A0 =C2=A0 =C2=A0#
=C2=A0 =C2=A0 -# @arg1: the first argument
=C2=A0 =C2=A0 +#=C2=A0 @arg1: the first argument
=C2=A0 =C2=A0 =C2=A0#
=C2=A0 =C2=A0 =C2=A0# @arg2: the second
=C2=A0 =C2=A0 =C2=A0#=C2=A0 =C2=A0 =C2=A0 =C2=A0 argument

and got

=C2=A0 =C2=A0 $ PYTHONPATH=3D/work/armbru/qemu/scripts python3 /work/armbru= /qemu/tests/qapi-schema/test-qapi.py -d tests/qapi-schema doc-good.json
=C2=A0 =C2=A0 doc-good FAIL
=C2=A0 =C2=A0 --- tests/qapi-schema/doc-good.out
=C2=A0 =C2=A0 +++
=C2=A0 =C2=A0 @@ -149,12 +149,12 @@
=C2=A0 =C2=A0 =C2=A0=3D=3D Another subsection
=C2=A0 =C2=A0 =C2=A0doc symbol=3Dcmd
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0body=3D
=C2=A0 =C2=A0 -
=C2=A0 =C2=A0 -=C2=A0 =C2=A0 arg=3Darg1
=C2=A0 =C2=A0 -the first argument
=C2=A0 =C2=A0 +@arg1: the first argument
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0arg=3Darg2
=C2=A0 =C2=A0 =C2=A0the second
=C2=A0 =C2=A0 =C2=A0argument
=C2=A0 =C2=A0 +=C2=A0 =C2=A0 arg=3Darg1
=C2=A0 =C2=A0 +
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0arg=3Darg3

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0feature=3Dcmd-feat1

[...]


OK, more time in the oven with this on= e, and I will tackle it separately and later. Possibly as part of my sphinx= -docs work I want to get to soon. We may drop it from this series to avoid = holding it up.

(The FIXME again keeps me honest he= re ... !)

Thanks for the reviews!
--= js
--00000000000074757c05cd39fdc2--