linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
@ 2019-09-24 23:02 Kees Cook
  2019-09-24 23:02 ` [PATCH 1/2] doc-rst: Reduce CSS padding around Field Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Kees Cook @ 2019-09-24 23:02 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Kees Cook, Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

Commit log from Patch 2 repeated here for cover letter:

In order to have the MAINTAINERS file visible in the rendered ReST
output, this makes some small changes to the existing MAINTAINERS file
to allow for better machine processing, and adds a new Sphinx directive
"maintainers-include" to perform the rendering.

Features include:
- Per-subsystem reference links: subsystem maintainer entries can be
  trivially linked to both internally and external. For example:
  https://www.kernel.org/doc/html/latest/process/maintainers.html#secure-computing

- Internally referenced .rst files are linked so they can be followed
  when browsing the resulting rendering. This allows, for example, the
  future addition of maintainer profiles to be automatically linked.

- Field name expansion: instead of the short fields (e.g. "M", "F",
  "K"), use the indicated inline "full names" for the fields (which are
  marked with "*"s in MAINTAINERS) so that a rendered subsystem entry
  is more human readable. Email lists are additionally comma-separated.
  For example:

    SECURE COMPUTING
        Mail:     Kees Cook <keescook@chromium.org>
        Reviewer: Andy Lutomirski <luto@amacapital.net>,
                  Will Drewry <wad@chromium.org>
        SCM:      git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git seccomp
        Status:   Supported
        Files:    kernel/seccomp.c include/uapi/linux/seccomp.h
                  include/linux/seccomp.h tools/testing/selftests/seccomp/*
                  tools/testing/selftests/kselftest_harness.h
                  userspace-api/seccomp_filter
        Content regex:  \bsecure_computing \bTIF_SECCOMP\b

---
Kees Cook (2):
  doc-rst: Reduce CSS padding around Field
  doc-rst: Programmatically render MAINTAINERS into ReST

 Documentation/conf.py                         |   3 +-
 Documentation/process/index.rst               |   1 +
 Documentation/process/maintainers.rst         |   1 +
 .../sphinx-static/theme_overrides.css         |  10 +
 Documentation/sphinx/maintainers_include.py   | 194 ++++++++++++++++++
 MAINTAINERS                                   |  62 +++---
 6 files changed, 240 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/process/maintainers.rst
 create mode 100755 Documentation/sphinx/maintainers_include.py

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] doc-rst: Reduce CSS padding around Field
  2019-09-24 23:02 [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST Kees Cook
@ 2019-09-24 23:02 ` Kees Cook
  2019-09-24 23:02 ` [PATCH 2/2] doc-rst: Programmatically render MAINTAINERS into ReST Kees Cook
  2019-10-01 14:31 ` [PATCH 0/2] docs: " Jonathan Corbet
  2 siblings, 0 replies; 11+ messages in thread
From: Kees Cook @ 2019-09-24 23:02 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Kees Cook, Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

Right now any ":Field Name: Field Contents" lines end up with significant
padding due to CSS from the "table" CSS which rightly needs padding to
make tables readable. However, field lists don't need this as they tend
to be stacked together. The future heavy use of fields in the parsed
MAINTAINERS file needs this cleaned up, and existing users look better
too. Note the needless white space (and misalignment of name/contents)
between "Date" and "Author":

https://www.kernel.org/doc/html/latest/accounting/psi.html

This patch fixes this by lowering the padding with a more specific CSS.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/sphinx-static/theme_overrides.css | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/sphinx-static/theme_overrides.css b/Documentation/sphinx-static/theme_overrides.css
index e21e36cd6761..459ec5b29d68 100644
--- a/Documentation/sphinx-static/theme_overrides.css
+++ b/Documentation/sphinx-static/theme_overrides.css
@@ -53,6 +53,16 @@ div[class^="highlight"] pre {
     line-height: normal;
 }
 
+/* Keep fields from being strangely far apart due to inheirited table CSS. */
+.rst-content table.field-list th.field-name {
+    padding-top: 1px;
+    padding-bottom: 1px;
+}
+.rst-content table.field-list td.field-body {
+    padding-top: 1px;
+    padding-bottom: 1px;
+}
+
 @media screen {
 
     /* content column
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] doc-rst: Programmatically render MAINTAINERS into ReST
  2019-09-24 23:02 [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST Kees Cook
  2019-09-24 23:02 ` [PATCH 1/2] doc-rst: Reduce CSS padding around Field Kees Cook
@ 2019-09-24 23:02 ` Kees Cook
  2019-10-01 14:31 ` [PATCH 0/2] docs: " Jonathan Corbet
  2 siblings, 0 replies; 11+ messages in thread
From: Kees Cook @ 2019-09-24 23:02 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Kees Cook, Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

In order to have the MAINTAINERS file visible in the rendered ReST
output, this makes some small changes to the existing MAINTAINERS file
to allow for better machine processing, and adds a new Sphinx directive
"maintainers-include" to perform the rendering.

Features include:
- Per-subsystem reference links: subsystem maintainer entries can be
  trivially linked to both internally and external. For example:
  https://www.kernel.org/doc/html/latest/process/maintainers.html#secure-computing

- Internally referenced .rst files are linked so they can be followed
  when browsing the resulting rendering. This allows, for example, the
  future addition of maintainer profiles to be automatically linked.

- Field name expansion: instead of the short fields (e.g. "M", "F",
  "K"), use the indicated inline "full names" for the fields (which are
  marked with "*"s in MAINTAINERS) so that a rendered subsystem entry
  is more human readable. Email lists are additionally comma-separated.
  For example:

    SECURE COMPUTING
	Mail:	  Kees Cook <keescook@chromium.org>
	Reviewer: Andy Lutomirski <luto@amacapital.net>,
		  Will Drewry <wad@chromium.org>
	SCM:	  git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git seccomp
	Status:	  Supported
	Files:	  kernel/seccomp.c include/uapi/linux/seccomp.h
		  include/linux/seccomp.h tools/testing/selftests/seccomp/*
		  tools/testing/selftests/kselftest_harness.h
		  userspace-api/seccomp_filter
	Content regex:	\bsecure_computing \bTIF_SECCOMP\b

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/conf.py                       |   3 +-
 Documentation/process/index.rst             |   1 +
 Documentation/process/maintainers.rst       |   1 +
 Documentation/sphinx/maintainers_include.py | 194 ++++++++++++++++++++
 MAINTAINERS                                 |  62 ++++---
 5 files changed, 230 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/process/maintainers.rst
 create mode 100755 Documentation/sphinx/maintainers_include.py

diff --git a/Documentation/conf.py b/Documentation/conf.py
index a8fe845832bc..3c7bdf4cd31f 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -37,7 +37,8 @@ needs_sphinx = '1.3'
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
 extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain',
-              'kfigure', 'sphinx.ext.ifconfig', 'automarkup']
+              'kfigure', 'sphinx.ext.ifconfig', 'automarkup',
+              'maintainers_include']
 
 # The name of the math extension changed on Sphinx 1.4
 if (major == 1 and minor > 3) or (major > 1):
diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
index e2c9ffc682c5..e2fb0c9652ac 100644
--- a/Documentation/process/index.rst
+++ b/Documentation/process/index.rst
@@ -46,6 +46,7 @@ Other guides to the community that are of interest to most developers are:
    kernel-docs
    deprecated
    embargoed-hardware-issues
+   maintainers
 
 These are some overall technical guides that have been put here for now for
 lack of a better place.
diff --git a/Documentation/process/maintainers.rst b/Documentation/process/maintainers.rst
new file mode 100644
index 000000000000..6174cfb4138f
--- /dev/null
+++ b/Documentation/process/maintainers.rst
@@ -0,0 +1 @@
+.. maintainers-include::
diff --git a/Documentation/sphinx/maintainers_include.py b/Documentation/sphinx/maintainers_include.py
new file mode 100755
index 000000000000..de62dd3dabd5
--- /dev/null
+++ b/Documentation/sphinx/maintainers_include.py
@@ -0,0 +1,194 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+# -*- coding: utf-8; mode: python -*-
+# pylint: disable=R0903, C0330, R0914, R0912, E0401
+
+u"""
+    maintainers-include
+    ~~~~~~~~~~~~~~~~~~~
+
+    Implementation of the ``maintainers-include`` reST-directive.
+
+    :copyright:  Copyright (C) 2019  Kees Cook <keescook@chromium.org>
+    :license:    GPL Version 2, June 1991 see linux/COPYING for details.
+
+    The ``maintainers-include`` reST-directive performs extensive parsing
+    specific to the Linux kernel's standard "MAINTAINERS" file, in an
+    effort to avoid needing to heavily mark up the original plain text.
+"""
+
+import re
+import os.path
+
+from docutils import statemachine
+from docutils.utils.error_reporting import ErrorString
+from docutils.parsers.rst import Directive
+from docutils.parsers.rst.directives.misc import Include
+
+__version__  = '1.0'
+
+def setup(app):
+    app.add_directive("maintainers-include", MaintainersInclude)
+    return dict(
+        version = __version__,
+        parallel_read_safe = True,
+        parallel_write_safe = True
+    )
+
+class MaintainersInclude(Include):
+    u"""MaintainersInclude (``maintainers-include``) directive"""
+    required_arguments = 0
+
+    def parse_maintainers(self, path):
+        """Parse all the MAINTAINERS lines into ReST for human-readability"""
+
+        result = list()
+        result.append(".. _maintainers:")
+        result.append("")
+
+        # Poor man's state machine.
+        descriptions = False
+        maintainers = False
+        subsystems = False
+
+        # Field letter to field name mapping.
+        field_letter = None
+        fields = dict()
+
+        prev = None
+        field_prev = ""
+        field_content = ""
+
+        for line in open(path):
+            # Have we reached the end of the preformatted Descriptions text?
+            if descriptions and line.startswith('Maintainers'):
+                descriptions = False
+                # Ensure a blank line following the last "|"-prefixed line.
+                result.append("")
+
+            # Start subsystem processing? This is to skip processing the text
+            # between the Maintainers heading and the first subsystem name.
+            if maintainers and not subsystems:
+                if re.search('^[A-Z0-9]', line):
+                    subsystems = True
+
+            # Drop needless input whitespace.
+            line = line.rstrip()
+
+            # Linkify all non-wildcard refs to ReST files in Documentation/.
+            pat = '(Documentation/([^\s\?\*]*)\.rst)'
+            m = re.search(pat, line)
+            if m:
+                # maintainers.rst is in a subdirectory, so include "../".
+                line = re.sub(pat, ':doc:`%s <../%s>`' % (m.group(2), m.group(2)), line)
+
+            # Check state machine for output rendering behavior.
+            output = None
+            if descriptions:
+                # Escape the escapes in preformatted text.
+                output = "| %s" % (line.replace("\\", "\\\\"))
+                # Look for and record field letter to field name mappings:
+                #   R: Designated *reviewer*: FullName <address@domain>
+                m = re.search("\s(\S):\s", line)
+                if m:
+                    field_letter = m.group(1)
+                if field_letter and not field_letter in fields:
+                    m = re.search("\*([^\*]+)\*", line)
+                    if m:
+                        fields[field_letter] = m.group(1)
+            elif subsystems:
+                # Skip empty lines: subsystem parser adds them as needed.
+                if len(line) == 0:
+                    continue
+                # Subsystem fields are batched into "field_content"
+                if line[1] != ':':
+                    # Render a subsystem entry as:
+                    #   SUBSYSTEM NAME
+                    #   ~~~~~~~~~~~~~~
+
+                    # Flush pending field content.
+                    output = field_content + "\n\n"
+                    field_content = ""
+
+                    # Collapse whitespace in subsystem name.
+                    heading = re.sub("\s+", " ", line)
+                    output = output + "%s\n%s" % (heading, "~" * len(heading))
+                    field_prev = ""
+                else:
+                    # Render a subsystem field as:
+                    #   :Field: entry
+                    #           entry...
+                    field, details = line.split(':', 1)
+                    details = details.strip()
+
+                    # Mark paths (and regexes) as literal text for improved
+                    # readability and to escape any escapes.
+                    if field in ['F', 'N', 'X', 'K']:
+                        # But only if not already marked :)
+                        if not ':doc:' in details:
+                            details = '``%s``' % (details)
+
+                    # Comma separate email field continuations.
+                    if field == field_prev and field_prev in ['M', 'R', 'L']:
+                        field_content = field_content + ","
+
+                    # Do not repeat field names, so that field entries
+                    # will be collapsed together.
+                    if field != field_prev:
+                        output = field_content + "\n"
+                        field_content = ":%s:" % (fields.get(field, field))
+                    field_content = field_content + "\n\t%s" % (details)
+                    field_prev = field
+            else:
+                output = line
+
+            # Re-split on any added newlines in any above parsing.
+            if output != None:
+                for separated in output.split('\n'):
+                    result.append(separated)
+
+            # Update the state machine when we find heading separators.
+            if line.startswith('----------'):
+                if prev.startswith('Descriptions'):
+                    descriptions = True
+                if prev.startswith('Maintainers'):
+                    maintainers = True
+
+            # Retain previous line for state machine transitions.
+            prev = line
+
+        # Flush pending field contents.
+        if field_content != "":
+            for separated in field_content.split('\n'):
+                result.append(separated)
+
+        output = "\n".join(result)
+        # For debugging the pre-rendered results...
+        #print(output, file=open("/tmp/MAINTAINERS.rst", "w"))
+
+        self.state_machine.insert_input(
+          statemachine.string2lines(output), path)
+
+    def run(self):
+        """Include the MAINTAINERS file as part of this reST file."""
+        if not self.state.document.settings.file_insertion_enabled:
+            raise self.warning('"%s" directive disabled.' % self.name)
+
+        # Walk up source path directories to find Documentation/../
+        path = self.state_machine.document.attributes['source']
+        path = os.path.realpath(path)
+        tail = path
+        while tail != "Documentation" and tail != "":
+            (path, tail) = os.path.split(path)
+
+        # Append "MAINTAINERS"
+        path = os.path.join(path, "MAINTAINERS")
+
+        try:
+            self.state.document.settings.record_dependencies.add(path)
+            lines = self.parse_maintainers(path)
+        except IOError as error:
+            raise self.severe('Problems with "%s" directive path:\n%s.' %
+                      (self.name, ErrorString(error)))
+
+        return []
diff --git a/MAINTAINERS b/MAINTAINERS
index 2b6f10ea1573..6c5d86aaa7dc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1,12 +1,14 @@
-
-
-	List of maintainers and how to submit kernel changes
+List of maintainers and how to submit kernel changes
+====================================================
 
 Please try to follow the guidelines below.  This will make things
 easier on the maintainers.  Not all of these guidelines matter for every
 trivial patch so apply some common sense.
 
-1.	Always _test_ your changes, however small, on at least 4 or
+Tips for patch submitters
+-------------------------
+
+1.	Always *test* your changes, however small, on at least 4 or
 	5 people, preferably many more.
 
 2.	Try to release a few ALPHA test versions to the net. Announce
@@ -25,7 +27,7 @@ trivial patch so apply some common sense.
 	testing and await feedback.
 
 5.	Make a patch available to the relevant maintainer in the list. Use
-	'diff -u' to make the patch easy to merge. Be prepared to get your
+	``diff -u`` to make the patch easy to merge. Be prepared to get your
 	changes sent back with seemingly silly requests about formatting
 	and variable names.  These aren't as silly as they seem. One
 	job the maintainers (and especially Linus) do is to keep things
@@ -38,7 +40,7 @@ trivial patch so apply some common sense.
 	See Documentation/process/coding-style.rst for guidance here.
 
 	PLEASE CC: the maintainers and mailing lists that are generated
-	by scripts/get_maintainer.pl.  The results returned by the
+	by ``scripts/get_maintainer.pl.`` The results returned by the
 	script will be best if you have git installed and are making
 	your changes in a branch derived from Linus' latest git tree.
 	See Documentation/process/submitting-patches.rst for details.
@@ -70,26 +72,27 @@ trivial patch so apply some common sense.
 	not represent an immediate threat and are better handled publicly,
 	and ideally, should come with a patch proposal. Please do not send
 	automated reports to this list either. Such bugs will be handled
-	better and faster in the usual public places.
+	better and faster in the usual public places. See
+	Documentation/admin-guide/security-bugs.rst for details.
 
 8.	Happy hacking.
 
-Descriptions of section entries:
+Descriptions of section entries
+-------------------------------
 
-	P: Person (obsolete)
-	M: Mail patches to: FullName <address@domain>
-	R: Designated reviewer: FullName <address@domain>
+	M: *Mail* patches to: FullName <address@domain>
+	R: Designated *Reviewer*: FullName <address@domain>
 	   These reviewers should be CCed on patches.
-	L: Mailing list that is relevant to this area
-	W: Web-page with status/info
-	B: URI for where to file bugs. A web-page with detailed bug
+	L: *Mailing list* that is relevant to this area
+	W: *Web-page* with status/info
+	B: URI for where to file *bugs*. A web-page with detailed bug
 	   filing info, a direct bug tracker link, or a mailto: URI.
-	C: URI for chat protocol, server and channel where developers
+	C: URI for *chat* protocol, server and channel where developers
 	   usually hang out, for example irc://server/channel.
-	Q: Patchwork web based patch tracking system site
-	T: SCM tree type and location.
+	Q: *Patchwork* web based patch tracking system site
+	T: *SCM* tree type and location.
 	   Type is one of: git, hg, quilt, stgit, topgit
-	S: Status, one of the following:
+	S: *Status*, one of the following:
 	   Supported:	Someone is actually paid to look after this.
 	   Maintained:	Someone actually looks after it.
 	   Odd Fixes:	It has a maintainer but they don't have time to do
@@ -99,13 +102,13 @@ Descriptions of section entries:
 	   Obsolete:	Old code. Something tagged obsolete generally means
 			it has been replaced by a better system and you
 			should be using that.
-	F: Files and directories with wildcard patterns.
+	F: *Files* and directories wildcard patterns.
 	   A trailing slash includes all files and subdirectory files.
 	   F:	drivers/net/	all files in and below drivers/net
 	   F:	drivers/net/*	all files in drivers/net, but not below
 	   F:	*/net/*		all files in "any top level directory"/net
 	   One pattern per line.  Multiple F: lines acceptable.
-	N: Files and directories with regex patterns.
+	N: Files and directories *Regex* patterns.
 	   N:	[^a-z]tegra	all files whose path contains the word tegra
 	   One pattern per line.  Multiple N: lines acceptable.
 	   scripts/get_maintainer.pl has different behavior for files that
@@ -113,14 +116,14 @@ Descriptions of section entries:
 	   get_maintainer will not look at git log history when an F: pattern
 	   match occurs.  When an N: match occurs, git log history is used
 	   to also notify the people that have git commit signatures.
-	X: Files and directories that are NOT maintained, same rules as F:
-	   Files exclusions are tested before file matches.
+	X: *Excluded* files and directories that are NOT maintained, same
+	   rules as F:. Files exclusions are tested before file matches.
 	   Can be useful for excluding a specific subdirectory, for instance:
 	   F:	net/
 	   X:	net/ipv6/
 	   matches all files in and below net excluding net/ipv6/
-	K: Keyword perl extended regex pattern to match content in a
-	   patch or file.  For instance:
+	K: *Content regex* (perl extended) pattern match in a patch or file.
+	   For instance:
 	   K: of_get_profile
 	      matches patches or files that contain "of_get_profile"
 	   K: \b(printk|pr_(info|err))\b
@@ -128,13 +131,12 @@ Descriptions of section entries:
 	      printk, pr_info or pr_err
 	   One regex pattern per line.  Multiple K: lines acceptable.
 
-Note: For the hard of thinking, this list is meant to remain in alphabetical
-order. If you could add yourselves to it in alphabetical order that would be
-so much easier [Ed]
-
-Maintainers List (try to look for most precise areas first)
+Maintainers List
+----------------
 
-		-----------------------------------
+.. note:: When reading this list, please look for the most precise areas
+          first. When adding to this list, please keep the entries in
+          alphabetical order.
 
 3C59X NETWORK DRIVER
 M:	Steffen Klassert <klassert@kernel.org>
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-09-24 23:02 [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST Kees Cook
  2019-09-24 23:02 ` [PATCH 1/2] doc-rst: Reduce CSS padding around Field Kees Cook
  2019-09-24 23:02 ` [PATCH 2/2] doc-rst: Programmatically render MAINTAINERS into ReST Kees Cook
@ 2019-10-01 14:31 ` Jonathan Corbet
  2019-10-01 15:09   ` Mauro Carvalho Chehab
  2019-10-01 16:27   ` Kees Cook
  2 siblings, 2 replies; 11+ messages in thread
From: Jonathan Corbet @ 2019-10-01 14:31 UTC (permalink / raw)
  To: Kees Cook; +Cc: Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

On Tue, 24 Sep 2019 16:02:06 -0700
Kees Cook <keescook@chromium.org> wrote:

> Commit log from Patch 2 repeated here for cover letter:
> 
> In order to have the MAINTAINERS file visible in the rendered ReST
> output, this makes some small changes to the existing MAINTAINERS file
> to allow for better machine processing, and adds a new Sphinx directive
> "maintainers-include" to perform the rendering.

I finally got around to trying this out.  After the usual warnings, the
build comes to a screeching halt with this:

  Sphinx parallel build error:
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)

For extra fun, the build process simply hangs, requiring a ^C to blow it
away.  You've managed to get new behavior out of Sphinx that I've not seen
before, congratulations :)

This almost certainly has to do with the fact that I'm (intentionally)
running the Python2 Sphinx build here.  Something's not doing unicode that
should be.

I would suggest that we might just want to repair this before merging this
feature.  Either that, or we bite the bullet and deprecate the use of
Python 2 entirely - something that's probably not too far into our future
regardless.  Anybody have thoughts on that matter?

On a separate note...it occurred to me, rather belatedly as usual, that
last time we discussed doing this that there was some opposition to adding
a second MAINTAINERS parser to the kernel; future changes to the format of
that file may force both to be adjusted, and somebody will invariably
forget one.  Addressing that, if we feel a need to do so, probably requires
tweaking get_maintainer.pl to output the information in a useful format.

Thanks,

jon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 14:31 ` [PATCH 0/2] docs: " Jonathan Corbet
@ 2019-10-01 15:09   ` Mauro Carvalho Chehab
  2019-10-01 16:16     ` Kees Cook
  2019-10-01 17:35     ` Jonathan Corbet
  2019-10-01 16:27   ` Kees Cook
  1 sibling, 2 replies; 11+ messages in thread
From: Mauro Carvalho Chehab @ 2019-10-01 15:09 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Kees Cook, Jani Nikula, linux-doc, linux-kernel

Em Tue, 1 Oct 2019 08:31:47 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> On Tue, 24 Sep 2019 16:02:06 -0700
> Kees Cook <keescook@chromium.org> wrote:
> 
> > Commit log from Patch 2 repeated here for cover letter:
> > 
> > In order to have the MAINTAINERS file visible in the rendered ReST
> > output, this makes some small changes to the existing MAINTAINERS file
> > to allow for better machine processing, and adds a new Sphinx directive
> > "maintainers-include" to perform the rendering.  
> 
> I finally got around to trying this out.  After the usual warnings, the
> build comes to a screeching halt with this:
> 
>   Sphinx parallel build error:
>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
> 
> For extra fun, the build process simply hangs, requiring a ^C to blow it
> away.  You've managed to get new behavior out of Sphinx that I've not seen
> before, congratulations :)
> 
> This almost certainly has to do with the fact that I'm (intentionally)
> running the Python2 Sphinx build here.  Something's not doing unicode that
> should be.
> 
> I would suggest that we might just want to repair this before merging this
> feature.  Either that, or we bite the bullet and deprecate the use of
> Python 2 entirely - something that's probably not too far into our future
> regardless.  Anybody have thoughts on that matter?

I'm almost sure we got this already. If I'm not mistaken, the solution
is to add the encoding line after shebang. Looking at 
Documentation/sphinx/maintainers_include.py (patch 2/2), the script
starts with:

	#!/usr/bin/env python
	# SPDX-License-Identifier: GPL-2.0
	# -*- coding: utf-8; mode: python -*-
	# pylint: disable=R0903, C0330, R0914, R0912, E0401

But, as I pointed before, the SPDX header at the wrong place is causing the 
crash, as the encoding line should be the second line, not the third one,
e. g.:

	#!/usr/bin/env python
	# -*- coding: utf-8; mode: python -*-
	# SPDX-License-Identifier: GPL-2.0
	# pylint: disable=R0903, C0330, R0914, R0912, E0401

I also suspect that this would happen even with python3, depending on
how LC_ALL, LANG, ... are set on the distro you use.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 15:09   ` Mauro Carvalho Chehab
@ 2019-10-01 16:16     ` Kees Cook
  2019-10-01 17:05       ` Jonathan Corbet
  2019-10-01 17:35     ` Jonathan Corbet
  1 sibling, 1 reply; 11+ messages in thread
From: Kees Cook @ 2019-10-01 16:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jonathan Corbet, Jani Nikula, linux-doc, linux-kernel

On Tue, Oct 01, 2019 at 12:09:30PM -0300, Mauro Carvalho Chehab wrote:
> Em Tue, 1 Oct 2019 08:31:47 -0600
> Jonathan Corbet <corbet@lwn.net> escreveu:
> 
> > On Tue, 24 Sep 2019 16:02:06 -0700
> > Kees Cook <keescook@chromium.org> wrote:
> > 
> > > Commit log from Patch 2 repeated here for cover letter:
> > > 
> > > In order to have the MAINTAINERS file visible in the rendered ReST
> > > output, this makes some small changes to the existing MAINTAINERS file
> > > to allow for better machine processing, and adds a new Sphinx directive
> > > "maintainers-include" to perform the rendering.  
> > 
> > I finally got around to trying this out.  After the usual warnings, the
> > build comes to a screeching halt with this:
> > 
> >   Sphinx parallel build error:
> >   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
> > 
> > For extra fun, the build process simply hangs, requiring a ^C to blow it
> > away.  You've managed to get new behavior out of Sphinx that I've not seen
> > before, congratulations :)
> > 
> > This almost certainly has to do with the fact that I'm (intentionally)
> > running the Python2 Sphinx build here.  Something's not doing unicode that
> > should be.

Given this would be for v5.5, and python2 is EOL in 2 months[1], I don't
think it's unreasonable to deprecate it. Especially considering there
are already explicit "python3" shebangs in existing code in the sphinx/
directory.

[1] https://pythonclock.org/

> > I would suggest that we might just want to repair this before merging this
> > feature.  Either that, or we bite the bullet and deprecate the use of
> > Python 2 entirely - something that's probably not too far into our future
> > regardless.  Anybody have thoughts on that matter?
> 
> I'm almost sure we got this already. If I'm not mistaken, the solution
> is to add the encoding line after shebang. Looking at 
> Documentation/sphinx/maintainers_include.py (patch 2/2), the script
> starts with:
> 
> 	#!/usr/bin/env python
> 	# SPDX-License-Identifier: GPL-2.0
> 	# -*- coding: utf-8; mode: python -*-
> 	# pylint: disable=R0903, C0330, R0914, R0912, E0401
> 
> But, as I pointed before, the SPDX header at the wrong place is causing the 
> crash, as the encoding line should be the second line, not the third one,
> e. g.:
> 
> 	#!/usr/bin/env python
> 	# -*- coding: utf-8; mode: python -*-
> 	# SPDX-License-Identifier: GPL-2.0
> 	# pylint: disable=R0903, C0330, R0914, R0912, E0401
> 
> I also suspect that this would happen even with python3, depending on
> how LC_ALL, LANG, ... are set on the distro you use.

Oh that's a delightful bug. :P I haven't been able to reproduce this
failure (maybe all my LANG junk is accidentally correct)?

Jon, if this fixes it for you, should I respin the patches?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 14:31 ` [PATCH 0/2] docs: " Jonathan Corbet
  2019-10-01 15:09   ` Mauro Carvalho Chehab
@ 2019-10-01 16:27   ` Kees Cook
  2019-10-01 17:36     ` Jonathan Corbet
  1 sibling, 1 reply; 11+ messages in thread
From: Kees Cook @ 2019-10-01 16:27 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

On Tue, Oct 01, 2019 at 08:31:47AM -0600, Jonathan Corbet wrote:
> On a separate note...it occurred to me, rather belatedly as usual, that
> last time we discussed doing this that there was some opposition to adding
> a second MAINTAINERS parser to the kernel; future changes to the format of
> that file may force both to be adjusted, and somebody will invariably
> forget one.  Addressing that, if we feel a need to do so, probably requires
> tweaking get_maintainer.pl to output the information in a useful format.

That's a reasonable point, but I would make two observations:

- get_maintainers.pl is written in Perl and I really don't want to write
  more Perl. ;)

- the parsing methods in get_maintainers is much more focused on the
  file/pattern matching and is blind to the structure of the rest
  of the document (it only examines '^[A-Z]:' and blank lines), and
  does so "on demand", in that it hunts through the entire MAINTAINERS
  file contents for each path match.

So I don't think it's suitable to merge functionality here...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 16:16     ` Kees Cook
@ 2019-10-01 17:05       ` Jonathan Corbet
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Corbet @ 2019-10-01 17:05 UTC (permalink / raw)
  To: Kees Cook; +Cc: Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

On Tue, 1 Oct 2019 09:16:41 -0700
Kees Cook <keescook@chromium.org> wrote:

> > > This almost certainly has to do with the fact that I'm (intentionally)
> > > running the Python2 Sphinx build here.  Something's not doing unicode that
> > > should be.  
> 
> Given this would be for v5.5, and python2 is EOL in 2 months[1], I don't
> think it's unreasonable to deprecate it. Especially considering there
> are already explicit "python3" shebangs in existing code in the sphinx/
> directory.

Two months and 30 days, don't exaggerate, man! :)

FWIW, the "shebangs" in that directory are no-ops.  Those are extensions
read into sphinx, not standalone programs; they go into a Python 2 sphinx
instance just fine.

Which isn't an argument against dropping Python 2, of course.  That's
really just a matter of when.

> > I'm almost sure we got this already. If I'm not mistaken, the solution
> > is to add the encoding line after shebang. Looking at 
> > Documentation/sphinx/maintainers_include.py (patch 2/2), the script
> > starts with:
> > 
> > 	#!/usr/bin/env python
> > 	# SPDX-License-Identifier: GPL-2.0
> > 	# -*- coding: utf-8; mode: python -*-
> > 	# pylint: disable=R0903, C0330, R0914, R0912, E0401
> > 
> > But, as I pointed before, the SPDX header at the wrong place is causing the 
> > crash, as the encoding line should be the second line, not the third one,
> > e. g.:
> > 
> > 	#!/usr/bin/env python
> > 	# -*- coding: utf-8; mode: python -*-
> > 	# SPDX-License-Identifier: GPL-2.0
> > 	# pylint: disable=R0903, C0330, R0914, R0912, E0401
> > 
> > I also suspect that this would happen even with python3, depending on
> > how LC_ALL, LANG, ... are set on the distro you use.  
> 
> Oh that's a delightful bug. :P I haven't been able to reproduce this
> failure (maybe all my LANG junk is accidentally correct)?

Delightful - and wrong; moving that line around doesn't change anything.
The crash comes from deep within sphinx; the more I look at it the more
confused I get.  Stay tuned...

jon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 15:09   ` Mauro Carvalho Chehab
  2019-10-01 16:16     ` Kees Cook
@ 2019-10-01 17:35     ` Jonathan Corbet
  2019-10-01 18:26       ` Kees Cook
  1 sibling, 1 reply; 11+ messages in thread
From: Jonathan Corbet @ 2019-10-01 17:35 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Kees Cook, Jani Nikula, linux-doc, linux-kernel

On Tue, 1 Oct 2019 12:09:30 -0300
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> wrote:

> >   Sphinx parallel build error:
> >   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
> > 
> > For extra fun, the build process simply hangs, requiring a ^C to blow it
> > away.  You've managed to get new behavior out of Sphinx that I've not seen
> > before, congratulations :)
> > 
> > This almost certainly has to do with the fact that I'm (intentionally)
> > running the Python2 Sphinx build here.  Something's not doing unicode that
> > should be.
> > 
> > I would suggest that we might just want to repair this before merging this
> > feature.  Either that, or we bite the bullet and deprecate the use of
> > Python 2 entirely - something that's probably not too far into our future
> > regardless.  Anybody have thoughts on that matter?  
> 
> I'm almost sure we got this already. If I'm not mistaken, the solution
> is to add the encoding line after shebang.

As mentioned before, that's not it.  The problem is that we're feeding
UTF8 into Sphinx as an ordinary str, then sphinx is trying to decode it
as ASCII. The attached hack makes things work.

Kees, I can either just keep this fix (breaking bisectability of the docs
build :) or you can send a new version - up to you.

Thanks,

jon

From c32bfbb07a7840662ba3b367c61b7f2946028b27 Mon Sep 17 00:00:00 2001
From: Jonathan Corbet <corbet@lwn.net>
Date: Tue, 1 Oct 2019 11:26:20 -0600
Subject: [PATCH] docs: make maintainers_include work with Python 2

The MAINTAINERS file contains UTF-8 data, so Python 2 code has to be
explicit about moving it into the Unicode realm.  Explicitly decode all
data from the file as soon as we read it.

This hack can go away once we deprecate Python 2 support.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/sphinx/maintainers_include.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/sphinx/maintainers_include.py b/Documentation/sphinx/maintainers_include.py
index de62dd3dabd5..705ba9fd5336 100755
--- a/Documentation/sphinx/maintainers_include.py
+++ b/Documentation/sphinx/maintainers_include.py
@@ -19,6 +19,7 @@ u"""
 
 import re
 import os.path
+import sys
 
 from docutils import statemachine
 from docutils.utils.error_reporting import ErrorString
@@ -60,6 +61,8 @@ class MaintainersInclude(Include):
         field_content = ""
 
         for line in open(path):
+            if sys.version_info.major == 2:
+                line = unicode(line, 'utf-8')
             # Have we reached the end of the preformatted Descriptions text?
             if descriptions and line.startswith('Maintainers'):
                 descriptions = False
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 16:27   ` Kees Cook
@ 2019-10-01 17:36     ` Jonathan Corbet
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Corbet @ 2019-10-01 17:36 UTC (permalink / raw)
  To: Kees Cook; +Cc: Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

On Tue, 1 Oct 2019 09:27:29 -0700
Kees Cook <keescook@chromium.org> wrote:

> On Tue, Oct 01, 2019 at 08:31:47AM -0600, Jonathan Corbet wrote:
> > On a separate note...it occurred to me, rather belatedly as usual, that
> > last time we discussed doing this that there was some opposition to adding
> > a second MAINTAINERS parser to the kernel; future changes to the format of
> > that file may force both to be adjusted, and somebody will invariably
> > forget one.  Addressing that, if we feel a need to do so, probably requires
> > tweaking get_maintainer.pl to output the information in a useful format.  
> 
> That's a reasonable point, but I would make two observations:
> 
> - get_maintainers.pl is written in Perl and I really don't want to write
>   more Perl. ;)

Trust me, I get it!

> - the parsing methods in get_maintainers is much more focused on the
>   file/pattern matching and is blind to the structure of the rest
>   of the document (it only examines '^[A-Z]:' and blank lines), and
>   does so "on demand", in that it hunts through the entire MAINTAINERS
>   file contents for each path match.
> 
> So I don't think it's suitable to merge functionality here...

Makes sense to me.  If anybody out there objects, speak now ...

jon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST
  2019-10-01 17:35     ` Jonathan Corbet
@ 2019-10-01 18:26       ` Kees Cook
  0 siblings, 0 replies; 11+ messages in thread
From: Kees Cook @ 2019-10-01 18:26 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Jani Nikula, linux-doc, linux-kernel

On Tue, Oct 01, 2019 at 11:35:06AM -0600, Jonathan Corbet wrote:
>          for line in open(path):
> +            if sys.version_info.major == 2:
> +                line = unicode(line, 'utf-8')

Ah-ha! Thanks. I've sent v2 now. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-10-01 18:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-24 23:02 [PATCH 0/2] docs: Programmatically render MAINTAINERS into ReST Kees Cook
2019-09-24 23:02 ` [PATCH 1/2] doc-rst: Reduce CSS padding around Field Kees Cook
2019-09-24 23:02 ` [PATCH 2/2] doc-rst: Programmatically render MAINTAINERS into ReST Kees Cook
2019-10-01 14:31 ` [PATCH 0/2] docs: " Jonathan Corbet
2019-10-01 15:09   ` Mauro Carvalho Chehab
2019-10-01 16:16     ` Kees Cook
2019-10-01 17:05       ` Jonathan Corbet
2019-10-01 17:35     ` Jonathan Corbet
2019-10-01 18:26       ` Kees Cook
2019-10-01 16:27   ` Kees Cook
2019-10-01 17:36     ` Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).