linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@kernel.org>
To: "Michal Suchánek" <msuchanek@suse.de>
Cc: linux-doc@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>
Subject: Re: Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)
Date: Thu, 6 May 2021 16:24:42 +0200	[thread overview]
Message-ID: <20210506162442.7d118ac3@coco.lan> (raw)
In-Reply-To: <20210506133212.GF6564@kitsune.suse.cz>

Em Thu, 6 May 2021 15:32:12 +0200
Michal Suchánek <msuchanek@suse.de> escreveu:

> On Thu, May 06, 2021 at 01:20:06PM +0200, Mauro Carvalho Chehab wrote:
> > Em Thu, 6 May 2021 12:39:13 +0200
> > Michal Suchánek <msuchanek@suse.de> escreveu:
> >   
> > > When building HTML documentation I get this output:
> > > 
> > > [  120s] + make O=/home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/html PYTHON=python3 htmldocs
> > > [  120s] make[1]: Entering directory '/home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/html'
> > > [  120s] cat: /etc/os-release: No such file or directory
> > > [  121s]   SPHINX  htmldocs --> file:///home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/html/Documentation/output
> > > [  121s]   PARSE   include/uapi/linux/dvb/audio.h
> > > [  121s]   PARSE   include/uapi/linux/dvb/ca.h
> > > [  121s]   PARSE   include/uapi/linux/dvb/dmx.h
> > > [  121s]   PARSE   include/uapi/linux/dvb/frontend.h
> > > [  122s]   PARSE   include/uapi/linux/dvb/net.h
> > > [  122s]   PARSE   include/uapi/linux/dvb/video.h
> > > [  122s]   PARSE   include/uapi/linux/videodev2.h
> > > [  122s]   PARSE   include/uapi/linux/media.h
> > > [  122s]   PARSE   include/uapi/linux/cec.h
> > > [  122s]   PARSE   include/uapi/linux/lirc.h
> > > [  190s] ../include/linux/dcache.h:318: warning: expecting prototype for dget, dget_dlock(). Prototype was for dget_dlock() instead
> > > [  203s] ../include/linux/regulator/driver.h:388: warning: Function parameter or member 'ramp_reg' not described in 'regulator_desc'
> > > [  203s] ../include/linux/regulator/driver.h:388: warning: Function parameter or member 'ramp_mask' not described in 'regulator_desc'
> > > [  203s] ../include/linux/regulator/driver.h:388: warning: Function parameter or member 'ramp_delay_table' not described in 'regulator_desc'
> > > [  203s] ../include/linux/regulator/driver.h:388: warning: Function parameter or member 'n_ramp_values' not described in 'regulator_desc'
> > > [  203s] ../include/linux/spi/spi.h:671: warning: Function parameter or member 'devm_allocated' not described in 'spi_controller'
> > > [  203s] ../drivers/usb/dwc3/core.h:865: warning: Function parameter or member 'hwparams9' not described in 'dwc3_hwparams'
> > > [  233s] ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2808: warning: Excess function parameter 'vm_context' description in 'amdgpu_vm_init'
> > > [  233s] ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2808: warning: Excess function parameter 'vm_context' description in 'amdgpu_vm_init'
> > > [  233s] ../drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:426: warning: Function parameter or member 'disable_hpd_irq' not described in 'amdgpu_display_manager'
> > > [  233s] ../drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'
> > > [  233s] ../drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'jump_whitelist' not described in 'intel_engine_cmd_parser'
> > > [  233s] ../drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'shadow_map' not described in 'intel_engine_cmd_parser'
> > > [  233s] ../drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'batch_map' not described in 'intel_engine_cmd_parser'
> > > [  233s] ../drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'
> > > [  233s] ../drivers/gpu/host1x/bus.c:774: warning: Excess function parameter 'key' description in '__host1x_client_register'
> > > [  234s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/ABI/testing/sysfs-platform-intel-pmc:2: WARNING: Definition list ends without a blank line; unexpected unindent.
> > > [  234s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/driver-api/serial/index.rst:17: WARNING: toctree contains reference to nonexisting document 'driver-api/serial/rocket'
> > > [  234s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/gpu/amdgpu:96: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:323: WARNING: Unexpected indentation.
> > > [  234s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/gpu/amdgpu:96: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:324: WARNING: Block quote ends without a blank line; unexpected unindent.
> > > [  234s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/gpu/amdgpu:96: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:327: WARNING: Definition list ends without a blank line; unexpected unindent.
> > > [  307s] /home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Documentation/driver-api/usb/writing_usb_driver.rst:129: WARNING: undefined label: usb_header (if the link has no caption the label must precede a section header)
> > > [  412s] 
> > > [  412s] Sphinx parallel build error:
> > > [  412s] UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)
> > > [  431s] make[2]: *** [../Documentation/Makefile:91: htmldocs] Error 2
> > > [  431s] make[1]: *** [/home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/Makefile:1784: htmldocs] Error 2
> > > [  431s] make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/kernel-docs-5.12.0.next.20210506/linux-5.12-next-20210506/html'
> > > [  431s] make: *** [Makefile:222: __sub-make] Error 2
> > > [  431s] error: Bad exit status from /var/tmp/rpm-tmp.npkyVx (%build)
> > > 
> > > It does not say which input file contains the offending character so I can't tell which file is broken.
> > > 
> > > Any idea how to debug?  
> > 
> > Yes. You probably has some weird file under Documentation/ABI.
> > Some text editors like kate tend to keep temporary files sometimes.
> > 
> > The scripts/get_ABI.pl (currently) doesn't have any logic
> > to recognize valid ABI files from trash stuff added at
> > the ABI dirs.
> > 
> > Just doing a git status (or a git clean) and removing such
> > files should fix the build.  
> 
> This is clean git-archived tarball uploaded to a build service so the
> likehood of some garbage files popping out in Documentation/ABI out of
> nowhere is quite small.

Well, it could be something completely different ;-) 

This crash happens when Sphinx/python finds a character it doesn't 
recognize as valid, like if you run something like:

	$ echo -en "What:\tBROKEN description\ndescription:\n";dd if=/dev/random count=10 ) > Documentation/ABI/testing/foobar
	$ ./scripts/get_abi.pl  rest 2>/dev/null|grep BROK
	Binary file (standard input) matches

On such case, non UTF-8 chars end being inserted, causing python and/or Sphinx
exceptions, causing it to crash:

	WARNING: The kernel documentation build process
	        support for Sphinx v3.0 and above is brand new. Be prepared for
	        possible issues in the generated output.
	enabling CJK for LaTeX builder

	Sphinx parallel build error:
	UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa8 in position 1985806: invalid start byte
	make[1]: *** [Documentation/Makefile:91: htmldocs] Error 2
	make: *** [Makefile:1790: htmldocs] Error 2

Yet, this sounds a weird to me:

	UnicodeEncodeError: 'latin-1' 

It sounds that it is somehow trying to use latin-1 alphabet, instead of utf-8.

This will certainly cause troubles, as there are non-latin-1 characters
at the docs (specially at Japanese and Chinese translations, but there
are also a few utf-8 graphic symbols somewhere else).

I remember there were a change in the past that made utf-8 to be default
for Sphinx, but can't remember the details.

Thanks,
Mauro

  reply	other threads:[~2021-05-06 14:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 10:39 Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256) Michal Suchánek
2021-05-06 11:20 ` Mauro Carvalho Chehab
2021-05-06 13:32   ` Michal Suchánek
2021-05-06 14:24     ` Mauro Carvalho Chehab [this message]
2021-05-06 14:35       ` Michal Suchánek
2021-05-06 15:57 ` Markus Heiser
2021-05-06 16:46   ` Mauro Carvalho Chehab
2021-05-06 17:04     ` Markus Heiser
2021-05-06 17:27       ` Mauro Carvalho Chehab
2021-05-06 17:53         ` Markus Heiser
2021-05-06 18:06           ` Michal Suchánek
2021-05-07  8:52             ` Mauro Carvalho Chehab
2021-05-06 17:57         ` Randy Dunlap
2021-05-06 18:08           ` Matthew Wilcox
2021-05-06 21:21             ` Randy Dunlap
2021-05-07  6:39               ` Mauro Carvalho Chehab
2021-05-07  6:49                 ` Randy Dunlap
2021-05-07  8:04                 ` Mauro Carvalho Chehab
2021-05-07  8:35                   ` Michal Suchánek
2021-05-07  8:56                     ` Markus Heiser
2021-05-07  9:14                       ` Mauro Carvalho Chehab
2021-05-07  9:51                         ` Markus Heiser
2021-05-07 10:29                           ` Michal Suchánek
2021-05-07  9:02                     ` Mauro Carvalho Chehab
2021-05-08  9:22                 ` Mauro Carvalho Chehab
2021-05-08 10:41                   ` Michal Suchánek
2021-05-08 14:41                     ` Mauro Carvalho Chehab
2021-05-08 15:55                       ` Randy Dunlap
2021-05-08 17:09                         ` Michal Suchánek
2021-05-08 17:46                           ` Randy Dunlap
2021-05-10  6:22                             ` Mauro Carvalho Chehab
2021-05-10  8:17                         ` Mauro Carvalho Chehab
2021-05-06 17:48       ` Michal Suchánek
2021-05-06 17:59         ` Markus Heiser
2021-05-06 18:16           ` Michal Suchánek
2021-05-12  6:22         ` Mauro Carvalho Chehab
2021-05-12  7:01           ` Michal Suchánek
2021-05-12  7:18             ` Markus Heiser
2021-05-12  7:37               ` Markus Heiser
2021-05-12  7:59             ` Mauro Carvalho Chehab
2021-05-17 13:10               ` Michal Suchánek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210506162442.7d118ac3@coco.lan \
    --to=mchehab@kernel.org \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=msuchanek@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).