From: Markus Heiser <markus.heiser@darmarit.de>
To: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
"Linux Media Mailing List" <linux-media@vger.kernel.org>,
"Mauro Carvalho Chehab" <mchehab@infradead.org>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Joe Perches" <joe@perches.com>,
linux-kernel@vger.kernel.org,
"Arnaldo Carvalho de Melo" <acme@kernel.org>,
"Sven Eckelmann" <sven@narfation.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Doug Smythies" <doug.smythies@gmail.com>,
"Aurélien Cedeyn" <aurelien.cedeyn@gmail.com>,
"Vincenzo Frascino" <vincenzo.frascino@arm.com>,
linux-doc@vger.kernel.org,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Thierry Reding" <treding@nvidia.com>,
"Armijn Hemel" <armijn@tjaldur.nl>,
"Jiri Olsa" <jolsa@redhat.com>,
"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
"Namhyung Kim" <namhyung@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Federico Vaga" <federico.vaga@vaga.pv.it>,
"Allison Randal" <allison@lohutok.net>,
"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
"Shuah Khan" <skhan@linuxfoundation.org>
Subject: Re: [PATCH 0/6] Address issues with SPDX requirements and PEP-263
Date: Sat, 7 Sep 2019 20:37:13 +0200 [thread overview]
Message-ID: <686101df-f40c-916e-2730-353a3852cc84@darmarit.de> (raw)
In-Reply-To: <20190907150442.583b44c2@coco.lan>
Am 07.09.19 um 20:04 schrieb Mauro Carvalho Chehab:
> Em Sat, 7 Sep 2019 19:33:06 +0200
> Markus Heiser <markus.heiser@darmarit.de> escreveu:
>> An (uncatched) exception is thrown, when writing UTF-8 to a stream which
>> do not support UTF-8 .. this is not a crash, it mostly indicates that the
>> developper makes some wrong assumption about the use-case.
>
> A not-handled exception is a crash in Python. I've seen python scripts
> crash countless times with non-English names.
This has nothing to do with the language, ask the developer of those scripts.
>> There exists
>> also the possibility to encode the UTF-8 to ASCII and replace unknown
>> code points in the out-stream, or to catch the exception.
>
> Yeah, but getting this right is very painful. I use patchwork since 2013.
> It took *years* for it to not crash with non-ASCII chars[1]. That's, btw,
> the primary reason why I don't usually use python: with other languages,
> an alien char doesn't cause a crash.
Python cares encoded (text) string-types while other languages and
application are just piping bytes to streams .. if you care about the
enconding you need exceptions when one whants write UTF-8 to ASCII out.
Anyway this is a bit of nitpicking / not helping here ..
>
> [1] I might be wrong, but the last patch I saw addressing an issue
> there was applied this year.
I alrady postet an example [1]
<snip>
This means your application has to know the encoding of a stream/file.
E.g. we handle the output from of the external Perl script
scripts/kernel-docs by encoding the byte stream from proc-call's
stdout into utf-8:
out, err = codecs.decode(out, 'utf-8'), codecs.decode(err, 'utf-8')
see patch
https://github.com/torvalds/linux/commit/86c0f046a8b0c23fca65f77333c233a06c25ef9a
Again, this is talking about application development and has
nothing to do with the encoding of the source files.
<snap>
[1] https://www.mail-archive.com/linux-doc@vger.kernel.org/msg33240.html
>>
>> But this was only academical, where do we have such problems in practice?
>>
>>> At least on media, we define that some Kernel strings can be UTF-8.
>>> See, for example the model field at the media_entity struct:
>>>
>>> https://linuxtv.org/downloads/v4l-dvb-apis/kapi/mc-core.html
>>>
>>> As stated there:
>>>
>>> "media_entity.model must be filled with the device model name as
>>> a NUL-terminated UTF-8 string. The device/model revision must
>>> not be stored in this field."
>>>
>>> I've no idea if the two perf scripts that contain the encoding data are
>>> meant to print some strings that may be UTF-8 encoding (like those that
>>> we have at the media subsystem), or if it is just that whomever added
>>> were using e-macs and wanted to make his life simpler. As it is better
>>> to be safe then sorry, on patches 2 and 3, I'm assuming the first case.
>>
>> Hm, I'am unsure if I understand you correct: Using UTF-8 in the .rst
>> files are fine .. where do we have scripts generating UTF-8 outputs?
>> (except the HTML output).
>
> In thesis, perf scripts may be reading strings from the Kernel, with
> might be using UTF-8 encoding.
>
>>
>>>
>>> In any case, we do need the encoding line at Sphinx extensions,
>>> although there, the shebang line is optional.
>>>
>>> In other words, we have those alternatives:
>>>
>>> 1) Neither shebang nor coding -> SPDX will be at first line;
>>> 2) shebang + SPDX -> SPDX will be at the second line;
>>> 3) shebang + coding + SPDX -> SPDX will be at the third line;
>>> 4) coding + SPDX
>>>
>>> This is something that only makes sense for Sphinx extensions.
>>>
>>> IMHO, I would place SPDX at the second line too, but I *guess* Python
>>> may accept it at the first line and would still properly evaluate
>>> coding (as this technically satisfies the text at PEP-263).
>>
>> Why you are so restrictive ..
>
> No idea. I would actually prefer to just remove the restriction, and let
> the SPDX header to be anywhere inside the first comment block inside a
> file [2].
>
> That's basically how this thread started: other developers think
> that it is a good idea to be pedantic. So, be it, but let's then fix
> the documentation, as the way it is, it is implicitly forbidding the
> addition of encoding lines for Python scripts.
>
> [2] I *suspect* that the restriction was added in order to make
> ./scripts/spdxcheck.py to run faster and to avoid false positives.
> Right now, if the maximum limit is removed (or set to a very high
> value), there will be one false positive:
>
> Documentation/dev-tools/kselftest.rst
>
> This doc has a SPDX-like tag at line 230, asking people to add SPDX
> headers on files, but the file itself doesn't have its own SPDX tag.
>
>> what we normal do:
>>
>> - write a shebang line if this file is called directly from the
>> command line .. but we do not need shebangs on py modules which
>> are imported from other modules or scripts
>>
>> - write a encoding line if it is need or helpful / mostly it is helpful
>> to know the encoding of a text/code file.
>>
>> - add a SPDX tag
>
> Yes, but this violates the current documentation, as it doesn't allow the
> SPDX tag after line #2.
Thats what I mean: The documentation was written with only a small use-cases
in mind .. there is no real need for SPDX to be in line one or two ... lets
fix the documentation as I described before.
Side note: if I can help you with perf or your build systems, don't hesitate
to contact me directly.
-- Markus --
>> At the end we will have files with one, two or all three of this lines.
>> And the oder of this lines is, what I wrote:
>>
>>>>
>>>> Thats what I mean [1] .. lets patch the description in the license-rules.rst::
>>>>
>>>> - first line for the OS (shebang)
>>>> - second line for environment (python-encoding, editor-mode, ...)
>>>> - third and more lines for application (SPDX use) ..
>>>>
>>>> [1] https://www.mail-archive.com/linux-doc@vger.kernel.org/msg33240.html
>>>>
>>>> -- Markus --
>>>>
>>>>> This suggests to me that we're adding a bunch of complications that we
>>>>> don't necessarily need. What am I missing here?
>>>>>
>>>>> Educate me properly and I'll not try to stand in the way of all this...
>>>>>
>>
>>
>> It seems like it is not only me who is mising something .. what are
>> the use-cases we have py-Exceptions, what are the use-cases to be so
>> restrictive as you described above.
>>
>> .. or did alice get lost in the cave?
>>
>> Thanks for your patience with me
>>
>> -- Markus --
>
>
>
> Thanks,
> Mauro
>
next prev parent reply other threads:[~2019-09-07 18:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-05 19:57 [PATCH 0/6] Address issues with SPDX requirements and PEP-263 Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 1/6] docs: sphinx: add SPDX header for some sphinx extensions Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 2/6] tools: perf: fix SPDX header in the light of PEP-263 Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 3/6] tools: intel_pstate_tracer.py: " Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 4/6] docs: license-rules.txt: cover SPDX headers on Python scripts Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 5/6] scripts/spdxcheck.py: keep track on what line SPDX header was found Mauro Carvalho Chehab
2019-09-05 19:57 ` [PATCH 6/6] scripts/spdxcheck.py: check if the line number follows the strict rule Mauro Carvalho Chehab
2019-09-06 12:04 ` [PATCH v2 " Mauro Carvalho Chehab
2019-09-05 20:05 ` [PATCH 0/6] Address issues with SPDX requirements and PEP-263 Joe Perches
2019-09-06 12:02 ` Mauro Carvalho Chehab
2019-09-06 12:22 ` Joe Perches
2019-09-07 13:34 ` Jonathan Corbet
2019-09-07 14:36 ` Markus Heiser
2019-09-07 16:22 ` Mauro Carvalho Chehab
2019-09-07 17:33 ` Markus Heiser
2019-09-07 18:04 ` Mauro Carvalho Chehab
2019-09-07 18:37 ` Markus Heiser [this message]
2019-09-07 21:17 ` Thomas Gleixner
2019-09-08 10:03 ` Matthew Wilcox
2019-09-08 14:46 ` Thomas Gleixner
2019-09-10 6:31 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=686101df-f40c-916e-2730-353a3852cc84@darmarit.de \
--to=markus.heiser@darmarit.de \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=allison@lohutok.net \
--cc=armijn@tjaldur.nl \
--cc=aurelien.cedeyn@gmail.com \
--cc=corbet@lwn.net \
--cc=doug.smythies@gmail.com \
--cc=federico.vaga@vaga.pv.it \
--cc=gregkh@linuxfoundation.org \
--cc=joe@perches.com \
--cc=jolsa@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=mchehab+samsung@kernel.org \
--cc=mchehab@infradead.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=skhan@linuxfoundation.org \
--cc=sven@narfation.org \
--cc=tglx@linutronix.de \
--cc=treding@nvidia.com \
--cc=u.kleine-koenig@pengutronix.de \
--cc=vincenzo.frascino@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).