On Thu, Sep 16, 2021 at 11:40 AM John Snow wrote: > > > On Thu, Sep 16, 2021 at 11:39 AM Daniel P. Berrangé > wrote: > >> On Thu, Sep 16, 2021 at 09:42:30AM -0400, John Snow wrote: >> > On Thu, Sep 16, 2021 at 8:59 AM Daniel P. Berrangé > > >> > wrote: >> > >> > > On Wed, Sep 15, 2021 at 11:40:31AM -0400, John Snow wrote: >> > > > A few new annoyances. Of note is the new warning for an unspecified >> > > > encoding when opening a text file, which actually does indicate a >> > > > potentially real problem; see >> > > > https://www.python.org/dev/peps/pep-0597/#motivation >> > > > >> > > > Use LC_CTYPE to determine an encoding to use for interpreting QEMU's >> > > > terminal output. Note that Python states: "language code and >> encoding >> > > > may be None if their values cannot be determined" -- use a platform >> > > > default as a backup. >> > > > >> > > > Signed-off-by: John Snow >> > > > --- >> > > > python/qemu/machine/machine.py | 9 ++++++++- >> > > > python/setup.cfg | 1 + >> > > > 2 files changed, 9 insertions(+), 1 deletion(-) >> > > > >> > > > diff --git a/python/qemu/machine/machine.py >> > > b/python/qemu/machine/machine.py >> > > > index a7081b1845..51b6e79a13 100644 >> > > > --- a/python/qemu/machine/machine.py >> > > > +++ b/python/qemu/machine/machine.py >> > > > @@ -19,6 +19,7 @@ >> > > > >> > > > import errno >> > > > from itertools import chain >> > > > +import locale >> > > > import logging >> > > > import os >> > > > import shutil >> > > > @@ -290,8 +291,14 @@ def get_pid(self) -> Optional[int]: >> > > > return self._subp.pid >> > > > >> > > > def _load_io_log(self) -> None: >> > > > + # Assume that the output encoding of QEMU's terminal output >> > > > + # is defined by our locale. If indeterminate, use a >> platform >> > > default. >> > > > + _, encoding = locale.getlocale() >> > > > + if encoding is None: >> > > > + encoding = >> locale.getpreferredencoding(do_setlocale=False) >> > > >> > > Do we really need this getpreferredencoding ? IIUC, this is a sign >> > > that the application is buggy by not calling >> > > >> > > locale.setlocale(locale.LC_ALL, '') >> > > >> > > during its main() method, which I think we can just delegate to the >> > > code in question to fix. Missing setlocale will affect everything >> > > they run, so doing workarounds in only 1 place is not worth it IMHO >> > > >> > > >> > I genuinely don't know! (And, I try to keep the Python code free from >> > assuming Linux as much as I can help it.) >> > >> > Python's getlocale documentation states: "language code and encoding >> may be >> > None if their values cannot be determined." >> > https://docs.python.org/3/library/locale.html#locale.getlocale >> > >> > But it is quiet as to the circumstances under which this may happen. >> > Browsing the cpython source code, (3.9ish): >> > >> > ``` >> > def getlocale(category=LC_CTYPE): >> > localename = _setlocale(category) >> > if category == LC_ALL and ';' in localename: >> > raise TypeError('category LC_ALL is not supported') >> > return _parse_localename(localename) >> > ``` >> > _setlocale is ultimately a call to (I think) _localemodule.c's >> > PyLocale_setlocale(PyObject *self, PyObject *args) C function. >> > It calls `result = setlocale(category, locale)` where the category is >> going >> > to be LC_CTYPE, so this should be equivalent to locale(3) (LC_CTYPE, >> NULL). >> > >> > locale(3) says that "The return value is NULL if the request cannot be >> > honored." >> > >> > Python parses that string according to _parse_localename, which in turn >> > calls normalize(localename). >> > Normalization looks quite involved, but has a fallback of returning the >> > string verbatim. If the normalized locale string is "C", we return the >> > tuple (None, None)! >> > >> > So I figured there was a non-zero chance that we'd see a value of `None` >> > here. >> > >> > Source code is in cpython/Lib/locale.py and >> cpython/Modules/_localemodule.c >> > if you want to nose around yourself. >> > >> > I also have no idea how this will all shake out on Windows, so I >> decided to >> > add the fallback here just in case. (Does the Python package work on >> > Windows? I don't know, but I avoid assuming it won't EVER run there... >> > Certainly, I have an interest in having the QMP packages I am building >> work >> > on all platforms.) >> > >> > Thoughts? >> >> Well this machine.py is using UNIX domain sockets and FD passing, >> so Windows is out of the question. >> >> I'd be inclined to just keep it simple unless someone reports a >> real bug with it. >> >> > What about the case where getlocale finds "C" and can't/won't return an > encoding? Is that not a real circumstance? > > --js > So, a few things: (1) machine.py can also use TCP, it doesn't have to use UNIX sockets. (2) The FD passing is optional and already checks for the ability to do so before it is invoked (3) jsnow@scv ~/s/q/python (python-package-pylint-210)> LC_ALL="C" python3 -c "import locale; print(locale.getlocale());" (None, None) (4) If you specify an encoding of "None", open() itself does exactly what I wrote: getpreferredencoding(do_setlocale=False) I'll drop the manual fallback -- but not because it wasn't actually necessary. It just so happens that it's handled for us. --js