On Thu, Sep 16, 2021 at 8:59 AM Daniel P. Berrangé wrote: > On Wed, Sep 15, 2021 at 11:40:31AM -0400, John Snow wrote: > > A few new annoyances. Of note is the new warning for an unspecified > > encoding when opening a text file, which actually does indicate a > > potentially real problem; see > > https://www.python.org/dev/peps/pep-0597/#motivation > > > > Use LC_CTYPE to determine an encoding to use for interpreting QEMU's > > terminal output. Note that Python states: "language code and encoding > > may be None if their values cannot be determined" -- use a platform > > default as a backup. > > > > Signed-off-by: John Snow > > --- > > python/qemu/machine/machine.py | 9 ++++++++- > > python/setup.cfg | 1 + > > 2 files changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index a7081b1845..51b6e79a13 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -19,6 +19,7 @@ > > > > import errno > > from itertools import chain > > +import locale > > import logging > > import os > > import shutil > > @@ -290,8 +291,14 @@ def get_pid(self) -> Optional[int]: > > return self._subp.pid > > > > def _load_io_log(self) -> None: > > + # Assume that the output encoding of QEMU's terminal output > > + # is defined by our locale. If indeterminate, use a platform > default. > > + _, encoding = locale.getlocale() > > + if encoding is None: > > + encoding = locale.getpreferredencoding(do_setlocale=False) > > Do we really need this getpreferredencoding ? IIUC, this is a sign > that the application is buggy by not calling > > locale.setlocale(locale.LC_ALL, '') > > during its main() method, which I think we can just delegate to the > code in question to fix. Missing setlocale will affect everything > they run, so doing workarounds in only 1 place is not worth it IMHO > > I genuinely don't know! (And, I try to keep the Python code free from assuming Linux as much as I can help it.) Python's getlocale documentation states: "language code and encoding may be None if their values cannot be determined." https://docs.python.org/3/library/locale.html#locale.getlocale But it is quiet as to the circumstances under which this may happen. Browsing the cpython source code, (3.9ish): ``` def getlocale(category=LC_CTYPE): localename = _setlocale(category) if category == LC_ALL and ';' in localename: raise TypeError('category LC_ALL is not supported') return _parse_localename(localename) ``` _setlocale is ultimately a call to (I think) _localemodule.c's PyLocale_setlocale(PyObject *self, PyObject *args) C function. It calls `result = setlocale(category, locale)` where the category is going to be LC_CTYPE, so this should be equivalent to locale(3) (LC_CTYPE, NULL). locale(3) says that "The return value is NULL if the request cannot be honored." Python parses that string according to _parse_localename, which in turn calls normalize(localename). Normalization looks quite involved, but has a fallback of returning the string verbatim. If the normalized locale string is "C", we return the tuple (None, None)! So I figured there was a non-zero chance that we'd see a value of `None` here. Source code is in cpython/Lib/locale.py and cpython/Modules/_localemodule.c if you want to nose around yourself. I also have no idea how this will all shake out on Windows, so I decided to add the fallback here just in case. (Does the Python package work on Windows? I don't know, but I avoid assuming it won't EVER run there... Certainly, I have an interest in having the QMP packages I am building work on all platforms.) Thoughts? > if self._qemu_log_path is not None: > > - with open(self._qemu_log_path, "r") as iolog: > > + with open(self._qemu_log_path, "r", > > + encoding=encoding) as iolog: > > self._iolog = iolog.read() > Regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| > >