toaster.lists.yoctoproject.org archive mirror
 help / color / mirror / Atom feed
* Database erros due to UTF-8 filenames
@ 2020-11-16 12:56 holger.sebert.ext
  2020-11-16 13:17 ` Reyna, David
  0 siblings, 1 reply; 3+ messages in thread
From: holger.sebert.ext @ 2020-11-16 12:56 UTC (permalink / raw)
  To: toaster

Hi,

I've setup Toaster and a MySQL docker container, all running on Ubuntu 16.04.
I am encountering the following database error, when building my Yocto project:

	ERROR: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")
	Traceback (most recent call last):
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute
		return self.cursor.execute(sql, params)
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/mysql/base.py", line 71, in execute
		return self.cursor.execute(query, args)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 206, in execute
		res = self._query(query)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 319, in _query
		db.query(q)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/connections.py", line 260, in query
		_mysql.connection.query(self, query)
	MySQLdb._exceptions.OperationalError: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")

The query that raised this error looks as follows:

	INSERT INTO `orm_target_file`
		(`target_id`, `path`, `size`, `inodetype`, `permission`,
		`owner`, `group`, `directory_id`, `sym_target_id`)
	VALUES (19,
		'/usr/share/ca-certificates/mozilla/NetLock_Arany_=Class_Gold=_F\xc5\x91tan\xc3\xbas\xc3\xadtv\xc3\xa1ny.crt',
		1476, 1, 'rw-r--r--', 'root', 'root', NULL, NULL)

The file causing this error has the following UTF-8 encoded filename:

	NetLock_Arany_=Class_Gold=_Főtanúsítvány.crt

When looking into the database I found out that the column `path` of table
`orm_target_file` has the following properties:

	CHARACTER_SET_NAME: latin1
	COLLATION_NAME: latin1_swedish_ci

Apperently, the column `path` is not ready for UTF-8 strings. I can fix that
manually by doing the following mysql command using the `mysql` tool:

	ALTER TABLE orm_target_file
	CONVERT TO CHARACTER SET utf8
	COLLATE utf8_general_ci;

This change makes the database error disappear.

I would like to fix that directly in Toasters's `orm/models.py`. I found the
following definition in class `Target_File`:

    path = models.FilePathField()

It seems like I need to pass some clever options to `FilePathField`, but which?
My own research in that direction has brought up nothing useful so far.

My questions are thus:

* How can I parametrize `FilePathField` to properly handle UTF-8 encoded
  filenames in the underlying database?

* How should a correspondig migration file look like in `orm/migrations`?

Thanks!

Best,
Holger

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Database erros due to UTF-8 filenames
  2020-11-16 12:56 Database erros due to UTF-8 filenames holger.sebert.ext
@ 2020-11-16 13:17 ` Reyna, David
  2020-11-26 17:32   ` Sebert, Holger.ext
  0 siblings, 1 reply; 3+ messages in thread
From: Reyna, David @ 2020-11-16 13:17 UTC (permalink / raw)
  To: Sebert, Holger.ext, toaster

Hi Holger,

This is an interesting problem. I will investigate.

We should see if there are any other localization fields that might have to support UTF-8 strings. Certainly all local path names will need to be supported.

I am also curious on how the local time zone support is working for you.

David

-----Original Message-----
From: toaster@lists.yoctoproject.org <toaster@lists.yoctoproject.org> On Behalf Of Sebert, Holger.ext
Sent: Monday, November 16, 2020 4:57 AM
To: toaster@lists.yoctoproject.org
Subject: [Toaster] Database erros due to UTF-8 filenames

Hi,

I've setup Toaster and a MySQL docker container, all running on Ubuntu 16.04.
I am encountering the following database error, when building my Yocto project:

	ERROR: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")
	Traceback (most recent call last):
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute
		return self.cursor.execute(sql, params)
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/mysql/base.py", line 71, in execute
		return self.cursor.execute(query, args)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 206, in execute
		res = self._query(query)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 319, in _query
		db.query(q)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/connections.py", line 260, in query
		_mysql.connection.query(self, query)
	MySQLdb._exceptions.OperationalError: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")

The query that raised this error looks as follows:

	INSERT INTO `orm_target_file`
		(`target_id`, `path`, `size`, `inodetype`, `permission`,
		`owner`, `group`, `directory_id`, `sym_target_id`)
	VALUES (19,
		'/usr/share/ca-certificates/mozilla/NetLock_Arany_=Class_Gold=_F\xc5\x91tan\xc3\xbas\xc3\xadtv\xc3\xa1ny.crt',
		1476, 1, 'rw-r--r--', 'root', 'root', NULL, NULL)

The file causing this error has the following UTF-8 encoded filename:

	NetLock_Arany_=Class_Gold=_Főtanúsítvány.crt

When looking into the database I found out that the column `path` of table
`orm_target_file` has the following properties:

	CHARACTER_SET_NAME: latin1
	COLLATION_NAME: latin1_swedish_ci

Apperently, the column `path` is not ready for UTF-8 strings. I can fix that
manually by doing the following mysql command using the `mysql` tool:

	ALTER TABLE orm_target_file
	CONVERT TO CHARACTER SET utf8
	COLLATE utf8_general_ci;

This change makes the database error disappear.

I would like to fix that directly in Toasters's `orm/models.py`. I found the
following definition in class `Target_File`:

    path = models.FilePathField()

It seems like I need to pass some clever options to `FilePathField`, but which?
My own research in that direction has brought up nothing useful so far.

My questions are thus:

* How can I parametrize `FilePathField` to properly handle UTF-8 encoded
  filenames in the underlying database?

* How should a correspondig migration file look like in `orm/migrations`?

Thanks!

Best,
Holger

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Database erros due to UTF-8 filenames
  2020-11-16 13:17 ` Reyna, David
@ 2020-11-26 17:32   ` Sebert, Holger.ext
  0 siblings, 0 replies; 3+ messages in thread
From: Sebert, Holger.ext @ 2020-11-26 17:32 UTC (permalink / raw)
  To: Reyna, David, toaster

Hi David,

as far as I can tell, Toaster doesn't set charset and collation by itself, but uses
the defaults of the server.

The problem can be solved by passing adequate parameters when starting up
the MySQL server, like so:

    docker run -dit --network host --name running-toaster-db toaster-db --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

If this is the right solution, maybe we can put this somewhere in the documentation?

Best,
Holger
________________________________________
Von: Reyna, David <david.reyna@windriver.com>
Gesendet: Montag, 16. November 2020 14:17:53
An: Sebert, Holger.ext; toaster@lists.yoctoproject.org
Betreff: RE: Database erros due to UTF-8 filenames

Hi Holger,

This is an interesting problem. I will investigate.

We should see if there are any other localization fields that might have to support UTF-8 strings. Certainly all local path names will need to be supported.

I am also curious on how the local time zone support is working for you.

David

-----Original Message-----
From: toaster@lists.yoctoproject.org <toaster@lists.yoctoproject.org> On Behalf Of Sebert, Holger.ext
Sent: Monday, November 16, 2020 4:57 AM
To: toaster@lists.yoctoproject.org
Subject: [Toaster] Database erros due to UTF-8 filenames

Hi,

I've setup Toaster and a MySQL docker container, all running on Ubuntu 16.04.
I am encountering the following database error, when building my Yocto project:

        ERROR: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")
        Traceback (most recent call last):
          File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute
                return self.cursor.execute(sql, params)
          File "/usr/local/lib/python3.7/dist-packages/django/db/backends/mysql/base.py", line 71, in execute
                return self.cursor.execute(query, args)
          File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 206, in execute
                res = self._query(query)
          File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 319, in _query
                db.query(q)
          File "/usr/local/lib/python3.7/dist-packages/MySQLdb/connections.py", line 260, in query
                _mysql.connection.query(self, query)
        MySQLdb._exceptions.OperationalError: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")

The query that raised this error looks as follows:

        INSERT INTO `orm_target_file`
                (`target_id`, `path`, `size`, `inodetype`, `permission`,
                `owner`, `group`, `directory_id`, `sym_target_id`)
        VALUES (19,
                '/usr/share/ca-certificates/mozilla/NetLock_Arany_=Class_Gold=_F\xc5\x91tan\xc3\xbas\xc3\xadtv\xc3\xa1ny.crt',
                1476, 1, 'rw-r--r--', 'root', 'root', NULL, NULL)

The file causing this error has the following UTF-8 encoded filename:

        NetLock_Arany_=Class_Gold=_Főtanúsítvány.crt

When looking into the database I found out that the column `path` of table
`orm_target_file` has the following properties:

        CHARACTER_SET_NAME: latin1
        COLLATION_NAME: latin1_swedish_ci

Apperently, the column `path` is not ready for UTF-8 strings. I can fix that
manually by doing the following mysql command using the `mysql` tool:

        ALTER TABLE orm_target_file
        CONVERT TO CHARACTER SET utf8
        COLLATE utf8_general_ci;

This change makes the database error disappear.

I would like to fix that directly in Toasters's `orm/models.py`. I found the
following definition in class `Target_File`:

    path = models.FilePathField()

It seems like I need to pass some clever options to `FilePathField`, but which?
My own research in that direction has brought up nothing useful so far.

My questions are thus:

* How can I parametrize `FilePathField` to properly handle UTF-8 encoded
  filenames in the underlying database?

* How should a correspondig migration file look like in `orm/migrations`?

Thanks!

Best,
Holger

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-26 17:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16 12:56 Database erros due to UTF-8 filenames holger.sebert.ext
2020-11-16 13:17 ` Reyna, David
2020-11-26 17:32   ` Sebert, Holger.ext

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).