uring regression

* uring regression - lost write request
@ 2021-10-22  3:12 Daniel Black
  2021-10-22  9:10 ` Pavel Begunkov
  0 siblings, 1 reply; 36+ messages in thread
From: Daniel Black @ 2021-10-22  3:12 UTC (permalink / raw)
  To: linux-block

Sometime after 5.11 and is fixed in 5.15-rcX (rc6 extensively tested
over last few days) is a kernel regression we are tracing in
https://jira.mariadb.org/browse/MDEV-26674 and
https://jira.mariadb.org/browse/MDEV-26555
5.10 and early across many distros and hardware appear not to have a problem.

I'd appreciate some help identifying a 5.14 linux stable patch
suitable as I observe the fault in mainline 5.14.14 (built
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14.14/). This is of
interest to both Debian (sid)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996951 , Ubuntu
(Impish) and Fedora fc33-35 (TODO bug report)..

Marko in https://jira.mariadb.org/browse/MDEV-26555?focusedCommentId=198601&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-198601
traced this down to a io_uring_wait_cqe never returning after a
request was pushed.

The observed behavior uses a mariadb-test package for 10.6

dan@impish:~$ uname -a
Linux impish 5.14.14-051414-generic #202110201037 SMP Wed Oct 20
11:04:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

dan@impish:~$ cd /usr/share/mysql/mysql-test/

dan@impish:/usr/share/mysql/mysql-test$ ./mtr --vardir=/tmp/var
--parallel=4 stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb    stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb  stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb
Logging: ./mtr  --vardir=/tmp/var --parallel=4 stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
vardir: /tmp/var
Removing old var directory...
Creating var directory '/tmp/var'...
Checking supported features...
MariaDB Version 10.6.5-MariaDB-1:10.6.5+maria~impish
 - SSL connections supported
 - binaries built with wsrep patch
Collecting tests...
Installing system database...

==============================================================================

TEST                                  WORKER RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[4] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039
worker[3] Using MTR_BUILD_THREAD 302, with reserved ports 16040..16059
worker[2] Using MTR_BUILD_THREAD 303, with reserved ports 16060..16079
stress.ddl_innodb 'innodb'               w3 [ pass ]  185605
stress.ddl_innodb 'innodb'               w4 [ pass ]  186292
stress.ddl_innodb 'innodb'               w2 [ pass ]  193053
stress.ddl_innodb 'innodb'               w1 [ pass ]  202529
stress.ddl_innodb 'innodb'               w4 [ pass ]  213972
stress.ddl_innodb 'innodb'               w3 [ pass ]  214661
stress.ddl_innodb 'innodb'               w1 [ pass ]  213266
stress.ddl_innodb 'innodb'               w4 [ pass ]  181716
stress.ddl_innodb 'innodb'               w3 [ pass ]  194047
stress.ddl_innodb 'innodb'               w1 [ pass ]  208319
stress.ddl_innodb 'innodb'               w2 [ fail ]
        Test ended at 2021-10-22 01:24:22

----------SERVER LOG START-----------
2021-10-22  1:24:20 0 [ERROR] [FATAL] InnoDB:
innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch.
Please refer to
https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

This threshold is 10 minutes so its not like the hardware is that slow.

To my frustration, the hirsuite based container (below) created as a
test framework for you has never produced a fault even though running
on the same 5.14.14-200.fc34.x86_64 kernel that would fail after 2-3
stress.ddl_innodb tests.

$ podman run   --rm --privileged=true
quay.io/danielgblack/mariadb-test:uring    --vardir=/var/tmp
stress.ddl_innodb{,,,,,,,,,,,,,}
...
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 2908.065 of 822 seconds executing testcases

Completed: All 18 tests were successful.

Looking at server test logs in /var/tmp/[0-9]/*/*err* the mariadbd
process are using uring.

I hope provides a hint.

In the mean time, the complete reproduce is to pull a 10.6 disto
package from https://mariadb.org/download/?tab=repo-config
It has to be a distro that provides liburing like:
Debian sid
Ubuntu - groovy+
Rhel8
Fedora

(centos/rhel are doing the incorrect baseurl currently, replace the
last fragement of the path, with [rhel|centos][7|8]-$arch )
Install repo.
Install Package mariadb-test (pull in MariaDB server as dependency).
ldd /usr/{s}bin/mariadbd to check liburing is there.

cd /usr/share/mysql/mysql-test
./mtr --vardir=/tmp/var   --parallel=4 encryption.innochecksum{,,,,,}
./mtr --vardir=/tmp/var   --parallel=4 stress.ddl_innodb{,,,,,}

Should generate a backtrace like above.

As an mtr argument with gdb and xterm installed the following will
breakpoint the application at this state.
 --gdb='b ib::fatal::~fatal;r'

I'm happy to build from a tree like
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=io_uring-5.15
if you'd like to to test something locally.

I can also run bpftrace scripts to pull out info if required.

^ permalink raw reply	[flat|nested] 36+ messages in thread