linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* uring regression - lost write request
@ 2021-10-22  3:12 Daniel Black
  2021-10-22  9:10 ` Pavel Begunkov
  0 siblings, 1 reply; 36+ messages in thread
From: Daniel Black @ 2021-10-22  3:12 UTC (permalink / raw)
  To: linux-block

Sometime after 5.11 and is fixed in 5.15-rcX (rc6 extensively tested
over last few days) is a kernel regression we are tracing in
https://jira.mariadb.org/browse/MDEV-26674 and
https://jira.mariadb.org/browse/MDEV-26555
5.10 and early across many distros and hardware appear not to have a problem.

I'd appreciate some help identifying a 5.14 linux stable patch
suitable as I observe the fault in mainline 5.14.14 (built
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14.14/). This is of
interest to both Debian (sid)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996951 , Ubuntu
(Impish) and Fedora fc33-35 (TODO bug report)..

Marko in https://jira.mariadb.org/browse/MDEV-26555?focusedCommentId=198601&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-198601
traced this down to a io_uring_wait_cqe never returning after a
request was pushed.

The observed behavior uses a mariadb-test package for 10.6

dan@impish:~$ uname -a
Linux impish 5.14.14-051414-generic #202110201037 SMP Wed Oct 20
11:04:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

dan@impish:~$ cd /usr/share/mysql/mysql-test/

dan@impish:/usr/share/mysql/mysql-test$ ./mtr --vardir=/tmp/var
--parallel=4 stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb    stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb  stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb
Logging: ./mtr  --vardir=/tmp/var --parallel=4 stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
vardir: /tmp/var
Removing old var directory...
Creating var directory '/tmp/var'...
Checking supported features...
MariaDB Version 10.6.5-MariaDB-1:10.6.5+maria~impish
 - SSL connections supported
 - binaries built with wsrep patch
Collecting tests...
Installing system database...

==============================================================================

TEST                                  WORKER RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
worker[4] Using MTR_BUILD_THREAD 301, with reserved ports 16020..16039
worker[3] Using MTR_BUILD_THREAD 302, with reserved ports 16040..16059
worker[2] Using MTR_BUILD_THREAD 303, with reserved ports 16060..16079
stress.ddl_innodb 'innodb'               w3 [ pass ]  185605
stress.ddl_innodb 'innodb'               w4 [ pass ]  186292
stress.ddl_innodb 'innodb'               w2 [ pass ]  193053
stress.ddl_innodb 'innodb'               w1 [ pass ]  202529
stress.ddl_innodb 'innodb'               w4 [ pass ]  213972
stress.ddl_innodb 'innodb'               w3 [ pass ]  214661
stress.ddl_innodb 'innodb'               w1 [ pass ]  213266
stress.ddl_innodb 'innodb'               w4 [ pass ]  181716
stress.ddl_innodb 'innodb'               w3 [ pass ]  194047
stress.ddl_innodb 'innodb'               w1 [ pass ]  208319
stress.ddl_innodb 'innodb'               w2 [ fail ]
        Test ended at 2021-10-22 01:24:22

----------SERVER LOG START-----------
2021-10-22  1:24:20 0 [ERROR] [FATAL] InnoDB:
innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch.
Please refer to
https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/

This threshold is 10 minutes so its not like the hardware is that slow.

To my frustration, the hirsuite based container (below) created as a
test framework for you has never produced a fault even though running
on the same 5.14.14-200.fc34.x86_64 kernel that would fail after 2-3
stress.ddl_innodb tests.

$ podman run   --rm --privileged=true
quay.io/danielgblack/mariadb-test:uring    --vardir=/var/tmp
stress.ddl_innodb{,,,,,,,,,,,,,}
...
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 2908.065 of 822 seconds executing testcases

Completed: All 18 tests were successful.

Looking at server test logs in /var/tmp/[0-9]/*/*err* the mariadbd
process are using uring.

I hope provides a hint.

In the mean time, the complete reproduce is to pull a 10.6 disto
package from https://mariadb.org/download/?tab=repo-config
It has to be a distro that provides liburing like:
Debian sid
Ubuntu - groovy+
Rhel8
Fedora

(centos/rhel are doing the incorrect baseurl currently, replace the
last fragement of the path, with [rhel|centos][7|8]-$arch )
Install repo.
Install Package mariadb-test (pull in MariaDB server as dependency).
ldd /usr/{s}bin/mariadbd to check liburing is there.

cd /usr/share/mysql/mysql-test
./mtr --vardir=/tmp/var   --parallel=4 encryption.innochecksum{,,,,,}
./mtr --vardir=/tmp/var   --parallel=4 stress.ddl_innodb{,,,,,}

Should generate a backtrace like above.

As an mtr argument with gdb and xterm installed the following will
breakpoint the application at this state.
 --gdb='b ib::fatal::~fatal;r'

I'm happy to build from a tree like
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=io_uring-5.15
if you'd like to to test something locally.

I can also run bpftrace scripts to pull out info if required.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-02-10  1:56 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-22  3:12 uring regression - lost write request Daniel Black
2021-10-22  9:10 ` Pavel Begunkov
2021-10-25  9:57   ` Pavel Begunkov
2021-10-25 11:09     ` Daniel Black
2021-10-25 11:25       ` Pavel Begunkov
2021-10-30  7:30         ` Salvatore Bonaccorso
2021-11-01  7:28           ` Daniel Black
2021-11-09 22:58             ` Daniel Black
2021-11-09 23:24               ` Jens Axboe
2021-11-10 18:01                 ` Jens Axboe
2021-11-11  6:52                   ` Daniel Black
2021-11-11 14:30                     ` Jens Axboe
2021-11-11 14:58                       ` Jens Axboe
2021-11-11 15:29                         ` Jens Axboe
2021-11-11 16:19                           ` Jens Axboe
2021-11-11 16:55                             ` Jens Axboe
2021-11-11 17:28                               ` Jens Axboe
2021-11-11 23:44                                 ` Jens Axboe
2021-11-12  6:25                                   ` Daniel Black
2021-11-12 19:19                                     ` Salvatore Bonaccorso
2021-11-14 20:33                                   ` Daniel Black
2021-11-14 20:55                                     ` Jens Axboe
2021-11-14 21:02                                       ` Salvatore Bonaccorso
2021-11-14 21:03                                         ` Jens Axboe
2021-11-24  3:27                                       ` Daniel Black
2021-11-24 15:28                                         ` Jens Axboe
2021-11-24 16:10                                           ` Jens Axboe
2021-11-24 16:18                                             ` Greg Kroah-Hartman
2021-11-24 16:22                                               ` Jens Axboe
2021-11-24 22:52                                                 ` Stefan Metzmacher
2021-11-25  0:58                                                   ` Jens Axboe
2021-11-25 16:35                                                     ` Stefan Metzmacher
2021-11-25 17:11                                                       ` Jens Axboe
2022-02-09 23:01                                                       ` Stefan Metzmacher
2022-02-10  0:10                                                         ` Daniel Black
2021-11-24 22:57                                                 ` Daniel Black

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).