From: Brad Fitzpatrick <brad@danga.com>
To: linux-kernel <linux-kernel@vger.kernel.org>
Subject: 2.6.9: unkillable processes during heavy IO
Date: Sun, 14 Nov 2004 14:15:49 -0800 (PST) [thread overview]
Message-ID: <Pine.LNX.4.58.0411141403040.22805@danga.com> (raw)
We have two database servers which freeze up during heavy IO load. The
machines themselves are responsible, but the mysqld processes are forever
locked, unkillable with even kill -9. I can't restart with MySQL without
rebooting the machines.
I can reasonable rule out hardware, since this is happening in the
same way on two identical machines.
I'd like to know how I can debug this, to file a proper bug report.
The hardware/software stack is:
- Dual Opteron 246, SMP kernel, w/ NUMA
- 9 GB of memory (4GB in one zone, 5GB in the other)
- MySQL, running mostly InnoDB, but some MyISAM
- MegaRAID raid-10
- device mapper
- XFS (used as both O_DIRECT from InnoDB and regularly from MyISAM)
At this point I'm going to try changing different variables on
different machines in order to try and isolate it, but it's a slow
process.
- on raw partions, instead of device mapper
- ext3 instead of XFS
- not using O_DIRECT
"Screenshot":
roast:~# killall -9 mysqld
roast:~# killall -9 mysqld
roast:~# ps afx | grep mysqld
31495 ? D 0:08 /usr/local/mysql/bin/mysqld --defaults-extra-file=/usr/local/mysql/data/my.cnf --basedir=/usr/local/mysql/ --datadir=/data/mydb --user=root
--pid-file=/var/run/mysqld/mysqld.pid --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
32391 ? D 0:01 /usr/local/mysql/bin/mysqld --defaults-extra-file=/usr/local/mysql/data/my.cnf --basedir=/usr/local/mysql/ --datadir=/data/mydb --user=root
--pid-file=/var/run/mysqld/mysqld.pid --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
515 ? D 0:00 /usr/local/mysql/bin/mysqld --defaults-extra-file=/usr/local/mysql/data/my.cnf --basedir=/usr/local/mysql/ --datadir=/data/mydb --user=root
--pid-file=/var/run/mysqld/mysqld.pid --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
517 ? Z 0:00 \_ [mysqld] <defunct>
Next time it hangs like this, how can I get a kernel backtrace or other useful information
for a certain process?
Thanks!
- Brad
next reply other threads:[~2004-11-14 22:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-14 22:15 Brad Fitzpatrick [this message]
2004-11-14 21:41 ` 2.6.9: unkillable processes during heavy IO Alan Cox
2004-11-14 22:30 ` Andrew Morton
2004-11-14 22:45 ` Andries Brouwer
2004-11-15 7:39 ` Lenar Lõhmus
2004-11-16 13:56 ` Brad Fitzpatrick
2004-11-17 4:01 ` Andrew Morton
2004-11-17 4:55 ` Nathan Scott
2004-11-17 7:00 ` Brad Fitzpatrick
2004-11-18 5:05 ` Nathan Scott
2004-11-19 4:59 ` Nathan Scott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0411141403040.22805@danga.com \
--to=brad@danga.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).