All of lore.kernel.org
 help / color / mirror / Atom feed
* bcache deadlock
@ 2015-08-01  6:08 Stefan Priebe
  2015-08-03  6:21 ` Ming Lin
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe @ 2015-08-01  6:08 UTC (permalink / raw)
  To: linux-bcache

Hi,

any ideas about this deadlock:
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task xfsaild/bcache4:2433 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:683 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:679 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:675 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:671 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:667 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task bcache_writebac:654 blocked for more 
than 120 seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task kswapd0:79 blocked for more than 120 
seconds.
2015-08-01 00:05:05     "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
2015-08-01 00:05:05     INFO: task kthreadd:2 blocked for more than 120 
seconds.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-01  6:08 bcache deadlock Stefan Priebe
@ 2015-08-03  6:21 ` Ming Lin
  2015-08-03  6:25   ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 7+ messages in thread
From: Ming Lin @ 2015-08-03  6:21 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: linux-bcache

On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
> Hi,
>
> any ideas about this deadlock:
> 2015-08-01 00:05:05     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
> than 120 seconds.

No backtrace?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-03  6:21 ` Ming Lin
@ 2015-08-03  6:25   ` Stefan Priebe - Profihost AG
  2015-08-10 14:51     ` Stefan Priebe
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2015-08-03  6:25 UTC (permalink / raw)
  To: Ming Lin; +Cc: linux-bcache



Am 03.08.2015 um 08:21 schrieb Ming Lin:
> On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
>> Hi,
>>
>> any ideas about this deadlock:
>> 2015-08-01 00:05:05     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
>> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
>> than 120 seconds.
> 
> No backtrace?
> 

Yes, no backtrace.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-03  6:25   ` Stefan Priebe - Profihost AG
@ 2015-08-10 14:51     ` Stefan Priebe
  2015-08-12 13:39       ` Jack Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe @ 2015-08-10 14:51 UTC (permalink / raw)
  To: Ming Lin; +Cc: linux-bcache

Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG:
>
>
> Am 03.08.2015 um 08:21 schrieb Ming Lin:
>> On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag> wrote:
>>> Hi,
>>>
>>> any ideas about this deadlock:
>>> 2015-08-01 00:05:05     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
>>> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
>>> than 120 seconds.
>>
>> No backtrace?
>>
>
> Yes, no backtrace.

Any chance or idea to fix this? This happens every day at a different 
server and is really annoying.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-10 14:51     ` Stefan Priebe
@ 2015-08-12 13:39       ` Jack Wang
  2015-08-12 13:57         ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 7+ messages in thread
From: Jack Wang @ 2015-08-12 13:39 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Ming Lin, linux-bcache

Have you checked on the server when this deadlock happened?

From my experience, you will get a trace for the warning.

2015-08-10 16:51 GMT+02:00 Stefan Priebe <s.priebe@profihost.ag>:
> Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG:
>>
>>
>>
>> Am 03.08.2015 um 08:21 schrieb Ming Lin:
>>>
>>> On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> any ideas about this deadlock:
>>>> 2015-08-01 00:05:05     "echo 0 >
>>>> /proc/sys/kernel/hung_task_timeout_secs"
>>>> disables this message.
>>>> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
>>>> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
>>>> than 120 seconds.
>>>
>>>
>>> No backtrace?
>>>
>>
>> Yes, no backtrace.
>
>
> Any chance or idea to fix this? This happens every day at a different server
> and is really annoying.
>
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-12 13:39       ` Jack Wang
@ 2015-08-12 13:57         ` Stefan Priebe - Profihost AG
  2015-08-22 21:48           ` Stefan Priebe
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Priebe - Profihost AG @ 2015-08-12 13:57 UTC (permalink / raw)
  To: Jack Wang; +Cc: Ming Lin, linux-bcache

Hi,
Am 12.08.2015 um 15:39 schrieb Jack Wang:
> Have you checked on the server when this deadlock happened?
> 
> From my experience, you will get a trace for the warning.

sadly there is no trace as it seems the kworker is running in an endless
loop.

I don't have the abbility to login - the system is running with a load
of 2000 or even 3000.

From the logs i've gathered the following informations:

top with running processes shows only kworker running on 100% CPU.

top - 15:02:31 up 10 days, 16:20,  1 user,  load average: 2494,67,
1878,69, 905,
Tasks: 226 total,   2 running, 222 sleeping,   0 stopped,   2 zombie
%Cpu(s):  0,9 us, 12,7 sy,  0,0 ni, 36,4 id, 50,0 wa,  0,0 hi,  0,0 si,
 0,0 st
KiB Mem:  49431532 total, 48672808 used,   758724 free,       52 buffers
KiB Swap:  3906556 total,   152772 used,  3753784 free, 40328600 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
21963 root      20   0     0    0    0 R 100,5  0,0   9:15.48
[kworker/u16:3]
29978 root      20   0 62488  20m 6892 S   8,0  0,0   0:02.59
/usr/bin/python /

iotop shows the same kworker permanently writing with > 1400MB/s.

Total DISK READ:       0.00 B/s | Total DISK WRITE:       0.00 B/s
  PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
29978 be/4 root        0.00 B/s   14.69 K/s  0.00 %  0.00 % python
/usr/sbin/iotop -b -d 1 -n 30 -P
21963 be/4 root        0.00 B/s 1428.89 M/s  0.00 %  0.00 % [kworker/u16:3]

To me this looks like an endless loop which could also explain why there
is no stack trace.

Greets,
Stefan

> 
> 2015-08-10 16:51 GMT+02:00 Stefan Priebe <s.priebe@profihost.ag>:
>> Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG:
>>>
>>>
>>>
>>> Am 03.08.2015 um 08:21 schrieb Ming Lin:
>>>>
>>>> On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> any ideas about this deadlock:
>>>>> 2015-08-01 00:05:05     "echo 0 >
>>>>> /proc/sys/kernel/hung_task_timeout_secs"
>>>>> disables this message.
>>>>> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
>>>>> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
>>>>> than 120 seconds.
>>>>
>>>>
>>>> No backtrace?
>>>>
>>>
>>> Yes, no backtrace.
>>
>>
>> Any chance or idea to fix this? This happens every day at a different server
>> and is really annoying.
>>
>>
>> Stefan
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bcache deadlock
  2015-08-12 13:57         ` Stefan Priebe - Profihost AG
@ 2015-08-22 21:48           ` Stefan Priebe
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Priebe @ 2015-08-22 21:48 UTC (permalink / raw)
  To: Jack Wang; +Cc: Ming Lin, linux-bcache

It seems to work since i disabled irqbalance. Is this problematic for 
bcache?

Stefan

Am 12.08.2015 um 15:57 schrieb Stefan Priebe - Profihost AG:
> Hi,
> Am 12.08.2015 um 15:39 schrieb Jack Wang:
>> Have you checked on the server when this deadlock happened?
>>
>>  From my experience, you will get a trace for the warning.
>
> sadly there is no trace as it seems the kworker is running in an endless
> loop.
>
> I don't have the abbility to login - the system is running with a load
> of 2000 or even 3000.
>
>  From the logs i've gathered the following informations:
>
> top with running processes shows only kworker running on 100% CPU.
>
> top - 15:02:31 up 10 days, 16:20,  1 user,  load average: 2494,67,
> 1878,69, 905,
> Tasks: 226 total,   2 running, 222 sleeping,   0 stopped,   2 zombie
> %Cpu(s):  0,9 us, 12,7 sy,  0,0 ni, 36,4 id, 50,0 wa,  0,0 hi,  0,0 si,
>   0,0 st
> KiB Mem:  49431532 total, 48672808 used,   758724 free,       52 buffers
> KiB Swap:  3906556 total,   152772 used,  3753784 free, 40328600 cached
>
>    PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
> 21963 root      20   0     0    0    0 R 100,5  0,0   9:15.48
> [kworker/u16:3]
> 29978 root      20   0 62488  20m 6892 S   8,0  0,0   0:02.59
> /usr/bin/python /
>
> iotop shows the same kworker permanently writing with > 1400MB/s.
>
> Total DISK READ:       0.00 B/s | Total DISK WRITE:       0.00 B/s
>    PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
> 29978 be/4 root        0.00 B/s   14.69 K/s  0.00 %  0.00 % python
> /usr/sbin/iotop -b -d 1 -n 30 -P
> 21963 be/4 root        0.00 B/s 1428.89 M/s  0.00 %  0.00 % [kworker/u16:3]
>
> To me this looks like an endless loop which could also explain why there
> is no stack trace.
>
> Greets,
> Stefan
>
>>
>> 2015-08-10 16:51 GMT+02:00 Stefan Priebe <s.priebe@profihost.ag>:
>>> Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG:
>>>>
>>>>
>>>>
>>>> Am 03.08.2015 um 08:21 schrieb Ming Lin:
>>>>>
>>>>> On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe <s.priebe@profihost.ag>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> any ideas about this deadlock:
>>>>>> 2015-08-01 00:05:05     "echo 0 >
>>>>>> /proc/sys/kernel/hung_task_timeout_secs"
>>>>>> disables this message.
>>>>>> 2015-08-01 00:05:05     Tainted: G O 3.18.19+47-ph #1
>>>>>> 2015-08-01 00:05:05     INFO: task xfsaild/bcache5:2437 blocked for more
>>>>>> than 120 seconds.
>>>>>
>>>>>
>>>>> No backtrace?
>>>>>
>>>>
>>>> Yes, no backtrace.
>>>
>>>
>>> Any chance or idea to fix this? This happens every day at a different server
>>> and is really annoying.
>>>
>>>
>>> Stefan
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-08-22 21:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-01  6:08 bcache deadlock Stefan Priebe
2015-08-03  6:21 ` Ming Lin
2015-08-03  6:25   ` Stefan Priebe - Profihost AG
2015-08-10 14:51     ` Stefan Priebe
2015-08-12 13:39       ` Jack Wang
2015-08-12 13:57         ` Stefan Priebe - Profihost AG
2015-08-22 21:48           ` Stefan Priebe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.