All of lore.kernel.org
 help / color / mirror / Atom feed
* md raid6 deadlock on write
       [not found] <20120629194600.GA23859@calhariz.com>
@ 2012-07-02 22:15 ` Jose Manuel dos Santos Calhariz
  2012-07-04  1:38   ` NeilBrown
  2012-07-04  2:43   ` Igor M Podlesny
  0 siblings, 2 replies; 6+ messages in thread
From: Jose Manuel dos Santos Calhariz @ 2012-07-02 22:15 UTC (permalink / raw)
  To: linux-raid; +Cc: ns-list


[-- Attachment #1.1: Type: text/plain, Size: 1766 bytes --]


We have a group of servers with a LVM over a RAID6 of 16 drives.
During normal work loads, sometimes, the md raid enter on deadlock for
writes and only a power off/power on allows to recover the machine.

The raid was created some time ago with something like:

  mdadm --create /dev/md2 --level=6 -n=16 /dev/sd[a-p]

Following an old discussion on this list
http://www.spinics.net/lists/raid/msg37708.html.  It's possible to
confirm that a fio command is enough to make the raid enter on
deadlock.  The command used was:

  fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k \
      --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 \
      --name=job3 --name=job4 --fsync=1000 --end_fsync=1

The running kernel is a vanilla from kernel.org 3.4.0.  This problem
was found in the kernels 3.4.0, 3.4.0-rc2 and 3.2.0.


In the past day 28, one of the servers was hit by that deadlock two
times in a row.  This first was during normal operation and it was
running the kernel 3.4.0-rc2.  The second was after business hours
running the fio to check if the problem was solved on kernel 3.4.0.

For the deadlock by running fio on kernel 3.4.0 was observed on the
raid: 

  - there was some read operations every 5 or 6 seconds,

  - increasing the stripe_cache_size would allow some extra IO,

  - there is information from "SysRq : Show State", not attached
    because is too big,

  - in attach the output of "iostat -dx 1",

  - the "avgqu-sz" of the logical volume used for fio tests was
    76280.00,

  - in attach the output of "ps ax".


        Jose Calhariz




-- 
--
"Existem 3 poderes soberanos:
Deus no céu, o Papa no Vaticano e Dadá Maravilha na grande área."
--Dadá Maravilha

[-- Attachment #1.2: iostat-dx-1-20120628-2011.log --]
[-- Type: text/plain, Size: 24967 bytes --]

Linux 3.4.0 (stor04) 	06/28/2012 	_x86_64_	(8 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda             546.26   112.81   92.19   39.10  5544.81  1178.33    51.21     1.77   13.48   1.63  21.37
sdb             545.78   112.87   93.14   38.89  5553.20  1177.24    50.98     1.72   13.00   1.59  20.98
sdc             543.02   116.08   92.99   39.58  5530.35  1208.50    50.83     1.76   13.27   1.63  21.65
sdd             541.38   116.82   93.32   39.24  5515.31  1211.62    50.74     1.70   12.80   1.59  21.14
sde             537.54   121.11   93.08   40.16  5481.87  1253.31    50.55     1.79   13.43   1.64  21.82
sdf             536.82   120.78   92.76   40.67  5475.16  1254.73    50.44     1.83   13.72   1.64  21.89
sdg             532.15   125.53   93.54   41.20  5440.44  1296.98    50.00     1.91   14.16   1.64  22.15
sdh             527.25   132.22   93.02   42.34  5398.18  1359.77    49.92     1.94   14.34   1.70  22.97
sdi             530.67   129.27   90.03   43.50  5401.08  1309.88    50.26     1.79   13.39   1.64  21.96
sdj             536.77   120.68   91.22   42.33  5453.39  1231.66    50.06     1.77   13.22   1.65  22.02
sdk             541.01   115.87   93.55   39.35  5504.97  1169.49    50.22     1.70   12.80   1.60  21.21
sdl             539.99   116.68   93.80   38.90  5501.38  1172.46    50.29     1.67   12.59   1.60  21.20
sdm             539.76   115.13   93.80   39.28  5501.33  1163.01    50.08     1.69   12.71   1.61  21.37
sdn             540.13   115.98   94.28   39.03  5505.44  1167.81    50.06     1.70   12.71   1.59  21.23
sdo             542.52   113.19   94.99   38.29  5534.05  1139.46    50.07     1.63   12.21   1.57  20.88
sdp             543.15   111.88   95.57   38.07  5547.23  1127.28    49.94     1.68   12.55   1.55  20.77
md0               0.00     0.00    0.30    0.68    20.11    42.32    63.67     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.01    0.00     0.08     0.00     7.85     0.00    0.00   0.00   0.00
md2               0.00     0.00  102.99  259.21  7619.98 11472.84    52.71     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    4.06    0.06   210.20     2.88    51.79     0.07   16.57   7.69   3.16
dm-1              0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    7.16   7.16   0.00
dm-2              0.00     0.00    4.29    1.27   442.44   168.75   109.92     0.26   26.22  24.22  13.47
dm-3              0.00     0.00    6.79    2.27   428.55    88.98    57.15     0.32   23.18  29.63  26.83
dm-4              0.00     0.00    0.05    0.00     0.34     0.25    12.26     0.00    2.69   1.40   0.01
dm-5              0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    6.06   6.06   0.00
dm-6              0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    4.26   4.26   0.00
dm-7              0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    3.29   3.29   0.00
dm-8              0.00     0.00    0.05    0.00     0.34     0.25    12.34     0.00    6.55   1.76   0.01
dm-9              0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    4.45   4.45   0.00
dm-10             0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    2.97   2.97   0.00
dm-11             0.00     0.00   15.20    3.08  1428.45   375.30    98.68     0.42   22.98   5.48  10.01
dm-12             0.00     0.00    0.00    0.00     0.01     0.00     7.29     0.00    6.91   6.91   0.00
dm-13             0.00     0.00    0.00    0.00     0.02     0.00     6.34     0.00    7.36   7.36   0.00
dm-14             0.00     0.00   19.58   26.06  1607.61  3065.03   102.38     2.47   54.04   3.94  17.99
dm-15             0.00     0.00   34.67   11.53  2656.75  1012.33    79.42     1.22   26.43   2.69  12.42
dm-16             0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    4.06   4.06   0.00
dm-17             0.00     0.00    9.98   47.20   778.63  4134.18    85.92     3.31   47.92  15.47  88.46
dm-18             0.00     0.00    0.05    0.27     0.47    76.22   241.46     0.76  224.90 418.73  13.30
dm-19             0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    5.61   5.61   0.00
dm-20             0.00     0.00    0.03    0.00     0.32     0.25    16.13     0.00    2.22   1.59   0.01
dm-21             0.00     0.00    0.00    0.00     0.01     0.00     8.00     0.00    4.13   4.13   0.00
dm-22             0.00     0.00    0.05    1.74     0.41   594.38   332.04     1.49   87.57  65.00  11.64
dm-23             0.00     0.00    0.04    1.33     0.32   639.87   468.33     0.31  228.77   9.59   1.31
sdq               0.00    75.12    0.05    0.59     0.40   605.75   941.91     2.16 3358.32  29.27   1.88
dm-24             0.00     0.00    0.00    0.00     0.02     0.00     9.71     0.00    2.52   2.52   0.00
dm-25             0.00     0.00    8.11  164.27    64.90  1314.19     8.00    88.64  109.82   0.85  14.72

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-12             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-14             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-16             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-17             0.00     0.00    0.00    0.00     0.00     0.00     0.00     5.00    0.00   0.00 100.00
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00    10.00    0.00   0.00 100.00
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    0.00     0.00     0.00     0.00    18.00    0.00   0.00 100.00
dm-23             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdq               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-24             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-25             0.00     0.00    0.00    0.00     0.00     0.00     0.00 76280.00    0.00   0.00 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-12             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-14             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-16             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-17             0.00     0.00    0.00    0.00     0.00     0.00     0.00     5.00    0.00   0.00 100.00
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00    10.00    0.00   0.00 100.00
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    0.00     0.00     0.00     0.00    18.00    0.00   0.00 100.00
dm-23             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdq               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-24             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-25             0.00     0.00    0.00    0.00     0.00     0.00     0.00 76280.00    0.00   0.00 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.10
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.10
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-12             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-14             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-16             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-17             0.00     0.00    0.00    0.00     0.00     0.00     0.00     5.00    0.00   0.00 100.10
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00    10.01    0.00   0.00 100.10
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    0.00     0.00     0.00     0.00    18.02    0.00   0.00 100.10
dm-23             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdq               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-24             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-25             0.00     0.00    0.00    0.00     0.00     0.00     0.00 76356.28    0.00   0.00 100.10

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-12             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-14             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-16             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-17             0.00     0.00    0.00    0.00     0.00     0.00     0.00     5.00    0.00   0.00 100.00
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00    10.00    0.00   0.00 100.00
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    0.00     0.00     0.00     0.00    18.00    0.00   0.00 100.00
dm-23             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdq               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-24             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-25             0.00     0.00    0.00    0.00     0.00     0.00     0.00 76280.00    0.00   0.00 100.00


[-- Attachment #1.3: ps_ax.txt --]
[-- Type: text/plain, Size: 11770 bytes --]

  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 init [2]  
    2 ?        S      0:00 [kthreadd]
    3 ?        S      0:00 [ksoftirqd/0]
    6 ?        S      0:00 [migration/0]
    7 ?        S      0:00 [migration/1]
    9 ?        S      0:00 [ksoftirqd/1]
   11 ?        S      0:00 [migration/2]
   13 ?        S      0:00 [ksoftirqd/2]
   14 ?        S      0:00 [migration/3]
   16 ?        S      0:00 [ksoftirqd/3]
   17 ?        S      0:00 [migration/4]
   19 ?        S      0:00 [ksoftirqd/4]
   20 ?        S      0:00 [migration/5]
   22 ?        S      0:00 [ksoftirqd/5]
   23 ?        S      0:00 [migration/6]
   25 ?        S      0:00 [ksoftirqd/6]
   26 ?        S      0:00 [migration/7]
   28 ?        S      0:00 [ksoftirqd/7]
   29 ?        S<     0:00 [khelper]
  199 ?        S      0:00 [sync_supers]
  201 ?        S      0:00 [bdi-default]
  203 ?        S<     0:00 [kblockd]
  350 ?        S<     0:00 [ata_sff]
  360 ?        S<     0:00 [md]
  390 ?        S      0:00 [kworker/6:1]
  391 ?        S      0:01 [kworker/7:1]
  434 ?        Ss     0:00 sshd: ctpm [priv]
  446 ?        S      0:00 sshd: ctpm@pts/9 
  447 pts/9    Ss+    0:00 -bash
  528 ?        S      0:00 [khungtaskd]
  533 ?        S      0:00 [kswapd0]
  597 ?        S      0:00 [fsnotify_mark]
  623 ?        S<     0:00 [xfsalloc]
  624 ?        S<     0:00 [xfs_mru_cache]
  626 ?        S<     0:00 [xfslogd]
  637 ?        S<     0:00 [crypto]
  701 pts/10   Ss+    0:00 /bin/bash
  809 ?        S      0:00 [scsi_eh_0]
  812 ?        S      0:00 [scsi_eh_1]
  815 ?        S      0:00 [scsi_eh_2]
  818 ?        S      0:00 [scsi_eh_3]
  821 ?        S      0:00 [scsi_eh_4]
  824 ?        S      0:00 [scsi_eh_5]
  837 ?        S<     0:00 [mpt_poll_0]
  838 ?        S<     0:00 [mpt/0]
  839 ?        S      0:00 [scsi_eh_6]
  934 ?        S<     0:00 [mpt_poll_1]
  935 ?        S<     0:00 [mpt/1]
  968 ?        S      0:00 [scsi_eh_7]
 1005 ?        S      0:00 [kworker/2:1]
 1075 ?        S<     0:00 [kpsmoused]
 1088 ?        S<     0:00 [edac-poller]
 1119 ?        S<     0:00 [deferwq]
 1275 ?        S      0:00 [khubd]
 1277 ?        S      0:00 [kworker/7:2]
 1293 ?        S      0:00 [kworker/5:2]
 1352 ?        S      0:00 [kworker/6:2]
 1485 ?        S      0:00 [kworker/4:1]
 1494 ?        S      0:00 [md0_raid1]
 1507 ?        S      0:00 [md1_raid1]
 1531 ?        S      7:45 [md2_raid6]
 1551 ?        S      0:00 [xfsbufd/md0]
 1552 ?        S<     0:00 [xfs-data/md0]
 1553 ?        S<     0:00 [xfs-conv/md0]
 1554 ?        S      0:03 [xfsaild/md0]
 1599 ?        S<s    0:00 udevd --daemon
 2357 ?        S<     0:00 [kmpathd]
 2358 ?        S<     0:00 [kmpath_handlerd]
 2651 ?        S<     0:00 [kdmflush]
 2669 ?        S<     0:00 [kdmflush]
 2686 ?        S<     0:00 [kdmflush]
 2704 ?        S<     0:00 [kdmflush]
 2722 ?        S<     0:00 [kdmflush]
 2740 ?        S<     0:00 [kdmflush]
 2757 ?        S<     0:00 [kdmflush]
 2774 ?        S<     0:00 [kdmflush]
 2791 ?        S<     0:00 [kdmflush]
 2815 ?        S<     0:00 [kdmflush]
 2832 ?        S<     0:00 [kdmflush]
 2849 ?        S<     0:00 [kdmflush]
 2866 ?        S<     0:00 [kdmflush]
 2883 ?        S<     0:00 [kdmflush]
 2900 ?        S<     0:00 [kdmflush]
 2917 ?        S<     0:00 [kdmflush]
 2934 ?        S<     0:00 [kdmflush]
 2952 ?        S<     0:00 [kdmflush]
 2970 ?        S<     0:00 [kdmflush]
 2994 ?        S<     0:00 [kdmflush]
 3012 ?        S<     0:00 [kdmflush]
 3030 ?        S<     0:00 [kdmflush]
 3045 ?        S      0:00 [flush-9:0]
 3048 ?        S<     0:00 [kdmflush]
 3065 ?        S<     0:00 [kdmflush]
 3101 ?        S      0:00 [xfsbufd/dm-8]
 3102 ?        S<     0:00 [xfs-data/dm-8]
 3103 ?        S<     0:00 [xfs-conv/dm-8]
 3104 ?        S      0:00 [xfsaild/dm-8]
 3111 ?        D      1:29 [md2_resync]
 3112 ?        S      0:00 [xfsbufd/dm-0]
 3113 ?        S<     0:00 [xfs-data/dm-0]
 3114 ?        S<     0:00 [xfs-conv/dm-0]
 3115 ?        S      0:00 [xfsaild/dm-0]
 3122 ?        S      0:00 [xfsbufd/dm-4]
 3123 ?        S<     0:00 [xfs-data/dm-4]
 3124 ?        S<     0:00 [xfs-conv/dm-4]
 3125 ?        S      0:00 [xfsaild/dm-4]
 3126 ?        S      0:00 [xfsbufd/dm-18]
 3127 ?        S<     0:00 [xfs-data/dm-18]
 3128 ?        S<     0:00 [xfs-conv/dm-18]
 3129 ?        S      0:03 [xfsaild/dm-18]
 3136 ?        S      0:00 [xfsbufd/dm-20]
 3137 ?        S<     0:00 [xfs-data/dm-20]
 3138 ?        S<     0:00 [xfs-conv/dm-20]
 3139 ?        S      0:00 [xfsaild/dm-20]
 3250 ?        Ss     0:00 /sbin/portmap
 3263 ?        S<     0:00 [rpciod]
 3265 ?        S<     0:00 [nfsiod]
 3272 ?        Ss     0:00 /usr/sbin/rpc.idmapd
 3373 ?        S<     0:00 [iscsi_eh]
 3377 ?        Ss     0:00 /usr/sbin/iscsid
 3378 ?        S<Ls   0:00 /usr/sbin/iscsid
 3454 ?        Sl     0:00 /usr/sbin/rsyslogd -c4
 3483 ?        S      0:00 [lockd]
 3484 ?        S<     0:00 [nfsd4]
 3485 ?        S<     0:00 [nfsd4_callbacks]
 3486 ?        S      0:01 [nfsd]
 3487 ?        D      0:01 [nfsd]
 3488 ?        D      0:02 [nfsd]
 3489 ?        D      0:06 [nfsd]
 3490 ?        D      0:02 [nfsd]
 3491 ?        D      0:01 [nfsd]
 3492 ?        D      0:03 [nfsd]
 3493 ?        D      0:03 [nfsd]
 3494 ?        D      0:01 [nfsd]
 3495 ?        S      0:05 [nfsd]
 3496 ?        D      0:06 [nfsd]
 3497 ?        S      0:04 [nfsd]
 3498 ?        S      0:03 [nfsd]
 3499 ?        D      0:06 [nfsd]
 3500 ?        D      0:02 [nfsd]
 3501 ?        D      0:01 [nfsd]
 3577 ?        Ss     0:00 /usr/sbin/acpid
 3578 ?        Ss     0:00 /usr/sbin/rpc.mountd --manage-gids
 3605 ?        SLl    0:00 /sbin/multipathd
 3625 ?        Ss     0:00 /usr/sbin/atd
 3632 ?        Ss     0:00 /sbin/mdadm --monitor --pid-file /var/run/mdadm/monitor.pid --daemonise --scan --syslog
 3643 ?        Ss     0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
 3661 ?        Ss     0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 106:110
 3697 ?        Ss     0:00 ha_logd: read process        
 3703 ?        Ss     0:00 /usr/sbin/cron
 3718 ?        S      0:00 ha_logd: write process       
 3789 ?        Ss     0:00 /usr/sbin/sshd
 3857 ?        Ssl    0:03 /usr/bin/ceph-mds -i stor04 --pid-file /var/run/ceph/mds.stor04.pid -c /etc/ceph/ceph.conf
 4012 ?        S      0:00 [kworker/5:0]
 4161 ?        Ss     0:00 /usr/lib/postfix/master
 4168 ?        S      0:00 qmgr -l -t fifo -u
 4174 ?        S<     0:00 [target_completi]
 4176 ?        S      0:00 [LIO_rd_mcp]
 4201 ?        S      0:01 [LIO_iblock]
 4222 ?        S      0:00 [LIO_iblock]
 4243 ?        S      0:00 [LIO_iblock]
 4264 ?        S      0:00 [LIO_iblock]
 4285 ?        D      0:02 [LIO_iblock]
 4306 ?        S      0:00 [LIO_iblock]
 4327 ?        S      0:00 [LIO_iblock]
 4348 ?        S      0:00 [LIO_iblock]
 4369 ?        S      0:00 [LIO_iblock]
 4390 ?        S      0:01 [LIO_iblock]
 4411 ?        S      0:15 [LIO_iblock]
 4437 ?        S      0:12 [LIO_iblock]
 4465 ?        S      0:00 [LIO_iblock]
 4486 ?        D      0:30 [LIO_iblock]
 4510 ?        S      0:01 [LIO_iblock]
 4527 ?        S      0:08 [iscsi_ttx]
 4528 ?        D      0:11 [iscsi_trx]
 4529 ?        S      0:00 [iscsi_ttx]
 4530 ?        S      0:00 [iscsi_trx]
 4531 ?        S      0:32 [iscsi_ttx]
 4532 ?        D      1:56 [iscsi_trx]
 4533 ?        S      0:00 [iscsi_ttx]
 4534 ?        S      0:00 [iscsi_trx]
 4536 ?        S      0:00 [iscsi_np]
 4551 ?        D      0:00 [iscsi_np]
 4679 ?        S      0:00 [iscsi_ttx]
 4680 ?        S      0:00 [iscsi_trx]
 4682 ?        S      0:03 [iscsi_ttx]
 4683 ?        S      0:03 [iscsi_trx]
 4684 ?        S      0:18 [iscsi_ttx]
 4685 ?        S      0:41 [iscsi_trx]
 4686 ?        S      0:00 [iscsi_ttx]
 4687 ?        S      0:00 [iscsi_trx]
 4688 ?        S      0:02 [iscsi_ttx]
 4689 ?        D      0:03 [iscsi_trx]
 4690 ?        S      0:05 [iscsi_ttx]
 4691 ?        D      0:15 [iscsi_trx]
 4704 ?        S      0:00 /usr/sbin/smartd --pidfile /var/run/smartd.pid --interval=1800
 4782 ?        Ss     0:00 /usr/sbin/munin-node
 4791 tty1     Ss+    0:00 /sbin/getty 38400 tty1
 4792 tty2     Ss+    0:00 /sbin/getty 38400 tty2
 4793 tty3     Ss+    0:00 /sbin/getty 38400 tty3
 4794 tty4     Ss+    0:00 /sbin/getty 38400 tty4
 4795 tty5     Ss+    0:00 /sbin/getty 38400 tty5
 4796 tty6     Ss+    0:00 /sbin/getty 38400 tty6
 5680 ?        S      0:00 [scsi_eh_8]
 5681 ?        S<     0:00 [iscsi_q_8]
 5682 ?        S<     0:00 [scsi_wq_8]
 5685 ?        S      0:27 [iscsi_ttx]
 5686 ?        S      1:27 [iscsi_trx]
 5698 ?        S<     0:00 [kdmflush]
 5787 ?        S      0:00 [iscsi_ttx]
 5788 ?        S      0:00 [iscsi_trx]
 6499 ?        S      0:20 [iscsi_ttx]
 6500 ?        S      0:48 [iscsi_trx]
 8136 ?        Ss     0:00 sshd: root@pts/7 
 8151 pts/7    Ss     0:00 -bash
 8309 pts/6    Ss+    0:00 /bin/bash
 9165 pts/7    R+     0:00 ps ax
11042 ?        S      0:00 pickup -l -t fifo -u -c
11686 ?        S      0:00 [kworker/0:2]
16121 ?        S      0:01 [iscsi_ttx]
16122 ?        S      0:08 [iscsi_trx]
17509 ?        S      0:00 [kworker/3:1]
20047 ?        S      0:00 [kworker/2:0]
20052 ?        Ss     0:00 sshd: root@pts/2 
20529 pts/2    Ss     0:00 -bash
20951 ?        S      0:00 [kworker/3:2]
21402 ?        S      0:00 [kworker/u:2]
21475 ?        S<     0:00 [kdmflush]
21476 ?        S<     0:00 udevd --daemon
22312 pts/0    S+     0:00 screen watch cat stripe_cache_active stripe_cache_size
22313 ?        Ss     0:00 SCREEN watch cat stripe_cache_active stripe_cache_size
22314 pts/1    Ss+    0:02 watch cat stripe_cache_active stripe_cache_size
22469 pts/2    S+     0:00 screen
22470 ?        Ss     0:01 SCREEN
22471 pts/3    Ss     0:00 /bin/bash
23111 pts/3    S+     0:02 watch cat /sys/block/md2/md/stripe_cache_active /sys/block/md2/md/stripe_cache_size
23129 pts/4    Ss     0:00 /bin/bash
23168 ?        D      0:04 [flush-253:25]
23283 pts/5    Ss     0:00 /bin/bash
23297 pts/5    S+     0:01 iostat -k 2
24909 pts/4    S+     0:02 fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 --name=job3 --name=job4 --fsync=1000 --end_fsync=1
24910 ?        Ds     0:01 fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 --name=job3 --name=job4 --fsync=1000 --end_fsync=1
24911 ?        Ds     0:01 fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 --name=job3 --name=job4 --fsync=1000 --end_fsync=1
24912 ?        Ds     0:01 fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 --name=job3 --name=job4 --fsync=1000 --end_fsync=1
24913 ?        Ds     0:08 fio --name=global --rw=randwrite --size=4G --bsrange=1k-128k --filename=/dev/stor04-vg0/stressraid6 --name=job1 --name=job2 --name=job3 --name=job4 --fsync=1000 --end_fsync=1
25102 ?        D      0:00 [kworker/4:2]
25890 ?        S      0:00 [kworker/0:1]
25927 ?        S      0:00 [kworker/1:6]
25929 ?        S      0:00 [kworker/1:8]
29686 ?        S      0:00 [kworker/4:0]
29977 ?        S      0:00 [kworker/4:4]
30101 ?        S      0:00 [kworker/u:1]
31244 ?        Ss     0:00 sshd: root@pts/0 
31250 pts/0    Ss     0:00 -bash
31970 pts/8    Ss+    0:00 /bin/bash

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: md raid6 deadlock on write
  2012-07-02 22:15 ` md raid6 deadlock on write Jose Manuel dos Santos Calhariz
@ 2012-07-04  1:38   ` NeilBrown
       [not found]     ` <20120704102411.GG15287@calhariz.com>
  2012-07-04  2:43   ` Igor M Podlesny
  1 sibling, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-07-04  1:38 UTC (permalink / raw)
  To: jose.spam; +Cc: jose.calhariz, linux-raid, ns-list

[-- Attachment #1: Type: text/plain, Size: 912 bytes --]

On Mon, 2 Jul 2012 23:15:08 +0100 Jose Manuel dos Santos Calhariz
<jose.calhariz@netvisao.pt> wrote:

> 
> We have a group of servers with a LVM over a RAID6 of 16 drives.
> During normal work loads, sometimes, the md raid enter on deadlock for
> writes and only a power off/power on allows to recover the machine.

This might be fixed by the following commit which was recently included in
3.5-rc. If could test with that I'd appreciate it.

> 
>   - there is information from "SysRq : Show State", not attached
>     because is too big,

How big is too big?  It is very hard to see if there is anything useful in
there if I cannot see it....

NeilBrown


> 
>   - in attach the output of "iostat -dx 1",
> 
>   - the "avgqu-sz" of the logical volume used for fio tests was
>     76280.00,
> 
>   - in attach the output of "ps ax".
> 
> 
>         Jose Calhariz
> 
> 
> 
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: md raid6 deadlock on write
  2012-07-02 22:15 ` md raid6 deadlock on write Jose Manuel dos Santos Calhariz
  2012-07-04  1:38   ` NeilBrown
@ 2012-07-04  2:43   ` Igor M Podlesny
  1 sibling, 0 replies; 6+ messages in thread
From: Igor M Podlesny @ 2012-07-04  2:43 UTC (permalink / raw)
  To: jose.spam; +Cc: linux-raid, ns-list

On 3 July 2012 06:15, Jose Manuel dos Santos Calhariz
<jose.calhariz@netvisao.pt> wrote:
[...]
>   - there is information from "SysRq : Show State", not attached
>     because is too big,

   You can get (hopefully) much more than that just using netconsole
-- http://wiki.openvz.org/Remote_console_setup#Netconsole

--

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: md raid6 deadlock on write
       [not found]     ` <20120704102411.GG15287@calhariz.com>
@ 2012-07-06 13:55       ` Jose Manuel dos Santos Calhariz
  2012-07-09  3:46         ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Jose Manuel dos Santos Calhariz @ 2012-07-06 13:55 UTC (permalink / raw)
  To: jose.spam; +Cc: NeilBrown, jose.calhariz, linux-raid, ns-list

[-- Attachment #1: Type: text/plain, Size: 1428 bytes --]

On Wed, Jul 04, 2012 at 11:24:11AM +0100, Jose Manuel dos Santos Calhariz wrote:
> On Wed, Jul 04, 2012 at 11:38:11AM +1000, NeilBrown wrote:
> > On Mon, 2 Jul 2012 23:15:08 +0100 Jose Manuel dos Santos Calhariz
> > <jose.calhariz@netvisao.pt> wrote:
> > 
> > > 
> > > We have a group of servers with a LVM over a RAID6 of 16 drives.
> > > During normal work loads, sometimes, the md raid enter on deadlock for
> > > writes and only a power off/power on allows to recover the machine.
> > 
> > This might be fixed by the following commit which was recently included in
> > 3.5-rc. If could test with that I'd appreciate it.
> 
> We will do it, at first opportunity.

We have two machines that are running fio for 24 hours without
problems.  So the bug seams to be fixed, thank you.

Any possibility of the fix being ported to kernel 3.2?

> 
> > 
> > > 
> > >   - there is information from "SysRq : Show State", not attached
> > >     because is too big,
> > 
> > How big is too big?  It is very hard to see if there is anything useful in
> > there if I cannot see it....
> 
> Big enough to be blocked by the mailing list ;-)
> 
> I am attaching now, so you can see it.
> 
> > 
> > NeilBrown
> > 
> 
> 
>         Jose Calhariz
> 

      Jose Calhariz





-- 
--
"Existem 3 poderes soberanos:
Deus no céu, o Papa no Vaticano e Dadá Maravilha na grande área."
--Dadá Maravilha

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: md raid6 deadlock on write
  2012-07-06 13:55       ` Jose Manuel dos Santos Calhariz
@ 2012-07-09  3:46         ` NeilBrown
  2012-07-09 11:22           ` Jose Manuel dos Santos Calhariz
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-07-09  3:46 UTC (permalink / raw)
  To: jose.spam; +Cc: jose.calhariz, linux-raid, ns-list

[-- Attachment #1: Type: text/plain, Size: 1215 bytes --]

On Fri, 6 Jul 2012 14:55:24 +0100 Jose Manuel dos Santos Calhariz
<jose.calhariz@netvisao.pt> wrote:

> On Wed, Jul 04, 2012 at 11:24:11AM +0100, Jose Manuel dos Santos Calhariz wrote:
> > On Wed, Jul 04, 2012 at 11:38:11AM +1000, NeilBrown wrote:
> > > On Mon, 2 Jul 2012 23:15:08 +0100 Jose Manuel dos Santos Calhariz
> > > <jose.calhariz@netvisao.pt> wrote:
> > > 
> > > > 
> > > > We have a group of servers with a LVM over a RAID6 of 16 drives.
> > > > During normal work loads, sometimes, the md raid enter on deadlock for
> > > > writes and only a power off/power on allows to recover the machine.
> > > 
> > > This might be fixed by the following commit which was recently included in
> > > 3.5-rc. If could test with that I'd appreciate it.
> > 
> > We will do it, at first opportunity.
> 
> We have two machines that are running fio for 24 hours without
> problems.  So the bug seams to be fixed, thank you.

Thanks for testing and reported.


> 
> Any possibility of the fix being ported to kernel 3.2?

It seems that I didn't tag that patch for -stable so it won't automatically
get included.  I'll send it the old way - maybe it'll get into 3.2.23.

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: md raid6 deadlock on write
  2012-07-09  3:46         ` NeilBrown
@ 2012-07-09 11:22           ` Jose Manuel dos Santos Calhariz
  0 siblings, 0 replies; 6+ messages in thread
From: Jose Manuel dos Santos Calhariz @ 2012-07-09 11:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: jose.spam, jose.calhariz, linux-raid, ns-list

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

On Mon, Jul 09, 2012 at 01:46:53PM +1000, NeilBrown wrote:
> On Fri, 6 Jul 2012 14:55:24 +0100 Jose Manuel dos Santos Calhariz
> <jose.calhariz@netvisao.pt> wrote:
> 
> > On Wed, Jul 04, 2012 at 11:24:11AM +0100, Jose Manuel dos Santos Calhariz wrote:
> > > On Wed, Jul 04, 2012 at 11:38:11AM +1000, NeilBrown wrote:
> > > > On Mon, 2 Jul 2012 23:15:08 +0100 Jose Manuel dos Santos Calhariz
> > > > <jose.calhariz@netvisao.pt> wrote:
> > > > 
> > > > > 
> > > > > We have a group of servers with a LVM over a RAID6 of 16 drives.
> > > > > During normal work loads, sometimes, the md raid enter on deadlock for
> > > > > writes and only a power off/power on allows to recover the machine.
> > > > 
> > > > This might be fixed by the following commit which was recently included in
> > > > 3.5-rc. If could test with that I'd appreciate it.
> > > 
> > > We will do it, at first opportunity.
> > 
> > We have two machines that are running fio for 24 hours without
> > problems.  So the bug seams to be fixed, thank you.
> 
> Thanks for testing and reported.
> 
> 
> > 
> > Any possibility of the fix being ported to kernel 3.2?
> 
> It seems that I didn't tag that patch for -stable so it won't automatically
> get included.  I'll send it the old way - maybe it'll get into
> 3.2.23.

That would be great. 

> 
> Thanks,
> NeilBrown
> 

     Jose Calhariz

-- 
--
Preguiça é o habito de descansar antes da fadiga.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-09 11:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20120629194600.GA23859@calhariz.com>
2012-07-02 22:15 ` md raid6 deadlock on write Jose Manuel dos Santos Calhariz
2012-07-04  1:38   ` NeilBrown
     [not found]     ` <20120704102411.GG15287@calhariz.com>
2012-07-06 13:55       ` Jose Manuel dos Santos Calhariz
2012-07-09  3:46         ` NeilBrown
2012-07-09 11:22           ` Jose Manuel dos Santos Calhariz
2012-07-04  2:43   ` Igor M Podlesny

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.