From: Xiao Ni <xni@redhat.com>
To: NeilBrown <neilb@suse.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
Date: Thu, 14 Sep 2017 00:55:15 -0400 (EDT) [thread overview]
Message-ID: <446747392.10694917.1505364915884.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <87o9qe9p3j.fsf@notabene.neil.brown.name>
----- Original Message -----
> From: "NeilBrown" <neilb@suse.com>
> To: "Xiao Ni" <xni@redhat.com>
> Cc: linux-raid@vger.kernel.org
> Sent: Thursday, September 14, 2017 7:05:20 AM
> Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
>
> On Wed, Sep 13 2017, Xiao Ni wrote:
> >
> > Hi Neil
> >
> > Sorry for the bad news. The test is still running and it's stuck again.
>
> Any details? Anything at all? Just a little hint maybe?
>
> Just saying "it's stuck again" is very nearly useless.
>
Hi Neil
It doesn't show any useful information in /var/log/messages
echo file raid5.c +p > /sys/kernel/debug/dynamic_debug/control
There aren't any messages too.
It looks like another problem.
[root@dell-pr1700-02 ~]# ps auxf | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 8381 0.0 0.0 0 0 ? D Sep13 0:00 \_ [kworker/u8:1]
root 8966 0.0 0.0 0 0 ? D Sep13 0:00 \_ [jbd2/md0-8]
root 824 0.0 0.1 216856 8492 ? Ss Sep03 0:06 /usr/bin/abrt-watch-log -F BUG: WARNING: at WARNING: CPU: INFO: possible recursive locking detected ernel BUG at list_del corruption list_add corruption do_IRQ: stack overflow: ear stack overflow (cur: eneral protection fault nable to handle kernel ouble fault: RTNL: assertion failed eek! page_mapcount(page) went negative! adness at NETDEV WATCHDOG ysctl table check failed : nobody cared IRQ handler type mismatch Machine Check Exception: Machine check events logged divide error: bounds: coprocessor segment overrun: invalid TSS: segment not present: invalid opcode: alignment check: stack segment: fpu exception: simd exception: iret exception: /var/log/messages -- /usr/bin/abrt-dump-oops -xtD
root 836 0.0 0.0 195052 3200 ? Ssl Sep03 0:00 /usr/sbin/gssproxy -D
root 1225 0.0 0.0 106008 7436 ? Ss Sep03 0:00 /usr/sbin/sshd -D
root 12411 0.0 0.0 112672 2264 pts/0 S+ 00:50 0:00 \_ grep --color=auto D
root 8987 0.0 0.0 109000 2728 pts/2 D+ Sep13 0:04 \_ dd if=/dev/urandom of=/mnt/md_test/testfile bs=1M count=1000
root 8983 0.0 0.0 7116 2080 ? Ds Sep13 0:00 /usr/sbin/mdadm --grow --continue /dev/md0
[root@dell-pr1700-02 ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 loop6[7] loop4[6] loop5[5](S) loop3[3] loop2[2] loop1[1] loop0[0]
2039808 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
[>....................] reshape = 0.0% (1/509952) finish=1059.5min speed=7K/sec
unused devices: <none>
It looks like the reshape doesn't start. This time I didn't add the codes to check
the information of mddev->suspended and active_stripes. I just added the patches
to source codes. Do you have other suggestions to check more things?
Best Regards
Xiao
next prev parent reply other threads:[~2017-09-14 4:55 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-12 1:49 [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without NeilBrown
2017-09-12 1:49 ` [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() NeilBrown
2017-09-12 1:49 ` [PATCH 1/4] md: always hold reconfig_mutex when calling mddev_suspend() NeilBrown
2017-09-12 1:49 ` [PATCH 4/4] md: allow metadata update while suspending NeilBrown
2017-09-12 1:49 ` [PATCH 2/4] md: don't call bitmap_create() while array is quiesced NeilBrown
2017-09-12 2:51 ` [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without Xiao Ni
2017-09-13 2:11 ` Xiao Ni
2017-09-13 15:09 ` Xiao Ni
2017-09-13 23:05 ` NeilBrown
2017-09-14 4:55 ` Xiao Ni [this message]
2017-09-14 5:32 ` NeilBrown
2017-09-14 7:57 ` Xiao Ni
2017-09-16 13:15 ` Xiao Ni
2017-10-05 5:17 ` NeilBrown
2017-10-06 3:53 ` Xiao Ni
2017-10-06 4:32 ` NeilBrown
2017-10-09 1:21 ` Xiao Ni
2017-10-09 4:57 ` NeilBrown
2017-10-09 5:32 ` Xiao Ni
2017-10-09 5:52 ` NeilBrown
2017-10-10 6:05 ` Xiao Ni
2017-10-10 21:20 ` NeilBrown
[not found] ` <960568852.19225619.1507689864371.JavaMail.zimbra@redhat.com>
2017-10-13 3:48 ` NeilBrown
2017-10-16 4:43 ` Xiao Ni
2017-09-30 9:46 ` Xiao Ni
2017-10-05 5:03 ` NeilBrown
2017-10-06 3:40 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=446747392.10694917.1505364915884.JavaMail.zimbra@redhat.com \
--to=xni@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.