All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <gqjiang@suse.com>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org, NeilBrown <neilb@suse.com>
Subject: Re: [PATCH 3/3] MD: hold mddev lock for md-cluster receive thread
Date: Wed, 3 Aug 2016 11:18:24 +0800	[thread overview]
Message-ID: <57A16280.9050503@suse.com> (raw)
In-Reply-To: <20160802224456.GD98613@kernel.org>



On 08/03/2016 06:44 AM, Shaohua Li wrote:
> On Tue, Aug 02, 2016 at 05:52:41PM +0800, Guoqing Jiang wrote:
>>
>> On 08/02/2016 05:45 AM, Shaohua Li wrote:
>>> On Mon, Aug 01, 2016 at 04:38:59PM +0800, Guoqing Jiang wrote:
>>>> Hi,
>>>>
>>>> On 07/31/2016 07:54 AM, shli@kernel.org wrote:
>>>>> From: Shaohua Li <shli@fb.com>
>>>>>
>>>>> md-cluster receive thread calls .quiesce too, let it hold mddev lock.
>>>> I'd suggest hold on for the patchset, I can find lock problem easily with
>>>> the patchset applied. Take a resyncing  clusteed raid1 as example.
>>>>
>>>> md127_raid1 thread held reconfig_mutex then update sb, so it needs dlm
>>>> token lock. Meanwhile md127_resync thread got token lock and wants
>>>> EX on ack lock but recv_daemon can't release ack lock since recv_daemon
>>>> doesn't get reconfig_mutex.
>>> Thansk, I'll drop this one. Other two patches are still safe for md-cluster,
>>> right?
>>  From the latest test, I can't find  lock issues with the first two patches,
>> but I doubt it would have side effect for the performance of resync.
> That's not need to be worried. The .quiesce() call is way heavier than
> hold/release the mutex.
>
>>> I really hope to have consistent locking for .quiesce. For the
>>> process_recvd_msg, I'm wondering what's protecting the datas? for example,
>>> md-cluster uses md_find_rdev_nr_rcu, which access the disks list without
>>> locking. Is there a race?
>> Yes, it should be protected by rcu lock, I will post a patch for it, thanks
>> for reminder.
>>
>>> Does it work if we move the mddev lock to
>>> process_recvd_msg?
>> I tried that, but It still have lock issue, eg, when node B and C have
>> status
>> as "resync=PENDING", then try to stop the resyncing array in node A.
> can you elaborate it?

I am not lucky enough to do the same test as yesterday, but I even
can't assemble clustered raid1 in other nodes.

1. node135: mdadm --create md0 --bitmap=clustered --raid-devices=2 
--level=mirror /dev/vdb /dev/vdc
2. Then node240 and node244 try to assemble it, but both of them would hang.

betalinux135:~ # cat /proc/mdstat
Personalities : [raid1]
md127 : active raid1 vdc[1] vdb[0]
       2095104 blocks super 1.2 [2/2] [UU]
       [=>...................]  resync =  6.2% (130816/2095104) 
finish=1.5min speed=21802K/sec
       bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>
betalinux135:~ # ssh betalinux240
Last login: Wed Aug  3 11:11:47 2016 from 192.168.100.1
betalinux240:~ # ps aux|grep md|grep D
root      1901  0.0  0.2  20896  2592 pts/0    D+   11:12   0:00 mdadm 
--assemble md0 /dev/vdb /dev/vdc
root      1914  0.0  0.2  19852  2032 ?        S    11:12   0:00 
/sbin/mdadm --incremental --export /dev/vdb --offroot ${DEVLINKS}
root      1915  0.0  0.1  19852  1940 ?        S    11:12   0:00 
/sbin/mdadm --incremental --export /dev/vdc --offroot ${DEVLINKS}
betalinux240:~ # cat /proc/1901/stack
[<ffffffffa069d4cb>] dlm_lock_sync+0x6b/0x80 [md_cluster]
[<ffffffffa069e7e6>] join+0x286/0x430 [md_cluster]
[<ffffffffa068a9f4>] bitmap_create+0x5f4/0x980 [md_mod]
[<ffffffffa0682f35>] md_run+0x595/0xa60 [md_mod]
[<ffffffffa06834cf>] do_md_run+0xf/0xb0 [md_mod]
[<ffffffffa0686781>] md_ioctl+0x11b1/0x1680 [md_mod]
[<ffffffff812ed158>] blkdev_ioctl+0x258/0x920
[<ffffffff8122f81d>] block_ioctl+0x3d/0x40
[<ffffffff8120ac0d>] do_vfs_ioctl+0x2cd/0x4a0
[<ffffffff8120ae54>] SyS_ioctl+0x74/0x80
[<ffffffff815e936e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[<ffffffffffffffff>] 0xffffffffffffffff
betalinux240:~ # exit
logout
Connection to betalinux240 closed.
betalinux135:~ # ssh betalinux244
Last login: Wed Aug  3 11:11:49 2016 from 192.168.100.1
betalinux244:~ # ps aux|grep md|grep D
root      1903  0.0  0.2  20896  2660 pts/0    D+   11:12   0:00 mdadm 
--assemble md0 /dev/vdb /dev/vdc
root      1923  0.0  0.2  19852  2112 ?        S    11:12   0:00 
/sbin/mdadm --incremental --export /dev/vdc --offroot ${DEVLINKS}
root      1928  0.0  0.2  19852  2092 ?        S    11:12   0:00 
/sbin/mdadm --incremental --export /dev/vdb --offroot ${DEVLINKS}
root      1938  0.0  0.0      0     0 ?        D    11:12   0:00 
[md0_cluster_rec]
betalinux244:~ # cat /proc/1903/stack
[<ffffffffa06904cb>] dlm_lock_sync+0x6b/0x80 [md_cluster]
[<ffffffffa06904fb>] lock_token+0x1b/0x50 [md_cluster]
[<ffffffffa06905fd>] metadata_update_start+0x3d/0xb0 [md_cluster]
[<ffffffffa06751ee>] md_update_sb.part.50+0x8e/0x810 [md_mod]
[<ffffffffa067646e>] md_allow_write+0x6e/0xc0 [md_mod]
[<ffffffffa0676505>] do_md_run+0x45/0xb0 [md_mod]
[<ffffffffa0679781>] md_ioctl+0x11b1/0x1680 [md_mod]
[<ffffffff812ed158>] blkdev_ioctl+0x258/0x920
[<ffffffff8122f81d>] block_ioctl+0x3d/0x40
[<ffffffff8120ac0d>] do_vfs_ioctl+0x2cd/0x4a0
[<ffffffff8120ae54>] SyS_ioctl+0x74/0x80
[<ffffffff815e936e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[<ffffffffffffffff>] 0xffffffffffffffff
betalinux244:~ # cat /proc/1938/stack
[<ffffffffa0691d90>] recv_daemon+0xc0/0x4a0 [md_cluster]
[<ffffffffa0670d30>] md_thread+0x130/0x150 [md_mod]
[<ffffffff810995ed>] kthread+0xbd/0xe0
[<ffffffff815e96bf>] ret_from_fork+0x3f/0x70
[<ffffffff81099530>] kthread+0x0/0xe0
[<ffffffffffffffff>] 0xffffffffffffffff

> For the raid5-cache issue, ignoring the md-cluster .quiesce() call is fine
> currently as we don't support raid5 cluster. We probably should add another
> parameter for .quiesce to indicate if the mddev lock is hold in the future.

Pls update me with the change in future, since it may have huge influence
for md-cluster.

Thanks,
GUoqing


  reply	other threads:[~2016-08-03  3:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-30 23:54 [PATCH 1/3] MD: hold mddev lock for .quiesce in md_do_sync shli
2016-07-30 23:54 ` [PATCH 2/3] MD: hold mddev lock to change bitmap location shli
2016-08-03  0:03   ` NeilBrown
2016-07-30 23:54 ` [PATCH 3/3] MD: hold mddev lock for md-cluster receive thread shli
2016-08-01  8:38   ` Guoqing Jiang
2016-08-01 21:45     ` Shaohua Li
2016-08-02  9:52       ` Guoqing Jiang
2016-08-02 22:44         ` Shaohua Li
2016-08-03  3:18           ` Guoqing Jiang [this message]
2016-08-03  0:09         ` NeilBrown
2016-08-03  3:42           ` Guoqing Jiang
2016-07-31  6:03 ` [PATCH 1/3] MD: hold mddev lock for .quiesce in md_do_sync yizhan
2016-08-02 23:47 ` NeilBrown
2016-08-04  3:16   ` NeilBrown
2016-08-06  4:14     ` Shaohua Li
2016-08-12  0:04       ` NeilBrown
2016-08-17  1:28         ` Shaohua Li
2016-08-24  4:49           ` NeilBrown
2016-08-24  5:25             ` Shaohua Li
2016-08-25  4:59               ` NeilBrown
2016-08-25 17:17                 ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57A16280.9050503@suse.com \
    --to=gqjiang@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.