From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932445AbcFIQfj (ORCPT ); Thu, 9 Jun 2016 12:35:39 -0400 Received: from mail.kernel.org ([198.145.29.136]:46274 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932089AbcFIQfe (ORCPT ); Thu, 9 Jun 2016 12:35:34 -0400 Date: Thu, 9 Jun 2016 09:35:30 -0700 From: Shaohua Li To: Cong Wang Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] md: use a mutex to protect a global list Message-ID: <20160609163530.GA16840@kernel.org> References: <1465402816-10882-1-git-send-email-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1465402816-10882-1-git-send-email-xiyou.wangcong@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 08, 2016 at 09:20:16AM -0700, Cong Wang wrote: > We saw a list corruption in the list all_detected_devices: > > WARNING: CPU: 16 PID: 226 at lib/list_debug.c:29 __list_add+0x3c/0xa9() > list_add corruption. next->prev should be prev (ffff880859d58320), but was ffff880859ce74c0. (next=ffffffff81abfdb0). > Modules linked in: ahci libahci libata sd_mod scsi_mod > CPU: 16 PID: 226 Comm: kworker/u241:4 Not tainted 4.1.20 #1 > Hardware name: Dell Inc. PowerEdge C6220/04GD66, BIOS 2.2.3 11/07/2013 > Workqueue: events_unbound async_run_entry_fn > 0000000000000000 ffff880859a5baf8 ffffffff81502872 ffff880859a5bb48 > 0000000000000009 ffff880859a5bb38 ffffffff810692a5 ffff880859ee8828 > ffffffff812ad02c ffff880859d58320 ffffffff81abfdb0 ffff880859eb90c0 > Call Trace: > [] dump_stack+0x4d/0x63 > [] warn_slowpath_common+0xa1/0xbb > [] ? __list_add+0x3c/0xa9 > [] warn_slowpath_fmt+0x46/0x48 > [] __list_add+0x3c/0xa9 > [] md_autodetect_dev+0x41/0x62 > [] rescan_partitions+0x25f/0x29d > [] ? mutex_lock+0x13/0x31 > [] __blkdev_get+0x1aa/0x3cd > [] blkdev_get+0x5f/0x294 > [] ? put_device+0x17/0x19 > [] ? disk_put_part+0x12/0x14 > [] add_disk+0x29d/0x407 > [] ? __pm_runtime_use_autosuspend+0x5c/0x64 > [] sd_probe_async+0x115/0x1af [sd_mod] > [] async_run_entry_fn+0x72/0x12c > [] process_one_work+0x198/0x2ce > [] worker_thread+0x1dd/0x2bb > [] ? cancel_delayed_work_sync+0x15/0x15 > [] ? cancel_delayed_work_sync+0x15/0x15 > [] kthread+0xae/0xb6 > [] ? param_array_set+0x40/0xfa > [] ? __kthread_parkme+0x61/0x61 > [] ret_from_fork+0x42/0x70 > [] ? __kthread_parkme+0x61/0x61 > > I suspect it is because there is no lock protecting this > global list, autostart_arrays() is called in ioctl() path > where there is no lock. > > Cc: Shaohua Li > Signed-off-by: Cong Wang Applied, thanks! This probably is because deiver can do async probe now.