All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
To: Douglas Miller <dougmill@linux.vnet.ibm.com>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] block: Fix kernel panic occurs while creating second raid disk
Date: Thu, 3 Nov 2016 10:45:43 +0530	[thread overview]
Message-ID: <CAK=zhgrsCO+7gLQ3uHfZ2AMN46kBadqys63FV0YZ3MnQ2ZP9EQ@mail.gmail.com> (raw)
In-Reply-To: <69529c36-39e3-c30f-f3c7-ecd45e8aa7bc@linux.vnet.ibm.com>

On Tue, Nov 1, 2016 at 11:52 PM, Douglas Miller
<dougmill@linux.vnet.ibm.com> wrote:
> On 10/24/2016 01:54 PM, Sreekanth Reddy wrote:
>>
>> Observing below kernel panic while creating second raid disk
>> on LSI SAS3008 HBA card.
>>
>> [  +0.000055] ------------[ cut here ]------------
>> [  +0.000007] WARNING: CPU: 2 PID: 281 at fs/sysfs/dir.c:31
>> sysfs_warn_dup+0x62/0x80
>> [  +0.000002] sysfs: cannot create duplicate filename
>> '/devices/virtual/bdi/8:32'
>> [  +0.000001] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle
>> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge
>> stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl
>> sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt
>> ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich
>> mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd
>> grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class
>> nvme_core scsi_transport_sas dca
>> [  +0.000067] CPU: 2 PID: 281 Comm: kworker/u49:5 Not tainted 4.9.0-rc2 #1
>> [  +0.000002] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS
>> 1.1 07/22/2015
>> [  +0.000005] Workqueue: events_unbound async_run_entry_fn
>> [  +0.000004] Call Trace:
>> [  +0.000009]  [<ffffffff813ca51e>] dump_stack+0x63/0x85
>> [  +0.000005]  [<ffffffff810a5bfb>] __warn+0xcb/0xf0
>> [  +0.000004]  [<ffffffff810a5c7f>] warn_slowpath_fmt+0x5f/0x80
>> [  +0.000006]  [<ffffffff812bf17f>] ? kernfs_path_from_node+0x4f/0x60
>> [  +0.000002]  [<ffffffff812c2942>] sysfs_warn_dup+0x62/0x80
>> [  +0.000002]  [<ffffffff812c2a27>] sysfs_create_dir_ns+0x77/0x90
>> [  +0.000004]  [<ffffffff813ccef9>] kobject_add_internal+0x99/0x330
>> [  +0.000003]  [<ffffffff813d6efb>] ? vsnprintf+0x35b/0x4c0
>> [  +0.000003]  [<ffffffff813cd6f5>] kobject_add+0x75/0xd0
>> [  +0.000006]  [<ffffffff81514e43>] ? device_private_init+0x23/0x70
>> [  +0.000007]  [<ffffffff817cb652>] ? mutex_lock+0x12/0x30
>> [  +0.000003]  [<ffffffff81514fa9>] device_add+0x119/0x670
>> [  +0.000004]  [<ffffffff815156f0>] device_create_groups_vargs+0xe0/0xf0
>> [  +0.000003]  [<ffffffff8151571c>] device_create_vargs+0x1c/0x20
>> [  +0.000006]  [<ffffffff811d712c>] bdi_register+0x8c/0x180
>> [  +0.000003]  [<ffffffff811d7506>] bdi_register_owner+0x36/0x60
>> [  +0.000006]  [<ffffffff813ad778>] device_add_disk+0x168/0x480
>> [  +0.000005]  [<ffffffff81524891>] ? update_autosuspend+0x51/0x60
>> [  +0.000005]  [<ffffffff81557770>] sd_probe_async+0x110/0x1c0
>> [  +0.000002]  [<ffffffff810c8a49>] async_run_entry_fn+0x39/0x140
>> [  +0.000003]  [<ffffffff810bfa5f>] process_one_work+0x15f/0x430
>> [  +0.000002]  [<ffffffff810bfd7e>] worker_thread+0x4e/0x490
>> [  +0.000002]  [<ffffffff810bfd30>] ? process_one_work+0x430/0x430
>> [  +0.000003]  [<ffffffff810c55a9>] kthread+0xd9/0xf0
>> [  +0.000003]  [<ffffffff810c54d0>] ? kthread_park+0x60/0x60
>> [  +0.000003]  [<ffffffff817ce595>] ret_from_fork+0x25/0x30
>> [  +0.000002] ------------[ cut here ]------------
>> [  +0.000004] WARNING: CPU: 2 PID: 281 at lib/kobject.c:240
>> kobject_add_internal+0x2bd/0x330
>> [  +0.000001] kobject_add_internal failed for 8:32 with -EEXIST, don't try
>> to register things with the same name in the same
>> [  +0.000001] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle
>> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge
>> stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl
>> sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt
>> ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich
>> mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd
>> grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class
>> nvme_core scsi_transport_sas dca
>> [  +0.000043] CPU: 2 PID: 281 Comm: kworker/u49:5 Tainted: G        W
>> 4.9.0-rc2 #1
>> [  +0.000001] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS
>> 1.1 07/22/2015
>> [  +0.000002] Workqueue: events_unbound async_run_entry_fn
>> [  +0.000003] Call Trace:
>> [  +0.000003]  [<ffffffff813ca51e>] dump_stack+0x63/0x85
>> [  +0.000003]  [<ffffffff810a5bfb>] __warn+0xcb/0xf0
>> [  +0.000004]  [<ffffffff810a5c7f>] warn_slowpath_fmt+0x5f/0x80
>> [  +0.000002]  [<ffffffff812c294a>] ? sysfs_warn_dup+0x6a/0x80
>> [  +0.000003]  [<ffffffff813cd11d>] kobject_add_internal+0x2bd/0x330
>> [  +0.000003]  [<ffffffff813d6efb>] ? vsnprintf+0x35b/0x4c0
>> [  +0.000003]  [<ffffffff813cd6f5>] kobject_add+0x75/0xd0
>> [  +0.000003]  [<ffffffff81514e43>] ? device_private_init+0x23/0x70
>> [  +0.000004]  [<ffffffff817cb652>] ? mutex_lock+0x12/0x30
>> [  +0.000002]  [<ffffffff81514fa9>] device_add+0x119/0x670
>> [  +0.000004]  [<ffffffff815156f0>] device_create_groups_vargs+0xe0/0xf0
>> [  +0.000003]  [<ffffffff8151571c>] device_create_vargs+0x1c/0x20
>> [  +0.000003]  [<ffffffff811d712c>] bdi_register+0x8c/0x180
>> [  +0.000003]  [<ffffffff811d7506>] bdi_register_owner+0x36/0x60
>> [  +0.000004]  [<ffffffff813ad778>] device_add_disk+0x168/0x480
>> [  +0.000003]  [<ffffffff81524891>] ? update_autosuspend+0x51/0x60
>> [  +0.000002]  [<ffffffff81557770>] sd_probe_async+0x110/0x1c0
>> [  +0.000002]  [<ffffffff810c8a49>] async_run_entry_fn+0x39/0x140
>> [  +0.000002]  [<ffffffff810bfa5f>] process_one_work+0x15f/0x430
>> [  +0.000002]  [<ffffffff810bfd7e>] worker_thread+0x4e/0x490
>> [  +0.000002]  [<ffffffff810bfd30>] ? process_one_work+0x430/0x430
>> [  +0.000003]  [<ffffffff810c55a9>] kthread+0xd9/0xf0
>> [  +0.000003]  [<ffffffff810c54d0>] ? kthread_park+0x60/0x60
>> [  +0.000003]  [<ffffffff817ce595>] ret_from_fork+0x25/0x30
>> [  +0.000949] BUG: unable to handle kernel
>> [  +0.005263] NULL pointer dereference
>> [  +0.002853] IP: [<ffffffff812c2c64>]
>> sysfs_do_create_link_sd.isra.2+0x34/0xb0
>> [  +0.008584] PGD 0
>>
>> [  +0.006115] Oops: 0000 [#1] SMP
>> [  +0.004531] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle
>> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge
>> stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl
>> sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt
>> ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich
>> mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd
>> grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class
>> nvme_core scsi_transport_sas dca
>> [  +0.080566] CPU: 17 PID: 281 Comm: kworker/u49:5 Tainted: G        W
>> 4.9.0-rc2 #1
>> [  +0.009472] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS
>> 1.1 07/22/2015
>> [  +0.009169] Workqueue: events_unbound async_run_entry_fn
>> [  +0.007340] RIP: 0010:[<ffffffff812c2c64>] [<ffffffff812c2c64>]
>> sysfs_do_create_link_sd.isra.2+0x34/0xb0
>> [  +0.010294] Call Trace:
>> [  +0.005269]  [<ffffffff812c2d05>] sysfs_create_link+0x25/0x40
>> [  +0.008568]  [<ffffffff813ad80c>] device_add_disk+0x1fc/0x480
>> [  +0.008551]  [<ffffffff81557770>] sd_probe_async+0x110/0x1c0
>> [  +0.008456]  [<ffffffff810c8a49>] async_run_entry_fn+0x39/0x140
>> [  +0.010021]  [<ffffffff810bfa5f>] process_one_work+0x15f/0x430
>> [  +0.009623]  [<ffffffff810bfd7e>] worker_thread+0x4e/0x490
>> [  +0.007422]  [<ffffffff810bfd30>] ? process_one_work+0x430/0x430
>> [  +0.008728]  [<ffffffff810c55a9>] kthread+0xd9/0xf0
>> [  +0.007578]  [<ffffffff810c54d0>] ? kthread_park+0x60/0x60
>> [  +0.006816]  [<ffffffff817ce595>] ret_from_fork+0x25/0x30
>> [  +0.006814] Code: 75 48 85 ff 74 70 55 48 89 e5 41 57 41 56 41 55 41 54
>> 49 89 fe 53 48 c7 c7 90 74 01 82 48 89 f3 41 89 cc  c5 ff ff c6 05 15 48 d5
>> [  +0.022853] RIP  [<ffffffff812c2c64>]
>> sysfs_do_create_link_sd.isra.2+0x34/0xb0
>> [  +0.008679]  RSP <ffffc90019c3fd10>
>> [  +0.006129] BUG: unable to handle kernel
>>
>> While analyzing this issue, I observed that while creating the first raid
>> disk,
>> we hide first raid disk's PD devices (i.e. device will be their but it
>> won't have
>> block device entry). But kernel is not removing the entries of this first
>> raid disk's
>>  PD devices BDI's in /sys/devices/virtual/bdi/ path, still it shows bdi
>> device entries
>> for these PD eventhough these PD doesn't have a block device names.
>>
>> e.g.
>> output of 'ls -l /dev/sd*' after creating first raid disk
>> [root@dhcp ~]# ls -l /dev/sd*
>> brw-rw---- 1 root disk 8,   0 Oct 24 17:37 /dev/sda
>> brw-rw---- 1 root disk 8,   1 Oct 24 17:37 /dev/sda1
>> brw-rw---- 1 root disk 8,   2 Oct 24 17:37 /dev/sda2
>> brw-rw---- 1 root disk 8,   3 Oct 24 17:37 /dev/sda3
>> brw-rw---- 1 root disk 8,  16 Oct 24 17:37 /dev/sdb
>> brw-rw---- 1 root disk 8,  64 Oct 24 17:37 /dev/sde
>> brw-rw---- 1 root disk 8,  80 Oct 24 17:37 /dev/sdf
>> brw-rw---- 1 root disk 8,  96 Oct 24 17:37 /dev/sdg
>> brw-rw---- 1 root disk 8, 112 Oct 24 17:37 /dev/sdh
>> brw-rw---- 1 root disk 8, 128 Oct 24 17:37 /dev/sdi
>> brw-rw---- 1 root disk 8, 144 Oct 24 17:37 /dev/sdj
>> brw-rw---- 1 root disk 8, 160 Oct 24 17:41 /dev/sdk
>>
>> outout of 'ls -l /sys/devices/virtual/bdi/'
>> [root@dhcp-135-24-192-127 ~]# ls -l /sys/devices/virtual/bdi/
>> total 0
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 259:0
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:0
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:112
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:128
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:144
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:16
>> drwxr-xr-x 3 root root 0 Oct 24 17:41 8:160
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:32
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:48
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:64
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:80
>> drwxr-xr-x 3 root root 0 Oct 24 17:39 8:96
>>
>> Here we can observe that there are no block devices for
>> '8:32' & '8:48' bdi entries, which are PD's for raid disk /dev/sdk.
>>
>> Now while creating a second raid disk, kernel is trying to use
>> MAJOR:MINOR as 8:32 for second raid disk and we observe
>> above kernel OOPs.
>>
>> By calling bdi_unregister() in del_gendisk() function has resolved this
>> issue.
>>
>> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
>> ---
>>  block/genhd.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index fcd6d4f..b95f2fa 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -658,6 +658,7 @@ void del_gendisk(struct gendisk *disk)
>>      disk->flags &= ~GENHD_FL_UP;
>>
>>      sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
>> +    bdi_unregister(&disk->queue->backing_dev_info);
>>      blk_unregister_queue(disk);
>>      blk_unregister_region(disk_devt(disk), disk->minors);
>>
> There is a problem with this patch. bdi_unregister() is also called by
> blk_cleanup_queue(), and both that and del_gendisk() may be called by
> cleanup_mapped_device(). This results in a panic when bdi_unregister() is
> called for the second time.

To fix this problem, I have already posted version 2 patch, here is
the patch URL,
https://patchwork.kernel.org/patch/9394471/

Please check this patch and let me known if any changes is needed.

Thanks,
Sreekanth
>

  reply	other threads:[~2016-11-03  5:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <e0d58e4a-8e1e-972c-9195-e2cf010dff84@linux.vnet.ibm.com>
2016-11-01 18:22 ` [PATCH] block: Fix kernel panic occurs while creating second raid disk Douglas Miller
2016-11-03  5:15   ` Sreekanth Reddy [this message]
2017-01-24 12:56     ` Douglas Miller
2016-10-24 13:42 Sreekanth Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK=zhgrsCO+7gLQ3uHfZ2AMN46kBadqys63FV0YZ3MnQ2ZP9EQ@mail.gmail.com' \
    --to=sreekanth.reddy@broadcom.com \
    --cc=dougmill@linux.vnet.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.