linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "stable@vger.kernel.org" <stable@vger.kernel.org>,
	yanaijie <yanaijie@huawei.com>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	Ewan Milne <emilne@redhat.com>, Christoph Hellwig <hch@lst.de>,
	Tomas Henzl <thenzl@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Hannes Reinecke <hare@suse.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sasha Levin <sashal@kernel.org>
Subject: Re: [PATCH 4.14 01/51] scsi: libsas: direct call probe and destruct
Date: Mon, 3 Aug 2020 13:57:14 +0100	[thread overview]
Message-ID: <8743227b-adb3-ed1f-3559-e562555ac045@huawei.com> (raw)
In-Reply-To: <20200803121849.564535738@linuxfoundation.org>

On 03/08/2020 13:19, Greg Kroah-Hartman wrote:
> From: Jason Yan <yanaijie@huawei.com>
> 
> [ Upstream commit 0558f33c06bb910e2879e355192227a8e8f0219d ]
> 

Hi Greg,

This patch was one of a series from Jason to fix this WARN issue, below:

https://lore.kernel.org/linux-scsi/8f6e3763-2b04-23e8-f1ec-8ed3c58f55d3@huawei.com/

I'm doubtful that it should be taken in isolation. Maybe 1 or 2 other 
patches are required.

The WARN was really annoying, so we could spend a bit of time to test a 
backport of what is strictly required. Let us know.

Thanks,
John

> In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery
> competing with ata error handling") introduced disco mutex to prevent
> rediscovery competing with ata error handling and put the whole
> revalidation in the mutex. But the rphy add/remove needs to wait for the
> error handling which also grabs the disco mutex. This may leads to dead
> lock.So the probe and destruct event were introduce to do the rphy
> add/remove asynchronously and out of the lock.
> 
> The asynchronously processed workers makes the whole discovery process
> not atomic, the other events may interrupt the process. For example,
> if a loss of signal event inserted before the probe event, the
> sas_deform_port() is called and the port will be deleted.
> 
> And sas_port_delete() may run before the destruct event, but the
> port-x:x is the top parent of end device or expander. This leads to
> a kernel WARNING such as:
> 
> [   82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22'
> [   82.042983] ------------[ cut here ]------------
> [   82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237
> sysfs_remove_group+0x94/0xa0
> [   82.043059] Call trace:
> [   82.043082] [<ffff0000082e7624>] sysfs_remove_group+0x94/0xa0
> [   82.043085] [<ffff00000864e320>] dpm_sysfs_remove+0x60/0x70
> [   82.043086] [<ffff00000863ee10>] device_del+0x138/0x308
> [   82.043089] [<ffff00000869a2d0>] sas_phy_delete+0x38/0x60
> [   82.043091] [<ffff00000869a86c>] do_sas_phy_delete+0x6c/0x80
> [   82.043093] [<ffff00000863dc20>] device_for_each_child+0x58/0xa0
> [   82.043095] [<ffff000008696f80>] sas_remove_children+0x40/0x50
> [   82.043100] [<ffff00000869d1bc>] sas_destruct_devices+0x64/0xa0
> [   82.043102] [<ffff0000080e93bc>] process_one_work+0x1fc/0x4b0
> [   82.043104] [<ffff0000080e96c0>] worker_thread+0x50/0x490
> [   82.043105] [<ffff0000080f0364>] kthread+0xfc/0x128
> [   82.043107] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
> 
> Make probe and destruct a direct call in the disco and revalidate function,
> but put them outside the lock. The whole discovery or revalidate won't
> be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT
> event are deleted as a result of the direct call.
> 
> Introduce a new list to destruct the sas_port and put the port delete after
> the destruct. This makes sure the right order of destroying the sysfs
> kobject and fix the warning above.
> 
> In sas_ex_revalidate_domain() have a loop to find all broadcasted
> device, and sometimes we have a chance to find the same expander twice.
> Because the sas_port will be deleted at the end of the whole revalidate
> process, sas_port with the same name cannot be added before this.
> Otherwise the sysfs will complain of creating duplicate filename. Since
> the LLDD will send broadcast for every device change, we can only
> process one expander's revalidation.
> 
> [mkp: kbuild test robot warning]
> 
> Signed-off-by: Jason Yan <yanaijie@huawei.com>
> CC: John Garry <john.garry@huawei.com>
> CC: Johannes Thumshirn <jthumshirn@suse.de>
> CC: Ewan Milne <emilne@redhat.com>
> CC: Christoph Hellwig <hch@lst.de>
> CC: Tomas Henzl <thenzl@redhat.com>
> CC: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---


  reply	other threads:[~2020-08-03 12:59 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-03 12:19 [PATCH 4.14 00/51] 4.14.192-rc1 review Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 01/51] scsi: libsas: direct call probe and destruct Greg Kroah-Hartman
2020-08-03 12:57   ` John Garry [this message]
2020-08-05  9:52     ` Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 02/51] net: phy: mdio-bcm-unimac: fix potential NULL dereference in unimac_mdio_probe() Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 03/51] crypto: ccp - Release all allocated memory if sha type is invalid Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 04/51] media: rc: prevent memory leak in cx23888_ir_probe Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 05/51] iio: imu: adis16400: fix memory leak Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 06/51] ath9k_htc: release allocated buffer if timed out Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 07/51] ath9k: " Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 08/51] x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 09/51] PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 10/51] wireless: Use offsetof instead of custom macro Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 11/51] ARM: 8986/1: hw_breakpoint: Dont invoke overflow handler on uaccess watchpoints Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 12/51] random32: update the net random state on interrupt and activity Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 13/51] ARM: percpu.h: fix build error Greg Kroah-Hartman
2020-08-03 12:19 ` [PATCH 4.14 14/51] drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 15/51] drm: hold gem reference until object is no longer accessed Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 16/51] f2fs: check memory boundary by insane namelen Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 17/51] f2fs: check if file namelen exceeds max value Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 18/51] random: fix circular include dependency on arm64 after addition of percpu.h Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 19/51] random32: remove net_rand_state from the latent entropy gcc plugin Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 20/51] 9p/trans_fd: abort p9_read_work if req status changed Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 21/51] 9p/trans_fd: Fix concurrency del of req_list in p9_fd_cancelled/p9_read_work Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 22/51] x86/build/lto: Fix truncated .bss with -fdata-sections Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 23/51] x86, vmlinux.lds: Page-align end of ..page_aligned sections Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 24/51] rds: Prevent kernel-infoleak in rds_notify_queue_get() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 25/51] xfs: fix missed wakeup on l_flush_wait Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 26/51] net/x25: Fix x25_neigh refcnt leak when x25 disconnect Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 27/51] net/x25: Fix null-ptr-deref in x25_disconnect Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 28/51] selftests/net: rxtimestamp: fix clang issues for target arch PowerPC Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 29/51] sh: Fix validation of system call number Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 30/51] net: lan78xx: add missing endpoint sanity check Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 31/51] net: lan78xx: fix transfer-buffer memory leak Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 32/51] mlx4: disable device on shutdown Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 33/51] mlxsw: core: Increase scope of RCU read-side critical section Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 34/51] mlxsw: core: Free EMAD transactions using kfree_rcu() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 35/51] ibmvnic: Fix IRQ mapping disposal in error path Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 36/51] bpf: Fix map leak in HASH_OF_MAPS map Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 37/51] mac80211: mesh: Free ie data when leaving mesh Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 38/51] mac80211: mesh: Free pending skb when destroying a mpath Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 39/51] arm64/alternatives: move length validation inside the subsection Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 40/51] arm64: csum: Fix handling of bad packets Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 41/51] usb: hso: Fix debug compile warning on sparc32 Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 42/51] qed: Disable "MFW indication via attention" SPAM every 5 minutes Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 43/51] nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 44/51] parisc: add support for cmpxchg on u8 pointers Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 45/51] net: ethernet: ravb: exit if re-initialization fails in tx timeout Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 46/51] Revert "i2c: cadence: Fix the hold bit setting" Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 47/51] x86/unwind/orc: Fix ORC for newly forked tasks Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 48/51] cxgb4: add missing release on skb in uld_send() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 49/51] xen-netfront: fix potential deadlock in xennet_remove() Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 50/51] KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled Greg Kroah-Hartman
2020-08-03 12:20 ` [PATCH 4.14 51/51] x86/i8259: Use printk_deferred() to prevent deadlock Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8743227b-adb3-ed1f-3559-e562555ac045@huawei.com \
    --to=john.garry@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=emilne@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=jthumshirn@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=thenzl@redhat.com \
    --cc=yanaijie@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).