linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Dongdong Liu <liudongdong3@huawei.com>
Cc: helgaas@kernel.org, lorenzo.pieralisi@arm.com,
	linux-pci@vger.kernel.org, linuxarm@huawei.com,
	john.garry@huawei.com,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [RFC PATCH] PCI: hotplug: Fix surprise removal report card present and link failed
Date: Wed, 16 Jan 2019 15:22:35 +0100	[thread overview]
Message-ID: <20190116142235.6drd6avwk3e7rtwd@wunner.de> (raw)
In-Reply-To: <1547649064-19019-1-git-send-email-liudongdong3@huawei.com>

[cc += Mika]

On Wed, Jan 16, 2019 at 10:31:04PM +0800, Dongdong Liu wrote:
> The lspci -tv topology is as below.
>  +-[0000:80]-+-00.0-[81]----00.0  Huawei Technologies Co., Ltd. Device 3714
>  |           +-02.0-[82]----00.0  Huawei Technologies Co., Ltd. Device 3714
>  |           +-04.0-[83]----00.0  Huawei Technologies Co., Ltd. Device 3714
>  |           +-06.0-[84]----00.0  Huawei Technologies Co., Ltd. Device 3714
>  |           +-10.0-[87]----00.0  Huawei Technologies Co., Ltd. Device 3714
> 
> Then surprise removal 87:00.0 NVME SSD card. The message is as below.
> 
> pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
> iommu: Removing device 0000:87:00.0 from group 12
> pciehp 0000:80:10.0:pcie004: Slot(36): Card present
> pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
> pciehp 0000:80:10.0:pcie004: Failed to check link status

What is the problem that you're trying to fix?  That these messages
are logged?  Or is there a bigger issue?  If the only problem are the
messages, then I feel that the current behavior is a feature, not a bug.
We could probably tone down the "Failed to check link status" message's
severity.  (Currently it's KERN_ERR, all the other messages are KERN_INFO.)


> The NVME SSD card is permited to surprise removal with slot power on
> on the D06 board. The hotplug port's POWER_CTRL(ctrl) is false.

I don't quite follow:  Does the hotplug port have a Power Controller but
misrepresents that fact in the Slot Capabilities register?


> Data link layer state changed (link down) event reported prior to presence
> detect changed (card is not present) when surprise removal.

It's normal that PDS and DLLSC events are received in varying order and with
significant delays.  (I've seen 244 msec once with Thunderbolt between
PDS- and DLLSC-.)


> Current code pciehp_handle_presence_or_link_change() handle link down event,
> then check the card present, but at this time presence detect state have
> not updated, so have such issue. If surprise removal and power controller
> present is not implemented, wait for at least 1 second before checking
> card present to fix the issue.

Thunderbolt doesn't have a Power Controller either and this change would
imply that if Thunderbolt devices are quickly swapped, bringing up the
replaced device would take longer than it takes now.  So I'm not really
happy about this change.

Thanks,

Lukas

> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/pci/hotplug/pciehp_ctrl.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 3f3df4c..ef8952d 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -216,6 +216,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
>  void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  {
>  	bool present, link_active;
> +	bool safe_removal = SAFE_REMOVAL;
>  
>  	/*
>  	 * If the slot is on and presence or link has changed, turn it off.
> @@ -236,6 +237,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  			ctrl_info(ctrl, "Slot(%s): Card not present\n",
>  				  slot_name(ctrl));
>  		pciehp_disable_slot(ctrl, SURPRISE_REMOVAL);
> +		safe_removal = SURPRISE_REMOVAL;
>  		break;
>  	default:
>  		mutex_unlock(&ctrl->state_lock);
> @@ -244,6 +246,15 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  
>  	/* Turn the slot on if it's occupied or link is up */
>  	mutex_lock(&ctrl->state_lock);
> +	/*
> +	 * if surprise removal and power controller present is not implemented,
> +	 * wait for at least 1 second before checking card present as
> +	 * data link layer state changed (link down) event reported
> +	 * prior to presence detect changed (card is not present).
> +	 */
> +	if (!safe_removal && !POWER_CTRL(ctrl))
> +		msleep(1000);
> +
>  	present = pciehp_card_present(ctrl);
>  	link_active = pciehp_check_link_active(ctrl);
>  	if (!present && !link_active) {
> -- 
> 1.9.1

  reply	other threads:[~2019-01-16 14:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-16 14:31 [RFC PATCH] PCI: hotplug: Fix surprise removal report card present and link failed Dongdong Liu
2019-01-16 14:22 ` Lukas Wunner [this message]
2019-01-17 12:07   ` Dongdong Liu
2019-01-17 18:30     ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190116142235.6drd6avwk3e7rtwd@wunner.de \
    --to=lukas@wunner.de \
    --cc=helgaas@kernel.org \
    --cc=john.garry@huawei.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liudongdong3@huawei.com \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=mika.westerberg@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).