All of lore.kernel.org
 help / color / mirror / Atom feed
* NVMe: Invalid controller status value, device hotplug out
@ 2014-07-22 17:25 Mundu
  2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu
  2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee
  0 siblings, 2 replies; 5+ messages in thread
From: Mundu @ 2014-07-22 17:25 UTC (permalink / raw)


 If device hotplug out, controller status become 0xFFFF_FFFF
 1. Controller fatal status become true in nvme_kthread()
 2. Ready bit nvere reset to 0 in nvme_wait_ready() for
    controller disable.
    above two scenarios controller status data is not a valid
    data, since device/controller is already removed (hotplug)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1]  If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since  device/controller is already removed (hotplug)
  2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu
@ 2014-07-22 17:25 ` Mundu
  2014-07-22 17:40   ` Keith Busch
  2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee
  1 sibling, 1 reply; 5+ messages in thread
From: Mundu @ 2014-07-22 17:25 UTC (permalink / raw)


Signed-off-by: Mundu <mundu2510 at gmail.com>
---
 drivers/block/nvme-core.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 28aec2d..6c6998e 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev, u64 cap, bool enabled)
 {
 	unsigned long timeout;
 	u32 bit = enabled ? NVME_CSTS_RDY : 0;
+	u32 csts;
 
 	timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
 
-	while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) {
+	while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) {
+		/* If device is hot plugout, csts become 0xFFFFFFFF */
+		if (csts == -1)
+			return -ENODEV;
 		msleep(100);
 		if (fatal_signal_pending(current))
 			return -EINTR;
@@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue *nvmeq)
 static int nvme_kthread(void *data)
 {
 	struct nvme_dev *dev, *next;
+	u32 csts;
 
 	while (!kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
 		spin_lock(&dev_list_lock);
 		list_for_each_entry_safe(dev, next, &dev_list, node) {
 			int i;
-			if (readl(&dev->bar->csts) & NVME_CSTS_CFS &&
+			if ((csts = readl(&dev->bar->csts)) & NVME_CSTS_CFS &&
 							dev->initialized) {
+				/* If device is hot plugout,
+				   csts become 0xFFFFFFFF */
+				if (csts == -1)
+					continue;
 				if (work_busy(&dev->reset_work))
 					continue;
 				list_del_init(&dev->node);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 1/1]  If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since  device/controller is already removed (hotplug)
  2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu
@ 2014-07-22 17:40   ` Keith Busch
  2014-07-22 17:57     ` mundu agarwal
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2014-07-22 17:40 UTC (permalink / raw)


On Tue, 22 Jul 2014, Mundu wrote:
> Signed-off-by: Mundu <mundu2510 at gmail.com>
> ---
> drivers/block/nvme-core.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
> index 28aec2d..6c6998e 100644
> --- a/drivers/block/nvme-core.c
> +++ b/drivers/block/nvme-core.c
> @@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev, u64 cap, bool enabled)
> {
> 	unsigned long timeout;
> 	u32 bit = enabled ? NVME_CSTS_RDY : 0;
> +	u32 csts;
>
> 	timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
>
> -	while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) {
> +	while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) {
> +		/* If device is hot plugout, csts become 0xFFFFFFFF */
> +		if (csts == -1)
> +			return -ENODEV;

In nvme_dev_map, we already return -ENODEV if reading CSTS returns
-1. That happens before we call nvme_wait_ready. Are you just shorting
the probe in the event someone hot removes a drive the moment after the
driver ioremaps the BAR, but before it enables the device?

> 		msleep(100);
> 		if (fatal_signal_pending(current))
> 			return -EINTR;
> @@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue *nvmeq)
> static int nvme_kthread(void *data)
> {
> 	struct nvme_dev *dev, *next;
> +	u32 csts;
>
> 	while (!kthread_should_stop()) {
> 		set_current_state(TASK_INTERRUPTIBLE);
> 		spin_lock(&dev_list_lock);
> 		list_for_each_entry_safe(dev, next, &dev_list, node) {
> 			int i;
> -			if (readl(&dev->bar->csts) & NVME_CSTS_CFS &&
> +			if ((csts = readl(&dev->bar->csts)) & NVME_CSTS_CFS &&
> 							dev->initialized) {
> +				/* If device is hot plugout,
> +				   csts become 0xFFFFFFFF */
> +				if (csts == -1)
> +					continue;

You definitely want to let the reset work get scheduled here instead of
continuing. Some platforms do not support "surprise removal", so this
check guards against that by triggering a reset which will bail on the
device if it appears gone and remove it from pci.

> 				if (work_busy(&dev->reset_work))
> 					continue;
> 				list_del_init(&dev->node);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug)
  2014-07-22 17:40   ` Keith Busch
@ 2014-07-22 17:57     ` mundu agarwal
  0 siblings, 0 replies; 5+ messages in thread
From: mundu agarwal @ 2014-07-22 17:57 UTC (permalink / raw)


>> In nvme_dev_map, we already return -ENODEV if reading CSTS returns
>> -1. That happens before we call nvme_wait_ready. Are you just shorting
>> the probe in the event someone hot removes a drive the moment after the
>> driver ioremaps the BAR, but before it enables the device?

Yes

On Tue, Jul 22, 2014@11:10 PM, Keith Busch <keith.busch@intel.com> wrote:
> On Tue, 22 Jul 2014, Mundu wrote:
>>
>> Signed-off-by: Mundu <mundu2510 at gmail.com>
>> ---
>> drivers/block/nvme-core.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
>> index 28aec2d..6c6998e 100644
>> --- a/drivers/block/nvme-core.c
>> +++ b/drivers/block/nvme-core.c
>> @@ -1396,10 +1396,14 @@ static int nvme_wait_ready(struct nvme_dev *dev,
>> u64 cap, bool enabled)
>> {
>>         unsigned long timeout;
>>         u32 bit = enabled ? NVME_CSTS_RDY : 0;
>> +       u32 csts;
>>
>>         timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
>>
>> -       while ((readl(&dev->bar->csts) & NVME_CSTS_RDY) != bit) {
>> +       while (((csts = readl(&dev->bar->csts)) & NVME_CSTS_RDY) != bit) {
>> +               /* If device is hot plugout, csts become 0xFFFFFFFF */
>> +               if (csts == -1)
>> +                       return -ENODEV;
>
>
> In nvme_dev_map, we already return -ENODEV if reading CSTS returns
> -1. That happens before we call nvme_wait_ready. Are you just shorting
> the probe in the event someone hot removes a drive the moment after the
> driver ioremaps the BAR, but before it enables the device?
>
>
>>                 msleep(100);
>>                 if (fatal_signal_pending(current))
>>                         return -EINTR;
>> @@ -1905,14 +1909,19 @@ static int nvme_submit_async_req(struct nvme_queue
>> *nvmeq)
>> static int nvme_kthread(void *data)
>> {
>>         struct nvme_dev *dev, *next;
>> +       u32 csts;
>>
>>         while (!kthread_should_stop()) {
>>                 set_current_state(TASK_INTERRUPTIBLE);
>>                 spin_lock(&dev_list_lock);
>>                 list_for_each_entry_safe(dev, next, &dev_list, node) {
>>                         int i;
>> -                       if (readl(&dev->bar->csts) & NVME_CSTS_CFS &&
>> +                       if ((csts = readl(&dev->bar->csts)) &
>> NVME_CSTS_CFS &&
>>                                                         dev->initialized)
>> {
>> +                               /* If device is hot plugout,
>> +                                  csts become 0xFFFFFFFF */
>> +                               if (csts == -1)
>> +                                       continue;
>
>
> You definitely want to let the reset work get scheduled here instead of
> continuing. Some platforms do not support "surprise removal", so this
> check guards against that by triggering a reset which will bail on the
> device if it appears gone and remove it from pci.
>
>
>>                                 if (work_busy(&dev->reset_work))
>>                                         continue;
>>                                 list_del_init(&dev->node);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* NVMe: Invalid controller status value, device hotplug out
  2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu
  2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu
@ 2014-07-23 14:24 ` J Freyensee
  1 sibling, 0 replies; 5+ messages in thread
From: J Freyensee @ 2014-07-23 14:24 UTC (permalink / raw)


On 07/22/2014 10:25 AM, Mundu wrote:
>   If device hotplug out, controller status become 0xFFFF_FFFF
>   1. Controller fatal status become true in nvme_kthread()
>   2. Ready bit nvere reset to 0 in nvme_wait_ready() for
>      controller disable.
>      above two scenarios controller status data is not a valid
>      data, since device/controller is already removed (hotplug)
>

I believe you want this comment to be in the actual patch you sent?? 
The patch you sent after this email has no description, something 
Matthew pointed out previously.

> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-23 14:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-22 17:25 NVMe: Invalid controller status value, device hotplug out Mundu
2014-07-22 17:25 ` [PATCH 1/1] If device hotplug out, controller status become 0xFFFF_FFFF 1. Controller fatal status become true in nvme_kthread() 2. Ready bit will not be reset to 0 in nvme_wait_ready() for controller disable above two scenarios controller status data is not a valid data, since device/controller is already removed (hotplug) Mundu
2014-07-22 17:40   ` Keith Busch
2014-07-22 17:57     ` mundu agarwal
2014-07-23 14:24 ` NVMe: Invalid controller status value, device hotplug out J Freyensee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.