All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux host behavior when CSTS.CFS bit is set to 1
@ 2017-01-25 23:20 Ken Chen (kchena)
  2017-01-25 23:39 ` Keith Busch
  0 siblings, 1 reply; 2+ messages in thread
From: Ken Chen (kchena) @ 2017-01-25 23:20 UTC (permalink / raw)


Hi All,

>From a recent message exchange in this mail list, I saw the following statement:

"In the nvme Linux driver in function nvme_kthread() the CSTS register is read once a second to check for controller status failure." 

If this statement is true, is Linux supposed to initiate a reset to the drive if CSTS.CFS bit is set to 1?

I am working on a SSD firmware. When certain hardware exceptions occur in the drive, the firmware sets CSTS.CFS bit to 1, expecting a reset by the host.  I am using Centos 7 with kernel 4.2.2-1.el7.elrepo.x86_64. In my tests, when firmware sets CFS bit, the host does not seem to reset the drive. That is, there is no transition of CC.EN bit from 1 to 0, there is no setting CC.SHN bit to 1, and there is no writing "NVMe" to NSSR register, etc. Is there anything else that firmware needs to do to trigger a reset from the host? Or is there any configuration (such as PCIe AER) that needs to be enabled in order for Linux to support this functionality?

Any advice will be appreciated.

Thanks,

Ken

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Linux host behavior when CSTS.CFS bit is set to 1
  2017-01-25 23:20 Linux host behavior when CSTS.CFS bit is set to 1 Ken Chen (kchena)
@ 2017-01-25 23:39 ` Keith Busch
  0 siblings, 0 replies; 2+ messages in thread
From: Keith Busch @ 2017-01-25 23:39 UTC (permalink / raw)


On Wed, Jan 25, 2017@11:20:45PM +0000, Ken Chen (kchena) wrote:
> Hi All,
> 
> From a recent message exchange in this mail list, I saw the following statement:
> 
> "In the nvme Linux driver in function nvme_kthread() the CSTS register is read once a second to check for controller status failure." 

That's probably not a recent exchange on this list. We got rid of the
kthread almost a year ago. It was replaced with a per-controller timer.
 
> If this statement is true, is Linux supposed to initiate a reset to the drive if CSTS.CFS bit is set to 1?

Yes, that is correct.
 
> I am working on a SSD firmware. When certain hardware exceptions occur in the drive, the firmware sets CSTS.CFS bit to 1, expecting a reset by the host.  I am using Centos 7 with kernel 4.2.2-1.el7.elrepo.x86_64. In my tests, when firmware sets CFS bit, the host does not seem to reset the drive. That is, there is no transition of CC.EN bit from 1 to 0, there is no setting CC.SHN bit to 1, and there is no writing "NVMe" to NSSR register, etc. Is there anything else that firmware needs to do to trigger a reset from the host? Or is there any configuration (such as PCIe AER) that needs to be enabled in order for Linux to support this functionality?
> 
> Any advice will be appreciated.

There's no additional driver dependency required for this.

I've tested this part quite a bit, and it's always worked as designed
as far as I know. Are you sure the controller is really raising the
CSTS.CFS bit for the host to see? How are you verifying that it is
really set?

The only reason the driver may skip the reset if CFS is raised is if the
driver is already trying to reset the controller.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-01-25 23:39 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25 23:20 Linux host behavior when CSTS.CFS bit is set to 1 Ken Chen (kchena)
2017-01-25 23:39 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.