linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] nvme: hwmon: provide temperature threshold features
@ 2019-11-14 15:39 Akinobu Mita
  2019-11-14 15:40 ` [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor Akinobu Mita
  2019-11-14 15:40 ` [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold Akinobu Mita
  0 siblings, 2 replies; 10+ messages in thread
From: Akinobu Mita @ 2019-11-14 15:39 UTC (permalink / raw)
  To: linux-nvme, linux-hwmon
  Cc: Jean Delvare, Sagi Grimberg, Akinobu Mita, Jens Axboe,
	Keith Busch, Christoph Hellwig, Guenter Roeck

According to the NVMe specification, the over temperature threshold and
under temperature threshold features shall be implemented for Composite
Temperature if a non-zero WCTEMP field value is reported in the Identify
Controller data structure.  The features are also implemented for all
implemented temperature sensors (i.e., all Temperature Sensor fields that
report a non-zero value).

This provides the over temperature threshold and under temperature
threshold for each sensor as temperature min and max values of hwmon
sysfs attributes.

This patch depends on the patch "nvme: Add hardware monitoring support".
(http://lists.infradead.org/pipermail/linux-nvme/2019-November/027883.html)

* v2
- Add helper macros for kelvin from/to milli Celsius conversion
- Remove alarm attributes for each implemented temperature sensor
- Use u32 variable for the last parameter of nvme_get_features()
- Use NULL for the unused last parameter of nvme_set_features()
- Avoid ternary operator
- Adjust temperature value ranges with clamp_val()
- Don't use WCTEMP after initialization
- Avoid accessing negative index when WCTEMP == 0
- Add a new quirk to avoid changing temperature threshold

Akinobu Mita (2):
  nvme: hwmon: provide temperature min and max values for each sensor
  nvme: hwmon: add quirk to avoid changing temperature threshold

 drivers/nvme/host/nvme-hwmon.c | 110 +++++++++++++++++++++++++++++++++++------
 drivers/nvme/host/nvme.h       |   5 ++
 drivers/nvme/host/pci.c        |   3 +-
 include/linux/nvme.h           |   6 +++
 4 files changed, 107 insertions(+), 17 deletions(-)

Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
-- 
2.7.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-14 15:39 [PATCH v2 0/2] nvme: hwmon: provide temperature threshold features Akinobu Mita
@ 2019-11-14 15:40 ` Akinobu Mita
  2019-11-14 19:06   ` Guenter Roeck
  2019-11-20 18:48   ` Christoph Hellwig
  2019-11-14 15:40 ` [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold Akinobu Mita
  1 sibling, 2 replies; 10+ messages in thread
From: Akinobu Mita @ 2019-11-14 15:40 UTC (permalink / raw)
  To: linux-nvme, linux-hwmon
  Cc: Jean Delvare, Sagi Grimberg, Akinobu Mita, Jens Axboe,
	Keith Busch, Christoph Hellwig, Guenter Roeck

According to the NVMe specification, the over temperature threshold and
under temperature threshold features shall be implemented for Composite
Temperature if a non-zero WCTEMP field value is reported in the Identify
Controller data structure.  The features are also implemented for all
implemented temperature sensors (i.e., all Temperature Sensor fields that
report a non-zero value).

This provides the over temperature threshold and under temperature
threshold for each sensor as temperature min and max values of hwmon
sysfs attributes.

The WCTEMP is already provided as a temperature max value for Composite
Temperature, but this change isn't incompatible.  Because the default
value of the over temperature threshold for Composite Temperature is
the WCTEMP.

Now the alarm attribute for Composite Temperature indicates one of the
temperature is outside of a temperature threshold.  Because there is only
a single bit in Critical Warning field that indicates a temperature is
outside of a threshold.

Example output from the "sensors" command:

nvme-pci-0100
Adapter: PCI adapter
Composite:    +33.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)
Sensor 1:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +31.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 5:     +47.9°C  (low  = -273.1°C, high = +65261.8°C)

This also adds helper macros for kelvin from/to milli Celsius conversion,
and replaces the repeated code in nvme-hwmon.c.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
 drivers/nvme/host/nvme-hwmon.c | 106 ++++++++++++++++++++++++++++++++++-------
 include/linux/nvme.h           |   6 +++
 2 files changed, 96 insertions(+), 16 deletions(-)

diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
index 5480cbb..97a84b4 100644
--- a/drivers/nvme/host/nvme-hwmon.c
+++ b/drivers/nvme/host/nvme-hwmon.c
@@ -9,12 +9,57 @@
 
 #include "nvme.h"
 
+/* These macros should be moved to linux/temperature.h */
+#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
+#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)
+
 struct nvme_hwmon_data {
 	struct nvme_ctrl *ctrl;
 	struct nvme_smart_log log;
 	struct mutex read_lock;
 };
 
+static int nvme_get_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
+				long *temp)
+{
+	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
+	u32 status;
+	int ret;
+
+	if (under)
+		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
+
+	ret = nvme_get_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
+				&status);
+	if (ret > 0)
+		return -EIO;
+	if (ret < 0)
+		return ret;
+	*temp = KELVIN_TO_MILLICELSIUS(status & NVME_TEMP_THRESH_MASK);
+
+	return 0;
+}
+
+static int nvme_set_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
+				long temp)
+{
+	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
+	int ret;
+
+	temp = MILLICELSIUS_TO_KELVIN(temp);
+	threshold |= clamp_val(temp, 0, NVME_TEMP_THRESH_MASK);
+
+	if (under)
+		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
+
+	ret = nvme_set_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
+				NULL);
+	if (ret > 0)
+		return -EIO;
+
+	return ret;
+}
+
 static int nvme_hwmon_get_smart_log(struct nvme_hwmon_data *data)
 {
 	int ret;
@@ -39,10 +84,11 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
 	 */
 	switch (attr) {
 	case hwmon_temp_max:
-		*val = (data->ctrl->wctemp - 273) * 1000;
-		return 0;
+		return nvme_get_temp_thresh(data->ctrl, channel, false, val);
+	case hwmon_temp_min:
+		return nvme_get_temp_thresh(data->ctrl, channel, true, val);
 	case hwmon_temp_crit:
-		*val = (data->ctrl->cctemp - 273) * 1000;
+		*val = KELVIN_TO_MILLICELSIUS(data->ctrl->cctemp);
 		return 0;
 	default:
 		break;
@@ -59,7 +105,7 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
 			temp = get_unaligned_le16(log->temperature);
 		else
 			temp = le16_to_cpu(log->temp_sensor[channel - 1]);
-		*val = (temp - 273) * 1000;
+		*val = KELVIN_TO_MILLICELSIUS(temp);
 		break;
 	case hwmon_temp_alarm:
 		*val = !!(log->critical_warning & NVME_SMART_CRIT_TEMPERATURE);
@@ -73,6 +119,23 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
 	return err;
 }
 
+static int nvme_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
+			    u32 attr, int channel, long val)
+{
+	struct nvme_hwmon_data *data = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_max:
+		return nvme_set_temp_thresh(data->ctrl, channel, false, val);
+	case hwmon_temp_min:
+		return nvme_set_temp_thresh(data->ctrl, channel, true, val);
+	default:
+		break;
+	}
+
+	return -EOPNOTSUPP;
+}
+
 static const char * const nvme_hwmon_sensor_names[] = {
 	"Composite",
 	"Sensor 1",
@@ -105,8 +168,10 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
 			return 0444;
 		break;
 	case hwmon_temp_max:
-		if (!channel && data->ctrl->wctemp)
-			return 0444;
+	case hwmon_temp_min:
+		if ((!channel && data->ctrl->wctemp) ||
+		    (channel && data->log.temp_sensor[channel - 1]))
+			return 0644;
 		break;
 	case hwmon_temp_alarm:
 		if (!channel)
@@ -126,16 +191,24 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
 static const struct hwmon_channel_info *nvme_hwmon_info[] = {
 	HWMON_CHANNEL_INFO(chip, HWMON_C_REGISTER_TZ),
 	HWMON_CHANNEL_INFO(temp,
-			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
-				HWMON_T_LABEL | HWMON_T_ALARM,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL),
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_CRIT | HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL),
 	NULL
 };
 
@@ -143,6 +216,7 @@ static const struct hwmon_ops nvme_hwmon_ops = {
 	.is_visible	= nvme_hwmon_is_visible,
 	.read		= nvme_hwmon_read,
 	.read_string	= nvme_hwmon_read_string,
+	.write		= nvme_hwmon_write,
 };
 
 static const struct hwmon_chip_info nvme_hwmon_chip_info = {
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 269b2e8..347a44f 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -803,6 +803,12 @@ struct nvme_write_zeroes_cmd {
 
 /* Features */
 
+enum {
+	NVME_TEMP_THRESH_MASK		= 0xffff,
+	NVME_TEMP_THRESH_SELECT_SHIFT	= 16,
+	NVME_TEMP_THRESH_TYPE_UNDER	= 0x100000,
+};
+
 struct nvme_feat_auto_pst {
 	__le64 entries[32];
 };
-- 
2.7.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold
  2019-11-14 15:39 [PATCH v2 0/2] nvme: hwmon: provide temperature threshold features Akinobu Mita
  2019-11-14 15:40 ` [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor Akinobu Mita
@ 2019-11-14 15:40 ` Akinobu Mita
  2019-11-14 19:14   ` Guenter Roeck
  2019-11-20 18:48   ` Christoph Hellwig
  1 sibling, 2 replies; 10+ messages in thread
From: Akinobu Mita @ 2019-11-14 15:40 UTC (permalink / raw)
  To: linux-nvme, linux-hwmon
  Cc: Jean Delvare, Sagi Grimberg, Akinobu Mita, Jens Axboe,
	Keith Busch, Christoph Hellwig, Guenter Roeck

This adds a new quirk NVME_QUIRK_NO_TEMP_THRESH_CHANGE to avoid changing
the value of the temperature threshold feature for specific devices that
show undesirable behavior.

Guenter reported:

"On my Intel NVME drive (SSDPEKKW512G7), writing any minimum limit on the
Composite temperature sensor results in a temperature warning, and that
warning is sticky until I reset the controller.

It doesn't seem to matter which temperature I write; writing -273000 has
the same result."

The Intel NVMe has the latest firmware version installed, so this isn't
a problem that was ever fixed.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
 drivers/nvme/host/nvme-hwmon.c | 6 +++++-
 drivers/nvme/host/nvme.h       | 5 +++++
 drivers/nvme/host/pci.c        | 3 ++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
index 97a84b4..a5af21f 100644
--- a/drivers/nvme/host/nvme-hwmon.c
+++ b/drivers/nvme/host/nvme-hwmon.c
@@ -170,8 +170,12 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
 	case hwmon_temp_max:
 	case hwmon_temp_min:
 		if ((!channel && data->ctrl->wctemp) ||
-		    (channel && data->log.temp_sensor[channel - 1]))
+		    (channel && data->log.temp_sensor[channel - 1])) {
+			if (data->ctrl->quirks &
+			    NVME_QUIRK_NO_TEMP_THRESH_CHANGE)
+				return 0444;
 			return 0644;
+		}
 		break;
 	case hwmon_temp_alarm:
 		if (!channel)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 000a3d9..19e5e87 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -115,6 +115,11 @@ enum nvme_quirks {
 	 * Prevent tag overlap between queues
 	 */
 	NVME_QUIRK_SHARED_TAGS                  = (1 << 13),
+
+	/*
+	 * Don't change the value of the temperature threshold feature
+	 */
+	NVME_QUIRK_NO_TEMP_THRESH_CHANGE	= (1 << 14),
 };
 
 /*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 931d4a9..2c0206b 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3529,7 +3529,8 @@ static const struct pci_device_id nvme_id_table[] = {
 				NVME_QUIRK_DEALLOCATE_ZEROES, },
 	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
 		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
-				NVME_QUIRK_MEDIUM_PRIO_SQ },
+				NVME_QUIRK_MEDIUM_PRIO_SQ |
+				NVME_QUIRK_NO_TEMP_THRESH_CHANGE },
 	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
 		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
 	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
-- 
2.7.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-14 15:40 ` [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor Akinobu Mita
@ 2019-11-14 19:06   ` Guenter Roeck
  2019-11-20 18:48   ` Christoph Hellwig
  1 sibling, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2019-11-14 19:06 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Christoph Hellwig

On Fri, Nov 15, 2019 at 12:40:00AM +0900, Akinobu Mita wrote:
> According to the NVMe specification, the over temperature threshold and
> under temperature threshold features shall be implemented for Composite
> Temperature if a non-zero WCTEMP field value is reported in the Identify
> Controller data structure.  The features are also implemented for all
> implemented temperature sensors (i.e., all Temperature Sensor fields that
> report a non-zero value).
> 
> This provides the over temperature threshold and under temperature
> threshold for each sensor as temperature min and max values of hwmon
> sysfs attributes.
> 
> The WCTEMP is already provided as a temperature max value for Composite
> Temperature, but this change isn't incompatible.  Because the default
> value of the over temperature threshold for Composite Temperature is
> the WCTEMP.
> 
> Now the alarm attribute for Composite Temperature indicates one of the
> temperature is outside of a temperature threshold.  Because there is only
> a single bit in Critical Warning field that indicates a temperature is
> outside of a threshold.
> 
> Example output from the "sensors" command:
> 
> nvme-pci-0100
> Adapter: PCI adapter
> Composite:    +33.9°C  (low  = -273.1°C, high = +69.8°C)
>                        (crit = +79.8°C)
> Sensor 1:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)
> Sensor 2:     +31.9°C  (low  = -273.1°C, high = +65261.8°C)
> Sensor 5:     +47.9°C  (low  = -273.1°C, high = +65261.8°C)
> 
> This also adds helper macros for kelvin from/to milli Celsius conversion,
> and replaces the repeated code in nvme-hwmon.c.
> 
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>

Tested with:
	INTEL SSDPEKKW512G7
	Samsung SSD 970 EVO 500GB
	THNSN5256GPU7 NVMe TOSHIBA 256GB

> ---
>  drivers/nvme/host/nvme-hwmon.c | 106 ++++++++++++++++++++++++++++++++++-------
>  include/linux/nvme.h           |   6 +++
>  2 files changed, 96 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
> index 5480cbb..97a84b4 100644
> --- a/drivers/nvme/host/nvme-hwmon.c
> +++ b/drivers/nvme/host/nvme-hwmon.c
> @@ -9,12 +9,57 @@
>  
>  #include "nvme.h"
>  
> +/* These macros should be moved to linux/temperature.h */
> +#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
> +#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)
> +
>  struct nvme_hwmon_data {
>  	struct nvme_ctrl *ctrl;
>  	struct nvme_smart_log log;
>  	struct mutex read_lock;
>  };
>  
> +static int nvme_get_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
> +				long *temp)
> +{
> +	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
> +	u32 status;
> +	int ret;
> +
> +	if (under)
> +		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
> +
> +	ret = nvme_get_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
> +				&status);
> +	if (ret > 0)
> +		return -EIO;
> +	if (ret < 0)
> +		return ret;
> +	*temp = KELVIN_TO_MILLICELSIUS(status & NVME_TEMP_THRESH_MASK);
> +
> +	return 0;
> +}
> +
> +static int nvme_set_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
> +				long temp)
> +{
> +	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
> +	int ret;
> +
> +	temp = MILLICELSIUS_TO_KELVIN(temp);
> +	threshold |= clamp_val(temp, 0, NVME_TEMP_THRESH_MASK);
> +
> +	if (under)
> +		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
> +
> +	ret = nvme_set_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
> +				NULL);
> +	if (ret > 0)
> +		return -EIO;
> +
> +	return ret;
> +}
> +
>  static int nvme_hwmon_get_smart_log(struct nvme_hwmon_data *data)
>  {
>  	int ret;
> @@ -39,10 +84,11 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
>  	 */
>  	switch (attr) {
>  	case hwmon_temp_max:
> -		*val = (data->ctrl->wctemp - 273) * 1000;
> -		return 0;
> +		return nvme_get_temp_thresh(data->ctrl, channel, false, val);
> +	case hwmon_temp_min:
> +		return nvme_get_temp_thresh(data->ctrl, channel, true, val);
>  	case hwmon_temp_crit:
> -		*val = (data->ctrl->cctemp - 273) * 1000;
> +		*val = KELVIN_TO_MILLICELSIUS(data->ctrl->cctemp);
>  		return 0;
>  	default:
>  		break;
> @@ -59,7 +105,7 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
>  			temp = get_unaligned_le16(log->temperature);
>  		else
>  			temp = le16_to_cpu(log->temp_sensor[channel - 1]);
> -		*val = (temp - 273) * 1000;
> +		*val = KELVIN_TO_MILLICELSIUS(temp);
>  		break;
>  	case hwmon_temp_alarm:
>  		*val = !!(log->critical_warning & NVME_SMART_CRIT_TEMPERATURE);
> @@ -73,6 +119,23 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
>  	return err;
>  }
>  
> +static int nvme_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
> +			    u32 attr, int channel, long val)
> +{
> +	struct nvme_hwmon_data *data = dev_get_drvdata(dev);
> +
> +	switch (attr) {
> +	case hwmon_temp_max:
> +		return nvme_set_temp_thresh(data->ctrl, channel, false, val);
> +	case hwmon_temp_min:
> +		return nvme_set_temp_thresh(data->ctrl, channel, true, val);
> +	default:
> +		break;
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
>  static const char * const nvme_hwmon_sensor_names[] = {
>  	"Composite",
>  	"Sensor 1",
> @@ -105,8 +168,10 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
>  			return 0444;
>  		break;
>  	case hwmon_temp_max:
> -		if (!channel && data->ctrl->wctemp)
> -			return 0444;
> +	case hwmon_temp_min:
> +		if ((!channel && data->ctrl->wctemp) ||
> +		    (channel && data->log.temp_sensor[channel - 1]))
> +			return 0644;
>  		break;
>  	case hwmon_temp_alarm:
>  		if (!channel)
> @@ -126,16 +191,24 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
>  static const struct hwmon_channel_info *nvme_hwmon_info[] = {
>  	HWMON_CHANNEL_INFO(chip, HWMON_C_REGISTER_TZ),
>  	HWMON_CHANNEL_INFO(temp,
> -			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> -				HWMON_T_LABEL | HWMON_T_ALARM,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL,
> -			   HWMON_T_INPUT | HWMON_T_LABEL),
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_CRIT | HWMON_T_LABEL | HWMON_T_ALARM,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL,
> +			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
> +				HWMON_T_LABEL),
>  	NULL
>  };
>  
> @@ -143,6 +216,7 @@ static const struct hwmon_ops nvme_hwmon_ops = {
>  	.is_visible	= nvme_hwmon_is_visible,
>  	.read		= nvme_hwmon_read,
>  	.read_string	= nvme_hwmon_read_string,
> +	.write		= nvme_hwmon_write,
>  };
>  
>  static const struct hwmon_chip_info nvme_hwmon_chip_info = {
> diff --git a/include/linux/nvme.h b/include/linux/nvme.h
> index 269b2e8..347a44f 100644
> --- a/include/linux/nvme.h
> +++ b/include/linux/nvme.h
> @@ -803,6 +803,12 @@ struct nvme_write_zeroes_cmd {
>  
>  /* Features */
>  
> +enum {
> +	NVME_TEMP_THRESH_MASK		= 0xffff,
> +	NVME_TEMP_THRESH_SELECT_SHIFT	= 16,
> +	NVME_TEMP_THRESH_TYPE_UNDER	= 0x100000,
> +};
> +
>  struct nvme_feat_auto_pst {
>  	__le64 entries[32];
>  };
> -- 
> 2.7.4
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold
  2019-11-14 15:40 ` [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold Akinobu Mita
@ 2019-11-14 19:14   ` Guenter Roeck
  2019-11-20 18:48   ` Christoph Hellwig
  1 sibling, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2019-11-14 19:14 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Christoph Hellwig

On Fri, Nov 15, 2019 at 12:40:01AM +0900, Akinobu Mita wrote:
> This adds a new quirk NVME_QUIRK_NO_TEMP_THRESH_CHANGE to avoid changing
> the value of the temperature threshold feature for specific devices that
> show undesirable behavior.
> 
> Guenter reported:
> 
> "On my Intel NVME drive (SSDPEKKW512G7), writing any minimum limit on the
> Composite temperature sensor results in a temperature warning, and that
> warning is sticky until I reset the controller.
> 
> It doesn't seem to matter which temperature I write; writing -273000 has
> the same result."
> 
> The Intel NVMe has the latest firmware version installed, so this isn't
> a problem that was ever fixed.
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>

Tested on:
	INTEL SSDPEKKW512G7

Specifically verified that min/max attributes are indeed read-only
on this drive.

nvme-pci-0100
Adapter: PCI adapter
Composite:    +32.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)

groeck@jupiter:/sys/class/hwmon/hwmon0$ ls -l
total 0
lrwxrwxrwx 1 root root    0 Nov 14 10:59 device -> ../../../0000:01:00.0
-r--r--r-- 1 root root 4096 Nov 14 10:59 name
drwxr-xr-x 2 root root    0 Nov 14 10:59 power
lrwxrwxrwx 1 root root    0 Nov 14 10:59 subsystem -> ../../../../../../class/hwmon
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_alarm
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_crit
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_input
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_label
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_max
-r--r--r-- 1 root root 4096 Nov 14 10:59 temp1_min
-rw-r--r-- 1 root root 4096 Nov 14 10:59 uevent

groeck@jupiter:/sys/class/hwmon/hwmon0$ sudo nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     BTPY65250EQN512F     INTEL SSDPEKKW512G7                      1         512.11  GB / 512.11  GB    512   B +  0 B    PSF121C

> ---
>  drivers/nvme/host/nvme-hwmon.c | 6 +++++-
>  drivers/nvme/host/nvme.h       | 5 +++++
>  drivers/nvme/host/pci.c        | 3 ++-
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
> index 97a84b4..a5af21f 100644
> --- a/drivers/nvme/host/nvme-hwmon.c
> +++ b/drivers/nvme/host/nvme-hwmon.c
> @@ -170,8 +170,12 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
>  	case hwmon_temp_max:
>  	case hwmon_temp_min:
>  		if ((!channel && data->ctrl->wctemp) ||
> -		    (channel && data->log.temp_sensor[channel - 1]))
> +		    (channel && data->log.temp_sensor[channel - 1])) {
> +			if (data->ctrl->quirks &
> +			    NVME_QUIRK_NO_TEMP_THRESH_CHANGE)
> +				return 0444;
>  			return 0644;
> +		}
>  		break;
>  	case hwmon_temp_alarm:
>  		if (!channel)
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 000a3d9..19e5e87 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -115,6 +115,11 @@ enum nvme_quirks {
>  	 * Prevent tag overlap between queues
>  	 */
>  	NVME_QUIRK_SHARED_TAGS                  = (1 << 13),
> +
> +	/*
> +	 * Don't change the value of the temperature threshold feature
> +	 */
> +	NVME_QUIRK_NO_TEMP_THRESH_CHANGE	= (1 << 14),
>  };
>  
>  /*
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 931d4a9..2c0206b 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3529,7 +3529,8 @@ static const struct pci_device_id nvme_id_table[] = {
>  				NVME_QUIRK_DEALLOCATE_ZEROES, },
>  	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
>  		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
> -				NVME_QUIRK_MEDIUM_PRIO_SQ },
> +				NVME_QUIRK_MEDIUM_PRIO_SQ |
> +				NVME_QUIRK_NO_TEMP_THRESH_CHANGE },
>  	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
>  		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
>  	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
> -- 
> 2.7.4
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-14 15:40 ` [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor Akinobu Mita
  2019-11-14 19:06   ` Guenter Roeck
@ 2019-11-20 18:48   ` Christoph Hellwig
  2019-11-21 13:48     ` Akinobu Mita
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2019-11-20 18:48 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Christoph Hellwig, Guenter Roeck

On Fri, Nov 15, 2019 at 12:40:00AM +0900, Akinobu Mita wrote:
> +/* These macros should be moved to linux/temperature.h */
> +#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
> +#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)

Didn't we want to move this to a generic header?

Otherwise this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold
  2019-11-14 15:40 ` [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold Akinobu Mita
  2019-11-14 19:14   ` Guenter Roeck
@ 2019-11-20 18:48   ` Christoph Hellwig
  1 sibling, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2019-11-20 18:48 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Christoph Hellwig, Guenter Roeck

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-20 18:48   ` Christoph Hellwig
@ 2019-11-21 13:48     ` Akinobu Mita
  2019-11-21 14:27       ` Guenter Roeck
  0 siblings, 1 reply; 10+ messages in thread
From: Akinobu Mita @ 2019-11-21 13:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Guenter Roeck

2019年11月21日(木) 3:48 Christoph Hellwig <hch@lst.de>:
>
> On Fri, Nov 15, 2019 at 12:40:00AM +0900, Akinobu Mita wrote:
> > +/* These macros should be moved to linux/temperature.h */
> > +#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
> > +#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)
>
> Didn't we want to move this to a generic header?

Yes.  I start preparing for <linux/temperature.h>.
Once the change is accepted, I'll move the macros to the header.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-21 13:48     ` Akinobu Mita
@ 2019-11-21 14:27       ` Guenter Roeck
  2019-11-21 15:15         ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2019-11-21 14:27 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, linux-nvme, Jens Axboe,
	Keith Busch, Christoph Hellwig

On Thu, Nov 21, 2019 at 10:48:21PM +0900, Akinobu Mita wrote:
> 2019年11月21日(木) 3:48 Christoph Hellwig <hch@lst.de>:
> >
> > On Fri, Nov 15, 2019 at 12:40:00AM +0900, Akinobu Mita wrote:
> > > +/* These macros should be moved to linux/temperature.h */
> > > +#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
> > > +#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)
> >
> > Didn't we want to move this to a generic header?
> 
> Yes.  I start preparing for <linux/temperature.h>.
> Once the change is accepted, I'll move the macros to the header.

I agree, it is better to keep that change separate.

Guenter

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor
  2019-11-21 14:27       ` Guenter Roeck
@ 2019-11-21 15:15         ` Keith Busch
  0 siblings, 0 replies; 10+ messages in thread
From: Keith Busch @ 2019-11-21 15:15 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-hwmon, Jean Delvare, Sagi Grimberg, Akinobu Mita,
	linux-nvme, Jens Axboe, Christoph Hellwig

On Thu, Nov 21, 2019 at 06:27:12AM -0800, Guenter Roeck wrote:
> On Thu, Nov 21, 2019 at 10:48:21PM +0900, Akinobu Mita wrote:
> > 2019年11月21日(木) 3:48 Christoph Hellwig <hch@lst.de>:
> > >
> > > On Fri, Nov 15, 2019 at 12:40:00AM +0900, Akinobu Mita wrote:
> > > > +/* These macros should be moved to linux/temperature.h */
> > > > +#define MILLICELSIUS_TO_KELVIN(t) DIV_ROUND_CLOSEST((t) + 273150, 1000)
> > > > +#define KELVIN_TO_MILLICELSIUS(t) ((t) * 1000L - 273150)
> > >
> > > Didn't we want to move this to a generic header?
> > 
> > Yes.  I start preparing for <linux/temperature.h>.
> > Once the change is accepted, I'll move the macros to the header.
> 
> I agree, it is better to keep that change separate.

I've added these two patches as-is for the next nvme-5.5 pull
request.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-11-21 15:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 15:39 [PATCH v2 0/2] nvme: hwmon: provide temperature threshold features Akinobu Mita
2019-11-14 15:40 ` [PATCH v2 1/2] nvme: hwmon: provide temperature min and max values for each sensor Akinobu Mita
2019-11-14 19:06   ` Guenter Roeck
2019-11-20 18:48   ` Christoph Hellwig
2019-11-21 13:48     ` Akinobu Mita
2019-11-21 14:27       ` Guenter Roeck
2019-11-21 15:15         ` Keith Busch
2019-11-14 15:40 ` [PATCH v2 2/2] nvme: hwmon: add quirk to avoid changing temperature threshold Akinobu Mita
2019-11-14 19:14   ` Guenter Roeck
2019-11-20 18:48   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).