All of lore.kernel.org
 help / color / mirror / Atom feed
From: Akinobu Mita <akinobu.mita@gmail.com>
To: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-nvme@lists.infradead.org, linux-pm@vger.kernel.org,
	Zhang Rui <rui.zhang@intel.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Keith Busch <keith.busch@intel.com>, Jens Axboe <axboe@fb.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	Minwoo Im <minwoo.im.dev@gmail.com>,
	Kenneth Heitke <kenneth.heitke@intel.com>,
	Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
Subject: Re: [PATCH v3 2/3] nvme: add thermal zone devices
Date: Sat, 8 Jun 2019 00:21:12 +0900	[thread overview]
Message-ID: <CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com> (raw)
In-Reply-To: <20190606040506.GA1913@localhost.localdomain>

2019年6月6日(木) 13:05 Eduardo Valentin <edubezval@gmail.com>:

> > > > +static struct thermal_zone_params nvme_tz_params = {
> > > > +     .governor_name = "user_space",
> > >
> > > Also,
> > >
> > > Was there any particular reason why defaulting to user_space here?
> >
> > I only tested with the user_space governor.  There is no cooling device
> > to bind with this.
> >
>
> hmmm.. I see. But was there any particular reason to pick the thermal
> subsystem here if you do not have any cooling? Or is there any specific
> cooling that can be written in future?
>
> If no cooling is expected, why not a simple hwmon?

For example, we can use USB port powered cooling fan with USB hub that
supports per-port power switching.  It can be turned on and off by user
space tool.

> > > > +     .no_hwmon = true,
> > > > +};
> > > > +
> > > > +static struct thermal_zone_device *
> > > > +nvme_thermal_zone_register(struct nvme_ctrl *ctrl, unsigned int sensor)
> > > > +{
> > > > +     struct thermal_zone_device *tzdev;
> > > > +     char name[THERMAL_NAME_LENGTH];
> > > > +     int ret;
> > > > +
> > > > +     snprintf(name, sizeof(name), "nvme%d_temp%u", ctrl->instance, sensor);
> > > > +
> > > > +     tzdev = thermal_zone_device_register(name, 1, 1, ctrl, &nvme_tz_ops,
> > > > +                                          &nvme_tz_params, 0, 0);
> > > > +     if (IS_ERR(tzdev)) {
> > > > +             dev_err(ctrl->device,
> > > > +                     "Failed to register thermal zone device: %ld\n",
> > > > +                     PTR_ERR(tzdev));
> > > > +             return tzdev;
> > > > +     }
> > > > +
> > > > +     snprintf(name, sizeof(name), "temp%d", sensor);
> > > > +     ret = sysfs_create_link(&ctrl->ctrl_device.kobj, &tzdev->device.kobj,
> > > > +                             name);
> > > > +     if (ret)
> > > > +             goto device_unregister;
> > > > +
> > > > +     ret = sysfs_create_link(&tzdev->device.kobj,
> > > > +                             &ctrl->ctrl_device.kobj, "device");
> > > > +     if (ret)
> > > > +             goto remove_link;
> > > > +
> > > > +     return tzdev;
> > > > +
> > > > +remove_link:
> > > > +     sysfs_remove_link(&ctrl->ctrl_device.kobj, name);
> > > > +device_unregister:
> > > > +     thermal_zone_device_unregister(tzdev);
> > > > +
> > > > +     return ERR_PTR(ret);
> > > > +}
> > > > +
> > > > +/**
> > > > + * nvme_thermal_zones_register() - register nvme thermal zone devices
> > > > + * @ctrl: controller instance
> > > > + *
> > > > + * This function creates up to nine thermal zone devices for all implemented
> > > > + * temperature sensors including the composite temperature.
> > > > + * Each thermal zone device provides a single trip point temperature that is
> > > > + * associated with an over temperature threshold.
> > > > + */
> > > > +static int nvme_thermal_zones_register(struct nvme_ctrl *ctrl)
> > > > +{
> > > > +     struct nvme_smart_log *log;
> > > > +     int ret;
> > > > +     int i;
> > > > +
> > > > +     log = kzalloc(sizeof(*log), GFP_KERNEL);
> > > > +     if (!log)
> > > > +             return 0; /* non-fatal error */
> > >
> > > I am not sure about this API design here. I would leave the error
> > > handling and judging if this is fatal or not to the caller.
> > > I mean, if I ask to register a nvme thermal zone and I get
> > > a 0 as response, I would assume the thermal zone exists from
> > > now on, right?
> >
> > This routine is designed to return error code only when we're unable to
> > communicate with the device at all (i.e. nvme_submit_sync_cmd returns a
> > negative value).
> >
> > We don't want to abandon device initialization just due to thermal zone
> > failures.  Because thermal zone device isn't mandatory to manage the
> > controllers, and the device like qemu doesn't provide smart log (according
> > to Keith).
>
> That is fair.. it is just really weird to continue your business when
> the kernel cannot even allocate memory for you..

Right.  The system is not unlikely to continue working properly when under
page size allocation with GFP_KERNEL is failed, unless fault injection for
allocation is enabled.

WARNING: multiple messages have this Message-ID (diff)
From: akinobu.mita@gmail.com (Akinobu Mita)
Subject: [PATCH v3 2/3] nvme: add thermal zone devices
Date: Sat, 8 Jun 2019 00:21:12 +0900	[thread overview]
Message-ID: <CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com> (raw)
In-Reply-To: <20190606040506.GA1913@localhost.localdomain>

2019?6?6?(?) 13:05 Eduardo Valentin <edubezval at gmail.com>:

> > > > +static struct thermal_zone_params nvme_tz_params = {
> > > > +     .governor_name = "user_space",
> > >
> > > Also,
> > >
> > > Was there any particular reason why defaulting to user_space here?
> >
> > I only tested with the user_space governor.  There is no cooling device
> > to bind with this.
> >
>
> hmmm.. I see. But was there any particular reason to pick the thermal
> subsystem here if you do not have any cooling? Or is there any specific
> cooling that can be written in future?
>
> If no cooling is expected, why not a simple hwmon?

For example, we can use USB port powered cooling fan with USB hub that
supports per-port power switching.  It can be turned on and off by user
space tool.

> > > > +     .no_hwmon = true,
> > > > +};
> > > > +
> > > > +static struct thermal_zone_device *
> > > > +nvme_thermal_zone_register(struct nvme_ctrl *ctrl, unsigned int sensor)
> > > > +{
> > > > +     struct thermal_zone_device *tzdev;
> > > > +     char name[THERMAL_NAME_LENGTH];
> > > > +     int ret;
> > > > +
> > > > +     snprintf(name, sizeof(name), "nvme%d_temp%u", ctrl->instance, sensor);
> > > > +
> > > > +     tzdev = thermal_zone_device_register(name, 1, 1, ctrl, &nvme_tz_ops,
> > > > +                                          &nvme_tz_params, 0, 0);
> > > > +     if (IS_ERR(tzdev)) {
> > > > +             dev_err(ctrl->device,
> > > > +                     "Failed to register thermal zone device: %ld\n",
> > > > +                     PTR_ERR(tzdev));
> > > > +             return tzdev;
> > > > +     }
> > > > +
> > > > +     snprintf(name, sizeof(name), "temp%d", sensor);
> > > > +     ret = sysfs_create_link(&ctrl->ctrl_device.kobj, &tzdev->device.kobj,
> > > > +                             name);
> > > > +     if (ret)
> > > > +             goto device_unregister;
> > > > +
> > > > +     ret = sysfs_create_link(&tzdev->device.kobj,
> > > > +                             &ctrl->ctrl_device.kobj, "device");
> > > > +     if (ret)
> > > > +             goto remove_link;
> > > > +
> > > > +     return tzdev;
> > > > +
> > > > +remove_link:
> > > > +     sysfs_remove_link(&ctrl->ctrl_device.kobj, name);
> > > > +device_unregister:
> > > > +     thermal_zone_device_unregister(tzdev);
> > > > +
> > > > +     return ERR_PTR(ret);
> > > > +}
> > > > +
> > > > +/**
> > > > + * nvme_thermal_zones_register() - register nvme thermal zone devices
> > > > + * @ctrl: controller instance
> > > > + *
> > > > + * This function creates up to nine thermal zone devices for all implemented
> > > > + * temperature sensors including the composite temperature.
> > > > + * Each thermal zone device provides a single trip point temperature that is
> > > > + * associated with an over temperature threshold.
> > > > + */
> > > > +static int nvme_thermal_zones_register(struct nvme_ctrl *ctrl)
> > > > +{
> > > > +     struct nvme_smart_log *log;
> > > > +     int ret;
> > > > +     int i;
> > > > +
> > > > +     log = kzalloc(sizeof(*log), GFP_KERNEL);
> > > > +     if (!log)
> > > > +             return 0; /* non-fatal error */
> > >
> > > I am not sure about this API design here. I would leave the error
> > > handling and judging if this is fatal or not to the caller.
> > > I mean, if I ask to register a nvme thermal zone and I get
> > > a 0 as response, I would assume the thermal zone exists from
> > > now on, right?
> >
> > This routine is designed to return error code only when we're unable to
> > communicate with the device at all (i.e. nvme_submit_sync_cmd returns a
> > negative value).
> >
> > We don't want to abandon device initialization just due to thermal zone
> > failures.  Because thermal zone device isn't mandatory to manage the
> > controllers, and the device like qemu doesn't provide smart log (according
> > to Keith).
>
> That is fair.. it is just really weird to continue your business when
> the kernel cannot even allocate memory for you..

Right.  The system is not unlikely to continue working properly when under
page size allocation with GFP_KERNEL is failed, unless fault injection for
allocation is enabled.

  reply	other threads:[~2019-06-07 15:21 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-26 16:29 [PATCH v3 0/3] nvme: add thermal zone devices Akinobu Mita
2019-05-26 16:29 ` Akinobu Mita
2019-05-26 16:29 ` [PATCH v3 1/3] nvme: Export get and set features Akinobu Mita
2019-05-26 16:29   ` Akinobu Mita
2019-05-26 16:45   ` Chaitanya Kulkarni
2019-05-26 16:45     ` Chaitanya Kulkarni
2019-05-29 15:19   ` Minwoo Im
2019-05-29 15:19     ` Minwoo Im
2019-05-26 16:29 ` [PATCH v3 2/3] nvme: add thermal zone devices Akinobu Mita
2019-05-26 16:29   ` Akinobu Mita
2019-05-29 15:15   ` Minwoo Im
2019-05-29 15:15     ` Minwoo Im
2019-05-29 16:47     ` Akinobu Mita
2019-05-29 16:47       ` Akinobu Mita
2019-05-30 10:18       ` Minwoo Im
2019-05-30 10:18         ` Minwoo Im
2019-06-01  9:02   ` Christoph Hellwig
2019-06-01  9:02     ` Christoph Hellwig
2019-06-02 13:19     ` Akinobu Mita
2019-06-02 13:19       ` Akinobu Mita
2019-06-04  7:31       ` Christoph Hellwig
2019-06-04  7:31         ` Christoph Hellwig
2019-06-05 15:42         ` Akinobu Mita
2019-06-05 15:42           ` Akinobu Mita
2019-06-03  2:36   ` Eduardo Valentin
2019-06-03  2:36     ` Eduardo Valentin
2019-06-03 15:20     ` Akinobu Mita
2019-06-03 15:20       ` Akinobu Mita
2019-06-06  4:05       ` Eduardo Valentin
2019-06-06  4:05         ` Eduardo Valentin
2019-06-07 15:21         ` Akinobu Mita [this message]
2019-06-07 15:21           ` Akinobu Mita
2019-05-26 16:29 ` [PATCH v3 3/3] nvme: notify thermal framework when temperature threshold events occur Akinobu Mita
2019-05-26 16:29   ` Akinobu Mita
2019-06-01  9:03   ` Christoph Hellwig
2019-06-01  9:03     ` Christoph Hellwig
2019-06-02 13:46     ` Akinobu Mita
2019-06-02 13:46       ` Akinobu Mita
2019-06-04  7:32       ` Christoph Hellwig
2019-06-04  7:32         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com \
    --to=akinobu.mita@gmail.com \
    --cc=Chaitanya.Kulkarni@wdc.com \
    --cc=axboe@fb.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=edubezval@gmail.com \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=kenneth.heitke@intel.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=minwoo.im.dev@gmail.com \
    --cc=rui.zhang@intel.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.