From: Akinobu Mita <akinobu.mita@gmail.com> To: Eduardo Valentin <edubezval@gmail.com> Cc: linux-nvme@lists.infradead.org, linux-pm@vger.kernel.org, Zhang Rui <rui.zhang@intel.com>, Daniel Lezcano <daniel.lezcano@linaro.org>, Keith Busch <keith.busch@intel.com>, Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>, Minwoo Im <minwoo.im.dev@gmail.com>, Kenneth Heitke <kenneth.heitke@intel.com>, Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com> Subject: Re: [PATCH v3 2/3] nvme: add thermal zone devices Date: Sat, 8 Jun 2019 00:21:12 +0900 [thread overview] Message-ID: <CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com> (raw) In-Reply-To: <20190606040506.GA1913@localhost.localdomain> 2019年6月6日(木) 13:05 Eduardo Valentin <edubezval@gmail.com>: > > > > +static struct thermal_zone_params nvme_tz_params = { > > > > + .governor_name = "user_space", > > > > > > Also, > > > > > > Was there any particular reason why defaulting to user_space here? > > > > I only tested with the user_space governor. There is no cooling device > > to bind with this. > > > > hmmm.. I see. But was there any particular reason to pick the thermal > subsystem here if you do not have any cooling? Or is there any specific > cooling that can be written in future? > > If no cooling is expected, why not a simple hwmon? For example, we can use USB port powered cooling fan with USB hub that supports per-port power switching. It can be turned on and off by user space tool. > > > > + .no_hwmon = true, > > > > +}; > > > > + > > > > +static struct thermal_zone_device * > > > > +nvme_thermal_zone_register(struct nvme_ctrl *ctrl, unsigned int sensor) > > > > +{ > > > > + struct thermal_zone_device *tzdev; > > > > + char name[THERMAL_NAME_LENGTH]; > > > > + int ret; > > > > + > > > > + snprintf(name, sizeof(name), "nvme%d_temp%u", ctrl->instance, sensor); > > > > + > > > > + tzdev = thermal_zone_device_register(name, 1, 1, ctrl, &nvme_tz_ops, > > > > + &nvme_tz_params, 0, 0); > > > > + if (IS_ERR(tzdev)) { > > > > + dev_err(ctrl->device, > > > > + "Failed to register thermal zone device: %ld\n", > > > > + PTR_ERR(tzdev)); > > > > + return tzdev; > > > > + } > > > > + > > > > + snprintf(name, sizeof(name), "temp%d", sensor); > > > > + ret = sysfs_create_link(&ctrl->ctrl_device.kobj, &tzdev->device.kobj, > > > > + name); > > > > + if (ret) > > > > + goto device_unregister; > > > > + > > > > + ret = sysfs_create_link(&tzdev->device.kobj, > > > > + &ctrl->ctrl_device.kobj, "device"); > > > > + if (ret) > > > > + goto remove_link; > > > > + > > > > + return tzdev; > > > > + > > > > +remove_link: > > > > + sysfs_remove_link(&ctrl->ctrl_device.kobj, name); > > > > +device_unregister: > > > > + thermal_zone_device_unregister(tzdev); > > > > + > > > > + return ERR_PTR(ret); > > > > +} > > > > + > > > > +/** > > > > + * nvme_thermal_zones_register() - register nvme thermal zone devices > > > > + * @ctrl: controller instance > > > > + * > > > > + * This function creates up to nine thermal zone devices for all implemented > > > > + * temperature sensors including the composite temperature. > > > > + * Each thermal zone device provides a single trip point temperature that is > > > > + * associated with an over temperature threshold. > > > > + */ > > > > +static int nvme_thermal_zones_register(struct nvme_ctrl *ctrl) > > > > +{ > > > > + struct nvme_smart_log *log; > > > > + int ret; > > > > + int i; > > > > + > > > > + log = kzalloc(sizeof(*log), GFP_KERNEL); > > > > + if (!log) > > > > + return 0; /* non-fatal error */ > > > > > > I am not sure about this API design here. I would leave the error > > > handling and judging if this is fatal or not to the caller. > > > I mean, if I ask to register a nvme thermal zone and I get > > > a 0 as response, I would assume the thermal zone exists from > > > now on, right? > > > > This routine is designed to return error code only when we're unable to > > communicate with the device at all (i.e. nvme_submit_sync_cmd returns a > > negative value). > > > > We don't want to abandon device initialization just due to thermal zone > > failures. Because thermal zone device isn't mandatory to manage the > > controllers, and the device like qemu doesn't provide smart log (according > > to Keith). > > That is fair.. it is just really weird to continue your business when > the kernel cannot even allocate memory for you.. Right. The system is not unlikely to continue working properly when under page size allocation with GFP_KERNEL is failed, unless fault injection for allocation is enabled.
WARNING: multiple messages have this Message-ID (diff)
From: akinobu.mita@gmail.com (Akinobu Mita) Subject: [PATCH v3 2/3] nvme: add thermal zone devices Date: Sat, 8 Jun 2019 00:21:12 +0900 [thread overview] Message-ID: <CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com> (raw) In-Reply-To: <20190606040506.GA1913@localhost.localdomain> 2019?6?6?(?) 13:05 Eduardo Valentin <edubezval at gmail.com>: > > > > +static struct thermal_zone_params nvme_tz_params = { > > > > + .governor_name = "user_space", > > > > > > Also, > > > > > > Was there any particular reason why defaulting to user_space here? > > > > I only tested with the user_space governor. There is no cooling device > > to bind with this. > > > > hmmm.. I see. But was there any particular reason to pick the thermal > subsystem here if you do not have any cooling? Or is there any specific > cooling that can be written in future? > > If no cooling is expected, why not a simple hwmon? For example, we can use USB port powered cooling fan with USB hub that supports per-port power switching. It can be turned on and off by user space tool. > > > > + .no_hwmon = true, > > > > +}; > > > > + > > > > +static struct thermal_zone_device * > > > > +nvme_thermal_zone_register(struct nvme_ctrl *ctrl, unsigned int sensor) > > > > +{ > > > > + struct thermal_zone_device *tzdev; > > > > + char name[THERMAL_NAME_LENGTH]; > > > > + int ret; > > > > + > > > > + snprintf(name, sizeof(name), "nvme%d_temp%u", ctrl->instance, sensor); > > > > + > > > > + tzdev = thermal_zone_device_register(name, 1, 1, ctrl, &nvme_tz_ops, > > > > + &nvme_tz_params, 0, 0); > > > > + if (IS_ERR(tzdev)) { > > > > + dev_err(ctrl->device, > > > > + "Failed to register thermal zone device: %ld\n", > > > > + PTR_ERR(tzdev)); > > > > + return tzdev; > > > > + } > > > > + > > > > + snprintf(name, sizeof(name), "temp%d", sensor); > > > > + ret = sysfs_create_link(&ctrl->ctrl_device.kobj, &tzdev->device.kobj, > > > > + name); > > > > + if (ret) > > > > + goto device_unregister; > > > > + > > > > + ret = sysfs_create_link(&tzdev->device.kobj, > > > > + &ctrl->ctrl_device.kobj, "device"); > > > > + if (ret) > > > > + goto remove_link; > > > > + > > > > + return tzdev; > > > > + > > > > +remove_link: > > > > + sysfs_remove_link(&ctrl->ctrl_device.kobj, name); > > > > +device_unregister: > > > > + thermal_zone_device_unregister(tzdev); > > > > + > > > > + return ERR_PTR(ret); > > > > +} > > > > + > > > > +/** > > > > + * nvme_thermal_zones_register() - register nvme thermal zone devices > > > > + * @ctrl: controller instance > > > > + * > > > > + * This function creates up to nine thermal zone devices for all implemented > > > > + * temperature sensors including the composite temperature. > > > > + * Each thermal zone device provides a single trip point temperature that is > > > > + * associated with an over temperature threshold. > > > > + */ > > > > +static int nvme_thermal_zones_register(struct nvme_ctrl *ctrl) > > > > +{ > > > > + struct nvme_smart_log *log; > > > > + int ret; > > > > + int i; > > > > + > > > > + log = kzalloc(sizeof(*log), GFP_KERNEL); > > > > + if (!log) > > > > + return 0; /* non-fatal error */ > > > > > > I am not sure about this API design here. I would leave the error > > > handling and judging if this is fatal or not to the caller. > > > I mean, if I ask to register a nvme thermal zone and I get > > > a 0 as response, I would assume the thermal zone exists from > > > now on, right? > > > > This routine is designed to return error code only when we're unable to > > communicate with the device at all (i.e. nvme_submit_sync_cmd returns a > > negative value). > > > > We don't want to abandon device initialization just due to thermal zone > > failures. Because thermal zone device isn't mandatory to manage the > > controllers, and the device like qemu doesn't provide smart log (according > > to Keith). > > That is fair.. it is just really weird to continue your business when > the kernel cannot even allocate memory for you.. Right. The system is not unlikely to continue working properly when under page size allocation with GFP_KERNEL is failed, unless fault injection for allocation is enabled.
next prev parent reply other threads:[~2019-06-07 15:21 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-26 16:29 [PATCH v3 0/3] nvme: add thermal zone devices Akinobu Mita 2019-05-26 16:29 ` Akinobu Mita 2019-05-26 16:29 ` [PATCH v3 1/3] nvme: Export get and set features Akinobu Mita 2019-05-26 16:29 ` Akinobu Mita 2019-05-26 16:45 ` Chaitanya Kulkarni 2019-05-26 16:45 ` Chaitanya Kulkarni 2019-05-29 15:19 ` Minwoo Im 2019-05-29 15:19 ` Minwoo Im 2019-05-26 16:29 ` [PATCH v3 2/3] nvme: add thermal zone devices Akinobu Mita 2019-05-26 16:29 ` Akinobu Mita 2019-05-29 15:15 ` Minwoo Im 2019-05-29 15:15 ` Minwoo Im 2019-05-29 16:47 ` Akinobu Mita 2019-05-29 16:47 ` Akinobu Mita 2019-05-30 10:18 ` Minwoo Im 2019-05-30 10:18 ` Minwoo Im 2019-06-01 9:02 ` Christoph Hellwig 2019-06-01 9:02 ` Christoph Hellwig 2019-06-02 13:19 ` Akinobu Mita 2019-06-02 13:19 ` Akinobu Mita 2019-06-04 7:31 ` Christoph Hellwig 2019-06-04 7:31 ` Christoph Hellwig 2019-06-05 15:42 ` Akinobu Mita 2019-06-05 15:42 ` Akinobu Mita 2019-06-03 2:36 ` Eduardo Valentin 2019-06-03 2:36 ` Eduardo Valentin 2019-06-03 15:20 ` Akinobu Mita 2019-06-03 15:20 ` Akinobu Mita 2019-06-06 4:05 ` Eduardo Valentin 2019-06-06 4:05 ` Eduardo Valentin 2019-06-07 15:21 ` Akinobu Mita [this message] 2019-06-07 15:21 ` Akinobu Mita 2019-05-26 16:29 ` [PATCH v3 3/3] nvme: notify thermal framework when temperature threshold events occur Akinobu Mita 2019-05-26 16:29 ` Akinobu Mita 2019-06-01 9:03 ` Christoph Hellwig 2019-06-01 9:03 ` Christoph Hellwig 2019-06-02 13:46 ` Akinobu Mita 2019-06-02 13:46 ` Akinobu Mita 2019-06-04 7:32 ` Christoph Hellwig 2019-06-04 7:32 ` Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAC5umyhx-Ger4ZGcEu+mDNxgPPBCBEROhAEODAKqSv_p9QRw4g@mail.gmail.com \ --to=akinobu.mita@gmail.com \ --cc=Chaitanya.Kulkarni@wdc.com \ --cc=axboe@fb.com \ --cc=daniel.lezcano@linaro.org \ --cc=edubezval@gmail.com \ --cc=hch@lst.de \ --cc=keith.busch@intel.com \ --cc=kenneth.heitke@intel.com \ --cc=linux-nvme@lists.infradead.org \ --cc=linux-pm@vger.kernel.org \ --cc=minwoo.im.dev@gmail.com \ --cc=rui.zhang@intel.com \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.