All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vadim Pasternak <vadimp@nvidia.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>,
	"davem@davemloft.net" <davem@davemloft.net>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux@roeck-us.net" <linux@roeck-us.net>,
	"rui.zhang@intel.com" <rui.zhang@intel.com>,
	"edubezval@gmail.com" <edubezval@gmail.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>,
	Ido Schimmel <idosch@nvidia.com>
Subject: RE: [patch net-next RFC v1] mlxsw: core: Add the hottest thermal zone detection
Date: Wed, 15 Jun 2022 22:06:27 +0000	[thread overview]
Message-ID: <BN9PR12MB53814C07F1FF66C06BCDC5FAAFAD9@BN9PR12MB5381.namprd12.prod.outlook.com> (raw)
In-Reply-To: <f3c62ebe-7d59-c537-a010-bff366c8aeba@linaro.org>

Hi Daniel,

> -----Original Message-----
> From: Daniel Lezcano <daniel.lezcano@linaro.org>
> Sent: Wednesday, June 15, 2022 11:32 PM
> To: Vadim Pasternak <vadimp@nvidia.com>; davem@davemloft.net
> Cc: netdev@vger.kernel.org; linux@roeck-us.net; rui.zhang@intel.com;
> edubezval@gmail.com; jiri@resnulli.us; Ido Schimmel <idosch@nvidia.com>
> Subject: Re: [patch net-next RFC v1] mlxsw: core: Add the hottest thermal
> zone detection
> 
> 
> Hi Vadim,
> 
> On 29/05/2019 15:52, Vadim Pasternak wrote:
> > When multiple sensors are mapped to the same cooling device, the
> > cooling device should be set according the worst sensor from the
> > sensors associated with this cooling device.
> >
> > Provide the hottest thermal zone detection and enforce cooling device
> > to follow the temperature trends the hottest zone only.
> > Prevent competition for the cooling device control from others zones,
> > by "stable trend" indication. A cooling device will not perform any
> > actions associated with a zone with "stable trend".
> >
> > When other thermal zone is detected as a hottest, a cooling device is
> > to be switched to following temperature trends of new hottest zone.
> >
> > Thermal zone score is represented by 32 bits unsigned integer and
> > calculated according to the next formula:
> > For T < TZ<t><i>, where t from {normal trip = 0, high trip = 1, hot
> > trip = 2, critical = 3}:
> > TZ<i> score = (T + (TZ<t><i> - T) / 2) / (TZ<t><i> - T) * 256 ** j;
> > Highest thermal zone score s is set as MAX(TZ<i>score); Following this
> > formula, if TZ<i> is in trip point higher than TZ<k>, the higher score
> > is to be always assigned to TZ<i>.
> >
> > For two thermal zones located at the same kind of trip point, the
> > higher score will be assigned to the zone, which closer to the next trip
> point.
> > Thus, the highest score will always be assigned objectively to the
> > hottest thermal zone.
> 
> While reading the code I noticed this change and I was wondering why it was
> needed.
> 
> The thermal framework does already aggregates the mitigation decisions,
> taking the highest cooling state [1].
> 
> That allows for instance a spanning fan on a dual socket. Two thermal zones
> for one cooling device.

Here the hottest thermal zone is calculated for different thermal zone_devices, for example, each
optical transceiver or gearbox is separated 'tzdev', while all of them share the same cooling device.
It could up to 128 transceivers.

It was also intention to avoid a competition between thermal zones when some of them
can be in trend up state and some  in trend down.

Are you saying that the below code will work for such case?

	/* Make sure cdev enters the deepest cooling state */
	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
			instance->tz->id, instance->target);
		if (instance->target == THERMAL_NO_TARGET)
			continue;
		if (instance->target > target)
			target = instance->target;
	}

> 
> AFAICS, the code hijacked the get_trend function just for the sake of
> returning 1 for the hotter thermal zone leading to a computation of the trend
> in the thermal core code.

Yes, get_trend() returns one just to indicate that cooling device should not be
touched for a thermal zone, which is not hottest.

> 
> I would like to get rid of the get_trend ops in the thermal framework and the
> changes in this patch sounds like pointless as the aggregation of the cooling
> action is already handled in the thermal framework.
> 
> Given the above, it would make sense to revert commit 6f73862fabd93 and
> 2dc2f760052da ?

I believe we should run thermal emulation to validate we are OK.

Thanks,
Vadim.

> 
> Thanks
> 
>    -- Daniel
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/tree/drive
> rs/thermal/thermal_helpers.c#n190
> 
> 
> [ ... ]
> 
> 
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog

  reply	other threads:[~2022-06-15 22:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29 13:52 [patch net-next RFC v1] mlxsw: core: Add the hottest thermal zone detection Vadim Pasternak
2022-06-15 20:31 ` Daniel Lezcano
2022-06-15 22:06   ` Vadim Pasternak [this message]
2022-06-15 22:27     ` Daniel Lezcano
2022-06-16  3:41       ` [EXTERNAL][patch " Eduardo Valentin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR12MB53814C07F1FF66C06BCDC5FAAFAD9@BN9PR12MB5381.namprd12.prod.outlook.com \
    --to=vadimp@nvidia.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=davem@davemloft.net \
    --cc=edubezval@gmail.com \
    --cc=idosch@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=linux@roeck-us.net \
    --cc=netdev@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.