linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
@ 2016-07-11 15:18 Prarit Bhargava
  2016-07-11 16:07 ` Coelho, Luciano
  2016-07-11 18:00 ` Emmanuel Grumbach
  0 siblings, 2 replies; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-11 15:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, Chaya Rachel Ivgi, Sara Sharon,
	linux-wireless, netdev

Didn't get any feedback or review comments on this patch.  Resending ...

P.

---8<---

The iwlwifi driver implements a thermal zone and hwmon device, but
returns -EIO on temperature reads if the firmware isn't loaded.  This
results in the error

iwlwifi-virtual-0
Adapter: Virtual device
ERROR: Can't get value of subfeature temp1_input: I/O error
temp1:            N/A

being output when using sensors from the lm-sensors package.  Since
the temperature cannot be read unless the ucode is loaded there is no
reason to add the interface only to have it return an error 100% of
the time.

This patch moves the firmware check to iwl_mvm_thermal_zone_register() and
stops the thermal zone from being created if the ucode hasn't been loaded.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Intel Linux Wireless <linuxwifi@intel.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
Cc: Sara Sharon <sara.sharon@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 drivers/net/wireless/intel/iwlwifi/mvm/tt.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
index 58fc7b3c711c..64802659711f 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
@@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device,
 
 	mutex_lock(&mvm->mutex);
 
-	if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
-		ret = -EIO;
-		goto out;
-	}
-
 	ret = iwl_mvm_get_temp(mvm, &temp);
 	if (ret)
 		goto out;
@@ -684,11 +679,6 @@ static int iwl_mvm_tzone_set_trip_temp(struct thermal_zone_device *device,
 
 	mutex_lock(&mvm->mutex);
 
-	if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
-		ret = -EIO;
-		goto out;
-	}
-
 	if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) {
 		ret = -EINVAL;
 		goto out;
@@ -750,6 +740,9 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
 		return;
 	}
 
+	if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR))
+		return;
+
 	BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
 
 	mvm->tz_device.tzone = thermal_zone_device_register(name,
-- 
1.7.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 15:18 [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded Prarit Bhargava
@ 2016-07-11 16:07 ` Coelho, Luciano
  2016-07-11 17:00   ` Prarit Bhargava
  2016-07-11 18:00 ` Emmanuel Grumbach
  1 sibling, 1 reply; 19+ messages in thread
From: Coelho, Luciano @ 2016-07-11 16:07 UTC (permalink / raw)
  To: linux-kernel, prarit
  Cc: linuxwifi, Berg, Johannes, kvalo, Ivgi, Chaya Rachel, netdev,
	Sharon, Sara, linux-wireless, Grumbach, Emmanuel

On Mon, 2016-07-11 at 11:18 -0400, Prarit Bhargava wrote:
> Didn't get any feedback or review comments on this patch.  Resending
> ...
> 
> P.

Sorry, this got flooded down my inbox.


> ---8<---
> 
> The iwlwifi driver implements a thermal zone and hwmon device, but
> returns -EIO on temperature reads if the firmware isn't loaded.  This
> results in the error
> 
> iwlwifi-virtual-0
> Adapter: Virtual device
> ERROR: Can't get value of subfeature temp1_input: I/O error
> temp1:            N/A
> 
> being output when using sensors from the lm-sensors package.  Since
> the temperature cannot be read unless the ucode is loaded there is no
> reason to add the interface only to have it return an error 100% of
> the time.
> 
> This patch moves the firmware check to
> iwl_mvm_thermal_zone_register() and
> stops the thermal zone from being created if the ucode hasn't been
> loaded.
> 
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Cc: Johannes Berg <johannes.berg@intel.com>
> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> Cc: Luca Coelho <luciano.coelho@intel.com>
> Cc: Intel Linux Wireless <linuxwifi@intel.com>
> Cc: Kalle Valo <kvalo@codeaurora.org>
> Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
> Cc: Sara Sharon <sara.sharon@intel.com>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> ---

I have now sent it for review on our internal tree.

--
Luca.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 16:07 ` Coelho, Luciano
@ 2016-07-11 17:00   ` Prarit Bhargava
  0 siblings, 0 replies; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-11 17:00 UTC (permalink / raw)
  To: Coelho, Luciano, linux-kernel
  Cc: linuxwifi, Berg, Johannes, kvalo, Ivgi, Chaya Rachel, netdev,
	Sharon, Sara, linux-wireless, Grumbach, Emmanuel



On 07/11/2016 12:07 PM, Coelho, Luciano wrote:
> On Mon, 2016-07-11 at 11:18 -0400, Prarit Bhargava wrote:
>> Didn't get any feedback or review comments on this patch.  Resending
>> ...
>>
>> P.
> 
> Sorry, this got flooded down my inbox.

NP, Luciano -- My worry was that it hadn't been seen or didn't make it out to
the list.

I'm being a bit impatient too ;)

P.

> 
> 
>> ---8<---
>>
>> The iwlwifi driver implements a thermal zone and hwmon device, but
>> returns -EIO on temperature reads if the firmware isn't loaded.  This
>> results in the error
>>
>> iwlwifi-virtual-0
>> Adapter: Virtual device
>> ERROR: Can't get value of subfeature temp1_input: I/O error
>> temp1:            N/A
>>
>> being output when using sensors from the lm-sensors package.  Since
>> the temperature cannot be read unless the ucode is loaded there is no
>> reason to add the interface only to have it return an error 100% of
>> the time.
>>
>> This patch moves the firmware check to
>> iwl_mvm_thermal_zone_register() and
>> stops the thermal zone from being created if the ucode hasn't been
>> loaded.
>>
>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>> Cc: Johannes Berg <johannes.berg@intel.com>
>> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
>> Cc: Luca Coelho <luciano.coelho@intel.com>
>> Cc: Intel Linux Wireless <linuxwifi@intel.com>
>> Cc: Kalle Valo <kvalo@codeaurora.org>
>> Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
>> Cc: Sara Sharon <sara.sharon@intel.com>
>> Cc: linux-wireless@vger.kernel.org
>> Cc: netdev@vger.kernel.org
>> ---
> 
> I have now sent it for review on our internal tree.
> 
> --
> Luca.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 15:18 [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded Prarit Bhargava
  2016-07-11 16:07 ` Coelho, Luciano
@ 2016-07-11 18:00 ` Emmanuel Grumbach
  2016-07-11 18:19   ` Prarit Bhargava
  1 sibling, 1 reply; 19+ messages in thread
From: Emmanuel Grumbach @ 2016-07-11 18:00 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, Chaya Rachel Ivgi, Sara Sharon,
	linux-wireless, netdev

On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <prarit@redhat.com> wrote:
>
> Didn't get any feedback or review comments on this patch.  Resending ...
>
> P.

This change is obviously completely broken. It simply disables the
registration to thermal zone core.

>
> ---8<---
>
> The iwlwifi driver implements a thermal zone and hwmon device, but
> returns -EIO on temperature reads if the firmware isn't loaded.  This
> results in the error
>
> iwlwifi-virtual-0
> Adapter: Virtual device
> ERROR: Can't get value of subfeature temp1_input: I/O error
> temp1:            N/A
>
> being output when using sensors from the lm-sensors package.  Since
> the temperature cannot be read unless the ucode is loaded there is no
> reason to add the interface only to have it return an error 100% of
> the time.
>
> This patch moves the firmware check to iwl_mvm_thermal_zone_register() and
> stops the thermal zone from being created if the ucode hasn't been loaded.
>
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Cc: Johannes Berg <johannes.berg@intel.com>
> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> Cc: Luca Coelho <luciano.coelho@intel.com>
> Cc: Intel Linux Wireless <linuxwifi@intel.com>
> Cc: Kalle Valo <kvalo@codeaurora.org>
> Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
> Cc: Sara Sharon <sara.sharon@intel.com>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> ---
>  drivers/net/wireless/intel/iwlwifi/mvm/tt.c |   13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> index 58fc7b3c711c..64802659711f 100644
> --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> @@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device,
>
>         mutex_lock(&mvm->mutex);
>
> -       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
> -               ret = -EIO;
> -               goto out;
> -       }
> -
>         ret = iwl_mvm_get_temp(mvm, &temp);
>         if (ret)
>                 goto out;
> @@ -684,11 +679,6 @@ static int iwl_mvm_tzone_set_trip_temp(struct thermal_zone_device *device,
>
>         mutex_lock(&mvm->mutex);
>
> -       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
> -               ret = -EIO;
> -               goto out;
> -       }
> -
>         if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) {
>                 ret = -EINVAL;
>                 goto out;
> @@ -750,6 +740,9 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
>                 return;
>         }
>
> +       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR))
> +               return;
> +
>         BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
>
>         mvm->tz_device.tzone = thermal_zone_device_register(name,
> --
> 1.7.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 18:00 ` Emmanuel Grumbach
@ 2016-07-11 18:19   ` Prarit Bhargava
  2016-07-11 18:27     ` Grumbach, Emmanuel
  0 siblings, 1 reply; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-11 18:19 UTC (permalink / raw)
  To: Emmanuel Grumbach
  Cc: linux-kernel, Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, Chaya Rachel Ivgi, Sara Sharon,
	linux-wireless, netdev



On 07/11/2016 02:00 PM, Emmanuel Grumbach wrote:
> On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <prarit@redhat.com> wrote:
>>
>> Didn't get any feedback or review comments on this patch.  Resending ...
>>
>> P.
> 
> This change is obviously completely broken. It simply disables the
> registration to thermal zone core.

No it is not broken, and yes, that is exactly what should happen IMO.

The problem is that the iwlwifi driver implements the thermal zone even when the
device doesn't support it.

As can be seen in the current code base, iwl_mvm_tzone_get_temp() will return
-EIO 100% of the time when the firmware doesn't support reading the
temperature[1].  In this case a read of sysfs will result in a return of -EIO,
and this breaks existing userspace programs such as lm-sensors (which by all
accounts is bad to do).

Note that in my patch I have removed the -EIO return in favor of not registering
the non-existent thermal zone.  I'm not removing any functionality by changing
this, nor am I adding functionality.  In both cases the thermal zone is not
functional, and with my patch userspace continues to work.

P.

[1] iwl_mvm_tzone_set_trip_temp() also returns -EIO, so setting and getting of
the temperature is non-functional.


> 
>>
>> ---8<---
>>
>> The iwlwifi driver implements a thermal zone and hwmon device, but
>> returns -EIO on temperature reads if the firmware isn't loaded.  This
>> results in the error
>>
>> iwlwifi-virtual-0
>> Adapter: Virtual device
>> ERROR: Can't get value of subfeature temp1_input: I/O error
>> temp1:            N/A
>>
>> being output when using sensors from the lm-sensors package.  Since
>> the temperature cannot be read unless the ucode is loaded there is no
>> reason to add the interface only to have it return an error 100% of
>> the time.
>>
>> This patch moves the firmware check to iwl_mvm_thermal_zone_register() and
>> stops the thermal zone from being created if the ucode hasn't been loaded.
>>
>> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
>> Cc: Johannes Berg <johannes.berg@intel.com>
>> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
>> Cc: Luca Coelho <luciano.coelho@intel.com>
>> Cc: Intel Linux Wireless <linuxwifi@intel.com>
>> Cc: Kalle Valo <kvalo@codeaurora.org>
>> Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
>> Cc: Sara Sharon <sara.sharon@intel.com>
>> Cc: linux-wireless@vger.kernel.org
>> Cc: netdev@vger.kernel.org
>> ---
>>  drivers/net/wireless/intel/iwlwifi/mvm/tt.c |   13 +++----------
>>  1 file changed, 3 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
>> index 58fc7b3c711c..64802659711f 100644
>> --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
>> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
>> @@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct thermal_zone_device *device,
>>
>>         mutex_lock(&mvm->mutex);
>>
>> -       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
>> -               ret = -EIO;
>> -               goto out;
>> -       }
>> -
>>         ret = iwl_mvm_get_temp(mvm, &temp);
>>         if (ret)
>>                 goto out;
>> @@ -684,11 +679,6 @@ static int iwl_mvm_tzone_set_trip_temp(struct thermal_zone_device *device,
>>
>>         mutex_lock(&mvm->mutex);
>>
>> -       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
>> -               ret = -EIO;
>> -               goto out;
>> -       }
>> -
>>         if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) {
>>                 ret = -EINVAL;
>>                 goto out;
>> @@ -750,6 +740,9 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
>>                 return;
>>         }
>>
>> +       if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR))
>> +               return;
>> +
>>         BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
>>
>>         mvm->tz_device.tzone = thermal_zone_device_register(name,
>> --
>> 1.7.9.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 18:19   ` Prarit Bhargava
@ 2016-07-11 18:27     ` Grumbach, Emmanuel
  2016-07-11 20:31       ` Prarit Bhargava
  2016-07-14  9:24       ` Stanislaw Gruszka
  0 siblings, 2 replies; 19+ messages in thread
From: Grumbach, Emmanuel @ 2016-07-11 18:27 UTC (permalink / raw)
  To: prarit
  Cc: linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes, kvalo,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

On Mon, 2016-07-11 at 14:19 -0400, Prarit Bhargava wrote:
> 
> On 07/11/2016 02:00 PM, Emmanuel Grumbach wrote:
> > On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <prarit@redhat.com
> > > wrote:
> > > 
> > > Didn't get any feedback or review comments on this patch. 
> > >  Resending ...
> > > 
> > > P.
> > 
> > This change is obviously completely broken. It simply disables the
> > registration to thermal zone core.
> 
> No it is not broken, and yes, that is exactly what should happen IMO.
> 
> The problem is that the iwlwifi driver implements the thermal zone
> even when the
> device doesn't support it.

We implement thermal zone because we do support it, but the problem is
that we need the firmware to be loaded for that. So you can argue that
we should register *later* when the firmware is loaded. But this is
really not helping all that much because the firmware can also be
stopped at any time. So you'd want us to register / unregister the
thermal zone anytime the firmware is loaded / unloaded?
I guess that works, but it seems wrong to me. Usually, registration
should happen only upon INIT, and yes, at that time the firmware is not
ready to provide the information yet.
Maybe returning -EBUSY would help lm-sensors not to get confused?

> 
> As can be seen in the current code base, iwl_mvm_tzone_get_temp()
> will return
> -EIO 100% of the time when the firmware doesn't support reading the
> temperature[1].  In this case a read of sysfs will result in a return
> of -EIO,
> and this breaks existing userspace programs such as lm-sensors (which
> by all
> accounts is bad to do).

Right, but I don't understand why the userspace is broken because of
that? Unless we register / unregister anytime the firmware is loaded, I
don't see any proper way to fix this. And yes, I'd expect the userspace
to handle gracefully failures in its requests.

> 
> Note that in my patch I have removed the -EIO return in favor of not
> registering
> the non-existent thermal zone.  I'm not removing any functionality by
> changing
> this, nor am I adding functionality.  In both cases the thermal zone
> is not
> functional, and with my patch userspace continues to work.

You are removing the thermal zone functionality since even when the
firmware will be loaded (which typically happens fairly quickly),
thermal zone won't work.

> 
> P.
> 
> [1] iwl_mvm_tzone_set_trip_temp() also returns -EIO, so setting and
> getting of
> the temperature is non-functional.
> 
> 
> > 
> > > 
> > > ---8<---
> > > 
> > > The iwlwifi driver implements a thermal zone and hwmon device,
> > > but
> > > returns -EIO on temperature reads if the firmware isn't loaded. 
> > >  This
> > > results in the error
> > > 
> > > iwlwifi-virtual-0
> > > Adapter: Virtual device
> > > ERROR: Can't get value of subfeature temp1_input: I/O error
> > > temp1:            N/A
> > > 
> > > being output when using sensors from the lm-sensors package. 
> > >  Since
> > > the temperature cannot be read unless the ucode is loaded there
> > > is no
> > > reason to add the interface only to have it return an error 100%
> > > of
> > > the time.
> > > 
> > > This patch moves the firmware check to
> > > iwl_mvm_thermal_zone_register() and
> > > stops the thermal zone from being created if the ucode hasn't
> > > been loaded.
> > > 
> > > Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> > > Cc: Johannes Berg <johannes.berg@intel.com>
> > > Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> > > Cc: Luca Coelho <luciano.coelho@intel.com>
> > > Cc: Intel Linux Wireless <linuxwifi@intel.com>
> > > Cc: Kalle Valo <kvalo@codeaurora.org>
> > > Cc: Chaya Rachel Ivgi <chaya.rachel.ivgi@intel.com>
> > > Cc: Sara Sharon <sara.sharon@intel.com>
> > > Cc: linux-wireless@vger.kernel.org
> > > Cc: netdev@vger.kernel.org
> > > ---
> > >  drivers/net/wireless/intel/iwlwifi/mvm/tt.c |   13 +++----------
> > >  1 file changed, 3 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> > > b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> > > index 58fc7b3c711c..64802659711f 100644
> > > --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> > > +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> > > @@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct
> > > thermal_zone_device *device,
> > > 
> > >         mutex_lock(&mvm->mutex);
> > > 
> > > -       if (!mvm->ucode_loaded || !(mvm->cur_ucode ==
> > > IWL_UCODE_REGULAR)) {
> > > -               ret = -EIO;
> > > -               goto out;
> > > -       }
> > > -
> > >         ret = iwl_mvm_get_temp(mvm, &temp);
> > >         if (ret)
> > >                 goto out;
> > > @@ -684,11 +679,6 @@ static int
> > > iwl_mvm_tzone_set_trip_temp(struct thermal_zone_device *device,
> > > 
> > >         mutex_lock(&mvm->mutex);
> > > 
> > > -       if (!mvm->ucode_loaded || !(mvm->cur_ucode ==
> > > IWL_UCODE_REGULAR)) {
> > > -               ret = -EIO;
> > > -               goto out;
> > > -       }
> > > -
> > >         if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) {
> > >                 ret = -EINVAL;
> > >                 goto out;
> > > @@ -750,6 +740,9 @@ static void
> > > iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
> > >                 return;
> > >         }
> > > 
> > > +       if (!mvm->ucode_loaded || !(mvm->cur_ucode ==
> > > IWL_UCODE_REGULAR))
> > > +               return;
> > > +
> > >         BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
> > > 
> > >         mvm->tz_device.tzone = thermal_zone_device_register(name,
> > > --
> > > 1.7.9.3
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux
> > > -wireless" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  
> > > http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 18:27     ` Grumbach, Emmanuel
@ 2016-07-11 20:31       ` Prarit Bhargava
  2016-07-13  6:50         ` Kalle Valo
  2016-07-14  9:24       ` Stanislaw Gruszka
  1 sibling, 1 reply; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-11 20:31 UTC (permalink / raw)
  To: Grumbach, Emmanuel
  Cc: linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes, kvalo,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless



On 07/11/2016 02:27 PM, Grumbach, Emmanuel wrote:
> On Mon, 2016-07-11 at 14:19 -0400, Prarit Bhargava wrote:
>>
>> On 07/11/2016 02:00 PM, Emmanuel Grumbach wrote:
>>> On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <prarit@redhat.com
>>>> wrote:
>>>>
>>>> Didn't get any feedback or review comments on this patch. 
>>>>  Resending ...
>>>>
>>>> P.
>>>
>>> This change is obviously completely broken. It simply disables the
>>> registration to thermal zone core.
>>
>> No it is not broken, and yes, that is exactly what should happen IMO.
>>
>> The problem is that the iwlwifi driver implements the thermal zone
>> even when the
>> device doesn't support it.
> 
> We implement thermal zone because we do support it, but the problem is
> that we need the firmware to be loaded for that. So you can argue that
> we should register *later* when the firmware is loaded. But this is
> really not helping all that much because the firmware can also be
> stopped at any time. So you'd want us to register / unregister the
> thermal zone anytime the firmware is loaded / unloaded?

You might have to do that.  I think that if the firmware enables a feature then
the act of loading the firmware should run the code that enables the feature.
IMO of course.

> I guess that works, but it seems wrong to me. Usually, registration
> should happen only upon INIT, and yes, at that time the firmware is not
> ready to provide the information yet.
> Maybe returning -EBUSY would help lm-sensors not to get confused?

I'll give that a shot, but I expect that won't work either as an error message
will still be displayed.

> 
>>
>> As can be seen in the current code base, iwl_mvm_tzone_get_temp()
>> will return
>> -EIO 100% of the time when the firmware doesn't support reading the
>> temperature[1].  In this case a read of sysfs will result in a return
>> of -EIO,
>> and this breaks existing userspace programs such as lm-sensors (which
>> by all
>> accounts is bad to do).
> 
> Right, but I don't understand why the userspace is broken because of
> that? 

Before the iwlwifi change, sensors successfully returned.  Now, because of the
error, it doesn't.

Unless we register / unregister anytime the firmware is loaded, I
> don't see any proper way to fix this. And yes, I'd expect the userspace
> to handle gracefully failures in its requests.

I agree with you in principle *and there's a great many things I wish userspace
would do gracefully* but updating the kernel shouldn't result in userspace
programs failing.

> 
>>
>> Note that in my patch I have removed the -EIO return in favor of not
>> registering
>> the non-existent thermal zone.  I'm not removing any functionality by
>> changing
>> this, nor am I adding functionality.  In both cases the thermal zone
>> is not
>> functional, and with my patch userspace continues to work.
> 
> You are removing the thermal zone functionality since even when the
> firmware will be loaded (which typically happens fairly quickly),
> thermal zone won't work.

Then I agree with your suggestion above that you need to enable the thermal zone
on a successful load of the firmware.  [Aside: I wonder what other drivers do in
this situation?  While this does seem like an odd case, I can't believe that the
iwlwifi driver is the only driver to enable features based on firmware.]

P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 20:31       ` Prarit Bhargava
@ 2016-07-13  6:50         ` Kalle Valo
  2016-07-13  7:24           ` Luca Coelho
  2016-07-13 10:01           ` Prarit Bhargava
  0 siblings, 2 replies; 19+ messages in thread
From: Kalle Valo @ 2016-07-13  6:50 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Grumbach, Emmanuel, linux-kernel, linuxwifi, Coelho, Luciano,
	Berg, Johannes, Ivgi, Chaya Rachel, netdev, Sharon, Sara,
	linux-wireless

Prarit Bhargava <prarit@redhat.com> writes:

>> We implement thermal zone because we do support it, but the problem is
>> that we need the firmware to be loaded for that. So you can argue that
>> we should register *later* when the firmware is loaded. But this is
>> really not helping all that much because the firmware can also be
>> stopped at any time. So you'd want us to register / unregister the
>> thermal zone anytime the firmware is loaded / unloaded?
>
> You might have to do that.  I think that if the firmware enables a feature then
> the act of loading the firmware should run the code that enables the feature.
> IMO of course.

But I suspect that the iwlwifi firmware is loaded during interface up
(and unloaded during interface down) and in that case
register/unregister would be happening all the time. That doesn't sound
like a good idea. I would rather try to fix the thermal interface to
handle the cases when the measurement is not available.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-13  6:50         ` Kalle Valo
@ 2016-07-13  7:24           ` Luca Coelho
  2016-07-13 10:20             ` Prarit Bhargava
  2016-07-13 10:01           ` Prarit Bhargava
  1 sibling, 1 reply; 19+ messages in thread
From: Luca Coelho @ 2016-07-13  7:24 UTC (permalink / raw)
  To: Kalle Valo, Prarit Bhargava
  Cc: Grumbach, Emmanuel, linux-kernel, linuxwifi, Berg, Johannes,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

On Wed, 2016-07-13 at 09:50 +0300, Kalle Valo wrote:
> Prarit Bhargava <prarit@redhat.com> writes:
> 
> > > We implement thermal zone because we do support it, but the
> > > problem is
> > > that we need the firmware to be loaded for that. So you can argue
> > > that
> > > we should register *later* when the firmware is loaded. But this
> > > is
> > > really not helping all that much because the firmware can also be
> > > stopped at any time. So you'd want us to register / unregister
> > > the
> > > thermal zone anytime the firmware is loaded / unloaded?
> > 
> > You might have to do that.  I think that if the firmware enables a
> > feature then
> > the act of loading the firmware should run the code that enables
> > the feature.
> > IMO of course.
> 
> But I suspect that the iwlwifi firmware is loaded during interface up
> (and unloaded during interface down) and in that case
> register/unregister would be happening all the time. That doesn't
> sound
> like a good idea. I would rather try to fix the thermal interface to
> handle the cases when the measurement is not available.

I totally agree with Emmanuel and Kalle.  We should not change this.
 It is a design decision to return an error when the interface is down,
this is very common with other subsystems as well.  The userspace
should be able to handle errors and report something like "unavailable"
when this kind of error is returned.

I'm not sure EIO is the best we can have, but for me that's exactly
what it is.  The thermal zone *is* there, but cannot be accessed
because the firmware is not available.  I'm okay to change it to EBUSY,
if that would help userspace, but I think that's a bit misleading.  The
device is not busy, on the contrary, it's not even running at all.

Furthermore, I don't think this is "breaking userspace" in the sense of
being a regression.  The userspace API has always been implemented with
the possibility of returning errors.  It's not a good design if a
single device returning an error causes all the other devices to also
fail.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-13  6:50         ` Kalle Valo
  2016-07-13  7:24           ` Luca Coelho
@ 2016-07-13 10:01           ` Prarit Bhargava
  2016-07-14  7:13             ` Kalle Valo
  1 sibling, 1 reply; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-13 10:01 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Grumbach, Emmanuel, linux-kernel, linuxwifi, Coelho, Luciano,
	Berg, Johannes, Ivgi, Chaya Rachel, netdev, Sharon, Sara,
	linux-wireless



On 07/13/2016 02:50 AM, Kalle Valo wrote:
> Prarit Bhargava <prarit@redhat.com> writes:
> 
>>> We implement thermal zone because we do support it, but the problem is
>>> that we need the firmware to be loaded for that. So you can argue that
>>> we should register *later* when the firmware is loaded. But this is
>>> really not helping all that much because the firmware can also be
>>> stopped at any time. So you'd want us to register / unregister the
>>> thermal zone anytime the firmware is loaded / unloaded?
>>
>> You might have to do that.  I think that if the firmware enables a feature then
>> the act of loading the firmware should run the code that enables the feature.
>> IMO of course.
> 
> But I suspect that the iwlwifi firmware is loaded during interface up
> (and unloaded during interface down) and in that case
> register/unregister would be happening all the time. 

You make it sound like the interface is coming and going a 1000 times a second.
 Maybe this happens once during runtime & during suspend/resume cycles?  What
about the cases when the firmware isn't present (and that's what lead me to this
bug)?

That doesn't sound
> like a good idea. I would rather try to fix the thermal interface to
> handle the cases when the measurement is not available.
> 

Userspace is broken because of this change.  I've had to make another horrible
change to cpufreq for a similar change so I don't see the argument here to just
blame userspace and ignore the outcome of the patch.

P.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-13  7:24           ` Luca Coelho
@ 2016-07-13 10:20             ` Prarit Bhargava
  2016-07-14  8:01               ` Kalle Valo
  0 siblings, 1 reply; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-13 10:20 UTC (permalink / raw)
  To: Luca Coelho, Kalle Valo
  Cc: Grumbach, Emmanuel, linux-kernel, linuxwifi, Berg, Johannes,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless



On 07/13/2016 03:24 AM, Luca Coelho wrote:
> On Wed, 2016-07-13 at 09:50 +0300, Kalle Valo wrote:
>> Prarit Bhargava <prarit@redhat.com> writes:
>>
>>>> We implement thermal zone because we do support it, but the
>>>> problem is
>>>> that we need the firmware to be loaded for that. So you can argue
>>>> that
>>>> we should register *later* when the firmware is loaded. But this
>>>> is
>>>> really not helping all that much because the firmware can also be
>>>> stopped at any time. So you'd want us to register / unregister
>>>> the
>>>> thermal zone anytime the firmware is loaded / unloaded?
>>>
>>> You might have to do that.  I think that if the firmware enables a
>>> feature then
>>> the act of loading the firmware should run the code that enables
>>> the feature.
>>> IMO of course.
>>
>> But I suspect that the iwlwifi firmware is loaded during interface up
>> (and unloaded during interface down) and in that case
>> register/unregister would be happening all the time. That doesn't
>> sound
>> like a good idea. I would rather try to fix the thermal interface to
>> handle the cases when the measurement is not available.
> 
> I totally agree with Emmanuel and Kalle.  We should not change this.
>  It is a design decision to return an error when the interface is down,
> this is very common with other subsystems as well.  

Please show me another subsystem or driver that does this.  I've looked around
the kernel but cannot find one that updates the firmware and implements new
features on the fly like this.  I have come across several drivers that allow
for an update, but they do not implement new features based on the firmware.

Additionally, what happens when someone back revs firmware versions (which
happens far more than you and I would expect)?  Does that mean I now go from a
functional system to a non-functional system wrt to userspace?

The userspace
> should be able to handle errors and report something like "unavailable"
> when this kind of error is returned.
> 

I myself have made the same arguments wrt to cpufreq code & bad userspace
choices.  I just went through this a few months back with what went from a
simple patch and turned out to be a hideous patch in cpufreq.  You cannot break
userspace like this.

See commit 51443fbf3d2c ("cpufreq: intel_pstate: Fix intel_pstate powersave
min_perf_pct value").  What should have been a trivial change resulted in a
massive change because of broken userspace.

> I'm not sure EIO is the best we can have, but for me that's exactly
> what it is.  The thermal zone *is* there, but cannot be accessed
> because the firmware is not available.  I'm okay to change it to EBUSY,
> if that would help userspace, but I think that's a bit misleading.  The
> device is not busy, on the contrary, it's not even running at all.
> 

I understand that, but by returning -EIO we end up with an error.

> Furthermore, I don't think this is "breaking userspace" in the sense of
> being a regression.  

I run (let's say 4.5 kernel).  sensors works.  I update to 4.7.  sensors doesn't
work.  How is that not a regression?  That's _exactly_ what it should be
reported as.

The userspace API has always been implemented with
> the possibility of returning errors.  It's not a good design if a
> single device returning an error causes all the other devices to also
> fail.
> 

If that were the case we would never have to worry about "breaking userspace"?
For any kernel change I could just say that the userspace design was bad and be
done with it.  Why fix anything then?

I don't see any harm in waiting to register the sysfs files for hwmon until the
firmware has been validated.  IIUC, the up/down'ing of the device doesn't happen
that often (during initial boot, and suspend/resume, switching wifi connections,
shutdown?).  This would make the iwlwifi community happy (IMO) and sensors would
still work.  At the same time I could write a patch for lm-sensors to fix this
issue if it comes up in future versions.  [Aside: I'm going to have the
reproducing system available today and will test this out.  It looks like just
moving some code around.]

The bottom line is that lm-sensors is currently broken with this change in
iwlwifi.  AFAICT, no other thermal device returns an error this way, and IMO
that means the iwlwifi driver is doing something new and unexpected wrt to
userspace.

P.


> --
> Cheers,
> Luca.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-13 10:01           ` Prarit Bhargava
@ 2016-07-14  7:13             ` Kalle Valo
  0 siblings, 0 replies; 19+ messages in thread
From: Kalle Valo @ 2016-07-14  7:13 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Grumbach, Emmanuel, linux-kernel, linuxwifi, Coelho, Luciano,
	Berg, Johannes, Ivgi, Chaya Rachel, netdev, Sharon, Sara,
	linux-wireless

Prarit Bhargava <prarit@redhat.com> writes:

> On 07/13/2016 02:50 AM, Kalle Valo wrote:
>> Prarit Bhargava <prarit@redhat.com> writes:
>> 
>>>> We implement thermal zone because we do support it, but the problem is
>>>> that we need the firmware to be loaded for that. So you can argue that
>>>> we should register *later* when the firmware is loaded. But this is
>>>> really not helping all that much because the firmware can also be
>>>> stopped at any time. So you'd want us to register / unregister the
>>>> thermal zone anytime the firmware is loaded / unloaded?
>>>
>>> You might have to do that.  I think that if the firmware enables a feature then
>>> the act of loading the firmware should run the code that enables the feature.
>>> IMO of course.
>> 
>> But I suspect that the iwlwifi firmware is loaded during interface up
>> (and unloaded during interface down) and in that case
>> register/unregister would be happening all the time. 
>
> You make it sound like the interface is coming and going a 1000 times a second.
>  Maybe this happens once during runtime & during suspend/resume
>  cycles?

Of course it doesn't happen 1000 times a second but it depends on user
space behaviour. In some cases, when the wlan interface is always up,
the firmware is loaded only once. But in some cases the user space might
change the interface state more frequently.

More so registering services like thermal zone should happen during
driver probe time, not during interface up event.

> What about the cases when the firmware isn't present (and that's what
> lead me to this bug)?

In that case the kernel could return a predefined error value like
-EGAIN or -ENOTDOWN so that the user space knows that a value is not
available at this time (but might be available later).

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-13 10:20             ` Prarit Bhargava
@ 2016-07-14  8:01               ` Kalle Valo
  2016-07-14  9:08                 ` Grumbach, Emmanuel
  0 siblings, 1 reply; 19+ messages in thread
From: Kalle Valo @ 2016-07-14  8:01 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Luca Coelho, Grumbach, Emmanuel, linux-kernel, linuxwifi, Berg,
	Johannes, Ivgi, Chaya Rachel, netdev, Sharon, Sara,
	linux-wireless

Prarit Bhargava <prarit@redhat.com> writes:

> On 07/13/2016 03:24 AM, Luca Coelho wrote:
>
>> I totally agree with Emmanuel and Kalle. We should not change this.
>> It is a design decision to return an error when the interface is
>> down, this is very common with other subsystems as well.
>
> Please show me another subsystem or driver that does this.  I've looked around
> the kernel but cannot find one that updates the firmware and implements new
> features on the fly like this.  I have come across several drivers that allow
> for an update, but they do not implement new features based on the
> firmware.
>
> Additionally, what happens when someone back revs firmware versions (which
> happens far more than you and I would expect)?  Does that mean I now go from a
> functional system to a non-functional system wrt to userspace?

I'm not following, what do you mean exactly? Why are you talking
updating the firmware?

So when we talk about "loading firmware" we mean that the driver pushes
the firmware image to to the chipset. And then the interface is down the
chipset is powered down and the RAM on it will be erased. That's the
general idea anyway, I haven't checked how iwlwifi exactly works in this
case but Luca or Emmanuel can correct me.

>> The userspace should be able to handle errors and report something
>> like "unavailable" when this kind of error is returned.
>
> I myself have made the same arguments wrt to cpufreq code & bad userspace
> choices.  I just went through this a few months back with what went from a
> simple patch and turned out to be a hideous patch in cpufreq.  You cannot break
> userspace like this.

Don't get me wrong, I'm a strong supporter of stable user space
interfaces and I always try to adher to that. But there's a limit for
everything. If I'm understanding correctly, what you mean is that the
kernel should never return an error because an application doesn't
handle errors gracefully. Sorry, but that doesn't make sense to me.

> See commit 51443fbf3d2c ("cpufreq: intel_pstate: Fix intel_pstate powersave
> min_perf_pct value").  What should have been a trivial change resulted in a
> massive change because of broken userspace.

In that cpufreq case I understand, it was about a combination of
configuration values which broke the user space. But here we are just
dealing with a simple error value, nothing fancy.

>> I'm not sure EIO is the best we can have, but for me that's exactly
>> what it is.  The thermal zone *is* there, but cannot be accessed
>> because the firmware is not available.  I'm okay to change it to EBUSY,
>> if that would help userspace, but I think that's a bit misleading.  The
>> device is not busy, on the contrary, it's not even running at all.
>> 
>
> I understand that, but by returning -EIO we end up with an error.
>
>> Furthermore, I don't think this is "breaking userspace" in the sense of
>> being a regression.  
>
> I run (let's say 4.5 kernel).  sensors works.  I update to 4.7.  sensors doesn't
> work.  How is that not a regression?  That's _exactly_ what it should be
> reported as.

Sure, it's a regression in a way. But that's how the user space app you
are using is implemented, the same problem would happen with any driver
returning errors.

>> The userspace API has always been implemented with
>> the possibility of returning errors.  It's not a good design if a
>> single device returning an error causes all the other devices to also
>> fail.
>> 
>
> If that were the case we would never have to worry about "breaking userspace"?
> For any kernel change I could just say that the userspace design was bad and be
> done with it.  Why fix anything then?

Because we are talking about a simple error value.

> I don't see any harm in waiting to register the sysfs files for hwmon until the
> firmware has been validated.

I'm against of that because it's bad software design. It's standard
practise in Linux that drivers register their capabilities during driver
probe time so that user space can query them whenever needed. I assume a
properly behaving user space app would want to know about all the
available sensors once the driver is initialised and your suggestion
would break that.

> IIUC, the up/down'ing of the device doesn't happen that often (during
> initial boot, and suspend/resume, switching wifi connections,
> shutdown?).

Basically it can happen anytime, this is fully controlled by user space.
There's no point of trying to make any assumptions as they won't hold
anyway.

> This would make the iwlwifi community happy (IMO) and
> sensors would still work. At the same time I could write a patch for
> lm-sensors to fix this issue if it comes up in future versions.
> [Aside: I'm going to have the reproducing system available today and
> will test this out. It looks like just moving some code around.]

Another option, but still a bad one I don't like, is that you change the
kernel interface to ignore all errors from drivers (like iwlwifi). This
way drivers don't need to make ugly workarounds.

> The bottom line is that lm-sensors is currently broken with this change in
> iwlwifi.  AFAICT, no other thermal device returns an error this way, and IMO
> that means the iwlwifi driver is doing something new and unexpected wrt to
> userspace.

I haven't checked but I suspect ath10k has a similar problem when
interface is down.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-14  8:01               ` Kalle Valo
@ 2016-07-14  9:08                 ` Grumbach, Emmanuel
  0 siblings, 0 replies; 19+ messages in thread
From: Grumbach, Emmanuel @ 2016-07-14  9:08 UTC (permalink / raw)
  To: Kalle Valo, Prarit Bhargava
  Cc: Luca Coelho, linux-kernel, linuxwifi, Berg, Johannes, Ivgi,
	Chaya Rachel, netdev, Sharon, Sara, linux-wireless

> 
> Prarit Bhargava <prarit@redhat.com> writes:
> 
> > On 07/13/2016 03:24 AM, Luca Coelho wrote:
> >
> >> I totally agree with Emmanuel and Kalle. We should not change this.
> >> It is a design decision to return an error when the interface is
> >> down, this is very common with other subsystems as well.
> >
> > Please show me another subsystem or driver that does this.  I've
> > looked around the kernel but cannot find one that updates the firmware
> > and implements new features on the fly like this.  I have come across
> > several drivers that allow for an update, but they do not implement
> > new features based on the firmware.
> >
> > Additionally, what happens when someone back revs firmware versions
> > (which happens far more than you and I would expect)?  Does that mean
> > I now go from a functional system to a non-functional system wrt to
> userspace?
> 
> I'm not following, what do you mean exactly? Why are you talking updating
> the firmware?
> 
> So when we talk about "loading firmware" we mean that the driver pushes
> the firmware image to to the chipset. And then the interface is down the
> chipset is powered down and the RAM on it will be erased. That's the general
> idea anyway, I haven't checked how iwlwifi exactly works in this case but
> Luca or Emmanuel can correct me.

This is correct.


> 
> >> The userspace should be able to handle errors and report something
> >> like "unavailable" when this kind of error is returned.
> >
> > I myself have made the same arguments wrt to cpufreq code & bad
> > userspace choices.  I just went through this a few months back with
> > what went from a simple patch and turned out to be a hideous patch in
> > cpufreq.  You cannot break userspace like this.
> 
> Don't get me wrong, I'm a strong supporter of stable user space interfaces
> and I always try to adher to that. But there's a limit for everything. If I'm
> understanding correctly, what you mean is that the kernel should never
> return an error because an application doesn't handle errors gracefully.
> Sorry, but that doesn't make sense to me.
> 
> > See commit 51443fbf3d2c ("cpufreq: intel_pstate: Fix intel_pstate
> > powersave min_perf_pct value").  What should have been a trivial
> > change resulted in a massive change because of broken userspace.
> 
> In that cpufreq case I understand, it was about a combination of
> configuration values which broke the user space. But here we are just
> dealing with a simple error value, nothing fancy.
> 
> >> I'm not sure EIO is the best we can have, but for me that's exactly
> >> what it is.  The thermal zone *is* there, but cannot be accessed
> >> because the firmware is not available.  I'm okay to change it to
> >> EBUSY, if that would help userspace, but I think that's a bit
> >> misleading.  The device is not busy, on the contrary, it's not even running
> at all.
> >>
> >
> > I understand that, but by returning -EIO we end up with an error.
> >
> >> Furthermore, I don't think this is "breaking userspace" in the sense
> >> of being a regression.
> >
> > I run (let's say 4.5 kernel).  sensors works.  I update to 4.7.
> > sensors doesn't work.  How is that not a regression?  That's _exactly_
> > what it should be reported as.
> 
> Sure, it's a regression in a way. But that's how the user space app you are
> using is implemented, the same problem would happen with any driver
> returning errors.
> 
> >> The userspace API has always been implemented with the possibility of
> >> returning errors.  It's not a good design if a single device
> >> returning an error causes all the other devices to also fail.
> >>
> >
> > If that were the case we would never have to worry about "breaking
> userspace"?
> > For any kernel change I could just say that the userspace design was
> > bad and be done with it.  Why fix anything then?
> 
> Because we are talking about a simple error value.
> 
> > I don't see any harm in waiting to register the sysfs files for hwmon
> > until the firmware has been validated.
> 
> I'm against of that because it's bad software design. It's standard practise in
> Linux that drivers register their capabilities during driver probe time so that
> user space can query them whenever needed. I assume a properly behaving
> user space app would want to know about all the available sensors once the
> driver is initialised and your suggestion would break that.
> 
> > IIUC, the up/down'ing of the device doesn't happen that often (during
> > initial boot, and suspend/resume, switching wifi connections,
> > shutdown?).
> 
> Basically it can happen anytime, this is fully controlled by user space.
> There's no point of trying to make any assumptions as they won't hold
> anyway.
> 
> > This would make the iwlwifi community happy (IMO) and sensors would
> > still work. At the same time I could write a patch for lm-sensors to
> > fix this issue if it comes up in future versions.
> > [Aside: I'm going to have the reproducing system available today and
> > will test this out. It looks like just moving some code around.]
> 
> Another option, but still a bad one I don't like, is that you change the kernel
> interface to ignore all errors from drivers (like iwlwifi). This way drivers don't
> need to make ugly workarounds.
> 
> > The bottom line is that lm-sensors is currently broken with this
> > change in iwlwifi.  AFAICT, no other thermal device returns an error
> > this way, and IMO that means the iwlwifi driver is doing something new
> > and unexpected wrt to userspace.
> 
> I haven't checked but I suspect ath10k has a similar problem when interface
> is down.
> 
> --
> Kalle Valo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-11 18:27     ` Grumbach, Emmanuel
  2016-07-11 20:31       ` Prarit Bhargava
@ 2016-07-14  9:24       ` Stanislaw Gruszka
  2016-07-14  9:44         ` Grumbach, Emmanuel
  1 sibling, 1 reply; 19+ messages in thread
From: Stanislaw Gruszka @ 2016-07-14  9:24 UTC (permalink / raw)
  To: Grumbach, Emmanuel
  Cc: prarit, linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes,
	kvalo, Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

On Mon, Jul 11, 2016 at 06:27:30PM +0000, Grumbach, Emmanuel wrote:
> I guess that works, but it seems wrong to me. Usually, registration
> should happen only upon INIT, and yes, at that time the firmware is not
> ready to provide the information yet.
<snip>
> > 
> > As can be seen in the current code base, iwl_mvm_tzone_get_temp()
> > will return
> > -EIO 100% of the time when the firmware doesn't support reading the

If I understad correctly this error happen 100% of the time, not only
during init. Hence seems there is an issue here, i.e. cur_ucode is not
marked correctly as IWL_UCODE_REGULAR or iwl_mvm_get_temp() fail
100% of the time (iwl_mvm_is_tt_in_fw() incorrecly return true on
Prarit device ? ).

BTW, you implement thermal_zone device, but do you also need hwmon
device? Perhaps using theramal_zone_params no_hwmon option would be
proper here?

Stanislaw

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-14  9:24       ` Stanislaw Gruszka
@ 2016-07-14  9:44         ` Grumbach, Emmanuel
  2016-07-15 11:25           ` Stanislaw Gruszka
  0 siblings, 1 reply; 19+ messages in thread
From: Grumbach, Emmanuel @ 2016-07-14  9:44 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: prarit, linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes,
	kvalo, Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

> 
> On Mon, Jul 11, 2016 at 06:27:30PM +0000, Grumbach, Emmanuel wrote:
> > I guess that works, but it seems wrong to me. Usually, registration
> > should happen only upon INIT, and yes, at that time the firmware is
> > not ready to provide the information yet.
> <snip>
> > >
> > > As can be seen in the current code base, iwl_mvm_tzone_get_temp()
> > > will return -EIO 100% of the time when the firmware doesn't support
> > > reading the
> 
> If I understad correctly this error happen 100% of the time, not only during
> init. Hence seems there is an issue here, i.e. cur_ucode is not marked
> correctly as IWL_UCODE_REGULAR or iwl_mvm_get_temp() fail 100% of the
> time (iwl_mvm_is_tt_in_fw() incorrecly return true on Prarit device ? ).

Cur_ucode will not be IWL_UCODE_REGULAR until you load the firmware which
will happen upon ifup.

> 
> BTW, you implement thermal_zone device, but do you also need hwmon
> device? Perhaps using theramal_zone_params no_hwmon option would be
> proper here?

That's an interesting direction. I'd have to check, but TBH, I am not familiar with
that code. Luca was very involved during the development but he is not available
right now. I will be back more the less when the merge window will close :)

> 
> Stanislaw

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-14  9:44         ` Grumbach, Emmanuel
@ 2016-07-15 11:25           ` Stanislaw Gruszka
  2016-07-15 12:14             ` Prarit Bhargava
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislaw Gruszka @ 2016-07-15 11:25 UTC (permalink / raw)
  To: Grumbach, Emmanuel
  Cc: prarit, linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes,
	kvalo, Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

On Thu, Jul 14, 2016 at 09:44:22AM +0000, Grumbach, Emmanuel wrote:
> > If I understad correctly this error happen 100% of the time, not only during
> > init. Hence seems there is an issue here, i.e. cur_ucode is not marked
> > correctly as IWL_UCODE_REGULAR or iwl_mvm_get_temp() fail 100% of the
> > time (iwl_mvm_is_tt_in_fw() incorrecly return true on Prarit device ? ).
> 
> Cur_ucode will not be IWL_UCODE_REGULAR until you load the firmware which
> will happen upon ifup.

Then creating thermal_device on ifup looks more reasonable to me.
Otherwise we can create device that can be non-functional virtually
forever, i.e. when soft RFKILL is enabled. However I admit that
creating thermal_device when HW is detected has some advantages
too.

Stanislaw

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-15 11:25           ` Stanislaw Gruszka
@ 2016-07-15 12:14             ` Prarit Bhargava
  2016-07-17  6:13               ` Grumbach, Emmanuel
  0 siblings, 1 reply; 19+ messages in thread
From: Prarit Bhargava @ 2016-07-15 12:14 UTC (permalink / raw)
  To: Stanislaw Gruszka, Grumbach, Emmanuel
  Cc: linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes, kvalo,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless



On 07/15/2016 07:25 AM, Stanislaw Gruszka wrote:
> On Thu, Jul 14, 2016 at 09:44:22AM +0000, Grumbach, Emmanuel wrote:
>>> If I understad correctly this error happen 100% of the time, not only during
>>> init. Hence seems there is an issue here, i.e. cur_ucode is not marked
>>> correctly as IWL_UCODE_REGULAR or iwl_mvm_get_temp() fail 100% of the
>>> time (iwl_mvm_is_tt_in_fw() incorrecly return true on Prarit device ? ).
>>
>> Cur_ucode will not be IWL_UCODE_REGULAR until you load the firmware which
>> will happen upon ifup.
> 
> Then creating thermal_device on ifup looks more reasonable to me.
> Otherwise we can create device that can be non-functional virtually
> forever, i.e. when soft RFKILL is enabled. However I admit that
> creating thermal_device when HW is detected has some advantages
> too.

That's my plan right now.  Unfortunately something else in the kernel seems
recently broken and is preventing me from testing.  I will get back to this
early next week.

P.
> 
> Stanislaw
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
  2016-07-15 12:14             ` Prarit Bhargava
@ 2016-07-17  6:13               ` Grumbach, Emmanuel
  0 siblings, 0 replies; 19+ messages in thread
From: Grumbach, Emmanuel @ 2016-07-17  6:13 UTC (permalink / raw)
  To: Prarit Bhargava, Stanislaw Gruszka
  Cc: linux-kernel, linuxwifi, Coelho, Luciano, Berg, Johannes, kvalo,
	Ivgi, Chaya Rachel, netdev, Sharon, Sara, linux-wireless

> On 07/15/2016 07:25 AM, Stanislaw Gruszka wrote:
> > On Thu, Jul 14, 2016 at 09:44:22AM +0000, Grumbach, Emmanuel wrote:
> >>> If I understad correctly this error happen 100% of the time, not
> >>> only during init. Hence seems there is an issue here, i.e. cur_ucode
> >>> is not marked correctly as IWL_UCODE_REGULAR or
> iwl_mvm_get_temp()
> >>> fail 100% of the time (iwl_mvm_is_tt_in_fw() incorrecly return true on
> Prarit device ? ).
> >>
> >> Cur_ucode will not be IWL_UCODE_REGULAR until you load the firmware
> >> which will happen upon ifup.
> >
> > Then creating thermal_device on ifup looks more reasonable to me.
> > Otherwise we can create device that can be non-functional virtually
> > forever, i.e. when soft RFKILL is enabled. However I admit that
> > creating thermal_device when HW is detected has some advantages too.
> 
> That's my plan right now.  Unfortunately something else in the kernel seems
> recently broken and is preventing me from testing.  I will get back to this
> early next week.
> 

But we already said that this won't work since you may have the device enabled upon boot and then disabled. So unless you unregister the thermal zone subsystem upon wifi disable, you won't solve the problem. Kalle and Luca already refused that solution.

I glanced (again) at the thermal zone API and since it allows to return an int, the subsystem itself should handle the failures and / or the userspace problems. The API itself is awful, it has no documentation whatsoever, even not variable names, but only types... You can't really blame the subsystem users to assume that a method that can return an int can't fail where the out values is passed by a pointer. Of course, you have to guess that this is the expected behavior, since you don't have any hint about the meaning of the parameters.
I think that the right place to "fix" this problem is to fix the subsystem. This way, you will fix it for iwlwifi and for any (future) other users that may fall into the trap opened by the API itself.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-07-17  6:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-11 15:18 [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded Prarit Bhargava
2016-07-11 16:07 ` Coelho, Luciano
2016-07-11 17:00   ` Prarit Bhargava
2016-07-11 18:00 ` Emmanuel Grumbach
2016-07-11 18:19   ` Prarit Bhargava
2016-07-11 18:27     ` Grumbach, Emmanuel
2016-07-11 20:31       ` Prarit Bhargava
2016-07-13  6:50         ` Kalle Valo
2016-07-13  7:24           ` Luca Coelho
2016-07-13 10:20             ` Prarit Bhargava
2016-07-14  8:01               ` Kalle Valo
2016-07-14  9:08                 ` Grumbach, Emmanuel
2016-07-13 10:01           ` Prarit Bhargava
2016-07-14  7:13             ` Kalle Valo
2016-07-14  9:24       ` Stanislaw Gruszka
2016-07-14  9:44         ` Grumbach, Emmanuel
2016-07-15 11:25           ` Stanislaw Gruszka
2016-07-15 12:14             ` Prarit Bhargava
2016-07-17  6:13               ` Grumbach, Emmanuel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).