All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug when using both "set" and "blink" functions
@ 2017-11-22  0:55 Craig McQueen
  2017-11-22  3:36 ` Craig McQueen
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Craig McQueen @ 2017-11-22  0:55 UTC (permalink / raw)
  To: linux-leds

I'd like to control a LED to possibly be any of:

* Off
* On
* Blinking

I've had this working fine in 3.14.x kernel.

But when upgrading to 4.4.x, I found that the transition from "blinking" to "on" didn't work. Namely, if it's blinking and then I call led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't remember if it turned off, or remained blinking; it wasn't on anyway). I worked around it by calling led_set_brightness(led_cdev, LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).

Now I have upgraded to 4.9.x, and found that the transition from "blinking" to "on" again isn't working. The LED ends up being off instead of on.

Examining the code of led_set_brightness():

* Behaviour is different depending on whether the brightness is LED_OFF (it schedules the blink to be turned off "soon") or other (it alters the brightness of subsequent blinks).
* It looks as though there are race conditions in the transition from blinking to steady off -- it schedules the blink to be turned off "soon", so it's difficult to guarantee a reliable transition from blinking to on/off.

The combination of the above two points makes it seem difficult to robustly go from blinking to off/on.

So, my questions are:

* What is the correct way to use the API to reliably control an LED for a combination of off/on/blinking?
* Is my above analysis of the code correct, so that there are indeed race conditions going from blinking to off, leading to undefined behaviour? Can that be fixed?

-- 
Craig McQueen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-22  0:55 Bug when using both "set" and "blink" functions Craig McQueen
@ 2017-11-22  3:36 ` Craig McQueen
  2017-11-22 12:36 ` Pavel Machek
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Craig McQueen @ 2017-11-22  3:36 UTC (permalink / raw)
  To: linux-leds

I wrote:
> I'd like to control a LED to possibly be any of:
>
> * Off
> * On
> * Blinking
>
> I've had this working fine in 3.14.x kernel.
> 
> But when upgrading to 4.4.x, I found that the transition from "blinking"
> to "on" didn't work. Namely, if it's blinking and then I call
> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
> remember if it turned off, or remained blinking; it wasn't on anyway).
> I worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
> before led_set_brightness(led_cdev, LED_FULL).
> 
> Now I have upgraded to 4.9.x, and found that the transition from
> "blinking" to "on" again isn't working. The LED ends up being off
> instead of on.
> 
> Examining the code of led_set_brightness():
> 
> * Behaviour is different depending on whether the brightness is LED_OFF
>   (it schedules the blink to be turned off "soon") or other (it alters
>   the brightness of subsequent blinks).
> * It looks as though there are race conditions in the transition from
>   blinking to steady off -- it schedules the blink to be turned off
>   "soon", so it's difficult to guarantee a reliable transition from
>   blinking to on/off.
> 
> The combination of the above two points makes it seem difficult to
> robustly go from blinking to off/on.
> 
> So, my questions are:
> 
> * What is the correct way to use the API to reliably control an LED for
>   a combination of off/on/blinking?
> * Is my above analysis of the code correct, so that there are indeed
>   race conditions going from blinking to off, leading to undefined
>   behaviour? Can that be fixed?

I also thought I could perhaps use led_blink_set_oneshot() to effectively
turn the LED on or off in a blink-compatible way. But, I see that if a one-shot
blink is in-progress, then any attempt to change the 'invert' parameter is
ineffective because the function exits early.

-- 
Craig McQueen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-22  0:55 Bug when using both "set" and "blink" functions Craig McQueen
  2017-11-22  3:36 ` Craig McQueen
@ 2017-11-22 12:36 ` Pavel Machek
  2017-11-23  0:14   ` Craig McQueen
  2017-11-22 19:53 ` Jacek Anaszewski
  2017-11-25 21:42 ` Jacek Anaszewski
  3 siblings, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2017-11-22 12:36 UTC (permalink / raw)
  To: Craig McQueen; +Cc: linux-leds

[-- Attachment #1: Type: text/plain, Size: 1744 bytes --]

On Wed 2017-11-22 00:55:12, Craig McQueen wrote:
> I'd like to control a LED to possibly be any of:
> 
> * Off
> * On
> * Blinking
> 
> I've had this working fine in 3.14.x kernel.
> 
> But when upgrading to 4.4.x, I found that the transition from "blinking" to "on" didn't work. Namely, if it's blinking and then I call led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't remember if it turned off, or remained blinking; it wasn't on anyway). I worked around it by calling led_set_brightness(led_cdev, LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).
> 
> Now I have upgraded to 4.9.x, and found that the transition from "blinking" to "on" again isn't working. The LED ends up being off instead of on.
> 
> Examining the code of led_set_brightness():
> 
> * Behaviour is different depending on whether the brightness is LED_OFF (it schedules the blink to be turned off "soon") or other (it alters the brightness of subsequent blinks).
> * It looks as though there are race conditions in the transition from blinking to steady off -- it schedules the blink to be turned off "soon", so it's difficult to guarantee a reliable transition from blinking to on/off.
> 
> The combination of the above two points makes it seem difficult to robustly go from blinking to off/on.
> 
> So, my questions are:
> 
> * What is the correct way to use the API to reliably control an LED
> for a combination of off/on/blinking?

You should be able to use sysfs from the userland. .. if that is
broken we need to fix it.

What are you trying to do?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-22  0:55 Bug when using both "set" and "blink" functions Craig McQueen
  2017-11-22  3:36 ` Craig McQueen
  2017-11-22 12:36 ` Pavel Machek
@ 2017-11-22 19:53 ` Jacek Anaszewski
  2017-11-23  0:55   ` Craig McQueen
  2017-11-25 21:42 ` Jacek Anaszewski
  3 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-22 19:53 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

Hi Craig,

On 11/22/2017 01:55 AM, Craig McQueen wrote:
> I'd like to control a LED to possibly be any of:
> 
> * Off
> * On
> * Blinking
> 
> I've had this working fine in 3.14.x kernel.
> 
> But when upgrading to 4.4.x, I found that the transition from "blinking" to "on" didn't work. Namely, if it's blinking and then I call led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't remember if it turned off, or remained blinking; it wasn't on anyway). I worked around it by calling led_set_brightness(led_cdev, LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).

This is by design. Setting brightness to LED_OFF deactivates any active
trigger. There were some inconsistencies in the API semantics and it
was fixed.

See Documentation/ABI/testing/sysfs-class-led

What:           /sys/class/leds/<led>/brightness
Date:           March 2006
KernelVersion:  2.6.17
Contact:        Richard Purdie <rpurdie@rpsys.net>
Description:
                Set the brightness of the LED. Most LEDs don't
                have hardware brightness support, so will just be turned
on for
                non-zero brightness settings. The value is between 0 and
                /sys/class/leds/<led>/max_brightness.

                Writing 0 to this file clears active trigger.

                Writing non-zero to this file while trigger is active
changes the
                top brightness trigger is going to use.

> Now I have upgraded to 4.9.x, and found that the transition from "blinking" to "on" again isn't working. The LED ends up being off instead of on.
> 
> Examining the code of led_set_brightness():
> 
> * Behaviour is different depending on whether the brightness is LED_OFF (it schedules the blink to be turned off "soon") or other (it alters the brightness of subsequent blinks).
> * It looks as though there are race conditions in the transition from blinking to steady off -- it schedules the blink to be turned off "soon", so it's difficult to guarantee a reliable transition from blinking to on/off.
> 
> The combination of the above two points makes it seem difficult to robustly go from blinking to off/on.

Please make sure you have the following patch:

Author: Hans de Goede <hdegoede@redhat.com>  2016-11-08 14:38:46
Committer: Jacek Anaszewski <j.anaszewski@samsung.com>  2016-11-22 12:07:05
Parent: 8338eab50ffb3399a938d723f9605383ed9f8473 (leds: Add user LED
driver for NIC78bx device)
Follows: v4.9-rc1
Precedes: leds_for_4.10

    led: core: Use atomic bit-field for the blink-flags

    All the LED_BLINK* flags are accessed read-modify-write from e.g.
    led_set_brightness and led_blink_set_oneshot while both
    set_brightness_work and the blink_timer may be running.

    If these race then the modify step done by one of them may be lost,
    switch the LED_BLINK* flags to a new atomic work_flags bit-field
    to avoid this race.

> So, my questions are:
> 
> * What is the correct way to use the API to reliably control an LED for a combination of off/on/blinking?
> * Is my above analysis of the code correct, so that there are indeed race conditions going from blinking to off, leading to undefined behaviour? Can that be fixed?
> 

Frankly speaking I have never experienced the problem even before the
aforementioned patch. Could you share what driver are you using?
Is it implementing brightness_set or brightness_seT_blocking op?

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-22 12:36 ` Pavel Machek
@ 2017-11-23  0:14   ` Craig McQueen
  0 siblings, 0 replies; 17+ messages in thread
From: Craig McQueen @ 2017-11-23  0:14 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-leds

Pavel Machek wrote:
> On Wed 2017-11-22 00:55:12, Craig McQueen wrote:
> > I'd like to control a LED to possibly be any of:
> >
> > * Off
> > * On
> > * Blinking
> >
> > I've had this working fine in 3.14.x kernel.
> >
> > But when upgrading to 4.4.x, I found that the transition from "blinking" to
> "on" didn't work. Namely, if it's blinking and then I call
> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
> remember if it turned off, or remained blinking; it wasn't on anyway). I
> worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
> before led_set_brightness(led_cdev, LED_FULL).
> >
> > Now I have upgraded to 4.9.x, and found that the transition from "blinking"
> to "on" again isn't working. The LED ends up being off instead of on.
> >
> > Examining the code of led_set_brightness():
> >
> > * Behaviour is different depending on whether the brightness is LED_OFF
> (it schedules the blink to be turned off "soon") or other (it alters the
> brightness of subsequent blinks).
> > * It looks as though there are race conditions in the transition from blinking
> to steady off -- it schedules the blink to be turned off "soon", so it's difficult
> to guarantee a reliable transition from blinking to on/off.
> >
> > The combination of the above two points makes it seem difficult to
> robustly go from blinking to off/on.
> >
> > So, my questions are:
> >
> > * What is the correct way to use the API to reliably control an LED
> > for a combination of off/on/blinking?
> 
> You should be able to use sysfs from the userland. .. if that is broken we
> need to fix it.
> 
> What are you trying to do?

I've written a kernel driver for a power supply interface on some custom hardware. I'm using LED trigger API to make two LED triggers, "power" and "battery". The "battery" LED trigger should be off when no battery, on when battery is present and okay, or flashing when battery is present but in fault condition.

* Off: led_trigger_event(trigger, LED_OFF)
* On: led_trigger_event(trigger, LED_FULL)
* Flashing: led_trigger_blink(trigger, &delay_on, &delay_off)

But for kernel 4.4, going from flashing to on didn't work. I had to change the On case to:
* On: led_trigger_event(trigger, LED_OFF); led_trigger_event(trigger, LED_FULL);

But for kernel 4.9, that again doesn't work when going from flashing to on (the LED is turned off).

My driver is using triggers, but the trigger calls lead fairly directly to:

* Off: led_trigger_event(trigger, LED_OFF) --> led_set_brightness(led_cdev, LED_OFF)
* On: led_trigger_event(trigger, LED_FULL) --> led_set_brightness(led_cdev, LED_FULL)
* Flashing: led_trigger_blink(trigger, &delay_on, &delay_off) --> led_blink_set(led_cdev, delay_on, delay_off)

-- 
Craig McQueen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-22 19:53 ` Jacek Anaszewski
@ 2017-11-23  0:55   ` Craig McQueen
  2017-11-23 21:36     ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Craig McQueen @ 2017-11-23  0:55 UTC (permalink / raw)
  To: Jacek Anaszewski, linux-leds

Jacek Anaszewski wrote:
> Hi Craig,
> 
> On 11/22/2017 01:55 AM, Craig McQueen wrote:
> > I'd like to control a LED to possibly be any of:
> >
> > * Off
> > * On
> > * Blinking
> >
> > I've had this working fine in 3.14.x kernel.
> >
> > But when upgrading to 4.4.x, I found that the transition from "blinking" to
> "on" didn't work. Namely, if it's blinking and then I call
> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
> remember if it turned off, or remained blinking; it wasn't on anyway). I
> worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
> before led_set_brightness(led_cdev, LED_FULL).
> 
> This is by design. Setting brightness to LED_OFF deactivates any active
> trigger. There were some inconsistencies in the API semantics and it was
> fixed.
> 
> See Documentation/ABI/testing/sysfs-class-led
> 
> What:           /sys/class/leds/<led>/brightness
> Date:           March 2006
> KernelVersion:  2.6.17
> Contact:        Richard Purdie <rpurdie@rpsys.net>
> Description:
>                 Set the brightness of the LED. Most LEDs don't
>                 have hardware brightness support, so will just be turned on for
>                 non-zero brightness settings. The value is between 0 and
>                 /sys/class/leds/<led>/max_brightness.
> 
>                 Writing 0 to this file clears active trigger.
> 
>                 Writing non-zero to this file while trigger is active changes the
>                 top brightness trigger is going to use.

That makes sense. Which is why, when I changed to kernel 4.4.x, I made the "on" case do led_set_brightness(led_cdev, LED_OFF);  led_set_brightness(led_cdev, LED_FULL);. As I described in another email, I am actually creating LED triggers, so my code calls led_trigger_event(trigger, LED_OFF); led_trigger_event(trigger, LED_FULL);. That seems to meet the design intent of the API.

> > Now I have upgraded to 4.9.x, and found that the transition from "blinking"
> to "on" again isn't working. The LED ends up being off instead of on.
> >
> > Examining the code of led_set_brightness():
> >
> > * Behaviour is different depending on whether the brightness is LED_OFF
> (it schedules the blink to be turned off "soon") or other (it alters the
> brightness of subsequent blinks).
> > * It looks as though there are race conditions in the transition from blinking
> to steady off -- it schedules the blink to be turned off "soon", so it's difficult
> to guarantee a reliable transition from blinking to on/off.
> >
> > The combination of the above two points makes it seem difficult to
> robustly go from blinking to off/on.
> 
> Please make sure you have the following patch:
> 
> Author: Hans de Goede <hdegoede@redhat.com>  2016-11-08 14:38:46
> Committer: Jacek Anaszewski <j.anaszewski@samsung.com>  2016-11-22
> 12:07:05
> Parent: 8338eab50ffb3399a938d723f9605383ed9f8473 (leds: Add user LED
> driver for NIC78bx device)
> Follows: v4.9-rc1
> Precedes: leds_for_4.10
> 
>     led: core: Use atomic bit-field for the blink-flags
> 
>     All the LED_BLINK* flags are accessed read-modify-write from e.g.
>     led_set_brightness and led_blink_set_oneshot while both
>     set_brightness_work and the blink_timer may be running.
> 
>     If these race then the modify step done by one of them may be lost,
>     switch the LED_BLINK* flags to a new atomic work_flags bit-field
>     to avoid this race.

I don't have that patch, I don't think. I see it in commit a9c6ce57ec2f136d08141e8220a0ffaca216f7b0 which goes into kernel 4.10. But it looks as though that's not the essence of the issue.

> > So, my questions are:
> >
> > * What is the correct way to use the API to reliably control an LED for a
> combination of off/on/blinking?

It sounds as though calling led_set_brightness(led_cdev, LED_OFF);  led_set_brightness(led_cdev, LED_FULL); in-principle should behave according to API design intent.

> > * Is my above analysis of the code correct, so that there are indeed race
> conditions going from blinking to off, leading to undefined behaviour? Can
> that be fixed?

Looking at the code, it looks problematic that led_set_brightness(led_cdev, LED_OFF) does not immediately disable soft blinking, but schedules it for "soon in the future". Then, when led_set_brightness(led_cdev, LED_FULL) is called, it is not effective because the previous call with LED_OFF has not actually done anything yet.

> Frankly speaking I have never experienced the problem even before the
> aforementioned patch. Could you share what driver are you using?
> Is it implementing brightness_set or brightness_seT_blocking op?

I'm using gpio-leds driver (on a BeagleBone Black derived system).

> --
> Best regards,
> Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-23  0:55   ` Craig McQueen
@ 2017-11-23 21:36     ` Jacek Anaszewski
  2017-11-24  5:23       ` Craig McQueen
  0 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-23 21:36 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/23/2017 01:55 AM, Craig McQueen wrote:
> Jacek Anaszewski wrote:
>> Hi Craig,
>>
>> On 11/22/2017 01:55 AM, Craig McQueen wrote:
>>> I'd like to control a LED to possibly be any of:
>>>
>>> * Off
>>> * On
>>> * Blinking
>>>
>>> I've had this working fine in 3.14.x kernel.
>>>
>>> But when upgrading to 4.4.x, I found that the transition from "blinking" to
>> "on" didn't work. Namely, if it's blinking and then I call
>> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
>> remember if it turned off, or remained blinking; it wasn't on anyway). I
>> worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
>> before led_set_brightness(led_cdev, LED_FULL).
>>
>> This is by design. Setting brightness to LED_OFF deactivates any active
>> trigger. There were some inconsistencies in the API semantics and it was
>> fixed.
>>
>> See Documentation/ABI/testing/sysfs-class-led
>>
>> What:           /sys/class/leds/<led>/brightness
>> Date:           March 2006
>> KernelVersion:  2.6.17
>> Contact:        Richard Purdie <rpurdie@rpsys.net>
>> Description:
>>                 Set the brightness of the LED. Most LEDs don't
>>                 have hardware brightness support, so will just be turned on for
>>                 non-zero brightness settings. The value is between 0 and
>>                 /sys/class/leds/<led>/max_brightness.
>>
>>                 Writing 0 to this file clears active trigger.
>>
>>                 Writing non-zero to this file while trigger is active changes the
>>                 top brightness trigger is going to use.
> 
> That makes sense. Which is why, when I changed to kernel 4.4.x, I made the "on" case do led_set_brightness(led_cdev, LED_OFF);  led_set_brightness(led_cdev, LED_FULL);. As I described in another email, I am actually creating LED triggers, so my code calls led_trigger_event(trigger, LED_OFF); led_trigger_event(trigger, LED_FULL);. That seems to meet the design intent of the API.
> 
>>> Now I have upgraded to 4.9.x, and found that the transition from "blinking"
>> to "on" again isn't working. The LED ends up being off instead of on.
>>>
>>> Examining the code of led_set_brightness():
>>>
>>> * Behaviour is different depending on whether the brightness is LED_OFF
>> (it schedules the blink to be turned off "soon") or other (it alters the
>> brightness of subsequent blinks).
>>> * It looks as though there are race conditions in the transition from blinking
>> to steady off -- it schedules the blink to be turned off "soon", so it's difficult
>> to guarantee a reliable transition from blinking to on/off.
>>>
>>> The combination of the above two points makes it seem difficult to
>> robustly go from blinking to off/on.
>>
>> Please make sure you have the following patch:
>>
>> Author: Hans de Goede <hdegoede@redhat.com>  2016-11-08 14:38:46
>> Committer: Jacek Anaszewski <j.anaszewski@samsung.com>  2016-11-22
>> 12:07:05
>> Parent: 8338eab50ffb3399a938d723f9605383ed9f8473 (leds: Add user LED
>> driver for NIC78bx device)
>> Follows: v4.9-rc1
>> Precedes: leds_for_4.10
>>
>>     led: core: Use atomic bit-field for the blink-flags
>>
>>     All the LED_BLINK* flags are accessed read-modify-write from e.g.
>>     led_set_brightness and led_blink_set_oneshot while both
>>     set_brightness_work and the blink_timer may be running.
>>
>>     If these race then the modify step done by one of them may be lost,
>>     switch the LED_BLINK* flags to a new atomic work_flags bit-field
>>     to avoid this race.
> 
> I don't have that patch, I don't think. I see it in commit a9c6ce57ec2f136d08141e8220a0ffaca216f7b0 which goes into kernel 4.10. But it looks as though that's not the essence of the issue.

It's actually eb1610b4c273370f491c5e194e5a56e3470d81e8.

> 
>>> So, my questions are:
>>>
>>> * What is the correct way to use the API to reliably control an LED for a
>> combination of off/on/blinking?
> 
> It sounds as though calling led_set_brightness(led_cdev, LED_OFF);  led_set_brightness(led_cdev, LED_FULL); in-principle should behave according to API design intent.
> 
>>> * Is my above analysis of the code correct, so that there are indeed race
>> conditions going from blinking to off, leading to undefined behaviour? Can
>> that be fixed?
> 
> Looking at the code, it looks problematic that led_set_brightness(led_cdev, LED_OFF) does not immediately disable soft blinking, but schedules it for "soon in the future". 

Comments in led_set_brightness() explain the reason for which
delaying brightness setting is required, i.e. to avoid the problem
when led_set_brightness() is called from hard irq context with
timer trigger enabled in the same time. See the original patch
introducing deferred brightness setting:

commit d23a22a74fded23a12434c9463fe66cec2b0afcd
Author: Fabio Baltieri <fabio.baltieri@gmail.com>
Date:   Wed Aug 15 21:44:34 2012 +0800

    leds: delay led_set_brightness if stopping soft-blink

    Delay execution of led_set_brightness() if need to stop soft-blink
    timer.

    This allows led_set_brightness to be called in hard-irq context even if
    soft-blink was activated on that LED.

    Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
    Cc: Pawel Moll <pawel.moll@arm.com>
    Signed-off-by: Bryan Wu <bryan.wu@canonical.com>


In general it allows to avoid conflict explained in [0].

> Then, when led_set_brightness(led_cdev, LED_FULL) is called, it is
> not effective because the previous call with LED_OFF has not actually
> done anything yet.

You really should test your code with the patch
eb1610b4c27 ("led: core: Use atomic bit-field for the blink-flags').

Changing the brightness to any non-zero value while blinking is on
shouldn't cause races after introducing flag
LED_BLINK_BRIGHTNESS_CHANGE, which is set in led_brightness_set() when
blinking is on, and new brightness is applied upon next blink transition
to on. This doesn't stop blinking though, which is in line with doc.

Now I see that there can be a problem if you set brightness to 0
while blinking and immediately set it to any non-zero value, before
LED_BLINK_DISABLE is handled. The last operation can have no effect
then. I'll check it.

> 
>> Frankly speaking I have never experienced the problem even before the
>> aforementioned patch. Could you share what driver are you using?
>> Is it implementing brightness_set or brightness_seT_blocking op?
> 
> I'm using gpio-leds driver (on a BeagleBone Black derived system).
> 
>> --
>> Best regards,
>> Jacek Anaszewski
> 

[0]
https://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c188.html

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-23 21:36     ` Jacek Anaszewski
@ 2017-11-24  5:23       ` Craig McQueen
  2017-11-24 20:13         ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Craig McQueen @ 2017-11-24  5:23 UTC (permalink / raw)
  To: Jacek Anaszewski, linux-leds

Jacek Anaszewski wrote:
> On 11/23/2017 01:55 AM, Craig McQueen wrote:
> > Jacek Anaszewski wrote:
> >> Hi Craig,
> >>
> >> On 11/22/2017 01:55 AM, Craig McQueen wrote:
> >>> I'd like to control a LED to possibly be any of:
> >>>
> >>> * Off
> >>> * On
> >>> * Blinking
> >>>
> >>> I've had this working fine in 3.14.x kernel.
> >>>
> >>> But when upgrading to 4.4.x, I found that the transition from
> >>> "blinking" to
> >> "on" didn't work. Namely, if it's blinking and then I call
> >> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I
> >> can't remember if it turned off, or remained blinking; it wasn't on
> >> anyway). I worked around it by calling led_set_brightness(led_cdev,
> >> LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).
> >>
> >> This is by design. Setting brightness to LED_OFF deactivates any
> >> active trigger. There were some inconsistencies in the API semantics
> >> and it was fixed.
> >>
> >> See Documentation/ABI/testing/sysfs-class-led
> >>
> >> What:           /sys/class/leds/<led>/brightness
> >> Date:           March 2006
> >> KernelVersion:  2.6.17
> >> Contact:        Richard Purdie <rpurdie@rpsys.net>
> >> Description:
> >>                 Set the brightness of the LED. Most LEDs don't
> >>                 have hardware brightness support, so will just be turned on for
> >>                 non-zero brightness settings. The value is between 0 and
> >>                 /sys/class/leds/<led>/max_brightness.
> >>
> >>                 Writing 0 to this file clears active trigger.
> >>
> >>                 Writing non-zero to this file while trigger is active changes the
> >>                 top brightness trigger is going to use.
> >
> > That makes sense. Which is why, when I changed to kernel 4.4.x, I made
> the "on" case do led_set_brightness(led_cdev, LED_OFF);
> led_set_brightness(led_cdev, LED_FULL);. As I described in another email, I
> am actually creating LED triggers, so my code calls led_trigger_event(trigger,
> LED_OFF); led_trigger_event(trigger, LED_FULL);. That seems to meet the
> design intent of the API.
> >
> >>> Now I have upgraded to 4.9.x, and found that the transition from
> "blinking"
> >> to "on" again isn't working. The LED ends up being off instead of on.
> >>>
> >>> Examining the code of led_set_brightness():
> >>>
> >>> * Behaviour is different depending on whether the brightness is
> >>> LED_OFF
> >> (it schedules the blink to be turned off "soon") or other (it alters
> >> the brightness of subsequent blinks).
> >>> * It looks as though there are race conditions in the transition
> >>> from blinking
> >> to steady off -- it schedules the blink to be turned off "soon", so
> >> it's difficult to guarantee a reliable transition from blinking to on/off.
> >>>
> >>> The combination of the above two points makes it seem difficult to
> >> robustly go from blinking to off/on.
> >>
> >> Please make sure you have the following patch:
> >>
> >> Author: Hans de Goede <hdegoede@redhat.com>  2016-11-08 14:38:46
> >> Committer: Jacek Anaszewski <j.anaszewski@samsung.com>  2016-11-22
> >> 12:07:05
> >> Parent: 8338eab50ffb3399a938d723f9605383ed9f8473 (leds: Add user LED
> >> driver for NIC78bx device)
> >> Follows: v4.9-rc1
> >> Precedes: leds_for_4.10
> >>
> >>     led: core: Use atomic bit-field for the blink-flags
> >>
> >>     All the LED_BLINK* flags are accessed read-modify-write from e.g.
> >>     led_set_brightness and led_blink_set_oneshot while both
> >>     set_brightness_work and the blink_timer may be running.
> >>
> >>     If these race then the modify step done by one of them may be lost,
> >>     switch the LED_BLINK* flags to a new atomic work_flags bit-field
> >>     to avoid this race.
> >
> > I don't have that patch, I don't think. I see it in commit
> a9c6ce57ec2f136d08141e8220a0ffaca216f7b0 which goes into kernel 4.10. But
> it looks as though that's not the essence of the issue.
> 
> It's actually eb1610b4c273370f491c5e194e5a56e3470d81e8.
> 
> >
> >>> So, my questions are:
> >>>
> >>> * What is the correct way to use the API to reliably control an LED
> >>> for a
> >> combination of off/on/blinking?
> >
> > It sounds as though calling led_set_brightness(led_cdev, LED_OFF);
> led_set_brightness(led_cdev, LED_FULL); in-principle should behave
> according to API design intent.
> >
> >>> * Is my above analysis of the code correct, so that there are indeed
> >>> race
> >> conditions going from blinking to off, leading to undefined
> >> behaviour? Can that be fixed?
> >
> > Looking at the code, it looks problematic that led_set_brightness(led_cdev,
> LED_OFF) does not immediately disable soft blinking, but schedules it for
> "soon in the future".
> 
> Comments in led_set_brightness() explain the reason for which delaying
> brightness setting is required, i.e. to avoid the problem when
> led_set_brightness() is called from hard irq context with timer trigger
> enabled in the same time. See the original patch introducing deferred
> brightness setting:
> 
> commit d23a22a74fded23a12434c9463fe66cec2b0afcd
> Author: Fabio Baltieri <fabio.baltieri@gmail.com>
> Date:   Wed Aug 15 21:44:34 2012 +0800
> 
>     leds: delay led_set_brightness if stopping soft-blink
> 
>     Delay execution of led_set_brightness() if need to stop soft-blink
>     timer.
> 
>     This allows led_set_brightness to be called in hard-irq context even if
>     soft-blink was activated on that LED.
> 
>     Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
>     Cc: Pawel Moll <pawel.moll@arm.com>
>     Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
> 
> 
> In general it allows to avoid conflict explained in [0].
> 
> [0]
> https://www.kernel.org/pub/linux/kernel/people/rusty/kernel-
> locking/c188.html
> 
> > Then, when led_set_brightness(led_cdev, LED_FULL) is called, it is not
> > effective because the previous call with LED_OFF has not actually done
> > anything yet.
> 
> You really should test your code with the patch
> eb1610b4c27 ("led: core: Use atomic bit-field for the blink-flags').
> 
> Changing the brightness to any non-zero value while blinking is on shouldn't
> cause races after introducing flag LED_BLINK_BRIGHTNESS_CHANGE, which is
> set in led_brightness_set() when blinking is on, and new brightness is applied
> upon next blink transition to on. This doesn't stop blinking though, which is in
> line with doc.
> 
> Now I see that there can be a problem if you set brightness to 0 while
> blinking and immediately set it to any non-zero value, before
> LED_BLINK_DISABLE is handled. The last operation can have no effect then.
> I'll check it.

I've considered the following fix:

diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
index ef1360445413..5e60a6c3a52f 100644
--- a/drivers/leds/led-core.c
+++ b/drivers/leds/led-core.c
@@ -109,7 +109,6 @@ static void set_brightness_delayed(struct work_struct *ws)
        int ret = 0;
 
        if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
-               led_cdev->delayed_set_value = LED_OFF;
                led_stop_software_blink(led_cdev);
        }
 
@@ -241,7 +240,10 @@ void led_set_brightness(struct led_classdev *led_cdev,
                 */
                if (brightness == LED_OFF) {
                        set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
+                       led_cdev->delayed_set_value = brightness;
                        schedule_work(&led_cdev->set_brightness_work);
+               } else if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
+                       led_cdev->delayed_set_value = brightness;
                } else {
                        set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
                                &led_cdev->work_flags);


However, I think it has problems with concurrency/race conditions. I'm having difficulty thinking of an elegant solution that is free of race conditions, without resorting to spinlocks etc. I'll be very interested to see what you propose.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-24  5:23       ` Craig McQueen
@ 2017-11-24 20:13         ` Jacek Anaszewski
  0 siblings, 0 replies; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-24 20:13 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/24/2017 06:23 AM, Craig McQueen wrote:
> Jacek Anaszewski wrote:
>> On 11/23/2017 01:55 AM, Craig McQueen wrote:
>>> Jacek Anaszewski wrote:
>>>> Hi Craig,
>>>>
>>>> On 11/22/2017 01:55 AM, Craig McQueen wrote:
>>>>> I'd like to control a LED to possibly be any of:
>>>>>
>>>>> * Off
>>>>> * On
>>>>> * Blinking
>>>>>
>>>>> I've had this working fine in 3.14.x kernel.
>>>>>
>>>>> But when upgrading to 4.4.x, I found that the transition from
>>>>> "blinking" to
>>>> "on" didn't work. Namely, if it's blinking and then I call
>>>> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I
>>>> can't remember if it turned off, or remained blinking; it wasn't on
>>>> anyway). I worked around it by calling led_set_brightness(led_cdev,
>>>> LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).
>>>>
>>>> This is by design. Setting brightness to LED_OFF deactivates any
>>>> active trigger. There were some inconsistencies in the API semantics
>>>> and it was fixed.
>>>>
>>>> See Documentation/ABI/testing/sysfs-class-led
>>>>
>>>> What:           /sys/class/leds/<led>/brightness
>>>> Date:           March 2006
>>>> KernelVersion:  2.6.17
>>>> Contact:        Richard Purdie <rpurdie@rpsys.net>
>>>> Description:
>>>>                 Set the brightness of the LED. Most LEDs don't
>>>>                 have hardware brightness support, so will just be turned on for
>>>>                 non-zero brightness settings. The value is between 0 and
>>>>                 /sys/class/leds/<led>/max_brightness.
>>>>
>>>>                 Writing 0 to this file clears active trigger.
>>>>
>>>>                 Writing non-zero to this file while trigger is active changes the
>>>>                 top brightness trigger is going to use.
>>>
>>> That makes sense. Which is why, when I changed to kernel 4.4.x, I made
>> the "on" case do led_set_brightness(led_cdev, LED_OFF);
>> led_set_brightness(led_cdev, LED_FULL);. As I described in another email, I
>> am actually creating LED triggers, so my code calls led_trigger_event(trigger,
>> LED_OFF); led_trigger_event(trigger, LED_FULL);. That seems to meet the
>> design intent of the API.
>>>
>>>>> Now I have upgraded to 4.9.x, and found that the transition from
>> "blinking"
>>>> to "on" again isn't working. The LED ends up being off instead of on.
>>>>>
>>>>> Examining the code of led_set_brightness():
>>>>>
>>>>> * Behaviour is different depending on whether the brightness is
>>>>> LED_OFF
>>>> (it schedules the blink to be turned off "soon") or other (it alters
>>>> the brightness of subsequent blinks).
>>>>> * It looks as though there are race conditions in the transition
>>>>> from blinking
>>>> to steady off -- it schedules the blink to be turned off "soon", so
>>>> it's difficult to guarantee a reliable transition from blinking to on/off.
>>>>>
>>>>> The combination of the above two points makes it seem difficult to
>>>> robustly go from blinking to off/on.
>>>>
>>>> Please make sure you have the following patch:
>>>>
>>>> Author: Hans de Goede <hdegoede@redhat.com>  2016-11-08 14:38:46
>>>> Committer: Jacek Anaszewski <j.anaszewski@samsung.com>  2016-11-22
>>>> 12:07:05
>>>> Parent: 8338eab50ffb3399a938d723f9605383ed9f8473 (leds: Add user LED
>>>> driver for NIC78bx device)
>>>> Follows: v4.9-rc1
>>>> Precedes: leds_for_4.10
>>>>
>>>>     led: core: Use atomic bit-field for the blink-flags
>>>>
>>>>     All the LED_BLINK* flags are accessed read-modify-write from e.g.
>>>>     led_set_brightness and led_blink_set_oneshot while both
>>>>     set_brightness_work and the blink_timer may be running.
>>>>
>>>>     If these race then the modify step done by one of them may be lost,
>>>>     switch the LED_BLINK* flags to a new atomic work_flags bit-field
>>>>     to avoid this race.
>>>
>>> I don't have that patch, I don't think. I see it in commit
>> a9c6ce57ec2f136d08141e8220a0ffaca216f7b0 which goes into kernel 4.10. But
>> it looks as though that's not the essence of the issue.
>>
>> It's actually eb1610b4c273370f491c5e194e5a56e3470d81e8.
>>
>>>
>>>>> So, my questions are:
>>>>>
>>>>> * What is the correct way to use the API to reliably control an LED
>>>>> for a
>>>> combination of off/on/blinking?
>>>
>>> It sounds as though calling led_set_brightness(led_cdev, LED_OFF);
>> led_set_brightness(led_cdev, LED_FULL); in-principle should behave
>> according to API design intent.
>>>
>>>>> * Is my above analysis of the code correct, so that there are indeed
>>>>> race
>>>> conditions going from blinking to off, leading to undefined
>>>> behaviour? Can that be fixed?
>>>
>>> Looking at the code, it looks problematic that led_set_brightness(led_cdev,
>> LED_OFF) does not immediately disable soft blinking, but schedules it for
>> "soon in the future".
>>
>> Comments in led_set_brightness() explain the reason for which delaying
>> brightness setting is required, i.e. to avoid the problem when
>> led_set_brightness() is called from hard irq context with timer trigger
>> enabled in the same time. See the original patch introducing deferred
>> brightness setting:
>>
>> commit d23a22a74fded23a12434c9463fe66cec2b0afcd
>> Author: Fabio Baltieri <fabio.baltieri@gmail.com>
>> Date:   Wed Aug 15 21:44:34 2012 +0800
>>
>>     leds: delay led_set_brightness if stopping soft-blink
>>
>>     Delay execution of led_set_brightness() if need to stop soft-blink
>>     timer.
>>
>>     This allows led_set_brightness to be called in hard-irq context even if
>>     soft-blink was activated on that LED.
>>
>>     Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
>>     Cc: Pawel Moll <pawel.moll@arm.com>
>>     Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
>>
>>
>> In general it allows to avoid conflict explained in [0].
>>
>> [0]
>> https://www.kernel.org/pub/linux/kernel/people/rusty/kernel-
>> locking/c188.html
>>
>>> Then, when led_set_brightness(led_cdev, LED_FULL) is called, it is not
>>> effective because the previous call with LED_OFF has not actually done
>>> anything yet.
>>
>> You really should test your code with the patch
>> eb1610b4c27 ("led: core: Use atomic bit-field for the blink-flags').
>>
>> Changing the brightness to any non-zero value while blinking is on shouldn't
>> cause races after introducing flag LED_BLINK_BRIGHTNESS_CHANGE, which is
>> set in led_brightness_set() when blinking is on, and new brightness is applied
>> upon next blink transition to on. This doesn't stop blinking though, which is in
>> line with doc.
>>
>> Now I see that there can be a problem if you set brightness to 0 while
>> blinking and immediately set it to any non-zero value, before
>> LED_BLINK_DISABLE is handled. The last operation can have no effect then.
>> I'll check it.
> 
> I've considered the following fix:
> 
> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
> index ef1360445413..5e60a6c3a52f 100644
> --- a/drivers/leds/led-core.c
> +++ b/drivers/leds/led-core.c
> @@ -109,7 +109,6 @@ static void set_brightness_delayed(struct work_struct *ws)
>         int ret = 0;
>  
>         if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
> -               led_cdev->delayed_set_value = LED_OFF;
>                 led_stop_software_blink(led_cdev);
>         }
>  
> @@ -241,7 +240,10 @@ void led_set_brightness(struct led_classdev *led_cdev,
>                  */
>                 if (brightness == LED_OFF) {
>                         set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
> +                       led_cdev->delayed_set_value = brightness;
>                         schedule_work(&led_cdev->set_brightness_work);
> +               } else if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
> +                       led_cdev->delayed_set_value = brightness;
>                 } else {
>                         set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
>                                 &led_cdev->work_flags);

Thanks, I'll test that.

> 
> However, I think it has problems with concurrency/race conditions. I'm having difficulty thinking of an elegant solution that is free of race conditions, without resorting to spinlocks etc. I'll be very interested to see what you propose.

spin_lock is not an option since some LED class drivers can sleep while
setting brightness, which is forbidden while holding a spin_lock.
That's why we have to resort to such convoluted constructions.

As a first shot, I'd try to debug your case in detail.
You could add logs to your brightness_set op and to the LED core
functions involved in the operations in question.

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-22  0:55 Bug when using both "set" and "blink" functions Craig McQueen
                   ` (2 preceding siblings ...)
  2017-11-22 19:53 ` Jacek Anaszewski
@ 2017-11-25 21:42 ` Jacek Anaszewski
  2017-11-27  6:51   ` Craig McQueen
  3 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-25 21:42 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/22/2017 01:55 AM, Craig McQueen wrote:
> I'd like to control a LED to possibly be any of:
> 
> * Off
> * On
> * Blinking
> 
> I've had this working fine in 3.14.x kernel.
> 
> But when upgrading to 4.4.x, I found that the transition from "blinking" to "on" didn't work. Namely, if it's blinking and then I call led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't remember if it turned off, or remained blinking; it wasn't on anyway). I worked around it by calling led_set_brightness(led_cdev, LED_OFF) just before led_set_brightness(led_cdev, LED_FULL).

Could you please check it with 4.14?

The expected behavior in 4.14 is:

use case 1: led_set_brightness(led_cdev, LED_OFF) while timer trigger
            is active.
expected result: LED is off, trigger is deactivated.

use case 2: led_set_brightness(led_cdev, LED_FULL) while timer trigger
            is active.
expected result: LED blink brightness is LED_FULL, trigger remains active.

> Now I have upgraded to 4.9.x, and found that the transition from "blinking" to "on" again isn't working. The LED ends up being off instead of on.

Could you provide the exact steps to reproduce this case?
LED shouldn't be in any way off after setting brightness to non-zero
value when blinking.

There were indeed changes in 4.9 in this area, that modified semantics
of setting brightness to non-zero value when blinking. See the
reasoning:

commit 7cfe749fad5158247282f2fee30773fd454029ab
Author: Tony Makkiel <tony.makkiel@daqri.com>
Date:   Wed May 18 17:22:45 2016 +0100

    leds: core: Fix brightness setting upon hardware blinking enabled

    Commit 76931edd54f8 ("leds: fix brightness changing when software
blinking
    is active") changed the semantics of led_set_brightness() which
according
    to the documentation should disable blinking upon any brightness
setting.
    Moreover it made it different for soft blink case, where it was possible
    to change blink brightness, and for hardware blink case, where setting
    any brightness greater than 0 was ignored.

    While the change itself is against the documentation claims, it was
driven
    also by the fact that timer trigger remained active after turning
blinking
    off. Fixing that would have required major refactoring in the led-core,
    led-class, and led-triggers because of cyclic dependencies.

    Finally, it has been decided that allowing for brightness change during
    blinking is beneficial as it can be accomplished without disturbing
    blink rhythm.

    The change in brightness setting semantics will not affect existing
    LED class drivers that implement blink_set op thanks to the LED_BLINK_SW
    flag introduced by this patch. The flag state will be from now on
checked
    in led_set_brightness() which will allow to distinguish between software
    and hardware blink mode. In the latter case the control will be passed
    directly to the drivers which apply their semantics on brightness set,
    which is disable the blinking in case of most such drivers. New drivers
    will apply new semantics and just change the brightness while hardware
    blinking is on, if possible.

    The issue was smuggled by subsequent LED core improvements, which
modified
    the code that originally introduced the problem.



> Examining the code of led_set_brightness():
> 
> * Behaviour is different depending on whether the brightness is LED_OFF (it schedules the blink to be turned off "soon") or other (it alters the brightness of subsequent blinks).
> * It looks as though there are race conditions in the transition from blinking to steady off -- it schedules the blink to be turned off "soon", so it's difficult to guarantee a reliable transition from blinking to on/off.
> 
> The combination of the above two points makes it seem difficult to robustly go from blinking to off/on.
> 
> So, my questions are:
> 
> * What is the correct way to use the API to reliably control an LED for a combination of off/on/blinking?
> * Is my above analysis of the code correct, so that there are indeed race conditions going from blinking to off, leading to undefined behaviour? Can that be fixed?

I checked earlier discussed case i.e. disabling trigger and immediately
setting brightness to non-zero value and it seems to work fine. Checked
with uleds driver:

# echo "timer" > trigger
# echo 0 > brightness; echo 100 > brightness

Brightness always is 100 after that, even when setting
delay_on/off to 1.

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-25 21:42 ` Jacek Anaszewski
@ 2017-11-27  6:51   ` Craig McQueen
  2017-11-27 19:26     ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Craig McQueen @ 2017-11-27  6:51 UTC (permalink / raw)
  To: Jacek Anaszewski, linux-leds

Jacek Anaszewski wrote:
> On 11/22/2017 01:55 AM, Craig McQueen wrote:
> > I'd like to control a LED to possibly be any of:
> >
> > * Off
> > * On
> > * Blinking
> >
> > I've had this working fine in 3.14.x kernel.
> >
> > But when upgrading to 4.4.x, I found that the transition from "blinking" to
> "on" didn't work. Namely, if it's blinking and then I call
> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
> remember if it turned off, or remained blinking; it wasn't on anyway). I
> worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
> before led_set_brightness(led_cdev, LED_FULL).
> 
> Could you please check it with 4.14?

It isn't easy for me to try 4.14 in my current setup (Yocto build with meta-ti).

> The expected behavior in 4.14 is:
> 
> use case 1: led_set_brightness(led_cdev, LED_OFF) while timer trigger
>             is active.
> expected result: LED is off, trigger is deactivated.
> 
> use case 2: led_set_brightness(led_cdev, LED_FULL) while timer trigger
>             is active.
> expected result: LED blink brightness is LED_FULL, trigger remains active.

Yes, I understand that API design.

> > Now I have upgraded to 4.9.x, and found that the transition from "blinking"
> to "on" again isn't working. The LED ends up being off instead of on.
> 
> Could you provide the exact steps to reproduce this case?
> LED shouldn't be in any way off after setting brightness to non-zero value
> when blinking.

I'm using a custom kernel driver that tries to go from blinking to steady-on using these function calls:

		led_trigger_event(trigger, LED_OFF);
		led_trigger_event(trigger, LED_FULL);

> There were indeed changes in 4.9 in this area, that modified semantics of
> setting brightness to non-zero value when blinking. See the
> reasoning:
> 
> commit 7cfe749fad5158247282f2fee30773fd454029ab
> Author: Tony Makkiel <tony.makkiel@daqri.com>
> Date:   Wed May 18 17:22:45 2016 +0100
> 
>     leds: core: Fix brightness setting upon hardware blinking enabled
> 
>     Commit 76931edd54f8 ("leds: fix brightness changing when software
> blinking
>     is active") changed the semantics of led_set_brightness() which according
>     to the documentation should disable blinking upon any brightness setting.
>     Moreover it made it different for soft blink case, where it was possible
>     to change blink brightness, and for hardware blink case, where setting
>     any brightness greater than 0 was ignored.
> 
>     While the change itself is against the documentation claims, it was driven
>     also by the fact that timer trigger remained active after turning blinking
>     off. Fixing that would have required major refactoring in the led-core,
>     led-class, and led-triggers because of cyclic dependencies.
> 
>     Finally, it has been decided that allowing for brightness change during
>     blinking is beneficial as it can be accomplished without disturbing
>     blink rhythm.
> 
>     The change in brightness setting semantics will not affect existing
>     LED class drivers that implement blink_set op thanks to the LED_BLINK_SW
>     flag introduced by this patch. The flag state will be from now on checked
>     in led_set_brightness() which will allow to distinguish between software
>     and hardware blink mode. In the latter case the control will be passed
>     directly to the drivers which apply their semantics on brightness set,
>     which is disable the blinking in case of most such drivers. New drivers
>     will apply new semantics and just change the brightness while hardware
>     blinking is on, if possible.
> 
>     The issue was smuggled by subsequent LED core improvements, which
> modified
>     the code that originally introduced the problem.
> 
> 
> 
> > Examining the code of led_set_brightness():
> >
> > * Behaviour is different depending on whether the brightness is LED_OFF
> (it schedules the blink to be turned off "soon") or other (it alters the
> brightness of subsequent blinks).
> > * It looks as though there are race conditions in the transition from blinking
> to steady off -- it schedules the blink to be turned off "soon", so it's difficult
> to guarantee a reliable transition from blinking to on/off.
> >
> > The combination of the above two points makes it seem difficult to
> robustly go from blinking to off/on.
> >
> > So, my questions are:
> >
> > * What is the correct way to use the API to reliably control an LED for a
> combination of off/on/blinking?
> > * Is my above analysis of the code correct, so that there are indeed race
> conditions going from blinking to off, leading to undefined behaviour? Can
> that be fixed?
> 
> I checked earlier discussed case i.e. disabling trigger and immediately setting
> brightness to non-zero value and it seems to work fine. Checked with uleds
> driver:
> 
> # echo "timer" > trigger
> # echo 0 > brightness; echo 100 > brightness
> 
> Brightness always is 100 after that, even when setting delay_on/off to 1.

I still think there's a race condition, but you don't discover it when using userspace, because there's sufficient time between setting the brightness to 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.

But I'm using a custom kernel driver that tries to go from blinking to steady-on using these function calls:

		led_trigger_event(trigger, LED_OFF);   /* which calls led_set_brightness(led_cdev, LED_OFF) */
		led_trigger_event(trigger, LED_FULL);  /* which calls led_set_brightness(led_cdev, LED_FULL) */

In this case, LED_BLINK_DISABLE is not yet processed before the second function call, so the second function call becomes ineffective, because the pending LED_BLINK_DISABLE is later processed, turning off the LED and leaving it off.

-- 
Craig McQueen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-27  6:51   ` Craig McQueen
@ 2017-11-27 19:26     ` Jacek Anaszewski
  2017-11-28  4:32       ` Craig McQueen
  0 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-27 19:26 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/27/2017 07:51 AM, Craig McQueen wrote:
> Jacek Anaszewski wrote:
>> On 11/22/2017 01:55 AM, Craig McQueen wrote:
>>> I'd like to control a LED to possibly be any of:
>>>
>>> * Off
>>> * On
>>> * Blinking
>>>
>>> I've had this working fine in 3.14.x kernel.
>>>
>>> But when upgrading to 4.4.x, I found that the transition from "blinking" to
>> "on" didn't work. Namely, if it's blinking and then I call
>> led_set_brightness(led_cdev, LED_FULL), then it wouldn't work (I can't
>> remember if it turned off, or remained blinking; it wasn't on anyway). I
>> worked around it by calling led_set_brightness(led_cdev, LED_OFF) just
>> before led_set_brightness(led_cdev, LED_FULL).
>>
>> Could you please check it with 4.14?
> 
> It isn't easy for me to try 4.14 in my current setup (Yocto build with meta-ti).

You could backport related led-core.c patches.
They don't have out-of-LED tree dependencies AFAIR.

>> The expected behavior in 4.14 is:
>>
>> use case 1: led_set_brightness(led_cdev, LED_OFF) while timer trigger
>>             is active.
>> expected result: LED is off, trigger is deactivated.
>>
>> use case 2: led_set_brightness(led_cdev, LED_FULL) while timer trigger
>>             is active.
>> expected result: LED blink brightness is LED_FULL, trigger remains active.
> 
> Yes, I understand that API design.
> 
>>> Now I have upgraded to 4.9.x, and found that the transition from "blinking"
>> to "on" again isn't working. The LED ends up being off instead of on.
>>
>> Could you provide the exact steps to reproduce this case?
>> LED shouldn't be in any way off after setting brightness to non-zero value
>> when blinking.
> 
> I'm using a custom kernel driver that tries to go from blinking to steady-on using these function calls:
> 
> 		led_trigger_event(trigger, LED_OFF);
> 		led_trigger_event(trigger, LED_FULL);
> 
>> There were indeed changes in 4.9 in this area, that modified semantics of
>> setting brightness to non-zero value when blinking. See the
>> reasoning:
>>
>> commit 7cfe749fad5158247282f2fee30773fd454029ab
>> Author: Tony Makkiel <tony.makkiel@daqri.com>
>> Date:   Wed May 18 17:22:45 2016 +0100
>>
>>     leds: core: Fix brightness setting upon hardware blinking enabled
>>
>>     Commit 76931edd54f8 ("leds: fix brightness changing when software
>> blinking
>>     is active") changed the semantics of led_set_brightness() which according
>>     to the documentation should disable blinking upon any brightness setting.
>>     Moreover it made it different for soft blink case, where it was possible
>>     to change blink brightness, and for hardware blink case, where setting
>>     any brightness greater than 0 was ignored.
>>
>>     While the change itself is against the documentation claims, it was driven
>>     also by the fact that timer trigger remained active after turning blinking
>>     off. Fixing that would have required major refactoring in the led-core,
>>     led-class, and led-triggers because of cyclic dependencies.
>>
>>     Finally, it has been decided that allowing for brightness change during
>>     blinking is beneficial as it can be accomplished without disturbing
>>     blink rhythm.
>>
>>     The change in brightness setting semantics will not affect existing
>>     LED class drivers that implement blink_set op thanks to the LED_BLINK_SW
>>     flag introduced by this patch. The flag state will be from now on checked
>>     in led_set_brightness() which will allow to distinguish between software
>>     and hardware blink mode. In the latter case the control will be passed
>>     directly to the drivers which apply their semantics on brightness set,
>>     which is disable the blinking in case of most such drivers. New drivers
>>     will apply new semantics and just change the brightness while hardware
>>     blinking is on, if possible.
>>
>>     The issue was smuggled by subsequent LED core improvements, which
>> modified
>>     the code that originally introduced the problem.
>>
>>
>>
>>> Examining the code of led_set_brightness():
>>>
>>> * Behaviour is different depending on whether the brightness is LED_OFF
>> (it schedules the blink to be turned off "soon") or other (it alters the
>> brightness of subsequent blinks).
>>> * It looks as though there are race conditions in the transition from blinking
>> to steady off -- it schedules the blink to be turned off "soon", so it's difficult
>> to guarantee a reliable transition from blinking to on/off.
>>>
>>> The combination of the above two points makes it seem difficult to
>> robustly go from blinking to off/on.
>>>
>>> So, my questions are:
>>>
>>> * What is the correct way to use the API to reliably control an LED for a
>> combination of off/on/blinking?
>>> * Is my above analysis of the code correct, so that there are indeed race
>> conditions going from blinking to off, leading to undefined behaviour? Can
>> that be fixed?
>>
>> I checked earlier discussed case i.e. disabling trigger and immediately setting
>> brightness to non-zero value and it seems to work fine. Checked with uleds
>> driver:
>>
>> # echo "timer" > trigger
>> # echo 0 > brightness; echo 100 > brightness
>>
>> Brightness always is 100 after that, even when setting delay_on/off to 1.
> 
> I still think there's a race condition, but you don't discover it when using userspace, because there's sufficient time between setting the brightness to 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
> 
> But I'm using a custom kernel driver that tries to go from blinking to steady-on using these function calls:
> 
> 		led_trigger_event(trigger, LED_OFF);   /* which calls led_set_brightness(led_cdev, LED_OFF) */
> 		led_trigger_event(trigger, LED_FULL);  /* which calls led_set_brightness(led_cdev, LED_FULL) */
> 
> In this case, LED_BLINK_DISABLE is not yet processed before the second function call, so the second function call becomes ineffective, because the pending LED_BLINK_DISABLE is later processed, turning off the LED and leaving it off.

Have you verified it by inserting printks in the led_set_brightness()
and brightness_set op of your driver?

If there is a race then the call order should be as follows:

led_set_brightness LED_OFF
led_set_brightness LED_FULL
brightness_set LED_FULL
brightness_set LED_OFF

Regardless of that, I strongly advise backporting the mainline patches.
If that doesn't help then make sure that brightness_set op of your
driver implements proper locking scheme. If that doesn't help too, then
you could try to come up with your patch for the LED core, that fixes
the issue for you. It can be hard to address your particular case
otherwise.

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-27 19:26     ` Jacek Anaszewski
@ 2017-11-28  4:32       ` Craig McQueen
  2017-11-28 21:35         ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Craig McQueen @ 2017-11-28  4:32 UTC (permalink / raw)
  To: Jacek Anaszewski, linux-leds

Jacek Anaszewski wrote:
> On 11/27/2017 07:51 AM, Craig McQueen wrote:
> > Jacek Anaszewski wrote:
> >> I checked earlier discussed case i.e. disabling trigger and
> >> immediately setting brightness to non-zero value and it seems to work
> >> fine. Checked with uleds
> >> driver:
> >>
> >> # echo "timer" > trigger
> >> # echo 0 > brightness; echo 100 > brightness
> >>
> >> Brightness always is 100 after that, even when setting delay_on/off to 1.
> >
> > I still think there's a race condition, but you don't discover it when using
> userspace, because there's sufficient time between setting the brightness to
> 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
> >
> > But I'm using a custom kernel driver that tries to go from blinking to steady-
> on using these function calls:
> >
> > 		led_trigger_event(trigger, LED_OFF);   /* which calls
> led_set_brightness(led_cdev, LED_OFF) */
> > 		led_trigger_event(trigger, LED_FULL);  /* which calls
> > led_set_brightness(led_cdev, LED_FULL) */
> >
> > In this case, LED_BLINK_DISABLE is not yet processed before the second
> function call, so the second function call becomes ineffective, because the
> pending LED_BLINK_DISABLE is later processed, turning off the LED and
> leaving it off.
> 
> Have you verified it by inserting printks in the led_set_brightness() and
> brightness_set op of your driver?

No, I didn't verify it. I deduced it by code review of led-core.c, and its use of LED_BLINK_DISABLE in led_set_brightness() and set_brightness_delayed().

Then, I made a patch (in an earlier email) which modified led_set_brightness(), and that fixed the issue. However, I was concerned that there still could be a race condition if set_brightness_delayed() happened to run in the middle of a call to led_set_brightness(led_cdev, LED_FULL). In that case, the call led_set_brightness(led_cdev, LED_FULL) could still fail.

> Regardless of that, I strongly advise backporting the mainline patches.
> If that doesn't help then make sure that brightness_set op of your driver
> implements proper locking scheme.

The problem seems to be nothing to do with the brightness_set op of the driver.

> If that doesn't help too, then you could
> try to come up with your patch for the LED core, that fixes the issue for you.
> It can be hard to address your particular case otherwise.

This code in led-core.c seems quite tricky to avoid any race conditions, without using a spinlock or semaphore. Here is another proposal which I am considering:

diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
index ef1360445413..5fe7826deab2 100644
--- a/drivers/leds/led-core.c
+++ b/drivers/leds/led-core.c
@@ -109,8 +109,8 @@ static void set_brightness_delayed(struct work_struct *ws)
        int ret = 0;
 
        if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
-               led_cdev->delayed_set_value = LED_OFF;
                led_stop_software_blink(led_cdev);
+               led_cdev->delayed_set_value = led_cdev->new_blink_brightness;
        }
 
        ret = __led_set_brightness(led_cdev, led_cdev->delayed_set_value);
@@ -239,15 +239,24 @@ void led_set_brightness(struct led_classdev *led_cdev,
                 * work queue task to avoid problems in case we are called
                 * from hard irq context.
                 */
+               led_cdev->new_blink_brightness = brightness;
                if (brightness == LED_OFF) {
                        set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
                        schedule_work(&led_cdev->set_brightness_work);
                } else {
                        set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
                                &led_cdev->work_flags);
-                       led_cdev->new_blink_brightness = brightness;
                }
-               return;
+
+               if (test_bit(LED_BLINK_SW, &led_cdev->work_flags)) {
+                       /*
+                        * Test LED_BLINK_SW again, to handle race condition
+                        * with set_brightness_delayed(). If it's no longer
+                        * set, then blink has just been stopped, so continue
+                        * with led_set_brightness_nosleep() below.
+                        */
+                       return;
+               }
        }
 
        led_set_brightness_nosleep(led_cdev, brightness);


-- 
Craig McQueen

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-28  4:32       ` Craig McQueen
@ 2017-11-28 21:35         ` Jacek Anaszewski
  2017-11-28 21:44           ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-28 21:35 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/28/2017 05:32 AM, Craig McQueen wrote:
> Jacek Anaszewski wrote:
>> On 11/27/2017 07:51 AM, Craig McQueen wrote:
>>> Jacek Anaszewski wrote:
>>>> I checked earlier discussed case i.e. disabling trigger and
>>>> immediately setting brightness to non-zero value and it seems to work
>>>> fine. Checked with uleds
>>>> driver:
>>>>
>>>> # echo "timer" > trigger
>>>> # echo 0 > brightness; echo 100 > brightness
>>>>
>>>> Brightness always is 100 after that, even when setting delay_on/off to 1.
>>>
>>> I still think there's a race condition, but you don't discover it when using
>> userspace, because there's sufficient time between setting the brightness to
>> 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
>>>
>>> But I'm using a custom kernel driver that tries to go from blinking to steady-
>> on using these function calls:
>>>
>>> 		led_trigger_event(trigger, LED_OFF);   /* which calls
>> led_set_brightness(led_cdev, LED_OFF) */
>>> 		led_trigger_event(trigger, LED_FULL);  /* which calls
>>> led_set_brightness(led_cdev, LED_FULL) */
>>>
>>> In this case, LED_BLINK_DISABLE is not yet processed before the second
>> function call, so the second function call becomes ineffective, because the
>> pending LED_BLINK_DISABLE is later processed, turning off the LED and
>> leaving it off.
>>
>> Have you verified it by inserting printks in the led_set_brightness() and
>> brightness_set op of your driver?
> 
> No, I didn't verify it. I deduced it by code review of led-core.c, and its use of LED_BLINK_DISABLE in led_set_brightness() and set_brightness_delayed().
> 
> Then, I made a patch (in an earlier email) which modified led_set_brightness(), and that fixed the issue. However, I was concerned that there still could be a race condition if set_brightness_delayed() happened to run in the middle of a call to led_set_brightness(led_cdev, LED_FULL). In that case, the call led_set_brightness(led_cdev, LED_FULL) could still fail.
> 
>> Regardless of that, I strongly advise backporting the mainline patches.
>> If that doesn't help then make sure that brightness_set op of your driver
>> implements proper locking scheme.
> 
> The problem seems to be nothing to do with the brightness_set op of the driver.
> 
>> If that doesn't help too, then you could
>> try to come up with your patch for the LED core, that fixes the issue for you.
>> It can be hard to address your particular case otherwise.
> 
> This code in led-core.c seems quite tricky to avoid any race conditions, without using a spinlock or semaphore. Here is another proposal which I am considering:
> 
> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
> index ef1360445413..5fe7826deab2 100644
> --- a/drivers/leds/led-core.c
> +++ b/drivers/leds/led-core.c
> @@ -109,8 +109,8 @@ static void set_brightness_delayed(struct work_struct *ws)
>         int ret = 0;
>  
>         if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
> -               led_cdev->delayed_set_value = LED_OFF;
>                 led_stop_software_blink(led_cdev);
> +               led_cdev->delayed_set_value = led_cdev->new_blink_brightness;
>         }
>  
>         ret = __led_set_brightness(led_cdev, led_cdev->delayed_set_value);
> @@ -239,15 +239,24 @@ void led_set_brightness(struct led_classdev *led_cdev,
>                  * work queue task to avoid problems in case we are called
>                  * from hard irq context.
>                  */
> +               led_cdev->new_blink_brightness = brightness;
>                 if (brightness == LED_OFF) {
>                         set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
>                         schedule_work(&led_cdev->set_brightness_work);
>                 } else {
>                         set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
>                                 &led_cdev->work_flags);
> -                       led_cdev->new_blink_brightness = brightness;
>                 }
> -               return;
> +
> +               if (test_bit(LED_BLINK_SW, &led_cdev->work_flags)) {
> +                       /*
> +                        * Test LED_BLINK_SW again, to handle race condition
> +                        * with set_brightness_delayed(). If it's no longer
> +                        * set, then blink has just been stopped, so continue
> +                        * with led_set_brightness_nosleep() below.
> +                        */
> +                       return;
> +               }
>         }
>  
>         led_set_brightness_nosleep(led_cdev, brightness);

It will be also prone to races. Every solution not employing
mutual exclusive section will be. I'm starting to think if the
best we can do isn't just preventing brightness setting when
blink disable is pending.


diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
index fd83c7f..9c775a2 100644
--- a/drivers/leds/led-core.c
+++ b/drivers/leds/led-core.c
@@ -228,6 +228,11 @@ EXPORT_SYMBOL_GPL(led_stop_software_blink);
 void led_set_brightness(struct led_classdev *led_cdev,
                        enum led_brightness brightness)
 {
+       if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
+               dev_err(led_cdev->dev,
+                       "Setting an LED's brightness failed - blink
disable pending\n");
+               return;
+       }
        /*
         * If software blink is active, delay brightness setting
         * until the next timer tick.



-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-28 21:35         ` Jacek Anaszewski
@ 2017-11-28 21:44           ` Jacek Anaszewski
  2017-11-28 23:40             ` Craig McQueen
  0 siblings, 1 reply; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-28 21:44 UTC (permalink / raw)
  To: Craig McQueen, linux-leds

On 11/28/2017 10:35 PM, Jacek Anaszewski wrote:
> On 11/28/2017 05:32 AM, Craig McQueen wrote:
>> Jacek Anaszewski wrote:
>>> On 11/27/2017 07:51 AM, Craig McQueen wrote:
>>>> Jacek Anaszewski wrote:
>>>>> I checked earlier discussed case i.e. disabling trigger and
>>>>> immediately setting brightness to non-zero value and it seems to work
>>>>> fine. Checked with uleds
>>>>> driver:
>>>>>
>>>>> # echo "timer" > trigger
>>>>> # echo 0 > brightness; echo 100 > brightness
>>>>>
>>>>> Brightness always is 100 after that, even when setting delay_on/off to 1.
>>>>
>>>> I still think there's a race condition, but you don't discover it when using
>>> userspace, because there's sufficient time between setting the brightness to
>>> 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
>>>>
>>>> But I'm using a custom kernel driver that tries to go from blinking to steady-
>>> on using these function calls:
>>>>
>>>> 		led_trigger_event(trigger, LED_OFF);   /* which calls
>>> led_set_brightness(led_cdev, LED_OFF) */
>>>> 		led_trigger_event(trigger, LED_FULL);  /* which calls
>>>> led_set_brightness(led_cdev, LED_FULL) */
>>>>
>>>> In this case, LED_BLINK_DISABLE is not yet processed before the second
>>> function call, so the second function call becomes ineffective, because the
>>> pending LED_BLINK_DISABLE is later processed, turning off the LED and
>>> leaving it off.
>>>
>>> Have you verified it by inserting printks in the led_set_brightness() and
>>> brightness_set op of your driver?
>>
>> No, I didn't verify it. I deduced it by code review of led-core.c, and its use of LED_BLINK_DISABLE in led_set_brightness() and set_brightness_delayed().
>>
>> Then, I made a patch (in an earlier email) which modified led_set_brightness(), and that fixed the issue. However, I was concerned that there still could be a race condition if set_brightness_delayed() happened to run in the middle of a call to led_set_brightness(led_cdev, LED_FULL). In that case, the call led_set_brightness(led_cdev, LED_FULL) could still fail.
>>
>>> Regardless of that, I strongly advise backporting the mainline patches.
>>> If that doesn't help then make sure that brightness_set op of your driver
>>> implements proper locking scheme.
>>
>> The problem seems to be nothing to do with the brightness_set op of the driver.
>>
>>> If that doesn't help too, then you could
>>> try to come up with your patch for the LED core, that fixes the issue for you.
>>> It can be hard to address your particular case otherwise.
>>
>> This code in led-core.c seems quite tricky to avoid any race conditions, without using a spinlock or semaphore. Here is another proposal which I am considering:
>>
>> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
>> index ef1360445413..5fe7826deab2 100644
>> --- a/drivers/leds/led-core.c
>> +++ b/drivers/leds/led-core.c
>> @@ -109,8 +109,8 @@ static void set_brightness_delayed(struct work_struct *ws)
>>         int ret = 0;
>>  
>>         if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
>> -               led_cdev->delayed_set_value = LED_OFF;
>>                 led_stop_software_blink(led_cdev);
>> +               led_cdev->delayed_set_value = led_cdev->new_blink_brightness;
>>         }
>>  
>>         ret = __led_set_brightness(led_cdev, led_cdev->delayed_set_value);
>> @@ -239,15 +239,24 @@ void led_set_brightness(struct led_classdev *led_cdev,
>>                  * work queue task to avoid problems in case we are called
>>                  * from hard irq context.
>>                  */
>> +               led_cdev->new_blink_brightness = brightness;
>>                 if (brightness == LED_OFF) {
>>                         set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
>>                         schedule_work(&led_cdev->set_brightness_work);
>>                 } else {
>>                         set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
>>                                 &led_cdev->work_flags);
>> -                       led_cdev->new_blink_brightness = brightness;
>>                 }
>> -               return;
>> +
>> +               if (test_bit(LED_BLINK_SW, &led_cdev->work_flags)) {
>> +                       /*
>> +                        * Test LED_BLINK_SW again, to handle race condition
>> +                        * with set_brightness_delayed(). If it's no longer
>> +                        * set, then blink has just been stopped, so continue
>> +                        * with led_set_brightness_nosleep() below.
>> +                        */
>> +                       return;
>> +               }
>>         }
>>  
>>         led_set_brightness_nosleep(led_cdev, brightness);
> 
> It will be also prone to races. Every solution not employing
> mutual exclusive section will be. I'm starting to think if the
> best we can do isn't just preventing brightness setting when
> blink disable is pending.
> 
> 
> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
> index fd83c7f..9c775a2 100644
> --- a/drivers/leds/led-core.c
> +++ b/drivers/leds/led-core.c
> @@ -228,6 +228,11 @@ EXPORT_SYMBOL_GPL(led_stop_software_blink);
>  void led_set_brightness(struct led_classdev *led_cdev,
>                         enum led_brightness brightness)
>  {
> +       if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
> +               dev_err(led_cdev->dev,
> +                       "Setting an LED's brightness failed - blink
> disable pending\n");
> +               return;
> +       }

Of course it will not work too since we can be preempted here
by the other process that will set LED_BLINK_DISABLE.
After waking up this one will not be aware of the flag state
change.

>         /*
>          * If software blink is active, delay brightness setting
>          * until the next timer tick.
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Bug when using both "set" and "blink" functions
  2017-11-28 21:44           ` Jacek Anaszewski
@ 2017-11-28 23:40             ` Craig McQueen
  2017-11-29 20:45               ` Jacek Anaszewski
  0 siblings, 1 reply; 17+ messages in thread
From: Craig McQueen @ 2017-11-28 23:40 UTC (permalink / raw)
  To: Jacek Anaszewski, linux-leds

Jacek Anaszewski wrote:
> On 11/28/2017 10:35 PM, Jacek Anaszewski wrote:
> > On 11/28/2017 05:32 AM, Craig McQueen wrote:
> >> Jacek Anaszewski wrote:
> >>> On 11/27/2017 07:51 AM, Craig McQueen wrote:
> >>>> Jacek Anaszewski wrote:
> >>>>> I checked earlier discussed case i.e. disabling trigger and
> >>>>> immediately setting brightness to non-zero value and it seems to
> >>>>> work fine. Checked with uleds
> >>>>> driver:
> >>>>>
> >>>>> # echo "timer" > trigger
> >>>>> # echo 0 > brightness; echo 100 > brightness
> >>>>>
> >>>>> Brightness always is 100 after that, even when setting delay_on/off to
> 1.
> >>>>
> >>>> I still think there's a race condition, but you don't discover it
> >>>> when using
> >>> userspace, because there's sufficient time between setting the
> >>> brightness to
> >>> 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
> >>>>
> >>>> But I'm using a custom kernel driver that tries to go from blinking
> >>>> to steady-
> >>> on using these function calls:
> >>>>
> >>>> 		led_trigger_event(trigger, LED_OFF);   /* which calls
> >>> led_set_brightness(led_cdev, LED_OFF) */
> >>>> 		led_trigger_event(trigger, LED_FULL);  /* which calls
> >>>> led_set_brightness(led_cdev, LED_FULL) */
> >>>>
> >>>> In this case, LED_BLINK_DISABLE is not yet processed before the
> >>>> second
> >>> function call, so the second function call becomes ineffective,
> >>> because the pending LED_BLINK_DISABLE is later processed, turning
> >>> off the LED and leaving it off.
> >>>
> >>> Have you verified it by inserting printks in the
> >>> led_set_brightness() and brightness_set op of your driver?
> >>
> >> No, I didn't verify it. I deduced it by code review of led-core.c, and its use
> of LED_BLINK_DISABLE in led_set_brightness() and
> set_brightness_delayed().
> >>
> >> Then, I made a patch (in an earlier email) which modified
> led_set_brightness(), and that fixed the issue. However, I was concerned
> that there still could be a race condition if set_brightness_delayed()
> happened to run in the middle of a call to led_set_brightness(led_cdev,
> LED_FULL). In that case, the call led_set_brightness(led_cdev, LED_FULL)
> could still fail.
> >>
> >>> Regardless of that, I strongly advise backporting the mainline patches.
> >>> If that doesn't help then make sure that brightness_set op of your
> >>> driver implements proper locking scheme.
> >>
> >> The problem seems to be nothing to do with the brightness_set op of the
> driver.
> >>
> >>> If that doesn't help too, then you could try to come up with your
> >>> patch for the LED core, that fixes the issue for you.
> >>> It can be hard to address your particular case otherwise.
> >>
> >> This code in led-core.c seems quite tricky to avoid any race conditions,
> without using a spinlock or semaphore. Here is another proposal which I am
> considering:
> >>
> >> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index
> >> ef1360445413..5fe7826deab2 100644
> >> --- a/drivers/leds/led-core.c
> >> +++ b/drivers/leds/led-core.c
> >> @@ -109,8 +109,8 @@ static void set_brightness_delayed(struct
> work_struct *ws)
> >>         int ret = 0;
> >>
> >>         if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags))
> {
> >> -               led_cdev->delayed_set_value = LED_OFF;
> >>                 led_stop_software_blink(led_cdev);
> >> +               led_cdev->delayed_set_value =
> >> + led_cdev->new_blink_brightness;
> >>         }
> >>
> >>         ret = __led_set_brightness(led_cdev,
> >> led_cdev->delayed_set_value); @@ -239,15 +239,24 @@ void
> led_set_brightness(struct led_classdev *led_cdev,
> >>                  * work queue task to avoid problems in case we are called
> >>                  * from hard irq context.
> >>                  */
> >> +               led_cdev->new_blink_brightness = brightness;
> >>                 if (brightness == LED_OFF) {
> >>                         set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
> >>                         schedule_work(&led_cdev->set_brightness_work);
> >>                 } else {
> >>                         set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
> >>                                 &led_cdev->work_flags);
> >> -                       led_cdev->new_blink_brightness = brightness;
> >>                 }
> >> -               return;
> >> +
> >> +               if (test_bit(LED_BLINK_SW, &led_cdev->work_flags)) {
> >> +                       /*
> >> +                        * Test LED_BLINK_SW again, to handle race condition
> >> +                        * with set_brightness_delayed(). If it's no longer
> >> +                        * set, then blink has just been stopped, so continue
> >> +                        * with led_set_brightness_nosleep() below.
> >> +                        */
> >> +                       return;
> >> +               }
> >>         }
> >>
> >>         led_set_brightness_nosleep(led_cdev, brightness);
> >
> > It will be also prone to races. Every solution not employing mutual
> > exclusive section will be. I'm starting to think if the best we can do
> > isn't just preventing brightness setting when blink disable is
> > pending.

It's not good behaviour to prevent brightness setting when blink disable is pending. That would be an API that "may or may not" do what the caller wants, depending on indeterminate timing. That would force the caller to insert an indeterminate delay between led_set_brightness(led_cdev, LED_OFF) and led_set_brightness(led_cdev, LED_FULL), to achieve a transition from flashing to steady-on. That would be an unwieldy API.

I think this is about as good as I can manage with a minimal change to the existing code, without using a spinlock. Honestly, I think led-core.c needs to be reworked to use a spinlock for led_set_brightness(), led_blink_set() et al. The code as it stands is very difficult to analyse/maintain confidently with its absence of any locking.

> > diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index
> > fd83c7f..9c775a2 100644
> > --- a/drivers/leds/led-core.c
> > +++ b/drivers/leds/led-core.c
> > @@ -228,6 +228,11 @@ EXPORT_SYMBOL_GPL(led_stop_software_blink);
> >  void led_set_brightness(struct led_classdev *led_cdev,
> >                         enum led_brightness brightness)  {
> > +       if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
> > +               dev_err(led_cdev->dev,
> > +                       "Setting an LED's brightness failed - blink
> > disable pending\n");
> > +               return;
> > +       }
> 
> Of course it will not work too since we can be preempted here by the other
> process that will set LED_BLINK_DISABLE.
> After waking up this one will not be aware of the flag state change.

That could only occur if led_set_brightness() is being called from more than one process. Is that a practical scenario? It doesn't seem sensible to have multiple processes controlling one LED.

Or, do you mean preempted by set_brightness_delayed(), which _clears_ LED_BLINK_DISABLE? In that case, it would also clear LED_BLINK_SW (via the call to led_stop_software_blink()). So I think it would behave as expected. It's more difficult to analyse in an SMP system, but I think in that case the worst that can happen is __led_set_brightness() gets called twice, redundantly.

> >         /*
> >          * If software blink is active, delay brightness setting
> >          * until the next timer tick.
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug when using both "set" and "blink" functions
  2017-11-28 23:40             ` Craig McQueen
@ 2017-11-29 20:45               ` Jacek Anaszewski
  0 siblings, 0 replies; 17+ messages in thread
From: Jacek Anaszewski @ 2017-11-29 20:45 UTC (permalink / raw)
  To: Craig McQueen; +Cc: linux-leds

On 11/29/2017 12:40 AM, Craig McQueen wrote:
> Jacek Anaszewski wrote:
>> On 11/28/2017 10:35 PM, Jacek Anaszewski wrote:
>>> On 11/28/2017 05:32 AM, Craig McQueen wrote:
>>>> Jacek Anaszewski wrote:
>>>>> On 11/27/2017 07:51 AM, Craig McQueen wrote:
>>>>>> Jacek Anaszewski wrote:
>>>>>>> I checked earlier discussed case i.e. disabling trigger and
>>>>>>> immediately setting brightness to non-zero value and it seems to
>>>>>>> work fine. Checked with uleds
>>>>>>> driver:
>>>>>>>
>>>>>>> # echo "timer" > trigger
>>>>>>> # echo 0 > brightness; echo 100 > brightness
>>>>>>>
>>>>>>> Brightness always is 100 after that, even when setting delay_on/off to
>> 1.
>>>>>>
>>>>>> I still think there's a race condition, but you don't discover it
>>>>>> when using
>>>>> userspace, because there's sufficient time between setting the
>>>>> brightness to
>>>>> 0 and later 100, for LED_BLINK_DISABLE to be processed in-between.
>>>>>>
>>>>>> But I'm using a custom kernel driver that tries to go from blinking
>>>>>> to steady-
>>>>> on using these function calls:
>>>>>>
>>>>>> 		led_trigger_event(trigger, LED_OFF);   /* which calls
>>>>> led_set_brightness(led_cdev, LED_OFF) */
>>>>>> 		led_trigger_event(trigger, LED_FULL);  /* which calls
>>>>>> led_set_brightness(led_cdev, LED_FULL) */
>>>>>>
>>>>>> In this case, LED_BLINK_DISABLE is not yet processed before the
>>>>>> second
>>>>> function call, so the second function call becomes ineffective,
>>>>> because the pending LED_BLINK_DISABLE is later processed, turning
>>>>> off the LED and leaving it off.
>>>>>
>>>>> Have you verified it by inserting printks in the
>>>>> led_set_brightness() and brightness_set op of your driver?
>>>>
>>>> No, I didn't verify it. I deduced it by code review of led-core.c, and its use
>> of LED_BLINK_DISABLE in led_set_brightness() and
>> set_brightness_delayed().
>>>>
>>>> Then, I made a patch (in an earlier email) which modified
>> led_set_brightness(), and that fixed the issue. However, I was concerned
>> that there still could be a race condition if set_brightness_delayed()
>> happened to run in the middle of a call to led_set_brightness(led_cdev,
>> LED_FULL). In that case, the call led_set_brightness(led_cdev, LED_FULL)
>> could still fail.
>>>>
>>>>> Regardless of that, I strongly advise backporting the mainline patches.
>>>>> If that doesn't help then make sure that brightness_set op of your
>>>>> driver implements proper locking scheme.
>>>>
>>>> The problem seems to be nothing to do with the brightness_set op of the
>> driver.
>>>>
>>>>> If that doesn't help too, then you could try to come up with your
>>>>> patch for the LED core, that fixes the issue for you.
>>>>> It can be hard to address your particular case otherwise.
>>>>
>>>> This code in led-core.c seems quite tricky to avoid any race conditions,
>> without using a spinlock or semaphore. Here is another proposal which I am
>> considering:
>>>>
>>>> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index
>>>> ef1360445413..5fe7826deab2 100644
>>>> --- a/drivers/leds/led-core.c
>>>> +++ b/drivers/leds/led-core.c
>>>> @@ -109,8 +109,8 @@ static void set_brightness_delayed(struct
>> work_struct *ws)
>>>>         int ret = 0;
>>>>
>>>>         if (test_and_clear_bit(LED_BLINK_DISABLE, &led_cdev->work_flags))
>> {
>>>> -               led_cdev->delayed_set_value = LED_OFF;
>>>>                 led_stop_software_blink(led_cdev);
>>>> +               led_cdev->delayed_set_value =
>>>> + led_cdev->new_blink_brightness;
>>>>         }
>>>>
>>>>         ret = __led_set_brightness(led_cdev,
>>>> led_cdev->delayed_set_value); @@ -239,15 +239,24 @@ void
>> led_set_brightness(struct led_classdev *led_cdev,
>>>>                  * work queue task to avoid problems in case we are called
>>>>                  * from hard irq context.
>>>>                  */
>>>> +               led_cdev->new_blink_brightness = brightness;
>>>>                 if (brightness == LED_OFF) {
>>>>                         set_bit(LED_BLINK_DISABLE, &led_cdev->work_flags);
>>>>                         schedule_work(&led_cdev->set_brightness_work);
>>>>                 } else {
>>>>                         set_bit(LED_BLINK_BRIGHTNESS_CHANGE,
>>>>                                 &led_cdev->work_flags);
>>>> -                       led_cdev->new_blink_brightness = brightness;
>>>>                 }
>>>> -               return;
>>>> +
>>>> +               if (test_bit(LED_BLINK_SW, &led_cdev->work_flags)) {
>>>> +                       /*
>>>> +                        * Test LED_BLINK_SW again, to handle race condition
>>>> +                        * with set_brightness_delayed(). If it's no longer
>>>> +                        * set, then blink has just been stopped, so continue
>>>> +                        * with led_set_brightness_nosleep() below.
>>>> +                        */
>>>> +                       return;
>>>> +               }
>>>>         }
>>>>
>>>>         led_set_brightness_nosleep(led_cdev, brightness);
>>>
>>> It will be also prone to races. Every solution not employing mutual
>>> exclusive section will be. I'm starting to think if the best we can do
>>> isn't just preventing brightness setting when blink disable is
>>> pending.
> 
> It's not good behaviour to prevent brightness setting when blink disable is pending. That would be an API that "may or may not" do what the caller wants, depending on indeterminate timing. That would force the caller to insert an indeterminate delay between led_set_brightness(led_cdev, LED_OFF) and led_set_brightness(led_cdev, LED_FULL), to achieve a transition from flashing to steady-on. That would be an unwieldy API.

I agree.

> I think this is about as good as I can manage with a minimal change to the existing code, without using a spinlock. Honestly, I think led-core.c needs to be reworked to use a spinlock for led_set_brightness(), led_blink_set() et al. The code as it stands is very difficult to analyse/maintain confidently with its absence of any locking.

OK, so maybe this is a good opportunity to get rid of this painful
design allowing for calling led_set_brightness() from hard irq
context. AFAIK there are no mainline drivers doing that, and if there
are some, it should not be a problem to rework them to defer LED
brightness setting. This is to be checked, and all such drivers would
have to be updated in the same patch as the change to the led-core.c

The modifications would have to include:
- addition of spin_lock_bh() sections to:
	* led_timer_function()
	* set_brightness_delayed()
	* led_blink_set()
	* led_blink_set_oneshot()
	* led_stop_software_blink()
	* led_set_brightness()
	* led_set_brightness_nopm()
- reworking some of the above functions to simplify the code:
	* at least LED_BLINK_DISABLE flag won't be longer needed
        * add BUG_ON(in_irq()) to led_set_brightness()

>>> diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index
>>> fd83c7f..9c775a2 100644
>>> --- a/drivers/leds/led-core.c
>>> +++ b/drivers/leds/led-core.c
>>> @@ -228,6 +228,11 @@ EXPORT_SYMBOL_GPL(led_stop_software_blink);
>>>  void led_set_brightness(struct led_classdev *led_cdev,
>>>                         enum led_brightness brightness)  {
>>> +       if (test_bit(LED_BLINK_DISABLE, &led_cdev->work_flags)) {
>>> +               dev_err(led_cdev->dev,
>>> +                       "Setting an LED's brightness failed - blink
>>> disable pending\n");
>>> +               return;
>>> +       }
>>
>> Of course it will not work too since we can be preempted here by the other
>> process that will set LED_BLINK_DISABLE.
>> After waking up this one will not be aware of the flag state change.
> 
> That could only occur if led_set_brightness() is being called from more than one process. Is that a practical scenario? It doesn't seem sensible to have multiple processes controlling one LED.

Think of LED triggers. More than one driver can call
led_trigger_event() for the same trigger..

> Or, do you mean preempted by set_brightness_delayed(), which _clears_ LED_BLINK_DISABLE? In that case, it would also clear LED_BLINK_SW (via the call to led_stop_software_blink()). So I think it would behave as expected. It's more difficult to analyse in an SMP system, but I think in that case the worst that can happen is __led_set_brightness() gets called twice, redundantly.
> 
>>>         /*
>>>          * If software blink is active, delay brightness setting
>>>          * until the next timer tick.
>>>
> 
> 

-- 
Best regards,
Jacek Anaszewski

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-11-29 20:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22  0:55 Bug when using both "set" and "blink" functions Craig McQueen
2017-11-22  3:36 ` Craig McQueen
2017-11-22 12:36 ` Pavel Machek
2017-11-23  0:14   ` Craig McQueen
2017-11-22 19:53 ` Jacek Anaszewski
2017-11-23  0:55   ` Craig McQueen
2017-11-23 21:36     ` Jacek Anaszewski
2017-11-24  5:23       ` Craig McQueen
2017-11-24 20:13         ` Jacek Anaszewski
2017-11-25 21:42 ` Jacek Anaszewski
2017-11-27  6:51   ` Craig McQueen
2017-11-27 19:26     ` Jacek Anaszewski
2017-11-28  4:32       ` Craig McQueen
2017-11-28 21:35         ` Jacek Anaszewski
2017-11-28 21:44           ` Jacek Anaszewski
2017-11-28 23:40             ` Craig McQueen
2017-11-29 20:45               ` Jacek Anaszewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.