watchdog:alim1535_wdt: Fix data race in ali_settimer() concerning ali_timeout_bits variable. variable.
diff mbox series

Message ID 20190718155238.3066-1-mbalant3@gmail.com
State New
Headers show
Series
  • watchdog:alim1535_wdt: Fix data race in ali_settimer() concerning ali_timeout_bits variable. variable.
Related show

Commit Message

Mark Balantzyan July 18, 2019, 3:52 p.m. UTC
---
 drivers/watchdog/alim1535_wdt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Guenter Roeck July 18, 2019, 4:34 p.m. UTC | #1
On Thu, Jul 18, 2019 at 08:52:38AM -0700, Mark Balantzyan wrote:
> ---

Subject and description are all messed up.

>  drivers/watchdog/alim1535_wdt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/watchdog/alim1535_wdt.c b/drivers/watchdog/alim1535_wdt.c
> index 60f0c2eb..4ba2b860 100644
> --- a/drivers/watchdog/alim1535_wdt.c
> +++ b/drivers/watchdog/alim1535_wdt.c
> @@ -107,6 +107,7 @@ static void ali_keepalive(void)
>  
>  static int ali_settimer(int t)
>  {
> +    spin_lock(&ali_lock);
>      if (t < 0)
>          return -EINVAL;
>      else if (t < 60)
> @@ -117,7 +118,7 @@ static int ali_settimer(int t)
>          ali_timeout_bits = (t / 300)|(1 << 6)|(1 << 7);
>      else
>          return -EINVAL;

This return and the return above will exit the function with the
spinlock still active, which will guarantee a hangup if/when the
function is re-entered.

> -
> +    spin_unlock(&ali_lock);
>      timeout = t;

timeout is still unprotected and may have no relation to the
stored value of ali_timeout_bits.

Overall your patch would introduce much more severe problems
than the problem it tries to fix, and it doesn't even completely
fix that problem either.

I would suggest to leave the driver alone, unless you have the hardware
to test your changes. And, if you do, it would be much more valuable
to convert the driver to use the watchdog subsystem.

Thanks,
Guenter
Mark Balantzyan July 31, 2019, 4:17 p.m. UTC | #2
Hi Guenter,

If it's not too much too ask, I also propose to rewrite alim1535_wdt to 
use the watchdog subsystem as I believe we are making progress toward the 
similar end in pc87413_wdt, as my evaluation ends in some weeks.

Thank you,
Mark
Guenter Roeck July 31, 2019, 4:43 p.m. UTC | #3
Hi Mark,

On Wed, Jul 31, 2019 at 09:17:13AM -0700, Mark Balantzyan wrote:
> Hi Guenter,
> 
> If it's not too much too ask, I also propose to rewrite alim1535_wdt to use
> the watchdog subsystem as I believe we are making progress toward the
> similar end in pc87413_wdt, as my evaluation ends in some weeks.
> 

Please, no. We still have ways to go with that one driver, and we'll be
stuck with a patch which I can't accept because of lack of testing.
I (and you) really need to talk to your evaluators why they ask you
to make those changes. This is highly inappropriate. The Linux kernel
is not an feasible target for such an evaluation. This is setting
you up for failure, and it is a waste of both your and my time.

Are you really working for or on belalf of the Linux Foundation ?
They should know better. And if Google is involved, I am embarassed
for my employer.  If they really want people to do work like this,
they should also provide reviewers and coaching staff. They should
most definitely not expect kernel maintainers to do it for them.

Thanks,
Guenter
Mark Balantzyan July 31, 2019, 5:04 p.m. UTC | #4
My employer (and yes, I am working for the Linux Foundation) has me 
working on analysing race condition warnings in the Linux kernel. They 
have a driver verification project running under the umbrella of the ELISA 
project involved in the research, investigation, experimentation, and 
establishment of linux kernel verification measures and tools.

I actually do have assigned to me coaches and/or mentors that I have been 
corresponding with. They are aware of what is going on and are being cc'd 
to (most of) our emails.

pc87413_wdt was detected by our race condition analysis tool as having 
warning. Even outside this work we've been doing, I've been trying to apply
the reasoning of the race condition analysis tool to different kernel modules,
as part of my menteeship.

I hope you can respect that this is a process primarily for learning and 
experimentation. I'm sorry if I'm creating too much work for you at once. 
If so, let me know and I'll try to spread it out.

Thank you,
Mark
Mark Balantzyan July 31, 2019, 6:23 p.m. UTC | #5
Hi Guenter, all,

It's alright if you still don't wish to review my patch on alim1535_wdt, 
but my employer and I, using our race condition analysis tool, detected it 
to contain a race condition warning. I believe any possible issues could 
be resolved if it were rewritten to use the watchdog subsystem as you've 
previously stated.

Now, I don't wish to bother you too much, but it seems I forgot to work 
mainly with my assigned mentor prior to submitting patches..sorry. So, 
after I have worked on rewriting the alim1535 driver into common watchdog 
subsystem with my mentor, may I submit it to you then?

Thank you,
Mark
Guenter Roeck July 31, 2019, 7:24 p.m. UTC | #6
Mark,

On Wed, Jul 31, 2019 at 11:23:19AM -0700, Mark Balantzyan wrote:
> Hi Guenter, all,
> 
> It's alright if you still don't wish to review my patch on alim1535_wdt, but
> my employer and I, using our race condition analysis tool, detected it to
> contain a race condition warning. I believe any possible issues could be
> resolved if it were rewritten to use the watchdog subsystem as you've
> previously stated.
> 
> Now, I don't wish to bother you too much, but it seems I forgot to work
> mainly with my assigned mentor prior to submitting patches..sorry. So, after
> I have worked on rewriting the alim1535 driver into common watchdog
> subsystem with my mentor, may I submit it to you then?
> 

Similar to pc87413, this driver very likely has zero users left, there
won't be any hardware to test the patch, and we won't be able to accept
such a patch because it wasn't tested.

On top of that, the only race condition I can see in that driver is in
ali_settimer(), between ali_timeout_bits and timeout. Yet, that is not
really a race condition because the driver can only be opened once,
and thus there is no means for two threads entering ali_settimer()
at the same time.

I don't really understand this focus on fixing theoretic/irrelevant
race conditions in drivers which no one uses anymore. Maybe someone
can enlighten me ?

Thanks,
Guenter
Mark Balantzyan July 31, 2019, 11:37 p.m. UTC | #7
Hi Guenter, all,

 	I don't really understand this focus on fixing theoretic/irrelevant
 	race conditions in drivers which no one uses anymore. Maybe someone
 	can enlighten me ?

In conjunction with linuxtesting.org and The Linux Foundation, I've been 
enlisted to test and work on helping to test tools they use for 
reliability testing of linux-based systems. In particular, two tools, 
RaceHound (whose command is 'lines2insns' and which isolates race 
conditions in kernel modules) and Klever, which is browser-based, are
under continual development by the Center and I aim to help them improve
their throughput by aiding in investigating where, in the automated nature
particularly of Klever (requiring considerable configuration as well),
there may areas to improve.

Hence, yes, a large amount of our findings result in manifesting as 
theoretical and possible only, but relevant to improving the tools 
nonetheless.

Hope that helps with the 'enlightening' :), regards,
Mark

Patch
diff mbox series

diff --git a/drivers/watchdog/alim1535_wdt.c b/drivers/watchdog/alim1535_wdt.c
index 60f0c2eb..4ba2b860 100644
--- a/drivers/watchdog/alim1535_wdt.c
+++ b/drivers/watchdog/alim1535_wdt.c
@@ -107,6 +107,7 @@  static void ali_keepalive(void)
 
 static int ali_settimer(int t)
 {
+    spin_lock(&ali_lock);
     if (t < 0)
         return -EINVAL;
     else if (t < 60)
@@ -117,7 +118,7 @@  static int ali_settimer(int t)
         ali_timeout_bits = (t / 300)|(1 << 6)|(1 << 7);
     else
         return -EINVAL;
-
+    spin_unlock(&ali_lock);
     timeout = t;
     return 0;
 }