linux-rtc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Infinite loop on edge cases
       [not found] <CADjOgT97-8PM1=Nm+Mk4jw_8vzt3i9KXUohDAkUsptxS7JuU6Q@mail.gmail.com>
@ 2019-04-09  9:01 ` Alexandre Belloni
  2019-04-09 10:46   ` Mastro Gippo
  0 siblings, 1 reply; 6+ messages in thread
From: Alexandre Belloni @ 2019-04-09  9:01 UTC (permalink / raw)
  To: Mastro Gippo; +Cc: a.zummo, linux-rtc

Hi,

On 09/04/2019 10:44:27+0200, Mastro Gippo wrote:
> Hello, I have been testing the latest RTC drivers on an OpenWrt build on a
> MTK7688 SOC and I got an infinite loop on boot that would block the entire
> system.

You should not use hctosys that would solve you boot issue.

> Basically, if I don't have a RTC connected, any read would return 0xFF.
> That may be also be a problem of the I2C driver, that instead of generating
> an error just returns that value. But this may happen in other cases, like
> if we add an EEPROM or another IC with the same address as the RTC, or the
> RTC IC fails in some unpredictable way.
> The current implementation of that piece of code has some bad practices
> that lead to infinite loops due to this error.

You have to fix the i2c bus driver because this one has to return an
error so regmap can handle it properly.

Regarding other cases, like adding an EEPROM or another IC with the same
address as the RTC. Then the answer is to fix your device tree. If it
doesn't properly describe your hardware, then you shouldn't expect Linux
to run properly.

But again, this infinite loop doesn't have to be fatal anyway, as long
as you are not using hctosys.

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite loop on edge cases
  2019-04-09  9:01 ` Infinite loop on edge cases Alexandre Belloni
@ 2019-04-09 10:46   ` Mastro Gippo
  2019-04-09 11:27     ` Alexandre Belloni
  0 siblings, 1 reply; 6+ messages in thread
From: Mastro Gippo @ 2019-04-09 10:46 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: a.zummo, linux-rtc

Hi Alexandre,
thank you for your comments.
Since failure behaviour is not specified in the Maxim datasheet, the
gracious way to handle this would be a retry counter, as that would
also solve the problem of a DS1307 that for some reason doesn't set
the CH bit (maybe an internal oscillator failure?). A similar erratic
behavior happened to me once, due to an unstable power supply to the
RTC.
IMHO, I think that a hardware failure and/or a problematic I2C driver
should not produce an infinite loop - be it fatal or not.
I already solved my boot issue with this patch, and I will ship it
with the devices I sell, but may I ask you what approach would you
recommend instead of hctosys? I tried looking online for solutions but
couldn't find an "official" one. I will gladly investigate better
approaches if available.

Thank you,
Best regards
Cristiano Griletti - Mastro Gippo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite loop on edge cases
  2019-04-09 10:46   ` Mastro Gippo
@ 2019-04-09 11:27     ` Alexandre Belloni
  2019-04-09 14:35       ` Mastro Gippo
  0 siblings, 1 reply; 6+ messages in thread
From: Alexandre Belloni @ 2019-04-09 11:27 UTC (permalink / raw)
  To: Mastro Gippo; +Cc: a.zummo, linux-rtc

On 09/04/2019 12:46:20+0200, Mastro Gippo wrote:
> Hi Alexandre,
> thank you for your comments.
> Since failure behaviour is not specified in the Maxim datasheet, the
> gracious way to handle this would be a retry counter, as that would
> also solve the problem of a DS1307 that for some reason doesn't set
> the CH bit (maybe an internal oscillator failure?). A similar erratic
> behavior happened to me once, due to an unstable power supply to the
> RTC.

Ok, I had a closer look at the tissue, the solution here is to not
bother and simply not handle DS1307_BIT_CH, DS1338_BIT_OSF,
DS1340_BIT_nEOSC, DS1340_BIT_OSF, MCP794XX_BIT_VBATEN or
MCP794XX_BIT_ST. they should only be handled in the read_time and
set_time callbacks, were they really matters. I'll send a patch for
that.

> IMHO, I think that a hardware failure and/or a problematic I2C driver
> should not produce an infinite loop - be it fatal or not.

This is correct and this has to be fixed. I must admit I was too quick
and didn't see this was in the probe function.

> I already solved my boot issue with this patch, and I will ship it
> with the devices I sell, but may I ask you what approach would you
> recommend instead of hctosys? I tried looking online for solutions but
> couldn't find an "official" one. I will gladly investigate better
> approaches if available.
> 

Many distributions have initscripts that are using hwclock to set the
system time at boot and save it at shutdown. The kernel is not doing a
good job at setting the system time and saving it for technical and
historical reasons. hctosys and systohc should not be used.


-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite loop on edge cases
  2019-04-09 11:27     ` Alexandre Belloni
@ 2019-04-09 14:35       ` Mastro Gippo
  2019-04-15 15:34         ` Alexandre Belloni
  0 siblings, 1 reply; 6+ messages in thread
From: Mastro Gippo @ 2019-04-09 14:35 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: a.zummo, linux-rtc

Thanks Alexandre, let me know if you need help testing the patches
when they're available.
I can "simulate" various failure modes on my hardware.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Infinite loop on edge cases
  2019-04-09 14:35       ` Mastro Gippo
@ 2019-04-15 15:34         ` Alexandre Belloni
  0 siblings, 0 replies; 6+ messages in thread
From: Alexandre Belloni @ 2019-04-15 15:34 UTC (permalink / raw)
  To: Mastro Gippo; +Cc: a.zummo, linux-rtc

Hi,

On 09/04/2019 16:35:29+0200, Mastro Gippo wrote:
> Thanks Alexandre, let me know if you need help testing the patches
> when they're available.
> I can "simulate" various failure modes on my hardware.

Did you have a chance to test the patch I sent?


-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Infinite loop on edge cases
@ 2019-04-09  8:46 Mastro Gippo
  0 siblings, 0 replies; 6+ messages in thread
From: Mastro Gippo @ 2019-04-09  8:46 UTC (permalink / raw)
  To: linux-rtc

[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]

Hello, I have been testing the latest RTC drivers on an OpenWrt build
on a MTK7688 SOC and I got an infinite loop on boot that would block
the entire system.
Basically, if I don't have a RTC connected, any read would return
0xFF. That may be also be a problem of the I2C driver, that instead of
generating an error just returns that value. But this may happen in
other cases, like if we add an EEPROM or another IC with the same
address as the RTC, or the RTC IC fails in some unpredictable way.
The current implementation of that piece of code has some bad
practices that lead to infinite loops due to this error.
Since on these RTCs the byte at address 0x00 will never be 0xFF (check
the datasheet), I added a sanity check for this special case.
A better way to do this would be to add a retry counter when setting
the CH bit, but that's  an even more edge case that will probably
never happen.
The same infinite loop will happen on other RTC ICs that do a similar
check in other parts of code, but I don't have them so I can't edit
the code and test it.
The code has been tested and works; please consider merging it.
Thanks,
Best regards
Cristiano Griletti - Mastro Gippo

[-- Attachment #2: 0001-Fixed-infinite-loop-if-RTC-is-broken-disconnected.patch --]
[-- Type: application/octet-stream, Size: 1841 bytes --]

From 57c81bb7c9c15d8d1f29ec1e5e2c5cdc980fb943 Mon Sep 17 00:00:00 2001
From: Cristiano Griletti - Mastro Gippo <gipmad@gmail.com>
Date: Mon, 8 Apr 2019 17:10:09 +0200
Subject: [PATCH] Fixed infinite loop if RTC is broken/disconnected

If the RTC IC is missing or there's some other kind of error in the I2C bus reading wrong data, reading address 0x00 may always return 0xFF without returning an [err] code at line 1710.
When this happens, we get stuck in an infinite loop, stopping the device from booting and continuously printing this error:
"rtc-ds1307 0-0068: SET TIME!"
I added a sanity check of the byte returned from address 0x00, as it can never be 0xFF in normal operation and that can only mean that there's a read error (see datasheet).
This has been tested only with DS1307, but I'm sure the same bug will happen on the other devices where there is a "goto read_rtc;" (lines 1755 and 1788). I don't have the hardware to test them so I didn't touch that code.
---
 drivers/rtc/rtc-ds1307.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 07530fe..db56539 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -1727,9 +1727,16 @@ static int ds1307_probe(struct i2c_client *client,
 	case m41t11:
 		/* clock halted?  turn it on, so clock can tick. */
 		if (tmp & DS1307_BIT_CH) {
-			regmap_write(ds1307->regmap, DS1307_REG_SECS, 0);
-			dev_warn(ds1307->dev, "SET TIME!\n");
-			goto read_rtc;
+			if (tmp == 0xFF) {
+				/* read error; avoid loop. */
+				dev_dbg(ds1307->dev, "read error DS1307_REG_SECS\n", err);
+				goto exit;
+			}
+			else {
+				regmap_write(ds1307->regmap, DS1307_REG_SECS, 0);
+				dev_warn(ds1307->dev, "SET TIME!\n");
+				goto read_rtc;
+			}
 		}
 		break;
 	case ds_1308:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-04-15 15:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CADjOgT97-8PM1=Nm+Mk4jw_8vzt3i9KXUohDAkUsptxS7JuU6Q@mail.gmail.com>
2019-04-09  9:01 ` Infinite loop on edge cases Alexandre Belloni
2019-04-09 10:46   ` Mastro Gippo
2019-04-09 11:27     ` Alexandre Belloni
2019-04-09 14:35       ` Mastro Gippo
2019-04-15 15:34         ` Alexandre Belloni
2019-04-09  8:46 Mastro Gippo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).