imx25 ADC values wrong after sporadic time

* imx25 ADC values wrong after sporadic time
@ 2019-01-24 10:48 Benjamin Beckmeyer
  2019-01-26 17:58 ` Jonathan Cameron
  2019-01-30  8:04 ` Benjamin Beckmeyer
  0 siblings, 2 replies; 3+ messages in thread
From: Benjamin Beckmeyer @ 2019-01-24 10:48 UTC (permalink / raw)
  To: linux-iio

Hey all,

I have a problem with a i.MX25 device and the ADC in special. The ADC is already a kernel module (to reload it when the error occurs) and it all works fine. Then suddenly the ADC delivers wrong values and even a reload of the kernel module doesn't fix it.

The interesting part of it: It's so sporadic that the devices in our company never show the problem it's only at our customer devices. And even there some devices run for 2 month and other for only some hours.

I got a dmesg output from a customer(where the error is now present) the last line is the only interesting part I think, at least for the ADC.

[467450.903249] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x00000000 reg = 0x00000004
[613458.872789] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x5bec3543 reg = 0x00000000
[2974587.954034] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x5c103c70 reg = 0x00000000
[3149932.971010] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x5c12e961 reg = 0x00000000
[4212751.737165] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x00000000 reg = 0x00000004
[4648608.098370] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x00000000 reg = 0x00000004
[5089481.865850] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x00000000 reg = 0x00000004
[5609097.665957] imxdi_rtc 53ffc000.dryice: Write-wait timeout val = 0x00000000 reg = 0x00000004
[6126834.383266] iio iio:device0: ADC wait for measurement failed

So there is a timeout, where the driver was waiting for an interrupt to be finished, when I'm right.

The message never pops up again and the ADC values will be read all 200ms or so.

So my thinking is that this has something to do with my error. But the other messages before the ADC message had the same issue with a timeout with a similar function. So maybe there is a problem somewhere deeper? 

I'm running linux kernel 4.14.95 at the moment. And at that point I'm not able to reproduce the error, just that friendly customer help us. 

What I can say is that there was the earlier kernel version 3.7.2 with a custom kernel driver module for this ADC which was working fine over years and still is. But with me there came the current kernel to the device and I wanted to use the existing linux driver. 

What I have changed at this point is that the driver is running in POWER MODE instead of POWER SAVE MODE. 

I'm sure the driver is working properly, but then after a unknown time it suddenly starts to give wrong values back. First when it runs properly it gives back some values close to the max values of 4095 and the suddenly almost 0 but not only 0.

So do any of you guys have an idea what we can do about it? Or maybe how we can get closer to the problem. Any help would be appreciated. In the next days I wanted to see if the rtc of the device is running properly because of the dmesg output. Maybe that could bring me to a more deeper problem about the interrupt controller. But this is only guesswork.

Best Regards,

Benjamin Beckmeyer

^ permalink raw reply	[flat|nested] 3+ messages in thread