linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
@ 2006-01-14 19:23 Andrey Borzenkov
  2006-01-14 21:20 ` [lm-sensors] " Jean Delvare
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-14 19:23 UTC (permalink / raw)
  To: lm-sensors; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vanilla 2.6.15 on Toshiba Portege 4000. I get constant messages in dmesg:

i2c_adapter i2c-0: Error: command never completed
lm90 0-004c: Register 0x1 read failed (-1)
i2c_adapter i2c-0: Error: command never completed
lm90 0-004c: Register 0x14 read failed (-1)
i2c_adapter i2c-0: Error: command never completed
lm90 0-004c: Register 0x8 read failed (-1)
i2c_adapter i2c-0: Error: command never completed
lm90 0-004c: Register 0x0 read failed (-1)

for quite a number of registers. Apparently I can read sensors just fine still 
I am uneasy seeing those.

{pts/1}% lspci
00:00.0 Host bridge: ALi Corporation M1644/M1644T Northbridge+Trident (rev 01)
00:01.0 PCI bridge: ALi Corporation PCI to AGP Controller
00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
00:04.0 IDE interface: ALi Corporation M5229 IDE (rev c3)
00:06.0 Multimedia audio controller: ALi Corporation M5451 PCI AC-Link 
Controller Audio Device (rev 01)
00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV]
00:08.0 Bridge: ALi Corporation M7101 Power Management Controller [PMU]
00:0a.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] 
(rev 08)
00:10.0 CardBus bridge: Texas Instruments PCI1410 PC card Cardbus Controller 
(rev 01)
00:11.0 CardBus bridge: Toshiba America Info Systems ToPIC100 PCI to Cardbus 
Bridge with ZV Support (rev 32)
00:11.1 CardBus bridge: Toshiba America Info Systems ToPIC100 PCI to Cardbus 
Bridge with ZV Support (rev 32)
00:12.0 System peripheral: Toshiba America Info Systems SD TypA Controller 
(rev 03)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade XPAi1 (rev 
82)
{pts/1}% sensors
eeprom-i2c-0-50
Adapter: SMBus ALI1535 adapter at ef00
Memory type:            SDR SDRAM DIMM
Memory size (MB):       256

adm1032-i2c-0-4c
Adapter: SMBus ALI1535 adapter at ef00
M/B Temp:    +43°C  (low  =   -65°C, high =  +127°C)
CPU Temp:  +47.6°C  (low  = +43.0°C, high = +51.0°C)   ALARM
M/B Crit:   +127°C  (hyst =  +122°C)
CPU Crit:   +100°C  (hyst =   +95°C)


- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDyU+3R6LMutpd94wRAhIpAJ9jAaVmEx6v3FF5f7pDvmD/Xu7GnQCeO/5O
RSvVH1lgezCRTdrAQdLD0js=
=i2dt
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-14 19:23 2.6.15: lm90 0-004c: Register 0x13 read failed (-1) Andrey Borzenkov
@ 2006-01-14 21:20 ` Jean Delvare
  2006-01-14 21:45   ` Andrey Borzenkov
  0 siblings, 1 reply; 11+ messages in thread
From: Jean Delvare @ 2006-01-14 21:20 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: lm-sensors, linux-kernel

Hi Andrey,

> Vanilla 2.6.15 on Toshiba Portege 4000. I get constant messages in dmesg:
> 
> i2c_adapter i2c-0: Error: command never completed
> lm90 0-004c: Register 0x1 read failed (-1)
> i2c_adapter i2c-0: Error: command never completed
> lm90 0-004c: Register 0x14 read failed (-1)
> i2c_adapter i2c-0: Error: command never completed
> lm90 0-004c: Register 0x8 read failed (-1)
> i2c_adapter i2c-0: Error: command never completed
> lm90 0-004c: Register 0x0 read failed (-1)
> 
> for quite a number of registers. Apparently I can read sensors just fine still 
> I am uneasy seeing those.

Before 2.6.15, the lm90 driver did not handle read errors in any way,
so they were probably already there, you simply were not aware of it.
However, I guess that you already had the "command never completed"
errors? These come from the i2c-ali1535 bus driver.

It would be possible to add a retry-on-failure mechanism in the lm90
driver. However, the real problem is more likely in the i2c-ali1535
driver so fixing this one driver would be preferable.

> eeprom-i2c-0-50
> Adapter: SMBus ALI1535 adapter at ef00
> Memory type:            SDR SDRAM DIMM
> Memory size (MB):       256
> 
> adm1032-i2c-0-4c
> Adapter: SMBus ALI1535 adapter at ef00
> M/B Temp:    +43°C  (low  =   -65°C, high =  +127°C)
> CPU Temp:  +47.6°C  (low  = +43.0°C, high = +51.0°C)   ALARM
> M/B Crit:   +127°C  (hyst =  +122°C)
> CPU Crit:   +100°C  (hyst =   +95°C)

Do you also have "command never completed" errors without an associated
error from the lm90 driver? This would suggest that the eeprom driver
too is triggering errors, which in turn would confirm that we need to
fix the i2c-ali1535 driver rather than adding a workaround to the lm90
driver.

It looks like the i2c-ali1535 driver as it exists in the lm_sensors CVS
repository (for Linux 2.4 kernels) did receive a major change in March
2005. These changes were supposed to "fix stability problems" (by
adding delay loops pretty much everywhere). They were never ported to
the Linux 2.6 version of the driver. Maybe we should try doing so now.

This is a 400 lines patch, porting it won't be trivial, I am not
familiar with this driver myself and I don't have a chip to test my
changes on, so if someone else wants to take his/her chance, go. If
not, I'll do it.

Andrey, will you be able to test a i2c-ali1535 patch if we come up with
one?

-- 
Jean Delvare

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-14 21:20 ` [lm-sensors] " Jean Delvare
@ 2006-01-14 21:45   ` Andrey Borzenkov
  2006-01-15 19:12     ` Andrey Borzenkov
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-14 21:45 UTC (permalink / raw)
  To: Jean Delvare; +Cc: lm-sensors, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 00:20, Jean Delvare wrote:
> Hi Andrey,
>
> > Vanilla 2.6.15 on Toshiba Portege 4000. I get constant messages in dmesg:
> >
> > i2c_adapter i2c-0: Error: command never completed
> > lm90 0-004c: Register 0x1 read failed (-1)
> > i2c_adapter i2c-0: Error: command never completed
> > lm90 0-004c: Register 0x14 read failed (-1)
> > i2c_adapter i2c-0: Error: command never completed
> > lm90 0-004c: Register 0x8 read failed (-1)
> > i2c_adapter i2c-0: Error: command never completed
> > lm90 0-004c: Register 0x0 read failed (-1)
> >
> > for quite a number of registers. Apparently I can read sensors just fine
> > still I am uneasy seeing those.
>
> Before 2.6.15, the lm90 driver did not handle read errors in any way,
> so they were probably already there, you simply were not aware of it.
> However, I guess that you already had the "command never completed"
> errors? These come from the i2c-ali1535 bus driver.
>

Before 2.6.15 I run Mandriva kernel 2.6.12-12mdk. I do not remember them but 
may be I just never actually looked in dmesg :)

> It would be possible to add a retry-on-failure mechanism in the lm90
> driver. However, the real problem is more likely in the i2c-ali1535
> driver so fixing this one driver would be preferable.
>
> > eeprom-i2c-0-50
> > Adapter: SMBus ALI1535 adapter at ef00
> > Memory type:            SDR SDRAM DIMM
> > Memory size (MB):       256
> >
> > adm1032-i2c-0-4c
> > Adapter: SMBus ALI1535 adapter at ef00
> > M/B Temp:    +43°C  (low  =   -65°C, high =  +127°C)
> > CPU Temp:  +47.6°C  (low  = +43.0°C, high = +51.0°C)   ALARM
> > M/B Crit:   +127°C  (hyst =  +122°C)
> > CPU Crit:   +100°C  (hyst =   +95°C)
>
> Do you also have "command never completed" errors without an associated
> error from the lm90 driver?

yes, on boot.

> This would suggest that the eeprom driver 
> too is triggering errors, which in turn would confirm that we need to
> fix the i2c-ali1535 driver rather than adding a workaround to the lm90
> driver.
>
> It looks like the i2c-ali1535 driver as it exists in the lm_sensors CVS
> repository (for Linux 2.4 kernels) did receive a major change in March
> 2005. These changes were supposed to "fix stability problems" (by
> adding delay loops pretty much everywhere). They were never ported to
> the Linux 2.6 version of the driver. Maybe we should try doing so now.
>
> This is a 400 lines patch, porting it won't be trivial, I am not
> familiar with this driver myself and I don't have a chip to test my
> changes on, so if someone else wants to take his/her chance, go. If
> not, I'll do it.
>
> Andrey, will you be able to test a i2c-ali1535 patch if we come up with
> one?

Yes. Send me a patch (or give a link) and I'll try what I can do to port it. I 
ask if I have a question :)

BTW that reminds me - I actually have two 256M modules. Sensors show just one. 
Both are from Toshiba so it is unlikely that one does not have SPD - any idea 
why eeprom does not find second one? Oh, and it was the same when I had two 
128M modules so it is unlikely caused by modules.

thank you for reply

- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDyXD6R6LMutpd94wRAsIqAJwP5CdEisSKsA/iGqv2ouZ58xLe8ACgvRIY
WfuwZrsE996ZEtSoYvElgnQ=
=SSCR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-14 21:45   ` Andrey Borzenkov
@ 2006-01-15 19:12     ` Andrey Borzenkov
  2006-01-15 19:48       ` Andrey Borzenkov
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-15 19:12 UTC (permalink / raw)
  To: Jean Delvare; +Cc: lm-sensors, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 00:45, Andrey Borzenkov wrote:
> On Sunday 15 January 2006 00:20, Jean Delvare wrote:
> > Hi Andrey,
> >
> > > Vanilla 2.6.15 on Toshiba Portege 4000. I get constant messages in
> > > dmesg:
> > >
> > > i2c_adapter i2c-0: Error: command never completed
> > > lm90 0-004c: Register 0x1 read failed (-1)
> > > i2c_adapter i2c-0: Error: command never completed
> > > lm90 0-004c: Register 0x14 read failed (-1)
> > > i2c_adapter i2c-0: Error: command never completed
> > > lm90 0-004c: Register 0x8 read failed (-1)
> > > i2c_adapter i2c-0: Error: command never completed
> > > lm90 0-004c: Register 0x0 read failed (-1)
> > >
> > > for quite a number of registers. Apparently I can read sensors just
> > > fine still I am uneasy seeing those.
> >
> > Before 2.6.15, the lm90 driver did not handle read errors in any way,
> > so they were probably already there, you simply were not aware of it.
> > However, I guess that you already had the "command never completed"
> > errors? These come from the i2c-ali1535 bus driver.
>
> Before 2.6.15 I run Mandriva kernel 2.6.12-12mdk. I do not remember them
> but may be I just never actually looked in dmesg :)
>
> > It would be possible to add a retry-on-failure mechanism in the lm90
> > driver. However, the real problem is more likely in the i2c-ali1535
> > driver so fixing this one driver would be preferable.
> >
> > > eeprom-i2c-0-50
> > > Adapter: SMBus ALI1535 adapter at ef00
> > > Memory type:            SDR SDRAM DIMM
> > > Memory size (MB):       256
> > >
> > > adm1032-i2c-0-4c
> > > Adapter: SMBus ALI1535 adapter at ef00
> > > M/B Temp:    +43°C  (low  =   -65°C, high =  +127°C)
> > > CPU Temp:  +47.6°C  (low  = +43.0°C, high = +51.0°C)   ALARM
> > > M/B Crit:   +127°C  (hyst =  +122°C)
> > > CPU Crit:   +100°C  (hyst =   +95°C)
> >
> > Do you also have "command never completed" errors without an associated
> > error from the lm90 driver?
>
> yes, on boot.
>
> > This would suggest that the eeprom driver
> > too is triggering errors, which in turn would confirm that we need to
> > fix the i2c-ali1535 driver rather than adding a workaround to the lm90
> > driver.
> >
> > It looks like the i2c-ali1535 driver as it exists in the lm_sensors CVS
> > repository (for Linux 2.4 kernels) did receive a major change in March
> > 2005. These changes were supposed to "fix stability problems" (by
> > adding delay loops pretty much everywhere). They were never ported to
> > the Linux 2.6 version of the driver. Maybe we should try doing so now.
> >
> > This is a 400 lines patch, porting it won't be trivial, I am not
> > familiar with this driver myself and I don't have a chip to test my
> > changes on, so if someone else wants to take his/her chance, go. If
> > not, I'll do it.
> >
> > Andrey, will you be able to test a i2c-ali1535 patch if we come up with
> > one?
>
> Yes. Send me a patch (or give a link) and I'll try what I can do to port
> it. I ask if I have a question :)
>

Do you mean revision 1.21 with date: 2005/03/27 02:22:10;  author: mds? I 
checked and this one seems to be in current 2.6.15.1 kernel. I did not check 
if there were any omissions comparing with CVS but current kernel does 
contain and use ali1535_transaction() added by mentioned patch.

- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDyp6fR6LMutpd94wRAnqNAKCn3kW51rt3YrPatfVibeU1WPClvQCfTWAG
u3RJ0TZnP3izDyPS1HwbVg0=
=35MJ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-15 19:12     ` Andrey Borzenkov
@ 2006-01-15 19:48       ` Andrey Borzenkov
  2006-01-15 20:33         ` Rudolf Marek
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-15 19:48 UTC (permalink / raw)
  To: Jean Delvare; +Cc: lm-sensors, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 22:12, Andrey Borzenkov wrote:
> On Sunday 15 January 2006 00:45, Andrey Borzenkov wrote:
> > On Sunday 15 January 2006 00:20, Jean Delvare wrote:
> > > Hi Andrey,
> > >
> > > > Vanilla 2.6.15 on Toshiba Portege 4000. I get constant messages in
> > > > dmesg:
> > > >
> > > > i2c_adapter i2c-0: Error: command never completed
> > > > lm90 0-004c: Register 0x1 read failed (-1)
> > > > i2c_adapter i2c-0: Error: command never completed
> > > > lm90 0-004c: Register 0x14 read failed (-1)
> > > > i2c_adapter i2c-0: Error: command never completed
> > > > lm90 0-004c: Register 0x8 read failed (-1)
> > > > i2c_adapter i2c-0: Error: command never completed
> > > > lm90 0-004c: Register 0x0 read failed (-1)
> > > >
> > > > for quite a number of registers. Apparently I can read sensors just
> > > > fine still I am uneasy seeing those.
> > >
> > > Before 2.6.15, the lm90 driver did not handle read errors in any way,
> > > so they were probably already there, you simply were not aware of it.
> > > However, I guess that you already had the "command never completed"
> > > errors? These come from the i2c-ali1535 bus driver.
> >
> > Before 2.6.15 I run Mandriva kernel 2.6.12-12mdk. I do not remember them
> > but may be I just never actually looked in dmesg :)
> >
> > > It would be possible to add a retry-on-failure mechanism in the lm90
> > > driver. However, the real problem is more likely in the i2c-ali1535
> > > driver so fixing this one driver would be preferable.
> > >
> > > > eeprom-i2c-0-50
> > > > Adapter: SMBus ALI1535 adapter at ef00
> > > > Memory type:            SDR SDRAM DIMM
> > > > Memory size (MB):       256
> > > >
> > > > adm1032-i2c-0-4c
> > > > Adapter: SMBus ALI1535 adapter at ef00
> > > > M/B Temp:    +43°C  (low  =   -65°C, high =  +127°C)
> > > > CPU Temp:  +47.6°C  (low  = +43.0°C, high = +51.0°C)   ALARM
> > > > M/B Crit:   +127°C  (hyst =  +122°C)
> > > > CPU Crit:   +100°C  (hyst =   +95°C)
> > >
> > > Do you also have "command never completed" errors without an associated
> > > error from the lm90 driver?
> >
> > yes, on boot.
> >
> > > This would suggest that the eeprom driver
> > > too is triggering errors, which in turn would confirm that we need to
> > > fix the i2c-ali1535 driver rather than adding a workaround to the lm90
> > > driver.
> > >
> > > It looks like the i2c-ali1535 driver as it exists in the lm_sensors CVS
> > > repository (for Linux 2.4 kernels) did receive a major change in March
> > > 2005. These changes were supposed to "fix stability problems" (by
> > > adding delay loops pretty much everywhere). They were never ported to
> > > the Linux 2.6 version of the driver. Maybe we should try doing so now.
> > >
> > > This is a 400 lines patch, porting it won't be trivial, I am not
> > > familiar with this driver myself and I don't have a chip to test my
> > > changes on, so if someone else wants to take his/her chance, go. If
> > > not, I'll do it.
> > >
> > > Andrey, will you be able to test a i2c-ali1535 patch if we come up with
> > > one?
> >
> > Yes. Send me a patch (or give a link) and I'll try what I can do to port
> > it. I ask if I have a question :)
>
> Do you mean revision 1.21 with date: 2005/03/27 02:22:10;  author: mds? I
> checked and this one seems to be in current 2.6.15.1 kernel. I did not
> check if there were any omissions comparing with CVS but current kernel
> does contain and use ali1535_transaction() added by mentioned patch.
>


I compiled i2c-ali1535 with debugging. I have to types of errors. First block 
is:

Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=10, CMD=03, ADD=99, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=9a, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=9a
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=9a, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a0, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=00, CMD=03, ADD=a0, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a0, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=00, CMD=03, ADD=a0, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a2, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=a2
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=a2, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a4, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=a4
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=a4, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a6, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=a6
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=a6, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=a8, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=a8
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=a8, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=aa, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=aa
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=aa, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=ac, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=ac
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=ac, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=00, CMD=03, ADD=ae, DAT0=00, DAT1=10
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: no response or bus 
collision ADD=ae
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:17:53 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=44, 
TYP=00, CMD=03, ADD=ae, DAT0=00, DAT1=10
Jan 15 22:17:57 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=00, ADD=98, DAT0=00, DAT1=10
Jan 15 22:17:57 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=10, CMD=00, ADD=98, DAT0=00, DAT1=10

this appears simply a probing for non-existent i2c ports (correct me if I am 
wrong) presumably by eeprom driver.

Second block are errors from lm90 for different registers:

Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=01, ADD=99, DAT0=a0, DAT1=10
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=10, CMD=01, ADD=99, DAT0=29, DAT1=10
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=04, 
TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
Jan 15 22:24:02 cooker kernel: lm90 0-004c: Register 0x8 read failed (-1)
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=07, ADD=98, DAT0=29, DAT1=10
Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=10, CMD=07, ADD=98, DAT0=29, DAT1=10

Here I do not see SMBus errors - it appears really that i2c device did not 
respond. OTOH interesting is that there is no timeout. Apparently command 
completed without setting DONE bit. As I have zero knowledge about hardware I 
cannot interpret it. Next driver resets SMBus and it works for some time 
again. Judging by comments in source, it apprently signifies hung ali1535, 
not external i2c device (it is using KILL, and "this doesn't seem to clear 
the controller if an external device is hung")

I am ready to test any patch.

- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDyqb3R6LMutpd94wRAkvsAJ4/nD91TVzezwLIIcRzasBMjVbvewCeKxqa
I563XEGbgfGG239rAQZzJ/A=
=E7Yd
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-15 19:48       ` Andrey Borzenkov
@ 2006-01-15 20:33         ` Rudolf Marek
  2006-01-15 20:58           ` Andrey Borzenkov
                             ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Rudolf Marek @ 2006-01-15 20:33 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: Jean Delvare, linux-kernel, lm-sensors

[-- Attachment #1: Type: text/plain, Size: 3169 bytes --]

Hello all,

> 
> this appears simply a probing for non-existent i2c ports (correct me if I am 
> wrong) presumably by eeprom driver.

yes I think you are right. (ADD/2 is the address of chip, that it tries to access)

> Second block are errors from lm90 for different registers:
> 
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
> TYP=10, CMD=01, ADD=99, DAT0=a0, DAT1=10
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
> TYP=10, CMD=01, ADD=99, DAT0=29, DAT1=10
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
> TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Error: command never 
> completed
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=04, 
> TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
> Jan 15 22:24:02 cooker kernel: lm90 0-004c: Register 0x8 read failed (-1)
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
> TYP=10, CMD=07, ADD=98, DAT0=29, DAT1=10
> Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
> TYP=10, CMD=07, ADD=98, DAT0=29, DAT1=10
> 
> Here I do not see SMBus errors - it appears really that i2c device did not 
> respond. OTOH interesting is that there is no timeout. Apparently command 
> completed without setting DONE bit. As I have zero knowledge about hardware I 
> cannot interpret it. Next driver resets SMBus and it works for some time 
> again. Judging by comments in source, it apprently signifies hung ali1535, 
> not external i2c device (it is using KILL, and "this doesn't seem to clear 
> the controller if an external device is hung")

Well it seems this ali 15x3 has maybe same hardware bug? It was mentioned already here:
http://www2.lm-sensors.nu/~lm78/readticket.cgi?ticket=2030

> In the log below you can see that the ALI15X3 chip seems to keep in idle-state
> without reporting "done", but it does not turn in "busy" state. I patched the
> driver to do the reset procedure (with ALI15X3_T_OUT) after the error, but
> afterwards, the chip turns to "busy" state until next reboot.

And it continued:

http://lists.lm-sensors.org/pipermail/lm-sensors/2005-October/013808.html

I asked for a patch and what I have received like a month after is patch that works for them:

> Dear Rudolf,
> 
> unfortunately i do not have cvs installed on my machine. I hope it's okay if
> i send you the complete patched module (the only file i changed was the
> i2c-ali15x3.c) so you can do the patch yourself. Since i'm not a experienced
> driver developer i do not know what you ment with your last sentence and i
> did not find any remarks on the website.
> 
> However, feel free to contact me if you have still any questions.
> 
> This version works fine and without any problems over many days in our test
> system.
> 
> Regards,
> Claudio Klingler

I'm putting it into attachment. (this is against the lmsensors CVS so 2.4 driver)

Since I dont own the motherboard with this chip (nor the datasheet) and the resulting driver was hard to read I just left this issue.
I hope it can help now.

Regards
Rudolf

[-- Attachment #2: i2c-ali15x3.c --]
[-- Type: text/x-csrc, Size: 16990 bytes --]

/*
    ali15x3.c - Part of lm_sensors, Linux kernel modules for hardware
              monitoring
    Copyright (c) 1999  Frodo Looijaard <frodol@dds.nl> and
    Philip Edelbrock <phil@netroedge.com> and
    Mark D. Studebaker <mdsxyz123@yahoo.com>

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/

/*
    This is the driver for the SMB Host controller on
    Acer Labs Inc. (ALI) M1541 and M1543C South Bridges.

    The M1543C is a South bridge for desktop systems.
    The M1533 is a South bridge for portable systems.
    They are part of the following ALI chipsets:
       "Aladdin Pro 2": Includes the M1621 Slot 1 North bridge
       with AGP and 100MHz CPU Front Side bus
       "Aladdin V": Includes the M1541 Socket 7 North bridge
       with AGP and 100MHz CPU Front Side bus
       "Aladdin IV": Includes the M1541 Socket 7 North bridge
       with host bus up to 83.3 MHz.
    For an overview of these chips see http://www.acerlabs.com

    The M1533/M1543C devices appear as FOUR separate devices
    on the PCI bus. An output of lspci will show something similar
    to the following:

	00:02.0 USB Controller: Acer Laboratories Inc. M5237
	00:03.0 Bridge: Acer Laboratories Inc. M7101
	00:07.0 ISA bridge: Acer Laboratories Inc. M1533
	00:0f.0 IDE interface: Acer Laboratories Inc. M5229

    The SMB controller is part of the 7101 device, which is an
    ACPI-compliant Power Management Unit (PMU).

    The whole 7101 device has to be enabled for the SMB to work.
    You can't just enable the SMB alone.
    The SMB and the ACPI have separate I/O spaces.
    We make sure that the SMB is enabled. We leave the ACPI alone.

    This driver controls the SMB Host only.
    The SMB Slave controller on the M15X3 is not enabled.

    This driver does not use interrupts.
*/

/* Note: we assume there can only be one ALI15X3, with one SMBus interface */

/* #define DEBUG 1 */

#include <linux/module.h>
#include <linux/pci.h>
#include <linux/kernel.h>
#include <linux/stddef.h>
#include <linux/sched.h>
#include <linux/ioport.h>
#include <linux/i2c.h>
#include <linux/init.h>
#include <asm/io.h>
#include "version.h"
#include "sensors_compat.h"

/* ALI15X3 SMBus address offsets */
#define SMBHSTSTS	(0 + ali15x3_smba)
#define SMBHSTCNT	(1 + ali15x3_smba)
#define SMBHSTSTART	(2 + ali15x3_smba)
#define SMBHSTCMD	(7 + ali15x3_smba)
#define SMBHSTADD	(3 + ali15x3_smba)
#define SMBHSTDAT0	(4 + ali15x3_smba)
#define SMBHSTDAT1	(5 + ali15x3_smba)
#define SMBBLKDAT	(6 + ali15x3_smba)

/* PCI Address Constants */
#define SMBCOM		0x004
#define SMBBA		0x014
#define SMBATPC		0x05B	/* used to unlock xxxBA registers */
#define SMBHSTCFG	0x0E0
#define SMBSLVC		0x0E1
#define SMBCLK		0x0E2
#define SMBREV		0x008

/* Other settings */
#define MAX_TIMEOUT		50	/* times 1/100 sec */
#define MAX_RETRIES		3	/* maximum reset retries */
#define ALI15X3_SMB_IOSIZE	32

/* this is what the Award 1004 BIOS sets them to on a ASUS P5A MB.
   We don't use these here. If the bases aren't set to some value we
   tell user to upgrade BIOS and we fail.
*/
#define ALI15X3_SMB_DEFAULTBASE	0xE800

/* ALI15X3 address lock bits */
#define ALI15X3_LOCK		0x06

/* ALI15X3 command constants */
#define ALI15X3_ABORT		0x02
#define ALI15X3_T_OUT		0x04
#define ALI15X3_QUICK		0x00
#define ALI15X3_BYTE		0x10
#define ALI15X3_BYTE_DATA	0x20
#define ALI15X3_WORD_DATA	0x30
#define ALI15X3_BLOCK_DATA	0x40
#define ALI15X3_BLOCK_CLR	0x80

/* ALI15X3 status register bits */
#define ALI15X3_STS_IDLE	0x04
#define ALI15X3_STS_BUSY	0x08
#define ALI15X3_STS_DONE	0x10
#define ALI15X3_STS_DEV		0x20	/* device error */
#define ALI15X3_STS_COLL	0x40	/* collision or no response */
#define ALI15X3_STS_TERM	0x80	/* terminated by abort */
#define ALI15X3_STS_ERR		0xE0	/* all the bad error bits */


#define E_TIMEOUT	-2

#define	PRINTK(fmt, arg...) {;}


/* If force_addr is set to anything different from 0, we forcibly enable
   the device at the given address. */
static int force_addr = 0;
MODULE_PARM(force_addr, "i");
MODULE_PARM_DESC(force_addr,
		 "Initialize the base address of the i2c controller");

static unsigned short ali15x3_smba = 0;

static int ali15x3_setup(struct pci_dev *ALI15X3_dev)
{
	u16 a;
	unsigned char temp;

	/* Check the following things:
		- SMB I/O address is initialized
		- Device is enabled
		- We can use the addresses
	*/

	/* Unlock the register.
	   The data sheet says that the address registers are read-only
	   if the lock bits are 1, but in fact the address registers
	   are zero unless you clear the lock bits.
	*/
	pci_read_config_byte(ALI15X3_dev, SMBATPC, &temp);
	if (temp & ALI15X3_LOCK) {
		temp &= ~ALI15X3_LOCK;
		pci_write_config_byte(ALI15X3_dev, SMBATPC, temp);
	}

	/* Determine the address of the SMBus area */
	pci_read_config_word(ALI15X3_dev, SMBBA, &ali15x3_smba);
	ali15x3_smba &= (0xffff & ~(ALI15X3_SMB_IOSIZE - 1));
	if (ali15x3_smba == 0 && force_addr == 0) {
		dev_err(ALI15X3_dev, "ALI15X3_smb region uninitialized "
			"- upgrade BIOS or use force_addr=0xaddr\n");
		return -ENODEV;
	}

	if(force_addr)
		ali15x3_smba = force_addr & ~(ALI15X3_SMB_IOSIZE - 1);

	if (!request_region(ali15x3_smba, ALI15X3_SMB_IOSIZE, "ali15x3-smb")) {
		dev_err(ALI15X3_dev,
			"ALI15X3_smb region 0x%x already in use!\n",
			ali15x3_smba);
		return -ENODEV;
	}

	if(force_addr) {
		dev_info(ALI15X3_dev, "forcing ISA address 0x%04X\n",
			ali15x3_smba);
		if (PCIBIOS_SUCCESSFUL !=
		    pci_write_config_word(ALI15X3_dev, SMBBA, ali15x3_smba))
			return -ENODEV;
		if (PCIBIOS_SUCCESSFUL !=
		    pci_read_config_word(ALI15X3_dev, SMBBA, &a))
			return -ENODEV;
		if ((a & ~(ALI15X3_SMB_IOSIZE - 1)) != ali15x3_smba) {
			/* make sure it works */
			dev_err(ALI15X3_dev,
				"force address failed - not supported?\n");
			return -ENODEV;
		}
	}
	/* check if whole device is enabled */
	pci_read_config_byte(ALI15X3_dev, SMBCOM, &temp);
	if ((temp & 1) == 0) {
		dev_info(ALI15X3_dev, "enabling SMBus device\n");
		pci_write_config_byte(ALI15X3_dev, SMBCOM, temp | 0x01);
	}

	/* Is SMB Host controller enabled? */
	pci_read_config_byte(ALI15X3_dev, SMBHSTCFG, &temp);
	if ((temp & 1) == 0) {
		dev_info(ALI15X3_dev, "enabling SMBus controller\n");
		pci_write_config_byte(ALI15X3_dev, SMBHSTCFG, temp | 0x01);
	}

	/* set SMB clock to 74KHz as recommended in data sheet */
	/* ??CK Evtl. verursacht das die Bus timeouts, deshalb*/
	/* belassen wir es mal bei den Default-Einstellungen */
	pci_write_config_byte(ALI15X3_dev, SMBCLK, 0x20);

	/*
	  The interrupt routing for SMB is set up in register 0x77 in the
	  1533 ISA Bridge device, NOT in the 7101 device.
	  Don't bother with finding the 1533 device and reading the register.
	if ((....... & 0x0F) == 1)
		dev_dbg(ALI15X3_dev, "ALI15X3 using Interrupt 9 for SMBus.\n");
	*/
	pci_read_config_byte(ALI15X3_dev, SMBREV, &temp);
	dev_dbg(ALI15X3_dev, "SMBREV = 0x%X\n", temp);
	dev_dbg(ALI15X3_dev, "iALI15X3_smba = 0x%X\n", ali15x3_smba);

	return 0;
}

static int ali15x3_wait_for_idle (struct i2c_adapter *adap)
{
	int temp;
	int timeout;

	/* make sure SMBus is idle */
	temp = inb_p(SMBHSTSTS);
	for (timeout = 0;
	     (timeout < MAX_TIMEOUT) && !(temp & ALI15X3_STS_IDLE);
	     timeout++) {
		i2c_delay(1);
		temp = inb_p(SMBHSTSTS);
	}
	if (timeout >= MAX_TIMEOUT) {
		dev_err(adap, "Idle wait Timeout! STS=0x%02x\n", temp);
		return -1;
	}

	return 0;
}

static int ali15x3_do_reset (struct i2c_adapter *adap)
{
	int temp;
	
	/*
	   If the host controller is still busy, it may have timed out in the
	   previous transaction, resulting in a "SMBus Timeout" Dev.
	   I've tried the following to reset a stuck busy bit.
		1. Reset the controller with an ABORT command.
		   (this doesn't seem to clear the controller if an external
		   device is hung)
		2. Reset the controller and the other SMBus devices with a
		   T_OUT command.  (this clears the host busy bit if an
		   external device is hung, but it comes back upon a new access
		   to a device)
		3. Disable and reenable the controller in SMBHSTCFG
	   Worst case, nothing seems to work except power reset.
	*/
	/* Abort - reset the host controller */
	/*
	   Try resetting entire SMB bus, including other devices -
	   This may not work either - it clears the BUSY bit but
	   then the BUSY bit may come back on when you try and use the chip again.
	   If that's the case you are stuck.
	*/
	dev_dbg(adap, "Resetting entire SMB Bus (abort) to "
		"clear busy condition (%02x)\n", temp);
	outb_p(ALI15X3_T_OUT, SMBHSTCNT);
	i2c_delay(1);
	temp = inb_p(SMBHSTSTS);
	dev_dbg(adap, "Status after reset: %02x\n", temp);

	/* now check the error bits and the busy bit */
	if (temp & (ALI15X3_STS_ERR | ALI15X3_STS_BUSY)) {
		/* do a clear-on-write */
		dev_dbg(adap, "Doing a clear on write\n");
		outb_p(0xFF, SMBHSTSTS);
		if ((temp = inb_p(SMBHSTSTS)) &
		    (ALI15X3_STS_ERR | ALI15X3_STS_BUSY)) {
			/* this is probably going to be correctable only by a power reset
			   as one of the bits now appears to be stuck */
			/* This may be a bus or device with electrical problems. */
			dev_err(adap, "SMBus reset failed! (0x%02x) - "
				"controller or device on bus is probably hung\n",
				temp);
			return -1;
		}
	} else {
		/* check and clear done bit */
		if (temp & ALI15X3_STS_DONE) {
			dev_info(adap, "Resetting done flag\n");
			outb_p(temp, SMBHSTSTS);
		}
	}
	
	return 0;
}

/* Another internally used function */
static int ali15x3_transaction(struct i2c_adapter *adap)
{
	int temp;
	int result = 0;
	int timeout = 0;

	/*dev_dbg(adap, "Transaction (pre): STS=%02x, CNT=%02x, CMD=%02x, "
		"ADD=%02x, DAT0=%02x, DAT1=%02x\n", inb_p(SMBHSTSTS),
		inb_p(SMBHSTCNT), inb_p(SMBHSTCMD), inb_p(SMBHSTADD),
		inb_p(SMBHSTDAT0), inb_p(SMBHSTDAT1)); */

	/* start the transaction by writing anything to the start register */
	outb_p(0xFF, SMBHSTSTART);

	/* We will always wait for a fraction of a second! */
	timeout = 0;
	do {
		i2c_delay(1);
		temp = inb_p(SMBHSTSTS);
	} while ((!(temp & (ALI15X3_STS_ERR | ALI15X3_STS_DONE)))
		 && (timeout++ < MAX_TIMEOUT));

	/* If the SMBus is still busy, we give up */
	if (timeout >= MAX_TIMEOUT) {
		//dev_warn(adap, "SMBus Timeout, doing a reset\n");
		ali15x3_do_reset(adap);
		return E_TIMEOUT;
	}

	if (temp & ALI15X3_STS_TERM) {
		result = -1;
		//dev_dbg(adap, "Error: Failed bus transaction\n");
	}

	/*
	  Unfortunately the ALI SMB controller maps "no response" and "bus
	  collision" into a single bit. No reponse is the usual case so don't
	  do a PRINTK.
	  This means that bus collisions go unreported.
	*/
	if (temp & ALI15X3_STS_COLL) {
		result = -1;
		/*dev_dbg(adap,
			"Error: no response or bus collision ADD=%02x\n",
			inb_p(SMBHSTADD));  */
	}

	/* haven't ever seen this */
	if (temp & ALI15X3_STS_DEV) {
		result = -1;
		dev_err(adap, "Error: device error\n");
	}
	dev_dbg(adap, "Transaction (post): STS=%02x, CNT=%02x, CMD=%02x, "
		"ADD=%02x, DAT0=%02x, DAT1=%02x\n", inb_p(SMBHSTSTS),
		inb_p(SMBHSTCNT), inb_p(SMBHSTCMD), inb_p(SMBHSTADD),
		inb_p(SMBHSTDAT0), inb_p(SMBHSTDAT1));
	return result;
}

/* Return -1 on error. */
static s32 ali15x3_access(struct i2c_adapter * adap, u16 addr,
		   unsigned short flags, char read_write, u8 command,
		   int size, union i2c_smbus_data * data)
{
	int i, len;
	int retries;
	int tsize;
	
	for (retries = 0; retries < MAX_RETRIES; retries++)
	{	
		/* clear all the bits (clear-on-write) */
		outb_p(0xFF, SMBHSTSTS);
	
		/* ali15x3_do_reset(adap); */
		ali15x3_wait_for_idle(adap);
	
		switch (size) {
		case I2C_SMBUS_QUICK:
			outb_p(((addr & 0x7f) << 1) | (read_write & 0x01),
			SMBHSTADD);
			tsize = ALI15X3_QUICK;
			break;
		case I2C_SMBUS_BYTE:
			outb_p(((addr & 0x7f) << 1) | (read_write & 0x01),
			SMBHSTADD);
			if (read_write == I2C_SMBUS_WRITE)
				outb_p(command, SMBHSTCMD);
			tsize = ALI15X3_BYTE;
			break;
		case I2C_SMBUS_BYTE_DATA:
			outb_p(((addr & 0x7f) << 1) | (read_write & 0x01),
			SMBHSTADD);
			outb_p(command, SMBHSTCMD);
			if (read_write == I2C_SMBUS_WRITE)
				outb_p(data->byte, SMBHSTDAT0);
			tsize = ALI15X3_BYTE_DATA;
			break;
		case I2C_SMBUS_WORD_DATA:
			outb_p(((addr & 0x7f) << 1) | (read_write & 0x01),
			SMBHSTADD);
			outb_p(command, SMBHSTCMD);
			if (read_write == I2C_SMBUS_WRITE) {
				outb_p(data->word & 0xff, SMBHSTDAT0);
				outb_p((data->word & 0xff00) >> 8, SMBHSTDAT1);
			}
			tsize = ALI15X3_WORD_DATA;
			break;
		case I2C_SMBUS_BLOCK_DATA:
			outb_p(((addr & 0x7f) << 1) | (read_write & 0x01),
			SMBHSTADD);
			outb_p(command, SMBHSTCMD);
			if (read_write == I2C_SMBUS_WRITE) {
				len = data->block[0];
				if (len < 0) {
					len = 0;
					data->block[0] = len;
				}
				if (len > 32) {
					len = 32;
					data->block[0] = len;
				}
				outb_p(len, SMBHSTDAT0);
				/* Reset SMBBLKDAT */
				outb_p(inb_p(SMBHSTCNT) | ALI15X3_BLOCK_CLR, SMBHSTCNT);
				for (i = 1; i <= len; i++)
					outb_p(data->block[i], SMBBLKDAT);
			}
			tsize = ALI15X3_BLOCK_DATA;
			break;
		default:
			PRINTK
			(KERN_WARNING "i2c-ali15x3.o: Unsupported transaction %d\n", size);
			return -1;
		}
	
		outb_p(tsize, SMBHSTCNT);	/* output command */
	
		switch (ali15x3_transaction(adap))
		{
			case 0:
				break;	/* no error, continue */
			case E_TIMEOUT:	
				PRINTK(KERN_INFO "i2c-ali15x3.o: timeout, doing a retry (%d)\n", retries);
				continue;
			default:
				return -1;
		}
		
		break;
	}
	
	if (retries == MAX_RETRIES)
	{
		PRINTK(KERN_WARNING "i2c-ali15x3.o: transaction failed, maximum retries reached\n");	
		return -1;
	}
		
	if ((read_write == I2C_SMBUS_WRITE) || (size == ALI15X3_QUICK))
		return 0;


	switch (tsize) {
	case ALI15X3_BYTE:	/* Result put in SMBHSTDAT0 */
		data->byte = inb_p(SMBHSTDAT0);
		break;
	case ALI15X3_BYTE_DATA:
		data->byte = inb_p(SMBHSTDAT0);
		break;
	case ALI15X3_WORD_DATA:
		data->word = inb_p(SMBHSTDAT0) + (inb_p(SMBHSTDAT1) << 8);
		break;
	case ALI15X3_BLOCK_DATA:
		len = inb_p(SMBHSTDAT0);
		if (len > 32)
			len = 32;
		data->block[0] = len;
		/* Reset SMBBLKDAT */
		outb_p(inb_p(SMBHSTCNT) | ALI15X3_BLOCK_CLR, SMBHSTCNT);
		for (i = 1; i <= data->block[0]; i++) {
			data->block[i] = inb_p(SMBBLKDAT);
			dev_dbg(adap, "Blk: len=%d, i=%d, data=%02x\n",
				len, i, data->block[i]);
		}
		break;
	}
	return 0;
}

static void ali15x3_inc(struct i2c_adapter *adapter)
{
#ifdef MODULE
	MOD_INC_USE_COUNT;
#endif
}

static void ali15x3_dec(struct i2c_adapter *adapter)
{
#ifdef MODULE
	MOD_DEC_USE_COUNT;
#endif
}

static u32 ali15x3_func(struct i2c_adapter *adapter)
{
	return I2C_FUNC_SMBUS_QUICK | I2C_FUNC_SMBUS_BYTE |
	    I2C_FUNC_SMBUS_BYTE_DATA | I2C_FUNC_SMBUS_WORD_DATA |
	    I2C_FUNC_SMBUS_BLOCK_DATA;
}

static struct i2c_algorithm smbus_algorithm = {
	.name		= "Non-I2C SMBus adapter",
	.id		= I2C_ALGO_SMBUS,
	.smbus_xfer	= ali15x3_access,
	.functionality	= ali15x3_func,
};

static struct i2c_adapter ali15x3_adapter = {
	.id		= I2C_ALGO_SMBUS | I2C_HW_SMBUS_ALI15X3,
	.algo		= &smbus_algorithm,
	.name		= "unset",
	.inc_use	= ali15x3_inc,
	.dec_use	= ali15x3_dec,
};

static struct pci_device_id ali15x3_ids[] __devinitdata = {
	{
	.vendor =	PCI_VENDOR_ID_AL,
	.device =	PCI_DEVICE_ID_AL_M7101,
	.subvendor =	PCI_ANY_ID,
	.subdevice =	PCI_ANY_ID,
	},
	{ 0, }
};

static int __devinit ali15x3_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
	if (ali15x3_setup(dev)) {
		dev_err(dev,
			"ALI15X3 not detected, module not inserted.\n");
		return -ENODEV;
	}

	snprintf(ali15x3_adapter.name, 32,
		"SMBus ALI15X3 adapter at %04x", ali15x3_smba);
	return i2c_add_adapter(&ali15x3_adapter);
}

static void __devexit ali15x3_remove(struct pci_dev *dev)
{
	i2c_del_adapter(&ali15x3_adapter);
	release_region(ali15x3_smba, ALI15X3_SMB_IOSIZE);
}

static struct pci_driver ali15x3_driver = {
	.name		= "ali15x3 smbus",
	.id_table	= ali15x3_ids,
	.probe		= ali15x3_probe,
	.remove		= __devexit_p(ali15x3_remove),
};

static int __init i2c_ali15x3_init(void)
{
	PRINTK("i2c-ali15x3.o version %s (%s)\n", LM_VERSION, LM_DATE);
	return pci_module_init(&ali15x3_driver);
}

static void __exit i2c_ali15x3_exit(void)
{
	pci_unregister_driver(&ali15x3_driver);
}

MODULE_AUTHOR ("Frodo Looijaard <frodol@dds.nl>, "
		"Philip Edelbrock <phil@netroedge.com>, "
		"and Mark D. Studebaker <mdsxyz123@yahoo.com>");
MODULE_DESCRIPTION("ALI15X3 SMBus driver");
MODULE_LICENSE("GPL");

module_init(i2c_ali15x3_init);
module_exit(i2c_ali15x3_exit);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-15 20:33         ` Rudolf Marek
@ 2006-01-15 20:58           ` Andrey Borzenkov
  2006-01-16 19:40           ` Andrey Borzenkov
  2006-01-27  4:15           ` Andrey Borzenkov
  2 siblings, 0 replies; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-15 20:58 UTC (permalink / raw)
  To: Rudolf Marek; +Cc: Jean Delvare, linux-kernel, lm-sensors

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 23:33, Rudolf Marek wrote:
>
> Well it seems this ali 15x3 has maybe same hardware bug? It was mentioned
> already here: http://www2.lm-sensors.nu/~lm78/readticket.cgi?ticket=2030
>
> > In the log below you can see that the ALI15X3 chip seems to keep in
> > idle-state without reporting "done", but it does not turn in "busy"
> > state. I patched the driver to do the reset procedure (with
> > ALI15X3_T_OUT) after the error, but afterwards, the chip turns to "busy"
> > state until next reboot.
>

This is already done in i2c-ali1535 in current kernel. So it looks like HW 
issue that can be ignored at best. After reset SMBus continues to work. The 
only question is, should we provide an option to shut up those errors; 
assuming user knows (s)he has buggy controller there is no reason to spam 
dmesg with known issue. Will patch be accepted? I will emit first occurence 
of this error to let users know something is fishy and supress further ones. 
But this has to wait for next week, it is already too late here.

Thank you for information

- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDyrd9R6LMutpd94wRAuNVAKCwq+yTwvFt6jYLS1wL5pIDr68IMwCbBHb+
yXAnHp+jzVFW1ddKVbZVkY8=
=ABky
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-15 20:33         ` Rudolf Marek
  2006-01-15 20:58           ` Andrey Borzenkov
@ 2006-01-16 19:40           ` Andrey Borzenkov
  2006-01-16 21:17             ` Rudolf Marek
  2006-01-27  4:15           ` Andrey Borzenkov
  2 siblings, 1 reply; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-16 19:40 UTC (permalink / raw)
  To: Rudolf Marek; +Cc: Jean Delvare, linux-kernel, lm-sensors

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 23:33, Rudolf Marek wrote:
> Well it seems this ali 15x3 has maybe same hardware bug? It was mentioned
> already here: http://www2.lm-sensors.nu/~lm78/readticket.cgi?ticket=2030
>
[...]
> Since I dont own the motherboard with this chip (nor the datasheet) and the
> resulting driver was hard to read I just left this issue. I hope it can
> help now.

Actually it did. I realized that 15x3 you sent attempted recovery while 
current 1535 not. After some experiments I came up with this patch (it is not 
meant for inclusion but only for discussion) that seems to work. I had hard 
rime finding the exact place where to retry command but now I get

Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=01, ADD=99, DAT0=05, DAT1=10
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=10, CMD=01, ADD=99, DAT0=2c, DAT1=10
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=04, 
TYP=10, CMD=10, ADD=98, DAT0=2c, DAT1=10
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Error: command never 
completed
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=04, 
TYP=10, CMD=10, ADD=98, DAT0=2c, DAT1=10
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Adapter hung, retrying after 
reset
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (pre): STS=00, 
TYP=00, CMD=10, ADD=98, DAT0=2c, DAT1=10
Jan 16 22:20:14 cooker kernel: i2c_adapter i2c-0: Transaction (post): STS=14, 
TYP=00, CMD=10, ADD=98, DAT0=2c, DAT1=10

so it appears to recover nicely. Does it look like it returns correct value 
after retry?

I intend to squash errors, leaving only the first occurence but making it more 
verbose. Probably:

Error: command never completed. It is probably hardware bug
Command will be retried after controller is reset
further occurences of this error won't be reported as long as retry is 
sucessful

is wording OK (I am not native english speaker)?

regards

- -andrey

- --- linux-2.6.15/drivers/i2c/busses/i2c-ali1535.c	2006-01-03 
06:21:10.000000000 +0300
+++ i2c-ali1535.c	2006-01-16 22:22:51.000000000 +0300
@@ -311,8 +311,8 @@ static int ali1535_transaction(struct i2
 	}
 
 	/* check to see if the "command complete" indication is set */
- -	if (!(temp & ALI1535_STS_DONE)) {
- -		result = -1;
+	if (!result && !(temp & ALI1535_STS_DONE)) {
+		result = -2;
 		dev_err(&adap->dev, "Error: command never completed\n");
 	}
 
@@ -344,6 +344,7 @@ static s32 ali1535_access(struct i2c_ada
 	int temp;
 	int timeout;
 	s32 result = 0;
+	int retry = 1;
 
 	down(&i2c_ali1535_sem);
 	/* make sure SMBus is idle */
@@ -360,6 +361,7 @@ static s32 ali1535_access(struct i2c_ada
 	/* clear status register (clear-on-write) */
 	outb_p(0xFF, SMBHSTSTS);
 
+retry:
 	switch (size) {
 	case I2C_SMBUS_PROC_CALL:
 		dev_err(&adap->dev, "I2C_SMBUS_PROC_CALL not supported!\n");
@@ -424,7 +426,14 @@ static s32 ali1535_access(struct i2c_ada
 		break;
 	}
 
- -	if (ali1535_transaction(adap)) {
+	if (((result = ali1535_transaction(adap)) == -2) && retry--) {
+		/* Adapter hung and was reset; retry */
+		dev_dbg(&adap->dev, "Adapter hung, retrying after reset\n");
+		result = 0;
+		goto retry;
+	}
+
+	if (result) {
 		/* Error in transaction */
 		result = -1;
 		goto EXIT;
 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD4DBQFDy/aZR6LMutpd94wRAoBlAJ0ZLlhPMIBC5Fmz0Iw4NBoNjM7wfwCUCB0t
+sFjdErqBnZatcpLmiPTKA==
=MW/Q
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-16 19:40           ` Andrey Borzenkov
@ 2006-01-16 21:17             ` Rudolf Marek
  2006-01-21 21:02               ` Andrey Borzenkov
  0 siblings, 1 reply; 11+ messages in thread
From: Rudolf Marek @ 2006-01-16 21:17 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: Jean Delvare, linux-kernel, lm-sensors, klingler

Hello all,

> Actually it did. I realized that 15x3 you sent attempted recovery while 
> current 1535 not. After some experiments I came up with this patch (it is not 
> meant for inclusion but only for discussion) that seems to work. I had hard 
> rime finding the exact place where to retry command but now I get
> 
> so it appears to recover nicely. Does it look like it returns correct value 
> after retry?

This can be possible. I may know why this is happening. I have now the datasheets
for both ali 1535 and 15x3. I found that there are special bits that are used to
control somehow when bus is considered idle.

Those bits are in PCI config space of same device as the smbus base addr is.

For the ali 15x3 the register is located at 0xe2 and bits are:

Bit        Description
7-5 (001b) SMB Clock Select.
           [7:5] : "clock"
           000 :     149K
           001 :     74K (recommended)
           010 :      37K
           100 :     223K
           101 :     111K
           110 :     55K
           These three bits are used to select the base clock for internal state machine. All the
           timings will be based on this clock. The clock is derived from OSC14M.
4-3 (0h)   Idle Delay Setting.
           [4:3] :   "idle time"
           00 :       BaseClk*64 53.76 us ref. 1.19M base clock. (default)
           01 :       BaseClk*32
           10 :       BaseClk*128
           Others : Reserved
           These two bits are used to decide the idle time to qualify SMBus is in idle state. The
           time is calculated based on the base clock defined in bits[7:5].
2-0 (0h)   Reserved.

For the 1535 is the register offset 0xF2

Bit       Description
7-5 (001) The base clock referenced by the SMB host controller.
          000: 149K.
          001: 74K.
          010: 31K.
          100: 223K.
          101: 111K.
          110: 55K.
4-3       Bus Delay Timer Setting. The base clock is set in the previous field. This timer decides
          when the SMB bus is actually idle.
          00: Base Clock × 4.
          01: Base Clock × 2.
          10: Base Clock × 8.
          11: Reserved.
2-0       Reserved.

What is interresting both drivers sets this to 0x20, overwriting two reserved bits - this is no good.
       /* set SMB clock to 74KHz as recommended in data sheet */
        pci_write_config_byte(dev, SMBCLK, 0x20);

Andrey and Claudio,
please can you send back output of lscpi -d 10b9:7101 -x -x -x  before you load the ali driver?

Also you both can try to change the delay a bit, after the driver loads (or kill the above line that sets it).

for andrey (1535):  setpci -d 10b9:7101 f2.b=28
(this should set it to base*8)

for Claudio:
I dont know if you want to dig into this, but if you want so please try with such driver that reports that it reset the controller.
setpci -d 10b9:7101 e2.b=28
(this should set it to base*128)

when done please load your chip device driver and let it run, observe if it resets more or less often. You may play with the smbus clock too if you want.
I hope this helps.

> I intend to squash errors, leaving only the first occurence but making it more 
> verbose. Probably:
> 
> Error: command never completed. It is probably hardware bug
> Command will be retried after controller is reset
> further occurences of this error won't be reported as long as retry is 
> sucessful
> 
> is wording OK (I am not native english speaker)?

I guess best would be to to emit some kind of error after all retries, but question
is how to do it cleanly.


> regards
> 
> -andrey
> 
> --- linux-2.6.15/drivers/i2c/busses/i2c-ali1535.c	2006-01-03 
> 06:21:10.000000000 +0300
> +++ i2c-ali1535.c	2006-01-16 22:22:51.000000000 +0300
> @@ -311,8 +311,8 @@ static int ali1535_transaction(struct i2
>  	}
>  
>  	/* check to see if the "command complete" indication is set */
> -	if (!(temp & ALI1535_STS_DONE)) {
> -		result = -1;
> +	if (!result && !(temp & ALI1535_STS_DONE)) {
> +		result = -2;
>  		dev_err(&adap->dev, "Error: command never completed\n");

Perhaps this dev_err can be move down

>  	}
>  
> @@ -344,6 +344,7 @@ static s32 ali1535_access(struct i2c_ada
>  	int temp;
>  	int timeout;
>  	s32 result = 0;
> +	int retry = 1;
>  
>  	down(&i2c_ali1535_sem);
>  	/* make sure SMBus is idle */
> @@ -360,6 +361,7 @@ static s32 ali1535_access(struct i2c_ada
>  	/* clear status register (clear-on-write) */
>  	outb_p(0xFF, SMBHSTSTS);
>  
> +retry:
>  	switch (size) {
>  	case I2C_SMBUS_PROC_CALL:
>  		dev_err(&adap->dev, "I2C_SMBUS_PROC_CALL not supported!\n");
> @@ -424,7 +426,14 @@ static s32 ali1535_access(struct i2c_ada
>  		break;
>  	}
>  
> -	if (ali1535_transaction(adap)) {
> +	if (((result = ali1535_transaction(adap)) == -2) && retry--) {
> +		/* Adapter hung and was reset; retry */
> +		dev_dbg(&adap->dev, "Adapter hung, retrying after reset\n");
> +		result = 0;
> +		goto retry;
> +	}
> +
> +	if (result) {

perhaps here to test if result is -2 and tell user that never completed?

>  		/* Error in transaction */
>  		result = -1;
>  		goto EXIT;
>  

Thats all from me,

regards
Rudolf

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-16 21:17             ` Rudolf Marek
@ 2006-01-21 21:02               ` Andrey Borzenkov
  0 siblings, 0 replies; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-21 21:02 UTC (permalink / raw)
  To: Rudolf Marek; +Cc: Jean Delvare, linux-kernel, lm-sensors, klingler

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday 17 January 2006 00:17, Rudolf Marek wrote:
> Hello all,
>
> > Actually it did. I realized that 15x3 you sent attempted recovery while
> > current 1535 not. After some experiments I came up with this patch (it is
> > not meant for inclusion but only for discussion) that seems to work. I
> > had hard rime finding the exact place where to retry command but now I
> > get
> >
> > so it appears to recover nicely. Does it look like it returns correct
> > value after retry?
>
> This can be possible. I may know why this is happening. I have now the
> datasheets for both ali 1535 and 15x3.

Any chance I can see it (for 1535)?

> I found that there are special bits 
> that are used to control somehow when bus is considered idle.
>
> Those bits are in PCI config space of same device as the smbus base addr
> is.
>
> For the ali 15x3 the register is located at 0xe2 and bits are:
>
> Bit        Description
> 7-5 (001b) SMB Clock Select.
>            [7:5] : "clock"
>            000 :     149K
>            001 :     74K (recommended)
>            010 :      37K
>            100 :     223K
>            101 :     111K
>            110 :     55K
>            These three bits are used to select the base clock for internal
> state machine. All the timings will be based on this clock. The clock is
> derived from OSC14M. 4-3 (0h)   Idle Delay Setting.
>            [4:3] :   "idle time"
>            00 :       BaseClk*64 53.76 us ref. 1.19M base clock. (default)
>            01 :       BaseClk*32
>            10 :       BaseClk*128
>            Others : Reserved
>            These two bits are used to decide the idle time to qualify SMBus
> is in idle state. The time is calculated based on the base clock defined in
> bits[7:5]. 2-0 (0h)   Reserved.
>
> For the 1535 is the register offset 0xF2
>
> Bit       Description
> 7-5 (001) The base clock referenced by the SMB host controller.
>           000: 149K.
>           001: 74K.
>           010: 31K.
>           100: 223K.
>           101: 111K.
>           110: 55K.
> 4-3       Bus Delay Timer Setting. The base clock is set in the previous
> field. This timer decides when the SMB bus is actually idle.
>           00: Base Clock × 4.
>           01: Base Clock × 2.
>           10: Base Clock × 8.
>           11: Reserved.
> 2-0       Reserved.
>
> What is interresting both drivers sets this to 0x20, overwriting two
> reserved bits - this is no good.

Fixed in attached patch.

> /* set SMB clock to 74KHz as recommended 
> in data sheet */
>         pci_write_config_byte(dev, SMBCLK, 0x20);
>
> Andrey and Claudio,
> please can you send back output of lscpi -d 10b9:7101 -x -x -x  before you
> load the ali driver?
>

It is exactly 0x20 as set by driver anyway.

> Also you both can try to change the delay a bit, after the driver loads (or
> kill the above line that sets it).
>
> for andrey (1535):  setpci -d 10b9:7101 f2.b=28
> (this should set it to base*8)
>

This did not completely eliminated problems but made them far less frequent 
then before. Combining with patch below it results in something like:

i2c_adapter i2c-0: Error: adapter did not idle after transaction
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: Error: command never completed
i2c_adapter i2c-0: Adapter hung, retrying after reset
i2c_adapter i2c-0: Error: adapter did not idle after transaction
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: Failed to execute command after 3 retries status: 00
lm90 0-004c: Register 0x1 read failed (-1)
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: adapter not idle before command; retrying
i2c_adapter i2c-0: Error: command never completed
i2c_adapter i2c-0: Adapter hung, retrying after reset

So sometimes it still could not be recovered. Unfortunately I cannot do much 
at this stage without having data sheet, as everything else is just a 
guesswork.

> I guess best would be to to emit some kind of error after all retries, but
> question is how to do it cleanly.
>

below is peroposed patch. After it has been sufficiently dicussed and tested I 
intend to replace most of dev_{err,info,warn} in retry path with dev_dbg and 
leave only one message after retry failed. Could you comment on it now you 
have datasheet? Is there better way to reset adapter after error?

regards

- -andrey 


Subject: [PATCH] ali1535 error recovery cleanup, PCI config fix

- - fix interpretation of BUSY flag. Old code apparently assumed it was
  asserted during transaction was active. My test shows it is asserted
  in response to transaction start command if it could not be initiated
  sucessfully

- - introduced retry logic. If transaction did not complete, retry. Same goes
  for waiting for idle condition. Number of retries is rather arbitrary.

- - preseve reserved bits in PCI config byte f2 (clock timing), suggested
  by Rudolf Marek

- - set bus delay multiplier to x8, suggested by Rudolf Marek

- - restructured overall code; after that is actually became very similar
  to patch for ali15x3 from Claudion Klinger, except I try to insure
  adapter is in sane state before command is started

Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>

- ---

 drivers/i2c/busses/i2c-ali1535.c |  203 
++++++++++++++++++++------------------
 1 files changed, 109 insertions(+), 94 deletions(-)

10f6fcf00b69ea2861eb2321e2f4d204a9fcfa2c
diff --git a/drivers/i2c/busses/i2c-ali1535.c 
b/drivers/i2c/busses/i2c-ali1535.c
index 3eb4789..ae48573 100644
- --- a/drivers/i2c/busses/i2c-ali1535.c
+++ b/drivers/i2c/busses/i2c-ali1535.c
@@ -50,7 +50,6 @@
     This driver does not use interrupts.
 */
 
- -
 /* Note: we assume there can only be one ALI1535, with one SMBus interface */
 
 #include <linux/module.h>
@@ -86,6 +85,9 @@
 
 /* Other settings */
 #define MAX_TIMEOUT		500	/* times 1/100 sec */
+#define MAX_RETRIES		3	/* times to retry hung transaction */
+#define STATUS_SET		1
+#define STATUS_UNSET		0
 #define ALI1535_SMB_IOSIZE	32
 
 #define ALI1535_SMB_DEFAULTBASE	0x8040
@@ -138,6 +140,22 @@ static struct pci_driver ali1535_driver;
 static unsigned short ali1535_smba;
 static DECLARE_MUTEX(i2c_ali1535_sem);
 
+static inline s32 ali1535_wait_for_status(int set, int status)
+{
+	int timeout = 0;
+	int temp = 0;
+
+	/* clear status register (clear-on-write) */
+	outb_p(0xFF, SMBHSTSTS);
+	do {
+		msleep(1);
+		timeout += 1;
+		temp = inb_p(SMBHSTSTS);
+	} while (!!(temp & status) != set && timeout <= MAX_TIMEOUT);
+
+	return temp;
+}
+
 /* Detect whether a ALI1535 can be found, and initialize it, where necessary.
    Note the differences between kernels with the old PCI BIOS interface and
    newer kernels with the real PCI interface. In compat.h some things are
@@ -184,7 +202,11 @@ static int ali1535_setup(struct pci_dev 
 	}
 
 	/* set SMB clock to 74KHz as recommended in data sheet */
- -	pci_write_config_byte(dev, SMBCLK, 0x20);
+	/* set bus delay multiplier to x8 as suggested by Rudolf Marek
+	 * also preserve reserved bits (also from Rudolf Marek)
+	 */
+	pci_read_config_byte(dev, SMBCLK, &temp);
+	pci_write_config_byte(dev, SMBCLK, (temp & 3) | 0x28);
 
 	/*
 	  The interrupt routing for SMB is set up in register 0x77 in the
@@ -210,83 +232,18 @@ static int ali1535_transaction(struct i2
 {
 	int temp;
 	int result = 0;
- -	int timeout = 0;
 
 	dev_dbg(&adap->dev, "Transaction (pre): STS=%02x, TYP=%02x, "
 		"CMD=%02x, ADD=%02x, DAT0=%02x, DAT1=%02x\n",
 		inb_p(SMBHSTSTS), inb_p(SMBHSTTYP), inb_p(SMBHSTCMD),
 		inb_p(SMBHSTADD), inb_p(SMBHSTDAT0), inb_p(SMBHSTDAT1));
 
- -	/* get status */
- -	temp = inb_p(SMBHSTSTS);
- -
- -	/* Make sure the SMBus host is ready to start transmitting */
- -	/* Check the busy bit first */
- -	if (temp & ALI1535_STS_BUSY) {
- -		/* If the host controller is still busy, it may have timed out
- -		 * in the previous transaction, resulting in a "SMBus Timeout"
- -		 * printk.  I've tried the following to reset a stuck busy bit.
- -		 *   1. Reset the controller with an KILL command. (this
- -		 *      doesn't seem to clear the controller if an external
- -		 *      device is hung)
- -		 *   2. Reset the controller and the other SMBus devices with a
- -		 *      T_OUT command. (this clears the host busy bit if an
- -		 *      external device is hung, but it comes back upon a new
- -		 *      access to a device)
- -		 *   3. Disable and reenable the controller in SMBHSTCFG. Worst
- -		 *      case, nothing seems to work except power reset.
- -		 */
- -
- -		/* Try resetting entire SMB bus, including other devices - This
- -		 * may not work either - it clears the BUSY bit but then the
- -		 * BUSY bit may come back on when you try and use the chip
- -		 * again.  If that's the case you are stuck.
- -		 */
- -		dev_info(&adap->dev,
- -			"Resetting entire SMB Bus to clear busy condition (%02x)\n",
- -			temp);
- -		outb_p(ALI1535_T_OUT, SMBHSTTYP);
- -		temp = inb_p(SMBHSTSTS);
- -	}
- -
- -	/* now check the error bits and the busy bit */
- -	if (temp & (ALI1535_STS_ERR | ALI1535_STS_BUSY)) {
- -		/* do a clear-on-write */
- -		outb_p(0xFF, SMBHSTSTS);
- -		if ((temp = inb_p(SMBHSTSTS)) &
- -		    (ALI1535_STS_ERR | ALI1535_STS_BUSY)) {
- -			/* This is probably going to be correctable only by a
- -			 * power reset as one of the bits now appears to be
- -			 * stuck */
- -			/* This may be a bus or device with electrical problems. */
- -			dev_err(&adap->dev,
- -				"SMBus reset failed! (0x%02x) - controller or "
- -				"device on bus is probably hung\n", temp);
- -			return -1;
- -		}
- -	} else {
- -		/* check and clear done bit */
- -		if (temp & ALI1535_STS_DONE) {
- -			outb_p(temp, SMBHSTSTS);
- -		}
- -	}
- -
 	/* start the transaction by writing anything to the start register */
 	outb_p(0xFF, SMBHSTPORT);
 
 	/* We will always wait for a fraction of a second! */
- -	timeout = 0;
- -	do {
- -		msleep(1);
- -		temp = inb_p(SMBHSTSTS);
- -	} while (((temp & ALI1535_STS_BUSY) && !(temp & ALI1535_STS_IDLE))
- -		 && (timeout++ < MAX_TIMEOUT));
- -
- -	/* If the SMBus is still busy, we give up */
- -	if (timeout >= MAX_TIMEOUT) {
- -		result = -1;
- -		dev_err(&adap->dev, "SMBus Timeout!\n");
- -	}
+	temp = ali1535_wait_for_status(STATUS_SET,
+			ALI1535_STS_ERR | ALI1535_STS_DONE | ALI1535_STS_BUSY);
 
 	if (temp & ALI1535_STS_FAIL) {
 		result = -1;
@@ -311,9 +268,16 @@ static int ali1535_transaction(struct i2
 	}
 
 	/* check to see if the "command complete" indication is set */
- -	if (!(temp & ALI1535_STS_DONE)) {
- -		result = -1;
- -		dev_err(&adap->dev, "Error: command never completed\n");
+	if (!result) {
+		if (temp & ALI1535_STS_BUSY) {
+			result = -2;
+			dev_err(&adap->dev, "Error: adapter busy\n");
+		} else if (!(temp & ALI1535_STS_DONE)) {
+			result = -2;
+			dev_err(&adap->dev, "Error: command never completed\n");
+		}
+		if (!(temp & ALI1535_STS_IDLE))
+			dev_err(&adap->dev, "Error: adapter did not idle after transaction\n");
 	}
 
 	dev_dbg(&adap->dev, "Transaction (post): STS=%02x, TYP=%02x, "
@@ -321,41 +285,83 @@ static int ali1535_transaction(struct i2
 		inb_p(SMBHSTSTS), inb_p(SMBHSTTYP), inb_p(SMBHSTCMD),
 		inb_p(SMBHSTADD), inb_p(SMBHSTDAT0), inb_p(SMBHSTDAT1));
 
- -	/* take consequent actions for error conditions */
- -	if (!(temp & ALI1535_STS_DONE)) {
- -		/* issue "kill" to reset host controller */
- -		outb_p(ALI1535_KILL,SMBHSTTYP);
- -		outb_p(0xFF,SMBHSTSTS);
- -	} else if (temp & ALI1535_STS_ERR) {
- -		/* issue "timeout" to reset all devices on bus */
- -		outb_p(ALI1535_T_OUT,SMBHSTTYP);
- -		outb_p(0xFF,SMBHSTSTS);
- -	}
- -
 	return result;
 }
 
+static void ali1535_reset(struct i2c_adapter *adap)
+{
+	int temp = inb_p(SMBHSTSTS);
+
+	dev_dbg(&adap->dev, "reset(pre): STS=%02x\n", temp);
+
+	/* If the host controller is still busy, it may have timed out
+	 * in the previous transaction, resulting in a "SMBus Timeout"
+	 * printk.  I've tried the following to reset a stuck busy bit.
+	 *   1. Reset the controller with an KILL command. (this
+	 *      doesn't seem to clear the controller if an external
+	 *      device is hung)
+	 *   2. Reset the controller and the other SMBus devices with a
+	 *      T_OUT command. (this clears the host busy bit if an
+	 *      external device is hung, but it comes back upon a new
+	 *      access to a device)
+	 *   3. Disable and reenable the controller in SMBHSTCFG. Worst
+	 *      case, nothing seems to work except power reset.
+	 */
+
+	/* Try resetting entire SMB bus, including other devices - This
+	 * may not work either - it clears the BUSY bit but then the
+	 * BUSY bit may come back on when you try and use the chip
+	 * again.  If that's the case you are stuck.
+	 */
+
+	if ((temp & ALI1535_STS_ERR) || !(temp & ALI1535_STS_IDLE))
+		outb_p(ALI1535_T_OUT, SMBHSTTYP);
+	else if (!(temp & ALI1535_STS_DONE))
+		outb_p(ALI1535_KILL, SMBHSTTYP);
+
+	dev_dbg(&adap->dev, "reset(post): STS=%02x\n", inb_p(SMBHSTSTS));
+}
+
+static inline s32 ali1535_wait_for_idle(struct i2c_adapter *adap)
+{
+	int temp;
+
+	temp = inb_p(SMBHSTSTS);
+
+	dev_dbg(&adap->dev, "wait_for_idle(pre): STS=%02x\n", temp);
+
+	temp = ali1535_wait_for_status(STATUS_SET, ALI1535_STS_IDLE);
+
+	dev_dbg(&adap->dev, "wait_for_idle(post): STS=%02x\n", temp);
+
+	return !(temp & ALI1535_STS_IDLE);
+}
+
 /* Return -1 on error. */
 static s32 ali1535_access(struct i2c_adapter *adap, u16 addr,
 			  unsigned short flags, char read_write, u8 command,
 			  int size, union i2c_smbus_data *data)
 {
 	int i, len;
- -	int temp;
- -	int timeout;
 	s32 result = 0;
+	int retry = 0;
 
 	down(&i2c_ali1535_sem);
+retry:
+	if (retry >= MAX_RETRIES) {
+		dev_err(&adap->dev, "Failed to execute command after %d retries"
+			" status: %02x\n", MAX_RETRIES, inb_p(SMBHSTSTS));
+		result = -1;
+		goto EXIT;
+	}
+
 	/* make sure SMBus is idle */
- -	temp = inb_p(SMBHSTSTS);
- -	for (timeout = 0;
- -	     (timeout < MAX_TIMEOUT) && !(temp & ALI1535_STS_IDLE);
- -	     timeout++) {
- -		msleep(1);
- -		temp = inb_p(SMBHSTSTS);
+	if (ali1535_wait_for_idle(adap)) {
+		dev_warn(&adap->dev, "adapter not idle before command; retrying\n");
+		retry++;
+		ali1535_reset(adap);
+		goto retry;
 	}
- -	if (timeout >= MAX_TIMEOUT)
- -		dev_warn(&adap->dev, "Idle wait Timeout! STS=0x%02x\n", temp);
 
 	/* clear status register (clear-on-write) */
 	outb_p(0xFF, SMBHSTSTS);
@@ -424,7 +430,16 @@ static s32 ali1535_access(struct i2c_ada
 		break;
 	}
 
- -	if (ali1535_transaction(adap)) {
+	if ((result = ali1535_transaction(adap)) == -2) {
+		/* Adapter hung and was reset; retry */
+		dev_err(&adap->dev, "Adapter hung, retrying after reset\n");
+		result = 0;
+		retry++;
+		ali1535_reset(adap);
+		goto retry;
+	}
+
+	if (result) {
 		/* Error in transaction */
 		result = -1;
 		goto EXIT;
- -- 
1.1.3
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFD0qF0R6LMutpd94wRAqJZAKCGgz7BKIZBsZFZWy4xUdPnidt3AgCfRrA9
3vT8vnL7YWJf2iOGBF1I9RI=
=ZFOW
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [lm-sensors] 2.6.15: lm90 0-004c: Register 0x13 read failed (-1)
  2006-01-15 20:33         ` Rudolf Marek
  2006-01-15 20:58           ` Andrey Borzenkov
  2006-01-16 19:40           ` Andrey Borzenkov
@ 2006-01-27  4:15           ` Andrey Borzenkov
  2 siblings, 0 replies; 11+ messages in thread
From: Andrey Borzenkov @ 2006-01-27  4:15 UTC (permalink / raw)
  To: Rudolf Marek; +Cc: Jean Delvare, linux-kernel, lm-sensors

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 15 January 2006 23:33, Rudolf Marek wrote:
> Hello all,
>
> > this appears simply a probing for non-existent i2c ports (correct me if I
> > am wrong) presumably by eeprom driver.
>
> yes I think you are right. (ADD/2 is the address of chip, that it tries to
> access)
>
> > Second block are errors from lm90 for different registers:
> >
> > Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre):
> > STS=04, TYP=10, CMD=01, ADD=99, DAT0=a0, DAT1=10
> > Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post):
> > STS=14, TYP=10, CMD=01, ADD=99, DAT0=29, DAT1=10
> > Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (pre):
> > STS=04, TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
> > Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Error: command never
> > completed
> > Jan 15 22:24:02 cooker kernel: i2c_adapter i2c-0: Transaction (post):
> > STS=04, TYP=10, CMD=08, ADD=98, DAT0=29, DAT1=10
> > Jan 15 22:24:02 cooker kernel: lm90 0-004c: Register 0x8 read failed (-1)

I still did not have much time to spend on it but booting today I suddenly got

i2c_adapter i2c-0: Unsupported chip (man_id=0x41, chip_id=0x42).

I begin to suspect that it is still lm90 (at least partly). Transacton did not 
fail (otherwise we were not here) but returned some strange value. Anyone 
knows if such chip really exits?

TIA

- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFD2Z5UR6LMutpd94wRAmaYAKCAdwCutdUWK+RFbQu9nMiLuIl6jACdGgj9
IHiDsWm37Xr4UWmQYbvwIOk=
=a1Ao
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-01-27  4:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-14 19:23 2.6.15: lm90 0-004c: Register 0x13 read failed (-1) Andrey Borzenkov
2006-01-14 21:20 ` [lm-sensors] " Jean Delvare
2006-01-14 21:45   ` Andrey Borzenkov
2006-01-15 19:12     ` Andrey Borzenkov
2006-01-15 19:48       ` Andrey Borzenkov
2006-01-15 20:33         ` Rudolf Marek
2006-01-15 20:58           ` Andrey Borzenkov
2006-01-16 19:40           ` Andrey Borzenkov
2006-01-16 21:17             ` Rudolf Marek
2006-01-21 21:02               ` Andrey Borzenkov
2006-01-27  4:15           ` Andrey Borzenkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).