* another must-fix: major PS/2 mouse problem @ 2003-06-01 1:46 Albert Cahalan 2003-06-04 5:47 ` Yoann [not found] ` <3EDCF47A.1060605@ifrance.com> 0 siblings, 2 replies; 24+ messages in thread From: Albert Cahalan @ 2003-06-01 1:46 UTC (permalink / raw) To: linux-kernel; +Cc: akpm Lots of people (check Google) get this message from the kernel: psmouse.c: Lost synchronization, throwing 2 bytes away. (the number of bytes will be 1, 2, or 3) At work, I get it when there is heavy NFS traffic. The mouse goes crazy, jumping around and doing random cut-and-paste all over everything. This is with a decently fast and modern PC. I'll guess that NFS and the mouse both have worker threads fighting for CPU time, and neither is RT. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-01 1:46 another must-fix: major PS/2 mouse problem Albert Cahalan @ 2003-06-04 5:47 ` Yoann [not found] ` <20030603232155.1488c02f.akpm@digeo.com> [not found] ` <3EDCF47A.1060605@ifrance.com> 1 sibling, 1 reply; 24+ messages in thread From: Yoann @ 2003-06-04 5:47 UTC (permalink / raw) To: linux-kernel is there a patch for this bug ? I have the same problem with my laptop, chip sis630, celeron 1.2Ghz, 256MB of RAM (32MB for video), mouse on PS/2 (ImPS/2) abd read mp3 throught nfs partition (ethernet 100MB). I haven't try without traffic on nfs but I will try next time I boot on the 2.5.70 (currently, I'm running a 2.4.20) Yoann Albert Cahalan wrote: > Lots of people (check Google) get this message > from the kernel: > > psmouse.c: Lost synchronization, throwing 2 bytes away. > > (the number of bytes will be 1, 2, or 3) > > At work, I get it when there is heavy NFS traffic. > The mouse goes crazy, jumping around and doing > random cut-and-paste all over everything. This is > with a decently fast and modern PC. > > I'll guess that NFS and the mouse both have worker > threads fighting for CPU time, and neither is RT. ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20030603232155.1488c02f.akpm@digeo.com>]
* Re: another must-fix: major PS/2 mouse problem [not found] ` <20030603232155.1488c02f.akpm@digeo.com> @ 2003-06-04 7:47 ` Vojtech Pavlik 2003-06-04 7:53 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Vojtech Pavlik @ 2003-06-04 7:47 UTC (permalink / raw) To: Andrew Morton; +Cc: Yoann, linux-kernel, Vojtech Pavlik, Albert D.Cahalan On Tue, Jun 03, 2003 at 11:21:55PM -0700, Andrew Morton wrote: > We believe that it may be due to the ethernet driver holding interrupts off > for too long when the traffic is heavy. Note that this doesn't necessarily mean that the ethernet driver disables the interrupts for a too long time, it just means that the computer is only servicing the network interrupts at that time, and since the mouse interrupt does have a lower priority, it's serviced not very often and with huge delays. In such a case the network driver should either use interrupt mitigation if the cards supports it (reading many packets per one interrupt) or switch to a polled mode. > Does that seem to match your observations? Does the problem happen when > the net traffic is high? > > Which ethernet driver are you using? -- Vojtech Pavlik SuSE Labs, SuSE CR ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 7:47 ` Vojtech Pavlik @ 2003-06-04 7:53 ` Andrew Morton 2003-06-04 8:00 ` Vojtech Pavlik 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2003-06-04 7:53 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: linux-yoann, linux-kernel, vojtech, acahalan Vojtech Pavlik <vojtech@ucw.cz> wrote: > > On Tue, Jun 03, 2003 at 11:21:55PM -0700, Andrew Morton wrote: > > > We believe that it may be due to the ethernet driver holding interrupts off > > for too long when the traffic is heavy. > > Note that this doesn't necessarily mean that the ethernet driver > disables the interrupts for a too long time, it just means that the > computer is only servicing the network interrupts at that time, and > since the mouse interrupt does have a lower priority, it's serviced > not very often and with huge delays. > > In such a case the network driver should either use interrupt mitigation > if the cards supports it (reading many packets per one interrupt) or > switch to a polled mode. Has this problem been observed in 2.4 kernels? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 7:53 ` Andrew Morton @ 2003-06-04 8:00 ` Vojtech Pavlik 2003-06-04 8:14 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Vojtech Pavlik @ 2003-06-04 8:00 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-yoann, linux-kernel, vojtech, acahalan On Wed, Jun 04, 2003 at 12:53:02AM -0700, Andrew Morton wrote: > Vojtech Pavlik <vojtech@ucw.cz> wrote: > > > > On Tue, Jun 03, 2003 at 11:21:55PM -0700, Andrew Morton wrote: > > > > > We believe that it may be due to the ethernet driver holding interrupts off > > > for too long when the traffic is heavy. > > > > Note that this doesn't necessarily mean that the ethernet driver > > disables the interrupts for a too long time, it just means that the > > computer is only servicing the network interrupts at that time, and > > since the mouse interrupt does have a lower priority, it's serviced > > not very often and with huge delays. > > > > In such a case the network driver should either use interrupt mitigation > > if the cards supports it (reading many packets per one interrupt) or > > switch to a polled mode. > > Has this problem been observed in 2.4 kernels? No, since 2.4 doesn't have the re-sync code in the mouse driver which is triggering in this case. But problems with the machine being flooded with interrupts from the NIC so hard that it actually cannot do anything are quite common. -- Vojtech Pavlik SuSE Labs, SuSE CR ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 8:00 ` Vojtech Pavlik @ 2003-06-04 8:14 ` Andrew Morton 2003-06-04 8:40 ` Vojtech Pavlik 2003-06-04 23:09 ` Albert Cahalan 0 siblings, 2 replies; 24+ messages in thread From: Andrew Morton @ 2003-06-04 8:14 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: linux-yoann, linux-kernel, vojtech, acahalan Vojtech Pavlik <vojtech@ucw.cz> wrote: > > > Has this problem been observed in 2.4 kernels? > > No, since 2.4 doesn't have the re-sync code in the mouse driver which is > triggering in this case. But problems with the machine being flooded > with interrupts from the NIC so hard that it actually cannot do anything > are quite common. So is the resync code doing more good than harm? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 8:14 ` Andrew Morton @ 2003-06-04 8:40 ` Vojtech Pavlik 2003-06-04 19:20 ` Yoann 2003-06-04 23:09 ` Albert Cahalan 1 sibling, 1 reply; 24+ messages in thread From: Vojtech Pavlik @ 2003-06-04 8:40 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-yoann, linux-kernel, vojtech, acahalan On Wed, Jun 04, 2003 at 01:14:13AM -0700, Andrew Morton wrote: > > > Has this problem been observed in 2.4 kernels? > > > > No, since 2.4 doesn't have the re-sync code in the mouse driver which is > > triggering in this case. But problems with the machine being flooded > > with interrupts from the NIC so hard that it actually cannot do anything > > are quite common. > > So is the resync code doing more good than harm? Hard to tell. The people for which it does good don't complain. -- Vojtech Pavlik SuSE Labs, SuSE CR ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 8:40 ` Vojtech Pavlik @ 2003-06-04 19:20 ` Yoann 0 siblings, 0 replies; 24+ messages in thread From: Yoann @ 2003-06-04 19:20 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Andrew Morton, linux-kernel, vojtech, acahalan Vojtech Pavlik wrote: > On Wed, Jun 04, 2003 at 01:14:13AM -0700, Andrew Morton wrote: > > >>>>Has this problem been observed in 2.4 kernels? >>> >>> No, since 2.4 doesn't have the re-sync code in the mouse driver which is >>> triggering in this case. But problems with the machine being flooded >>> with interrupts from the NIC so hard that it actually cannot do anything >>> are quite common. >> >>So is the resync code doing more good than harm? > > > Hard to tell. The people for which it does good don't complain. I didn't reboot my pc yet, so I'm still running a 2.4.20 without any problem with my mouse. but when I will boot on the 2.5.70, what I should do to find where does the bug come from. I'm little but new here, so I never try to locate a bug in a kernel... thanks for your advice Yoann -- Jugglers, like programmers, handle objects which, at first sight, seem complex and difficult to control. Some of them, with time and patience, manage to control one or the other or both at the same time, and thus become aware of what they are doing. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-06-04 8:14 ` Andrew Morton 2003-06-04 8:40 ` Vojtech Pavlik @ 2003-06-04 23:09 ` Albert Cahalan 1 sibling, 0 replies; 24+ messages in thread From: Albert Cahalan @ 2003-06-04 23:09 UTC (permalink / raw) To: Andrew Morton; +Cc: Vojtech Pavlik, linux-yoann, linux-kernel, vojtech On Wed, 2003-06-04 at 04:14, Andrew Morton wrote: > Vojtech Pavlik <vojtech@ucw.cz> wrote: > >> Has this problem been observed in 2.4 kernels? > > > > No, since 2.4 doesn't have the re-sync code in the mouse driver which is > > triggering in this case. But problems with the machine being flooded > > with interrupts from the NIC so hard that it actually cannot do anything > > are quite common. > > So is the resync code doing more good than harm? The log message is useful. I think the resync code is a bit like the OOM killer. We need it, but something is wrong if it ever gets used. It also doesn't quite work the way it should. Anyway... I only get the problem with NFS traffic. It may be that NFS traffic is the only way I've yet found to generate extreme network usage though. The system with problems is an NFSv3 client that gets abused by an in-house version control system based on SCCS. I suppose this is like running "tar xf foo.tar" or "tar xf foo.tar foo" over NFS. The hardware is: Pentium III (Coppermine) 1002.822 MHz Apollo chipset # lspci -s 00:0d.0 -v 00:0d.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74) Subsystem: 3Com Corporation 3C905C-TX Fast Etherlink for PC Management NIC Flags: bus master, medium devsel, latency 32, IRQ 11 I/O ports at ec00 [size=128] Memory at df000000 (32-bit, non-prefetchable) [size=128] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: [dc] Power Management version 2 # nfsstat -c Client rpc stats: calls retrans authrefrsh 118380 7843 0 Client nfs v2: null getattr setattr root lookup readlink 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% read wrcache write create remove rename 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% link symlink mkdir rmdir readdir fsstat 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% Client nfs v3: null getattr setattr lookup access readlink 0 0% 12501 10% 114 0% 68765 58% 25538 21% 4 0% read write create mkdir symlink mknod 8830 7% 725 0% 377 0% 3 0% 1 0% 0 0% remove rmdir rename link readdir readdirplus 498 0% 0 0% 367 0% 173 0% 0 0% 10 0% fsstat fsinfo pathconf commit 2 0% 2 0% 0 0% 470 0% ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <3EDCF47A.1060605@ifrance.com>]
[parent not found: <1054681254.22103.3750.camel@cube>]
[parent not found: <3EDD8850.9060808@ifrance.com>]
* Re: another must-fix: major PS/2 mouse problem [not found] ` <3EDD8850.9060808@ifrance.com> @ 2003-07-23 0:44 ` Albert Cahalan 2003-07-24 17:30 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Albert Cahalan @ 2003-07-23 0:44 UTC (permalink / raw) To: Yoann; +Cc: linux-kernel, Andrew Morton, vortex, jgarzik I may have found the problem! On Tue, 2003-06-03 at 15:18, Yoann wrote: > I have the same problem with my laptop, chip sis630, > celeron 1.2Ghz, 256MB of RAM (32MB for video), mouse > on PS/2 (ImPS/2) abd read mp3 throught nfs partition > (ethernet 100MB). I haven't try without traffic on > nfs but I will try next time I boot on the 2.5.70 Using the lockmeter on a 2.5.75 kernel, I discovered that boomerang_interrupt() grabs a spinlock for over 1/4 second. No joke, 253 ms. Interrupts are off AFAIK. Mouse behavior is terrible. It should be no surprise that NTP isn't working too well either. The ntpd daemon keeps complaining about losing sync and having to advance the clock by amounts of over 100 seconds. Could somebody with the hardware manual take a look at that function? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-23 0:44 ` Albert Cahalan @ 2003-07-24 17:30 ` Andrew Morton 2003-07-25 1:46 ` Albert Cahalan 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2003-07-24 17:30 UTC (permalink / raw) To: Albert Cahalan; +Cc: linux-yoann, linux-kernel, akpm, vortex, jgarzik Albert Cahalan <albert@users.sourceforge.net> wrote: > > Using the lockmeter on a 2.5.75 kernel, I discovered > that boomerang_interrupt() grabs a spinlock for over > 1/4 second. No joke, 253 ms. Interrupts are off AFAIK. boomerang_interrupt() doesn't disable interrupts. Is the NIC sharing the mouse's IRQ line? boomerang_interrupt() is only used by nasty old NICs and yes, I guess it is possible that something has gone wrong and is causing occasional long spins in there. But I am more suspecting that you're not really using boomerang_interrupt() at all, and that something has gone wrong with lockmeter. What sort of NIC are you using? Bear in mind that if some other device generates an interrupt while the CPU is running boomerang_interrupt(), lockmeter will count the time spent in that other device's interrupt as "time spent in boomerand_interrupt()". Which is very true, but it is not much help when one is trying to identify the source of the problem. Perhaps what you should do is to do an rdtsc on entry and exit of do_IRQ() and print stuff out when "long" periods of time in do_IRQ() are noticed. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-24 17:30 ` Andrew Morton @ 2003-07-25 1:46 ` Albert Cahalan 2003-07-26 3:19 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Albert Cahalan @ 2003-07-25 1:46 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, linux-yoann, linux-kernel mailing list, Andrew Morton, vortex, jgarzik On Thu, 2003-07-24 at 13:30, Andrew Morton wrote: > Albert Cahalan <albert@users.sourceforge.net> wrote: > > Using the lockmeter on a 2.5.75 kernel, I discovered > > that boomerang_interrupt() grabs a spinlock for over > > 1/4 second. No joke, 253 ms. Interrupts are off AFAIK. > > boomerang_interrupt() doesn't disable interrupts. Is the NIC sharing the > mouse's IRQ line? No. CPU0 0: 746770 XT-PIC timer 1: 936 XT-PIC i8042 2: 0 XT-PIC cascade 4: 9 XT-PIC serial 5: 0 XT-PIC uhci-hcd, uhci-hcd 11: 2417 XT-PIC eth0 12: 60 XT-PIC i8042 14: 13844 XT-PIC ide0 15: 2 XT-PIC ide1 NMI: 0 LOC: 751552 ERR: 0 MIS: 0 > boomerang_interrupt() is only used by nasty old NICs and yes, I guess it is > possible that something has gone wrong and is causing occasional long spins > in there. > > But I am more suspecting that you're not really using boomerang_interrupt() > at all, and that something has gone wrong with lockmeter. What sort of NIC > are you using? I hope you don't consider a 100 Mb/s PCI device to be a nasty old NIC. It's not an NE2000 you know! I have this: 00:0d.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74) Subsystem: 3Com Corporation 3C905C-TX Fast Etherlink for PC Management NIC Flags: bus master, medium devsel, latency 32, IRQ 11 I/O ports at ec00 [size=128] Memory at df001000 (32-bit, non-prefetchable) [size=128] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Without heavy net usage, boomerang_interrupt can take as long as 1950 microseconds. That would be from mounting an NFS filesystem and receiving broadcast packets. I didn't have an opportunity to hit NFS hard today. That's from rdtsc on a 1002-MHz Pentium III. > Bear in mind that if some other device generates an interrupt while the CPU > is running boomerang_interrupt(), lockmeter will count the time spent in > that other device's interrupt as "time spent in boomerand_interrupt()". > Which is very true, but it is not much help when one is trying to identify > the source of the problem. Do the Intel IRQ controller priority rules play a role here? > Perhaps what you should do is to do an rdtsc on entry and exit of do_IRQ() > and print stuff out when "long" periods of time in do_IRQ() are noticed. I added code to the top and bottom of do_IRQ, as well as to the top and bottom of boomerang_interrupt. The lockmeter was compiled into the kernel but never enabled. I record the minimum and maximum time in microseconds. ------------------------------- IRQ num use min max --- ------ -------- --- ------- 0 746770 timer 40 103595 1 936 i8042 13 389773 2 0 cascade - - 3 - - - - 4 9 serial 28 56 5 0 uhci-hcd - - 6 - - 711 711 7 - - 25 25 8 - - - - 9 - - - - 10 - - - - 11 2417 eth0 87 1535331 12 60 i8042 18 102895 13 - - - - 14 13844 ide0 8 51944 15 2 ide1 7 11 NMI 0 LOC 751552 ERR 0 MIS 0 ------------------------------- ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-25 1:46 ` Albert Cahalan @ 2003-07-26 3:19 ` Andrew Morton 2003-07-26 15:16 ` Zwane Mwaikambo 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2003-07-26 3:19 UTC (permalink / raw) To: Albert Cahalan; +Cc: albert, linux-yoann, linux-kernel, akpm, vortex, jgarzik Albert Cahalan <albert@users.sourceforge.net> wrote: > > I hope you don't consider a 100 Mb/s PCI device to be > a nasty old NIC. It's not an NE2000 you know! I have this: Sorry, I got my boomerangs and vortices mixed up. Vortex is the ancient one. > I added code to the top and bottom of do_IRQ, as well as to > the top and bottom of boomerang_interrupt. The lockmeter was > compiled into the kernel but never enabled. I record the > minimum and maximum time in microseconds. > > ------------------------------- > IRQ num use min max > --- ------ -------- --- ------- > 0 746770 timer 40 103595 > 1 936 i8042 13 389773 > 2 0 cascade - - > 3 - - - - > 4 9 serial 28 56 > 5 0 uhci-hcd - - > 6 - - 711 711 > 7 - - 25 25 > 8 - - - - > 9 - - - - > 10 - - - - > 11 2417 eth0 87 1535331 > 12 60 i8042 18 102895 > 13 - - - - > 14 13844 ide0 8 51944 > 15 2 ide1 7 11 But did your instrumentation account for nested interrupts? What happens if a slow i8042 interrupt happens in the middle of a 3c59x interrupt? Still, that probably doesn't account for the stalls. I don't know what does account for it, frankly. You could try dropping the 2.4 driver into the 2.5 tree just to verify that it is not a driver problem. The driver has hardly changed at all. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-26 3:19 ` Andrew Morton @ 2003-07-26 15:16 ` Zwane Mwaikambo 2003-07-29 2:55 ` Albert Cahalan 0 siblings, 1 reply; 24+ messages in thread From: Zwane Mwaikambo @ 2003-07-26 15:16 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, linux-yoann, linux-kernel, akpm, vortex, jgarzik On Fri, 25 Jul 2003, Andrew Morton wrote: > But did your instrumentation account for nested interrupts? What happens > if a slow i8042 interrupt happens in the middle of a 3c59x interrupt? Just to verify that, he could remove the local_irq_enable for !SA_INTERRUPT. Zwane -- function.linuxpower.ca ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-26 15:16 ` Zwane Mwaikambo @ 2003-07-29 2:55 ` Albert Cahalan 2003-07-29 3:14 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Albert Cahalan @ 2003-07-29 2:55 UTC (permalink / raw) To: Zwane Mwaikambo Cc: Andrew Morton, Albert Cahalan, linux-yoann, linux-kernel mailing list, Andrew Morton, vortex, jgarzik On Sat, 2003-07-26 at 11:16, Zwane Mwaikambo wrote: > On Fri, 25 Jul 2003, Andrew Morton wrote: > > > But did your instrumentation account for nested interrupts? What happens > > if a slow i8042 interrupt happens in the middle of a 3c59x interrupt? > > Just to verify that, he could remove the local_irq_enable for > !SA_INTERRUPT. OK, I did this. Now, in microseconds, I get: ------------------------ IRQ use min max --- -------- --- ------- 0 timer 40 103968 1 i8042 14 1138 (was 389773) 2 cascade - - 3 - - - 4 serial 29 56 5 uhci-hcd - - 6 - 690 690 7 - 40 40 8 - - - 9 - - - 10 - - - 11 eth0 73 31332 (was 1535331) 12 i8042 18 215 (was 102895) 13 - - - 14 ide0 7 43846 15 ide1 7 12 ------------------------ boomerang_interrupt itself takes 4 to 59 microseconds. Then I switched to 2.6.0-test2. Testing more, I get the problem with or without SMP and with or without preemption. Here's a chunk of my log file: Loosing too many ticks! TSC cannot be used as a timesource. (Are you running with SpeedStep?) Falling back to a sane timesource. psmouse.c: Lost synchronization, throwing 3 bytes away. psmouse.c: Lost synchronization, throwing 1 bytes away. Arrrrgh! The TSC is my only good time source! Remember that this is a pretty normal system. I have a Red Hat 8 install w/ required upgrades, ext3, IDE, a 1-GHz Pentium III, a boring VIA chipset, etc. To reproduce, I do some PS/2 mouse movement while doing one of: a. Lots of concurrent write() and sync() activity to ext3. b. Lots of NFSv3 traffic. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 2:55 ` Albert Cahalan @ 2003-07-29 3:14 ` Andrew Morton 2003-07-29 12:40 ` Albert Cahalan 2003-07-30 5:08 ` Pavel Machek 0 siblings, 2 replies; 24+ messages in thread From: Andrew Morton @ 2003-07-29 3:14 UTC (permalink / raw) To: Albert Cahalan Cc: zwane, albert, linux-yoann, linux-kernel, akpm, vortex, jgarzik Albert Cahalan <albert@users.sourceforge.net> wrote: > > OK, I did this. Now, in microseconds, I get: > > ------------------------ > IRQ use min max > --- -------- --- ------- > 0 timer 40 103968 > 1 i8042 14 1138 (was 389773) > 2 cascade - - > 3 - - - > 4 serial 29 56 > 5 uhci-hcd - - > 6 - 690 690 > 7 - 40 40 > 8 - - - > 9 - - - > 10 - - - > 11 eth0 73 31332 (was 1535331) > 12 i8042 18 215 (was 102895) > 13 - - - > 14 ide0 7 43846 > 15 ide1 7 12 > ------------------------ > > boomerang_interrupt itself takes 4 to 59 microseconds. So this looks OK, yes? (Is that instrumentation patch productisable? Looks handly, albeit a subset of microstate accounting) > Then I switched to 2.6.0-test2. Testing more, I get the > problem with or without SMP and with or without > preemption. Here's a chunk of my log file: > > Loosing too many ticks! > TSC cannot be used as a timesource. (Are you running with SpeedStep?) > Falling back to a sane timesource. > psmouse.c: Lost synchronization, throwing 3 bytes away. > psmouse.c: Lost synchronization, throwing 1 bytes away. > > Arrrrgh! The TSC is my only good time source! Arrrgh! More PS/2 problems! I think the lost synchronisation is the problem, would you agree? The person who fixes this gets a Nobel prize. > Remember that this is a pretty normal system. I have > a Red Hat 8 install w/ required upgrades, ext3, IDE, > a 1-GHz Pentium III, a boring VIA chipset, etc. > > To reproduce, I do some PS/2 mouse movement while > doing one of: > > a. Lots of concurrent write() and sync() activity to ext3. > b. Lots of NFSv3 traffic. ie: lots of interrupt traffic causes the PS2 driver to go whacky? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 3:14 ` Andrew Morton @ 2003-07-29 12:40 ` Albert Cahalan 2003-07-29 18:58 ` Andrew Morton 2003-07-30 5:08 ` Pavel Machek 1 sibling, 1 reply; 24+ messages in thread From: Albert Cahalan @ 2003-07-29 12:40 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, zwane, linux-yoann, linux-kernel mailing list, Andrew Morton, vortex, jgarzik On Mon, 2003-07-28 at 23:14, Andrew Morton wrote: > Albert Cahalan <albert@users.sourceforge.net> wrote: > > OK, I did this. Now, in microseconds, I get: > > > > ------------------------ > > IRQ use min max > > --- -------- --- ------- > > 0 timer 40 103968 > > 1 i8042 14 1138 (was 389773) > > 2 cascade - - > > 3 - - - > > 4 serial 29 56 > > 5 uhci-hcd - - > > 6 - 690 690 > > 7 - 40 40 > > 8 - - - > > 9 - - - > > 10 - - - > > 11 eth0 73 31332 (was 1535331) > > 12 i8042 18 215 (was 102895) > > 13 - - - > > 14 ide0 7 43846 > > 15 ide1 7 12 > > ------------------------ > > > > boomerang_interrupt itself takes 4 to 59 microseconds. > > So this looks OK, yes? I suppose boomerang_interrupt itself is OK. Spending 104 ms in IRQ 0, 31 ms in IRQ 11, and 44 ms in IRQ 14 is not at all OK. I was hoping to get under 200 microseconds for everything. > (Is that instrumentation patch productisable? > Looks handly, albeit a subset of microstate accounting) Not really. I printk() when a value exceeds the saved maximum, then scan my logs for the first and last values. There's also hard-coded knowledge of my 1-GHz CPU, which lets me convert to microseconds as follows: us = (unsigned)(ns64>>3)/125u; (that lets me handle up to 32 seconds) Huh. So the minimum value is really the first value. Later values could be less, but that's not important. I suppose that true min/max via a /proc file would be pretty easy to implement. I like my 1-GHz hack. I like a TSC that measures in nanoseconds too. > > Then I switched to 2.6.0-test2. Testing more, I get the > > problem with or without SMP and with or without > > preemption. Here's a chunk of my log file: > > > > Loosing too many ticks! > > TSC cannot be used as a timesource. (Are you running with SpeedStep?) > > Falling back to a sane timesource. > > psmouse.c: Lost synchronization, throwing 3 bytes away. > > psmouse.c: Lost synchronization, throwing 1 bytes away. > > > > Arrrrgh! The TSC is my only good time source! > > Arrrgh! More PS/2 problems! > > I think the lost synchronisation is the problem, would you agree? It's one problem. It's a problem other people have seen. My TSC should be good though; I'd like to use it. At times ntpd (the NTP daemon) gets really unhappy with the situation, yanking my clock ahead by up to 10 minutes to compensate for lost time. > The person who fixes this gets a Nobel prize. > > > Remember that this is a pretty normal system. I have > > a Red Hat 8 install w/ required upgrades, ext3, IDE, > > a 1-GHz Pentium III, a boring VIA chipset, etc. > > > > To reproduce, I do some PS/2 mouse movement while > > doing one of: > > > > a. Lots of concurrent write() and sync() activity to ext3. > > b. Lots of NFSv3 traffic. > > ie: lots of interrupt traffic causes the PS2 driver to go whacky? I guess so. The ext3+IDE behavior seems to lift the blame from boomerang_interrupt. Using ext3+IDE, I seem to need a couple minutes to reproduce the problem. NFSv3+Ethernet will give me the problem almost instantly. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 12:40 ` Albert Cahalan @ 2003-07-29 18:58 ` Andrew Morton 2003-07-29 19:36 ` Zwane Mwaikambo 2003-07-29 19:43 ` Chris Friesen 0 siblings, 2 replies; 24+ messages in thread From: Andrew Morton @ 2003-07-29 18:58 UTC (permalink / raw) To: Albert Cahalan Cc: albert, zwane, linux-yoann, linux-kernel, akpm, vortex, jgarzik Albert Cahalan <albert@users.sourceforge.net> wrote: > > > So this looks OK, yes? > > I suppose boomerang_interrupt itself is OK. > Spending 104 ms in IRQ 0, 31 ms in IRQ 11, and > 44 ms in IRQ 14 is not at all OK. I was hoping > to get under 200 microseconds for everything. I misread that. Last time I checked (which was about 18 months ago) the maximum interrupts-off time on a 500MHz desktop-class machine was 80 microseconds. Something is broken there. Do you have another machine to sanity check against? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 18:58 ` Andrew Morton @ 2003-07-29 19:36 ` Zwane Mwaikambo 2003-07-29 19:43 ` Chris Friesen 1 sibling, 0 replies; 24+ messages in thread From: Zwane Mwaikambo @ 2003-07-29 19:36 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, linux-yoann, linux-kernel, akpm, vortex, jgarzik On Tue, 29 Jul 2003, Andrew Morton wrote: > Albert Cahalan <albert@users.sourceforge.net> wrote: > > > > > So this looks OK, yes? > > > > I suppose boomerang_interrupt itself is OK. > > Spending 104 ms in IRQ 0, 31 ms in IRQ 11, and > > 44 ms in IRQ 14 is not at all OK. I was hoping > > to get under 200 microseconds for everything. > > I misread that. > > Last time I checked (which was about 18 months ago) the maximum > interrupts-off time on a 500MHz desktop-class machine was 80 microseconds. IDE has traditionally been a small headache in that department. I need to find out how it fares in 2.5 -- function.linuxpower.ca ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 18:58 ` Andrew Morton 2003-07-29 19:36 ` Zwane Mwaikambo @ 2003-07-29 19:43 ` Chris Friesen 1 sibling, 0 replies; 24+ messages in thread From: Chris Friesen @ 2003-07-29 19:43 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, zwane, linux-yoann, linux-kernel, akpm, vortex, jgarzik Andrew Morton wrote: > Last time I checked (which was about 18 months ago) the maximum > interrupts-off time on a 500MHz desktop-class machine was 80 microseconds. You might want to bump that up a little bit. Querying carrier signal on a tulip chip is 100usecs with interrupts off. Doesn't make any difference here though. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-29 3:14 ` Andrew Morton 2003-07-29 12:40 ` Albert Cahalan @ 2003-07-30 5:08 ` Pavel Machek 2003-07-30 6:32 ` Andrew Morton 2003-07-30 12:29 ` Albert Cahalan 1 sibling, 2 replies; 24+ messages in thread From: Pavel Machek @ 2003-07-30 5:08 UTC (permalink / raw) To: Andrew Morton Cc: Albert Cahalan, zwane, linux-yoann, linux-kernel, akpm, vortex, jgarzik, vojtech Hi! > > Loosing too many ticks! > > TSC cannot be used as a timesource. (Are you running with SpeedStep?) > > Falling back to a sane timesource. > > psmouse.c: Lost synchronization, throwing 3 bytes away. > > psmouse.c: Lost synchronization, throwing 1 bytes away. > > > > Arrrrgh! The TSC is my only good time source! > > Arrrgh! More PS/2 problems! > > I think the lost synchronisation is the problem, would you agree? > > The person who fixes this gets a Nobel prize. If you set ps/2 synchronization timeout to 20 seconds, you are going to make vojtech unhappy (he likes that code :-), but at least 2.6.0 will not be worse than 2.4.x... Do you want me to create a patch? -- Pavel Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need... ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-30 5:08 ` Pavel Machek @ 2003-07-30 6:32 ` Andrew Morton 2003-07-30 12:29 ` Albert Cahalan 1 sibling, 0 replies; 24+ messages in thread From: Andrew Morton @ 2003-07-30 6:32 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel, vojtech Pavel Machek <pavel@ucw.cz> wrote: > > Hi! > > > > Loosing too many ticks! > > > TSC cannot be used as a timesource. (Are you running with SpeedStep?) > > > Falling back to a sane timesource. > > > psmouse.c: Lost synchronization, throwing 3 bytes away. > > > psmouse.c: Lost synchronization, throwing 1 bytes away. > > > > > > Arrrrgh! The TSC is my only good time source! > > > > Arrrgh! More PS/2 problems! > > > > I think the lost synchronisation is the problem, would you agree? > > > > The person who fixes this gets a Nobel prize. > > > If you set ps/2 synchronization timeout to 20 seconds, you are going to make vojtech > unhappy (he likes that code :-), but at least 2.6.0 will not be worse than 2.4.x... 2.6 is currently much worse than 2.4: we're buried in what appear to be many different varieties of PS/2 bug reports. > Do you want me to create a patch? Well I do not know what the problem with synchronisation is, not what solution you propose. But yeah, I like patches ;) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem 2003-07-30 5:08 ` Pavel Machek 2003-07-30 6:32 ` Andrew Morton @ 2003-07-30 12:29 ` Albert Cahalan 1 sibling, 0 replies; 24+ messages in thread From: Albert Cahalan @ 2003-07-30 12:29 UTC (permalink / raw) To: linux-kernel mailing list Cc: mikpe, 0, pavel, Andrew Morton, Albert Cahalan, zwane, linux-yoann, vojtech On Wed, 2003-07-30 at 01:08, Pavel Machek wrote: > Hi! > > > > Loosing too many ticks! > > > TSC cannot be used as a timesource. (Are you running with SpeedStep?) > > > Falling back to a sane timesource. > > > psmouse.c: Lost synchronization, throwing 3 bytes away. > > > psmouse.c: Lost synchronization, throwing 1 bytes away. > > > > > > Arrrrgh! The TSC is my only good time source! > > > > Arrrgh! More PS/2 problems! > > > > I think the lost synchronisation is the problem, would you agree? > > > > The person who fixes this gets a Nobel prize. > > > If you set ps/2 synchronization timeout to 20 seconds, you are going to make vojtech > unhappy (he likes that code :-), but at least 2.6.0 will not be worse than 2.4.x... > > Do you want me to create a patch? No. That will just hide one symptom of the problem, making things more difficult to debug. It won't fix my clock, which the ntpd program keeps complaining about. Under heavy load, my clock falls behind so much that ntpd gives up on the gentle treatment and just yanks the clock forward by as much as 10 minutes. It won't make the mouse run well. Maybe you'd stop the mouse from going crazy from time to time, but there'd still temporary freezes from time to time. (not OK!) It won't convince Linux that my TSC isn't broken. It won't solve Mikael Pettersson's problem, posted under the subject "[BUG] 2.6.0-test2 loses time on 486". He writes: "My old 486 test box is losing time at an alarming rate when running 2.6.0-test kernels. It loses almost 2 minutes per hour, less if it sits idle. This problem does not occur when it's running a 2.4 kernel." Gee, I get that too, on a 1 GHz Pentium III. It seems we're all losing LOTS of clock ticks and other interrupts. I took the net-related email addresses off the Cc: list. Please leave me on it so I don't have to break threading. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: another must-fix: major PS/2 mouse problem @ 2003-07-30 19:18 Mikael Pettersson 0 siblings, 0 replies; 24+ messages in thread From: Mikael Pettersson @ 2003-07-30 19:18 UTC (permalink / raw) To: albert, linux-kernel; +Cc: 0, akpm, linux-yoann, pavel, vojtech, zwane On 30 Jul 2003 08:29:32 -0400, Albert Cahalan wrote: >> > > psmouse.c: Lost synchronization, throwing 3 bytes away. >> > > psmouse.c: Lost synchronization, throwing 1 bytes away. >> > > >> > > Arrrrgh! The TSC is my only good time source! >> > >> > Arrrgh! More PS/2 problems! >> > >> > I think the lost synchronisation is the problem, would you agree? >> > >> > The person who fixes this gets a Nobel prize. ... >It won't make the mouse run well. Maybe you'd stop the >mouse from going crazy from time to time, but there'd >still temporary freezes from time to time. (not OK!) FWIW, the problems my Dell Latitude had with the external mice I use with it were significantly reduced once I added "psmouse_noext" to the kernel's command line. That one change eliminated all lost sync messages and general craziness after resumes from suspended state. To make the mouse move at proper speed w/o jerkiness I also had to tweak the rate and scaling programmed into it to match 2.4 defaults. (rate 100, scale 2:1) In fairness, only my old Latitude has these PS/2 issues. /Mikael ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2003-07-30 19:24 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-06-01 1:46 another must-fix: major PS/2 mouse problem Albert Cahalan 2003-06-04 5:47 ` Yoann [not found] ` <20030603232155.1488c02f.akpm@digeo.com> 2003-06-04 7:47 ` Vojtech Pavlik 2003-06-04 7:53 ` Andrew Morton 2003-06-04 8:00 ` Vojtech Pavlik 2003-06-04 8:14 ` Andrew Morton 2003-06-04 8:40 ` Vojtech Pavlik 2003-06-04 19:20 ` Yoann 2003-06-04 23:09 ` Albert Cahalan [not found] ` <3EDCF47A.1060605@ifrance.com> [not found] ` <1054681254.22103.3750.camel@cube> [not found] ` <3EDD8850.9060808@ifrance.com> 2003-07-23 0:44 ` Albert Cahalan 2003-07-24 17:30 ` Andrew Morton 2003-07-25 1:46 ` Albert Cahalan 2003-07-26 3:19 ` Andrew Morton 2003-07-26 15:16 ` Zwane Mwaikambo 2003-07-29 2:55 ` Albert Cahalan 2003-07-29 3:14 ` Andrew Morton 2003-07-29 12:40 ` Albert Cahalan 2003-07-29 18:58 ` Andrew Morton 2003-07-29 19:36 ` Zwane Mwaikambo 2003-07-29 19:43 ` Chris Friesen 2003-07-30 5:08 ` Pavel Machek 2003-07-30 6:32 ` Andrew Morton 2003-07-30 12:29 ` Albert Cahalan 2003-07-30 19:18 Mikael Pettersson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).