All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-11 22:41 ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

[Note:
  10 new reports in the last 10 days, but fortunately we're fixing them faster
  than they're being reported.]

This message contains a list of some regressions introduced between 2.6.30 and
2.6.31, for which there are no fixes in the mainline I know of.  If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.30
and 2.6.31, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-12      161       45          35
  2009-10-02      151       49          42
  2009-09-06      123       34          27
  2009-08-26      108       33          26
  2009-08-20      102       32          29
  2009-08-10       89       27          24
  2009-08-02       76       36          28
  2009-07-27       70       51          43
  2009-07-07       35       25          21
  2009-06-29       22       22          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt@redhat.com>
Date		: 2009-09-24 14:51 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-10-07 20:19 (5 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14385
Subject		: DMAR regression in 2.6.31 leads to ext4 corruption?
Submitter	: Andy Isaacson <adi@hexapodia.org>
Date		: 2009-10-08 23:56 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125504643703877&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14377
Subject		: "conservative" cpufreq governor broken
Submitter	: Steven Noonan <steven@uplinklabs.net>
Date		: 2009-10-05 16:32 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e21c9610991e95621a81407cdbab881226419b
References	: http://marc.info/?l=linux-kernel&m=125476067108252&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14329
Subject		: Sata disk doesn't wake up after S3 suspend
Submitter	:  <frodone@gmail.com>
Date		: 2009-10-05 22:58 (7 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
Subject		: MCA on hp rx8640
Submitter	: Andrew Patterson <andrew.patterson@hp.com>
Date		: 2009-09-29 17:20 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://www.spinics.net/lists/linux-usb/msg22799.html


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li@intel.com>
Date		: 2009-09-22 5:49 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
Date		: 2009-09-12 7:46 (30 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (29 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (25 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (27 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (31 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (34 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ff05b2b4eac2e63d345fc731ea151a060247f53
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
		  http://lkml.org/lkml/2009/10/2/86
		  http://lkml.org/lkml/2009/10/5/24
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (39 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (43 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14070
Subject		: lockdep warning triggered by dup_fd
Submitter	: Bart Van Assche <bart.vanassche@gmail.com>
Date		: 2009-08-23 09:36 (50 days old)
References	: http://lkml.org/lkml/2009/8/23/8


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (53 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim@klingt.org>
Date		: 2009-08-14 8:26 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (58 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (67 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (70 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (69 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (87 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (82 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (98 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (117 days old)
References	: http://lkml.org/lkml/2009/6/17/194


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
Subject		: WARNING: at net/ipv4/af_inet.c:154
Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Date		: 2009-09-30 12:24 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4
Handled-By	: Eric Dumazet <eric.dumazet@gmail.com>
Patch		: http://patchwork.kernel.org/patch/52743/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
Submitter	: gabriele balducci <balducci@units.it>
Date		: 2009-09-30 15:02 (12 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix@esperi.org.uk>
Date		: 2009-09-26 11:16 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
Patch		: http://patchwork.kernel.org/patch/50277/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
		  James Bottomley <James.Bottomley@suse.de>
Patch		: http://patchwork.kernel.org/patch/51412/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (26 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (37 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14129
Subject		: 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
Submitter	: chepioq <chepioq@gmail.com>
Date		: 2009-09-06 07:01 (36 days old)
Handled-By	: Alex Chiang <achiang@hp.com>
		  Rafael J. Wysocki <rjw@sisk.pl>
Patch		: http://patchwork.kernel.org/patch/51834/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (60 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js@sig21.net>
Date		: 2009-08-07 21:51 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>
Patch		: http://patchwork.kernel.org/patch/38550/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13940
Subject		: 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G
Submitter	: Ricardo Jorge da Fonseca Marques Ferreira <storm@sys49152.net>
Date		: 2009-08-07 22:33 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968457731107&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23280


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.30 and 2.6.31, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13615

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 248+ messages in thread

* 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-11 22:41 ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Natalie Protasevich, Linux ACPI, Andrew Morton,
	Kernel Testers List, Linus Torvalds, Linux PM List

[Note:
  10 new reports in the last 10 days, but fortunately we're fixing them faster
  than they're being reported.]

This message contains a list of some regressions introduced between 2.6.30 and
2.6.31, for which there are no fixes in the mainline I know of.  If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.30
and 2.6.31, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-12      161       45          35
  2009-10-02      151       49          42
  2009-09-06      123       34          27
  2009-08-26      108       33          26
  2009-08-20      102       32          29
  2009-08-10       89       27          24
  2009-08-02       76       36          28
  2009-07-27       70       51          43
  2009-07-07       35       25          21
  2009-06-29       22       22          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt@redhat.com>
Date		: 2009-09-24 14:51 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-10-07 20:19 (5 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14385
Subject		: DMAR regression in 2.6.31 leads to ext4 corruption?
Submitter	: Andy Isaacson <adi@hexapodia.org>
Date		: 2009-10-08 23:56 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125504643703877&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14377
Subject		: "conservative" cpufreq governor broken
Submitter	: Steven Noonan <steven@uplinklabs.net>
Date		: 2009-10-05 16:32 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e21c9610991e95621a81407cdbab881226419b
References	: http://marc.info/?l=linux-kernel&m=125476067108252&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14329
Subject		: Sata disk doesn't wake up after S3 suspend
Submitter	:  <frodone@gmail.com>
Date		: 2009-10-05 22:58 (7 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
Subject		: MCA on hp rx8640
Submitter	: Andrew Patterson <andrew.patterson@hp.com>
Date		: 2009-09-29 17:20 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://www.spinics.net/lists/linux-usb/msg22799.html


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li@intel.com>
Date		: 2009-09-22 5:49 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
Date		: 2009-09-12 7:46 (30 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (29 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (25 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (27 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (31 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (34 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ff05b2b4eac2e63d345fc731ea151a060247f53
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
		  http://lkml.org/lkml/2009/10/2/86
		  http://lkml.org/lkml/2009/10/5/24
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (39 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (43 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14070
Subject		: lockdep warning triggered by dup_fd
Submitter	: Bart Van Assche <bart.vanassche@gmail.com>
Date		: 2009-08-23 09:36 (50 days old)
References	: http://lkml.org/lkml/2009/8/23/8


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (53 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim@klingt.org>
Date		: 2009-08-14 8:26 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (58 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (67 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (70 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (69 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (87 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (82 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (98 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (117 days old)
References	: http://lkml.org/lkml/2009/6/17/194


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
Subject		: WARNING: at net/ipv4/af_inet.c:154
Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Date		: 2009-09-30 12:24 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4
Handled-By	: Eric Dumazet <eric.dumazet@gmail.com>
Patch		: http://patchwork.kernel.org/patch/52743/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
Submitter	: gabriele balducci <balducci@units.it>
Date		: 2009-09-30 15:02 (12 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix@esperi.org.uk>
Date		: 2009-09-26 11:16 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
Patch		: http://patchwork.kernel.org/patch/50277/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
		  James Bottomley <James.Bottomley@suse.de>
Patch		: http://patchwork.kernel.org/patch/51412/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (26 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (37 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14129
Subject		: 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
Submitter	: chepioq <chepioq@gmail.com>
Date		: 2009-09-06 07:01 (36 days old)
Handled-By	: Alex Chiang <achiang@hp.com>
		  Rafael J. Wysocki <rjw@sisk.pl>
Patch		: http://patchwork.kernel.org/patch/51834/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (60 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js@sig21.net>
Date		: 2009-08-07 21:51 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>
Patch		: http://patchwork.kernel.org/patch/38550/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13940
Subject		: 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G
Submitter	: Ricardo Jorge da Fonseca Marques Ferreira <storm@sys49152.net>
Date		: 2009-08-07 22:33 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968457731107&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23280


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.30 and 2.6.31, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13615

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13645] NULL pointer dereference at (null) (level2_spare_pgt)
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 22:41   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, poornima nayak

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (117 days old)
References	: http://lkml.org/lkml/2009/6/17/194



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13645] NULL pointer dereference at (null) (level2_spare_pgt)
@ 2009-10-11 22:41   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:41 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, poornima nayak

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date		: 2009-06-17 17:56 (117 days old)
References	: http://lkml.org/lkml/2009/6/17/194


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13733] 2.6.31-rc2: irq 16: nobody cared
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 22:49   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:49 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Niel Lambrechts

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (98 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13733] 2.6.31-rc2: irq 16: nobody cared
@ 2009-10-11 22:49   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 22:49 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Niel Lambrechts

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-07-06 18:32 (98 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13941] x86 Geode issue
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Al Viro, Martin-Éric Racine

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (70 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13809] oprofile: possible circular locking dependency detected
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jerome Marchand

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (82 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13836] suspend script fails, related to stdout?
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tomas M.

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (87 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13906] Huawei E169 GPRS connection causes Ooops
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Clemens Eisserer

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (69 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13940] 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Len Brown,
	Ricardo Jorge da Fonseca Marques Ferreira

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13940
Subject		: 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G
Submitter	: Ricardo Jorge da Fonseca Marques Ferreira <storm@sys49152.net>
Date		: 2009-08-07 22:33 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968457731107&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23280



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13906] Huawei E169 GPRS connection causes Ooops
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Clemens Eisserer

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-08-04 09:02 (69 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13941] x86 Geode issue
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Al Viro, Martin-Éric Racine

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (70 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13809] oprofile: possible circular locking dependency detected
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jerome Marchand

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date		: 2009-07-22 13:35 (82 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13836] suspend script fails, related to stdout?
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tomas M.

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-07-17 21:24 (87 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13948] ath5k broken after suspend-to-ram
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bob Copeland, Johannes Stezenbach, Nick Kossifidis

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js@sig21.net>
Date		: 2009-08-07 21:51 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>
Patch		: http://patchwork.kernel.org/patch/38550/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Fabio Comolli, Luis R. Rodriguez

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (67 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13987] Received NMI interrupt at resume
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Christian Casteyde

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (58 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Fabio Comolli, Luis R. Rodriguez

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-08-06 20:15 (67 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13948] ath5k broken after suspend-to-ram
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bob Copeland, Johannes Stezenbach, Nick Kossifidis

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13948
Subject		: ath5k broken after suspend-to-ram
Submitter	: Johannes Stezenbach <js-FF7aIK3TAVNeoWH0uzbU5w@public.gmane.org>
Date		: 2009-08-07 21:51 (66 days old)
References	: http://marc.info/?l=linux-kernel&m=124968192727854&w=4
Handled-By	: Nick Kossifidis <mickflemm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/38550/


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #13987] Received NMI interrupt at resume
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Christian Casteyde

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian-GANU6spQydw@public.gmane.org>
Date		: 2009-08-15 07:55 (58 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14070] lockdep warning triggered by dup_fd
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (10 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12 17:10   ` Bart Van Assche
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Bart Van Assche

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14070
Subject		: lockdep warning triggered by dup_fd
Submitter	: Bart Van Assche <bart.vanassche@gmail.com>
Date		: 2009-08-23 09:36 (50 days old)
References	: http://lkml.org/lkml/2009/8/23/8



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14058] Oops in fsnotify
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Eric Paris, Grant Wilson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (53 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14013] hd don't show up
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tejun Heo, Tim Blechmann

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim@klingt.org>
Date		: 2009-08-14 8:26 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj@kernel.org>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14017] _end symbol missing from Symbol.map
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Hannes Reinecke

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (60 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14017] _end symbol missing from Symbol.map
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Hannes Reinecke

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare-l3A5Bk7waGM@public.gmane.org>
Date		: 2009-08-13 6:45 (60 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare-l3A5Bk7waGM@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14058] Oops in fsnotify
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Eric Paris, Grant Wilson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson-1HOZaDBbGgxaa/9Udqfwiw@public.gmane.org>
Date		: 2009-08-20 15:48 (53 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14013] hd don't show up
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tejun Heo, Tim Blechmann

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14013
Subject		: hd don't show up
Submitter	: Tim Blechmann <tim-xpEK/MU0Hawdnm+yROfE0A@public.gmane.org>
Date		: 2009-08-14 8:26 (59 days old)
References	: http://marc.info/?l=linux-kernel&m=125023842514480&w=4
Handled-By	: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14141] order 2 page allocation failures in iwlagn
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (16 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-11 23:57     ` Frans Pop
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, David Rientjes, Frans Pop, Pekka Enberg,
	Reinette Chatre

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (36 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ff05b2b4eac2e63d345fc731ea151a060247f53
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
		  http://lkml.org/lkml/2009/10/2/86
		  http://lkml.org/lkml/2009/10/5/24
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14090] WARNING: at fs/notify/inotify/inotify_user.c:394
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (15 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Joerg Platte

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (43 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14137] usb console regressions
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (14 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jason Wessel

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (37 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14114] Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tsvety Petrov

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (39 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14129] 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (18 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alexander Chiang, Alex Chiang,
	Bjorn Helgaas, chepioq, Len Brown, Rafael J. Wysocki

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14129
Subject		: 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
Submitter	: chepioq <chepioq@gmail.com>
Date		: 2009-09-06 07:01 (36 days old)
Handled-By	: Alex Chiang <achiang@hp.com>
		  Rafael J. Wysocki <rjw@sisk.pl>
Patch		: http://patchwork.kernel.org/patch/51834/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14114] Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tsvety Petrov

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov-qXmYkbEmOXkAvxtiuMwx3w@public.gmane.org>
Date		: 2009-09-03 21:06 (39 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14181] b43 causes panic at ifconfig down / shutdown
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jeremy Huddleston

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (27 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14143] OOPS when setting nr_requests for md devices
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, aCaB

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (34 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14157] end_request: I/O error, dev cciss/cXdX, sector 0
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, jiri.harcarik

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (31 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14181] b43 causes panic at ifconfig down / shutdown
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jeremy Huddleston

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu-CC+yJ3UmIYqDUpFQwHEjaQ@public.gmane.org>
Date		: 2009-09-15 18:34 (27 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14157] end_request: I/O error, dev cciss/cXdX, sector 0
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, jiri.harcarik-Re5JQEeQqe8AvxtiuMwx3w

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-09-11 07:42 (31 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14143] OOPS when setting nr_requests for md devices
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, aCaB

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab-Vl9ZkupcxcOsTnJN9+BGXg@public.gmane.org>
Date		: 2009-09-08 08:48 (34 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14248] 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (26 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Jurriaan

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (29 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (22 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12 10:49   ` David Miller
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Stephan von Krawczynski

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14204] MCE prevent booting on my computer(pentium iii @500Mhz)
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, GNUtoo, Ingo Molnar

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (21 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14185] Oops in driversbasefirmware_class
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, David Woodhouse, Frederik Deweerdt, lars_ericsson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (25 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14249] BUG: oops in gss_validate on 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (24 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bastian Blank, Trond Myklebust

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14204] MCE prevent booting on my computer(pentium iii @500Mhz)
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, GNUtoo, Ingo Molnar

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo-n+LsquliYkMdnm+yROfE0A@public.gmane.org>
Date		: 2009-09-21 20:36 (21 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14185] Oops in driversbasefirmware_class
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, David Woodhouse, Frederik Deweerdt, lars_ericsson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
Subject		: Oops in driversbasefirmware_class
Submitter	:  <lars_ericsson@telia.com>
Date		: 2009-09-17 05:09 (25 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14257] Not able to boot on 32 bit System
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (28 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Rishikesh

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14256] kernel BUG at fs/ext3/super.c:435
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (29 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Mikael Pettersson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14258] Memory leak in SCSI initialization
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, James Bottomley, Michael Ellerman, Tetsuo Handa

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
		  James Bottomley <James.Bottomley@suse.de>
Patch		: http://patchwork.kernel.org/patch/51412/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14253] Oops in driversbasefirmware_class
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Frederik Deweerdt, Lars Ericsson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (26 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14253] Oops in driversbasefirmware_class
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Frederik Deweerdt, Lars Ericsson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson-zq6IREYz3ykAvxtiuMwx3w@public.gmane.org>
Date		: 2009-09-16 20:44 (26 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt-kjvbsxwSFqI@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/49914/


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14258] Memory leak in SCSI initialization
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, James Bottomley, Michael Ellerman, Tetsuo Handa

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>
Date		: 2009-09-22 4:18 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael-Gsx/Oe8HsFggBc27wqDAHg@public.gmane.org>
		  James Bottomley <James.Bottomley-l3A5Bk7waGM@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/51412/


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting'
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Duyck, Nix

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix@esperi.org.uk>
Date		: 2009-09-26 11:16 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
Patch		: http://patchwork.kernel.org/patch/50277/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (32 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12 11:05   ` David Miller
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Karol Lewandowski, Mel Gorman

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14264] ehci problem - mouse dead on scroll
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Stern, Oliver Neukum, Volker Armin Hemmann

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
Date		: 2009-09-12 7:46 (30 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern@rowland.harvard.edu>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting'
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Alexander Duyck, Nix

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
Submitter	: Nix <nix-dKoSMcxRz+Te9xe1eoZjHA@public.gmane.org>
Date		: 2009-09-26 11:16 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
Handled-By	: Alexander Duyck <alexander.duyck-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/50277/


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14264] ehci problem - mouse dead on scroll
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Stern, Oliver Neukum, Volker Armin Hemmann

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
Subject		: ehci problem - mouse dead on scroll
Submitter	: Volker Armin Hemmann <volkerarmin-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
Date		: 2009-09-12 7:46 (30 days old)
References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
Handled-By	: Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (36 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12 14:39     ` Chuck Ebbert
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, gabriele balducci

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
Submitter	: gabriele balducci <balducci@units.it>
Date		: 2009-09-30 15:02 (12 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (35 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12 10:51   ` David Miller
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bartlomiej Zolnierkiewicz, David Miller,
	Santiago Garcia Mantinan

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14267] Disassociating atheros wlan
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Johannes Berg, John W. Linville,
	Justin P. Mattock, Kristoffer Ericson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14266] regression in page writeback
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Chris Mason,
	Christoph Hellwig, Dave Chinner, Linus Torvalds, Peter Zijlstra,
	Richard Kennedy, Shaohua Li, Theodore Tso, Wu Fengguang

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li@intel.com>
Date		: 2009-09-22 5:49 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu@intel.com>



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14266] regression in page writeback
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Chris Mason,
	Christoph Hellwig, Dave Chinner, Linus Torvalds, Peter Zijlstra,
	Richard Kennedy, Shaohua Li, Theodore Tso, Wu Fengguang

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
Subject		: regression in page writeback
Submitter	: Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date		: 2009-09-22 5:49 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
Handled-By	: Wu Fengguang <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14267] Disassociating atheros wlan
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Johannes Berg, John W. Linville,
	Justin P. Mattock, Kristoffer Ericson

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-09-24 10:16 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14301] WARNING: at net/ipv4/af_inet.c:154
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (40 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Dumazet, Ralf Hildebrandt

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14301
Subject		: WARNING: at net/ipv4/af_inet.c:154
Submitter	: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>
Date		: 2009-09-30 12:24 (12 days old)
References	: http://marc.info/?l=linux-kernel&m=125431350218137&w=4
Handled-By	: Eric Dumazet <eric.dumazet@gmail.com>
Patch		: http://patchwork.kernel.org/patch/52743/



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14385] DMAR regression in 2.6.31 leads to ext4 corruption?
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (38 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andy Isaacson, Chris Wright

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14385
Subject		: DMAR regression in 2.6.31 leads to ext4 corruption?
Submitter	: Andy Isaacson <adi@hexapodia.org>
Date		: 2009-10-08 23:56 (4 days old)
References	: http://marc.info/?l=linux-kernel&m=125504643703877&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14377] "conservative" cpufreq governor broken
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (41 preceding siblings ...)
  (?)
@ 2009-10-11 23:01 ` Rafael J. Wysocki
  2009-10-12  1:47   ` Steven Noonan
  -1 siblings, 1 reply; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eero Nurkkala, Rik van Riel, Steven Noonan,
	Thomas Gleixner, Venkatesh Pallipadi

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14377
Subject		: "conservative" cpufreq governor broken
Submitter	: Steven Noonan <steven@uplinklabs.net>
Date		: 2009-10-05 16:32 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e21c9610991e95621a81407cdbab881226419b
References	: http://marc.info/?l=linux-kernel&m=125476067108252&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14329] Sata disk doesn't wake up after S3 suspend
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, frodone

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14329
Subject		: Sata disk doesn't wake up after S3 suspend
Submitter	:  <frodone@gmail.com>
Date		: 2009-10-05 22:58 (7 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14309] MCA on hp rx8640
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Patterson, David Woodhouse,
	David Woodhouse, Greg Kroah-Hartman

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
Subject		: MCA on hp rx8640
Submitter	: Andrew Patterson <andrew.patterson@hp.com>
Date		: 2009-09-29 17:20 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://www.spinics.net/lists/linux-usb/msg22799.html



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14329] Sata disk doesn't wake up after S3 suspend
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, frodone-Re5JQEeQqe8AvxtiuMwx3w

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14329
Subject		: Sata disk doesn't wake up after S3 suspend
Submitter	:  <frodone-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-05 22:58 (7 days old)


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14309] MCA on hp rx8640
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Patterson, David Woodhouse,
	David Woodhouse, Greg Kroah-Hartman

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
Subject		: MCA on hp rx8640
Submitter	: Andrew Patterson <andrew.patterson-VXdhtT5mjnY@public.gmane.org>
Date		: 2009-09-29 17:20 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
References	: http://www.spinics.net/lists/linux-usb/msg22799.html


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14388] keyboard under X with 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, Justin P. Mattock,
	Linus Torvalds, OGAWA Hirofumi

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-10-07 20:19 (5 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14391] use after free of struct powernow_k8_data
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Michal Schmidt,
	Naga Chumbalkar, Rusty Russell

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt@redhat.com>
Date		: 2009-09-24 14:51 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4



^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14391] use after free of struct powernow_k8_data
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Michal Schmidt,
	Naga Chumbalkar, Rusty Russell

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date		: 2009-09-24 14:51 (18 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-11 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-11 23:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, Justin P. Mattock,
	Linus Torvalds, OGAWA Hirofumi

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-07 20:19 (5 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14267] Disassociating atheros wlan
  2009-10-11 23:01   ` Rafael J. Wysocki
@ 2009-10-11 23:11     ` Justin P. Mattock
  -1 siblings, 0 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-11 23:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Johannes Berg,
	John W. Linville, Kristoffer Ericson

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
> Subject		: Disassociating atheros wlan
> Submitter	: Kristoffer Ericson<kristoffer.ericson@gmail.com>
> Date		: 2009-09-24 10:16 (18 days old)
> References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4
>
>
>
>    
I attached my bisect log to the bug report, but did not individually test
some of the commits in the log to see if it finds the issue.
(I can try later on and see).

so for now I say yes keep it open.

Justin P. Mattock


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14267] Disassociating atheros wlan
@ 2009-10-11 23:11     ` Justin P. Mattock
  0 siblings, 0 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-11 23:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Johannes Berg,
	John W. Linville, Kristoffer Ericson

Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
> Subject		: Disassociating atheros wlan
> Submitter	: Kristoffer Ericson<kristoffer.ericson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date		: 2009-09-24 10:16 (18 days old)
> References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4
>
>
>
>    
I attached my bisect log to the bug report, but did not individually test
some of the commits in the log to see if it finds the issue.
(I can try later on and see).

so for now I say yes keep it open.

Justin P. Mattock

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-11 23:24   ` Larry Finger
  -1 siblings, 0 replies; 248+ messages in thread
From: Larry Finger @ 2009-10-11 23:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> [Note:
>   10 new reports in the last 10 days, but fortunately we're fixing them faster
>   than they're being reported.]

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> Subject		: b43 causes panic at ifconfig down / shutdown
> Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> Date		: 2009-09-15 18:34 (27 days old)

A patch to fix this one is in the hands of the OP. It should be tested
within the next couple of days.

Larry

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-11 23:24   ` Larry Finger
  0 siblings, 0 replies; 248+ messages in thread
From: Larry Finger @ 2009-10-11 23:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> [Note:
>   10 new reports in the last 10 days, but fortunately we're fixing them faster
>   than they're being reported.]

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> Subject		: b43 causes panic at ifconfig down / shutdown
> Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> Date		: 2009-09-15 18:34 (27 days old)

A patch to fix this one is in the hands of the OP. It should be tested
within the next couple of days.

Larry

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14141] order 2 page allocation failures in iwlagn
  2009-10-11 23:01 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
@ 2009-10-11 23:57     ` Frans Pop
  0 siblings, 0 replies; 248+ messages in thread
From: Frans Pop @ 2009-10-11 23:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, David Rientjes,
	Pekka Enberg, Reinette Chatre

On Monday 12 October 2009, you wrote:
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
> Subject	: order 2 page allocation failures in iwlagn
> Submitter	: Frans Pop <elendil@planet.nl>
> Date		: 2009-09-06 7:40 (36 days old)
> References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
> 		  http://lkml.org/lkml/2009/10/2/86
> 		  http://lkml.org/lkml/2009/10/5/24
> Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>

See: http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/01395.html

I don't see that message on lkml yet :-(

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14141] order 2 page allocation failures in iwlagn
@ 2009-10-11 23:57     ` Frans Pop
  0 siblings, 0 replies; 248+ messages in thread
From: Frans Pop @ 2009-10-11 23:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, David Rientjes,
	Pekka Enberg, Reinette Chatre

On Monday 12 October 2009, you wrote:
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
> Subject	: order 2 page allocation failures in iwlagn
> Submitter	: Frans Pop <elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
> Date		: 2009-09-06 7:40 (36 days old)
> References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
> 		  http://lkml.org/lkml/2009/10/2/86
> 		  http://lkml.org/lkml/2009/10/5/24
> Handled-By	: Pekka Enberg <penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org>

See: http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/01395.html

I don't see that message on lkml yet :-(

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13948] ath5k broken after suspend-to-ram
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-12  0:19   ` Bob Copeland
  2009-10-12 21:24       ` Rafael J. Wysocki
  -1 siblings, 1 reply; 248+ messages in thread
From: Bob Copeland @ 2009-10-12  0:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Johannes Stezenbach, Nick Kossifidis

On Sun, Oct 11, 2009 at 7:01 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13948
> Subject         : ath5k broken after suspend-to-ram
> Submitter       : Johannes Stezenbach <js@sig21.net>
> Date            : 2009-08-07 21:51 (66 days old)
> References      : http://marc.info/?l=linux-kernel&m=124968192727854&w=4
> Handled-By      : Nick Kossifidis <mickflemm@gmail.com>
> Patch           : http://patchwork.kernel.org/patch/38550/

This patch was included in 2.6.31.2, so I believe this can go.

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14266] regression in page writeback
  2009-10-11 23:01   ` Rafael J. Wysocki
@ 2009-10-12  1:02     ` Shaohua Li
  -1 siblings, 0 replies; 248+ messages in thread
From: Shaohua Li @ 2009-10-12  1:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	Chris Mason, Christoph Hellwig, Dave Chinner, Linus Torvalds,
	Peter Zijlstra, Richard Kennedy, Theodore Tso, Wu, Fengguang

On Mon, Oct 12, 2009 at 07:01:09AM +0800, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
> Subject		: regression in page writeback
> Submitter	: Shaohua Li <shaohua.li@intel.com>
> Date		: 2009-09-22 5:49 (20 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
> References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
> Handled-By	: Wu Fengguang <fengguang.wu@intel.com>
The regression is disappeared in latest git tree

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14266] regression in page writeback
@ 2009-10-12  1:02     ` Shaohua Li
  0 siblings, 0 replies; 248+ messages in thread
From: Shaohua Li @ 2009-10-12  1:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	Chris Mason, Christoph Hellwig, Dave Chinner, Linus Torvalds,
	Peter Zijlstra, Richard Kennedy, Theodore Tso, Wu, Fengguang

On Mon, Oct 12, 2009 at 07:01:09AM +0800, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
> Subject		: regression in page writeback
> Submitter	: Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Date		: 2009-09-22 5:49 (20 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
> References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
> Handled-By	: Wu Fengguang <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
The regression is disappeared in latest git tree

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14377] "conservative" cpufreq governor broken
  2009-10-11 23:01 ` [Bug #14377] "conservative" cpufreq governor broken Rafael J. Wysocki
@ 2009-10-12  1:47   ` Steven Noonan
  2009-10-12 21:39     ` Rafael J. Wysocki
  0 siblings, 1 reply; 248+ messages in thread
From: Steven Noonan @ 2009-10-12  1:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Eero Nurkkala,
	Rik van Riel, Thomas Gleixner, Venkatesh Pallipadi

Hi Rafael,

There's a commit to fix this in the stable queue for 2.6.31.x and said
fix is already in the 2.6.32 tree. The commit is titled "NOHZ: update
idle state also when NOHZ is inactive" (fdc6f192e7).

- Steven

On Sun, Oct 11, 2009 at 4:01 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=14377
> Subject         : "conservative" cpufreq governor broken
> Submitter       : Steven Noonan <steven@uplinklabs.net>
> Date            : 2009-10-05 16:32 (7 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f2e21c9610991e95621a81407cdbab881226419b
> References      : http://marc.info/?l=linux-kernel&m=125476067108252&w=4
>
>
>

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting'
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-12  3:12   ` David Miller
  2009-10-12 21:32     ` Rafael J. Wysocki
  -1 siblings, 1 reply; 248+ messages in thread
From: David Miller @ 2009-10-12  3:12 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, kernel-testers, alexander.duyck, nix

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Mon, 12 Oct 2009 01:01:08 +0200 (CEST)

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
> Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
> Submitter	: Nix <nix@esperi.org.uk>
> Date		: 2009-09-26 11:16 (16 days old)
> References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
> Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
> Patch		: http://patchwork.kernel.org/patch/50277/

Fixed by:

commit a825e00c98a2ee37eb2a0ad93b352e79d2bc1593
Author: Alexander Duyck <alexander.h.duyck@intel.com>
Date:   Fri Oct 2 12:30:42 2009 +0000

    e1000e: swap max hw supported frame size between 82574 and 82583
    
    There appears to have been a mixup in the max supported jumbo frame size
    between 82574 and 82583 which ended up disabling jumbo frames on the 82574
    as a result.  This patch swaps the two so that this issue is resolved.
    
    This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14261
    
    Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
    Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-12  7:24   ` Fabio Comolli
  2009-10-12 21:23       ` Rafael J. Wysocki
  -1 siblings, 1 reply; 248+ messages in thread
From: Fabio Comolli @ 2009-10-12  7:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Luis R. Rodriguez

Actually I switched to -32rc so I can't test it anymore... Sorry. I
can confirm it for 2.6.31.1.

Regards,
Fabio

On Mon, Oct 12, 2009 at 1:01 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13943
> Subject         : WARNING: at net/mac80211/mlme.c:2292 with ath5k
> Submitter       : Fabio Comolli <fabio.comolli@gmail.com>
> Date            : 2009-08-06 20:15 (67 days old)
> References      : http://marc.info/?l=linux-kernel&m=124958978600600&w=4
>
>
>

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000
  2009-10-11 23:01 ` [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000 Rafael J. Wysocki
@ 2009-10-12 10:49   ` David Miller
  2009-10-12 11:44     ` Stephan von Krawczynski
  0 siblings, 1 reply; 248+ messages in thread
From: David Miller @ 2009-10-12 10:49 UTC (permalink / raw)
  To: rjw
  Cc: linux-kernel, kernel-testers, skraw, netdev, jeffrey.t.kirsher,
	jesse.brandeburg, peter.p.waskiewicz.jr

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Mon, 12 Oct 2009 01:01:06 +0200 (CEST)

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
> Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
> Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
> Date		: 2009-09-20 11:26 (22 days old)
> References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4

Hmmm... e1000 calls skb_trim() on both jumbo and non-jumbo ring
buffers which get recycled.

At least for the Jumbo case, that's illegal as you cannot call
skb_trim() on an SKB with paged data.

But this assertion is triggering for the non-jumbo ring where
only linear packets should be present as far as I can tell.

Some Intel folks need to take a look, CC:'d, and people need
to CC: their networking bug reports to netdev@vger.kernel.org
so that the proper folks see it.

Thanks.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
  2009-10-11 23:01 ` [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187 Rafael J. Wysocki
@ 2009-10-12 10:51   ` David Miller
  2009-10-12 12:09     ` Santiago Garcia Mantinan
  0 siblings, 1 reply; 248+ messages in thread
From: David Miller @ 2009-10-12 10:51 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, kernel-testers, bzolnier, manty

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Mon, 12 Oct 2009 01:01:09 +0200 (CEST)

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
> Subject		: kernel BUG at drivers/ide/ide-disk.c:187
> Submitter	: Santiago Garcia Mantinan <manty@manty.net>
> Date		: 2009-09-30 11:05 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
> Handled-By	: David Miller <davem@davemloft.net>

I gave the user a debugging patch, but they reported that they
can no longer trigger the issue with 2.6.31.1

See:

	http://marc.info/?l=linux-ide&m=125469615425454&w=4

Maybe that's enough to close this, I dunno.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
  2009-10-11 23:01 ` [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100 Rafael J. Wysocki
@ 2009-10-12 11:05   ` David Miller
  2009-10-13 12:29     ` Karol Lewandowski
  0 siblings, 1 reply; 248+ messages in thread
From: David Miller @ 2009-10-12 11:05 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, kernel-testers, karol.k.lewandowski, mel, netdev

From: "Rafael J. Wysocki" <rjw@sisk.pl>
Date: Mon, 12 Oct 2009 01:01:08 +0200 (CEST)

[ Netdev CC:'d ]

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
> Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
> Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
> Date		: 2009-09-15 12:05 (27 days old)
> References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4

A 128K memory allocation fails after resume, film at 11.

That e100 driver code has been that way forever, so likely it's
something in the page allocator or similar that is making this happen
more likely now.  Perhaps it's related to the iwlagn allocation
failures being tracked down in another thread.

It's a shame that pci_alloc_consistent() has to always use GFP_ATOMIC
for compatability.

As far as I can tell, these code paths can sleep.  So maybe the
following hack would fix this for now.  Could someone test this?

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index 679965c..c71729f 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -1780,9 +1780,9 @@ static void e100_clean_cbs(struct nic *nic)
 			nic->cb_to_clean = nic->cb_to_clean->next;
 			nic->cbs_avail++;
 		}
-		pci_free_consistent(nic->pdev,
-			sizeof(struct cb) * nic->params.cbs.count,
-			nic->cbs, nic->cbs_dma_addr);
+		dma_free_coherent(&nic->pdev->dev,
+				  sizeof(struct cb) * nic->params.cbs.count,
+				  nic->cbs, nic->cbs_dma_addr);
 		nic->cbs = NULL;
 		nic->cbs_avail = 0;
 	}
@@ -1800,8 +1800,10 @@ static int e100_alloc_cbs(struct nic *nic)
 	nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = NULL;
 	nic->cbs_avail = 0;
 
-	nic->cbs = pci_alloc_consistent(nic->pdev,
-		sizeof(struct cb) * count, &nic->cbs_dma_addr);
+	nic->cbs = dma_alloc_coherent(&nic->pdev->dev,
+				      sizeof(struct cb) * count,
+				      &nic->cbs_dma_addr,
+				      GFP_KERNEL);
 	if (!nic->cbs)
 		return -ENOMEM;
 
@@ -2655,16 +2657,16 @@ static int e100_do_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
 
 static int e100_alloc(struct nic *nic)
 {
-	nic->mem = pci_alloc_consistent(nic->pdev, sizeof(struct mem),
-		&nic->dma_addr);
+	nic->mem = dma_alloc_coherent(&nic->pdev->dev, sizeof(struct mem),
+				      &nic->dma_addr, GFP_KERNEL);
 	return nic->mem ? 0 : -ENOMEM;
 }
 
 static void e100_free(struct nic *nic)
 {
 	if (nic->mem) {
-		pci_free_consistent(nic->pdev, sizeof(struct mem),
-			nic->mem, nic->dma_addr);
+		dma_free_coherent(&nic->pdev->dev, sizeof(struct mem),
+				  nic->mem, nic->dma_addr);
 		nic->mem = NULL;
 	}
 }

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000
  2009-10-12 10:49   ` David Miller
@ 2009-10-12 11:44     ` Stephan von Krawczynski
  0 siblings, 0 replies; 248+ messages in thread
From: Stephan von Krawczynski @ 2009-10-12 11:44 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, netdev, jeffrey.t.kirsher,
	jesse.brandeburg, peter.p.waskiewicz.jr

On Mon, 12 Oct 2009 03:49:19 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Mon, 12 Oct 2009 01:01:06 +0200 (CEST)
> 
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
> > Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
> > Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
> > Date		: 2009-09-20 11:26 (22 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4
> 
> Hmmm... e1000 calls skb_trim() on both jumbo and non-jumbo ring
> buffers which get recycled.
> 
> At least for the Jumbo case, that's illegal as you cannot call
> skb_trim() on an SKB with paged data.
> 
> But this assertion is triggering for the non-jumbo ring where
> only linear packets should be present as far as I can tell.
> 
> Some Intel folks need to take a look, CC:'d, and people need
> to CC: their networking bug reports to netdev@vger.kernel.org
> so that the proper folks see it.
> 
> Thanks.

Really, this was a lucky catch, because most of the time the box goes dead right away.
Don't interpret "most of the time" as "continously every day". It just happens sometimes. I am not that surprised because it's a box on the "frontline", you can find a lot of trash going on there, like:

Oct 12 12:21:01 box kernel: TCP: Peer 217.231.204.133:61124/80 unexpectedly shrunk window 2348821413:2348838837 (repaired)
Oct 12 12:21:02 box kernel: TCP: Peer 217.231.204.133:61124/80 unexpectedly shrunk window 2348821413:2348838837 (repaired)

-- 
Regards,
Stephan


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
  2009-10-12 10:51   ` David Miller
@ 2009-10-12 12:09     ` Santiago Garcia Mantinan
  2009-10-12 21:38       ` Rafael J. Wysocki
  2009-10-12 23:21         ` David Miller
  0 siblings, 2 replies; 248+ messages in thread
From: Santiago Garcia Mantinan @ 2009-10-12 12:09 UTC (permalink / raw)
  To: David Miller; +Cc: rjw, linux-kernel, kernel-testers, bzolnier

Hi!

> I gave the user a debugging patch, but they reported that they
> can no longer trigger the issue with 2.6.31.1
> 
> See:
> 
> 	http://marc.info/?l=linux-ide&m=125469615425454&w=4
> 
> Maybe that's enough to close this, I dunno.

I'm the user with the debugging patch, I'd say no, don't close it.

Even though on my first tests I had the computer to crash the two weekends I
had tested, it seems that crashing on weekend was only luck and seems that
having the bug appear is more difficult than I had thought.

So... my comments saying that 2.6.31.1 seemed ok are probably wrong (I have
read the .1 changelog and there is an ata patch, but probably unrelated). It
is true that I had 2.6.31.1 running for a whole week without it crashing,
but seems that crashing it may need more time.

I'm now back to the 2.6.31 with the patch to debug it (3 days uptime) in
order to gather more info and after I get the info from 2.6.31 I'd test
3.6.31.latest or even 2.6.32whatever in case you find that better.

Regards...
-- 
Manty/BestiaTester -> http://manty.net

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
@ 2009-10-12 12:22   ` Frederik Deweerdt
  -1 siblings, 0 replies; 248+ messages in thread
From: Frederik Deweerdt @ 2009-10-12 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

Hi Rafael,

On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> Subject		: Oops in driversbasefirmware_class
> Submitter	:  <lars_ericsson@telia.com>
> Date		: 2009-09-17 05:09 (25 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> Subject		: Oops in driversbasefirmware_class
> Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
> Date		: 2009-09-16 20:44 (26 days old)
> References	: http://lkml.org/lkml/2009/9/16/461
> Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
> Patch		: http://patchwork.kernel.org/patch/49914/
> 
Those two are refering to the same bug.

Regards,
Frederik

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-12 12:22   ` Frederik Deweerdt
  0 siblings, 0 replies; 248+ messages in thread
From: Frederik Deweerdt @ 2009-10-12 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

Hi Rafael,

On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> Subject		: Oops in driversbasefirmware_class
> Submitter	:  <lars_ericsson-zq6IREYz3ykAvxtiuMwx3w@public.gmane.org>
> Date		: 2009-09-17 05:09 (25 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> Subject		: Oops in driversbasefirmware_class
> Submitter	: Lars Ericsson <Lars_Ericsson-zq6IREYz3ykAvxtiuMwx3w@public.gmane.org>
> Date		: 2009-09-16 20:44 (26 days old)
> References	: http://lkml.org/lkml/2009/9/16/461
> Handled-By	: Frederik Deweerdt <frederik.deweerdt-kjvbsxwSFqI@public.gmane.org>
> Patch		: http://patchwork.kernel.org/patch/49914/
> 
Those two are refering to the same bug.

Regards,
Frederik

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (46 preceding siblings ...)
  (?)
@ 2009-10-12 12:22 ` Frederik Deweerdt
  -1 siblings, 0 replies; 248+ messages in thread
From: Frederik Deweerdt @ 2009-10-12 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

Hi Rafael,

On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> Subject		: Oops in driversbasefirmware_class
> Submitter	:  <lars_ericsson@telia.com>
> Date		: 2009-09-17 05:09 (25 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> Subject		: Oops in driversbasefirmware_class
> Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
> Date		: 2009-09-16 20:44 (26 days old)
> References	: http://lkml.org/lkml/2009/9/16/461
> Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
> Patch		: http://patchwork.kernel.org/patch/49914/
> 
Those two are refering to the same bug.

Regards,
Frederik

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14143] OOPS when setting nr_requests for md devices
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-12 14:21   ` Chuck Ebbert
  2009-10-12 21:30     ` Rafael J. Wysocki
  -1 siblings, 1 reply; 248+ messages in thread
From: Chuck Ebbert @ 2009-10-12 14:21 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List, aCaB

On Mon, 12 Oct 2009 01:01:05 +0200 (CEST)
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
> Subject		: OOPS when setting nr_requests for md devices
> Submitter	: aCaB <acab@clamav.net>
> Date		: 2009-09-08 08:48 (34 days old)
> 

Fixed in 2.6.32 by commit b8a9ae77 ("block: don't assume device has a
request list backing in nr_requests store")

Also in 2.6.31.1

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
  2009-10-11 23:01 ` [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more? Rafael J. Wysocki
@ 2009-10-12 14:39     ` Chuck Ebbert
  0 siblings, 0 replies; 248+ messages in thread
From: Chuck Ebbert @ 2009-10-12 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, gabriele balducci

On Mon, 12 Oct 2009 01:01:09 +0200 (CEST)
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
> Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
> Submitter	: gabriele balducci <balducci@units.it>
> Date		: 2009-09-30 15:02 (12 days old)
> Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0
> 

Already marked fixed in bugzilla.

Fixed by commit 2fcad9d271
("ahci: disable 64bit DMA by default on SB600s")

Not in 2.6.31-stable.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
@ 2009-10-12 14:39     ` Chuck Ebbert
  0 siblings, 0 replies; 248+ messages in thread
From: Chuck Ebbert @ 2009-10-12 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, gabriele balducci

On Mon, 12 Oct 2009 01:01:09 +0200 (CEST)
"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14275
> Subject		: kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more?
> Submitter	: gabriele balducci <balducci-KqplAB4qOuE@public.gmane.org>
> Date		: 2009-09-30 15:02 (12 days old)
> Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=14275#c0
> 

Already marked fixed in bugzilla.

Fixed by commit 2fcad9d271
("ahci: disable 64bit DMA by default on SB600s")

Not in 2.6.31-stable.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14070] lockdep warning triggered by dup_fd
  2009-10-11 23:01 ` [Bug #14070] lockdep warning triggered by dup_fd Rafael J. Wysocki
@ 2009-10-12 17:10   ` Bart Van Assche
  2009-10-12 21:26       ` Rafael J. Wysocki
  0 siblings, 1 reply; 248+ messages in thread
From: Bart Van Assche @ 2009-10-12 17:10 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Mon, Oct 12, 2009 at 1:01 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=14070
> Subject         : lockdep warning triggered by dup_fd
> Submitter       : Bart Van Assche <bart.vanassche@gmail.com>
> Date            : 2009-08-23 09:36 (50 days old)
> References      : http://lkml.org/lkml/2009/8/23/8
>

Hello Rafael,

Since I reported the above issue I haven't seen it reappearing yet.

Bart.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-12 18:53   ` Justin P. Mattock
  2009-10-12 21:41       ` Rafael J. Wysocki
  2009-10-12 22:59     ` Nix
  -1 siblings, 2 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-12 18:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,  Frédéric L. W. Meunier ,
	Linus Torvalds, OGAWA Hirofumi

Not sure where this stands. Right now all three machines I have seem  
to be having no issues with the kayboard
(xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  
hit something.

justin P. Mattock



On Oct 11, 2009, at 4:01 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still  
> should
> be listed and let me know (either way).
>
>
> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14388
> Subject        : keyboard under X with 2.6.31
> Submitter    : Frédéric L. W. Meunier <fredlwm@gmail.com>
> Date        : 2009-10-07 20:19 (5 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
> References    : http://marc.info/?l=linux-kernel&m=125494753228217&w=4
>
>

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (49 preceding siblings ...)
  (?)
@ 2009-10-12 19:58 ` Andrew Patterson
  2009-10-12 21:48   ` Rafael J. Wysocki
  2009-10-12 21:48     ` Rafael J. Wysocki
  -1 siblings, 2 replies; 248+ messages in thread
From: Andrew Patterson @ 2009-10-12 19:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

On Mon, 2009-10-12 at 00:41 +0200, Rafael J. Wysocki wrote:
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
> Subject		: MCA on hp rx8640
> Submitter	: Andrew Patterson <andrew.patterson@hp.com>
> Date		: 2009-09-29 17:20 (13 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
> References	: http://www.spinics.net/lists/linux-usb/msg22799.html
> 

Linus fixed this one with d93a8f829fe1d2f3002f2c6ddb553d12db420412.  It
also looks like a duplicate of
http://bugzilla.kernel.org/show_bug.cgi?id=14374

Thanks,

Andrew
-- 
Andrew Patterson
Hewlett-Packard


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 22:41 ` Rafael J. Wysocki
                   ` (48 preceding siblings ...)
  (?)
@ 2009-10-12 19:58 ` Andrew Patterson
  -1 siblings, 0 replies; 248+ messages in thread
From: Andrew Patterson @ 2009-10-12 19:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Mon, 2009-10-12 at 00:41 +0200, Rafael J. Wysocki wrote:
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
> Subject		: MCA on hp rx8640
> Submitter	: Andrew Patterson <andrew.patterson@hp.com>
> Date		: 2009-09-29 17:20 (13 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
> References	: http://www.spinics.net/lists/linux-usb/msg22799.html
> 

Linus fixed this one with d93a8f829fe1d2f3002f2c6ddb553d12db420412.  It
also looks like a duplicate of
http://bugzilla.kernel.org/show_bug.cgi?id=14374

Thanks,

Andrew
-- 
Andrew Patterson
Hewlett-Packard

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
@ 2009-10-12 21:23       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:23 UTC (permalink / raw)
  To: Fabio Comolli
  Cc: Linux Kernel Mailing List, Kernel Testers List, Luis R. Rodriguez

On Monday 12 October 2009, Fabio Comolli wrote:
> Actually I switched to -32rc so I can't test it anymore... Sorry. I
> can confirm it for 2.6.31.1.

Does it happen in -32-rc?

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
@ 2009-10-12 21:23       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:23 UTC (permalink / raw)
  To: Fabio Comolli
  Cc: Linux Kernel Mailing List, Kernel Testers List, Luis R. Rodriguez

On Monday 12 October 2009, Fabio Comolli wrote:
> Actually I switched to -32rc so I can't test it anymore... Sorry. I
> can confirm it for 2.6.31.1.

Does it happen in -32-rc?

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13948] ath5k broken after suspend-to-ram
@ 2009-10-12 21:24       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:24 UTC (permalink / raw)
  To: Bob Copeland
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Johannes Stezenbach, Nick Kossifidis

On Monday 12 October 2009, Bob Copeland wrote:
> On Sun, Oct 11, 2009 at 7:01 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13948
> > Subject         : ath5k broken after suspend-to-ram
> > Submitter       : Johannes Stezenbach <js@sig21.net>
> > Date            : 2009-08-07 21:51 (66 days old)
> > References      : http://marc.info/?l=linux-kernel&m=124968192727854&w=4
> > Handled-By      : Nick Kossifidis <mickflemm@gmail.com>
> > Patch           : http://patchwork.kernel.org/patch/38550/
> 
> This patch was included in 2.6.31.2, so I believe this can go.

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13948] ath5k broken after suspend-to-ram
@ 2009-10-12 21:24       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:24 UTC (permalink / raw)
  To: Bob Copeland
  Cc: Linux Kernel Mailing List, Kernel Testers List,
	Johannes Stezenbach, Nick Kossifidis

On Monday 12 October 2009, Bob Copeland wrote:
> On Sun, Oct 11, 2009 at 7:01 PM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13948
> > Subject         : ath5k broken after suspend-to-ram
> > Submitter       : Johannes Stezenbach <js-FF7aIK3TAVNeoWH0uzbU5w@public.gmane.org>
> > Date            : 2009-08-07 21:51 (66 days old)
> > References      : http://marc.info/?l=linux-kernel&m=124968192727854&w=4
> > Handled-By      : Nick Kossifidis <mickflemm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Patch           : http://patchwork.kernel.org/patch/38550/
> 
> This patch was included in 2.6.31.2, so I believe this can go.

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14070] lockdep warning triggered by dup_fd
@ 2009-10-12 21:26       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:26 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Monday 12 October 2009, Bart Van Assche wrote:
> On Mon, Oct 12, 2009 at 1:01 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=14070
> > Subject         : lockdep warning triggered by dup_fd
> > Submitter       : Bart Van Assche <bart.vanassche@gmail.com>
> > Date            : 2009-08-23 09:36 (50 days old)
> > References      : http://lkml.org/lkml/2009/8/23/8
> >
> 
> Hello Rafael,
> 
> Since I reported the above issue I haven't seen it reappearing yet.

Thanks, closing for now.  Please reopen if you see it again.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14070] lockdep warning triggered by dup_fd
@ 2009-10-12 21:26       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:26 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Monday 12 October 2009, Bart Van Assche wrote:
> On Mon, Oct 12, 2009 at 1:01 AM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> >
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=14070
> > Subject         : lockdep warning triggered by dup_fd
> > Submitter       : Bart Van Assche <bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date            : 2009-08-23 09:36 (50 days old)
> > References      : http://lkml.org/lkml/2009/8/23/8
> >
> 
> Hello Rafael,
> 
> Since I reported the above issue I haven't seen it reappearing yet.

Thanks, closing for now.  Please reopen if you see it again.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14141] order 2 page allocation failures in iwlagn
@ 2009-10-12 21:29       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:29 UTC (permalink / raw)
  To: Frans Pop
  Cc: Linux Kernel Mailing List, Kernel Testers List, David Rientjes,
	Pekka Enberg, Reinette Chatre

On Monday 12 October 2009, Frans Pop wrote:
> On Monday 12 October 2009, you wrote:
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
> > Subject	: order 2 page allocation failures in iwlagn
> > Submitter	: Frans Pop <elendil@planet.nl>
> > Date		: 2009-09-06 7:40 (36 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
> > 		  http://lkml.org/lkml/2009/10/2/86
> > 		  http://lkml.org/lkml/2009/10/5/24
> > Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>
> 
> See: http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/01395.html
> 
> I don't see that message on lkml yet :-(

I've added the link to the bug entry, thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14141] order 2 page allocation failures in iwlagn
@ 2009-10-12 21:29       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:29 UTC (permalink / raw)
  To: Frans Pop
  Cc: Linux Kernel Mailing List, Kernel Testers List, David Rientjes,
	Pekka Enberg, Reinette Chatre

On Monday 12 October 2009, Frans Pop wrote:
> On Monday 12 October 2009, you wrote:
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
> > Subject	: order 2 page allocation failures in iwlagn
> > Submitter	: Frans Pop <elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
> > Date		: 2009-09-06 7:40 (36 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
> > 		  http://lkml.org/lkml/2009/10/2/86
> > 		  http://lkml.org/lkml/2009/10/5/24
> > Handled-By	: Pekka Enberg <penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org>
> 
> See: http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/01395.html
> 
> I don't see that message on lkml yet :-(

I've added the link to the bug entry, thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14143] OOPS when setting nr_requests for md devices
  2009-10-12 14:21   ` Chuck Ebbert
@ 2009-10-12 21:30     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:30 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Linux Kernel Mailing List, Kernel Testers List, aCaB

On Monday 12 October 2009, Chuck Ebbert wrote:
> On Mon, 12 Oct 2009 01:01:05 +0200 (CEST)
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
> > Subject		: OOPS when setting nr_requests for md devices
> > Submitter	: aCaB <acab@clamav.net>
> > Date		: 2009-09-08 08:48 (34 days old)
> > 
> 
> Fixed in 2.6.32 by commit b8a9ae77 ("block: don't assume device has a
> request list backing in nr_requests store")
> 
> Also in 2.6.31.1

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting'
  2009-10-12  3:12   ` David Miller
@ 2009-10-12 21:32     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:32 UTC (permalink / raw)
  To: David Miller; +Cc: linux-kernel, kernel-testers, alexander.duyck, nix

On Monday 12 October 2009, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Mon, 12 Oct 2009 01:01:08 +0200 (CEST)
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14261
> > Subject		: e1000e jumbo frames no longer work: 'Unsupported MTU setting'
> > Submitter	: Nix <nix@esperi.org.uk>
> > Date		: 2009-09-26 11:16 (16 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125396433321342&w=4
> > Handled-By	: Alexander Duyck <alexander.duyck@gmail.com>
> > Patch		: http://patchwork.kernel.org/patch/50277/
> 
> Fixed by:
> 
> commit a825e00c98a2ee37eb2a0ad93b352e79d2bc1593
> Author: Alexander Duyck <alexander.h.duyck@intel.com>
> Date:   Fri Oct 2 12:30:42 2009 +0000
> 
>     e1000e: swap max hw supported frame size between 82574 and 82583
>     
>     There appears to have been a mixup in the max supported jumbo frame size
>     between 82574 and 82583 which ended up disabling jumbo frames on the 82574
>     as a result.  This patch swaps the two so that this issue is resolved.
>     
>     This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14261
>     
>     Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>     Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14266] regression in page writeback
  2009-10-12  1:02     ` Shaohua Li
  (?)
@ 2009-10-12 21:34     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:34 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	Chris Mason, Christoph Hellwig, Dave Chinner, Linus Torvalds,
	Peter Zijlstra, Richard Kennedy, Theodore Tso, Wu, Fengguang

On Monday 12 October 2009, Shaohua Li wrote:
> On Mon, Oct 12, 2009 at 07:01:09AM +0800, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14266
> > Subject		: regression in page writeback
> > Submitter	: Shaohua Li <shaohua.li@intel.com>
> > Date		: 2009-09-22 5:49 (20 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d7831a0bdf06b9f722b947bb0c205ff7d77cebd8
> > References	: http://marc.info/?l=linux-kernel&m=125359858117176&w=4
> > Handled-By	: Wu Fengguang <fengguang.wu@intel.com>
> The regression is disappeared in latest git tree

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14267] Disassociating atheros wlan
  2009-10-11 23:11     ` Justin P. Mattock
  (?)
@ 2009-10-12 21:35     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:35 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linux Kernel Mailing List, Kernel Testers List, Johannes Berg,
	John W. Linville, Kristoffer Ericson

On Monday 12 October 2009, Justin P. Mattock wrote:
> Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
> > Subject		: Disassociating atheros wlan
> > Submitter	: Kristoffer Ericson<kristoffer.ericson@gmail.com>
> > Date		: 2009-09-24 10:16 (18 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4
> >
> >
> >
> >    
> I attached my bisect log to the bug report, but did not individually test
> some of the commits in the log to see if it finds the issue.
> (I can try later on and see).
> 
> so for now I say yes keep it open.

OK, thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
  2009-10-12 12:09     ` Santiago Garcia Mantinan
@ 2009-10-12 21:38       ` Rafael J. Wysocki
  2009-10-12 23:21         ` David Miller
  1 sibling, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:38 UTC (permalink / raw)
  To: Santiago Garcia Mantinan
  Cc: David Miller, linux-kernel, kernel-testers, bzolnier

On Monday 12 October 2009, Santiago Garcia Mantinan wrote:
> Hi!
> 
> > I gave the user a debugging patch, but they reported that they
> > can no longer trigger the issue with 2.6.31.1
> > 
> > See:
> > 
> > 	http://marc.info/?l=linux-ide&m=125469615425454&w=4
> > 
> > Maybe that's enough to close this, I dunno.
> 
> I'm the user with the debugging patch, I'd say no, don't close it.

OK, still open.

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14377] "conservative" cpufreq governor broken
  2009-10-12  1:47   ` Steven Noonan
@ 2009-10-12 21:39     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:39 UTC (permalink / raw)
  To: Steven Noonan
  Cc: Linux Kernel Mailing List, Kernel Testers List, Eero Nurkkala,
	Rik van Riel, Thomas Gleixner, Venkatesh Pallipadi

On Monday 12 October 2009, Steven Noonan wrote:
> Hi Rafael,
> 
> There's a commit to fix this in the stable queue for 2.6.31.x and said
> fix is already in the 2.6.32 tree. The commit is titled "NOHZ: update
> idle state also when NOHZ is inactive" (fdc6f192e7).

Thanks, closing.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-12 21:41       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:41 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,  Frédéric L. W. Meunier,
	Linus Torvalds, OGAWA Hirofumi

On Monday 12 October 2009, Justin P. Mattock wrote:
> Not sure where this stands. Right now all three machines I have seem  
> to be having no issues with the kayboard
> (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  
> hit something.

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-12 21:41       ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:41 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	Linus Torvalds, OGAWA Hirofumi

On Monday 12 October 2009, Justin P. Mattock wrote:
> Not sure where this stands. Right now all three machines I have seem  
> to be having no issues with the kayboard
> (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  
> hit something.

Thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 23:24   ` Larry Finger
@ 2009-10-12 21:43     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:43 UTC (permalink / raw)
  To: Larry Finger
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

On Monday 12 October 2009, Larry Finger wrote:
> On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> > [Note:
> >   10 new reports in the last 10 days, but fortunately we're fixing them faster
> >   than they're being reported.]
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> > Subject		: b43 causes panic at ifconfig down / shutdown
> > Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> > Date		: 2009-09-15 18:34 (27 days old)
> 
> A patch to fix this one is in the hands of the OP. It should be tested
> within the next couple of days.

Great, thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-12 21:43     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:43 UTC (permalink / raw)
  To: Larry Finger
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Larry Finger wrote:
> On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> > [Note:
> >   10 new reports in the last 10 days, but fortunately we're fixing them faster
> >   than they're being reported.]
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> > Subject		: b43 causes panic at ifconfig down / shutdown
> > Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> > Date		: 2009-09-15 18:34 (27 days old)
> 
> A patch to fix this one is in the hands of the OP. It should be tested
> within the next couple of days.

Great, thanks for the update.

Rafael

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-11 23:24   ` Larry Finger
  (?)
@ 2009-10-12 21:43   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:43 UTC (permalink / raw)
  To: Larry Finger
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Larry Finger wrote:
> On 10/11/2009 05:41 PM, Rafael J. Wysocki wrote:
> > [Note:
> >   10 new reports in the last 10 days, but fortunately we're fixing them faster
> >   than they're being reported.]
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
> > Subject		: b43 causes panic at ifconfig down / shutdown
> > Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
> > Date		: 2009-09-15 18:34 (27 days old)
> 
> A patch to fix this one is in the hands of the OP. It should be tested
> within the next couple of days.

Great, thanks for the update.

Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-12 12:22   ` Frederik Deweerdt
@ 2009-10-12 21:46     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:46 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

On Monday 12 October 2009, Frederik Deweerdt wrote:
> Hi Rafael,
> 
> On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	:  <lars_ericsson@telia.com>
> > Date		: 2009-09-17 05:09 (25 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
> > Date		: 2009-09-16 20:44 (26 days old)
> > References	: http://lkml.org/lkml/2009/9/16/461
> > Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
> > Patch		: http://patchwork.kernel.org/patch/49914/
> > 
> Those two are refering to the same bug.

Thanks, I closed #14185 as a duplicate of #14253.

Best,
Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-12 21:46     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:46 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Frederik Deweerdt wrote:
> Hi Rafael,
> 
> On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	:  <lars_ericsson@telia.com>
> > Date		: 2009-09-17 05:09 (25 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
> > Date		: 2009-09-16 20:44 (26 days old)
> > References	: http://lkml.org/lkml/2009/9/16/461
> > Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
> > Patch		: http://patchwork.kernel.org/patch/49914/
> > 
> Those two are refering to the same bug.

Thanks, I closed #14185 as a duplicate of #14253.

Best,
Rafael

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-12 12:22   ` Frederik Deweerdt
  (?)
  (?)
@ 2009-10-12 21:46   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:46 UTC (permalink / raw)
  To: Frederik Deweerdt
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Frederik Deweerdt wrote:
> Hi Rafael,
> 
> On Mon, Oct 12, 2009 at 12:41:30AM +0200, Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14185
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	:  <lars_ericsson@telia.com>
> > Date		: 2009-09-17 05:09 (25 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e03a201bbe8137487f340d26aa662110e324b20
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
> > Subject		: Oops in driversbasefirmware_class
> > Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
> > Date		: 2009-09-16 20:44 (26 days old)
> > References	: http://lkml.org/lkml/2009/9/16/461
> > Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
> > Patch		: http://patchwork.kernel.org/patch/49914/
> > 
> Those two are refering to the same bug.

Thanks, I closed #14185 as a duplicate of #14253.

Best,
Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-12 19:58 ` Andrew Patterson
@ 2009-10-12 21:48     ` Rafael J. Wysocki
  2009-10-12 21:48     ` Rafael J. Wysocki
  1 sibling, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:48 UTC (permalink / raw)
  To: Andrew Patterson
  Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI

On Monday 12 October 2009, Andrew Patterson wrote:
> On Mon, 2009-10-12 at 00:41 +0200, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
> > Subject		: MCA on hp rx8640
> > Submitter	: Andrew Patterson <andrew.patterson@hp.com>
> > Date		: 2009-09-29 17:20 (13 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
> > References	: http://www.spinics.net/lists/linux-usb/msg22799.html
> > 
> 
> Linus fixed this one with d93a8f829fe1d2f3002f2c6ddb553d12db420412.  It
> also looks like a duplicate of
> http://bugzilla.kernel.org/show_bug.cgi?id=14374

Thanks, I closed #14309 as a duplicate of #14374 that's already closed.

Best,
Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
@ 2009-10-12 21:48     ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:48 UTC (permalink / raw)
  To: Andrew Patterson
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Andrew Patterson wrote:
> On Mon, 2009-10-12 at 00:41 +0200, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
> > Subject		: MCA on hp rx8640
> > Submitter	: Andrew Patterson <andrew.patterson@hp.com>
> > Date		: 2009-09-29 17:20 (13 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
> > References	: http://www.spinics.net/lists/linux-usb/msg22799.html
> > 
> 
> Linus fixed this one with d93a8f829fe1d2f3002f2c6ddb553d12db420412.  It
> also looks like a duplicate of
> http://bugzilla.kernel.org/show_bug.cgi?id=14374

Thanks, I closed #14309 as a duplicate of #14374 that's already closed.

Best,
Rafael

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
--

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31
  2009-10-12 19:58 ` Andrew Patterson
@ 2009-10-12 21:48   ` Rafael J. Wysocki
  2009-10-12 21:48     ` Rafael J. Wysocki
  1 sibling, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-12 21:48 UTC (permalink / raw)
  To: Andrew Patterson
  Cc: DRI, Linux SCSI List, Network Development, Linux Wireless List,
	Linux Kernel Mailing List, Natalie Protasevich, Linux ACPI,
	Andrew Morton, Kernel Testers List, Linus Torvalds,
	Linux PM List

On Monday 12 October 2009, Andrew Patterson wrote:
> On Mon, 2009-10-12 at 00:41 +0200, Rafael J. Wysocki wrote:
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14309
> > Subject		: MCA on hp rx8640
> > Submitter	: Andrew Patterson <andrew.patterson@hp.com>
> > Date		: 2009-09-29 17:20 (13 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db8be50c4307dac2b37305fc59c8dc0f978d09ea
> > References	: http://www.spinics.net/lists/linux-usb/msg22799.html
> > 
> 
> Linus fixed this one with d93a8f829fe1d2f3002f2c6ddb553d12db420412.  It
> also looks like a duplicate of
> http://bugzilla.kernel.org/show_bug.cgi?id=14374

Thanks, I closed #14309 as a duplicate of #14374 that's already closed.

Best,
Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-12 18:53   ` Justin P. Mattock
  2009-10-12 21:41       ` Rafael J. Wysocki
@ 2009-10-12 22:59     ` Nix
  2009-10-12 23:38       ` Alan Cox
  2009-10-13  0:16       ` Linus Torvalds
  1 sibling, 2 replies; 248+ messages in thread
From: Nix @ 2009-10-12 22:59 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, Linus Torvalds, OGAWA Hirofumi

On 12 Oct 2009, Justin P. Mattock uttered the following:

> Not sure where this stands. Right now all three machines I have seem  to be having no issues with the kayboard
> (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  hit something.
[...]
>> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14388
>> Subject        : keyboard under X with 2.6.31
>> Submitter    : Frédéric L. W. Meunier <fredlwm@gmail.com>
>> Date        : 2009-10-07 20:19 (5 days old)
>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
>> References    : http://marc.info/?l=linux-kernel&m=125494753228217&w=4

I have been seeing problems precisely like those described (sometimes
the keyboard dies, sometimes it gets 'stuck' with a key held down, until
I switch TTYs, which generally means killing X as I'm not aware of an
easy way to switch VTs using only the mouse), since I moved to 2.6.31,
using the kbd driver from X.org git with head commit
158d33c15df60696946031a0319e2bd2ec8b9541, and version 1.6.3.901 of the X
server, old enough that I'd been using it for a couple of weeks before
switching to 2.6.31 without incident. (Note, I'm using kbd, not
evdev. Maybe this is a common factor among everyone seeing failures: I
don't know.)

So it seems likely to me that this is a kernel bug, somewhere, and the
TTY layer seems like a good place to look (OK, a horrible place, but a
*likely* place).

I'm about to try reverting the suggested commit and will report back. I
see this failure about once a day, so I'll give it three days to go
wrong and then (if it doesn't) will presume it works and so inform you.


(Of course with this commit reverted Emacsen start dropping data from
their ptys, and as bad luck would have it I live in (X)Emacs, but that's
on a different machine! so I can have my compile buffer data *and* not
destroy X ;} )

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
@ 2009-10-12 23:21         ` David Miller
  0 siblings, 0 replies; 248+ messages in thread
From: David Miller @ 2009-10-12 23:21 UTC (permalink / raw)
  To: manty; +Cc: rjw, linux-kernel, kernel-testers, bzolnier

From: Santiago Garcia Mantinan <manty@manty.net>
Date: Mon, 12 Oct 2009 14:09:43 +0200

> I'm now back to the 2.6.31 with the patch to debug it (3 days uptime) in
> order to gather more info and after I get the info from 2.6.31 I'd test
> 3.6.31.latest or even 2.6.32whatever in case you find that better.

Ok, thanks for the update.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187
@ 2009-10-12 23:21         ` David Miller
  0 siblings, 0 replies; 248+ messages in thread
From: David Miller @ 2009-10-12 23:21 UTC (permalink / raw)
  To: manty-gaW6/AuhO2xeoWH0uzbU5w
  Cc: rjw-KKrjLPT3xs0, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-testers-u79uwXL29TY76Z2rM5mHXA,
	bzolnier-Re5JQEeQqe8AvxtiuMwx3w

From: Santiago Garcia Mantinan <manty-gaW6/AuhO2xeoWH0uzbU5w@public.gmane.org>
Date: Mon, 12 Oct 2009 14:09:43 +0200

> I'm now back to the 2.6.31 with the patch to debug it (3 days uptime) in
> order to gather more info and after I get the info from 2.6.31 I'd test
> 3.6.31.latest or even 2.6.32whatever in case you find that better.

Ok, thanks for the update.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-12 22:59     ` Nix
@ 2009-10-12 23:38       ` Alan Cox
  2009-10-12 23:46           ` Dmitry Torokhov
  2009-10-13  2:00         ` Daniel Hazelton
  2009-10-13  0:16       ` Linus Torvalds
  1 sibling, 2 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-12 23:38 UTC (permalink / raw)
  To: Nix
  Cc: Justin P. Mattock, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, Linus Torvalds, OGAWA Hirofumi

> So it seems likely to me that this is a kernel bug, somewhere, and the
> TTY layer seems like a good place to look (OK, a horrible place, but a
> *likely* place).

Somewhere around 2.6.29-30 various things went funny in the keyboard
layer for me - notably characters "bleeding" across console switches.

> 
> I'm about to try reverting the suggested commit and will report back. I
> see this failure about once a day, so I'll give it three days to go
> wrong and then (if it doesn't) will presume it works and so inform you.
> 
> 
> (Of course with this commit reverted Emacsen start dropping data from
> their ptys, and as bad luck would have it I live in (X)Emacs, but that's
> on a different machine! so I can have my compile buffer data *and* not
> destroy X ;} )

X doesn't touch the pty layer. It touches vt (extensively) and the input
layers. It's vt/kbd access is also very raw so bypasses much of that
layer. That isn't to say tty isn't the cause but look for input layer
changes too.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-12 23:46           ` Dmitry Torokhov
  0 siblings, 0 replies; 248+ messages in thread
From: Dmitry Torokhov @ 2009-10-12 23:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Ed Tomlinson, Frédéric L. W. Meunier, Linus Torvalds,
	OGAWA Hirofumi

On Tue, Oct 13, 2009 at 12:38:41AM +0100, Alan Cox wrote:
> > So it seems likely to me that this is a kernel bug, somewhere, and the
> > TTY layer seems like a good place to look (OK, a horrible place, but a
> > *likely* place).
> 
> Somewhere around 2.6.29-30 various things went funny in the keyboard
> layer for me - notably characters "bleeding" across console switches.
> 

What do you mean by "bleeding"? Are you sure it is not autorepeat
kicking in?

> > 
> > I'm about to try reverting the suggested commit and will report back. I
> > see this failure about once a day, so I'll give it three days to go
> > wrong and then (if it doesn't) will presume it works and so inform you.
> > 
> > 
> > (Of course with this commit reverted Emacsen start dropping data from
> > their ptys, and as bad luck would have it I live in (X)Emacs, but that's
> > on a different machine! so I can have my compile buffer data *and* not
> > destroy X ;} )
> 
> X doesn't touch the pty layer. It touches vt (extensively) and the input
> layers. It's vt/kbd access is also very raw so bypasses much of that
> layer. That isn't to say tty isn't the cause but look for input layer
> changes too.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-12 23:46           ` Dmitry Torokhov
  0 siblings, 0 replies; 248+ messages in thread
From: Dmitry Torokhov @ 2009-10-12 23:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Ed Tomlinson, Frédéric L. W. Meunier, Linus Torvalds,
	OGAWA Hirofumi

On Tue, Oct 13, 2009 at 12:38:41AM +0100, Alan Cox wrote:
> > So it seems likely to me that this is a kernel bug, somewhere, and the
> > TTY layer seems like a good place to look (OK, a horrible place, but a
> > *likely* place).
> 
> Somewhere around 2.6.29-30 various things went funny in the keyboard
> layer for me - notably characters "bleeding" across console switches.
> 

What do you mean by "bleeding"? Are you sure it is not autorepeat
kicking in?

> > 
> > I'm about to try reverting the suggested commit and will report back. I
> > see this failure about once a day, so I'll give it three days to go
> > wrong and then (if it doesn't) will presume it works and so inform you.
> > 
> > 
> > (Of course with this commit reverted Emacsen start dropping data from
> > their ptys, and as bad luck would have it I live in (X)Emacs, but that's
> > on a different machine! so I can have my compile buffer data *and* not
> > destroy X ;} )
> 
> X doesn't touch the pty layer. It touches vt (extensively) and the input
> layers. It's vt/kbd access is also very raw so bypasses much of that
> layer. That isn't to say tty isn't the cause but look for input layer
> changes too.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-12 23:46           ` Dmitry Torokhov
  (?)
@ 2009-10-13  0:14           ` Justin P. Mattock
  -1 siblings, 0 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-13  0:14 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Alan Cox, Nix, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	Linus Torvalds, OGAWA Hirofumi

Dmitry Torokhov wrote:
> On Tue, Oct 13, 2009 at 12:38:41AM +0100, Alan Cox wrote:
>    
>>> So it seems likely to me that this is a kernel bug, somewhere, and the
>>> TTY layer seems like a good place to look (OK, a horrible place, but a
>>> *likely* place).
>>>        
>> Somewhere around 2.6.29-30 various things went funny in the keyboard
>> layer for me - notably characters "bleeding" across console switches.
>>
>>      
>
> What do you mean by "bleeding"? Are you sure it is not autorepeat
> kicking in?
>
>    
>>> I'm about to try reverting the suggested commit and will report back. I
>>> see this failure about once a day, so I'll give it three days to go
>>> wrong and then (if it doesn't) will presume it works and so inform you.
>>>
>>>
>>> (Of course with this commit reverted Emacsen start dropping data from
>>> their ptys, and as bad luck would have it I live in (X)Emacs, but that's
>>> on a different machine! so I can have my compile buffer data *and* not
>>> destroy X ;} )
>>>        
>> X doesn't touch the pty layer. It touches vt (extensively) and the input
>> layers. It's vt/kbd access is also very raw so bypasses much of that
>> layer. That isn't to say tty isn't the cause but look for input layer
>> changes too.
>>      
>
>    
FWIW:
Something I noticed with fedora/ubuntu(latest) is while opening
a terminal the history(example: pressing up arrow) will just
start firing off as if I pressed the arrow up key and held it,
all the way until the end of the history file( .bash_history).
seems to do this at a random, if I'm compiling most notable during 
./configure.
(When this happens the screen will be garbled with characters similar to 
this:
^C)

During my clfs build I used fedora as the host system, and this behavior 
went
right into the newly created system. When I built another system, I used 
ubuntu
and it seems to not be as bad, but still present.(I'm thinking , if this 
is what
others are experiencing it must be something in  userspace)

Justin P. Mattock

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-12 22:59     ` Nix
  2009-10-12 23:38       ` Alan Cox
@ 2009-10-13  0:16       ` Linus Torvalds
  2009-10-13  2:54           ` Frédéric L. W. Meunier
  2009-10-13  3:24         ` Linus Torvalds
  1 sibling, 2 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13  0:16 UTC (permalink / raw)
  To: Nix
  Cc: Justin P. Mattock, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, OGAWA Hirofumi



On Mon, 12 Oct 2009, Nix wrote:
> On 12 Oct 2009, Justin P. Mattock uttered the following:
> 
> > Not sure where this stands. Right now all three machines I have seem  to be having no issues with the kayboard
> > (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  hit something.
> [...]
> >> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14388
> >> Subject        : keyboard under X with 2.6.31
> >> Submitter    : Frédéric L. W. Meunier <fredlwm@gmail.com>
> >> Date        : 2009-10-07 20:19 (5 days old)
> >> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
> >> References    : http://marc.info/?l=linux-kernel&m=125494753228217&w=4
> 
> I have been seeing problems precisely like those described (sometimes
> the keyboard dies, sometimes it gets 'stuck' with a key held down, until
> I switch TTYs, which generally means killing X as I'm not aware of an
> easy way to switch VTs using only the mouse), since I moved to 2.6.31

The particular commit that was bisected to should really not matter for X, 
except perhaps from a timing standpoint.

The problem it fixed was in pty's, and X doesn't use them much if at all 
(various X _programs_ may, of course, but the symptoms don't sound like 
it's just a particular X app that has issues, but more of a generic X 
keyboard handling thing)

But for non-pty's, there should be no semantic changes from that commit 
outside of some general tty timing differences by doing that 
tty_flush_to_ldisc() at new points.

I could fairly easily imagine that some timing difference does expose 
another longer-standing problem in either the kernel or X itself. So the 
bisection isn't necessarily wrong, it's just not likely telling us what 
the real problem is.

Of course, maybe there is some race condition in the tty_buffer.c code. We 
_used_ to not call flush_to_ldisc() except through the workqueue code, so 
races would not be seen in normal circumstances. Now that flush_to_ldisc() 
could easily get called both synchronously from tty_read()/tty_poll(), 
while also being hit from the workqueues.

Alan, Ogawa-san, do either of you see some problem in tty_buffer.c, 
perhaps?

		Linus



^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-12 23:38       ` Alan Cox
  2009-10-12 23:46           ` Dmitry Torokhov
@ 2009-10-13  2:00         ` Daniel Hazelton
  1 sibling, 0 replies; 248+ messages in thread
From: Daniel Hazelton @ 2009-10-13  2:00 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,  Frédéric L. W. Meunier,
	Linus Torvalds, OGAWA Hirofumi

On Monday 12 October 2009 07:38:41 pm Alan Cox wrote:
> > So it seems likely to me that this is a kernel bug, somewhere, and the
> > TTY layer seems like a good place to look (OK, a horrible place, but a
> > *likely* place).
> 
> Somewhere around 2.6.29-30 various things went funny in the keyboard
> layer for me - notably characters "bleeding" across console switches.

Possibly not even related... I was seeing a similar problem in .26 or so and 
it turned out to be caused, IIRC, by SCIM (in Gnome) and the KDE "Input 
Methods" system. However my problem might not be related, because this even 
stopped Alt-SysRq-K from working - and it only happened if I was holding down 
a "modifier key" for a longer than average (say 30 seconds). I never reported 
this because I soon after got tired of that distro (I was giving it a try at 
the urging of friends) and replaced it.

The problem has not re-occurred since.

DRH

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13  2:54           ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13  2:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2501 bytes --]

On Mon, 12 Oct 2009, Linus Torvalds wrote:

> On Mon, 12 Oct 2009, Nix wrote:
>> On 12 Oct 2009, Justin P. Mattock uttered the following:
>>
>>> Not sure where this stands. Right now all three machines I have seem  to be having no issues with the kayboard
>>> (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  hit something.
>> [...]
>>>> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14388
>>>> Subject        : keyboard under X with 2.6.31
>>>> Submitter    : Frédéric L. W. Meunier <fredlwm@gmail.com>
>>>> Date        : 2009-10-07 20:19 (5 days old)
>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
>>>> References    : http://marc.info/?l=linux-kernel&m=125494753228217&w=4
>>
>> I have been seeing problems precisely like those described (sometimes
>> the keyboard dies, sometimes it gets 'stuck' with a key held down, until
>> I switch TTYs, which generally means killing X as I'm not aware of an
>> easy way to switch VTs using only the mouse), since I moved to 2.6.31
>
> The particular commit that was bisected to should really not matter for X,
> except perhaps from a timing standpoint.
>
> The problem it fixed was in pty's, and X doesn't use them much if at all
> (various X _programs_ may, of course, but the symptoms don't sound like
> it's just a particular X app that has issues, but more of a generic X
> keyboard handling thing)
>
> But for non-pty's, there should be no semantic changes from that commit
> outside of some general tty timing differences by doing that
> tty_flush_to_ldisc() at new points.
>
> I could fairly easily imagine that some timing difference does expose
> another longer-standing problem in either the kernel or X itself. So the
> bisection isn't necessarily wrong, it's just not likely telling us what
> the real problem is.
>
> Of course, maybe there is some race condition in the tty_buffer.c code. We
> _used_ to not call flush_to_ldisc() except through the workqueue code, so
> races would not be seen in normal circumstances. Now that flush_to_ldisc()
> could easily get called both synchronously from tty_read()/tty_poll(),
> while also being hit from the workqueues.
>
> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
> perhaps?

Just a note. With me, all the keyboard problems happened while I 
was under X, but doing something in a terminal running screen. 
Reverting the commit stopped the problem.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13  2:54           ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13  2:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2579 bytes --]

On Mon, 12 Oct 2009, Linus Torvalds wrote:

> On Mon, 12 Oct 2009, Nix wrote:
>> On 12 Oct 2009, Justin P. Mattock uttered the following:
>>
>>> Not sure where this stands. Right now all three machines I have seem  to be having no issues with the kayboard
>>> (xserver 1.6.*) I can go and build the latest xserver(1.7) to see if I  hit something.
>> [...]
>>>> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=14388
>>>> Subject        : keyboard under X with 2.6.31
>>>> Submitter    : Frédéric L. W. Meunier <fredlwm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>> Date        : 2009-10-07 20:19 (5 days old)
>>>> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
>>>> References    : http://marc.info/?l=linux-kernel&m=125494753228217&w=4
>>
>> I have been seeing problems precisely like those described (sometimes
>> the keyboard dies, sometimes it gets 'stuck' with a key held down, until
>> I switch TTYs, which generally means killing X as I'm not aware of an
>> easy way to switch VTs using only the mouse), since I moved to 2.6.31
>
> The particular commit that was bisected to should really not matter for X,
> except perhaps from a timing standpoint.
>
> The problem it fixed was in pty's, and X doesn't use them much if at all
> (various X _programs_ may, of course, but the symptoms don't sound like
> it's just a particular X app that has issues, but more of a generic X
> keyboard handling thing)
>
> But for non-pty's, there should be no semantic changes from that commit
> outside of some general tty timing differences by doing that
> tty_flush_to_ldisc() at new points.
>
> I could fairly easily imagine that some timing difference does expose
> another longer-standing problem in either the kernel or X itself. So the
> bisection isn't necessarily wrong, it's just not likely telling us what
> the real problem is.
>
> Of course, maybe there is some race condition in the tty_buffer.c code. We
> _used_ to not call flush_to_ldisc() except through the workqueue code, so
> races would not be seen in normal circumstances. Now that flush_to_ldisc()
> could easily get called both synchronously from tty_read()/tty_poll(),
> while also being hit from the workqueues.
>
> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
> perhaps?

Just a note. With me, all the keyboard problems happened while I 
was under X, but doing something in a terminal running screen. 
Reverting the commit stopped the problem.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  0:16       ` Linus Torvalds
  2009-10-13  2:54           ` Frédéric L. W. Meunier
@ 2009-10-13  3:24         ` Linus Torvalds
  2009-10-13  3:43           ` Justin P. Mattock
                             ` (2 more replies)
  1 sibling, 3 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13  3:24 UTC (permalink / raw)
  To: Nix, Alan Cox, Paul Fulghum
  Cc: Justin P. Mattock, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Frédéric L. W. Meunier, OGAWA Hirofumi


[ Alan, Paulkf - the tty buffering and locking is originally your code, 
  although from about three years ago, when it used to be in tty_io.c..
  Any comment? ]

On Mon, 12 Oct 2009, Linus Torvalds wrote:
> 
> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c, 
> perhaps?

Hmm. I see one, at least.

The "tty_insert_flip_string()" locking seems totally bogus. 

It does that "tty_buffer_request_room()" call and subsequent copying with 
no locking at all - sure, the tty_buffer_request_room() function itself 
locks the buffers, but then unlocks it when returning, so when we actually 
do the memcpy() etc, we can race with anybody.

I don't really see who would care, but it does look totally broken.

I dunno, this patch seems to make sense to me. Am I missing something?

[ NOTE! The patch is totally untested. It compiled for me on x86-64, and
  apart from that I'm just going to say that it looks obvious, and the old 
  code looks obviously buggy. Also, any remaining users of

	tty_prepare_flip_string
	tty_prepare_flip_string_flags

  are still fundamentally broken and buggy, while users of

	tty_buffer_request_room

  are pretty damn odd and suspect (but a lot of them seem to be just 
  pointless: they then call tty_insert_flip_string(), which means that the 
  tty_buffer_request_room() call was totally redundant ]

Comments? Does this work? Does it make any difference? It seems fairly 
unlikely, but it's the only obvious problem I've seen in the tty buffering 
code so far.

And that code is literally 3 years old, and it seems unlikely that a 
regular _keyboard_ buffer would be able to hit the (rather small) race 
condition. But other serialization may have hidden it, and timing 
differences could certainly have caused it to trigger much more easily.

			Linus

---
 drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
 1 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 3108991..25ab538 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct tty_struct *tty, size_t size)
  *
  *	Locking: Takes tty->buf.lock
  */
-int tty_buffer_request_room(struct tty_struct *tty, size_t size)
+static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t size)
 {
 	struct tty_buffer *b, *n;
 	int left;
-	unsigned long flags;
-
-	spin_lock_irqsave(&tty->buf.lock, flags);
 
 	/* OPTIMISATION: We could keep a per tty "zero" sized buffer to
 	   remove this conditional if its worth it. This would be invisible
@@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, size_t size)
 			size = left;
 	}
 
-	spin_unlock_irqrestore(&tty->buf.lock, flags);
 	return size;
 }
+
+int tty_buffer_request_room(struct tty_struct *tty, size_t size)
+{
+	int retval;
+	unsigned long flags;
+
+	spin_lock_irqsave(&tty->buf.lock, flags);
+	retval = locked_tty_buffer_request_room(tty, size);
+	spin_unlock_irqrestore(&tty->buf.lock, flags);
+	return retval;
+}
+
 EXPORT_SYMBOL_GPL(tty_buffer_request_room);
 
 /**
@@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
  *	Queue a series of bytes to the tty buffering. All the characters
  *	passed are marked as without error. Returns the number added.
  *
- *	Locking: Called functions may take tty->buf.lock
+ *	Locking: We take tty->buf.lock
  */
 
 int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars,
 				size_t size)
 {
 	int copied = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&tty->buf.lock, flags);
 	do {
-		int space = tty_buffer_request_room(tty, size - copied);
+		int space = locked_tty_buffer_request_room(tty, size - copied);
 		struct tty_buffer *tb = tty->buf.tail;
+
 		/* If there is no space then tb may be NULL */
 		if (unlikely(space == 0))
 			break;
@@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars,
 		/* There is a small chance that we need to split the data over
 		   several buffers. If this is the case we must loop */
 	} while (unlikely(size > copied));
+	spin_unlock_irqrestore(&tty->buf.lock, flags);
 	return copied;
 }
 EXPORT_SYMBOL(tty_insert_flip_string);
@@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct *tty,
 		const unsigned char *chars, const char *flags, size_t size)
 {
 	int copied = 0;
+	unsigned long irqflags;
+
+	spin_lock_irqsave(&tty->buf.lock, irqflags);
 	do {
-		int space = tty_buffer_request_room(tty, size - copied);
+		int space = locked_tty_buffer_request_room(tty, size - copied);
 		struct tty_buffer *tb = tty->buf.tail;
 		/* If there is no space then tb may be NULL */
 		if (unlikely(space == 0))
@@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct *tty,
 		/* There is a small chance that we need to split the data over
 		   several buffers. If this is the case we must loop */
 	} while (unlikely(size > copied));
+	spin_unlock_irqrestore(&tty->buf.lock, irqflags);
 	return copied;
 }
 EXPORT_SYMBOL(tty_insert_flip_string_flags);

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  3:24         ` Linus Torvalds
@ 2009-10-13  3:43           ` Justin P. Mattock
  2009-10-13  7:13               ` Frédéric L. W. Meunier
  2009-10-13 10:34             ` Alan Cox
  2009-10-13 10:32           ` Alan Cox
  2009-10-17 16:40             ` Pavel Machek
  2 siblings, 2 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-13  3:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Linus Torvalds wrote:
> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>    although from about three years ago, when it used to be in tty_io.c..
>    Any comment? ]
>
> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>    
>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>> perhaps?
>>      
>
> Hmm. I see one, at least.
>
> The "tty_insert_flip_string()" locking seems totally bogus.
>
> It does that "tty_buffer_request_room()" call and subsequent copying with
> no locking at all - sure, the tty_buffer_request_room() function itself
> locks the buffers, but then unlocks it when returning, so when we actually
> do the memcpy() etc, we can race with anybody.
>
> I don't really see who would care, but it does look totally broken.
>
> I dunno, this patch seems to make sense to me. Am I missing something?
>
> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>    apart from that I'm just going to say that it looks obvious, and the old
>    code looks obviously buggy. Also, any remaining users of
>
> 	tty_prepare_flip_string
> 	tty_prepare_flip_string_flags
>
>    are still fundamentally broken and buggy, while users of
>
> 	tty_buffer_request_room
>
>    are pretty damn odd and suspect (but a lot of them seem to be just
>    pointless: they then call tty_insert_flip_string(), which means that the
>    tty_buffer_request_room() call was totally redundant ]
>
> Comments? Does this work? Does it make any difference? It seems fairly
> unlikely, but it's the only obvious problem I've seen in the tty buffering
> code so far.
>
> And that code is literally 3 years old, and it seems unlikely that a
> regular _keyboard_ buffer would be able to hit the (rather small) race
> condition. But other serialization may have hidden it, and timing
> differences could certainly have caused it to trigger much more easily.
>
> 			Linus
>
> ---
>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>   1 files changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 3108991..25ab538 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct tty_struct *tty, size_t size)
>    *
>    *	Locking: Takes tty->buf.lock
>    */
> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
> +static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t size)
>   {
>   	struct tty_buffer *b, *n;
>   	int left;
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&tty->buf.lock, flags);
>
>   	/* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>   	   remove this conditional if its worth it. This would be invisible
> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>   			size = left;
>   	}
>
> -	spin_unlock_irqrestore(&tty->buf.lock, flags);
>   	return size;
>   }
> +
> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
> +{
> +	int retval;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tty->buf.lock, flags);
> +	retval = locked_tty_buffer_request_room(tty, size);
> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
> +	return retval;
> +}
> +
>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>
>   /**
> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>    *	Queue a series of bytes to the tty buffering. All the characters
>    *	passed are marked as without error. Returns the number added.
>    *
> - *	Locking: Called functions may take tty->buf.lock
> + *	Locking: We take tty->buf.lock
>    */
>
>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars,
>   				size_t size)
>   {
>   	int copied = 0;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&tty->buf.lock, flags);
>   	do {
> -		int space = tty_buffer_request_room(tty, size - copied);
> +		int space = locked_tty_buffer_request_room(tty, size - copied);
>   		struct tty_buffer *tb = tty->buf.tail;
> +
>   		/* If there is no space then tb may be NULL */
>   		if (unlikely(space == 0))
>   			break;
> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, const unsigned char *chars,
>   		/* There is a small chance that we need to split the data over
>   		   several buffers. If this is the case we must loop */
>   	} while (unlikely(size>  copied));
> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
>   	return copied;
>   }
>   EXPORT_SYMBOL(tty_insert_flip_string);
> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct *tty,
>   		const unsigned char *chars, const char *flags, size_t size)
>   {
>   	int copied = 0;
> +	unsigned long irqflags;
> +
> +	spin_lock_irqsave(&tty->buf.lock, irqflags);
>   	do {
> -		int space = tty_buffer_request_room(tty, size - copied);
> +		int space = locked_tty_buffer_request_room(tty, size - copied);
>   		struct tty_buffer *tb = tty->buf.tail;
>   		/* If there is no space then tb may be NULL */
>   		if (unlikely(space == 0))
> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct *tty,
>   		/* There is a small chance that we need to split the data over
>   		   several buffers. If this is the case we must loop */
>   	} while (unlikely(size>  copied));
> +	spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>   	return copied;
>   }
>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>
>    
I can throw your patch in over here for the heck of it.
If there's somebody who's really hitting this bug
then the results would be better  if this is the area that causing
this bug.(from here the only issue I'm seeing is spinning
history commands in the terminal  from time to time,
nothing of any unusable keys like others are reporting).

Justin P. Mattock

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13  7:13               ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13  7:13 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linus Torvalds, Nix, Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

On Mon, 12 Oct 2009, Justin P. Mattock wrote:

> Linus Torvalds wrote:
>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>    although from about three years ago, when it used to be in tty_io.c..
>>    Any comment? ]
>> 
>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>> 
>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>> perhaps?
>>> 
>> 
>> Hmm. I see one, at least.
>> 
>> The "tty_insert_flip_string()" locking seems totally bogus.
>> 
>> It does that "tty_buffer_request_room()" call and subsequent copying with
>> no locking at all - sure, the tty_buffer_request_room() function itself
>> locks the buffers, but then unlocks it when returning, so when we actually
>> do the memcpy() etc, we can race with anybody.
>> 
>> I don't really see who would care, but it does look totally broken.
>> 
>> I dunno, this patch seems to make sense to me. Am I missing something?
>> 
>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>    apart from that I'm just going to say that it looks obvious, and the old
>>    code looks obviously buggy. Also, any remaining users of
>>
>> 	tty_prepare_flip_string
>> 	tty_prepare_flip_string_flags
>>
>>    are still fundamentally broken and buggy, while users of
>>
>> 	tty_buffer_request_room
>>
>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>    pointless: they then call tty_insert_flip_string(), which means that the
>>    tty_buffer_request_room() call was totally redundant ]
>> 
>> Comments? Does this work? Does it make any difference? It seems fairly
>> unlikely, but it's the only obvious problem I've seen in the tty buffering
>> code so far.
>> 
>> And that code is literally 3 years old, and it seems unlikely that a
>> regular _keyboard_ buffer would be able to hit the (rather small) race
>> condition. But other serialization may have hidden it, and timing
>> differences could certainly have caused it to trigger much more easily.
>>
>> 			Linus
>> 
>> ---
>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>   1 files changed, 25 insertions(+), 8 deletions(-)
>> 
>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>> index 3108991..25ab538 100644
>> --- a/drivers/char/tty_buffer.c
>> +++ b/drivers/char/tty_buffer.c
>> @@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct 
>> tty_struct *tty, size_t size)
>>    *
>>    *	Locking: Takes tty->buf.lock
>>    */
>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>> +static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t 
>> size)
>>   {
>>   	struct tty_buffer *b, *n;
>>   	int left;
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&tty->buf.lock, flags);
>>
>>   	/* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>   	   remove this conditional if its worth it. This would be invisible
>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, 
>> size_t size)
>>   			size = left;
>>   	}
>> 
>> -	spin_unlock_irqrestore(&tty->buf.lock, flags);
>>   	return size;
>>   }
>> +
>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>> +{
>> +	int retval;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, flags);
>> +	retval = locked_tty_buffer_request_room(tty, size);
>> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
>> +	return retval;
>> +}
>> +
>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>
>>   /**
>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>    *	Queue a series of bytes to the tty buffering. All the characters
>>    *	passed are marked as without error. Returns the number added.
>>    *
>> - *	Locking: Called functions may take tty->buf.lock
>> + *	Locking: We take tty->buf.lock
>>    */
>>
>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned char 
>> *chars,
>>   				size_t size)
>>   {
>>   	int copied = 0;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, flags);
>>   	do {
>> -		int space = tty_buffer_request_room(tty, size - copied);
>> +		int space = locked_tty_buffer_request_room(tty, size - 
>> copied);
>>   		struct tty_buffer *tb = tty->buf.tail;
>> +
>>   		/* If there is no space then tb may be NULL */
>>   		if (unlikely(space == 0))
>>   			break;
>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, 
>> const unsigned char *chars,
>>   		/* There is a small chance that we need to split the data 
>> over
>>   		   several buffers. If this is the case we must loop */
>>   	} while (unlikely(size>  copied));
>> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
>>   	return copied;
>>   }
>>   EXPORT_SYMBOL(tty_insert_flip_string);
>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct 
>> *tty,
>>   		const unsigned char *chars, const char *flags, size_t size)
>>   {
>>   	int copied = 0;
>> +	unsigned long irqflags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, irqflags);
>>   	do {
>> -		int space = tty_buffer_request_room(tty, size - copied);
>> +		int space = locked_tty_buffer_request_room(tty, size - 
>> copied);
>>   		struct tty_buffer *tb = tty->buf.tail;
>>   		/* If there is no space then tb may be NULL */
>>   		if (unlikely(space == 0))
>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct 
>> *tty,
>>   		/* There is a small chance that we need to split the data 
>> over
>>   		   several buffers. If this is the case we must loop */
>>   	} while (unlikely(size>  copied));
>> +	spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>   	return copied;
>>   }
>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>
>> 
> I can throw your patch in over here for the heck of it.
> If there's somebody who's really hitting this bug
> then the results would be better  if this is the area that causing
> this bug.(from here the only issue I'm seeing is spinning
> history commands in the terminal  from time to time,
> nothing of any unusable keys like others are reporting).

I tested it on top of 2.6.31.4 (after putting back 
e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is 
fine after almost 3h. Before that, the problems would appear in 
less than 1h. Maybe I spoke too soon, but...

Boyan, does it work for you ?

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13  7:13               ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13  7:13 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linus Torvalds, Nix, Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

On Mon, 12 Oct 2009, Justin P. Mattock wrote:

> Linus Torvalds wrote:
>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>    although from about three years ago, when it used to be in tty_io.c..
>>    Any comment? ]
>> 
>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>> 
>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>> perhaps?
>>> 
>> 
>> Hmm. I see one, at least.
>> 
>> The "tty_insert_flip_string()" locking seems totally bogus.
>> 
>> It does that "tty_buffer_request_room()" call and subsequent copying with
>> no locking at all - sure, the tty_buffer_request_room() function itself
>> locks the buffers, but then unlocks it when returning, so when we actually
>> do the memcpy() etc, we can race with anybody.
>> 
>> I don't really see who would care, but it does look totally broken.
>> 
>> I dunno, this patch seems to make sense to me. Am I missing something?
>> 
>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>    apart from that I'm just going to say that it looks obvious, and the old
>>    code looks obviously buggy. Also, any remaining users of
>>
>> 	tty_prepare_flip_string
>> 	tty_prepare_flip_string_flags
>>
>>    are still fundamentally broken and buggy, while users of
>>
>> 	tty_buffer_request_room
>>
>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>    pointless: they then call tty_insert_flip_string(), which means that the
>>    tty_buffer_request_room() call was totally redundant ]
>> 
>> Comments? Does this work? Does it make any difference? It seems fairly
>> unlikely, but it's the only obvious problem I've seen in the tty buffering
>> code so far.
>> 
>> And that code is literally 3 years old, and it seems unlikely that a
>> regular _keyboard_ buffer would be able to hit the (rather small) race
>> condition. But other serialization may have hidden it, and timing
>> differences could certainly have caused it to trigger much more easily.
>>
>> 			Linus
>> 
>> ---
>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>   1 files changed, 25 insertions(+), 8 deletions(-)
>> 
>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>> index 3108991..25ab538 100644
>> --- a/drivers/char/tty_buffer.c
>> +++ b/drivers/char/tty_buffer.c
>> @@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct 
>> tty_struct *tty, size_t size)
>>    *
>>    *	Locking: Takes tty->buf.lock
>>    */
>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>> +static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t 
>> size)
>>   {
>>   	struct tty_buffer *b, *n;
>>   	int left;
>> -	unsigned long flags;
>> -
>> -	spin_lock_irqsave(&tty->buf.lock, flags);
>>
>>   	/* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>   	   remove this conditional if its worth it. This would be invisible
>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, 
>> size_t size)
>>   			size = left;
>>   	}
>> 
>> -	spin_unlock_irqrestore(&tty->buf.lock, flags);
>>   	return size;
>>   }
>> +
>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>> +{
>> +	int retval;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, flags);
>> +	retval = locked_tty_buffer_request_room(tty, size);
>> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
>> +	return retval;
>> +}
>> +
>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>
>>   /**
>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>    *	Queue a series of bytes to the tty buffering. All the characters
>>    *	passed are marked as without error. Returns the number added.
>>    *
>> - *	Locking: Called functions may take tty->buf.lock
>> + *	Locking: We take tty->buf.lock
>>    */
>>
>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned char 
>> *chars,
>>   				size_t size)
>>   {
>>   	int copied = 0;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, flags);
>>   	do {
>> -		int space = tty_buffer_request_room(tty, size - copied);
>> +		int space = locked_tty_buffer_request_room(tty, size - 
>> copied);
>>   		struct tty_buffer *tb = tty->buf.tail;
>> +
>>   		/* If there is no space then tb may be NULL */
>>   		if (unlikely(space == 0))
>>   			break;
>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, 
>> const unsigned char *chars,
>>   		/* There is a small chance that we need to split the data 
>> over
>>   		   several buffers. If this is the case we must loop */
>>   	} while (unlikely(size>  copied));
>> +	spin_unlock_irqrestore(&tty->buf.lock, flags);
>>   	return copied;
>>   }
>>   EXPORT_SYMBOL(tty_insert_flip_string);
>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct 
>> *tty,
>>   		const unsigned char *chars, const char *flags, size_t size)
>>   {
>>   	int copied = 0;
>> +	unsigned long irqflags;
>> +
>> +	spin_lock_irqsave(&tty->buf.lock, irqflags);
>>   	do {
>> -		int space = tty_buffer_request_room(tty, size - copied);
>> +		int space = locked_tty_buffer_request_room(tty, size - 
>> copied);
>>   		struct tty_buffer *tb = tty->buf.tail;
>>   		/* If there is no space then tb may be NULL */
>>   		if (unlikely(space == 0))
>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct 
>> *tty,
>>   		/* There is a small chance that we need to split the data 
>> over
>>   		   several buffers. If this is the case we must loop */
>>   	} while (unlikely(size>  copied));
>> +	spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>   	return copied;
>>   }
>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>
>> 
> I can throw your patch in over here for the heck of it.
> If there's somebody who's really hitting this bug
> then the results would be better  if this is the area that causing
> this bug.(from here the only issue I'm seeing is spinning
> history commands in the terminal  from time to time,
> nothing of any unusable keys like others are reporting).

I tested it on top of 2.6.31.4 (after putting back 
e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is 
fine after almost 3h. Before that, the problems would appear in 
less than 1h. Maybe I spoke too soon, but...

Boyan, does it work for you ?

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  7:13               ` Frédéric L. W. Meunier
  (?)
@ 2009-10-13  8:19               ` Boyan
  2009-10-13  9:17                 ` Dmitry Torokhov
                                   ` (2 more replies)
  -1 siblings, 3 replies; 248+ messages in thread
From: Boyan @ 2009-10-13  8:19 UTC (permalink / raw)
  To: "Frédéric L. W. Meunier"
  Cc: Justin P. Mattock, Linus Torvalds, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

Frédéric L. W. Meunier wrote:
> On Mon, 12 Oct 2009, Justin P. Mattock wrote:
> 
>> Linus Torvalds wrote:
>>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>>    although from about three years ago, when it used to be in tty_io.c..
>>>    Any comment? ]
>>>
>>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>>>
>>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>>> perhaps?
>>>>
>>>
>>> Hmm. I see one, at least.
>>>
>>> The "tty_insert_flip_string()" locking seems totally bogus.
>>>
>>> It does that "tty_buffer_request_room()" call and subsequent copying 
>>> with
>>> no locking at all - sure, the tty_buffer_request_room() function itself
>>> locks the buffers, but then unlocks it when returning, so when we 
>>> actually
>>> do the memcpy() etc, we can race with anybody.
>>>
>>> I don't really see who would care, but it does look totally broken.
>>>
>>> I dunno, this patch seems to make sense to me. Am I missing something?
>>>
>>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>>    apart from that I'm just going to say that it looks obvious, and 
>>> the old
>>>    code looks obviously buggy. Also, any remaining users of
>>>
>>>     tty_prepare_flip_string
>>>     tty_prepare_flip_string_flags
>>>
>>>    are still fundamentally broken and buggy, while users of
>>>
>>>     tty_buffer_request_room
>>>
>>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>>    pointless: they then call tty_insert_flip_string(), which means 
>>> that the
>>>    tty_buffer_request_room() call was totally redundant ]
>>>
>>> Comments? Does this work? Does it make any difference? It seems fairly
>>> unlikely, but it's the only obvious problem I've seen in the tty 
>>> buffering
>>> code so far.
>>>
>>> And that code is literally 3 years old, and it seems unlikely that a
>>> regular _keyboard_ buffer would be able to hit the (rather small) race
>>> condition. But other serialization may have hidden it, and timing
>>> differences could certainly have caused it to trigger much more easily.
>>>
>>>             Linus
>>>
>>> ---
>>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>>   1 files changed, 25 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>>> index 3108991..25ab538 100644
>>> --- a/drivers/char/tty_buffer.c
>>> +++ b/drivers/char/tty_buffer.c
>>> @@ -196,13 +196,10 @@ static struct tty_buffer 
>>> *tty_buffer_find(struct tty_struct *tty, size_t size)
>>>    *
>>>    *    Locking: Takes tty->buf.lock
>>>    */
>>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>> +static int locked_tty_buffer_request_room(struct tty_struct *tty, 
>>> size_t size)
>>>   {
>>>       struct tty_buffer *b, *n;
>>>       int left;
>>> -    unsigned long flags;
>>> -
>>> -    spin_lock_irqsave(&tty->buf.lock, flags);
>>>
>>>       /* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>>          remove this conditional if its worth it. This would be 
>>> invisible
>>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct 
>>> *tty, size_t size)
>>>               size = left;
>>>       }
>>>
>>> -    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>       return size;
>>>   }
>>> +
>>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>> +{
>>> +    int retval;
>>> +    unsigned long flags;
>>> +
>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>> +    retval = locked_tty_buffer_request_room(tty, size);
>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>> +    return retval;
>>> +}
>>> +
>>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>
>>>   /**
>>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>    *    Queue a series of bytes to the tty buffering. All the characters
>>>    *    passed are marked as without error. Returns the number added.
>>>    *
>>> - *    Locking: Called functions may take tty->buf.lock
>>> + *    Locking: We take tty->buf.lock
>>>    */
>>>
>>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned 
>>> char *chars,
>>>                   size_t size)
>>>   {
>>>       int copied = 0;
>>> +    unsigned long flags;
>>> +
>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>       do {
>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>           struct tty_buffer *tb = tty->buf.tail;
>>> +
>>>           /* If there is no space then tb may be NULL */
>>>           if (unlikely(space == 0))
>>>               break;
>>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct 
>>> *tty, const unsigned char *chars,
>>>           /* There is a small chance that we need to split the data over
>>>              several buffers. If this is the case we must loop */
>>>       } while (unlikely(size>  copied));
>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>       return copied;
>>>   }
>>>   EXPORT_SYMBOL(tty_insert_flip_string);
>>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct 
>>> tty_struct *tty,
>>>           const unsigned char *chars, const char *flags, size_t size)
>>>   {
>>>       int copied = 0;
>>> +    unsigned long irqflags;
>>> +
>>> +    spin_lock_irqsave(&tty->buf.lock, irqflags);
>>>       do {
>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>           struct tty_buffer *tb = tty->buf.tail;
>>>           /* If there is no space then tb may be NULL */
>>>           if (unlikely(space == 0))
>>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct 
>>> tty_struct *tty,
>>>           /* There is a small chance that we need to split the data over
>>>              several buffers. If this is the case we must loop */
>>>       } while (unlikely(size>  copied));
>>> +    spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>>       return copied;
>>>   }
>>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>>
>>>
>> I can throw your patch in over here for the heck of it.
>> If there's somebody who's really hitting this bug
>> then the results would be better  if this is the area that causing
>> this bug.(from here the only issue I'm seeing is spinning
>> history commands in the terminal  from time to time,
>> nothing of any unusable keys like others are reporting).
> 
> I tested it on top of 2.6.31.4 (after putting back 
> e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is fine 
> after almost 3h. Before that, the problems would appear in less than 1h. 
> Maybe I spoke too soon, but...
> 
> Boyan, does it work for you ?
> 

I've just tested it on top of 2.6.31.3 and it doesn't work. As I've
mentioned in previous email - I usually trigger the problem easily
watching pictures with gthumb - this is combination of cpu intensive
operations and keyboard usage and if it doesn't work it takes me no more
than a minute to trigger the problem.

I thought the problem may be more easily triggered because of the newer
(1.6.4 RC) in fedora which is slower for my ati radeon cards, but now
I'm with older version 1.6.1.901 which is fine in speed - so it doesn't
matter what is the version of X.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
@ 2009-10-13  8:46         ` Fabio Comolli
  0 siblings, 0 replies; 248+ messages in thread
From: Fabio Comolli @ 2009-10-13  8:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Luis R. Rodriguez

Hi Rafael.

On Mon, Oct 12, 2009 at 11:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday 12 October 2009, Fabio Comolli wrote:
>> Actually I switched to -32rc so I can't test it anymore... Sorry. I
>> can confirm it for 2.6.31.1.
>
> Does it happen in -32-rc?

Please have a look at my other bug report which applies to 32rc (#14372).
The result is the same (wifi down after resume) but the warning does
not show up.

The "cure" is also the same (rfkill-off followed by rfkill-on).

>
> Rafael
>

Regards,
Fabio

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k
@ 2009-10-13  8:46         ` Fabio Comolli
  0 siblings, 0 replies; 248+ messages in thread
From: Fabio Comolli @ 2009-10-13  8:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Luis R. Rodriguez

Hi Rafael.

On Mon, Oct 12, 2009 at 11:23 PM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> On Monday 12 October 2009, Fabio Comolli wrote:
>> Actually I switched to -32rc so I can't test it anymore... Sorry. I
>> can confirm it for 2.6.31.1.
>
> Does it happen in -32-rc?

Please have a look at my other bug report which applies to 32rc (#14372).
The result is the same (wifi down after resume) but the warning does
not show up.

The "cure" is also the same (rfkill-off followed by rfkill-on).

>
> Rafael
>

Regards,
Fabio

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  8:19               ` Boyan
@ 2009-10-13  9:17                 ` Dmitry Torokhov
  2009-10-13 14:33                   ` Frédéric L. W. Meunier
  2009-10-13 15:05                 ` Linus Torvalds
  2 siblings, 0 replies; 248+ messages in thread
From: Dmitry Torokhov @ 2009-10-13  9:17 UTC (permalink / raw)
  To: Boyan
  Cc: "Frédéric L. W. Meunier",
	Justin P. Mattock, Linus Torvalds, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Ed Tomlinson, OGAWA Hirofumi

On Tue, Oct 13, 2009 at 11:19:05AM +0300, Boyan wrote:
> Frédéric L. W. Meunier wrote:
>> On Mon, 12 Oct 2009, Justin P. Mattock wrote:
>>
>>> Linus Torvalds wrote:
>>>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>>>    although from about three years ago, when it used to be in tty_io.c..
>>>>    Any comment? ]
>>>>
>>>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>>>>
>>>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>>>> perhaps?
>>>>>
>>>>
>>>> Hmm. I see one, at least.
>>>>
>>>> The "tty_insert_flip_string()" locking seems totally bogus.
>>>>
>>>> It does that "tty_buffer_request_room()" call and subsequent 
>>>> copying with
>>>> no locking at all - sure, the tty_buffer_request_room() function itself
>>>> locks the buffers, but then unlocks it when returning, so when we  
>>>> actually
>>>> do the memcpy() etc, we can race with anybody.
>>>>
>>>> I don't really see who would care, but it does look totally broken.
>>>>
>>>> I dunno, this patch seems to make sense to me. Am I missing something?
>>>>
>>>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>>>    apart from that I'm just going to say that it looks obvious, and 
>>>> the old
>>>>    code looks obviously buggy. Also, any remaining users of
>>>>
>>>>     tty_prepare_flip_string
>>>>     tty_prepare_flip_string_flags
>>>>
>>>>    are still fundamentally broken and buggy, while users of
>>>>
>>>>     tty_buffer_request_room
>>>>
>>>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>>>    pointless: they then call tty_insert_flip_string(), which means  
>>>> that the
>>>>    tty_buffer_request_room() call was totally redundant ]
>>>>
>>>> Comments? Does this work? Does it make any difference? It seems fairly
>>>> unlikely, but it's the only obvious problem I've seen in the tty  
>>>> buffering
>>>> code so far.
>>>>
>>>> And that code is literally 3 years old, and it seems unlikely that a
>>>> regular _keyboard_ buffer would be able to hit the (rather small) race
>>>> condition. But other serialization may have hidden it, and timing
>>>> differences could certainly have caused it to trigger much more easily.
>>>>
>>>>             Linus
>>>>
>>>> ---
>>>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>>>   1 files changed, 25 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>>>> index 3108991..25ab538 100644
>>>> --- a/drivers/char/tty_buffer.c
>>>> +++ b/drivers/char/tty_buffer.c
>>>> @@ -196,13 +196,10 @@ static struct tty_buffer  
>>>> *tty_buffer_find(struct tty_struct *tty, size_t size)
>>>>    *
>>>>    *    Locking: Takes tty->buf.lock
>>>>    */
>>>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +static int locked_tty_buffer_request_room(struct tty_struct *tty,  
>>>> size_t size)
>>>>   {
>>>>       struct tty_buffer *b, *n;
>>>>       int left;
>>>> -    unsigned long flags;
>>>> -
>>>> -    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>
>>>>       /* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>>>          remove this conditional if its worth it. This would be  
>>>> invisible
>>>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct  
>>>> *tty, size_t size)
>>>>               size = left;
>>>>       }
>>>>
>>>> -    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return size;
>>>>   }
>>>> +
>>>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +{
>>>> +    int retval;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>> +    retval = locked_tty_buffer_request_room(tty, size);
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>> +    return retval;
>>>> +}
>>>> +
>>>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>
>>>>   /**
>>>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>    *    Queue a series of bytes to the tty buffering. All the characters
>>>>    *    passed are marked as without error. Returns the number added.
>>>>    *
>>>> - *    Locking: Called functions may take tty->buf.lock
>>>> + *    Locking: We take tty->buf.lock
>>>>    */
>>>>
>>>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned 
>>>> char *chars,
>>>>                   size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>> +
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>>               break;
>>>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct  
>>>> *tty, const unsigned char *chars,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string);
>>>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct  
>>>> tty_struct *tty,
>>>>           const unsigned char *chars, const char *flags, size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long irqflags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, irqflags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct  
>>>> tty_struct *tty,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>>>
>>>>
>>> I can throw your patch in over here for the heck of it.
>>> If there's somebody who's really hitting this bug
>>> then the results would be better  if this is the area that causing
>>> this bug.(from here the only issue I'm seeing is spinning
>>> history commands in the terminal  from time to time,
>>> nothing of any unusable keys like others are reporting).
>>
>> I tested it on top of 2.6.31.4 (after putting back  
>> e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is fine  
>> after almost 3h. Before that, the problems would appear in less than 
>> 1h. Maybe I spoke too soon, but...
>>
>> Boyan, does it work for you ?
>>
>
> I've just tested it on top of 2.6.31.3 and it doesn't work. As I've
> mentioned in previous email - I usually trigger the problem easily
> watching pictures with gthumb - this is combination of cpu intensive
> operations and keyboard usage and if it doesn't work it takes me no more
> than a minute to trigger the problem.
>
> I thought the problem may be more easily triggered because of the newer
> (1.6.4 RC) in fedora which is slower for my ati radeon cards, but now
> I'm with older version 1.6.1.901 which is fine in speed - so it doesn't
> matter what is the version of X.


Can you reporoduce it in console while loading the CPU?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  3:24         ` Linus Torvalds
  2009-10-13  3:43           ` Justin P. Mattock
@ 2009-10-13 10:32           ` Alan Cox
  2009-10-13 13:25             ` Paul Fulghum
  2009-10-13 14:39               ` Linus Torvalds
  2009-10-17 16:40             ` Pavel Machek
  2 siblings, 2 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 10:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Paul Fulghum, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

> It does that "tty_buffer_request_room()" call and subsequent copying with 
> no locking at all - sure, the tty_buffer_request_room() function itself 
> locks the buffers, but then unlocks it when returning, so when we actually 
> do the memcpy() etc, we can race with anybody.

The tty_buffer_request_room() is a hint to help better allocation. It's
also only safe to run from the receiving path of the driver (which
has always been assumed not to make two parallel calls to the function at
once.

There is a simple reason the locking is sufficient. If you can call the
function from two places at once in your serial driver at the same you've
scrambled the data order so you've already lost.

So - not a bug - and the lock changes don't actually "fix" any behaviour
either because the ordering must be imposed by the caller.

>   pointless: they then call tty_insert_flip_string(), which means that the 
>   tty_buffer_request_room() call was totally redundant ]

It's a performance tweak. With a 3G USB modem or similar device running
at 20Mbits or more being able to generate one allocation per chunk
received for DMA made a measurable performance difference on some
platforms. 
 
Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  3:43           ` Justin P. Mattock
  2009-10-13  7:13               ` Frédéric L. W. Meunier
@ 2009-10-13 10:34             ` Alan Cox
  2009-10-13 15:16                 ` Justin P. Mattock
  1 sibling, 1 reply; 248+ messages in thread
From: Alan Cox @ 2009-10-13 10:34 UTC (permalink / raw)
  To: Justin P. Mattock
  Cc: Linus Torvalds, Nix, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,  Frédéric L. W. Meunier,
	OGAWA Hirofumi

> I can throw your patch in over here for the heck of it.
> If there's somebody who's really hitting this bug
> then the results would be better  if this is the area that causing
> this bug.(from here the only issue I'm seeing is spinning
> history commands in the terminal  from time to time,
> nothing of any unusable keys like others are reporting).

That sounds more like a lost key-up event somewhere higher up the stack.
USB keyboard ? and does it stop if you take the key in question. Also does
it stop if you touch the mouse wheel (assuming you've got mousewheel
bound to shell history somewhere ?)

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 11:00             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 11:00 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Ed Tomlinson, Frédéric L. W. Meunier, Linus Torvalds,
	OGAWA Hirofumi

On Mon, 12 Oct 2009 16:46:41 -0700
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> On Tue, Oct 13, 2009 at 12:38:41AM +0100, Alan Cox wrote:
> > > So it seems likely to me that this is a kernel bug, somewhere, and the
> > > TTY layer seems like a good place to look (OK, a horrible place, but a
> > > *likely* place).
> > 
> > Somewhere around 2.6.29-30 various things went funny in the keyboard
> > layer for me - notably characters "bleeding" across console switches.
> > 
> 
> What do you mean by "bleeding"? Are you sure it is not autorepeat
> kicking in?

Fairly. Just now and then I'll do something like type

"blahblah<alt-f1>"

eg when flipping consoles to check something and the last letter or two
ends up on the screen after the flip (as if the alt-f1 vc switch passes
the data somewhere). I suspect its some kind of asynchronous handling
using the "current console" rather than the "current console at the time
the letter was typed" but it doesn't occur to order so isn't bisectable
and I've never managed to pin down where in the keyboard/vt/tty stack it's
occurring.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 11:00             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 11:00 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Ed Tomlinson, Frédéric L. W. Meunier, Linus Torvalds,
	OGAWA Hirofumi

On Mon, 12 Oct 2009 16:46:41 -0700
Dmitry Torokhov <dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Tue, Oct 13, 2009 at 12:38:41AM +0100, Alan Cox wrote:
> > > So it seems likely to me that this is a kernel bug, somewhere, and the
> > > TTY layer seems like a good place to look (OK, a horrible place, but a
> > > *likely* place).
> > 
> > Somewhere around 2.6.29-30 various things went funny in the keyboard
> > layer for me - notably characters "bleeding" across console switches.
> > 
> 
> What do you mean by "bleeding"? Are you sure it is not autorepeat
> kicking in?

Fairly. Just now and then I'll do something like type

"blahblah<alt-f1>"

eg when flipping consoles to check something and the last letter or two
ends up on the screen after the flip (as if the alt-f1 vc switch passes
the data somewhere). I suspect its some kind of asynchronous handling
using the "current console" rather than the "current console at the time
the letter was typed" but it doesn't occur to order so isn't bisectable
and I've never managed to pin down where in the keyboard/vt/tty stack it's
occurring.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
  2009-10-12 11:05   ` David Miller
@ 2009-10-13 12:29     ` Karol Lewandowski
  0 siblings, 0 replies; 248+ messages in thread
From: Karol Lewandowski @ 2009-10-13 12:29 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, karol.k.lewandowski, mel, netdev

On Mon, Oct 12, 2009 at 04:05:36AM -0700, David Miller wrote:
> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> Date: Mon, 12 Oct 2009 01:01:08 +0200 (CEST)
> 
> [ Netdev CC:'d ]
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
> > Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
> > Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
> > Date		: 2009-09-15 12:05 (27 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4
> 
> A 128K memory allocation fails after resume, film at 11.
> 
> That e100 driver code has been that way forever, so likely it's
> something in the page allocator or similar that is making this happen
> more likely now.  Perhaps it's related to the iwlagn allocation
> failures being tracked down in another thread.
> 
> It's a shame that pci_alloc_consistent() has to always use GFP_ATOMIC
> for compatability.
> 
> As far as I can tell, these code paths can sleep.  So maybe the
> following hack would fix this for now.  Could someone test this?

Sadly this patch doesn't help.  I've tested it on post 2.6.32-rc4
kernel, and got failures after few tries.  Frans has been more
successful[1] at tracking this problem down than I've been (I failed
miserably, to be honest).

[1] http://lkml.org/lkml/2009/10/11/247

Thanks.


e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
e100 0000:00:03.0: PCI INT A -> Link[LNKC] -> GSI 9 (level, low) -> IRQ 9
e100 0000:00:03.0: PME# disabled
e100: eth0: e100_probe: addr 0xe8120000, irq 9, MAC addr 00:10:a4:89:e8:84
ifconfig: page allocation failure. order:5, mode:0x80d0
Pid: 4528, comm: ifconfig Tainted: G        W  2.6.32-rc4-00001-gd93a8f8-dirty #2
Call Trace:
 [<c0161034>] ? __alloc_pages_nodemask+0x43e/0x4a8
 [<c0104d7f>] ? dma_generic_alloc_coherent+0x4a/0xa7
 [<c0104d35>] ? dma_generic_alloc_coherent+0x0/0xa7
 [<d0933b68>] ? e100_alloc_cbs+0xc0/0x16d [e100]
 [<d0934be9>] ? e100_up+0x1b/0xf5 [e100]
 [<d0934cda>] ? e100_open+0x17/0x41 [e100]
 [<c0305f11>] ? dev_open+0x8f/0xc5
 [<c03056d0>] ? dev_change_flags+0xa2/0x155
 [<c033c103>] ? devinet_ioctl+0x22a/0x51b
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c02f92be>] ? sock_ioctl+0x1c0/0x1e4
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c01872da>] ? vfs_ioctl+0x16/0x4a
 [<c0187ba6>] ? do_vfs_ioctl+0x48f/0x4c6
 [<c016dfb3>] ? handle_mm_fault+0x214/0x462
 [<c0356e8e>] ? do_page_fault+0x2ce/0x2e4
 [<c0187c09>] ? sys_ioctl+0x2c/0x42
 [<c0102748>] ? sysenter_do_call+0x12/0x26
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:   0
active_anon:26670 inactive_anon:28253 isolated_anon:0
 active_file:2153 inactive_file:2367 isolated_file:0
 unevictable:0 dirty:15 writeback:24 unstable:0 buffer:151
 free:1291 slab_reclaimable:682 slab_unreclaimable:1101
 mapped:2234 shmem:70 pagetables:519 bounce:0
DMA free:1076kB min:124kB low:152kB high:184kB active_anon:5032kB inactive_anon:5116kB active_file:296kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:300kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 238 238
Normal free:4088kB min:1908kB low:2384kB high:2860kB active_anon:101648kB inactive_anon:107896kB active_file:8316kB inactive_file:9104kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:243776kB mlocked:0kB dirty:60kB writeback:96kB mapped:8636kB shmem:280kB slab_reclaimable:2720kB slab_unreclaimable:4364kB kernel_stack:472kB pagetables:2068kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 2*16kB 14*32kB 5*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1076kB
Normal: 550*4kB 106*8kB 53*16kB 4*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4088kB
10052 total pagecache pages
5462 pages in swap cache
Swap cache stats: add 42476, delete 37014, find 32571/34728
Free swap  = 412384kB
Total swap = 514040kB
65520 pages RAM
1689 pages reserved
8293 pages shared
58694 pages non-shared
ifconfig: page allocation failure. order:5, mode:0x80d0
Pid: 4528, comm: ifconfig Tainted: G        W  2.6.32-rc4-00001-gd93a8f8-dirty #2
Call Trace:
 [<c0161034>] ? __alloc_pages_nodemask+0x43e/0x4a8
 [<c0104d7f>] ? dma_generic_alloc_coherent+0x4a/0xa7
 [<c0104d35>] ? dma_generic_alloc_coherent+0x0/0xa7
 [<d0933b68>] ? e100_alloc_cbs+0xc0/0x16d [e100]
 [<d0934be9>] ? e100_up+0x1b/0xf5 [e100]
 [<d0934cda>] ? e100_open+0x17/0x41 [e100]
 [<c0305f11>] ? dev_open+0x8f/0xc5
 [<c03056d0>] ? dev_change_flags+0xa2/0x155
 [<c033c103>] ? devinet_ioctl+0x22a/0x51b
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c02f92be>] ? sock_ioctl+0x1c0/0x1e4
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c01872da>] ? vfs_ioctl+0x16/0x4a
 [<c0187ba6>] ? do_vfs_ioctl+0x48f/0x4c6
 [<c017dd04>] ? vfs_write+0xf4/0x105
 [<c0187c09>] ? sys_ioctl+0x2c/0x42
 [<c0102748>] ? sysenter_do_call+0x12/0x26
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:   0
active_anon:26162 inactive_anon:28360 isolated_anon:27
 active_file:2077 inactive_file:2461 isolated_file:5
 unevictable:0 dirty:14 writeback:262 unstable:0 buffer:149
 free:1639 slab_reclaimable:682 slab_unreclaimable:1103
 mapped:2184 shmem:70 pagetables:519 bounce:0
DMA free:1076kB min:124kB low:152kB high:184kB active_anon:5032kB inactive_anon:5116kB active_file:296kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:300kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 238 238
Normal free:5480kB min:1908kB low:2384kB high:2860kB active_anon:99616kB inactive_anon:108324kB active_file:8012kB inactive_file:9480kB unevictable:0kB isolated(anon):108kB isolated(file):20kB present:243776kB mlocked:0kB dirty:56kB writeback:1048kB mapped:8436kB shmem:280kB slab_reclaimable:2720kB slab_unreclaimable:4372kB kernel_stack:472kB pagetables:2068kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 2*16kB 14*32kB 5*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1076kB
Normal: 596*4kB 143*8kB 70*16kB 16*32kB 3*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5480kB
10454 total pagecache pages
5845 pages in swap cache
Swap cache stats: add 43307, delete 37462, find 32586/34751
Free swap  = 409320kB
Total swap = 514040kB
65520 pages RAM
1689 pages reserved
8220 pages shared
58380 pages non-shared
ifconfig: page allocation failure. order:5, mode:0x80d0
Pid: 4562, comm: ifconfig Tainted: G        W  2.6.32-rc4-00001-gd93a8f8-dirty #2
Call Trace:
 [<c0161034>] ? __alloc_pages_nodemask+0x43e/0x4a8
 [<c0104d7f>] ? dma_generic_alloc_coherent+0x4a/0xa7
 [<c0104d35>] ? dma_generic_alloc_coherent+0x0/0xa7
 [<d0933b68>] ? e100_alloc_cbs+0xc0/0x16d [e100]
 [<d0934be9>] ? e100_up+0x1b/0xf5 [e100]
 [<d0934cda>] ? e100_open+0x17/0x41 [e100]
 [<c0305f11>] ? dev_open+0x8f/0xc5
 [<c03056d0>] ? dev_change_flags+0xa2/0x155
 [<c033c103>] ? devinet_ioctl+0x22a/0x51b
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c02f92be>] ? sock_ioctl+0x1c0/0x1e4
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c01872da>] ? vfs_ioctl+0x16/0x4a
 [<c0187ba6>] ? do_vfs_ioctl+0x48f/0x4c6
 [<c016dfb3>] ? handle_mm_fault+0x214/0x462
 [<c0356e8e>] ? do_page_fault+0x2ce/0x2e4
 [<c0187c09>] ? sys_ioctl+0x2c/0x42
 [<c0102748>] ? sysenter_do_call+0x12/0x26
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:   0
active_anon:22485 inactive_anon:31175 isolated_anon:0
 active_file:1840 inactive_file:3750 isolated_file:0
 unevictable:0 dirty:24 writeback:2374 unstable:0 buffer:149
 free:1431 slab_reclaimable:675 slab_unreclaimable:1173
 mapped:2106 shmem:69 pagetables:509 bounce:0
DMA free:1076kB min:124kB low:152kB high:184kB active_anon:5032kB inactive_anon:5116kB active_file:296kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:300kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 238 238
Normal free:4648kB min:1908kB low:2384kB high:2860kB active_anon:84908kB inactive_anon:119584kB active_file:7064kB inactive_file:14636kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:243776kB mlocked:0kB dirty:96kB writeback:9496kB mapped:8124kB shmem:276kB slab_reclaimable:2692kB slab_unreclaimable:4652kB kernel_stack:464kB pagetables:2028kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 2*16kB 14*32kB 5*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1076kB
Normal: 430*4kB 64*8kB 25*16kB 45*32kB 7*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4648kB
16003 total pagecache pages
10343 pages in swap cache
Swap cache stats: add 49019, delete 38676, find 32912/35144
Free swap  = 387940kB
Total swap = 514040kB
65520 pages RAM
1689 pages reserved
6947 pages shared
58627 pages non-shared
ifconfig: page allocation failure. order:5, mode:0x80d0
Pid: 4562, comm: ifconfig Tainted: G        W  2.6.32-rc4-00001-gd93a8f8-dirty #2
Call Trace:
 [<c0161034>] ? __alloc_pages_nodemask+0x43e/0x4a8
 [<c0104d7f>] ? dma_generic_alloc_coherent+0x4a/0xa7
 [<c0104d35>] ? dma_generic_alloc_coherent+0x0/0xa7
 [<d0933b68>] ? e100_alloc_cbs+0xc0/0x16d [e100]
 [<d0934be9>] ? e100_up+0x1b/0xf5 [e100]
 [<d0934cda>] ? e100_open+0x17/0x41 [e100]
 [<c0305f11>] ? dev_open+0x8f/0xc5
 [<c03056d0>] ? dev_change_flags+0xa2/0x155
 [<c033c103>] ? devinet_ioctl+0x22a/0x51b
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c02f92be>] ? sock_ioctl+0x1c0/0x1e4
 [<c02f90fe>] ? sock_ioctl+0x0/0x1e4
 [<c01872da>] ? vfs_ioctl+0x16/0x4a
 [<c0187ba6>] ? do_vfs_ioctl+0x48f/0x4c6
 [<c016dfb3>] ? handle_mm_fault+0x214/0x462
 [<c011c0a1>] ? finish_task_switch+0x23/0x61
 [<c0356e8e>] ? do_page_fault+0x2ce/0x2e4
 [<c0187c09>] ? sys_ioctl+0x2c/0x42
 [<c0102748>] ? sysenter_do_call+0x12/0x26
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:   90, btch:  15 usd:   0
active_anon:19062 inactive_anon:33723 isolated_anon:0
 active_file:1517 inactive_file:4598 isolated_file:0
 unevictable:0 dirty:26 writeback:2979 unstable:0 buffer:149
 free:1762 slab_reclaimable:670 slab_unreclaimable:1196
 mapped:1952 shmem:65 pagetables:509 bounce:0
DMA free:1076kB min:124kB low:152kB high:184kB active_anon:5032kB inactive_anon:5116kB active_file:296kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:300kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 238 238
Normal free:5972kB min:1908kB low:2384kB high:2860kB active_anon:71216kB inactive_anon:129776kB active_file:5772kB inactive_file:18028kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:243776kB mlocked:0kB dirty:104kB writeback:11916kB mapped:7508kB shmem:260kB slab_reclaimable:2672kB slab_unreclaimable:4744kB kernel_stack:464kB pagetables:2028kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 2*8kB 2*16kB 14*32kB 5*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1076kB
Normal: 423*4kB 45*8kB 47*16kB 61*32kB 19*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5972kB
20642 total pagecache pages
14462 pages in swap cache
Swap cache stats: add 54216, delete 39754, find 33267/35545
Free swap  = 367980kB
Total swap = 514040kB
65520 pages RAM
1689 pages reserved
6413 pages shared
58337 pages non-shared

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 10:32           ` Alan Cox
@ 2009-10-13 13:25             ` Paul Fulghum
  2009-10-13 14:39               ` Linus Torvalds
  1 sibling, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 13:25 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Alan Cox wrote:
> The tty_buffer_request_room() is a hint to help better allocation. It's
> also only safe to run from the receiving path of the driver (which
> has always been assumed not to make two parallel calls to the function at
> once.

Yes, the locking only synchronizes between
producer and consumer. It does not coordinate between
multiple producers as it provides a lot of flexibility
(and responsibility) to the producer in how to fill the buffers.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 14:33                   ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13 14:33 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock,
	Linus Torvalds, Nix, Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7743 bytes --]

On Tue, 13 Oct 2009, Boyan wrote:

> Frédéric L. W. Meunier wrote:
>> On Mon, 12 Oct 2009, Justin P. Mattock wrote:
>> 
>>> Linus Torvalds wrote:
>>>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>>>    although from about three years ago, when it used to be in tty_io.c..
>>>>    Any comment? ]
>>>> 
>>>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>>>> 
>>>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>>>> perhaps?
>>>>> 
>>>> 
>>>> Hmm. I see one, at least.
>>>> 
>>>> The "tty_insert_flip_string()" locking seems totally bogus.
>>>> 
>>>> It does that "tty_buffer_request_room()" call and subsequent copying with
>>>> no locking at all - sure, the tty_buffer_request_room() function itself
>>>> locks the buffers, but then unlocks it when returning, so when we 
>>>> actually
>>>> do the memcpy() etc, we can race with anybody.
>>>> 
>>>> I don't really see who would care, but it does look totally broken.
>>>> 
>>>> I dunno, this patch seems to make sense to me. Am I missing something?
>>>> 
>>>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>>>    apart from that I'm just going to say that it looks obvious, and the 
>>>> old
>>>>    code looks obviously buggy. Also, any remaining users of
>>>>
>>>>     tty_prepare_flip_string
>>>>     tty_prepare_flip_string_flags
>>>>
>>>>    are still fundamentally broken and buggy, while users of
>>>>
>>>>     tty_buffer_request_room
>>>>
>>>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>>>    pointless: they then call tty_insert_flip_string(), which means that 
>>>> the
>>>>    tty_buffer_request_room() call was totally redundant ]
>>>> 
>>>> Comments? Does this work? Does it make any difference? It seems fairly
>>>> unlikely, but it's the only obvious problem I've seen in the tty 
>>>> buffering
>>>> code so far.
>>>> 
>>>> And that code is literally 3 years old, and it seems unlikely that a
>>>> regular _keyboard_ buffer would be able to hit the (rather small) race
>>>> condition. But other serialization may have hidden it, and timing
>>>> differences could certainly have caused it to trigger much more easily.
>>>>
>>>>             Linus
>>>> 
>>>> ---
>>>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>>>   1 files changed, 25 insertions(+), 8 deletions(-)
>>>> 
>>>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>>>> index 3108991..25ab538 100644
>>>> --- a/drivers/char/tty_buffer.c
>>>> +++ b/drivers/char/tty_buffer.c
>>>> @@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct 
>>>> tty_struct *tty, size_t size)
>>>>    *
>>>>    *    Locking: Takes tty->buf.lock
>>>>    */
>>>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t 
>>>> size)
>>>>   {
>>>>       struct tty_buffer *b, *n;
>>>>       int left;
>>>> -    unsigned long flags;
>>>> -
>>>> -    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>
>>>>       /* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>>>          remove this conditional if its worth it. This would be invisible
>>>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, 
>>>> size_t size)
>>>>               size = left;
>>>>       }
>>>> 
>>>> -    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return size;
>>>>   }
>>>> +
>>>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +{
>>>> +    int retval;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>> +    retval = locked_tty_buffer_request_room(tty, size);
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>> +    return retval;
>>>> +}
>>>> +
>>>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>
>>>>   /**
>>>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>    *    Queue a series of bytes to the tty buffering. All the characters
>>>>    *    passed are marked as without error. Returns the number added.
>>>>    *
>>>> - *    Locking: Called functions may take tty->buf.lock
>>>> + *    Locking: We take tty->buf.lock
>>>>    */
>>>>
>>>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned char 
>>>> *chars,
>>>>                   size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>> +
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>>               break;
>>>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, 
>>>> const unsigned char *chars,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string);
>>>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct 
>>>> *tty,
>>>>           const unsigned char *chars, const char *flags, size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long irqflags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, irqflags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct 
>>>> *tty,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>>> 
>>>> 
>>> I can throw your patch in over here for the heck of it.
>>> If there's somebody who's really hitting this bug
>>> then the results would be better  if this is the area that causing
>>> this bug.(from here the only issue I'm seeing is spinning
>>> history commands in the terminal  from time to time,
>>> nothing of any unusable keys like others are reporting).
>> 
>> I tested it on top of 2.6.31.4 (after putting back 
>> e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is fine after 
>> almost 3h. Before that, the problems would appear in less than 1h. Maybe I 
>> spoke too soon, but...
>> 
>> Boyan, does it work for you ?
>> 
>
> I've just tested it on top of 2.6.31.3 and it doesn't work. As I've
> mentioned in previous email - I usually trigger the problem easily
> watching pictures with gthumb - this is combination of cpu intensive
> operations and keyboard usage and if it doesn't work it takes me no more
> than a minute to trigger the problem.
>
> I thought the problem may be more easily triggered because of the newer
> (1.6.4 RC) in fedora which is slower for my ati radeon cards, but now
> I'm with older version 1.6.1.901 which is fine in speed - so it doesn't
> matter what is the version of X.

It happened again here. I was running screen inside a terminal 
under X, moved to the window running mc, used the up arrow key, 
and it locked the keyboard with that key pressed.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 14:33                   ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-13 14:33 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock,
	Linus Torvalds, Nix, Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7936 bytes --]

On Tue, 13 Oct 2009, Boyan wrote:

> Frédéric L. W. Meunier wrote:
>> On Mon, 12 Oct 2009, Justin P. Mattock wrote:
>> 
>>> Linus Torvalds wrote:
>>>> [ Alan, Paulkf - the tty buffering and locking is originally your code,
>>>>    although from about three years ago, when it used to be in tty_io.c..
>>>>    Any comment? ]
>>>> 
>>>> On Mon, 12 Oct 2009, Linus Torvalds wrote:
>>>> 
>>>>> Alan, Ogawa-san, do either of you see some problem in tty_buffer.c,
>>>>> perhaps?
>>>>> 
>>>> 
>>>> Hmm. I see one, at least.
>>>> 
>>>> The "tty_insert_flip_string()" locking seems totally bogus.
>>>> 
>>>> It does that "tty_buffer_request_room()" call and subsequent copying with
>>>> no locking at all - sure, the tty_buffer_request_room() function itself
>>>> locks the buffers, but then unlocks it when returning, so when we 
>>>> actually
>>>> do the memcpy() etc, we can race with anybody.
>>>> 
>>>> I don't really see who would care, but it does look totally broken.
>>>> 
>>>> I dunno, this patch seems to make sense to me. Am I missing something?
>>>> 
>>>> [ NOTE! The patch is totally untested. It compiled for me on x86-64, and
>>>>    apart from that I'm just going to say that it looks obvious, and the 
>>>> old
>>>>    code looks obviously buggy. Also, any remaining users of
>>>>
>>>>     tty_prepare_flip_string
>>>>     tty_prepare_flip_string_flags
>>>>
>>>>    are still fundamentally broken and buggy, while users of
>>>>
>>>>     tty_buffer_request_room
>>>>
>>>>    are pretty damn odd and suspect (but a lot of them seem to be just
>>>>    pointless: they then call tty_insert_flip_string(), which means that 
>>>> the
>>>>    tty_buffer_request_room() call was totally redundant ]
>>>> 
>>>> Comments? Does this work? Does it make any difference? It seems fairly
>>>> unlikely, but it's the only obvious problem I've seen in the tty 
>>>> buffering
>>>> code so far.
>>>> 
>>>> And that code is literally 3 years old, and it seems unlikely that a
>>>> regular _keyboard_ buffer would be able to hit the (rather small) race
>>>> condition. But other serialization may have hidden it, and timing
>>>> differences could certainly have caused it to trigger much more easily.
>>>>
>>>>             Linus
>>>> 
>>>> ---
>>>>   drivers/char/tty_buffer.c |   33 +++++++++++++++++++++++++--------
>>>>   1 files changed, 25 insertions(+), 8 deletions(-)
>>>> 
>>>> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
>>>> index 3108991..25ab538 100644
>>>> --- a/drivers/char/tty_buffer.c
>>>> +++ b/drivers/char/tty_buffer.c
>>>> @@ -196,13 +196,10 @@ static struct tty_buffer *tty_buffer_find(struct 
>>>> tty_struct *tty, size_t size)
>>>>    *
>>>>    *    Locking: Takes tty->buf.lock
>>>>    */
>>>> -int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +static int locked_tty_buffer_request_room(struct tty_struct *tty, size_t 
>>>> size)
>>>>   {
>>>>       struct tty_buffer *b, *n;
>>>>       int left;
>>>> -    unsigned long flags;
>>>> -
>>>> -    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>
>>>>       /* OPTIMISATION: We could keep a per tty "zero" sized buffer to
>>>>          remove this conditional if its worth it. This would be invisible
>>>> @@ -225,9 +222,20 @@ int tty_buffer_request_room(struct tty_struct *tty, 
>>>> size_t size)
>>>>               size = left;
>>>>       }
>>>> 
>>>> -    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return size;
>>>>   }
>>>> +
>>>> +int tty_buffer_request_room(struct tty_struct *tty, size_t size)
>>>> +{
>>>> +    int retval;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>> +    retval = locked_tty_buffer_request_room(tty, size);
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>> +    return retval;
>>>> +}
>>>> +
>>>>   EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>
>>>>   /**
>>>> @@ -239,16 +247,20 @@ EXPORT_SYMBOL_GPL(tty_buffer_request_room);
>>>>    *    Queue a series of bytes to the tty buffering. All the characters
>>>>    *    passed are marked as without error. Returns the number added.
>>>>    *
>>>> - *    Locking: Called functions may take tty->buf.lock
>>>> + *    Locking: We take tty->buf.lock
>>>>    */
>>>>
>>>>   int tty_insert_flip_string(struct tty_struct *tty, const unsigned char 
>>>> *chars,
>>>>                   size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, flags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>> +
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>>               break;
>>>> @@ -260,6 +272,7 @@ int tty_insert_flip_string(struct tty_struct *tty, 
>>>> const unsigned char *chars,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, flags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string);
>>>> @@ -282,8 +295,11 @@ int tty_insert_flip_string_flags(struct tty_struct 
>>>> *tty,
>>>>           const unsigned char *chars, const char *flags, size_t size)
>>>>   {
>>>>       int copied = 0;
>>>> +    unsigned long irqflags;
>>>> +
>>>> +    spin_lock_irqsave(&tty->buf.lock, irqflags);
>>>>       do {
>>>> -        int space = tty_buffer_request_room(tty, size - copied);
>>>> +        int space = locked_tty_buffer_request_room(tty, size - copied);
>>>>           struct tty_buffer *tb = tty->buf.tail;
>>>>           /* If there is no space then tb may be NULL */
>>>>           if (unlikely(space == 0))
>>>> @@ -297,6 +313,7 @@ int tty_insert_flip_string_flags(struct tty_struct 
>>>> *tty,
>>>>           /* There is a small chance that we need to split the data over
>>>>              several buffers. If this is the case we must loop */
>>>>       } while (unlikely(size>  copied));
>>>> +    spin_unlock_irqrestore(&tty->buf.lock, irqflags);
>>>>       return copied;
>>>>   }
>>>>   EXPORT_SYMBOL(tty_insert_flip_string_flags);
>>>> 
>>>> 
>>> I can throw your patch in over here for the heck of it.
>>> If there's somebody who's really hitting this bug
>>> then the results would be better  if this is the area that causing
>>> this bug.(from here the only issue I'm seeing is spinning
>>> history commands in the terminal  from time to time,
>>> nothing of any unusable keys like others are reporting).
>> 
>> I tested it on top of 2.6.31.4 (after putting back 
>> e043e42bdb66885b3ac10d27a01ccb9972e2b0a3), and the keyboard is fine after 
>> almost 3h. Before that, the problems would appear in less than 1h. Maybe I 
>> spoke too soon, but...
>> 
>> Boyan, does it work for you ?
>> 
>
> I've just tested it on top of 2.6.31.3 and it doesn't work. As I've
> mentioned in previous email - I usually trigger the problem easily
> watching pictures with gthumb - this is combination of cpu intensive
> operations and keyboard usage and if it doesn't work it takes me no more
> than a minute to trigger the problem.
>
> I thought the problem may be more easily triggered because of the newer
> (1.6.4 RC) in fedora which is slower for my ati radeon cards, but now
> I'm with older version 1.6.1.901 which is fine in speed - so it doesn't
> matter what is the version of X.

It happened again here. I was running screen inside a terminal 
under X, moved to the window running mc, used the up arrow key, 
and it locked the keyboard with that key pressed.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 14:39               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 14:39 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Paul Fulghum, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Alan Cox wrote:
> 
> There is a simple reason the locking is sufficient. If you can call the
> function from two places at once in your serial driver at the same you've
> scrambled the data order so you've already lost.

Umm. No, Alan.

You also can race with:

 - whoever is _reading_ the buffer, and due to memory ordering may see the 
   update to the buffer length _before_ it actually sees the data itself. 
   That spinlock does all the memory ordering too.

 - scrambling the data order with two writers is certainly less annoying 
   than potentially screwing up ->used entirely, and having the memcpy's 
   overflow the buffer. Both writers may have decided that there is enough 
   room for each one - but that does not mean that there is enough room 
   for _both_.

Now, I do agree that generally there should be locking at a higher level, 
and you should never see two concurrent writers. But even if the locking 
is only for reading, the old locking is simply _wrong_.

> >   pointless: they then call tty_insert_flip_string(), which means that the 
> >   tty_buffer_request_room() call was totally redundant ]
> 
> It's a performance tweak. With a 3G USB modem or similar device running
> at 20Mbits or more being able to generate one allocation per chunk
> received for DMA made a measurable performance difference on some
> platforms. 

Have you even _read_ the code, Alan?

It's not a f*cking performance tweak, and you're ludicrous to claim it is. 
It's pointless, and it's making the code _slower_ rather than faster.

Lookie here, Alan - the common sequence is crap like this:

	tty_buffer_request_room(tty, buf->size);
	tty_insert_flip_string(tty, buf->base, buf->size);

and anybody who claims that is a "performance tweak" doesn't know what the 
hell he is talking about.

Look again.

The first thing that tty_insert_flup_string() does is to re-do the same 
tty_buffer_request_room() call. 

Performance tweak? No. Most of them are stupid, pointless, and worthless. 
Many of them do it for a single character too.

Not all, no. One or two seem to do one tty_buffer_request_room() call, and 
then some one-byte-at-a-time thing, but quite frankly, those are sure as 
hell not going to push lots of data quickly that way either.

Maybe there is some driver where there's a point to it, but from a quick 
grep, I couldn't find any.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 14:39               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 14:39 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Paul Fulghum, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Alan Cox wrote:
> 
> There is a simple reason the locking is sufficient. If you can call the
> function from two places at once in your serial driver at the same you've
> scrambled the data order so you've already lost.

Umm. No, Alan.

You also can race with:

 - whoever is _reading_ the buffer, and due to memory ordering may see the 
   update to the buffer length _before_ it actually sees the data itself. 
   That spinlock does all the memory ordering too.

 - scrambling the data order with two writers is certainly less annoying 
   than potentially screwing up ->used entirely, and having the memcpy's 
   overflow the buffer. Both writers may have decided that there is enough 
   room for each one - but that does not mean that there is enough room 
   for _both_.

Now, I do agree that generally there should be locking at a higher level, 
and you should never see two concurrent writers. But even if the locking 
is only for reading, the old locking is simply _wrong_.

> >   pointless: they then call tty_insert_flip_string(), which means that the 
> >   tty_buffer_request_room() call was totally redundant ]
> 
> It's a performance tweak. With a 3G USB modem or similar device running
> at 20Mbits or more being able to generate one allocation per chunk
> received for DMA made a measurable performance difference on some
> platforms. 

Have you even _read_ the code, Alan?

It's not a f*cking performance tweak, and you're ludicrous to claim it is. 
It's pointless, and it's making the code _slower_ rather than faster.

Lookie here, Alan - the common sequence is crap like this:

	tty_buffer_request_room(tty, buf->size);
	tty_insert_flip_string(tty, buf->base, buf->size);

and anybody who claims that is a "performance tweak" doesn't know what the 
hell he is talking about.

Look again.

The first thing that tty_insert_flup_string() does is to re-do the same 
tty_buffer_request_room() call. 

Performance tweak? No. Most of them are stupid, pointless, and worthless. 
Many of them do it for a single character too.

Not all, no. One or two seem to do one tty_buffer_request_room() call, and 
then some one-byte-at-a-time thing, but quite frankly, those are sure as 
hell not going to push lots of data quickly that way either.

Maybe there is some driver where there's a point to it, but from a quick 
grep, I couldn't find any.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 11:00             ` Alan Cox
  (?)
@ 2009-10-13 14:51             ` Jiri Kosina
  2009-10-13 15:56               ` Andi Kleen
  -1 siblings, 1 reply; 248+ messages in thread
From: Jiri Kosina @ 2009-10-13 14:51 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dmitry Torokhov, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Ed Tomlinson, Frédéric L. W. Meunier, Linus Torvalds,
	OGAWA Hirofumi, Andi Kleen

On Tue, 13 Oct 2009, Alan Cox wrote:

> > > > So it seems likely to me that this is a kernel bug, somewhere, and the
> > > > TTY layer seems like a good place to look (OK, a horrible place, but a
> > > > *likely* place).
> > > 
> > > Somewhere around 2.6.29-30 various things went funny in the keyboard
> > > layer for me - notably characters "bleeding" across console switches.
> > 
> > What do you mean by "bleeding"? Are you sure it is not autorepeat
> > kicking in?
> 
> Fairly. Just now and then I'll do something like type
> 
> "blahblah<alt-f1>"
> 
> eg when flipping consoles to check something and the last letter or two
> ends up on the screen after the flip (as if the alt-f1 vc switch passes
> the data somewhere). I suspect its some kind of asynchronous handling
> using the "current console" rather than the "current console at the time
> the letter was typed" but it doesn't occur to order so isn't bisectable
> and I've never managed to pin down where in the keyboard/vt/tty stack it's
> occurring.

This has been reported by Andi Kleen some time ago [1] [2]. He seems to 
have had clear idea between which kernel versions this started happening 
and seemed to be able to reproduce it very reliably (which wasn't the case 
on my side), but I don't think he bisected it down to single commit yet.

Andi?

[1] http://marc.info/?l=linux-kernel&m=124695628924382&w=4
[2] http://bugzilla.kernel.org/show_bug.cgi?id=13739

-- 
Jiri Kosina
SUSE Labs, Novell Inc.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:02                 ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 15:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Paul Fulghum, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
>  - whoever is _reading_ the buffer, and due to memory ordering may see the 
>    update to the buffer length _before_ it actually sees the data itself. 
>    That spinlock does all the memory ordering too.

Hmm. This one looks like it's ok, because whenever we commit it, we do 
take the spinlock, so '->commit' is protected for the reader side.

>  - scrambling the data order with two writers is certainly less annoying 
>    than potentially screwing up ->used entirely, and having the memcpy's 
>    overflow the buffer. Both writers may have decided that there is enough 
>    room for each one - but that does not mean that there is enough room 
>    for _both_.

.. but this one is still true. Anybody who doesn't lock writers at a 
higher level could easily end up causing some really subtle memory 
corruption.

But maybe all users really are safe. 

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:02                 ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 15:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: Nix, Paul Fulghum, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
>  - whoever is _reading_ the buffer, and due to memory ordering may see the 
>    update to the buffer length _before_ it actually sees the data itself. 
>    That spinlock does all the memory ordering too.

Hmm. This one looks like it's ok, because whenever we commit it, we do 
take the spinlock, so '->commit' is protected for the reader side.

>  - scrambling the data order with two writers is certainly less annoying 
>    than potentially screwing up ->used entirely, and having the memcpy's 
>    overflow the buffer. Both writers may have decided that there is enough 
>    room for each one - but that does not mean that there is enough room 
>    for _both_.

.. but this one is still true. Anybody who doesn't lock writers at a 
higher level could easily end up causing some really subtle memory 
corruption.

But maybe all users really are safe. 

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13  8:19               ` Boyan
  2009-10-13  9:17                 ` Dmitry Torokhov
  2009-10-13 14:33                   ` Frédéric L. W. Meunier
@ 2009-10-13 15:05                 ` Linus Torvalds
  2009-10-13 20:08                   ` Boyan
  2 siblings, 1 reply; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 15:05 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Boyan wrote:
> 
> I've just tested it on top of 2.6.31.3 and it doesn't work. As I've
> mentioned in previous email - I usually trigger the problem easily
> watching pictures with gthumb - this is combination of cpu intensive
> operations and keyboard usage and if it doesn't work it takes me no more
> than a minute to trigger the problem.

The whole "CPU intensive" thing makes me wonder..

Do you have 'CONFIG_PREEMPT' enabled? Normally, "CPU intensive" does not 
at all increase the likelihood of any kernel races, but with kernel 
preemption we may well hit some preemption point and switch away, and make 
some race window much bigger.

So if you do have CONFIG_PREEMPT on, try to turn it off and see if it 
makes the problem go away. Also, are people seeing this always running SMP 
kernels, or are there UP kernels out there too (on UP _without_ preemption 
it is almost impossible to hit 99% of all race conditions, so if anybody 
is running an UP kernel with no preemption, then I'd be very surprised if 
it is a kernel issue).

But I also still wonder if it might be user-space races, and just the 
timing differences in the kernel. I don't know the input layer in X well 
enough, I'm wondering if things like composition engine/window manager 
could screw up here. Is there some pattern to the X versions (and/or 
window managers and composition engines)?

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:08                 ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 15:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

On Tue, 2009-10-13 at 07:39 -0700, Linus Torvalds wrote:

> You also can race with:
> 
>  - whoever is _reading_ the buffer, and due to memory ordering may see the 
>    update to the buffer length _before_ it actually sees the data itself. 
>    That spinlock does all the memory ordering too.

The only reader is flush_to_ldisc() which operates on the
'commit' and 'read' fields of the buffer.

tty_prepare_xxx and tty_insert_xxx operate on the 'used'
field of the buffer

'commit' is updated with 'used' only under spinlock when
tty_flip_buffer_push() is called after the producer is
finished filling a buffer or in tty_buffer_request_room()
when allocating a new buffer.

--
Paul Fulghum
Microgate Systems, Ltd


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:08                 ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 15:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

On Tue, 2009-10-13 at 07:39 -0700, Linus Torvalds wrote:

> You also can race with:
> 
>  - whoever is _reading_ the buffer, and due to memory ordering may see the 
>    update to the buffer length _before_ it actually sees the data itself. 
>    That spinlock does all the memory ordering too.

The only reader is flush_to_ldisc() which operates on the
'commit' and 'read' fields of the buffer.

tty_prepare_xxx and tty_insert_xxx operate on the 'used'
field of the buffer

'commit' is updated with 'used' only under spinlock when
tty_flip_buffer_push() is called after the producer is
finished filling a buffer or in tty_buffer_request_room()
when allocating a new buffer.

--
Paul Fulghum
Microgate Systems, Ltd

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:16                 ` Justin P. Mattock
  0 siblings, 0 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-13 15:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Nix, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Alan Cox wrote:
>> I can throw your patch in over here for the heck of it.
>> If there's somebody who's really hitting this bug
>> then the results would be better  if this is the area that causing
>> this bug.(from here the only issue I'm seeing is spinning
>> history commands in the terminal  from time to time,
>> nothing of any unusable keys like others are reporting).
>>      
>
> That sounds more like a lost key-up event somewhere higher up the stack.
> USB keyboard ? and does it stop if you take the key in question. Also does
> it stop if you touch the mouse wheel (assuming you've got mousewheel
> bound to shell history somewhere ?)
>
> Alan
>
>    
This seems like it's a new mechanism with
fedora/ubuntu, but could be wrong.
(smart keys or something)

It is a usb keyboard using evdev/keyboard
as the X modules to operate.(imac9,1)
The way I've been able to get this to stop is
open another terminal, then when the history starts
spinning like it does click on the other terminal,
and everything seems to stop.
As for reproducing this thing seems to have a mind of it's own, some timers
firing off to tell it to start searching for the last good word for the 
"user",
(but could be wrong)that ends up being something completely wrong,
although a couple of times it did actually work and go right to the word 
I had in mind,
but most of the time this things just causes irritation.


Justin P. Mattock


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:16                 ` Justin P. Mattock
  0 siblings, 0 replies; 248+ messages in thread
From: Justin P. Mattock @ 2009-10-13 15:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Nix, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Alan Cox wrote:
>> I can throw your patch in over here for the heck of it.
>> If there's somebody who's really hitting this bug
>> then the results would be better  if this is the area that causing
>> this bug.(from here the only issue I'm seeing is spinning
>> history commands in the terminal  from time to time,
>> nothing of any unusable keys like others are reporting).
>>      
>
> That sounds more like a lost key-up event somewhere higher up the stack.
> USB keyboard ? and does it stop if you take the key in question. Also does
> it stop if you touch the mouse wheel (assuming you've got mousewheel
> bound to shell history somewhere ?)
>
> Alan
>
>    
This seems like it's a new mechanism with
fedora/ubuntu, but could be wrong.
(smart keys or something)

It is a usb keyboard using evdev/keyboard
as the X modules to operate.(imac9,1)
The way I've been able to get this to stop is
open another terminal, then when the history starts
spinning like it does click on the other terminal,
and everything seems to stop.
As for reproducing this thing seems to have a mind of it's own, some timers
firing off to tell it to start searching for the last good word for the 
"user",
(but could be wrong)that ends up being something completely wrong,
although a couple of times it did actually work and go right to the word 
I had in mind,
but most of the time this things just causes irritation.


Justin P. Mattock

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:33                 ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 15:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

On Tue, 2009-10-13 at 07:39 -0700, Linus Torvalds wrote:
> It's not a f*cking performance tweak, and you're ludicrous to claim it is. 
> It's pointless, and it's making the code _slower_ rather than faster.
> 
> Lookie here, Alan - the common sequence is crap like this:
> 
> 	tty_buffer_request_room(tty, buf->size);
> 	tty_insert_flip_string(tty, buf->base, buf->size);

The performance tweak of tty_prepare_xxx is that you fill
the tty_buffer directly instead of writing data first to a staging
buffer and then calling tty_insert_flip_string, which just copies
from the staging buffer to the tty_buffer. So it saves a copy operation.
 
--
Paul Fulghum
Microgate Systems, Ltd


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 15:33                 ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 15:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

On Tue, 2009-10-13 at 07:39 -0700, Linus Torvalds wrote:
> It's not a f*cking performance tweak, and you're ludicrous to claim it is. 
> It's pointless, and it's making the code _slower_ rather than faster.
> 
> Lookie here, Alan - the common sequence is crap like this:
> 
> 	tty_buffer_request_room(tty, buf->size);
> 	tty_insert_flip_string(tty, buf->base, buf->size);

The performance tweak of tty_prepare_xxx is that you fill
the tty_buffer directly instead of writing data first to a staging
buffer and then calling tty_insert_flip_string, which just copies
from the staging buffer to the tty_buffer. So it saves a copy operation.
 
--
Paul Fulghum
Microgate Systems, Ltd

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14264] ehci problem - mouse dead on scroll
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-13 15:35   ` Alan Stern
  2009-10-13 15:55       ` Volker Armin Hemmann
  -1 siblings, 1 reply; 248+ messages in thread
From: Alan Stern @ 2009-10-13 15:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Oliver Neukum,
	Volker Armin Hemmann

On Mon, 12 Oct 2009, Rafael J. Wysocki wrote:

> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.30 and 2.6.31.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
> Subject		: ehci problem - mouse dead on scroll
> Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
> Date		: 2009-09-12 7:46 (30 days old)
> References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
> Handled-By	: Alan Stern <stern@rowland.harvard.edu>

This is probably a hardware problem in the mouse or the Logitech
receiver.  It affected both EHCI and OHCI, and it was not reproducible
with a different mouse.  But Volker hasn't reported any results since
the end of September.

Volker, another good test would be to try plugging your mouse into 
someone else's computer.

Alan Stern


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 15:33                 ` Paul Fulghum
  (?)
@ 2009-10-13 15:41                 ` Linus Torvalds
  2009-10-13 15:59                   ` Alan Cox
  2009-10-13 17:28                     ` Paul Fulghum
  -1 siblings, 2 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 15:41 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Paul Fulghum wrote:

> On Tue, 2009-10-13 at 07:39 -0700, Linus Torvalds wrote:
> > It's not a f*cking performance tweak, and you're ludicrous to claim it is. 
> > It's pointless, and it's making the code _slower_ rather than faster.
> > 
> > Lookie here, Alan - the common sequence is crap like this:
> > 
> > 	tty_buffer_request_room(tty, buf->size);
> > 	tty_insert_flip_string(tty, buf->base, buf->size);
> 
> The performance tweak of tty_prepare_xxx is that you fill
> the tty_buffer directly instead of writing data first to a staging
> buffer and then calling tty_insert_flip_string, which just copies
> from the staging buffer to the tty_buffer. So it saves a copy operation.

Read the above again. Read what that common sequence is. Please just READ 
the f*cking code, and read my emails, instead of talking about something 
totally different that I'm not talking about at all.

The _most_common_ use of "tty_buffer_request_room()" is literally just the 
above insane sequence I quoted, not the case you talk about at all. Don't 
believe me? Use grep.

What _you_ are talking about is something else, namely the 
tty_prepare_flip stuff. But dammit, that has nothing what-so-ever to do 
with "tty_buffer_request_room()".

What I was pointing out is that there are a lot of 
"tty_buffer_request_room()" calls, and as far as I can see, all of them 
(or at least a large percentage) are just pure and utter crap.

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14264] ehci problem - mouse dead on scroll
@ 2009-10-13 15:55       ` Volker Armin Hemmann
  0 siblings, 0 replies; 248+ messages in thread
From: Volker Armin Hemmann @ 2009-10-13 15:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Oliver Neukum

On Dienstag 13 Oktober 2009, Alan Stern wrote:
> On Mon, 12 Oct 2009, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
> > Subject		: ehci problem - mouse dead on scroll
> > Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
> > Date		: 2009-09-12 7:46 (30 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
> > Handled-By	: Alan Stern <stern@rowland.harvard.edu>
> 
> This is probably a hardware problem in the mouse or the Logitech
> receiver.  It affected both EHCI and OHCI, and it was not reproducible
> with a different mouse.  But Volker hasn't reported any results since
> the end of September.
> 
> Volker, another good test would be to try plugging your mouse into
> someone else's computer.
> 
> Alan Stern
> 

yeah, that is a problem - I am pretty 'alone' in regard of linux users. I know 
very few, and they have either only servers without X or run some stable 
distributions with old kernels.

It is probably hardware related. I have tried two other mice and both were ok. 
Both had a lesser resolution and were slower, but that shouldn't make any 
difference.

I wanted to try some of the 32-rcs to see if they make any difference but the 
reiser4 for 2.6.31 patch does not result in a buildable kernel anymore, thanks 
to changes in writeback.h. So all I can say is:
2.6.31 is ok with the logitech, using ehci+hub+Translator settings and that is 
good enough for me.
every other mouse is ok with either ohci or attached to the hub.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14264] ehci problem - mouse dead on scroll
@ 2009-10-13 15:55       ` Volker Armin Hemmann
  0 siblings, 0 replies; 248+ messages in thread
From: Volker Armin Hemmann @ 2009-10-13 15:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Oliver Neukum

On Dienstag 13 Oktober 2009, Alan Stern wrote:
> On Mon, 12 Oct 2009, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.30 and 2.6.31.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > be listed and let me know (either way).
> >
> >
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
> > Subject		: ehci problem - mouse dead on scroll
> > Submitter	: Volker Armin Hemmann <volkerarmin-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
> > Date		: 2009-09-12 7:46 (30 days old)
> > References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
> > Handled-By	: Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>
> 
> This is probably a hardware problem in the mouse or the Logitech
> receiver.  It affected both EHCI and OHCI, and it was not reproducible
> with a different mouse.  But Volker hasn't reported any results since
> the end of September.
> 
> Volker, another good test would be to try plugging your mouse into
> someone else's computer.
> 
> Alan Stern
> 

yeah, that is a problem - I am pretty 'alone' in regard of linux users. I know 
very few, and they have either only servers without X or run some stable 
distributions with old kernels.

It is probably hardware related. I have tried two other mice and both were ok. 
Both had a lesser resolution and were slower, but that shouldn't make any 
difference.

I wanted to try some of the 32-rcs to see if they make any difference but the 
reiser4 for 2.6.31 patch does not result in a buildable kernel anymore, thanks 
to changes in writeback.h. So all I can say is:
2.6.31 is ok with the logitech, using ehci+hub+Translator settings and that is 
good enough for me.
every other mouse is ok with either ohci or attached to the hub.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 14:51             ` Jiri Kosina
@ 2009-10-13 15:56               ` Andi Kleen
  0 siblings, 0 replies; 248+ messages in thread
From: Andi Kleen @ 2009-10-13 15:56 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Alan Cox, Dmitry Torokhov, Nix, Justin P. Mattock,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Ed Tomlinson,
	Frédéric L. W. Meunier, Linus Torvalds, OGAWA Hirofumi,
	Andi Kleen

> This has been reported by Andi Kleen some time ago [1] [2]. He seems to 
> have had clear idea between which kernel versions this started happening 
> and seemed to be able to reproduce it very reliably (which wasn't the case 
> on my side), but I don't think he bisected it down to single commit yet.
> 
> Andi?

I've never tried to bisect it, but I think it was introduced between
.29->.30. Or at least I had never noticed the problem before upgrading
to some .30rc*

My symptoms were slightly different from Alan though, for me the actual
console switch leaked. I can't remember seeing keys before the switch
leaking too.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 15:41                 ` Linus Torvalds
@ 2009-10-13 15:59                   ` Alan Cox
  2009-10-13 16:42                     ` Linus Torvalds
  2009-10-13 17:28                     ` Paul Fulghum
  1 sibling, 1 reply; 248+ messages in thread
From: Alan Cox @ 2009-10-13 15:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Fulghum, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi

> What I was pointing out is that there are a lot of 
> "tty_buffer_request_room()" calls, and as far as I can see, all of them 
> (or at least a large percentage) are just pure and utter crap.

Almost certainly. When the original conversion was done all the code
which tried to peer into the flip buffer and check what bytes were left
was converted systematically to call request_room() as well in order to
make the conversion easy and reliable. For most devices thats a fairly
naïve conversion as with the new buffering the tty device really
shouldn't care about overruns. If we overrun now it's because we really
do want to dump stuff not because of crappy buffering.

The request_room actually trying to produce big buffers semantic was
added because some of the high speed DMA based adapters (notably 3G USB
ones) would hand large blocks of data over each time at rates upwards of
10-20Mbits. Without that request_room tweak they tended to drop data.

Because of the way they work the DMA buffers are allocated when the URB
is submitted so they can't use prepare_* ops but had to copy in some form.

With those in place we top out at about 40-50Mbits over a USB serial
link, which ought to be enough for anyone sane for the moment.

Your change to tty_insert_flip_string() is irrelevant for any practical
situation simply because the caller has to provide ordering of the blocks
it submits (if it receives 5 bytes and they all queue asynchronously then
it doesn't matter one iota whether tty_insert_flip_string has extra
internal locking "linus" is still going to turn randomly into things like
"sunil" in the serial stream and cause much confusion).

I actually think you should make the tty_insert_flip_string internal
length checking change because:
- It makes the consistency of tty_insert_flip_string clearer to any
  future reader
- It's a very very mindbogglingly slight performance win
- It'll no doubt make you feel less grumpy ;)

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 15:59                   ` Alan Cox
@ 2009-10-13 16:42                     ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 16:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paul Fulghum, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, Frédéric L. W. Meunier,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Alan Cox wrote:
> 
> I actually think you should make the tty_insert_flip_string internal
> length checking change because:
> - It makes the consistency of tty_insert_flip_string clearer to any
>   future reader

It does that, although then you're still stuck looking at the (few) 
tty_prepare_flip_*() calls that have that "racy feel".

But there aren't _that_ many callers, and it would probably be easier to 
at least comment on those.

> - It's a very very mindbogglingly slight performance win

I suspect the loop is always done exactly once in practice, so I'm not 
sure it will matter for performance one way or the other.

However, if we do hold the spinlock over the whole operation, what we 
_could_ do is to then just combine it with tty_flip_buffer_push(), and do 
that
	tb->commit = tb->used;

part inside the lock. There's a number of people who effectively do

	tty_tty_buffer_request_room(tty, size);
	tty_insert_flip_string(tty, buffer, size);
	tty_flip_buffer_push(tty);

and that just takes that silly spinlock _three_ times for no good reason.

> - It'll no doubt make you feel less grumpy ;)

Not likely. Grumpy is my baseline.

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 17:28                     ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 17:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Linus Torvalds wrote:
> What _you_ are talking about is something else, namely the 
> tty_prepare_flip stuff. But dammit, that has nothing what-so-ever to do 
> with "tty_buffer_request_room()".

OK, I should have followed your argument more closely.
I was trying to interpret it in terms of your patch,
which touched the tty_prepare stuff, but that is separate
from your comments about optimizations.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 17:28                     ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 17:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Nix, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson,
	"Frédéric L. W. Meunier",
	OGAWA Hirofumi

Linus Torvalds wrote:
> What _you_ are talking about is something else, namely the 
> tty_prepare_flip stuff. But dammit, that has nothing what-so-ever to do 
> with "tty_buffer_request_room()".

OK, I should have followed your argument more closely.
I was trying to interpret it in terms of your patch,
which touched the tty_prepare stuff, but that is separate
from your comments about optimizations.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 19:32             ` Nix
  0 siblings, 0 replies; 248+ messages in thread
From: Nix @ 2009-10-13 19:32 UTC (permalink / raw)
  To: Frédéric L. W. Meunier
  Cc: Linus Torvalds, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

On 13 Oct 2009, Frédéric L. W. Meunier uttered the following:
> Just a note. With me, all the keyboard problems happened while I was
> under X, but doing something in a terminal running screen. Reverting
> the commit stopped the problem.

Here are some more specifics of the failure mode I see.

 - This is an SMP box (quad-core Nehalem with hyperthreading enabled and
   a preemptive kernel), so we can't rule out SMP-specific stuff.

 - I have seen it both in konsoles and in XEmacs in an X frame, so it
   isn't specific to screen, or specific to PTYs :)

 - I have a PS/2 keyboard (albeit of a very strange type: Maltron), so
   it's not USB: but I've seen this with a USB keyboard plugged in
   as well (dual-keyboarding).

 - As you might imagine it's hard to keep this box's CPU busy! I've
   seen it when totally idle (other than keystroke-triggered CPU
   activity, of course). It happens every few hours, normally.

 - I haven't seen it on the raw TTY, but I spend almost all my
   time in X, so this may well be sheer statistics.

 - I have *not* seen anything that looks like this on my headless server,
   which is also an HT quad Nehalem, but not preemptive. As Alan
   suggested, the VT or input layer or something near it is screaming
   (bashing keys mindlessly into whatever has focus under X): I've
   never seen this cause screaming on a remote machine but not on the
   local one, or in one ssh session on the local machine but not in
   others. It's always all of X that is affected.

 - Zapping X makes it go away. Next time it goes wrong I'll dig out
   an old machine with another screen, and ssh in, and see if I
   can make the problem go away by switching VTs without killing X
   (via chvt) and if it comes back when X restarts.

Here's my .config, in case it's of any use. (I'm using the TuxOnIce
patch, but I've also seen it without that patch, so we can rule that
out. I suspect we could rule it out anyway, as I doubt everyone here is
using TuxOnIce :) )

The .config of the affected machine:

CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y
CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_KERNEL_GZIP=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_CLASSIC_RCU=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_USER_SCHED=y
CONFIG_CGROUPS=y
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE="usr/initramfs.spindle"
CONFIG_INITRAMFS_ROOT_UID=99
CONFIG_INITRAMFS_ROOT_GID=101
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_INITRAMFS_COMPRESSION_GZIP=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_COUNTERS=y
CONFIG_PERF_COUNTERS=y
CONFIG_EVENT_PROFILE=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_STRIP_ASM_SYMS=y
CONFIG_SLAB=y
CONFIG_TRACEPOINTS=y
CONFIG_MARKERS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_KRETPROBES=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLOCK_COMPAT=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=m
CONFIG_IOSCHED_DEADLINE=m
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_SPARSE_IRQ=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_MCORE2=y
CONFIG_X86_CPU=y
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_IOMMU_API=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_NEW_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=y
CONFIG_X86_CPU_DEBUG=m
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_HAVE_MLOCK=y
CONFIG_HAVE_MLOCKED_PAGE_BIT=y
CONFIG_MMU_NOTIFIER=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_MTRR=y
CONFIG_X86_PAT=y
CONFIG_HZ_100=y
CONFIG_HZ=100
CONFIG_SCHED_HRTICK=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=y
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_DMAR=y
CONFIG_DMAR_DEFAULT_ON=y
CONFIG_DMAR_FLOPPY_WA=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y
CONFIG_PCIEASPM=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_IOV=y
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_MISC=y
CONFIG_IA32_EMULATION=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_FIB_HASH=y
CONFIG_IP_PNP=y
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
CONFIG_PNP=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_CDROM_PKTCDVD=y
CONFIG_CDROM_PKTCDVD_BUFFERS=16
CONFIG_MISC_DEVICES=y
CONFIG_HAVE_IDE=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_SCSI_ARCMSR=y
CONFIG_SCSI_ARCMSR_AER=y
CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
CONFIG_MD=y
CONFIG_BLK_DEV_DM=y
CONFIG_DM_CRYPT=y
CONFIG_DM_SNAPSHOT=y
CONFIG_DM_MIRROR=y
CONFIG_DM_ZERO=y
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_OHCI_DEBUG=y
CONFIG_FIREWIRE_SBP2=m
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_TUN=y
CONFIG_NETDEV_1000=y
CONFIG_E1000E=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1680
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=1050
CONFIG_INPUT_EVDEV=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_LIBPS2=y
CONFIG_GAMEPORT=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_IPMI_HANDLER=m
CONFIG_IPMI_PANIC_EVENT=y
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_POWEROFF=m
CONFIG_NVRAM=m
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_I801=y
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_HWMON=y
CONFIG_HWMON_VID=y
CONFIG_SENSORS_W83793=y
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_SSB_POSSIBLE=y
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_JACK=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VMASTER=y
CONFIG_SND_PCI=y
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_INPUT_JACK=y
CONFIG_SND_HDA_CODEC_INTELHDMI=y
CONFIG_SND_HDA_ELD=y
CONFIG_SND_HDA_GENERIC=y
CONFIG_SND_HDA_POWER_SAVE=y
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_USB_HID=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_DYNAMIC_MINORS=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_UHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_PL2303=m
CONFIG_EDAC=y
CONFIG_EDAC_MM_EDAC=y
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_DRV_CMOS=y
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DMIID=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_XATTR=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_JBD2=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QUOTA_TREE=y
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_FUSE_FS=y
CONFIG_CUSE=y
CONFIG_GENERIC_ACL=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_MISC_FILESYSTEMS=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_ROOT_NFS=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_UTF8=m
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PRINTK_TIME=y
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
CONFIG_LATENCYTOP=y
CONFIG_SYSCTL_SYSCALL_CHECK=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FTRACE_NMI_ENTER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_RING_BUFFER=y
CONFIG_FTRACE_NMI_ENTER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_SYSPROF_TRACER=y
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_STRICT_DEVMEM=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_RODATA=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_SECURITY_FILE_CAPABILITIES=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CBC=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=y
CONFIG_BINARY_PRINTF=y
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
CONFIG_CRC16=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_ZLIB_INFLATE=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 19:32             ` Nix
  0 siblings, 0 replies; 248+ messages in thread
From: Nix @ 2009-10-13 19:32 UTC (permalink / raw)
  To: Frédéric L. W. Meunier
  Cc: Linus Torvalds, Justin P. Mattock, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Boyan,
	Dmitry Torokhov, Ed Tomlinson, OGAWA Hirofumi

On 13 Oct 2009, Frédéric L. W. Meunier uttered the following:
> Just a note. With me, all the keyboard problems happened while I was
> under X, but doing something in a terminal running screen. Reverting
> the commit stopped the problem.

Here are some more specifics of the failure mode I see.

 - This is an SMP box (quad-core Nehalem with hyperthreading enabled and
   a preemptive kernel), so we can't rule out SMP-specific stuff.

 - I have seen it both in konsoles and in XEmacs in an X frame, so it
   isn't specific to screen, or specific to PTYs :)

 - I have a PS/2 keyboard (albeit of a very strange type: Maltron), so
   it's not USB: but I've seen this with a USB keyboard plugged in
   as well (dual-keyboarding).

 - As you might imagine it's hard to keep this box's CPU busy! I've
   seen it when totally idle (other than keystroke-triggered CPU
   activity, of course). It happens every few hours, normally.

 - I haven't seen it on the raw TTY, but I spend almost all my
   time in X, so this may well be sheer statistics.

 - I have *not* seen anything that looks like this on my headless server,
   which is also an HT quad Nehalem, but not preemptive. As Alan
   suggested, the VT or input layer or something near it is screaming
   (bashing keys mindlessly into whatever has focus under X): I've
   never seen this cause screaming on a remote machine but not on the
   local one, or in one ssh session on the local machine but not in
   others. It's always all of X that is affected.

 - Zapping X makes it go away. Next time it goes wrong I'll dig out
   an old machine with another screen, and ssh in, and see if I
   can make the problem go away by switching VTs without killing X
   (via chvt) and if it comes back when X restarts.

Here's my .config, in case it's of any use. (I'm using the TuxOnIce
patch, but I've also seen it without that patch, so we can rule that
out. I suspect we could rule it out anyway, as I doubt everyone here is
using TuxOnIce :) )

The .config of the affected machine:

CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y
CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_KERNEL_GZIP=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_CLASSIC_RCU=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_USER_SCHED=y
CONFIG_CGROUPS=y
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE="usr/initramfs.spindle"
CONFIG_INITRAMFS_ROOT_UID=99
CONFIG_INITRAMFS_ROOT_GID=101
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_INITRAMFS_COMPRESSION_GZIP=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_COUNTERS=y
CONFIG_PERF_COUNTERS=y
CONFIG_EVENT_PROFILE=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_STRIP_ASM_SYMS=y
CONFIG_SLAB=y
CONFIG_TRACEPOINTS=y
CONFIG_MARKERS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_KRETPROBES=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLOCK_COMPAT=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=m
CONFIG_IOSCHED_DEADLINE=m
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_SPARSE_IRQ=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_MCORE2=y
CONFIG_X86_CPU=y
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
CONFIG_IOMMU_API=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_NEW_MCE=y
CONFIG_X86_MCE_INTEL=y
CONFIG_X86_MCE_THRESHOLD=y
CONFIG_X86_THERMAL_VECTOR=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_INTEL=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=y
CONFIG_X86_CPU_DEBUG=m
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_HAVE_MLOCK=y
CONFIG_HAVE_MLOCKED_PAGE_BIT=y
CONFIG_MMU_NOTIFIER=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_MTRR=y
CONFIG_X86_PAT=y
CONFIG_HZ_100=y
CONFIG_HZ=100
CONFIG_SCHED_HRTICK=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=y
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_DMAR=y
CONFIG_DMAR_DEFAULT_ON=y
CONFIG_DMAR_FLOPPY_WA=y
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y
CONFIG_PCIEASPM=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_IOV=y
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_MISC=y
CONFIG_IA32_EMULATION=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_FIB_HASH=y
CONFIG_IP_PNP=y
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_UEVENT_HELPER_PATH=""
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
CONFIG_PNP=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_CDROM_PKTCDVD=y
CONFIG_CDROM_PKTCDVD_BUFFERS=16
CONFIG_MISC_DEVICES=y
CONFIG_HAVE_IDE=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_SCSI_ARCMSR=y
CONFIG_SCSI_ARCMSR_AER=y
CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
CONFIG_MD=y
CONFIG_BLK_DEV_DM=y
CONFIG_DM_CRYPT=y
CONFIG_DM_SNAPSHOT=y
CONFIG_DM_MIRROR=y
CONFIG_DM_ZERO=y
CONFIG_FIREWIRE=m
CONFIG_FIREWIRE_OHCI=m
CONFIG_FIREWIRE_OHCI_DEBUG=y
CONFIG_FIREWIRE_SBP2=m
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_TUN=y
CONFIG_NETDEV_1000=y
CONFIG_E1000E=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1680
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=1050
CONFIG_INPUT_EVDEV=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_LIBPS2=y
CONFIG_GAMEPORT=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_IPMI_HANDLER=m
CONFIG_IPMI_PANIC_EVENT=y
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_POWEROFF=m
CONFIG_NVRAM=m
CONFIG_HPET=y
CONFIG_HPET_MMAP=y
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_HELPER_AUTO=y
CONFIG_I2C_I801=y
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
CONFIG_HWMON=y
CONFIG_HWMON_VID=y
CONFIG_SENSORS_W83793=y
CONFIG_THERMAL=y
CONFIG_THERMAL_HWMON=y
CONFIG_SSB_POSSIBLE=y
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_JACK=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_SEQ_HRTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_VERBOSE_PROCFS=y
CONFIG_SND_VMASTER=y
CONFIG_SND_PCI=y
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_INPUT_JACK=y
CONFIG_SND_HDA_CODEC_INTELHDMI=y
CONFIG_SND_HDA_ELD=y
CONFIG_SND_HDA_GENERIC=y
CONFIG_SND_HDA_POWER_SAVE=y
CONFIG_SND_HDA_POWER_SAVE_DEFAULT=0
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_USB_HID=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_DYNAMIC_MINORS=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_UHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_PL2303=m
CONFIG_EDAC=y
CONFIG_EDAC_MM_EDAC=y
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_DRV_CMOS=y
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_DMIID=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_XATTR=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_JBD2=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QUOTA_TREE=y
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_FUSE_FS=y
CONFIG_CUSE=y
CONFIG_GENERIC_ACL=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_MISC_FILESYSTEMS=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_ROOT_NFS=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_UTF8=m
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PRINTK_TIME=y
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
CONFIG_STACKTRACE=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
CONFIG_LATENCYTOP=y
CONFIG_SYSCTL_SYSCALL_CHECK=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FTRACE_NMI_ENTER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_RING_BUFFER=y
CONFIG_FTRACE_NMI_ENTER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_SYSPROF_TRACER=y
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_HAVE_ARCH_KMEMCHECK=y
CONFIG_STRICT_DEVMEM=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_RODATA=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
CONFIG_DEFAULT_IO_DELAY_TYPE=0
CONFIG_SECURITY_FILE_CAPABILITIES=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CBC=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=y
CONFIG_BINARY_PRINTF=y
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
CONFIG_CRC16=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_ZLIB_INFLATE=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 15:05                 ` Linus Torvalds
@ 2009-10-13 20:08                   ` Boyan
  2009-10-13 20:53                       ` Linus Torvalds
  0 siblings, 1 reply; 248+ messages in thread
From: Boyan @ 2009-10-13 20:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

Linus Torvalds wrote:
 > The whole "CPU intensive" thing makes me wonder..

When it breaks the first time and I switch to text console and go back
in X then it is really easy to trigger. Just "make modules_install" is
enough to stop the keyboard. At such cases starting to compile kernel
will keep the keyboard non functional until it is finished.
I don't know the internals of X, but for me it seems something in X is
broken, such as if the system is busy and it takes too much time to
"realize" that some key is pressed, it decides to just "switch off" the
keyboard as it is broken, then when switch to text console and go back
in X it "switches on" the keyboard again.

 >
> Do you have 'CONFIG_PREEMPT' enabled? Normally, "CPU intensive" does not 
> at all increase the likelihood of any kernel races, but with kernel 
> preemption we may well hit some preemption point and switch away, and make 
> some race window much bigger.

Yes, CONFIG_PREEMPT=y

> 
> So if you do have CONFIG_PREEMPT on, try to turn it off and see if it 
> makes the problem go away. Also, are people seeing this always running SMP 
> kernels, or are there UP kernels out there too (on UP _without_ preemption 
> it is almost impossible to hit 99% of all race conditions, so if anybody 
> is running an UP kernel with no preemption, then I'd be very surprised if 
> it is a kernel issue).

My system is UP, Athlon XP, 1.83GHz, video ATI 9550. Now I've tested
with:
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set

and I couldn't trigger the problem.

> 
> But I also still wonder if it might be user-space races, and just the 
> timing differences in the kernel. I don't know the input layer in X well 
> enough, I'm wondering if things like composition engine/window manager 
> could screw up here. Is there some pattern to the X versions (and/or 
> window managers and composition engines)?

For my case it doesn't matter X version - 1.6.1 was the previous Fedora
11 X, and it worked couple of months for me without such problems.
At the middle of September they've updated it to 1.6.4 - only X, not
the driver I'm using (ati) and it started to behave really slow on my
system - I see it as slower redraw of windows, rather irritating,
and I thought the keyboard problem is related to this, but then tested
it with the older version and it was the same.
Finally last weekend found time to bisect this and the result was
the mentioned commit: e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
(pty: avoid forcing 'low_latency' tty flag).

Composite is enabled in my X config, but I don't have compiz or
something like that enabled. DRI is enabled.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14264] ehci problem - mouse dead on scroll
  2009-10-13 15:55       ` Volker Armin Hemmann
  (?)
@ 2009-10-13 20:39       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-13 20:39 UTC (permalink / raw)
  To: Volker Armin Hemmann
  Cc: Alan Stern, Linux Kernel Mailing List, Kernel Testers List,
	Oliver Neukum

On Tuesday 13 October 2009, Volker Armin Hemmann wrote:
> On Dienstag 13 Oktober 2009, Alan Stern wrote:
> > On Mon, 12 Oct 2009, Rafael J. Wysocki wrote:
> > > This message has been generated automatically as a part of a report
> > > of regressions introduced between 2.6.30 and 2.6.31.
> > >
> > > The following bug entry is on the current list of known regressions
> > > introduced between 2.6.30 and 2.6.31.  Please verify if it still should
> > > be listed and let me know (either way).
> > >
> > >
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14264
> > > Subject		: ehci problem - mouse dead on scroll
> > > Submitter	: Volker Armin Hemmann <volkerarmin@googlemail.com>
> > > Date		: 2009-09-12 7:46 (30 days old)
> > > References	: http://marc.info/?l=linux-kernel&m=125274202707893&w=4
> > > Handled-By	: Alan Stern <stern@rowland.harvard.edu>
> > 
> > This is probably a hardware problem in the mouse or the Logitech
> > receiver.  It affected both EHCI and OHCI, and it was not reproducible
> > with a different mouse.  But Volker hasn't reported any results since
> > the end of September.
> > 
> > Volker, another good test would be to try plugging your mouse into
> > someone else's computer.
> > 
> > Alan Stern
> > 
> 
> yeah, that is a problem - I am pretty 'alone' in regard of linux users. I know 
> very few, and they have either only servers without X or run some stable 
> distributions with old kernels.
> 
> It is probably hardware related. I have tried two other mice and both were ok. 
> Both had a lesser resolution and were slower, but that shouldn't make any 
> difference.

OK, I'm closing the bug.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 20:53                       ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 20:53 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Boyan wrote:
> 
> Composite is enabled in my X config, but I don't have compiz or
> something like that enabled. DRI is enabled.

I think I may actually see the problem. And if I'm right, then the bug you 
guys bisected down to is really the fundamental reason. Embarrassing. I 
was so convinced it should only change flush timing, that I didn't think 
through all the possibilities.

The reason for thinking that it only really changes timing si fairly 
simple: the only thing it really does is to call "flush_to_ldisc()" 
synchronously when needed. On the face of it, that should be perfectly 
safe.

But flush_to_ldisc() itself has a real oddity: it uses "tty->buf.lock" to 
protect everything, BUT NOT THE ACTUAL CALL TO ->receive_buf()!

So even though that function looks _trivially_ atomic, once you look 
deeper it suddenly becomes clear how it's not really atomic at all: it 
will do all the buffer handling with the spinlock held, but then after it 
has figured out the buffer, it does:

	...
        spin_unlock_irqrestore(&tty->buf.lock, flags);
        disc->ops->receive_buf(tty, char_buf,
                                        flag_buf, count);
        spin_lock_irqsave(&tty->buf.lock, flags);
	...

and by releasing that lock it actually seems to break all the buffering 
guarantees! What can happen is:

	CPU1 (or thread1 with PREEMPTION)
					CPU2 (or thread2 with PREEMPTION)

	flush_to_ldisc()
	...
	spin_lock_irqsave(..)
	.. get one buffer..
	spin_unlock_irqrestore(..);

			<- PREEMPTION POINT, anything can happen ->
			<- more buffers can be added, etc ->

					flush_to_ldisc()
					spin_lock_irqsave(..)
					.. get second buffer..
					spin_unlock_irqrestore(..);
					->receive_buf(tty, char_buf, ...
					spin_lock_irqrestore(..)
					.. all done ..


	->receive_buf(tty, char_buf, ...
        spin_lock_irqrestore(...)

Notice how the "->receive_buf()" calls were done out of order, even if the 
data was perfectly in-order in the buffers.

And you can get the same race on SMP even without preemption, just thanks 
to CPU's hitting that lock just right. CONFIG_PREEMPT just makes it easier 
(probably _much_ easier) to trigger, and possible even on UP.

As far as I can tell, this is not really a new bug (it could have happened 
with low_latency before too), but on a tty without low_latency it would 
never happen until the commit you bisected to because the workqueue itself 
would serialize everything, and only one flush would ever be pending.

Anyway, the above explanation "feels right". It would easily explain the 
behavior, because if the ->receive_buf() calls get re-ordered, then the 
events get re-ordered, and one simple case of that would be to see the key 
"release" event before the key "press" event.

It also explains how that commit seems to be indicated so consistently. It 
still requires some specific timing, but now it's not timing _introduced_ 
by the commit, it's an old bug that that commit exposed, and then needs 
some unlucky timing to actually happen.

The sane fix would be to just run ->receive_buf() under the tty->buf.lock, 
but I assume we'd have a lot of unhappy ldiscs if we did that (and 
possibly irq latency problems too).

I think the

	tty->buf.head = NULL;
	...
	/* Restore the queue head */
	tty->buf.head = head;

around that loop is actually there to try to avoid this whole problem, but 
whoever did that didn't realize that there are other things that could set 
buf.head (in particular, tty_buffer_request_room() while the lock is 
dropped, so that whole logic is totally broken anyway and might even 
conspire to make the problem worse (ie if somebody tries to add data while 
->receive_buf() is running and the lock is gone, you are now _really_ 
screwing things up).

So instead of playing games with buf.head, I think we should just rely on 
the TTY_FLUSHING bit. I'm not _entirely_ happy with this, because now if 
we call flush_to_ldisc() while somebody else is busy flushing it, it will 
return early even though the flush hasn't completed yet. But that was 
always true to some degree (ie the "buffer full" case).

Anyway, I'm not entirely happy with this patch, and I haven't actually 
TESTED it so it might well be totally broken, but something along the 
lines of the appended may just fix it. It would be good if people who see 
this problem tried it out.

			Linus
---
 drivers/char/tty_buffer.c |   31 +++++++++++++------------------
 1 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 3108991..da59334 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -402,28 +402,24 @@ static void flush_to_ldisc(struct work_struct *work)
 		container_of(work, struct tty_struct, buf.work.work);
 	unsigned long 	flags;
 	struct tty_ldisc *disc;
-	struct tty_buffer *tbuf, *head;
-	char *char_buf;
-	unsigned char *flag_buf;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
 		return;
 
 	spin_lock_irqsave(&tty->buf.lock, flags);
-	/* So we know a flush is running */
-	set_bit(TTY_FLUSHING, &tty->flags);
-	head = tty->buf.head;
-	if (head != NULL) {
-		tty->buf.head = NULL;
-		for (;;) {
-			int count = head->commit - head->read;
+
+	if (test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
+		struct tty_buffer *head;
+		while ((head = tty->buf.head) != NULL) {
+			int count;
+			char *char_buf;
+			unsigned char *flag_buf;
+
+			count = head->commit - head->read;
 			if (!count) {
-				if (head->next == NULL)
-					break;
-				tbuf = head;
-				head = head->next;
-				tty_buffer_free(tty, tbuf);
+				tty->buf.head = head->next;
+				tty_buffer_free(tty, head);
 				continue;
 			}
 			/* Ldisc or user is trying to flush the buffers
@@ -445,9 +441,9 @@ static void flush_to_ldisc(struct work_struct *work)
 							flag_buf, count);
 			spin_lock_irqsave(&tty->buf.lock, flags);
 		}
-		/* Restore the queue head */
-		tty->buf.head = head;
+		clear_bit(TTY_FLUSHING, &tty->flags);
 	}
+
 	/* We may have a deferred request to flush the input buffer,
 	   if so pull the chain under the lock and empty the queue */
 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
@@ -455,7 +451,6 @@ static void flush_to_ldisc(struct work_struct *work)
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
 	}
-	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 
 	tty_ldisc_deref(disc);

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 20:53                       ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 20:53 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Boyan wrote:
> 
> Composite is enabled in my X config, but I don't have compiz or
> something like that enabled. DRI is enabled.

I think I may actually see the problem. And if I'm right, then the bug you 
guys bisected down to is really the fundamental reason. Embarrassing. I 
was so convinced it should only change flush timing, that I didn't think 
through all the possibilities.

The reason for thinking that it only really changes timing si fairly 
simple: the only thing it really does is to call "flush_to_ldisc()" 
synchronously when needed. On the face of it, that should be perfectly 
safe.

But flush_to_ldisc() itself has a real oddity: it uses "tty->buf.lock" to 
protect everything, BUT NOT THE ACTUAL CALL TO ->receive_buf()!

So even though that function looks _trivially_ atomic, once you look 
deeper it suddenly becomes clear how it's not really atomic at all: it 
will do all the buffer handling with the spinlock held, but then after it 
has figured out the buffer, it does:

	...
        spin_unlock_irqrestore(&tty->buf.lock, flags);
        disc->ops->receive_buf(tty, char_buf,
                                        flag_buf, count);
        spin_lock_irqsave(&tty->buf.lock, flags);
	...

and by releasing that lock it actually seems to break all the buffering 
guarantees! What can happen is:

	CPU1 (or thread1 with PREEMPTION)
					CPU2 (or thread2 with PREEMPTION)

	flush_to_ldisc()
	...
	spin_lock_irqsave(..)
	.. get one buffer..
	spin_unlock_irqrestore(..);

			<- PREEMPTION POINT, anything can happen ->
			<- more buffers can be added, etc ->

					flush_to_ldisc()
					spin_lock_irqsave(..)
					.. get second buffer..
					spin_unlock_irqrestore(..);
					->receive_buf(tty, char_buf, ...
					spin_lock_irqrestore(..)
					.. all done ..


	->receive_buf(tty, char_buf, ...
        spin_lock_irqrestore(...)

Notice how the "->receive_buf()" calls were done out of order, even if the 
data was perfectly in-order in the buffers.

And you can get the same race on SMP even without preemption, just thanks 
to CPU's hitting that lock just right. CONFIG_PREEMPT just makes it easier 
(probably _much_ easier) to trigger, and possible even on UP.

As far as I can tell, this is not really a new bug (it could have happened 
with low_latency before too), but on a tty without low_latency it would 
never happen until the commit you bisected to because the workqueue itself 
would serialize everything, and only one flush would ever be pending.

Anyway, the above explanation "feels right". It would easily explain the 
behavior, because if the ->receive_buf() calls get re-ordered, then the 
events get re-ordered, and one simple case of that would be to see the key 
"release" event before the key "press" event.

It also explains how that commit seems to be indicated so consistently. It 
still requires some specific timing, but now it's not timing _introduced_ 
by the commit, it's an old bug that that commit exposed, and then needs 
some unlucky timing to actually happen.

The sane fix would be to just run ->receive_buf() under the tty->buf.lock, 
but I assume we'd have a lot of unhappy ldiscs if we did that (and 
possibly irq latency problems too).

I think the

	tty->buf.head = NULL;
	...
	/* Restore the queue head */
	tty->buf.head = head;

around that loop is actually there to try to avoid this whole problem, but 
whoever did that didn't realize that there are other things that could set 
buf.head (in particular, tty_buffer_request_room() while the lock is 
dropped, so that whole logic is totally broken anyway and might even 
conspire to make the problem worse (ie if somebody tries to add data while 
->receive_buf() is running and the lock is gone, you are now _really_ 
screwing things up).

So instead of playing games with buf.head, I think we should just rely on 
the TTY_FLUSHING bit. I'm not _entirely_ happy with this, because now if 
we call flush_to_ldisc() while somebody else is busy flushing it, it will 
return early even though the flush hasn't completed yet. But that was 
always true to some degree (ie the "buffer full" case).

Anyway, I'm not entirely happy with this patch, and I haven't actually 
TESTED it so it might well be totally broken, but something along the 
lines of the appended may just fix it. It would be good if people who see 
this problem tried it out.

			Linus
---
 drivers/char/tty_buffer.c |   31 +++++++++++++------------------
 1 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 3108991..da59334 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -402,28 +402,24 @@ static void flush_to_ldisc(struct work_struct *work)
 		container_of(work, struct tty_struct, buf.work.work);
 	unsigned long 	flags;
 	struct tty_ldisc *disc;
-	struct tty_buffer *tbuf, *head;
-	char *char_buf;
-	unsigned char *flag_buf;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
 		return;
 
 	spin_lock_irqsave(&tty->buf.lock, flags);
-	/* So we know a flush is running */
-	set_bit(TTY_FLUSHING, &tty->flags);
-	head = tty->buf.head;
-	if (head != NULL) {
-		tty->buf.head = NULL;
-		for (;;) {
-			int count = head->commit - head->read;
+
+	if (test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
+		struct tty_buffer *head;
+		while ((head = tty->buf.head) != NULL) {
+			int count;
+			char *char_buf;
+			unsigned char *flag_buf;
+
+			count = head->commit - head->read;
 			if (!count) {
-				if (head->next == NULL)
-					break;
-				tbuf = head;
-				head = head->next;
-				tty_buffer_free(tty, tbuf);
+				tty->buf.head = head->next;
+				tty_buffer_free(tty, head);
 				continue;
 			}
 			/* Ldisc or user is trying to flush the buffers
@@ -445,9 +441,9 @@ static void flush_to_ldisc(struct work_struct *work)
 							flag_buf, count);
 			spin_lock_irqsave(&tty->buf.lock, flags);
 		}
-		/* Restore the queue head */
-		tty->buf.head = head;
+		clear_bit(TTY_FLUSHING, &tty->flags);
 	}
+
 	/* We may have a deferred request to flush the input buffer,
 	   if so pull the chain under the lock and empty the queue */
 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
@@ -455,7 +451,6 @@ static void flush_to_ldisc(struct work_struct *work)
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
 	}
-	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 
 	tty_ldisc_deref(disc);

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 21:02                         ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 21:02 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
> Anyway, I'm not entirely happy with this patch, and I haven't actually 
> TESTED it so it might well be totally broken [..]

It is.

Looking over the patch a bit more, at a minimum that

	if (test_and_set_bit(TTY_FLUSHING, &tty->flags)) {

line should be

	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {

(ie add the '!') since we want to do things if the bit wasn't set before 
(and if it was already set it all turns into a no-op).

But apart from that obvious typo, the patch still looks good even after 
looking it through a bit more. It's still TOTALLY UNTESTED, though!

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 21:02                         ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 21:02 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
> Anyway, I'm not entirely happy with this patch, and I haven't actually 
> TESTED it so it might well be totally broken [..]

It is.

Looking over the patch a bit more, at a minimum that

	if (test_and_set_bit(TTY_FLUSHING, &tty->flags)) {

line should be

	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {

(ie add the '!') since we want to do things if the bit wasn't set before 
(and if it was already set it all turns into a no-op).

But apart from that obvious typo, the patch still looks good even after 
looking it through a bit more. It's still TOTALLY UNTESTED, though!

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 20:53                       ` Linus Torvalds
  (?)
  (?)
@ 2009-10-13 21:13                       ` Linus Torvalds
  2009-10-14  0:55                           ` Frédéric L. W. Meunier
  2009-10-14  7:45                           ` Boyan
  -1 siblings, 2 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 21:13 UTC (permalink / raw)
  To: Boyan
  Cc: Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi


Another bug:

On Tue, 13 Oct 2009, Linus Torvalds wrote:
>  			if (!count) {
> -				if (head->next == NULL)
> -					break;

Those two lines should _not_ be deleted. I cleaned up a bit too much.

The rule is that we must not free the last buffer, because it's also going 
to be 'tail'.

So here's a new version with that fixed (and the previous bug I already 
mentioned).

Whether it _works_ is still not clear. It might eat your pet goldfish, or 
make farting noises in your general direction. Or it might fix the bug. 
Who knows?

		Linus

---
 drivers/char/tty_buffer.c |   29 +++++++++++++----------------
 1 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 3108991..0296612 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -402,28 +402,26 @@ static void flush_to_ldisc(struct work_struct *work)
 		container_of(work, struct tty_struct, buf.work.work);
 	unsigned long 	flags;
 	struct tty_ldisc *disc;
-	struct tty_buffer *tbuf, *head;
-	char *char_buf;
-	unsigned char *flag_buf;
 
 	disc = tty_ldisc_ref(tty);
 	if (disc == NULL)	/*  !TTY_LDISC */
 		return;
 
 	spin_lock_irqsave(&tty->buf.lock, flags);
-	/* So we know a flush is running */
-	set_bit(TTY_FLUSHING, &tty->flags);
-	head = tty->buf.head;
-	if (head != NULL) {
-		tty->buf.head = NULL;
-		for (;;) {
-			int count = head->commit - head->read;
+
+	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
+		struct tty_buffer *head;
+		while ((head = tty->buf.head) != NULL) {
+			int count;
+			char *char_buf;
+			unsigned char *flag_buf;
+
+			count = head->commit - head->read;
 			if (!count) {
 				if (head->next == NULL)
 					break;
-				tbuf = head;
-				head = head->next;
-				tty_buffer_free(tty, tbuf);
+				tty->buf.head = head->next;
+				tty_buffer_free(tty, head);
 				continue;
 			}
 			/* Ldisc or user is trying to flush the buffers
@@ -445,9 +443,9 @@ static void flush_to_ldisc(struct work_struct *work)
 							flag_buf, count);
 			spin_lock_irqsave(&tty->buf.lock, flags);
 		}
-		/* Restore the queue head */
-		tty->buf.head = head;
+		clear_bit(TTY_FLUSHING, &tty->flags);
 	}
+
 	/* We may have a deferred request to flush the input buffer,
 	   if so pull the chain under the lock and empty the queue */
 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
@@ -455,7 +453,6 @@ static void flush_to_ldisc(struct work_struct *work)
 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
 		wake_up(&tty->read_wait);
 	}
-	clear_bit(TTY_FLUSHING, &tty->flags);
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 
 	tty_ldisc_deref(disc);

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 21:32                         ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 21:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

> But flush_to_ldisc() itself has a real oddity: it uses "tty->buf.lock" to 
> protect everything, BUT NOT THE ACTUAL CALL TO ->receive_buf()!

Indeed or it deadlocks

> Anyway, the above explanation "feels right". It would easily explain the 
> behavior, because if the ->receive_buf() calls get re-ordered, then the 
> events get re-ordered, and one simple case of that would be to see the key 
> "release" event before the key "press" event.

And you would only see it in X11 because only X11 deals in raw key events.

> The sane fix would be to just run ->receive_buf() under the tty->buf.lock, 
> but I assume we'd have a lot of unhappy ldiscs if we did that (and 
> possibly irq latency problems too).

You bet

However there is nothing stopping you stuffing that lot into a per tty
mutex solely used for serializing those submissions. It can't really be a
mutex for anything else as we call back into the ldisc to send stuff. You
aren't allowed to stuff data into the ldisc unless it can sleep so a
mutex is fine.

I can't help feeling a mutex might be simpler. It would also then fix
tiocsti() which is most definitely broken right now and documented as
racing.

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 21:32                         ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 21:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

> But flush_to_ldisc() itself has a real oddity: it uses "tty->buf.lock" to 
> protect everything, BUT NOT THE ACTUAL CALL TO ->receive_buf()!

Indeed or it deadlocks

> Anyway, the above explanation "feels right". It would easily explain the 
> behavior, because if the ->receive_buf() calls get re-ordered, then the 
> events get re-ordered, and one simple case of that would be to see the key 
> "release" event before the key "press" event.

And you would only see it in X11 because only X11 deals in raw key events.

> The sane fix would be to just run ->receive_buf() under the tty->buf.lock, 
> but I assume we'd have a lot of unhappy ldiscs if we did that (and 
> possibly irq latency problems too).

You bet

However there is nothing stopping you stuffing that lot into a per tty
mutex solely used for serializing those submissions. It can't really be a
mutex for anything else as we call back into the ldisc to send stuff. You
aren't allowed to stuff data into the ldisc unless it can sleep so a
mutex is fine.

I can't help feeling a mutex might be simpler. It would also then fix
tiocsti() which is most definitely broken right now and documented as
racing.

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 20:53                       ` Linus Torvalds
                                         ` (3 preceding siblings ...)
  (?)
@ 2009-10-13 21:46                       ` Paul Fulghum
  2009-10-13 22:42                           ` Linus Torvalds
  -1 siblings, 1 reply; 248+ messages in thread
From: Paul Fulghum @ 2009-10-13 21:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

Linus Torvalds wrote:

> and by releasing that lock it actually seems to break all the buffering 
> guarantees! What can happen is:
> 
> 	CPU1 (or thread1 with PREEMPTION)
> 					CPU2 (or thread2 with PREEMPTION)
> 
> 	flush_to_ldisc()
> 	...
> 	spin_lock_irqsave(..)
> 	.. get one buffer..
> 	spin_unlock_irqrestore(..);
> 
> 			<- PREEMPTION POINT, anything can happen ->
> 			<- more buffers can be added, etc ->
> 
> 					flush_to_ldisc()
> 					spin_lock_irqsave(..)
> 					.. get second buffer..
> 					spin_unlock_irqrestore(..);
> 					->receive_buf(tty, char_buf, ...
> 					spin_lock_irqrestore(..)
> 					.. all done ..
> 
> 
> 	->receive_buf(tty, char_buf, ...
>         spin_lock_irqrestore(...)
> 
> Notice how the "->receive_buf()" calls were done out of order, even if the 
> data was perfectly in-order in the buffers.

The buffer head is removed and set to null just before
the flushing loop.

If flush_to_ldisc() is reentered with the head set to null, nothing
is done. New buffers can be added where you say, but they are
added to the tail. So the order of flushed data is retained.

This existing mechanism essentially does the same thing as your patch.


-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 22:42                           ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 22:42 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Paul Fulghum wrote:
> 
> If flush_to_ldisc() is reentered with the head set to null, nothing
> is done. New buffers can be added where you say, but they are
> added to the tail. So the order of flushed data is retained.

They are added to the tail only if the tail is non-NULL.

And buf.tail, in turn, is protected by the TTY_FLUSHING bit.

And look what happens to TTY_FLUSHING if flush_to_ldisc() is called by 
multiple contexts - it doesn't nest right. The inner "flush_to_ldisc()" 
will clear the bit (your "nothing is done" case).

Now, I agree that we can solve things differently. We could, for example, 
get rid of TTY_FLUSHING entirely. If you want to keep the crazy "head = 
NULL" special case, we could basically replace all tests of TTY_FLUSHING 
with "tty->buf.tail && !tty->buf.head" instead, and use _that_ as a "the 
TTY is in the middle of a flush" operation. That should be 100% equivalent 
to my patch.

I do object to the whole crazy subtle TTY locking. I'm convinced it's 
wrong, and I'm convinced it's wrong exactly _because_ it tries to be so 
subtle and does non-obvious things.

That's why my patch also changed the whole loop logic: it's not subtle any 
more. Not only did I make TTY_FLUSHING nest correctly, I also stopped 
playing games with buf.head: it's now purely a list, rather than "a list 
and a failed attempt to lock".

And no, I'm not sure my patch helps. I'd have expected 
'tty_buffer_flush()' to be something very rare, for example. But I also 
didn't really check if we may do it some other way.

But I _am_ sure that it makes the code a whole lot more straightforward. 
Bits that say "we're busy flushing" suddenly actually act that way, and 
pointers that say "this is the head of the buffers" also act that wy.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 22:42                           ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 22:42 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Tue, 13 Oct 2009, Paul Fulghum wrote:
> 
> If flush_to_ldisc() is reentered with the head set to null, nothing
> is done. New buffers can be added where you say, but they are
> added to the tail. So the order of flushed data is retained.

They are added to the tail only if the tail is non-NULL.

And buf.tail, in turn, is protected by the TTY_FLUSHING bit.

And look what happens to TTY_FLUSHING if flush_to_ldisc() is called by 
multiple contexts - it doesn't nest right. The inner "flush_to_ldisc()" 
will clear the bit (your "nothing is done" case).

Now, I agree that we can solve things differently. We could, for example, 
get rid of TTY_FLUSHING entirely. If you want to keep the crazy "head = 
NULL" special case, we could basically replace all tests of TTY_FLUSHING 
with "tty->buf.tail && !tty->buf.head" instead, and use _that_ as a "the 
TTY is in the middle of a flush" operation. That should be 100% equivalent 
to my patch.

I do object to the whole crazy subtle TTY locking. I'm convinced it's 
wrong, and I'm convinced it's wrong exactly _because_ it tries to be so 
subtle and does non-obvious things.

That's why my patch also changed the whole loop logic: it's not subtle any 
more. Not only did I make TTY_FLUSHING nest correctly, I also stopped 
playing games with buf.head: it's now purely a list, rather than "a list 
and a failed attempt to lock".

And no, I'm not sure my patch helps. I'd have expected 
'tty_buffer_flush()' to be something very rare, for example. But I also 
didn't really check if we may do it some other way.

But I _am_ sure that it makes the code a whole lot more straightforward. 
Bits that say "we're busy flushing" suddenly actually act that way, and 
pointers that say "this is the head of the buffers" also act that wy.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-13 21:32                         ` Alan Cox
  (?)
@ 2009-10-13 22:54                         ` Linus Torvalds
  2009-10-13 23:11                             ` Alan Cox
  -1 siblings, 1 reply; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 22:54 UTC (permalink / raw)
  To: Alan Cox
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Alan Cox wrote:
> 
> I can't help feeling a mutex might be simpler. It would also then fix
> tiocsti() which is most definitely broken right now and documented as
> racing.

Hmm. Those tty's have too many different locks already.

But maybe we could just have one generic mutex, and use it for termios and 
IO locking. It makes perfect sense to serialize the ->receive_buf() code 
with any termios changes, since termios is what affects _how_ that 
->receive_buf() function works.

I do wonder why tiocsti() doesn't just use the tty buffering layer, 
though? Maybe that harks back to the whole "pty's did things differently" 
thing? Why does it go directly to ->receive_buf() in the first place?

And let's see if my patch even makes a difference. Maybe the breakage is 
somewhere else. The "oh, now we call flush_ldisc() from two different 
contexts" thing seems to be a promising lead, but ...

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:01                             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 23:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Fulghum, Boyan,  Frédéric L. W. Meunier,
	Justin P. Mattock, Nix, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

> And no, I'm not sure my patch helps. I'd have expected 
> 'tty_buffer_flush()' to be something very rare, for example. But I also 
> didn't really check if we may do it some other way.

It is rare for most applications

> But I _am_ sure that it makes the code a whole lot more straightforward. 
> Bits that say "we're busy flushing" suddenly actually act that way, and 
> pointers that say "this is the head of the buffers" also act that wy.

The more I look the more I think a mutex is the right answer. It also
provides us with a "stop feeding me" lock for ldisc changes and tty close
down bits.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:01                             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 23:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Fulghum, Boyan,  Frédéric L. W. Meunier,
	Justin P. Mattock, Nix, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

> And no, I'm not sure my patch helps. I'd have expected 
> 'tty_buffer_flush()' to be something very rare, for example. But I also 
> didn't really check if we may do it some other way.

It is rare for most applications

> But I _am_ sure that it makes the code a whole lot more straightforward. 
> Bits that say "we're busy flushing" suddenly actually act that way, and 
> pointers that say "this is the head of the buffers" also act that wy.

The more I look the more I think a mutex is the right answer. It also
provides us with a "stop feeding me" lock for ldisc changes and tty close
down bits.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:11                             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 23:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

> > I can't help feeling a mutex might be simpler. It would also then fix
> > tiocsti() which is most definitely broken right now and documented as
> > racing.
> 
> Hmm. Those tty's have too many different locks already.
> 
> But maybe we could just have one generic mutex, and use it for termios and 
> IO locking. It makes perfect sense to serialize the ->receive_buf() code 
> with any termios changes, since termios is what affects _how_ that 
> ->receive_buf() function works.

You cannot trivially just take the same lock for receive_buf and termios
locking at the moment. The reason is that receive_buf can cause the tty to
throttle which causes us to call the throttle methods which take the lock.

tty_throttle() and tty_unthrottle() can be called from both receive_buf
and non receive_buf paths so you can't just remove it.

The better existing lock is probably tty->ldisc_mutex which we take when
doing ldisc changes (which are an even more dramatic change during
receive_buf). We don't do ldisc changes from the receive_buf path and it
opens a path for further simplification of the ldisc logic if we can get
to the point where the ldisc doesn't get called randomly from the tty
layer when changing.

> I do wonder why tiocsti() doesn't just use the tty buffering layer, 
> though? Maybe that harks back to the whole "pty's did things differently" 
> thing? Why does it go directly to ->receive_buf() in the first place?

Historical question - I don't know - and at the time I commented it there
was no quick fix and bigger problems to sort first

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:11                             ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-13 23:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

> > I can't help feeling a mutex might be simpler. It would also then fix
> > tiocsti() which is most definitely broken right now and documented as
> > racing.
> 
> Hmm. Those tty's have too many different locks already.
> 
> But maybe we could just have one generic mutex, and use it for termios and 
> IO locking. It makes perfect sense to serialize the ->receive_buf() code 
> with any termios changes, since termios is what affects _how_ that 
> ->receive_buf() function works.

You cannot trivially just take the same lock for receive_buf and termios
locking at the moment. The reason is that receive_buf can cause the tty to
throttle which causes us to call the throttle methods which take the lock.

tty_throttle() and tty_unthrottle() can be called from both receive_buf
and non receive_buf paths so you can't just remove it.

The better existing lock is probably tty->ldisc_mutex which we take when
doing ldisc changes (which are an even more dramatic change during
receive_buf). We don't do ldisc changes from the receive_buf path and it
opens a path for further simplification of the ldisc logic if we can get
to the point where the ldisc doesn't get called randomly from the tty
layer when changing.

> I do wonder why tiocsti() doesn't just use the tty buffering layer, 
> though? Maybe that harks back to the whole "pty's did things differently" 
> thing? Why does it go directly to ->receive_buf() in the first place?

Historical question - I don't know - and at the time I commented it there
was no quick fix and bigger problems to sort first

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:16                               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 23:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi



On Wed, 14 Oct 2009, Alan Cox wrote:
> 
> The better existing lock is probably tty->ldisc_mutex which we take when
> doing ldisc changes (which are an even more dramatic change during
> receive_buf).

Yeah, that makes sense. And then we'd automatically also solve the 
"somebody tries to write during ldisc changes" issue. Not that I've 
checked how much it could help, but maybe we could get rid of _some_ of 
the special "tty_get_ldisc_wait()" kind of hacks.

And having that then protect flushing too would get rid of the 
TTY_FLUSHING and TTY_FLUSHPENDING logic. So it does smell like a good 
solution (without me looking at the code any closer right now, I can't 
take any more tty code reading just now ;)

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-13 23:16                               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-13 23:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Boyan, Frédéric L. W. Meunier, Justin P. Mattock, Nix,
	Paul Fulghum, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi



On Wed, 14 Oct 2009, Alan Cox wrote:
> 
> The better existing lock is probably tty->ldisc_mutex which we take when
> doing ldisc changes (which are an even more dramatic change during
> receive_buf).

Yeah, that makes sense. And then we'd automatically also solve the 
"somebody tries to write during ldisc changes" issue. Not that I've 
checked how much it could help, but maybe we could get rid of _some_ of 
the special "tty_get_ldisc_wait()" kind of hacks.

And having that then protect flushing too would get rid of the 
TTY_FLUSHING and TTY_FLUSHPENDING logic. So it does smell like a good 
solution (without me looking at the code any closer right now, I can't 
take any more tty code reading just now ;)

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  0:08                             ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14  0:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On Tue, 2009-10-13 at 15:42 -0700, Linus Torvalds wrote:

> They are added to the tail only if the tail is non-NULL.

True

tail is null only after:
* initialization - no flush_to_ldisc in progress
* tty_flush_buffer - protected from flush_to_ldisc by TTY_FLUSHING

flush_to_ldisc does not currently set tail to null after
flushing data from the last buffer. Each buffer is marked
individually as consumed so no more data will be added to it.
The buffer is marked empty again as it is recycled.

However, looking at this does reveal a problem.

If an individual buffer of 512 bytes or larger gets into the
free and full lists, that buffer is kfreed in tty_buffer_free()
instead of being recycled. That means that tty->buf.tail can
point to a freed block of memory, at least until another buffer
is allocated.

If that buffer is written to while in this state, the fields
will become incoherent and may result in more data being added to it.

I think the patch below fixes this problem. It sets tail to null
when all buffers are flushed. This is only executed after the buffer
has been passed to the ldisc and the spinlock is held so there is
no place for more data to be added incorrectly. I will test it myself
tomorrow when I get back to the office.

> I do object to the whole crazy subtle TTY locking. I'm convinced it's 
> wrong, and I'm convinced it's wrong exactly _because_ it tries to be so 
> subtle and does non-obvious things.

I understand. The patch below should fix the hole above, and I'm not
aware of any other hole. But I you prefer reworking the locking
to make things more obvious, I have no objection.

--- a/drivers/char/tty_buffer.c	2009-09-09 17:13:59.000000000 -0500
+++ b/drivers/char/tty_buffer.c	2009-10-13 18:34:34.000000000 -0500
@@ -423,6 +423,8 @@ static void flush_to_ldisc(struct work_s
 					break;
 				tbuf = head;
 				head = head->next;
+				if (!head)
+					tty->buf.tail = NULL;
 				tty_buffer_free(tty, tbuf);
 				continue;
 			}



^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  0:08                             ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14  0:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On Tue, 2009-10-13 at 15:42 -0700, Linus Torvalds wrote:

> They are added to the tail only if the tail is non-NULL.

True

tail is null only after:
* initialization - no flush_to_ldisc in progress
* tty_flush_buffer - protected from flush_to_ldisc by TTY_FLUSHING

flush_to_ldisc does not currently set tail to null after
flushing data from the last buffer. Each buffer is marked
individually as consumed so no more data will be added to it.
The buffer is marked empty again as it is recycled.

However, looking at this does reveal a problem.

If an individual buffer of 512 bytes or larger gets into the
free and full lists, that buffer is kfreed in tty_buffer_free()
instead of being recycled. That means that tty->buf.tail can
point to a freed block of memory, at least until another buffer
is allocated.

If that buffer is written to while in this state, the fields
will become incoherent and may result in more data being added to it.

I think the patch below fixes this problem. It sets tail to null
when all buffers are flushed. This is only executed after the buffer
has been passed to the ldisc and the spinlock is held so there is
no place for more data to be added incorrectly. I will test it myself
tomorrow when I get back to the office.

> I do object to the whole crazy subtle TTY locking. I'm convinced it's 
> wrong, and I'm convinced it's wrong exactly _because_ it tries to be so 
> subtle and does non-obvious things.

I understand. The patch below should fix the hole above, and I'm not
aware of any other hole. But I you prefer reworking the locking
to make things more obvious, I have no objection.

--- a/drivers/char/tty_buffer.c	2009-09-09 17:13:59.000000000 -0500
+++ b/drivers/char/tty_buffer.c	2009-10-13 18:34:34.000000000 -0500
@@ -423,6 +423,8 @@ static void flush_to_ldisc(struct work_s
 					break;
 				tbuf = head;
 				head = head->next;
+				if (!head)
+					tty->buf.tail = NULL;
 				tty_buffer_free(tty, tbuf);
 				continue;
 			}


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  0:55                           ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-14  0:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

On Tue, 13 Oct 2009, Linus Torvalds wrote:

>
> Another bug:
>
> On Tue, 13 Oct 2009, Linus Torvalds wrote:
>>  			if (!count) {
>> -				if (head->next == NULL)
>> -					break;
>
> Those two lines should _not_ be deleted. I cleaned up a bit too much.
>
> The rule is that we must not free the last buffer, because it's also going
> to be 'tail'.
>
> So here's a new version with that fixed (and the previous bug I already
> mentioned).
>
> Whether it _works_ is still not clear. It might eat your pet goldfish, or
> make farting noises in your general direction. Or it might fix the bug.
> Who knows?
>
> 		Linus
>
> ---
> drivers/char/tty_buffer.c |   29 +++++++++++++----------------
> 1 files changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 3108991..0296612 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -402,28 +402,26 @@ static void flush_to_ldisc(struct work_struct *work)
> 		container_of(work, struct tty_struct, buf.work.work);
> 	unsigned long 	flags;
> 	struct tty_ldisc *disc;
> -	struct tty_buffer *tbuf, *head;
> -	char *char_buf;
> -	unsigned char *flag_buf;
>
> 	disc = tty_ldisc_ref(tty);
> 	if (disc == NULL)	/*  !TTY_LDISC */
> 		return;
>
> 	spin_lock_irqsave(&tty->buf.lock, flags);
> -	/* So we know a flush is running */
> -	set_bit(TTY_FLUSHING, &tty->flags);
> -	head = tty->buf.head;
> -	if (head != NULL) {
> -		tty->buf.head = NULL;
> -		for (;;) {
> -			int count = head->commit - head->read;
> +
> +	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
> +		struct tty_buffer *head;
> +		while ((head = tty->buf.head) != NULL) {
> +			int count;
> +			char *char_buf;
> +			unsigned char *flag_buf;
> +
> +			count = head->commit - head->read;
> 			if (!count) {
> 				if (head->next == NULL)
> 					break;
> -				tbuf = head;
> -				head = head->next;
> -				tty_buffer_free(tty, tbuf);
> +				tty->buf.head = head->next;
> +				tty_buffer_free(tty, head);
> 				continue;
> 			}
> 			/* Ldisc or user is trying to flush the buffers
> @@ -445,9 +443,9 @@ static void flush_to_ldisc(struct work_struct *work)
> 							flag_buf, count);
> 			spin_lock_irqsave(&tty->buf.lock, flags);
> 		}
> -		/* Restore the queue head */
> -		tty->buf.head = head;
> +		clear_bit(TTY_FLUSHING, &tty->flags);
> 	}
> +
> 	/* We may have a deferred request to flush the input buffer,
> 	   if so pull the chain under the lock and empty the queue */
> 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
> @@ -455,7 +453,6 @@ static void flush_to_ldisc(struct work_struct *work)
> 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
> 		wake_up(&tty->read_wait);
> 	}
> -	clear_bit(TTY_FLUSHING, &tty->flags);
> 	spin_unlock_irqrestore(&tty->buf.lock, flags);
>
> 	tty_ldisc_deref(disc);

For now (more than 3h), it isn't doing any harm. And no keyboard 
lockups.

BTW, the old version of the patch was funny. It booted, but at 
the login prompt I could only enter the first letter.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  0:55                           ` Frédéric L. W. Meunier
  0 siblings, 0 replies; 248+ messages in thread
From: Frédéric L. W. Meunier @ 2009-10-14  0:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

On Tue, 13 Oct 2009, Linus Torvalds wrote:

>
> Another bug:
>
> On Tue, 13 Oct 2009, Linus Torvalds wrote:
>>  			if (!count) {
>> -				if (head->next == NULL)
>> -					break;
>
> Those two lines should _not_ be deleted. I cleaned up a bit too much.
>
> The rule is that we must not free the last buffer, because it's also going
> to be 'tail'.
>
> So here's a new version with that fixed (and the previous bug I already
> mentioned).
>
> Whether it _works_ is still not clear. It might eat your pet goldfish, or
> make farting noises in your general direction. Or it might fix the bug.
> Who knows?
>
> 		Linus
>
> ---
> drivers/char/tty_buffer.c |   29 +++++++++++++----------------
> 1 files changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 3108991..0296612 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -402,28 +402,26 @@ static void flush_to_ldisc(struct work_struct *work)
> 		container_of(work, struct tty_struct, buf.work.work);
> 	unsigned long 	flags;
> 	struct tty_ldisc *disc;
> -	struct tty_buffer *tbuf, *head;
> -	char *char_buf;
> -	unsigned char *flag_buf;
>
> 	disc = tty_ldisc_ref(tty);
> 	if (disc == NULL)	/*  !TTY_LDISC */
> 		return;
>
> 	spin_lock_irqsave(&tty->buf.lock, flags);
> -	/* So we know a flush is running */
> -	set_bit(TTY_FLUSHING, &tty->flags);
> -	head = tty->buf.head;
> -	if (head != NULL) {
> -		tty->buf.head = NULL;
> -		for (;;) {
> -			int count = head->commit - head->read;
> +
> +	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
> +		struct tty_buffer *head;
> +		while ((head = tty->buf.head) != NULL) {
> +			int count;
> +			char *char_buf;
> +			unsigned char *flag_buf;
> +
> +			count = head->commit - head->read;
> 			if (!count) {
> 				if (head->next == NULL)
> 					break;
> -				tbuf = head;
> -				head = head->next;
> -				tty_buffer_free(tty, tbuf);
> +				tty->buf.head = head->next;
> +				tty_buffer_free(tty, head);
> 				continue;
> 			}
> 			/* Ldisc or user is trying to flush the buffers
> @@ -445,9 +443,9 @@ static void flush_to_ldisc(struct work_struct *work)
> 							flag_buf, count);
> 			spin_lock_irqsave(&tty->buf.lock, flags);
> 		}
> -		/* Restore the queue head */
> -		tty->buf.head = head;
> +		clear_bit(TTY_FLUSHING, &tty->flags);
> 	}
> +
> 	/* We may have a deferred request to flush the input buffer,
> 	   if so pull the chain under the lock and empty the queue */
> 	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
> @@ -455,7 +453,6 @@ static void flush_to_ldisc(struct work_struct *work)
> 		clear_bit(TTY_FLUSHPENDING, &tty->flags);
> 		wake_up(&tty->read_wait);
> 	}
> -	clear_bit(TTY_FLUSHING, &tty->flags);
> 	spin_unlock_irqrestore(&tty->buf.lock, flags);
>
> 	tty_ldisc_deref(disc);

For now (more than 3h), it isn't doing any harm. And no keyboard 
lockups.

BTW, the old version of the patch was funny. It booted, but at 
the login prompt I could only enter the first letter.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:03                                 ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14  1:03 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi "



On Tue, 13 Oct 2009, Paul Fulghum wrote:
> 
> This is correct, the last buffer is not passed to tty_buffer_free()
> if it is the last in the list so tail is maintained.
> There is no free space in it so no new data can be added.
> There is no place where tail is null while the spinlock
> is released in preparation for calling receive_buf.
> I still can't spot any flaw in the current locking.

Do you even bother reading my emails?

Let me walk through an example of where the locking F*CKS UP, exactly 
because it's broken.

	thread1		thread2		thread3

	flush_to_ldisc
	set_bit(TTY_FLUSHING)
	buf.head = NULL
	...
	..release lock..
	.. sleep in ->receive_buf ..

			flush_to_ldisc
			set_bit(TTY_FLUSHING)
			.. head==NULL ..
			clear_bit(TTY_FLUSHING)
			.. release lock ..

					tty_ldisc_flush()
					-> tty_buffer_flush()
					TTY_FLUSHING not set!
					-> __tty_buffer_flush()
					-> tty->buf.tail = NULL

and now you're screwed. See? You have both 'buf.tail' and 'buf.head' both 
being NULL, and look what happens in that case 'tty_buffer_request_room()' 
if some new data comes in? Right: it will add the buffer to both tail and 
head.

And notice how 'thread1' is still inside flush_to_ldisc()! The buffer that 
got added will be overwritten by the old one, and now tail and head no 
longer match. Or another flush_to_ldisc() comes in, and now it won't be a 
no-op any more, and it will find the new data, and run ->receive_buf 
concurrently with the old receive_buf from thread1.

And the whole reason was that there were some very odd locking rules: 
buf.head=NULL meant "don't flush", and "TTY_FLUSHING is set" meant "don't 
clear 'buf.head'", and but the "don't flush" case still cleared 
TTY_FLUSHING (after not flushing), and it all messed up.

I could just have fixed it (move the "clear_bit(TTY_FLUSHING)" but up, but 
the fact is, once you fix that, it then becomes obvious that 
"buf.head=NULL" really is the wrong thing to test in the first place, and 
we should just use TTY_FLUSHING instead, and simply _remove_ the odd 
"buf.head=NULL is special" case. Which is what my patch did

> Your statement that the locking is too clever/subtle is
> clearly true since I am struggling to work this out again.

I have to say that the only case I could make up that is _clearly_ a bug 
is the above very contrieved example. I don't really think something like 
the above happens in reality. But it's an example of bad locking, and what 
happens when the locking logic isn't obvious.

There may be other cases where the locking fails, and I just didn't find 
them. 

Or the patch may simply not fix anything in practice, and nobody has ever 
actually triggered the bad locking in real life. I dunno. I just do know 
that the locking was too damn subtle.

Any time people do ad-hoc locking with "clever" schemes, it's almost 
invariably buggy. So the rule is: just don't do that. Make the locking 
rules "obvious".  Don't have subtle rules about "if head is NULL, then 
we're not going to add any new buffers to it, except if tail is also 
NULL". Because look above what happens, and see how complicated it was to 
even see the bug.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:03                                 ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14  1:03 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi "



On Tue, 13 Oct 2009, Paul Fulghum wrote:
> 
> This is correct, the last buffer is not passed to tty_buffer_free()
> if it is the last in the list so tail is maintained.
> There is no free space in it so no new data can be added.
> There is no place where tail is null while the spinlock
> is released in preparation for calling receive_buf.
> I still can't spot any flaw in the current locking.

Do you even bother reading my emails?

Let me walk through an example of where the locking F*CKS UP, exactly 
because it's broken.

	thread1		thread2		thread3

	flush_to_ldisc
	set_bit(TTY_FLUSHING)
	buf.head = NULL
	...
	..release lock..
	.. sleep in ->receive_buf ..

			flush_to_ldisc
			set_bit(TTY_FLUSHING)
			.. head==NULL ..
			clear_bit(TTY_FLUSHING)
			.. release lock ..

					tty_ldisc_flush()
					-> tty_buffer_flush()
					TTY_FLUSHING not set!
					-> __tty_buffer_flush()
					-> tty->buf.tail = NULL

and now you're screwed. See? You have both 'buf.tail' and 'buf.head' both 
being NULL, and look what happens in that case 'tty_buffer_request_room()' 
if some new data comes in? Right: it will add the buffer to both tail and 
head.

And notice how 'thread1' is still inside flush_to_ldisc()! The buffer that 
got added will be overwritten by the old one, and now tail and head no 
longer match. Or another flush_to_ldisc() comes in, and now it won't be a 
no-op any more, and it will find the new data, and run ->receive_buf 
concurrently with the old receive_buf from thread1.

And the whole reason was that there were some very odd locking rules: 
buf.head=NULL meant "don't flush", and "TTY_FLUSHING is set" meant "don't 
clear 'buf.head'", and but the "don't flush" case still cleared 
TTY_FLUSHING (after not flushing), and it all messed up.

I could just have fixed it (move the "clear_bit(TTY_FLUSHING)" but up, but 
the fact is, once you fix that, it then becomes obvious that 
"buf.head=NULL" really is the wrong thing to test in the first place, and 
we should just use TTY_FLUSHING instead, and simply _remove_ the odd 
"buf.head=NULL is special" case. Which is what my patch did

> Your statement that the locking is too clever/subtle is
> clearly true since I am struggling to work this out again.

I have to say that the only case I could make up that is _clearly_ a bug 
is the above very contrieved example. I don't really think something like 
the above happens in reality. But it's an example of bad locking, and what 
happens when the locking logic isn't obvious.

There may be other cases where the locking fails, and I just didn't find 
them. 

Or the patch may simply not fix anything in practice, and nobody has ever 
actually triggered the bad locking in real life. I dunno. I just do know 
that the locking was too damn subtle.

Any time people do ad-hoc locking with "clever" schemes, it's almost 
invariably buggy. So the rule is: just don't do that. Make the locking 
rules "obvious".  Don't have subtle rules about "if head is NULL, then 
we're not going to add any new buffers to it, except if tail is also 
NULL". Because look above what happens, and see how complicated it was to 
even see the bug.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:05                                   ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14  1:05 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi "


Oops, you'll probably get this twice, because 'alpine' core-dumped on me 
and I'm not sure the first one actually made it out. 

		Linus

On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
> 
> On Tue, 13 Oct 2009, Paul Fulghum wrote:
> > 
> > This is correct, the last buffer is not passed to tty_buffer_free()
> > if it is the last in the list so tail is maintained.
> > There is no free space in it so no new data can be added.
> > There is no place where tail is null while the spinlock
> > is released in preparation for calling receive_buf.
> > I still can't spot any flaw in the current locking.
> 
> Do you even bother reading my emails?
> 
> Let me walk through an example of where the locking F*CKS UP, exactly 
> because it's broken.
> 
> 	thread1		thread2		thread3
> 
> 	flush_to_ldisc
> 	set_bit(TTY_FLUSHING)
> 	buf.head = NULL
> 	...
> 	..release lock..
> 	.. sleep in ->receive_buf ..
> 
> 			flush_to_ldisc
> 			set_bit(TTY_FLUSHING)
> 			.. head==NULL ..
> 			clear_bit(TTY_FLUSHING)
> 			.. release lock ..
> 
> 					tty_ldisc_flush()
> 					-> tty_buffer_flush()
> 					TTY_FLUSHING not set!
> 					-> __tty_buffer_flush()
> 					-> tty->buf.tail = NULL
> 
> and now you're screwed. See? You have both 'buf.tail' and 'buf.head' both 
> being NULL, and look what happens in that case 'tty_buffer_request_room()' 
> if some new data comes in? Right: it will add the buffer to both tail and 
> head.
> 
> And notice how 'thread1' is still inside flush_to_ldisc()! The buffer that 
> got added will be overwritten by the old one, and now tail and head no 
> longer match. Or another flush_to_ldisc() comes in, and now it won't be a 
> no-op any more, and it will find the new data, and run ->receive_buf 
> concurrently with the old receive_buf from thread1.
> 
> And the whole reason was that there were some very odd locking rules: 
> buf.head=NULL meant "don't flush", and "TTY_FLUSHING is set" meant "don't 
> clear 'buf.head'", and but the "don't flush" case still cleared 
> TTY_FLUSHING (after not flushing), and it all messed up.
> 
> I could just have fixed it (move the "clear_bit(TTY_FLUSHING)" but up, but 
> the fact is, once you fix that, it then becomes obvious that 
> "buf.head=NULL" really is the wrong thing to test in the first place, and 
> we should just use TTY_FLUSHING instead, and simply _remove_ the odd 
> "buf.head=NULL is special" case. Which is what my patch did
> 
> > Your statement that the locking is too clever/subtle is
> > clearly true since I am struggling to work this out again.
> 
> I have to say that the only case I could make up that is _clearly_ a bug 
> is the above very contrieved example. I don't really think something like 
> the above happens in reality. But it's an example of bad locking, and what 
> happens when the locking logic isn't obvious.
> 
> There may be other cases where the locking fails, and I just didn't find 
> them. 
> 
> Or the patch may simply not fix anything in practice, and nobody has ever 
> actually triggered the bad locking in real life. I dunno. I just do know 
> that the locking was too damn subtle.
> 
> Any time people do ad-hoc locking with "clever" schemes, it's almost 
> invariably buggy. So the rule is: just don't do that. Make the locking 
> rules "obvious".  Don't have subtle rules about "if head is NULL, then 
> we're not going to add any new buffers to it, except if tail is also 
> NULL". Because look above what happens, and see how complicated it was to 
> even see the bug.
> 
> 			Linus
> 

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:05                                   ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14  1:05 UTC (permalink / raw)
  To: Paul Fulghum
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi "


Oops, you'll probably get this twice, because 'alpine' core-dumped on me 
and I'm not sure the first one actually made it out. 

		Linus

On Tue, 13 Oct 2009, Linus Torvalds wrote:
> 
> 
> On Tue, 13 Oct 2009, Paul Fulghum wrote:
> > 
> > This is correct, the last buffer is not passed to tty_buffer_free()
> > if it is the last in the list so tail is maintained.
> > There is no free space in it so no new data can be added.
> > There is no place where tail is null while the spinlock
> > is released in preparation for calling receive_buf.
> > I still can't spot any flaw in the current locking.
> 
> Do you even bother reading my emails?
> 
> Let me walk through an example of where the locking F*CKS UP, exactly 
> because it's broken.
> 
> 	thread1		thread2		thread3
> 
> 	flush_to_ldisc
> 	set_bit(TTY_FLUSHING)
> 	buf.head = NULL
> 	...
> 	..release lock..
> 	.. sleep in ->receive_buf ..
> 
> 			flush_to_ldisc
> 			set_bit(TTY_FLUSHING)
> 			.. head==NULL ..
> 			clear_bit(TTY_FLUSHING)
> 			.. release lock ..
> 
> 					tty_ldisc_flush()
> 					-> tty_buffer_flush()
> 					TTY_FLUSHING not set!
> 					-> __tty_buffer_flush()
> 					-> tty->buf.tail = NULL
> 
> and now you're screwed. See? You have both 'buf.tail' and 'buf.head' both 
> being NULL, and look what happens in that case 'tty_buffer_request_room()' 
> if some new data comes in? Right: it will add the buffer to both tail and 
> head.
> 
> And notice how 'thread1' is still inside flush_to_ldisc()! The buffer that 
> got added will be overwritten by the old one, and now tail and head no 
> longer match. Or another flush_to_ldisc() comes in, and now it won't be a 
> no-op any more, and it will find the new data, and run ->receive_buf 
> concurrently with the old receive_buf from thread1.
> 
> And the whole reason was that there were some very odd locking rules: 
> buf.head=NULL meant "don't flush", and "TTY_FLUSHING is set" meant "don't 
> clear 'buf.head'", and but the "don't flush" case still cleared 
> TTY_FLUSHING (after not flushing), and it all messed up.
> 
> I could just have fixed it (move the "clear_bit(TTY_FLUSHING)" but up, but 
> the fact is, once you fix that, it then becomes obvious that 
> "buf.head=NULL" really is the wrong thing to test in the first place, and 
> we should just use TTY_FLUSHING instead, and simply _remove_ the odd 
> "buf.head=NULL is special" case. Which is what my patch did
> 
> > Your statement that the locking is too clever/subtle is
> > clearly true since I am struggling to work this out again.
> 
> I have to say that the only case I could make up that is _clearly_ a bug 
> is the above very contrieved example. I don't really think something like 
> the above happens in reality. But it's an example of bad locking, and what 
> happens when the locking logic isn't obvious.
> 
> There may be other cases where the locking fails, and I just didn't find 
> them. 
> 
> Or the patch may simply not fix anything in practice, and nobody has ever 
> actually triggered the bad locking in real life. I dunno. I just do know 
> that the locking was too damn subtle.
> 
> Any time people do ad-hoc locking with "clever" schemes, it's almost 
> invariably buggy. So the rule is: just don't do that. Make the locking 
> rules "obvious".  Don't have subtle rules about "if head is NULL, then 
> we're not going to add any new buffers to it, except if tail is also 
> NULL". Because look above what happens, and see how complicated it was to 
> even see the bug.
> 
> 			Linus
> 

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14  0:55                           ` Frédéric L. W. Meunier
  (?)
@ 2009-10-14  1:12                           ` Linus Torvalds
  2009-10-14  1:20                               ` david-gFPdbfVZQbY
  -1 siblings, 1 reply; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14  1:12 UTC (permalink / raw)
  To: Frédéric L. W. Meunier
  Cc: Boyan, Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi



On Tue, 13 Oct 2009, Frédéric L. W. Meunier wrote:
> 
> For now (more than 3h), it isn't doing any harm. And no keyboard lockups.

I think it was Boyan who said he could trigger it "easily", and everybody 
else had a hard time to reproduce the problem, so I'll consider your "good 
for 3h" to not really be a confirmation either way. But at least it's not 
totally broken.

> BTW, the old version of the patch was funny. It booted, but at the login
> prompt I could only enter the first letter.

Yeah, each time somebody read from a tty, the reading would also get rid 
of all the buffers, but would leave buf.tail set to the last one (that had 
been freed). 

Which then resulted in all subsequent IO going to that tail buffer, but 
nobody ever seeing it, because 'head' was NULL, and would stay that way as 
long as 'tail' existed (which it would until the tty was flushed, which it 
would never be).

So you'd only ever see the first read (which could obviously be more than 
one character, but you'd have to type REALLY FAST to get there ;^)

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:20                               ` david-gFPdbfVZQbY
  0 siblings, 0 replies; 248+ messages in thread
From: david @ 2009-10-14  1:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Frédéric L. W. Meunier, Boyan, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On Tue, 13 Oct 2009, Linus Torvalds wrote:

> On Tue, 13 Oct 2009, Fr?d?ric L. W. Meunier wrote:
>>
>> For now (more than 3h), it isn't doing any harm. And no keyboard lockups.
>
> I think it was Boyan who said he could trigger it "easily", and everybody
> else had a hard time to reproduce the problem, so I'll consider your "good
> for 3h" to not really be a confirmation either way. But at least it's not
> totally broken.
>
>> BTW, the old version of the patch was funny. It booted, but at the login
>> prompt I could only enter the first letter.
>
> Yeah, each time somebody read from a tty, the reading would also get rid
> of all the buffers, but would leave buf.tail set to the last one (that had
> been freed).
>
> Which then resulted in all subsequent IO going to that tail buffer, but
> nobody ever seeing it, because 'head' was NULL, and would stay that way as
> long as 'tail' existed (which it would until the tty was flushed, which it
> would never be).
>
> So you'd only ever see the first read (which could obviously be more than
> one character, but you'd have to type REALLY FAST to get there ;^)

Interesting, I had a hadful of times in the last several months where I 
ran into something like this, but switching virtual terminals cleared it 
up.

David Lang

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:20                               ` david-gFPdbfVZQbY
  0 siblings, 0 replies; 248+ messages in thread
From: david-gFPdbfVZQbY @ 2009-10-14  1:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Frédéric L. W. Meunier, Boyan, Justin P. Mattock, Nix,
	Alan Cox, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On Tue, 13 Oct 2009, Linus Torvalds wrote:

> On Tue, 13 Oct 2009, Fr?d?ric L. W. Meunier wrote:
>>
>> For now (more than 3h), it isn't doing any harm. And no keyboard lockups.
>
> I think it was Boyan who said he could trigger it "easily", and everybody
> else had a hard time to reproduce the problem, so I'll consider your "good
> for 3h" to not really be a confirmation either way. But at least it's not
> totally broken.
>
>> BTW, the old version of the patch was funny. It booted, but at the login
>> prompt I could only enter the first letter.
>
> Yeah, each time somebody read from a tty, the reading would also get rid
> of all the buffers, but would leave buf.tail set to the last one (that had
> been freed).
>
> Which then resulted in all subsequent IO going to that tail buffer, but
> nobody ever seeing it, because 'head' was NULL, and would stay that way as
> long as 'tail' existed (which it would until the tty was flushed, which it
> would never be).
>
> So you'd only ever see the first read (which could obviously be more than
> one character, but you'd have to type REALLY FAST to get there ;^)

Interesting, I had a hadful of times in the last several months where I 
ran into something like this, but switching virtual terminals cleared it 
up.

David Lang

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:34                                   ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14  1:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson, hirofumi

Linus Torvalds wrote:
> Do you even bother reading my emails?

Yes, they are very colorful.

> Let me walk through an example of where the locking F*CKS UP, exactly 
> because it's broken.

OK, I got it this time.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  1:34                                   ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14  1:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	hirofumi-UIVanBePwB70ZhReMnHkpc8NsWr+9BEh

Linus Torvalds wrote:
> Do you even bother reading my emails?

Yes, they are very colorful.

> Let me walk through an example of where the locking F*CKS UP, exactly 
> because it's broken.

OK, I got it this time.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  7:45                           ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-14  7:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

Linus Torvalds wrote:
> Another bug:
> 
> On Tue, 13 Oct 2009, Linus Torvalds wrote:
>>  			if (!count) {
>> -				if (head->next == NULL)
>> -					break;
> 
> Those two lines should _not_ be deleted. I cleaned up a bit too much.
> 
> The rule is that we must not free the last buffer, because it's also going 
> to be 'tail'.
> 
> So here's a new version with that fixed (and the previous bug I already 
> mentioned).
> 
> Whether it _works_ is still not clear. It might eat your pet goldfish, or 
> make farting noises in your general direction. Or it might fix the bug. 
> Who knows?
> 
> 		Linus
> 
> ---
>  drivers/char/tty_buffer.c |   29 +++++++++++++----------------
>  1 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 3108991..0296612 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -402,28 +402,26 @@ static void flush_to_ldisc(struct work_struct *work)
>  		container_of(work, struct tty_struct, buf.work.work);
>  	unsigned long 	flags;
>  	struct tty_ldisc *disc;
> -	struct tty_buffer *tbuf, *head;
> -	char *char_buf;
> -	unsigned char *flag_buf;
>  
>  	disc = tty_ldisc_ref(tty);
>  	if (disc == NULL)	/*  !TTY_LDISC */
>  		return;
>  
>  	spin_lock_irqsave(&tty->buf.lock, flags);
> -	/* So we know a flush is running */
> -	set_bit(TTY_FLUSHING, &tty->flags);
> -	head = tty->buf.head;
> -	if (head != NULL) {
> -		tty->buf.head = NULL;
> -		for (;;) {
> -			int count = head->commit - head->read;
> +
> +	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
> +		struct tty_buffer *head;
> +		while ((head = tty->buf.head) != NULL) {
> +			int count;
> +			char *char_buf;
> +			unsigned char *flag_buf;
> +
> +			count = head->commit - head->read;
>  			if (!count) {
>  				if (head->next == NULL)
>  					break;
> -				tbuf = head;
> -				head = head->next;
> -				tty_buffer_free(tty, tbuf);
> +				tty->buf.head = head->next;
> +				tty_buffer_free(tty, head);
>  				continue;
>  			}
>  			/* Ldisc or user is trying to flush the buffers
> @@ -445,9 +443,9 @@ static void flush_to_ldisc(struct work_struct *work)
>  							flag_buf, count);
>  			spin_lock_irqsave(&tty->buf.lock, flags);
>  		}
> -		/* Restore the queue head */
> -		tty->buf.head = head;
> +		clear_bit(TTY_FLUSHING, &tty->flags);
>  	}
> +
>  	/* We may have a deferred request to flush the input buffer,
>  	   if so pull the chain under the lock and empty the queue */
>  	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
> @@ -455,7 +453,6 @@ static void flush_to_ldisc(struct work_struct *work)
>  		clear_bit(TTY_FLUSHPENDING, &tty->flags);
>  		wake_up(&tty->read_wait);
>  	}
> -	clear_bit(TTY_FLUSHING, &tty->flags);
>  	spin_unlock_irqrestore(&tty->buf.lock, flags);
>  
>  	tty_ldisc_deref(disc);
> 


It works for me. I couldn't reproduce the problem with this patch on top
of 2.6.31.3 with CONFIG_PREEMPT=y.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14  7:45                           ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-14  7:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: "Frédéric L. W. Meunier",
	Justin P. Mattock, Nix, Alan Cox, Paul Fulghum,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	OGAWA Hirofumi

Linus Torvalds wrote:
> Another bug:
> 
> On Tue, 13 Oct 2009, Linus Torvalds wrote:
>>  			if (!count) {
>> -				if (head->next == NULL)
>> -					break;
> 
> Those two lines should _not_ be deleted. I cleaned up a bit too much.
> 
> The rule is that we must not free the last buffer, because it's also going 
> to be 'tail'.
> 
> So here's a new version with that fixed (and the previous bug I already 
> mentioned).
> 
> Whether it _works_ is still not clear. It might eat your pet goldfish, or 
> make farting noises in your general direction. Or it might fix the bug. 
> Who knows?
> 
> 		Linus
> 
> ---
>  drivers/char/tty_buffer.c |   29 +++++++++++++----------------
>  1 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 3108991..0296612 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -402,28 +402,26 @@ static void flush_to_ldisc(struct work_struct *work)
>  		container_of(work, struct tty_struct, buf.work.work);
>  	unsigned long 	flags;
>  	struct tty_ldisc *disc;
> -	struct tty_buffer *tbuf, *head;
> -	char *char_buf;
> -	unsigned char *flag_buf;
>  
>  	disc = tty_ldisc_ref(tty);
>  	if (disc == NULL)	/*  !TTY_LDISC */
>  		return;
>  
>  	spin_lock_irqsave(&tty->buf.lock, flags);
> -	/* So we know a flush is running */
> -	set_bit(TTY_FLUSHING, &tty->flags);
> -	head = tty->buf.head;
> -	if (head != NULL) {
> -		tty->buf.head = NULL;
> -		for (;;) {
> -			int count = head->commit - head->read;
> +
> +	if (!test_and_set_bit(TTY_FLUSHING, &tty->flags)) {
> +		struct tty_buffer *head;
> +		while ((head = tty->buf.head) != NULL) {
> +			int count;
> +			char *char_buf;
> +			unsigned char *flag_buf;
> +
> +			count = head->commit - head->read;
>  			if (!count) {
>  				if (head->next == NULL)
>  					break;
> -				tbuf = head;
> -				head = head->next;
> -				tty_buffer_free(tty, tbuf);
> +				tty->buf.head = head->next;
> +				tty_buffer_free(tty, head);
>  				continue;
>  			}
>  			/* Ldisc or user is trying to flush the buffers
> @@ -445,9 +443,9 @@ static void flush_to_ldisc(struct work_struct *work)
>  							flag_buf, count);
>  			spin_lock_irqsave(&tty->buf.lock, flags);
>  		}
> -		/* Restore the queue head */
> -		tty->buf.head = head;
> +		clear_bit(TTY_FLUSHING, &tty->flags);
>  	}
> +
>  	/* We may have a deferred request to flush the input buffer,
>  	   if so pull the chain under the lock and empty the queue */
>  	if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
> @@ -455,7 +453,6 @@ static void flush_to_ldisc(struct work_struct *work)
>  		clear_bit(TTY_FLUSHPENDING, &tty->flags);
>  		wake_up(&tty->read_wait);
>  	}
> -	clear_bit(TTY_FLUSHING, &tty->flags);
>  	spin_unlock_irqrestore(&tty->buf.lock, flags);
>  
>  	tty_ldisc_deref(disc);
> 


It works for me. I couldn't reproduce the problem with this patch on top
of 2.6.31.3 with CONFIG_PREEMPT=y.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 11:58                                   ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-14 11:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi "

Stop a moment. The code wasn't designed to permit two paralle calls of
flush_to_ldisc to the same tty. That was always forbidden when that code
was designed.

It got broken because someone changed that rule recently (and because
the locking was too clever)

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 11:58                                   ` Alan Cox
  0 siblings, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-14 11:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi "

Stop a moment. The code wasn't designed to permit two paralle calls of
flush_to_ldisc to the same tty. That was always forbidden when that code
was designed.

It got broken because someone changed that rule recently (and because
the locking was too clever)

Alan

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 11:58                                   ` Alan Cox
  (?)
@ 2009-10-14 15:07                                   ` Linus Torvalds
  2009-10-14 16:34                                       ` Paul Fulghum
  2009-10-14 16:38                                       ` Linus Torvalds
  -1 siblings, 2 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 15:07 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi "



On Wed, 14 Oct 2009, Alan Cox wrote:
>
> Stop a moment. The code wasn't designed to permit two paralle calls of
> flush_to_ldisc to the same tty. That was always forbidden when that code
> was designed.

No, the code was clearly _designed_ for it - that's the whole and only 
point of the

	tty->buf.head = NULL;

line.

But it's certainly true that it just never happened before. At least for 
the !low_latency case, I'm not so sure about the low_latency=1 case, but I 
haven't checked either - it would depend on any higher-level 
serialization.

		Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 16:34                                       ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson, hirofumi

Linus Torvalds wrote:
> No, the code was clearly _designed_ for it - that's the whole and only 
> point of the
> 
> 	tty->buf.head = NULL;

I don't know about the original flush_to_ldisc, but
I designed the above code to protect against parallel calls
knowing I could not hold a spinlock when calling receive_buf.

TTY_FLUSHING came later. The problem associated
with that addition proves my code is obscure enough to make
maintenance difficult. I got confused reviewing my own code
yesterday.

Boo hoo, live and learn.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 16:34                                       ` Paul Fulghum
  0 siblings, 0 replies; 248+ messages in thread
From: Paul Fulghum @ 2009-10-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Boyan, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Dmitry Torokhov, Ed Tomlinson,
	hirofumi-UIVanBePwB70ZhReMnHkpc8NsWr+9BEh

Linus Torvalds wrote:
> No, the code was clearly _designed_ for it - that's the whole and only 
> point of the
> 
> 	tty->buf.head = NULL;

I don't know about the original flush_to_ldisc, but
I designed the above code to protect against parallel calls
knowing I could not hold a spinlock when calling receive_buf.

TTY_FLUSHING came later. The problem associated
with that addition proves my code is obscure enough to make
maintenance difficult. I got confused reviewing my own code
yesterday.

Boo hoo, live and learn.

-- 
Paul Fulghum
MicroGate Systems, Ltd.
=Customer Driven, by Design=
(800)444-1982
(512)345-7791 (Direct)
(512)343-9046 (Fax)
Central Time Zone (GMT -5h)
www.microgate.com

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 16:38                                       ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 16:38 UTC (permalink / raw)
  To: Alan Cox, Oleg Nesterov
  Cc: Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi "



On Wed, 14 Oct 2009, Linus Torvalds wrote:
> 
> But it's certainly true that it just never happened before. At least for 
> the !low_latency case, I'm not so sure about the low_latency=1 case, but I 
> haven't checked either - it would depend on any higher-level 
> serialization.

Btw, we _could_ try to solve this by adding some workqueue function to 
"run delayed work now", and then always doing the 'flush_to_ldisc()' 
through the workqueue logic.

So this is an "alternate patch": instead of making flush_to_ldisc() be 
safe to re-enter, we try to make sure it's always called through the whole 
workqueue logic and thus serialized by that.

Of course, keventd itself is multi-threaded, so I'm not entirely sure even 
-that- guarantees that one 'flush_to_ldisc()' couldn't be pending on one 
CPU while it is then scheduled and then run on another CPU concurrently 
too. The WORK_STRUCT_PENDING bit guarantees exclusion from the lists and 
from being pending, but the work might be both pending and _running_ at 
the same time, afaik.

I'm adding Oleg to the Cc, because he's the workqueue-master. Oleg?

The patch below is - surprise, surprise - entirely untested. I'm not sure 
my 'flush_delayed_work()' implementation is entirely kosher. But it looks 
like it might work, and it did compile for me (technically this is on top 
of my flush_to_ldisc() patch, but they should be independent of each 
other).

			Linus

---
 drivers/char/tty_buffer.c |    2 +-
 include/linux/workqueue.h |    1 +
 kernel/workqueue.c        |   18 ++++++++++++++++++
 3 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 0296612..66fa4e1 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -468,7 +468,7 @@ static void flush_to_ldisc(struct work_struct *work)
  */
 void tty_flush_to_ldisc(struct tty_struct *tty)
 {
-	flush_to_ldisc(&tty->buf.work.work);
+	flush_delayed_work(&tty->buf.work);
 }
 
 /**
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 7ef0c7b..cf24c20 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -207,6 +207,7 @@ extern int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 
 extern void flush_workqueue(struct workqueue_struct *wq);
 extern void flush_scheduled_work(void);
+extern void flush_delayed_work(struct delayed_work *work);
 
 extern int schedule_work(struct work_struct *work);
 extern int schedule_work_on(int cpu, struct work_struct *work);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index addfe2d..ccefe57 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -640,6 +640,24 @@ int schedule_delayed_work(struct delayed_work *dwork,
 EXPORT_SYMBOL(schedule_delayed_work);
 
 /**
+ * flush_delayed_work - block until a dwork_struct's callback has terminated
+ * @dwork: the delayed work which is to be flushed
+ *
+ * Any timeout is cancelled, and any pending work is run immediately.
+ */
+void flush_delayed_work(struct delayed_work *dwork)
+{
+	if (del_timer(&dwork->timer)) {
+		struct cpu_workqueue_struct *cwq;
+		cwq = wq_per_cpu(keventd_wq, get_cpu());
+		__queue_work(cwq, &dwork->work);
+		put_cpu();
+	}
+	flush_work(&dwork->work);
+}
+EXPORT_SYMBOL(flush_delayed_work);
+
+/**
  * schedule_delayed_work_on - queue work in global workqueue on CPU after delay
  * @cpu: cpu to use
  * @dwork: job to be done

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 16:38                                       ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 16:38 UTC (permalink / raw)
  To: Alan Cox, Oleg Nesterov
  Cc: Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi "



On Wed, 14 Oct 2009, Linus Torvalds wrote:
> 
> But it's certainly true that it just never happened before. At least for 
> the !low_latency case, I'm not so sure about the low_latency=1 case, but I 
> haven't checked either - it would depend on any higher-level 
> serialization.

Btw, we _could_ try to solve this by adding some workqueue function to 
"run delayed work now", and then always doing the 'flush_to_ldisc()' 
through the workqueue logic.

So this is an "alternate patch": instead of making flush_to_ldisc() be 
safe to re-enter, we try to make sure it's always called through the whole 
workqueue logic and thus serialized by that.

Of course, keventd itself is multi-threaded, so I'm not entirely sure even 
-that- guarantees that one 'flush_to_ldisc()' couldn't be pending on one 
CPU while it is then scheduled and then run on another CPU concurrently 
too. The WORK_STRUCT_PENDING bit guarantees exclusion from the lists and 
from being pending, but the work might be both pending and _running_ at 
the same time, afaik.

I'm adding Oleg to the Cc, because he's the workqueue-master. Oleg?

The patch below is - surprise, surprise - entirely untested. I'm not sure 
my 'flush_delayed_work()' implementation is entirely kosher. But it looks 
like it might work, and it did compile for me (technically this is on top 
of my flush_to_ldisc() patch, but they should be independent of each 
other).

			Linus

---
 drivers/char/tty_buffer.c |    2 +-
 include/linux/workqueue.h |    1 +
 kernel/workqueue.c        |   18 ++++++++++++++++++
 3 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 0296612..66fa4e1 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -468,7 +468,7 @@ static void flush_to_ldisc(struct work_struct *work)
  */
 void tty_flush_to_ldisc(struct tty_struct *tty)
 {
-	flush_to_ldisc(&tty->buf.work.work);
+	flush_delayed_work(&tty->buf.work);
 }
 
 /**
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 7ef0c7b..cf24c20 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -207,6 +207,7 @@ extern int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 
 extern void flush_workqueue(struct workqueue_struct *wq);
 extern void flush_scheduled_work(void);
+extern void flush_delayed_work(struct delayed_work *work);
 
 extern int schedule_work(struct work_struct *work);
 extern int schedule_work_on(int cpu, struct work_struct *work);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index addfe2d..ccefe57 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -640,6 +640,24 @@ int schedule_delayed_work(struct delayed_work *dwork,
 EXPORT_SYMBOL(schedule_delayed_work);
 
 /**
+ * flush_delayed_work - block until a dwork_struct's callback has terminated
+ * @dwork: the delayed work which is to be flushed
+ *
+ * Any timeout is cancelled, and any pending work is run immediately.
+ */
+void flush_delayed_work(struct delayed_work *dwork)
+{
+	if (del_timer(&dwork->timer)) {
+		struct cpu_workqueue_struct *cwq;
+		cwq = wq_per_cpu(keventd_wq, get_cpu());
+		__queue_work(cwq, &dwork->work);
+		put_cpu();
+	}
+	flush_work(&dwork->work);
+}
+EXPORT_SYMBOL(flush_delayed_work);
+
+/**
  * schedule_delayed_work_on - queue work in global workqueue on CPU after delay
  * @cpu: cpu to use
  * @dwork: job to be done

^ permalink raw reply related	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 16:38                                       ` Linus Torvalds
  (?)
@ 2009-10-14 18:20                                       ` Oleg Nesterov
  2009-10-14 18:51                                           ` Linus Torvalds
  -1 siblings, 1 reply; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-14 18:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On 10/14, Linus Torvalds wrote:
>
> Of course, keventd itself is multi-threaded, so I'm not entirely sure even
> -that- guarantees that one 'flush_to_ldisc()' couldn't be pending on one
> CPU while it is then scheduled and then run on another CPU concurrently
> too. The WORK_STRUCT_PENDING bit guarantees exclusion from the lists and
> from being pending, but the work might be both pending and _running_ at
> the same time, afaik.

Yes.

>  void tty_flush_to_ldisc(struct tty_struct *tty)
>  {
> -	flush_to_ldisc(&tty->buf.work.work);
> +	flush_delayed_work(&tty->buf.work);
>  }

Can't comment this change because I don't understand the problem.

> + * flush_delayed_work - block until a dwork_struct's callback has terminated
> + * @dwork: the delayed work which is to be flushed
> + *
> + * Any timeout is cancelled, and any pending work is run immediately.
> + */
> +void flush_delayed_work(struct delayed_work *dwork)
> +{
> +	if (del_timer(&dwork->timer)) {
> +		struct cpu_workqueue_struct *cwq;
> +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> +		__queue_work(cwq, &dwork->work);
> +		put_cpu();
> +	}
> +	flush_work(&dwork->work);
> +}

I think this is correct. If del_timer() succeeds, we "own" _PENDING bit and
dwork->work must not be queued. But afaics this helper needs del_timer_sync(),
otherwise I am not sure about the "flush" part.

Let's suppose this dwork was pending and del_timer() returns 0. Since we use
del_timer, not del_timer_sync, it is possible that delayed_work_timer_fn() is
running in parallel, and the queueing is in progress. In this case flush_work()
can just return, before delayed_work_timer_fn() actually queues this dwork.

And just in case... Of course, if dwork was pending and running on another CPU,
then flush_delayed_work(dwork) can return before the running callback terminates.
But I guess this is what we want.


As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
In this case flush_delayed_work() does nothing. Is it OK?

Oleg.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 18:51                                           ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 18:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Wed, 14 Oct 2009, Oleg Nesterov wrote:
> 
> >  void tty_flush_to_ldisc(struct tty_struct *tty)
> >  {
> > -	flush_to_ldisc(&tty->buf.work.work);
> > +	flush_delayed_work(&tty->buf.work);
> >  }
> 
> Can't comment this change because I don't understand the problem.

The work function is "flush_to_ldisc()", and what we want to make sure of 
is that the work has been called. We used to just call the work function 
directly - but that meant that now one CPU might be running that "direct" 
call, while another CPU might be running flush_to_ldisc through keventd.

So this makes the "flush_to_ldisc()" is now instead always called through 
keventd (but there's still a possibility that two keventd threads run it 
concurrently - although that is going to be _very_ rare).

> 
> > + * flush_delayed_work - block until a dwork_struct's callback has terminated
> > + * @dwork: the delayed work which is to be flushed
> > + *
> > + * Any timeout is cancelled, and any pending work is run immediately.
> > + */
> > +void flush_delayed_work(struct delayed_work *dwork)
> > +{
> > +	if (del_timer(&dwork->timer)) {
> > +		struct cpu_workqueue_struct *cwq;
> > +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> > +		__queue_work(cwq, &dwork->work);
> > +		put_cpu();
> > +	}
> > +	flush_work(&dwork->work);
> > +}
> 
> I think this is correct. If del_timer() succeeds, we "own" _PENDING bit and
> dwork->work must not be queued. But afaics this helper needs del_timer_sync(),
> otherwise I am not sure about the "flush" part.

Hmm. I wanted to avoid del_timer_sync(), because it's so expensive for the 
case when the timer isn't running at all, but I do think you're correct.

If the del_timer() fails, the timer might still be running on another CPU 
right at that moment, but not quite have queued the work yet. And then 
we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved' 
work), and not wait for it.

I wonder if we could mark the case of "workqueue is on timer" by setting 
the "work->entry" list to some special value. That way

	list_empty(&work->entry)

would always mean "it's neither pending _nor_ scheduled", and 
flush_delayed_work() could have a fast-case check that at the top:

	if (list_empty(&work->entry))
		return;

or similar. When we do the 'insert_work()' in the timer function, the 
'list_empty()' invariant wouldn't change, so you could do that locklessly.

Of course, I've just talked about how much I hate subtle locking in the 
tty layer. This would be subtle, but we could document it, and it would be 
in the core kernel rather than a driver layer. 

> And just in case... Of course, if dwork was pending and running on another CPU,
> then flush_delayed_work(dwork) can return before the running callback terminates.
> But I guess this is what we want.

No, we want to wait for the callback to terminate, so we do want to hit 
that 'flush_work()' case.

> As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
> In this case flush_delayed_work() does nothing. Is it OK?

Yes. In fact, it would be a bonus over our current "we always call that 
flush function whether it was scheduled or not" code.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 18:51                                           ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 18:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Wed, 14 Oct 2009, Oleg Nesterov wrote:
> 
> >  void tty_flush_to_ldisc(struct tty_struct *tty)
> >  {
> > -	flush_to_ldisc(&tty->buf.work.work);
> > +	flush_delayed_work(&tty->buf.work);
> >  }
> 
> Can't comment this change because I don't understand the problem.

The work function is "flush_to_ldisc()", and what we want to make sure of 
is that the work has been called. We used to just call the work function 
directly - but that meant that now one CPU might be running that "direct" 
call, while another CPU might be running flush_to_ldisc through keventd.

So this makes the "flush_to_ldisc()" is now instead always called through 
keventd (but there's still a possibility that two keventd threads run it 
concurrently - although that is going to be _very_ rare).

> 
> > + * flush_delayed_work - block until a dwork_struct's callback has terminated
> > + * @dwork: the delayed work which is to be flushed
> > + *
> > + * Any timeout is cancelled, and any pending work is run immediately.
> > + */
> > +void flush_delayed_work(struct delayed_work *dwork)
> > +{
> > +	if (del_timer(&dwork->timer)) {
> > +		struct cpu_workqueue_struct *cwq;
> > +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> > +		__queue_work(cwq, &dwork->work);
> > +		put_cpu();
> > +	}
> > +	flush_work(&dwork->work);
> > +}
> 
> I think this is correct. If del_timer() succeeds, we "own" _PENDING bit and
> dwork->work must not be queued. But afaics this helper needs del_timer_sync(),
> otherwise I am not sure about the "flush" part.

Hmm. I wanted to avoid del_timer_sync(), because it's so expensive for the 
case when the timer isn't running at all, but I do think you're correct.

If the del_timer() fails, the timer might still be running on another CPU 
right at that moment, but not quite have queued the work yet. And then 
we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved' 
work), and not wait for it.

I wonder if we could mark the case of "workqueue is on timer" by setting 
the "work->entry" list to some special value. That way

	list_empty(&work->entry)

would always mean "it's neither pending _nor_ scheduled", and 
flush_delayed_work() could have a fast-case check that at the top:

	if (list_empty(&work->entry))
		return;

or similar. When we do the 'insert_work()' in the timer function, the 
'list_empty()' invariant wouldn't change, so you could do that locklessly.

Of course, I've just talked about how much I hate subtle locking in the 
tty layer. This would be subtle, but we could document it, and it would be 
in the core kernel rather than a driver layer. 

> And just in case... Of course, if dwork was pending and running on another CPU,
> then flush_delayed_work(dwork) can return before the running callback terminates.
> But I guess this is what we want.

No, we want to wait for the callback to terminate, so we do want to hit 
that 'flush_work()' case.

> As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
> In this case flush_delayed_work() does nothing. Is it OK?

Yes. In fact, it would be a bonus over our current "we always call that 
flush function whether it was scheduled or not" code.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 18:51                                           ` Linus Torvalds
  (?)
@ 2009-10-14 19:52                                           ` Oleg Nesterov
  2009-10-14 20:55                                               ` Linus Torvalds
  2009-10-14 21:16                                             ` Alan Cox
  -1 siblings, 2 replies; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-14 19:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On 10/14, Linus Torvalds wrote:
>
> On Wed, 14 Oct 2009, Oleg Nesterov wrote:
> >
> > >  void tty_flush_to_ldisc(struct tty_struct *tty)
> > >  {
> > > -	flush_to_ldisc(&tty->buf.work.work);
> > > +	flush_delayed_work(&tty->buf.work);
> > >  }
> >
> > Can't comment this change because I don't understand the problem.
>
> The work function is "flush_to_ldisc()", and what we want to make sure of
> is that the work has been called.

Thanks... This contradicts with

> > As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
> > In this case flush_delayed_work() does nothing. Is it OK?
>
> Yes. In fact, it would be a bonus over our current "we always call that
> flush function whether it was scheduled or not" code.

But I guess I understand what you meant.

> If the del_timer() fails, the timer might still be running on another CPU
> right at that moment, but not quite have queued the work yet. And then
> we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved'
> work), and not wait for it.

Or we can get the right cwq, but since the work is not queued and it is not
cwq->current_work, flush_work() correctly assumes there is nothing to do.

> I wonder if we could mark the case of "workqueue is on timer" by setting
> the "work->entry" list to some special value. That way
>
> 	list_empty(&work->entry)
>
> would always mean "it's neither pending _nor_ scheduled", and
> flush_delayed_work() could have a fast-case check that at the top:
>
> 	if (list_empty(&work->entry))
> 		return;

Yes, but we already have this - delayed_work_pending(). If it is
false, it is neither pending nor scheduled. But it may be running,
we can check cwq->current_work.

The problem is, should we check all CPUs to detect the running case?
please see below.

> > And just in case... Of course, if dwork was pending and running on another CPU,
> > then flush_delayed_work(dwork) can return before the running callback terminates.
> > But I guess this is what we want.
>
> No, we want to wait for the callback to terminate, so we do want to hit
> that 'flush_work()' case.

Hmm. Now I am confused.

OK. Lets suppose dwork's callback is running on CPU 0.

A thread running on CPU 1 does queue_delayed_work(dwork, delay).

Now, flush_workqueue() will flush the 2nd "queue_delayed_work" correctly,
but it can return before "running on CPU 0" completes.

If this is not what we want, then we have to iterate over all CPUs.


As for optimization, I think you are right and flush_delayed_work()
can do
	if (delayed_work_pending() && del_timer_sync()) {
		...
	}
	flush_work();

Assuming that we should not check all CPUs. And in this case perhaps we can
even do something like

	if (!delayed_work_pending() &&
	    get_wq_data()->current_work != dwork)
		return;

but this needs barriers, and run_workqueue() needs mb__before_clear_bit().


I'll try to think more tomorrow, but I doubt it is possible to avoid
del_timer_sync() logic. Whatever we do, if we hit the queueing in progress
we should spin until it is finished.

Oleg.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 19:59                                         ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-14 19:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi

Linus Torvalds wrote:
> 
> On Wed, 14 Oct 2009, Linus Torvalds wrote:
>> But it's certainly true that it just never happened before. At least for 
>> the !low_latency case, I'm not so sure about the low_latency=1 case, but I 
>> haven't checked either - it would depend on any higher-level 
>> serialization.
> 
> Btw, we _could_ try to solve this by adding some workqueue function to 
> "run delayed work now", and then always doing the 'flush_to_ldisc()' 
> through the workqueue logic.
> 
> So this is an "alternate patch": instead of making flush_to_ldisc() be 
> safe to re-enter, we try to make sure it's always called through the whole 
> workqueue logic and thus serialized by that.
> 
> Of course, keventd itself is multi-threaded, so I'm not entirely sure even 
> -that- guarantees that one 'flush_to_ldisc()' couldn't be pending on one 
> CPU while it is then scheduled and then run on another CPU concurrently 
> too. The WORK_STRUCT_PENDING bit guarantees exclusion from the lists and 
> from being pending, but the work might be both pending and _running_ at 
> the same time, afaik.
> 
> I'm adding Oleg to the Cc, because he's the workqueue-master. Oleg?
> 
> The patch below is - surprise, surprise - entirely untested. I'm not sure 
> my 'flush_delayed_work()' implementation is entirely kosher. But it looks 
> like it might work, and it did compile for me (technically this is on top 
> of my flush_to_ldisc() patch, but they should be independent of each 
> other).
> 
> 			Linus
> 
> ---
>  drivers/char/tty_buffer.c |    2 +-
>  include/linux/workqueue.h |    1 +
>  kernel/workqueue.c        |   18 ++++++++++++++++++
>  3 files changed, 20 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 0296612..66fa4e1 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -468,7 +468,7 @@ static void flush_to_ldisc(struct work_struct *work)
>   */
>  void tty_flush_to_ldisc(struct tty_struct *tty)
>  {
> -	flush_to_ldisc(&tty->buf.work.work);
> +	flush_delayed_work(&tty->buf.work);
>  }
>  
>  /**
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 7ef0c7b..cf24c20 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -207,6 +207,7 @@ extern int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
>  
>  extern void flush_workqueue(struct workqueue_struct *wq);
>  extern void flush_scheduled_work(void);
> +extern void flush_delayed_work(struct delayed_work *work);
>  
>  extern int schedule_work(struct work_struct *work);
>  extern int schedule_work_on(int cpu, struct work_struct *work);
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index addfe2d..ccefe57 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -640,6 +640,24 @@ int schedule_delayed_work(struct delayed_work *dwork,
>  EXPORT_SYMBOL(schedule_delayed_work);
>  
>  /**
> + * flush_delayed_work - block until a dwork_struct's callback has terminated
> + * @dwork: the delayed work which is to be flushed
> + *
> + * Any timeout is cancelled, and any pending work is run immediately.
> + */
> +void flush_delayed_work(struct delayed_work *dwork)
> +{
> +	if (del_timer(&dwork->timer)) {
> +		struct cpu_workqueue_struct *cwq;
> +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> +		__queue_work(cwq, &dwork->work);
> +		put_cpu();
> +	}
> +	flush_work(&dwork->work);
> +}
> +EXPORT_SYMBOL(flush_delayed_work);
> +
> +/**
>   * schedule_delayed_work_on - queue work in global workqueue on CPU after delay
>   * @cpu: cpu to use
>   * @dwork: job to be done
> 

Works for me. I couldn't reproduce the problem with only this patch on
top of  2.6.31.4.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 19:59                                         ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-14 19:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi-UIVanBePwB70ZhReMnHkpc8NsWr+9BEh

Linus Torvalds wrote:
> 
> On Wed, 14 Oct 2009, Linus Torvalds wrote:
>> But it's certainly true that it just never happened before. At least for 
>> the !low_latency case, I'm not so sure about the low_latency=1 case, but I 
>> haven't checked either - it would depend on any higher-level 
>> serialization.
> 
> Btw, we _could_ try to solve this by adding some workqueue function to 
> "run delayed work now", and then always doing the 'flush_to_ldisc()' 
> through the workqueue logic.
> 
> So this is an "alternate patch": instead of making flush_to_ldisc() be 
> safe to re-enter, we try to make sure it's always called through the whole 
> workqueue logic and thus serialized by that.
> 
> Of course, keventd itself is multi-threaded, so I'm not entirely sure even 
> -that- guarantees that one 'flush_to_ldisc()' couldn't be pending on one 
> CPU while it is then scheduled and then run on another CPU concurrently 
> too. The WORK_STRUCT_PENDING bit guarantees exclusion from the lists and 
> from being pending, but the work might be both pending and _running_ at 
> the same time, afaik.
> 
> I'm adding Oleg to the Cc, because he's the workqueue-master. Oleg?
> 
> The patch below is - surprise, surprise - entirely untested. I'm not sure 
> my 'flush_delayed_work()' implementation is entirely kosher. But it looks 
> like it might work, and it did compile for me (technically this is on top 
> of my flush_to_ldisc() patch, but they should be independent of each 
> other).
> 
> 			Linus
> 
> ---
>  drivers/char/tty_buffer.c |    2 +-
>  include/linux/workqueue.h |    1 +
>  kernel/workqueue.c        |   18 ++++++++++++++++++
>  3 files changed, 20 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 0296612..66fa4e1 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -468,7 +468,7 @@ static void flush_to_ldisc(struct work_struct *work)
>   */
>  void tty_flush_to_ldisc(struct tty_struct *tty)
>  {
> -	flush_to_ldisc(&tty->buf.work.work);
> +	flush_delayed_work(&tty->buf.work);
>  }
>  
>  /**
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 7ef0c7b..cf24c20 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -207,6 +207,7 @@ extern int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
>  
>  extern void flush_workqueue(struct workqueue_struct *wq);
>  extern void flush_scheduled_work(void);
> +extern void flush_delayed_work(struct delayed_work *work);
>  
>  extern int schedule_work(struct work_struct *work);
>  extern int schedule_work_on(int cpu, struct work_struct *work);
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index addfe2d..ccefe57 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -640,6 +640,24 @@ int schedule_delayed_work(struct delayed_work *dwork,
>  EXPORT_SYMBOL(schedule_delayed_work);
>  
>  /**
> + * flush_delayed_work - block until a dwork_struct's callback has terminated
> + * @dwork: the delayed work which is to be flushed
> + *
> + * Any timeout is cancelled, and any pending work is run immediately.
> + */
> +void flush_delayed_work(struct delayed_work *dwork)
> +{
> +	if (del_timer(&dwork->timer)) {
> +		struct cpu_workqueue_struct *cwq;
> +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> +		__queue_work(cwq, &dwork->work);
> +		put_cpu();
> +	}
> +	flush_work(&dwork->work);
> +}
> +EXPORT_SYMBOL(flush_delayed_work);
> +
> +/**
>   * schedule_delayed_work_on - queue work in global workqueue on CPU after delay
>   * @cpu: cpu to use
>   * @dwork: job to be done
> 

Works for me. I couldn't reproduce the problem with only this patch on
top of  2.6.31.4.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 20:55                                               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 20:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Wed, 14 Oct 2009, Oleg Nesterov wrote:

> On 10/14, Linus Torvalds wrote:
> >
> > On Wed, 14 Oct 2009, Oleg Nesterov wrote:
> > >
> > > >  void tty_flush_to_ldisc(struct tty_struct *tty)
> > > >  {
> > > > -	flush_to_ldisc(&tty->buf.work.work);
> > > > +	flush_delayed_work(&tty->buf.work);
> > > >  }
> > >
> > > Can't comment this change because I don't understand the problem.
> >
> > The work function is "flush_to_ldisc()", and what we want to make sure of
> > is that the work has been called.
> 
> Thanks... This contradicts with
> 
> > > As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
> > > In this case flush_delayed_work() does nothing. Is it OK?
> >
> > Yes. In fact, it would be a bonus over our current "we always call that
> > flush function whether it was scheduled or not" code.
> 
> But I guess I understand what you meant.

Yeah. Basically, we want to make sure that it has been called *since it 
was scheduled*. In case it has already been called and is no longer 
pending at all, not calling it again is fine.

It's just that we didn't have any way to do that "force the pending 
delayed work to be scheduled", so instead we ran the scheduled function by 
hand synchronously. Which then seems to have triggered other problems.

> > If the del_timer() fails, the timer might still be running on another CPU
> > right at that moment, but not quite have queued the work yet. And then
> > we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved'
> > work), and not wait for it.
> 
> Or we can get the right cwq, but since the work is not queued and it is not
> cwq->current_work, flush_work() correctly assumes there is nothing to do.

Yes.

> > I wonder if we could mark the case of "workqueue is on timer" by setting
> > the "work->entry" list to some special value. That way
> >
> > 	list_empty(&work->entry)
> >
> > would always mean "it's neither pending _nor_ scheduled", and
> > flush_delayed_work() could have a fast-case check that at the top:
> >
> > 	if (list_empty(&work->entry))
> > 		return;
> 
> Yes, but we already have this - delayed_work_pending(). If it is
> false, it is neither pending nor scheduled. But it may be running,
> we can check cwq->current_work.

Yes. But I was more worried about the locks that "del_timer_sync()" does: 
the timer locks are more likely to be contended than the workqueue locks.

Maybe. I dunno.

> > > And just in case... Of course, if dwork was pending and running on another CPU,
> > > then flush_delayed_work(dwork) can return before the running callback terminates.
> > > But I guess this is what we want.
> >
> > No, we want to wait for the callback to terminate, so we do want to hit
> > that 'flush_work()' case.
> 
> Hmm. Now I am confused.
> 
> OK. Lets suppose dwork's callback is running on CPU 0.
> 
> A thread running on CPU 1 does queue_delayed_work(dwork, delay).
> 
> Now, flush_workqueue() will flush the 2nd "queue_delayed_work" correctly,
> but it can return before "running on CPU 0" completes.

Well, this is actually similar to the larger issue of "the tty layer 
doesn't want to ever run two works concurrently". So we already hit the 
concurrency bug.

That said, I had an earlier patch that should make that concurrency case 
be ok (you were not cc'd on that, because that was purely internal to the 
tty layer). And I think we want to do that regardless, especially since it 
_can_ happen with workqueues too (although I suspect it's rare enough in 
practice that nobody cares).

And to some degree, true races are ok. If somebody is writing data on 
another CPU at the same time as we are trying to flush it, not getting the 
flush is fine. The case we have really cared about we had real 
synchronization between the writer and the reader (ie the writer who added 
the delayed work will have done a wakeup and other things to let the 
reader know).

The reason for flushing it was that without the flush, the reader wouldn't 
necessarily see the data even though it's "old" by then - a delay of a 
jiffy is a _loong_ time.. So the flush doesn't need to be horribly exact, 
and after we have flushed, we will take locks that should serialize with 
the flusher.

So I don't think it really matters in practice, but I do think that we 
have that nasty hole in workqueues in general with overlapping work. I 
wish I could think of a way to fix it.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-14 20:55                                               ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 20:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Wed, 14 Oct 2009, Oleg Nesterov wrote:

> On 10/14, Linus Torvalds wrote:
> >
> > On Wed, 14 Oct 2009, Oleg Nesterov wrote:
> > >
> > > >  void tty_flush_to_ldisc(struct tty_struct *tty)
> > > >  {
> > > > -	flush_to_ldisc(&tty->buf.work.work);
> > > > +	flush_delayed_work(&tty->buf.work);
> > > >  }
> > >
> > > Can't comment this change because I don't understand the problem.
> >
> > The work function is "flush_to_ldisc()", and what we want to make sure of
> > is that the work has been called.
> 
> Thanks... This contradicts with
> 
> > > As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled?
> > > In this case flush_delayed_work() does nothing. Is it OK?
> >
> > Yes. In fact, it would be a bonus over our current "we always call that
> > flush function whether it was scheduled or not" code.
> 
> But I guess I understand what you meant.

Yeah. Basically, we want to make sure that it has been called *since it 
was scheduled*. In case it has already been called and is no longer 
pending at all, not calling it again is fine.

It's just that we didn't have any way to do that "force the pending 
delayed work to be scheduled", so instead we ran the scheduled function by 
hand synchronously. Which then seems to have triggered other problems.

> > If the del_timer() fails, the timer might still be running on another CPU
> > right at that moment, but not quite have queued the work yet. And then
> > we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved'
> > work), and not wait for it.
> 
> Or we can get the right cwq, but since the work is not queued and it is not
> cwq->current_work, flush_work() correctly assumes there is nothing to do.

Yes.

> > I wonder if we could mark the case of "workqueue is on timer" by setting
> > the "work->entry" list to some special value. That way
> >
> > 	list_empty(&work->entry)
> >
> > would always mean "it's neither pending _nor_ scheduled", and
> > flush_delayed_work() could have a fast-case check that at the top:
> >
> > 	if (list_empty(&work->entry))
> > 		return;
> 
> Yes, but we already have this - delayed_work_pending(). If it is
> false, it is neither pending nor scheduled. But it may be running,
> we can check cwq->current_work.

Yes. But I was more worried about the locks that "del_timer_sync()" does: 
the timer locks are more likely to be contended than the workqueue locks.

Maybe. I dunno.

> > > And just in case... Of course, if dwork was pending and running on another CPU,
> > > then flush_delayed_work(dwork) can return before the running callback terminates.
> > > But I guess this is what we want.
> >
> > No, we want to wait for the callback to terminate, so we do want to hit
> > that 'flush_work()' case.
> 
> Hmm. Now I am confused.
> 
> OK. Lets suppose dwork's callback is running on CPU 0.
> 
> A thread running on CPU 1 does queue_delayed_work(dwork, delay).
> 
> Now, flush_workqueue() will flush the 2nd "queue_delayed_work" correctly,
> but it can return before "running on CPU 0" completes.

Well, this is actually similar to the larger issue of "the tty layer 
doesn't want to ever run two works concurrently". So we already hit the 
concurrency bug.

That said, I had an earlier patch that should make that concurrency case 
be ok (you were not cc'd on that, because that was purely internal to the 
tty layer). And I think we want to do that regardless, especially since it 
_can_ happen with workqueues too (although I suspect it's rare enough in 
practice that nobody cares).

And to some degree, true races are ok. If somebody is writing data on 
another CPU at the same time as we are trying to flush it, not getting the 
flush is fine. The case we have really cared about we had real 
synchronization between the writer and the reader (ie the writer who added 
the delayed work will have done a wakeup and other things to let the 
reader know).

The reason for flushing it was that without the flush, the reader wouldn't 
necessarily see the data even though it's "old" by then - a delay of a 
jiffy is a _loong_ time.. So the flush doesn't need to be horribly exact, 
and after we have flushed, we will take locks that should serialize with 
the flusher.

So I don't think it really matters in practice, but I do think that we 
have that nasty hole in workqueues in general with overlapping work. I 
wish I could think of a way to fix it.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 19:59                                         ` Boyan
  (?)
@ 2009-10-14 21:02                                         ` Linus Torvalds
  2009-10-14 21:39                                           ` Alan Cox
  2009-10-15  7:24                                             ` Boyan
  -1 siblings, 2 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-14 21:02 UTC (permalink / raw)
  To: Boyan
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi



On Wed, 14 Oct 2009, Boyan wrote:
> 
> Works for me. I couldn't reproduce the problem with only this patch on
> top of  2.6.31.4.

So just to verify: both the flush_to_ldisc() patch _and_ the 
"flush_delayed_work()" one fixed the problem for you? And you tested them 
independently? And you said you could reliably trigger it before?

Ok, that makes me happy, because it implies that this really is the root 
cause, with two different approaches to fixing the same problem both 
working independently of each other.

So if you confirm that I understand your test situation right, I will 
probably commit them both: I think the flush_to_ldisc() patch is a real 
locking improvement regardless of whether we then avoid calling it in a 
nested manner or not, and the flush_delayed_work() thing seems to be the 
right thing(tm) to do too.

Besides, even with the flush_delayed_work() thing, we're still faced with 
the theory of multiple keventd threads running the flush_to_ldisc on 
separate CPU's. Even though it's probably unlikely enough to never happen.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 19:52                                           ` Oleg Nesterov
  2009-10-14 20:55                                               ` Linus Torvalds
@ 2009-10-14 21:16                                             ` Alan Cox
  2009-10-14 21:51                                               ` David Miller
  1 sibling, 1 reply; 248+ messages in thread
From: Alan Cox @ 2009-10-14 21:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

> Assuming that we should not check all CPUs. And in this case perhaps we can
> even do something like
> 
> 	if (!delayed_work_pending() &&
> 	    get_wq_data()->current_work != dwork)
> 		return;
> 
> but this needs barriers, and run_workqueue() needs mb__before_clear_bit().

Linus correctly said that we got into the mess because the locking was
too clever before. At this point the mutex approach seems to be rather
preferable to the other approaches which are at least as "clever" as the
current locking

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 21:02                                         ` Linus Torvalds
@ 2009-10-14 21:39                                           ` Alan Cox
  2009-10-15  7:24                                             ` Boyan
  1 sibling, 0 replies; 248+ messages in thread
From: Alan Cox @ 2009-10-14 21:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Boyan, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi

On Wed, 14 Oct 2009 14:02:10 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Wed, 14 Oct 2009, Boyan wrote:
> > 
> > Works for me. I couldn't reproduce the problem with only this patch on
> > top of  2.6.31.4.
> 
> So just to verify: both the flush_to_ldisc() patch _and_ the 
> "flush_delayed_work()" one fixed the problem for you? And you tested them 
> independently? And you said you could reliably trigger it before?
> 
> Ok, that makes me happy, because it implies that this really is the root 
> cause, with two different approaches to fixing the same problem both 
> working independently of each other.

It also seems to fix the "letters leaking across a console switch". Can't
be sure as its not trivial to reproduce but it seems to have gone too.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 21:16                                             ` Alan Cox
@ 2009-10-14 21:51                                               ` David Miller
  0 siblings, 0 replies; 248+ messages in thread
From: David Miller @ 2009-10-14 21:51 UTC (permalink / raw)
  To: alan
  Cc: oleg, torvalds, paulkf, btanastasov, rjw, linux-kernel,
	kernel-testers, dmitry.torokhov, edt, hirofumi

From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Wed, 14 Oct 2009 22:16:33 +0100

>> Assuming that we should not check all CPUs. And in this case perhaps we can
>> even do something like
>> 
>> 	if (!delayed_work_pending() &&
>> 	    get_wq_data()->current_work != dwork)
>> 		return;
>> 
>> but this needs barriers, and run_workqueue() needs mb__before_clear_bit().
> 
> Linus correctly said that we got into the mess because the locking was
> too clever before. At this point the mutex approach seems to be rather
> preferable to the other approaches which are at least as "clever" as the
> current locking

FWIW I've been seeing behavior that is almost certainly caused by this
bug for a while, I lose 'control' or 'shift' key-down events and this
is entirely independent of keyboard attached (i've tried several) and
seems to only happen on an SMP box.

It's most accutely annoying if it triggers while playing quake3 :-)

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14258] Memory leak in SCSI initialization
  2009-10-11 23:01   ` Rafael J. Wysocki
  (?)
@ 2009-10-15  2:30   ` Tetsuo Handa
  -1 siblings, 0 replies; 248+ messages in thread
From: Tetsuo Handa @ 2009-10-15  2:30 UTC (permalink / raw)
  To: James.Bottomley, rjw; +Cc: kernel-testers, michael, linux-kernel

I got below messages in 2.6.32-rc4 .

# dmesg | grep kmemleak
[    7.612391] kmemleak: Kernel memory leak detector initialized
[    7.615675] kmemleak: Automatic memory scanning thread started
[   78.641096] kmemleak: 13 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xdac2c478 (size 32):
  comm "swapper", pid 1, jiffies 4294894406
  hex dump (first 32 bytes):
    30 3a 30 3a 32 3a 30 00 5a 5a 5a 5a 5a 5a 5a 5a  0:0:2:0.ZZZZZZZZ
    5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a a5  ZZZZZZZZZZZZZZZ.
  backtrace:
    [<c10d3944>] create_object+0xe4/0x220
    [<c1322663>] kmemleak_alloc+0x83/0xd0
    [<c10d01d4>] __kmalloc+0x1b4/0x220
    [<c11ab450>] kvasprintf+0x30/0x60
    [<c11a3131>] kobject_set_name_vargs+0x21/0x60
    [<c11f8cd9>] dev_set_name+0x19/0x20
    [<c122c2b3>] scsi_sysfs_device_initialize+0xc3/0x120
    [<c1228ac4>] scsi_alloc_sdev+0x194/0x230
    [<c1229b50>] scsi_probe_and_add_lun+0x320/0x340
    [<c122a477>] __scsi_scan_target+0xb7/0x100
    [<c122a5f6>] scsi_scan_channel+0x86/0xa0
    [<c122a6f9>] scsi_scan_host_selected+0xe9/0x150
    [<c122aabc>] do_scsi_scan_host+0x7c/0x80
    [<c122ab6d>] scsi_scan_host+0x8d/0x90
    [<c1520c75>] BusLogic_init+0x355/0x420
    [<c100105c>] do_one_initcall+0x2c/0x1d0
(...snipped...)
unreferenced object 0xdac2cc58 (size 32):
  comm "swapper", pid 1, jiffies 4294894414
  hex dump (first 32 bytes):
    30 3a 30 3a 31 35 3a 30 00 5a 5a 5a 5a 5a 5a 5a  0:0:15:0.ZZZZZZZ
    5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a a5  ZZZZZZZZZZZZZZZ.
  backtrace:
    [<c10d3944>] create_object+0xe4/0x220
    [<c1322663>] kmemleak_alloc+0x83/0xd0
    [<c10d01d4>] __kmalloc+0x1b4/0x220
    [<c11ab450>] kvasprintf+0x30/0x60
    [<c11a3131>] kobject_set_name_vargs+0x21/0x60
    [<c11f8cd9>] dev_set_name+0x19/0x20
    [<c122c2b3>] scsi_sysfs_device_initialize+0xc3/0x120
    [<c1228ac4>] scsi_alloc_sdev+0x194/0x230
    [<c1229b50>] scsi_probe_and_add_lun+0x320/0x340
    [<c122a477>] __scsi_scan_target+0xb7/0x100
    [<c122a5f6>] scsi_scan_channel+0x86/0xa0
    [<c122a6f9>] scsi_scan_host_selected+0xe9/0x150
    [<c122aabc>] do_scsi_scan_host+0x7c/0x80
    [<c122ab6d>] scsi_scan_host+0x8d/0x90
    [<c1520c75>] BusLogic_init+0x355/0x420
    [<c100105c>] do_one_initcall+0x2c/0x1d0

In my environment, 0:0:0:0 and 0:0:1:0 are used by SCSI hard disks, 0:0:7:0 is
reserved. 0:0:X:0 (where X = 2-6, 8-15) are unused and reported as memory leak.

After applying http://patchwork.kernel.org/patch/51412/ , above messages
no longer appears. Please apply that patch to 2.6.32-rcX as well as 2.6.31.Y .

Regards.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15  7:24                                             ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-15  7:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi

Linus Torvalds wrote:
> 
> On Wed, 14 Oct 2009, Boyan wrote:
>> Works for me. I couldn't reproduce the problem with only this patch on
>> top of  2.6.31.4.
> 
> So just to verify: both the flush_to_ldisc() patch _and_ the 
> "flush_delayed_work()" one fixed the problem for you? And you tested them 
> independently? And you said you could reliably trigger it before?

Yes, both patches independently fix the problem for me.
I've tested with both patches applied too and couldn't trigger the
problem.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15  7:24                                             ` Boyan
  0 siblings, 0 replies; 248+ messages in thread
From: Boyan @ 2009-10-15  7:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, hirofumi-UIVanBePwB70ZhReMnHkpc8NsWr+9BEh

Linus Torvalds wrote:
> 
> On Wed, 14 Oct 2009, Boyan wrote:
>> Works for me. I couldn't reproduce the problem with only this patch on
>> top of  2.6.31.4.
> 
> So just to verify: both the flush_to_ldisc() patch _and_ the 
> "flush_delayed_work()" one fixed the problem for you? And you tested them 
> independently? And you said you could reliably trigger it before?

Yes, both patches independently fix the problem for me.
I've tested with both patches applied too and couldn't trigger the
problem.

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 20:55                                               ` Linus Torvalds
  (?)
@ 2009-10-15 12:47                                               ` Oleg Nesterov
  2009-10-15 15:29                                                   ` Oleg Nesterov
  2009-10-15 15:53                                                 ` Linus Torvalds
  -1 siblings, 2 replies; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-15 12:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On 10/14, Linus Torvalds wrote:
>
> Yeah. Basically, we want to make sure that it has been called *since it
> was scheduled*. In case it has already been called and is no longer
> pending at all, not calling it again is fine.
>
> It's just that we didn't have any way to do that "force the pending
> delayed work to be scheduled", so instead we ran the scheduled function by
> hand synchronously. Which then seems to have triggered other problems.

Yes. But I am not sure these problems are new. I do not understand this
code even remotely, but from a quick grep it seems to me it is possible
that flush_to_ldisc() can race with itself even without tty_flush_to_ldisc()
which calls work->func() by hand.

> So I don't think it really matters in practice, but I do think that we
> have that nasty hole in workqueues in general with overlapping work. I
> wish I could think of a way to fix it.

I don't entirely agree this is a hole, I mean everything works "as expected".
But yes, I agree, users often do not realize that multithreaded workqueues
imply overlapping works (unless the caller takes care). And in this case
work->func() should solve the races itself.

Perhaps it makes sense to introduce something like

	// same as queue_work(), but ensures work->func() can't race with itself

	int queue_work_xxx(struct workqueue_struct *wq, struct work_struct *work)
	{
		int ret = 0;

		if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) {
			struct cpu_workqueue_struct *cwq = get_wq_data(work);
			int cpu = get_cpu();

			// "cwq->current_work != work" is not strictly needed,
			// but we don't want to pin this work to the single CPU.

			if (!cwq || cwq->current_work != work)
				cwq = wq_per_cpu(wq, cpu);

			__queue_work(cwq, work);
			put_cpu();
			ret = 1;
		}

		return ret;
	}

This way we can never have multiple instances of the same work running on
different CPUs. Assuming, of course, the caller never mixes queue_work_xxx()
with queue_work(). The logic for queue_delayed_work_xxx() is similar.

But, this can race with cpu_down(). I think this is solvable but needs
more locking. I mean, the caller of queue_work_xxx() must not use the old
get_wq_data(work) if this CPU is already dead, but a simple cpu_online()
is not enough, we can race with workqueue_cpu_callback(CPU_POST_DEAD)
flushing this cwq, in this case we should carefully insert this work
into the almost-dead queue.

Or, perhaps better, instead of new helper, we can probably use the free
bit in work_struct->data to mark this work/dwork as "single-instance-work".
In this case __queue_work and queue_delayed_work_on should check this bit.

Do you think this makes sense and can close the hole?

If yes, I'll try to do this on Weekend.

Oleg.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15 15:29                                                   ` Oleg Nesterov
  0 siblings, 0 replies; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-15 15:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On 10/15, Oleg Nesterov wrote:
>
> But, this can race with cpu_down(). I think this is solvable but needs
> more locking. I mean, the caller of queue_work_xxx() must not use the old
> get_wq_data(work) if this CPU is already dead, but a simple cpu_online()
> is not enough, we can race with workqueue_cpu_callback(CPU_POST_DEAD)
> flushing this cwq, in this case we should carefully insert this work
> into the almost-dead queue.
>
> Or, perhaps better, instead of new helper, we can probably use the free
> bit in work_struct->data to mark this work/dwork as "single-instance-work".
> In this case __queue_work and queue_delayed_work_on should check this bit.

Actually, this looks simple. Please see the patch below.

Of course! the horror in __queue_work() should be cleanuped somehow.
The change queue_delayed_work_on() needs a separate patch probably.


All, what do you think? Do we need this?

Oleg.

If the work_struct/delayed_work has WORK_STRUCT_XXX bit set, it can never
race with itself.

Note: queue_work_on() or queue_delayed_work_on() must not be used if it is
work_xxx().

Also, we can optimize flush/cancel operations to not scan all CPUs if this
work is "singlethreaded".

PROBLEM: work_xxx() work can block cpu_down() if it contsantly re-queues
itself, hopefully we shouldn't have such stupid users.
---

--- TTT_32/include/linux/workqueue.h~WORK_XXX	2009-09-23 21:12:03.000000000 +0200
+++ TTT_32/include/linux/workqueue.h	2009-10-15 16:49:25.000000000 +0200
@@ -24,7 +24,8 @@ typedef void (*work_func_t)(struct work_
 
 struct work_struct {
 	atomic_long_t data;
-#define WORK_STRUCT_PENDING 0		/* T if work item pending execution */
+#define WORK_STRUCT_PENDING	0	/* T if work item pending execution */
+#define WORK_STRUCT_XXX		1	/* deny multiple running instances */
 #define WORK_STRUCT_FLAG_MASK (3UL)
 #define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
 	struct list_head entry;
@@ -148,6 +149,9 @@ struct execute_work {
 #define work_pending(work) \
 	test_bit(WORK_STRUCT_PENDING, work_data_bits(work))
 
+#define work_xxx(work) \
+	test_bit(WORK_STRUCT_XXX, work_data_bits(work))
+
 /**
  * delayed_work_pending - Find out whether a delayable work item is currently
  * pending
--- TTT_32/kernel/workqueue.c~WORK_XXX	2009-09-12 21:40:11.000000000 +0200
+++ TTT_32/kernel/workqueue.c	2009-10-15 17:09:51.000000000 +0200
@@ -145,6 +145,35 @@ static void __queue_work(struct cpu_work
 {
 	unsigned long flags;
 
+	if (work_xxx(work)) {
+		struct cpu_workqueue_struct *old = get_wq_data(work);
+		bool done = false;
+
+		if (!old)
+			goto fallback;
+
+		// This lockless check is racy. We should either remove it
+		// or add mb__before_clear_bit() into run_workqueue().
+		if (old->current_work != work)
+			goto fallback;
+
+		// OK, we should keep this old cwq. But its CPU can be dead,
+		// we have to recheck under old->lock
+		spin_lock_irqsave(&old->lock, flags);
+		if (old->current_work == work) {
+			// It is stiill running, queue the work here.
+			// even if this CPU is dead, run_workqueue()
+			// can't return without noticing this work
+			insert_work(old, work, &old->worklist);
+			done = true;
+		}
+		spin_unlock_irqrestore(&cwq->lock, flags);
+
+		if (done)
+			return;
+	}
+
+fallback:
 	spin_lock_irqsave(&cwq->lock, flags);
 	insert_work(cwq, work, &cwq->worklist);
 	spin_unlock_irqrestore(&cwq->lock, flags);
@@ -246,7 +275,8 @@ int queue_delayed_work_on(int cpu, struc
 		timer_stats_timer_set_start_info(&dwork->timer);
 
 		/* This stores cwq for the moment, for the timer_fn */
-		set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
+		if (!get_wq_data(work))
+			set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
 		timer->expires = jiffies + delay;
 		timer->data = (unsigned long)dwork;
 		timer->function = delayed_work_timer_fn;


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15 15:29                                                   ` Oleg Nesterov
  0 siblings, 0 replies; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-15 15:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi

On 10/15, Oleg Nesterov wrote:
>
> But, this can race with cpu_down(). I think this is solvable but needs
> more locking. I mean, the caller of queue_work_xxx() must not use the old
> get_wq_data(work) if this CPU is already dead, but a simple cpu_online()
> is not enough, we can race with workqueue_cpu_callback(CPU_POST_DEAD)
> flushing this cwq, in this case we should carefully insert this work
> into the almost-dead queue.
>
> Or, perhaps better, instead of new helper, we can probably use the free
> bit in work_struct->data to mark this work/dwork as "single-instance-work".
> In this case __queue_work and queue_delayed_work_on should check this bit.

Actually, this looks simple. Please see the patch below.

Of course! the horror in __queue_work() should be cleanuped somehow.
The change queue_delayed_work_on() needs a separate patch probably.


All, what do you think? Do we need this?

Oleg.

If the work_struct/delayed_work has WORK_STRUCT_XXX bit set, it can never
race with itself.

Note: queue_work_on() or queue_delayed_work_on() must not be used if it is
work_xxx().

Also, we can optimize flush/cancel operations to not scan all CPUs if this
work is "singlethreaded".

PROBLEM: work_xxx() work can block cpu_down() if it contsantly re-queues
itself, hopefully we shouldn't have such stupid users.
---

--- TTT_32/include/linux/workqueue.h~WORK_XXX	2009-09-23 21:12:03.000000000 +0200
+++ TTT_32/include/linux/workqueue.h	2009-10-15 16:49:25.000000000 +0200
@@ -24,7 +24,8 @@ typedef void (*work_func_t)(struct work_
 
 struct work_struct {
 	atomic_long_t data;
-#define WORK_STRUCT_PENDING 0		/* T if work item pending execution */
+#define WORK_STRUCT_PENDING	0	/* T if work item pending execution */
+#define WORK_STRUCT_XXX		1	/* deny multiple running instances */
 #define WORK_STRUCT_FLAG_MASK (3UL)
 #define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
 	struct list_head entry;
@@ -148,6 +149,9 @@ struct execute_work {
 #define work_pending(work) \
 	test_bit(WORK_STRUCT_PENDING, work_data_bits(work))
 
+#define work_xxx(work) \
+	test_bit(WORK_STRUCT_XXX, work_data_bits(work))
+
 /**
  * delayed_work_pending - Find out whether a delayable work item is currently
  * pending
--- TTT_32/kernel/workqueue.c~WORK_XXX	2009-09-12 21:40:11.000000000 +0200
+++ TTT_32/kernel/workqueue.c	2009-10-15 17:09:51.000000000 +0200
@@ -145,6 +145,35 @@ static void __queue_work(struct cpu_work
 {
 	unsigned long flags;
 
+	if (work_xxx(work)) {
+		struct cpu_workqueue_struct *old = get_wq_data(work);
+		bool done = false;
+
+		if (!old)
+			goto fallback;
+
+		// This lockless check is racy. We should either remove it
+		// or add mb__before_clear_bit() into run_workqueue().
+		if (old->current_work != work)
+			goto fallback;
+
+		// OK, we should keep this old cwq. But its CPU can be dead,
+		// we have to recheck under old->lock
+		spin_lock_irqsave(&old->lock, flags);
+		if (old->current_work == work) {
+			// It is stiill running, queue the work here.
+			// even if this CPU is dead, run_workqueue()
+			// can't return without noticing this work
+			insert_work(old, work, &old->worklist);
+			done = true;
+		}
+		spin_unlock_irqrestore(&cwq->lock, flags);
+
+		if (done)
+			return;
+	}
+
+fallback:
 	spin_lock_irqsave(&cwq->lock, flags);
 	insert_work(cwq, work, &cwq->worklist);
 	spin_unlock_irqrestore(&cwq->lock, flags);
@@ -246,7 +275,8 @@ int queue_delayed_work_on(int cpu, struc
 		timer_stats_timer_set_start_info(&dwork->timer);
 
 		/* This stores cwq for the moment, for the timer_fn */
-		set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
+		if (!get_wq_data(work))
+			set_wq_data(work, wq_per_cpu(wq, raw_smp_processor_id()));
 		timer->expires = jiffies + delay;
 		timer->data = (unsigned long)dwork;
 		timer->function = delayed_work_timer_fn;

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-15 12:47                                               ` Oleg Nesterov
  2009-10-15 15:29                                                   ` Oleg Nesterov
@ 2009-10-15 15:53                                                 ` Linus Torvalds
  1 sibling, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-15 15:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Thu, 15 Oct 2009, Oleg Nesterov wrote:

> On 10/14, Linus Torvalds wrote:
> >
> > It's just that we didn't have any way to do that "force the pending
> > delayed work to be scheduled", so instead we ran the scheduled function by
> > hand synchronously. Which then seems to have triggered other problems.
> 
> Yes. But I am not sure these problems are new.

Oh, I agree. I think the locking bug in the tty layer was long-standing, 
it just was impossible to trigger in practice as long as it was only 
called through keventd. The window is fairly small, and for many things 
(like the X keyboard), the amount of data transferred is tiny and you 
don't actually schedule the workqueue very often at all.

> I do not understand this code even remotely, but from a quick grep it 
> seems to me it is possible that flush_to_ldisc() can race with itself 
> even without tty_flush_to_ldisc() which calls work->func() by hand.

Yes, but only in the _very_ special case of scheduling the callback at 
just the right moment. And in fact, traditionally it was always scheduled 
with a timeout, which makes it even less likely to happen.

> Perhaps it makes sense to introduce something like
> 
> 	// same as queue_work(), but ensures work->func() can't race with itself
> 
> 	int queue_work_xxx(struct workqueue_struct *wq, struct work_struct *work)
> 	{
> 		int ret = 0;
> 
> 		if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) {
> 			struct cpu_workqueue_struct *cwq = get_wq_data(work);
> 			int cpu = get_cpu();
> 
> 			// "cwq->current_work != work" is not strictly needed,
> 			// but we don't want to pin this work to the single CPU.
> 
> 			if (!cwq || cwq->current_work != work)
> 				cwq = wq_per_cpu(wq, cpu);
> 
> 			__queue_work(cwq, work);

Yes. Looks good to me. Just forcing the new one to be on the same CPU as 
the previous one should solve it.

And it should even be good for performance to make it "sticky" to the CPU, 
so I think this could even be done without any new flags or functions.

The people who actually want to run work on multiple CPU's in parallel end 
up always having multiple work structures, so I think the CPU-stickiness 
is good for everybody.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15 16:04                                                     ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-15 16:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Thu, 15 Oct 2009, Oleg Nesterov wrote:
> 
> Actually, this looks simple. Please see the patch below.

Ok, I should have read all my emails before responding to the previous 
one.

But my response ends up being the same: I think your patch is fine, and I 
think you should drop the conditional flag. I don't think anybody _ever_ 
wants to run the same work entry on multiple CPU's at once. Anybody who 
wants parallelism needs to use multiple work entries _anyway_ (since the 
accidental parallelism you *can* get with a single one is really very 
accidental indeed).

Of course, it should be tested in -next for a while regardless. And talk 
to the networkng people, who are the main user of workqueues.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-15 16:04                                                     ` Linus Torvalds
  0 siblings, 0 replies; 248+ messages in thread
From: Linus Torvalds @ 2009-10-15 16:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson, OGAWA Hirofumi



On Thu, 15 Oct 2009, Oleg Nesterov wrote:
> 
> Actually, this looks simple. Please see the patch below.

Ok, I should have read all my emails before responding to the previous 
one.

But my response ends up being the same: I think your patch is fine, and I 
think you should drop the conditional flag. I don't think anybody _ever_ 
wants to run the same work entry on multiple CPU's at once. Anybody who 
wants parallelism needs to use multiple work entries _anyway_ (since the 
accidental parallelism you *can* get with a single one is really very 
accidental indeed).

Of course, it should be tested in -next for a while regardless. And talk 
to the networkng people, who are the main user of workqueues.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-14 16:38                                       ` Linus Torvalds
                                                         ` (2 preceding siblings ...)
  (?)
@ 2009-10-15 17:38                                       ` OGAWA Hirofumi
  2009-10-15 19:00                                         ` Oleg Nesterov
  2009-10-15 21:49                                         ` Linus Torvalds
  -1 siblings, 2 replies; 248+ messages in thread
From: OGAWA Hirofumi @ 2009-10-15 17:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson

Hi,

Linus Torvalds <torvalds@linux-foundation.org> writes:

> diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
> index 0296612..66fa4e1 100644
> --- a/drivers/char/tty_buffer.c
> +++ b/drivers/char/tty_buffer.c
> @@ -468,7 +468,7 @@ static void flush_to_ldisc(struct work_struct *work)
>   */
>  void tty_flush_to_ldisc(struct tty_struct *tty)
>  {
> -	flush_to_ldisc(&tty->buf.work.work);
> +	flush_delayed_work(&tty->buf.work);
>  }

This might wait unnecessary scheduled-work on input_available_p(). This
is nitpick though, we can call tty_flush_to_ldisc() only when data is
unavailable.

I.e. the following or something,

static inline int input_available_p(struct tty_struct *tty, int amt)
{
	int try = 0;

retry:
	if (tty->icanon) {
		if (tty->canon_data)
			return 1;
	} else if (tty->read_cnt >= (amt ? amt : 1))
		return 1;

	if (!checked) {
		tty_flush_to_ldisc(tty);
		try = 1;
		goto retry;
	}

	return 0;
}

> +void flush_delayed_work(struct delayed_work *dwork)
> +{
> +	if (del_timer(&dwork->timer)) {
> +		struct cpu_workqueue_struct *cwq;
> +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> +		__queue_work(cwq, &dwork->work);
> +		put_cpu();
> +	}
> +	flush_work(&dwork->work);
> +}
> +EXPORT_SYMBOL(flush_delayed_work);
> +
> +/**

Sorry if I'm missing the point. Doesn't this have (possible) race with
schedule_delayed_work() (i.e. by tty writer)?

             cpu0                                      cpu1

    if (del_timer(&dwork->timer)) {
                                            // cpu0 doesn't set _PENDING
                                            schedule_delayed_work()
        cwq = wq_per_cpu();
        __queue_work(cwq, &dwork->work);
        put_cpu();
    }
                                            // run timer
                                            delayed_work_timer_fn()
                                                __queue_work()
                                                    list_add_tail()
                                                // re-add without list_del(),
                                                // so this will break the list?
    flush_work(&dwork->work);

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-15 17:38                                       ` OGAWA Hirofumi
@ 2009-10-15 19:00                                         ` Oleg Nesterov
  2009-10-15 21:49                                         ` Linus Torvalds
  1 sibling, 0 replies; 248+ messages in thread
From: Oleg Nesterov @ 2009-10-15 19:00 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Linus Torvalds, Alan Cox, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson

On 10/16, OGAWA Hirofumi wrote:
>
> > +void flush_delayed_work(struct delayed_work *dwork)
> > +{
> > +	if (del_timer(&dwork->timer)) {
> > +		struct cpu_workqueue_struct *cwq;
> > +		cwq = wq_per_cpu(keventd_wq, get_cpu());
> > +		__queue_work(cwq, &dwork->work);
> > +		put_cpu();
> > +	}
> > +	flush_work(&dwork->work);
> > +}
> > +EXPORT_SYMBOL(flush_delayed_work);
> > +
> > +/**
>
> Sorry if I'm missing the point. Doesn't this have (possible) race with
> schedule_delayed_work() (i.e. by tty writer)?
>
>              cpu0                                      cpu1
>
>     if (del_timer(&dwork->timer)) {

If dwork->timer is pending - _PENDING must be set.
If del_timer() succeeds, nobody else can clear this bit.

>                                             // cpu0 doesn't set _PENDING
>                                             schedule_delayed_work()

and in this case schedule_delayed_work()->queue_delayed_work_on()
can't succeed because it does test_and_set_bit(_PENDING).


But. Since this helper was merged, I think it should use del_timer_sync()
to be correct. Yes, it is slower, but otherwise flush is racy.

And I think it should return a bolean to match flush_work(). IOW,

	int flush_delayed_work(struct delayed_work *dwork)
	{
		int requeued = false;

		if (del_timer(&dwork->timer)) {
			struct cpu_workqueue_struct *cwq;
			cwq = wq_per_cpu(keventd_wq, get_cpu());
			__queue_work(cwq, &dwork->work);
			put_cpu();

			requeued = true;
		}

		return flush_work(&dwork->work) || requeued;
	}

Not that I think this is terribly important, but still.

I'll send the patch.

Oleg.


^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13948] ath5k broken after suspend-to-ram
@ 2009-10-15 21:38         ` Johannes Stezenbach
  0 siblings, 0 replies; 248+ messages in thread
From: Johannes Stezenbach @ 2009-10-15 21:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bob Copeland, Linux Kernel Mailing List, Kernel Testers List,
	Nick Kossifidis

On Mon, Oct 12, 2009 at 11:24:29PM +0200, Rafael J. Wysocki wrote:
> On Monday 12 October 2009, Bob Copeland wrote:
> > On Sun, Oct 11, 2009 at 7:01 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13948
> > > Subject         : ath5k broken after suspend-to-ram
> > > Submitter       : Johannes Stezenbach <js@sig21.net>
> > > Date            : 2009-08-07 21:51 (66 days old)
> > > References      : http://marc.info/?l=linux-kernel&m=124968192727854&w=4
> > > Handled-By      : Nick Kossifidis <mickflemm@gmail.com>
> > > Patch           : http://patchwork.kernel.org/patch/38550/
> > 
> > This patch was included in 2.6.31.2, so I believe this can go.
> 
> Thanks, closing.

Sorry for not responding earlier.  I now installed
2.6.31.4 on my old Thinkpad T42p and can confirm that
the issue is resolved.

Thanks
Johannes

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #13948] ath5k broken after suspend-to-ram
@ 2009-10-15 21:38         ` Johannes Stezenbach
  0 siblings, 0 replies; 248+ messages in thread
From: Johannes Stezenbach @ 2009-10-15 21:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bob Copeland, Linux Kernel Mailing List, Kernel Testers List,
	Nick Kossifidis

On Mon, Oct 12, 2009 at 11:24:29PM +0200, Rafael J. Wysocki wrote:
> On Monday 12 October 2009, Bob Copeland wrote:
> > On Sun, Oct 11, 2009 at 7:01 PM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> > > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13948
> > > Subject         : ath5k broken after suspend-to-ram
> > > Submitter       : Johannes Stezenbach <js-FF7aIK3TAVNeoWH0uzbU5w@public.gmane.org>
> > > Date            : 2009-08-07 21:51 (66 days old)
> > > References      : http://marc.info/?l=linux-kernel&m=124968192727854&w=4
> > > Handled-By      : Nick Kossifidis <mickflemm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > > Patch           : http://patchwork.kernel.org/patch/38550/
> > 
> > This patch was included in 2.6.31.2, so I believe this can go.
> 
> Thanks, closing.

Sorry for not responding earlier.  I now installed
2.6.31.4 on my old Thinkpad T42p and can confirm that
the issue is resolved.

Thanks
Johannes

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-15 17:38                                       ` OGAWA Hirofumi
  2009-10-15 19:00                                         ` Oleg Nesterov
@ 2009-10-15 21:49                                         ` Linus Torvalds
  2009-10-15 22:29                                           ` OGAWA Hirofumi
  1 sibling, 1 reply; 248+ messages in thread
From: Linus Torvalds @ 2009-10-15 21:49 UTC (permalink / raw)
  To: OGAWA Hirofumi
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson



On Fri, 16 Oct 2009, OGAWA Hirofumi wrote:
> 
> I.e. the following or something,
> 
> static inline int input_available_p(struct tty_struct *tty, int amt)
> {
> 	int try = 0;
> 
> retry:
> 	if (tty->icanon) {
> 		if (tty->canon_data)
> 			return 1;
> 	} else if (tty->read_cnt >= (amt ? amt : 1))
> 		return 1;
> 
> 	if (!checked) {
> 		tty_flush_to_ldisc(tty);
> 		try = 1;
> 		goto retry;
> 	}
> 
> 	return 0;
> }

Yeah, we could do that. Especially if we ever see this in any profiles. I 
doubt we do, but..

> Sorry if I'm missing the point. Doesn't this have (possible) race with
> schedule_delayed_work() (i.e. by tty writer)?
> 
>              cpu0                                      cpu1
> 
>     if (del_timer(&dwork->timer)) {
>                                             // cpu0 doesn't set _PENDING
>                                             schedule_delayed_work()

We don't care.

We want to make sure that a writer that wrote the data strictly _before_ 
the reader is reading will always have the data show up.

But if the writer is exactly concurrent with the reader, it's fine to not 
see the data. Because at that point, we will rely not on the tty buffers, 
but on the writer doing a tty_wakeup() to notify us that there is new 
data.

			Linus

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
  2009-10-15 21:49                                         ` Linus Torvalds
@ 2009-10-15 22:29                                           ` OGAWA Hirofumi
  0 siblings, 0 replies; 248+ messages in thread
From: OGAWA Hirofumi @ 2009-10-15 22:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Oleg Nesterov, Paul Fulghum, Boyan, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Dmitry Torokhov,
	Ed Tomlinson

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Fri, 16 Oct 2009, OGAWA Hirofumi wrote:
>> 
>> I.e. the following or something,
>> 
>> static inline int input_available_p(struct tty_struct *tty, int amt)
>> {
>> 	int try = 0;
>> 
>> retry:
>> 	if (tty->icanon) {
>> 		if (tty->canon_data)
>> 			return 1;
>> 	} else if (tty->read_cnt >= (amt ? amt : 1))
>> 		return 1;
>> 
>> 	if (!checked) {
>> 		tty_flush_to_ldisc(tty);
>> 		try = 1;
>> 		goto retry;
>> 	}
>> 
>> 	return 0;
>> }
>
> Yeah, we could do that. Especially if we ever see this in any profiles. I 
> doubt we do, but..

Yes.  Or, FWIW, I was thinking to delete schedule_delayed_work() for
n_tty with flag or something at previous time.  I.e. disable background
flush_to_ldisc() by writer for n_tty, only n_tty_read() will check
tty.buf synchronously.

So, with it, unnecessary flush_to_ldisc() is removed completely... Well...

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-17 16:40             ` Pavel Machek
  0 siblings, 0 replies; 248+ messages in thread
From: Pavel Machek @ 2009-10-17 16:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Alan Cox, Paul Fulghum, Justin P. Mattock,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Fr?d?ric L. W. Meunier, OGAWA Hirofumi

Hi!

> Comments? Does this work? Does it make any difference? It seems fairly 
> unlikely, but it's the only obvious problem I've seen in the tty buffering 
> code so far.
> 
> And that code is literally 3 years old, and it seems unlikely that a 
> regular _keyboard_ buffer would be able to hit the (rather small) race 
> condition. But other serialization may have hidden it, and timing 
> differences could certainly have caused it to trigger much more easily.

I use this (run as root) to trigger various problems in this
area... (portable between i386 and i386).

								Pavel

void
main(void)
{
	int i;
	iopl(3);
	while (1) {
		asm volatile("cli");
		//		for (i=0; i<20000000; i++)
		for (i=0; i<1000000000; i++)
			asm volatile("");
		asm volatile("sti");
		sleep(1);
	}
}


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 248+ messages in thread

* Re: [Bug #14388] keyboard under X with 2.6.31
@ 2009-10-17 16:40             ` Pavel Machek
  0 siblings, 0 replies; 248+ messages in thread
From: Pavel Machek @ 2009-10-17 16:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nix, Alan Cox, Paul Fulghum, Justin P. Mattock,
	Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Boyan, Dmitry Torokhov, Ed Tomlinson,
	Fr?d?ric L. W. Meunier, OGAWA Hirofumi

Hi!

> Comments? Does this work? Does it make any difference? It seems fairly 
> unlikely, but it's the only obvious problem I've seen in the tty buffering 
> code so far.
> 
> And that code is literally 3 years old, and it seems unlikely that a 
> regular _keyboard_ buffer would be able to hit the (rather small) race 
> condition. But other serialization may have hidden it, and timing 
> differences could certainly have caused it to trigger much more easily.

I use this (run as root) to trigger various problems in this
area... (portable between i386 and i386).

								Pavel

void
main(void)
{
	int i;
	iopl(3);
	while (1) {
		asm volatile("cli");
		//		for (i=0; i<20000000; i++)
		for (i=0; i<1000000000; i++)
			asm volatile("");
		asm volatile("sti");
		sleep(1);
	}
}


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 248+ messages in thread

* [Bug #14143] OOPS when setting nr_requests for md devices
  2009-10-01 19:53 2.6.32-rc1-git2: " Rafael J. Wysocki
@ 2009-10-01 19:55 ` Rafael J. Wysocki
  0 siblings, 0 replies; 248+ messages in thread
From: Rafael J. Wysocki @ 2009-10-01 19:55 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, aCaB

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.30 and 2.6.31.

The following bug entry is on the current list of known regressions
introduced between 2.6.30 and 2.6.31.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14143
Subject		: OOPS when setting nr_requests for md devices
Submitter	: aCaB <acab@clamav.net>
Date		: 2009-09-08 08:48 (24 days old)



^ permalink raw reply	[flat|nested] 248+ messages in thread

end of thread, other threads:[~2009-10-18 18:04 UTC | newest]

Thread overview: 248+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-11 22:41 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-10-11 22:41 ` Rafael J. Wysocki
2009-10-11 22:41 ` [Bug #13645] NULL pointer dereference at (null) (level2_spare_pgt) Rafael J. Wysocki
2009-10-11 22:41   ` Rafael J. Wysocki
2009-10-11 22:49 ` [Bug #13733] 2.6.31-rc2: irq 16: nobody cared Rafael J. Wysocki
2009-10-11 22:49   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13940] 2.6.31-rc1 - iwlagn and sky2 stopped working when ACPI enabled - Toshiba U400-17b, Acer Aspire 8935G Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13836] suspend script fails, related to stdout? Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13809] oprofile: possible circular locking dependency detected Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13941] x86 Geode issue Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13906] Huawei E169 GPRS connection causes Ooops Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13948] ath5k broken after suspend-to-ram Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12  0:19   ` Bob Copeland
2009-10-12 21:24     ` Rafael J. Wysocki
2009-10-12 21:24       ` Rafael J. Wysocki
2009-10-15 21:38       ` Johannes Stezenbach
2009-10-15 21:38         ` Johannes Stezenbach
2009-10-11 23:01 ` [Bug #13987] Received NMI interrupt at resume Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12  7:24   ` Fabio Comolli
2009-10-12 21:23     ` Rafael J. Wysocki
2009-10-12 21:23       ` Rafael J. Wysocki
2009-10-13  8:46       ` Fabio Comolli
2009-10-13  8:46         ` Fabio Comolli
2009-10-11 23:01 ` [Bug #14070] lockdep warning triggered by dup_fd Rafael J. Wysocki
2009-10-12 17:10   ` Bart Van Assche
2009-10-12 21:26     ` Rafael J. Wysocki
2009-10-12 21:26       ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14058] Oops in fsnotify Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14013] hd don't show up Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14017] _end symbol missing from Symbol.map Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14137] usb console regressions Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14090] WARNING: at fs/notify/inotify/inotify_user.c:394 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
2009-10-11 23:57   ` Frans Pop
2009-10-11 23:57     ` Frans Pop
2009-10-12 21:29     ` Rafael J. Wysocki
2009-10-12 21:29       ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14114] Tuning a saa7134 based card is broken in kernel 2.6.31-rc7 Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14129] 2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14143] OOPS when setting nr_requests for md devices Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12 14:21   ` Chuck Ebbert
2009-10-12 21:30     ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14181] b43 causes panic at ifconfig down / shutdown Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14157] end_request: I/O error, dev cciss/cXdX, sector 0 Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000 Rafael J. Wysocki
2009-10-12 10:49   ` David Miller
2009-10-12 11:44     ` Stephan von Krawczynski
2009-10-11 23:01 ` [Bug #14185] Oops in driversbasefirmware_class Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14249] BUG: oops in gss_validate on 2.6.31 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14204] MCE prevent booting on my computer(pentium iii @500Mhz) Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14248] 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14258] Memory leak in SCSI initialization Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-15  2:30   ` Tetsuo Handa
2009-10-11 23:01 ` [Bug #14257] Not able to boot on 32 bit System Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14256] kernel BUG at fs/ext3/super.c:435 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14253] Oops in driversbasefirmware_class Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting' Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12  3:12   ` David Miller
2009-10-12 21:32     ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100 Rafael J. Wysocki
2009-10-12 11:05   ` David Miller
2009-10-13 12:29     ` Karol Lewandowski
2009-10-11 23:01 ` [Bug #14264] ehci problem - mouse dead on scroll Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-13 15:35   ` Alan Stern
2009-10-13 15:55     ` Volker Armin Hemmann
2009-10-13 15:55       ` Volker Armin Hemmann
2009-10-13 20:39       ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14267] Disassociating atheros wlan Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:11   ` Justin P. Mattock
2009-10-11 23:11     ` Justin P. Mattock
2009-10-12 21:35     ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187 Rafael J. Wysocki
2009-10-12 10:51   ` David Miller
2009-10-12 12:09     ` Santiago Garcia Mantinan
2009-10-12 21:38       ` Rafael J. Wysocki
2009-10-12 23:21       ` David Miller
2009-10-12 23:21         ` David Miller
2009-10-11 23:01 ` [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more? Rafael J. Wysocki
2009-10-12 14:39   ` Chuck Ebbert
2009-10-12 14:39     ` Chuck Ebbert
2009-10-11 23:01 ` [Bug #14266] regression in page writeback Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12  1:02   ` Shaohua Li
2009-10-12  1:02     ` Shaohua Li
2009-10-12 21:34     ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14385] DMAR regression in 2.6.31 leads to ext4 corruption? Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14329] Sata disk doesn't wake up after S3 suspend Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14301] WARNING: at net/ipv4/af_inet.c:154 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14377] "conservative" cpufreq governor broken Rafael J. Wysocki
2009-10-12  1:47   ` Steven Noonan
2009-10-12 21:39     ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14309] MCA on hp rx8640 Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14391] use after free of struct powernow_k8_data Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14388] keyboard under X with 2.6.31 Rafael J. Wysocki
2009-10-11 23:01   ` Rafael J. Wysocki
2009-10-12 18:53   ` Justin P. Mattock
2009-10-12 21:41     ` Rafael J. Wysocki
2009-10-12 21:41       ` Rafael J. Wysocki
2009-10-12 22:59     ` Nix
2009-10-12 23:38       ` Alan Cox
2009-10-12 23:46         ` Dmitry Torokhov
2009-10-12 23:46           ` Dmitry Torokhov
2009-10-13  0:14           ` Justin P. Mattock
2009-10-13 11:00           ` Alan Cox
2009-10-13 11:00             ` Alan Cox
2009-10-13 14:51             ` Jiri Kosina
2009-10-13 15:56               ` Andi Kleen
2009-10-13  2:00         ` Daniel Hazelton
2009-10-13  0:16       ` Linus Torvalds
2009-10-13  2:54         ` Frédéric L. W. Meunier
2009-10-13  2:54           ` Frédéric L. W. Meunier
2009-10-13 19:32           ` Nix
2009-10-13 19:32             ` Nix
2009-10-13  3:24         ` Linus Torvalds
2009-10-13  3:43           ` Justin P. Mattock
2009-10-13  7:13             ` Frédéric L. W. Meunier
2009-10-13  7:13               ` Frédéric L. W. Meunier
2009-10-13  8:19               ` Boyan
2009-10-13  9:17                 ` Dmitry Torokhov
2009-10-13 14:33                 ` Frédéric L. W. Meunier
2009-10-13 14:33                   ` Frédéric L. W. Meunier
2009-10-13 15:05                 ` Linus Torvalds
2009-10-13 20:08                   ` Boyan
2009-10-13 20:53                     ` Linus Torvalds
2009-10-13 20:53                       ` Linus Torvalds
2009-10-13 21:02                       ` Linus Torvalds
2009-10-13 21:02                         ` Linus Torvalds
2009-10-13 21:13                       ` Linus Torvalds
2009-10-14  0:55                         ` Frédéric L. W. Meunier
2009-10-14  0:55                           ` Frédéric L. W. Meunier
2009-10-14  1:12                           ` Linus Torvalds
2009-10-14  1:20                             ` david
2009-10-14  1:20                               ` david-gFPdbfVZQbY
2009-10-14  7:45                         ` Boyan
2009-10-14  7:45                           ` Boyan
2009-10-13 21:32                       ` Alan Cox
2009-10-13 21:32                         ` Alan Cox
2009-10-13 22:54                         ` Linus Torvalds
2009-10-13 23:11                           ` Alan Cox
2009-10-13 23:11                             ` Alan Cox
2009-10-13 23:16                             ` Linus Torvalds
2009-10-13 23:16                               ` Linus Torvalds
2009-10-13 21:46                       ` Paul Fulghum
2009-10-13 22:42                         ` Linus Torvalds
2009-10-13 22:42                           ` Linus Torvalds
2009-10-13 23:01                           ` Alan Cox
2009-10-13 23:01                             ` Alan Cox
2009-10-14  0:08                           ` Paul Fulghum
2009-10-14  0:08                             ` Paul Fulghum
     [not found]                             ` <4AD51D6B.7010509@microgate.com>
2009-10-14  1:03                               ` Linus Torvalds
2009-10-14  1:03                                 ` Linus Torvalds
2009-10-14  1:05                                 ` Linus Torvalds
2009-10-14  1:05                                   ` Linus Torvalds
2009-10-14  1:34                                 ` Paul Fulghum
2009-10-14  1:34                                   ` Paul Fulghum
2009-10-14 11:58                                 ` Alan Cox
2009-10-14 11:58                                   ` Alan Cox
2009-10-14 15:07                                   ` Linus Torvalds
2009-10-14 16:34                                     ` Paul Fulghum
2009-10-14 16:34                                       ` Paul Fulghum
2009-10-14 16:38                                     ` Linus Torvalds
2009-10-14 16:38                                       ` Linus Torvalds
2009-10-14 18:20                                       ` Oleg Nesterov
2009-10-14 18:51                                         ` Linus Torvalds
2009-10-14 18:51                                           ` Linus Torvalds
2009-10-14 19:52                                           ` Oleg Nesterov
2009-10-14 20:55                                             ` Linus Torvalds
2009-10-14 20:55                                               ` Linus Torvalds
2009-10-15 12:47                                               ` Oleg Nesterov
2009-10-15 15:29                                                 ` Oleg Nesterov
2009-10-15 15:29                                                   ` Oleg Nesterov
2009-10-15 16:04                                                   ` Linus Torvalds
2009-10-15 16:04                                                     ` Linus Torvalds
2009-10-15 15:53                                                 ` Linus Torvalds
2009-10-14 21:16                                             ` Alan Cox
2009-10-14 21:51                                               ` David Miller
2009-10-14 19:59                                       ` Boyan
2009-10-14 19:59                                         ` Boyan
2009-10-14 21:02                                         ` Linus Torvalds
2009-10-14 21:39                                           ` Alan Cox
2009-10-15  7:24                                           ` Boyan
2009-10-15  7:24                                             ` Boyan
2009-10-15 17:38                                       ` OGAWA Hirofumi
2009-10-15 19:00                                         ` Oleg Nesterov
2009-10-15 21:49                                         ` Linus Torvalds
2009-10-15 22:29                                           ` OGAWA Hirofumi
2009-10-13 10:34             ` Alan Cox
2009-10-13 15:16               ` Justin P. Mattock
2009-10-13 15:16                 ` Justin P. Mattock
2009-10-13 10:32           ` Alan Cox
2009-10-13 13:25             ` Paul Fulghum
2009-10-13 14:39             ` Linus Torvalds
2009-10-13 14:39               ` Linus Torvalds
2009-10-13 15:02               ` Linus Torvalds
2009-10-13 15:02                 ` Linus Torvalds
2009-10-13 15:08               ` Paul Fulghum
2009-10-13 15:08                 ` Paul Fulghum
2009-10-13 15:33               ` Paul Fulghum
2009-10-13 15:33                 ` Paul Fulghum
2009-10-13 15:41                 ` Linus Torvalds
2009-10-13 15:59                   ` Alan Cox
2009-10-13 16:42                     ` Linus Torvalds
2009-10-13 17:28                   ` Paul Fulghum
2009-10-13 17:28                     ` Paul Fulghum
2009-10-17 16:40           ` Pavel Machek
2009-10-17 16:40             ` Pavel Machek
2009-10-11 23:24 ` 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31 Larry Finger
2009-10-11 23:24   ` Larry Finger
2009-10-12 21:43   ` Rafael J. Wysocki
2009-10-12 21:43   ` Rafael J. Wysocki
2009-10-12 21:43     ` Rafael J. Wysocki
2009-10-12 12:22 ` Frederik Deweerdt
2009-10-12 12:22 ` Frederik Deweerdt
2009-10-12 12:22   ` Frederik Deweerdt
2009-10-12 21:46   ` Rafael J. Wysocki
2009-10-12 21:46     ` Rafael J. Wysocki
2009-10-12 21:46   ` Rafael J. Wysocki
2009-10-12 19:58 ` Andrew Patterson
2009-10-12 19:58 ` Andrew Patterson
2009-10-12 21:48   ` Rafael J. Wysocki
2009-10-12 21:48   ` Rafael J. Wysocki
2009-10-12 21:48     ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2009-10-01 19:53 2.6.32-rc1-git2: " Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14143] OOPS when setting nr_requests for md devices Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.