* Oops in 2.6.10-rc1 @ 2004-10-28 13:12 Christian 2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela 0 siblings, 1 reply; 70+ messages in thread From: Christian @ 2004-10-28 13:12 UTC (permalink / raw) To: alsa-devel [repost to alsa-devel as suggested by lkml] hi, yesterday i was updating to recent 2.6.10-rc1-BK and booting gives: Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: dfc10ce0 *pde = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: snd_ens1371 snd_rawmidi snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc rtc CPU: 0 EIP: 0060:[<dfc10ce0>] Not tainted VLI EFLAGS: 00010282 (2.6.10-rc1) EIP is at 0xdfc10ce0 eax: 00000000 ebx: dff1f800 ecx: dfc10ce0 edx: dff1f9c4 esi: ffffffed edi: dff1f800 ebp: dff1f800 esp: de613e50 ds: 007b es: 007b ss: 0068 Process modprobe (pid: 186, threadinfo=de612000 task=deb5e5a0) Stack: c01fc7b8 dff1f800 000007ff dff1f800 c01fc7ef dff1f800 000007ff dfc1e400 e082729d dff1f800 dfc1e400 00000000 e08469cf dfc1e400 000001f8 000000d0 c01667f7 de36da8c c0171759 dffe79e0 dfc1e400 ffffffed dff1f800 dff1f800 Call Trace: [<c01fc7b8>] pci_enable_device_bars+0x28/0x40 [<c01fc7ef>] pci_enable_device+0x1f/0x40 [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371] [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd] [<c01667f7>] __lookup_hash+0xa7/0xe0 [<c0171759>] alloc_inode+0x129/0x150 [<e0827867>] snd_audiopci_probe+0x87/0x1e0 [snd_ens1371] [<c016f6c2>] dput+0x92/0x250 [<c01fd202>] pci_device_probe_static+0x52/0x70 [<c01fd24c>] __pci_device_probe+0x2c/0x30 [<c01fd27c>] pci_device_probe+0x2c/0x60 [<c025adff>] bus_match+0x3f/0x80 [<c025af52>] driver_attach+0x52/0xa0 [<c025b478>] bus_add_driver+0x98/0xe0 [<c025ba8f>] driver_register+0x2f/0x40 [<c01fd530>] pci_register_driver+0x40/0x50 [<e08279cf>] alsa_card_ens137x_init+0xf/0x13 [snd_ens1371] [<c01341ba>] sys_init_module+0x18a/0x270 [<c01041fb>] syscall_call+0x7/0xb Code: 5f 64 65 76 38 62 00 00 00 00 00 00 00 00 00 02 00 00 00 88 0c c1 df 08 0d c1 df 10 fa 3a c0 00 fa 3a c0 00 00 00 00 6c 5a c1 df <0a> 00 00 00 36 46 37 46 00 00 00 00 f0 0c c1 df 69 6e 74 31 33 full dmesg output here: www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg.txt updating to an even more recent (read: updated now) does not help and the problem is really triggered when loading snd_ens1371. well, the only "problem" is the oops and i have no sound :-( just strange that nobody else cries out loud. or am i just lacking enough information? ok, this is debian/unstable (i386), gcc3.4.2, libc2.3.2, pls tell me if you need more information. thank you, Christian. -- BOFH excuse #374: It's the InterNIC's fault. ------------------------------------------------------- This SF.Net email is sponsored by: Sybase ASE Linux Express Edition - download now for FREE LinuxWorld Reader's Choice Award Winner for best database on Linux. http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [Alsa-devel] Oops in 2.6.10-rc1 2004-10-28 13:12 Oops in 2.6.10-rc1 Christian @ 2004-10-28 13:29 ` Jaroslav Kysela 2004-10-28 14:09 ` Christian 0 siblings, 1 reply; 70+ messages in thread From: Jaroslav Kysela @ 2004-10-28 13:29 UTC (permalink / raw) To: Christian; +Cc: alsa-devel, LKML On Thu, 28 Oct 2004, Christian wrote: > [<c01fc7b8>] pci_enable_device_bars+0x28/0x40 > [<c01fc7ef>] pci_enable_device+0x1f/0x40 > [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371] > [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd] It's a bit dead-lock, because we cannot help you. It seems that the pci structure passed to our code is broken. The driver has had no changes in initialization for a long time. Jaroslav ----- Jaroslav Kysela <perex@suse.cz> Linux Kernel Sound Maintainer ALSA Project, SUSE Labs ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [Alsa-devel] Oops in 2.6.10-rc1 2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela @ 2004-10-28 14:09 ` Christian 2004-11-04 15:16 ` Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Christian @ 2004-10-28 14:09 UTC (permalink / raw) To: LKML; +Cc: alsa-devel Jaroslav Kysela wrote: > On Thu, 28 Oct 2004, Christian wrote: > > >> [<c01fc7b8>] pci_enable_device_bars+0x28/0x40 >> [<c01fc7ef>] pci_enable_device+0x1f/0x40 >> [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371] >> [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd] > > > It's a bit dead-lock, because we cannot help you. It seems that > the pci structure passed to our code is broken. The driver has had > no changes in initialization for a long time. so, it's a kernel problem again, not related to the alsa framework? i see in http://www.kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.10-rc1 [...] <rddunlap@osdl.org> [PATCH] i386/io_apic init section fixups <wli@holomorphy.com> [PATCH] vm: convert users of remap_page_range() under sound/ to use remap_pfn_range() [...] so i'll revert the patches and see what it gives. thank you, Christian -- BOFH excuse #131: telnet: Unable to connect to remote host: Connection refused ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [Alsa-devel] Oops in 2.6.10-rc1 2004-10-28 14:09 ` Christian @ 2004-11-04 15:16 ` Christian Kujau 2004-11-05 2:35 ` Christian Kujau 2004-11-07 1:24 ` Christian Kujau 0 siblings, 2 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-04 15:16 UTC (permalink / raw) To: LKML; +Cc: alsa-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hm, still no sound with snd_ens1371 but now i spend some time to find out how to revert a patch with bk. while compiling is still ongoing, let me tell you how i tried to revert the patch with bk, because i am not entirely sure if i do the right thing here: bk changes > ../changes-04-11-2004.txt as written before, i suspect (!) two changes here: > [...] > <rddunlap@osdl.org> > [PATCH] i386/io_apic init section fixups > > <wli@holomorphy.com> > [PATCH] vm: convert users of remap_page_range() under sound/ to > use remap_pfn_range() > [...] > > so i'll revert the patches and see what it gives. in ../changes-04-11-2004.txt i found out the ChnageSet numbers: 1.1988.72.76 + 1.2000.5.77. then i did bk undo -a1.1988.72.76 only to find out that i misread the manual and 1.1988.72.76 is still in place. i did bk changes > ../changes-1.1988.72.76.txt and the very patch has a different ChangeSet now: 1.2202. so i did bk undo -a1.2201 is this the right way to revert patches when subsequent patches might not allow to simply "bk undo -r<vers>" (because subsequent patches rely on this single ChangeSet). thank you for your assistance, Christian - -- BOFH excuse #182: endothermal recalibration -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBike6+A7rjkF8z0wRAl/DAKDAMP31cXrzjBnnl+713F1zJ5ShQQCdFYRr TpRkMTwdhZq9SvoZEPR2Plw= =sm2q -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [Alsa-devel] Oops in 2.6.10-rc1 2004-11-04 15:16 ` Christian Kujau @ 2004-11-05 2:35 ` Christian Kujau 2004-11-05 11:40 ` holborn 2004-11-07 1:24 ` Christian Kujau 1 sibling, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-05 2:35 UTC (permalink / raw) To: LKML; +Cc: alsa-devel, perex -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi again, i *think* i found the ChangeSet leading to the bug i tried to report in http://marc.theaimsgroup.com/?l=linux-kernel&m=109888178603516&w=2 the error is sill present here (and only here? strange...), the latest -BK does not fix it. i had some difficulties in telling BK to do the right thing. to summarise the error: - - upon loading of snd_ens1371 the Oops occurs. system is still stable then, but no sound available. - - this occured somewhere between 2.6.9 (released 15-Oct-2004) and 2.6.9-10 (released 22-Oct-2004) one interesting changeset was: ChangeSet@1.2000.7.1, 2004-10-20 20:33:06+02:00, perex@suse.cz Merge suse.cz:/home/perex/bk/linux-sound/linux-2.5 into suse.cz:/home/perex/bk/linux-sound/linux-sound i tried to back it out: $ bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test but the said ChangeSet was still there (of course). i tried to back it out (now for sure): $ bk undo -a1.2010 (hm: the changesets get renumbered everytime i "do" something with the tree) this one reverted quite a few ChangeSets but i let it happen. compiling & booting this thing goes fine and i am now running 2,6,9-BK(?) with working snd_ens1371. if someone could give me a hint here what to do next or perhaps tell me that the whole things was totally pointless - please say so. i am somehow lost as to which is the right person to bug here. thank you for your time, Christian. - -- BOFH excuse #328: Fiber optics caused gas main leak -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBiucN+A7rjkF8z0wRAkpKAJ0bbevHqmpU/Ut3r5TbWgfu42cGBACgsrhm X8euqIjgc8KNCWl50oys/Yw= =8VM9 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-05 2:35 ` Christian Kujau @ 2004-11-05 11:40 ` holborn 0 siblings, 0 replies; 70+ messages in thread From: holborn @ 2004-11-05 11:40 UTC (permalink / raw) To: alsa-devel I use snd_ens1371 and linux-2.6.10-rc1 and works .... Josep ------------------------------------------------------- This SF.Net email is sponsored by: Sybase ASE Linux Express Edition - download now for FREE LinuxWorld Reader's Choice Award Winner for best database on Linux. http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click ^ permalink raw reply [flat|nested] 70+ messages in thread
* Oops in 2.6.10-rc1 2004-11-04 15:16 ` Christian Kujau 2004-11-05 2:35 ` Christian Kujau @ 2004-11-07 1:24 ` Christian Kujau 2004-11-07 7:02 ` Linus Torvalds 2004-11-07 13:05 ` Pekka Enberg 1 sibling, 2 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-07 1:24 UTC (permalink / raw) To: LKML; +Cc: alsa-devel, perex -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi again, i *think* i found the ChangeSet leading to the bug i tried to report in http://marc.theaimsgroup.com/?l=linux-kernel&m=109888178603516&w=2 the error is sill present here (and only here? strange...), the latest -BK does not fix it. i had some difficulties in telling BK to do the right thing. to summarise the error: - - upon loading of snd_ens1371 the Oops occurs. system is still stable then, but no sound available. - - this occured somewhere between 2.6.9 (released 15-Oct-2004) and 2.6.9-10 (released 22-Oct-2004) one interesting changeset was: ChangeSet@1.2000.7.1, 2004-10-20 20:33:06+02:00, perex@suse.cz Merge suse.cz:/home/perex/bk/linux-sound/linux-2.5 into suse.cz:/home/perex/bk/linux-sound/linux-sound i tried to back it out: $ bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test but the said ChangeSet was still there (of course). i tried to back it out (now for sure): $ bk undo -a1.2010 (hm: the changesets get renumbered everytime i "do" something with the tree) this one reverted quite a few ChangeSets but i let it happen. compiling & booting this thing goes fine and i am now running 2,6,9-BK(?) with working snd_ens1371. if someone could give me a hint here what to do next or perhaps tell me that the whole things was totally pointless - please say so. i am somehow lost as to which is the right person to bug here. thank you for your time, Christian. - -- BOFH excuse #328: Fiber optics caused gas main leak -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBjXlZ+A7rjkF8z0wRAqaVAJ9ljiIpxi01SblgEg/ce/Vd/uYksQCfeuJ9 hRGA0/17ttZ83xRQDb8jfhs= =DQYp -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 1:24 ` Christian Kujau @ 2004-11-07 7:02 ` Linus Torvalds 2004-11-07 13:10 ` Christian Kujau 2004-11-07 13:05 ` Pekka Enberg 1 sibling, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-07 7:02 UTC (permalink / raw) To: Christian Kujau; +Cc: LKML, alsa-devel, perex On Sun, 7 Nov 2004, Christian Kujau wrote: > > if someone could give me a hint here what to do next or perhaps tell me > that the whole things was totally pointless - please say so. > i am somehow lost as to which is the right person to bug here. Since you seem to be a BK user, try doing a bk revtool sound/pci/ens1370.c and see if you can find the change that caused your problem. Of course, the real change might be somewhere else in the sound driver initialization path, so it's not like just that one file might be the cause. Regardöess, the more you can pinpoint when the problem started, the better. Also, if you enable frame pointers (under kernel debugging), the traceback will look a bit better. As it is, your oops looks looks like something has jumped off into la-la-land by jumping through a bad pointer (the value is still in %ecx), but it's definitely not clear _where_ that happened. Your trace points to pci_enable_device_bars(), but that may well be just stale stack contents. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 7:02 ` Linus Torvalds @ 2004-11-07 13:10 ` Christian Kujau 2004-11-07 16:02 ` Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-07 13:10 UTC (permalink / raw) To: linux-kernel; +Cc: alsa-devel, perex On Sat, 6 Nov 2004 23:02:28 -0800 (PST), Linus Torvalds wrote > > Since you seem to be a BK user, try doing a s/BK user/BK beginner/ > > bk revtool sound/pci/ens1370.c > > and see if you can find the change that caused your problem. hm, i already found the ChangeSet (ChangeSet@1.2000.7.1), but it seems the ChangeSets get renumbered when linux makes progress. the issuer of this changeset did not comment yet. > Of course, the real change might be somewhere else in the > sound driver initialization path, so it's not like just that > one file might be the cause. Regardöess, the more you can > pinpoint when the problem started, the better. yes. > > Also, if you enable frame pointers (under kernel debugging), > the traceback will look a bit better. As it is, your oops ah, ok, will do. thank you for your time, Christian. -- BOFH excuse #206: Police are examining all internet packets in the search for a narco-net-trafficker ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 13:10 ` Christian Kujau @ 2004-11-07 16:02 ` Christian Kujau 2004-11-07 16:57 ` Linus Torvalds 0 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-07 16:02 UTC (permalink / raw) To: evil; +Cc: linux-kernel, alsa-devel, perex, torvalds -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> bk revtool sound/pci/ens1370.c >> >>and see if you can find the change that caused your problem. since i got this oops between 2.6.9 and 2.6.10-rc1 i am still assuming that the change was made somewere between 15-Oct-2004 (2.6.9) and 22-Oct-2004 (2.6.10-rc1). so the only Changeset matching this timespan is: - ------------------------- ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au [PATCH] module_param_array() should take a pointer module_param_array() takes a variable to put the number of elements in. Looking through the uses, many people don't care, so they declare a dummy or share one variable between several parameters. The latter is problematic because sysfs uses that number to decide how many to display. The solution is to change the variable arg to a pointer, and if the pointer is NULL, use the "max" value. This change is fairly small, but fixing up the callers is a lot of (trivial) churn. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> - ------------------------- >>Also, if you enable frame pointers (under kernel debugging), >>the traceback will look a bit better. As it is, your oops http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config the new config has this enabled: CONFIG_DEBUG_DRIVER=y CONFIG_DEBUG_INFO=y CONFIG_DEBUG_KOBJECT=y CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SPINLOCK=y CONFIG_FRAME_POINTER=y CONFIG_KPROBES=y shows the output of dmesg after doing "modprobe snd-ens1371". after this, snd-ens1371 seems to be loaded: Module Size Used by snd_ens1371 29928 1 snd_rawmidi 25952 1 snd_ens1371 snd_ac97_codec 77856 1 snd_ens1371 snd_pcm 101768 2 snd_ens1371,snd_ac97_codec snd_timer 31940 1 snd_pcm snd 51620 5 snd_ens1371,snd_rawmidi,snd_ac97_codec,snd_pcm,snd_timer soundcore 9440 1 snd snd_page_alloc 7620 1 snd_pcm ipv6 260480 8 psmouse 20424 0 rtc 20188 0 but is not working and cannot be unloaded: prinz:~$ rmmod snd_ens1371 ERROR: Module snd_ens1371 is in use there was an answer from the alsa-devel folks here: http://marc.theaimsgroup.com/?l=linux-kernel&m=109897024116288&w=2 "It's a bit dead-lock, because we cannot help you. It seems that the pci structure passed to our code is broken. The driver has had no changes in initialization for a long time." i hope these information will help a bit. thank you for your assistance, i really appreciate it Christian (still wondering why nobody else has this bug, 1370 is not *that* weird, i thought) PS: if someone could explain me, why the ChangeSet numbers are always different: i've used "bk revtool sound/pci/ens1370.c" to find out the changes for this file and the suspicious patch reads sound/pci/ens1370.c@1.54.1.1, 2004-10-20.... in "bk revtool". the changelog however reads: ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au - -- BOFH excuse #62: need to wrap system in aluminum foil to fix problem -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBjkcE+A7rjkF8z0wRAkR/AJ98DKSv5dZfOSJdKGWdz1LWPlItgQCgvS1A iS1wUtTgHzsx4JFpqsQGt68= =Hv9R -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 16:02 ` Christian Kujau @ 2004-11-07 16:57 ` Linus Torvalds 2004-11-07 18:31 ` Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-07 16:57 UTC (permalink / raw) To: Christian Kujau; +Cc: linux-kernel, alsa-devel, perex On Sun, 7 Nov 2004, Christian Kujau wrote: > > since i got this oops between 2.6.9 and 2.6.10-rc1 i am still assuming > that the change was made somewere between 15-Oct-2004 (2.6.9) and > 22-Oct-2004 (2.6.10-rc1). Not necessarily. The ALSA merge is the most likely reason for the oops, and since ALSA development does not merge with the kernel very often, it may be some much older change in the ALSA tree. You can check the ALSA tree _before_ the merge, by doing (in the current tree): bk undo -a1.2000.7.2 which should give you a tree without any of "my" stuff, ie it was what Jaroslav was working on before he merged it into the standard tree. (BK revision numbers change on merges, so the above number is not necessarily the right one unless you have the current -bk tree. It should have a changeset something like: ChangeSet@1.2000.7.2, 2004-10-20 20:51:33+02:00, perex@suse.cz Merge suse.cz:/home/perex/bk/linux-sound/linux-sound into suse.cz:/home/perex/bk/linux-sound/work so that you can double-check). > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops.txt Yup, it's a call through a bad pointer again, and again the EIP value can be found in %ecx. But the source of the bug is not clear. The stack trace implies "show_stack()", but that function doesn't do any indirect calls, so I suspect the frame pointer didn't help in this case. And it's not "pci_enable_device()" either (which was there last time too), since that one calls "pci_enable_device_bars()" at the point it shows in the stack trace. Quite frankly, it looks like something smashed the stack, and the fact that it happens _around_ when "pci_enable_device()" was called makes me seriously suspect the IRQ handler for the device. That's when IRQ routing is enabled, so often the interrupts start at that point. And since FRAME_POINTER didn't make the stack frame look sane, it's very possible that the bogus call isn't due to a real "call", but due to a return from a broken stack. > there was an answer from the alsa-devel folks here: > http://marc.theaimsgroup.com/?l=linux-kernel&m=109897024116288&w=2 > > "It's a bit dead-lock, because we cannot help you. It seems that > the pci structure passed to our code is broken. The driver has had > no changes in initialization for a long time." I seriously doubt that it's the PCI structure being broken. It's the ALSA merge, almost certainly - it's just that the stack is so confused that it's hard to tell where the bug has happened. And I'll double-check the "regparm" changes, just in case. They change some irq calling conventions, although none of the involved stuff seems to be implied here. A quick suggestion: make sure that there is not some stale object file lying around confusing things about memory layout, and do a "make clean" and make sure that all old modules are clean too and re-installed. The kernel dependencies should be correct, but even then there can be problems with clocks that are off a bit etc. > (still wondering why nobody else has this bug, 1370 is not *that* weird, i > thought) Yes, that makes me suspicious, and is one reason why I wonder if it's just your tree not being built right. > PS: if someone could explain me, why the ChangeSet numbers are always > different: i've used "bk revtool sound/pci/ens1370.c" to find out the > changes for this file and the suspicious patch reads > > sound/pci/ens1370.c@1.54.1.1, 2004-10-20.... > > in "bk revtool". the changelog however reads: > > ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au There are different revision numbers: there's the revision number for the _file_, and there is the revision number for the _change_. Also, both (or one) of them can change when a merge occurs, since other people may have had different merge histories, and in a distributed environment the revision numbers are a lot more fluid than in CVS. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 16:57 ` Linus Torvalds @ 2004-11-07 18:31 ` Christian Kujau 2004-11-07 18:44 ` Linus Torvalds 2004-11-07 23:45 ` Christian Kujau 0 siblings, 2 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-07 18:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel, alsa-devel, perex On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote > > You can check the ALSA tree _before_ the merge, by doing (in > the current tree): > > bk undo -a1.2000.7.2 > > which should give you a tree without any of "my" stuff, ie it > was what Jaroslav was working on before he merged it into the > standard tree. yes, i already did so, i think: http://marc.theaimsgroup.com/?l=linux-kernel&m=109979092216919&w=2 but i did it this way: bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test bk undo -a1.2010 (probably wrong, so i'll repeat it as you suggeseted) > (BK revision numbers change on merges, so the above number is > not necessarily the right one unless you have the current -bk aha! > A quick suggestion: make sure that there is not some stale > object file lying around confusing things about memory layout, > and do a "make clean" and make sure that all old modules are > clean too and re-installed. really: i always do "make clean", even "make mrproper" sometimes, just to be sure. and i am quite certain, that i did not forget to install the modules. but i'll keep my eyes open, yes. > The kernel dependencies should be correct, but even then there can be > problems with clocks that are off a bit etc. i'm updating via "ntpdate" on every boot. i am even using a (faster) 2nd machine for my build and the bk things right now: building a current -bk on boths hosts gives me this error. > Yes, that makes me suspicious, and is one reason why I wonder > if it's just your tree not being built right. i'll build a -bk snapshot from a tar.bz2 later on and see what it gives. > There are different revision numbers: there's the revision > number for the _file_, and there is the revision number for > the _change_. aha. it was kinda confusing...now i got it, i think ;) again: thank you for your time on this rainy weekend, Christian. -- BOFH excuse #8: static buildup ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 18:31 ` Christian Kujau @ 2004-11-07 18:44 ` Linus Torvalds 2004-11-07 23:45 ` Christian Kujau 1 sibling, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-07 18:44 UTC (permalink / raw) To: Christian Kujau; +Cc: linux-kernel, alsa-devel, perex On Sun, 7 Nov 2004, Christian Kujau wrote: > On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote > > > > You can check the ALSA tree _before_ the merge, by doing (in > > the current tree): > > > > bk undo -a1.2000.7.2 > > > > which should give you a tree without any of "my" stuff, ie it > > was what Jaroslav was working on before he merged it into the > > standard tree. > > yes, i already did so, i think: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=109979092216919&w=2 > > but i did it this way: > bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test > bk undo -a1.2010 Hmm.. That may well have worked fine, but it sounds in that post like you tried to undo the ALSA stuff, and what I suggested was really to do the reverse: take _only_ the ALSA changes, and then if it still fails, at least you have now pinpointed it a bit more (admittedly to the _likely_ source, but that's as it should be: you narrow down the "known bad" source base until you've narrowed it down to the smallest change you can find that causes the problem). > > Yes, that makes me suspicious, and is one reason why I wonder > > if it's just your tree not being built right. > > i'll build a -bk snapshot from a tar.bz2 later on and see what it gives. Sounds like you're doing everything right, but hey, it can't hurt to double-check. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 18:31 ` Christian Kujau @ 2004-11-07 23:45 ` Christian Kujau 2004-11-07 23:45 ` Christian Kujau 1 sibling, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-07 23:45 UTC (permalink / raw) To: linux-kernel; +Cc: Linus Torvalds, alsa-devel, linux-sound -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Christian Kujau schrieb: > On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote > >> bk undo -a1.2000.7.2 >> >>which should give you a tree without any of "my" stuff, ie it >>was what Jaroslav was working on before he merged it into the >>standard tree. i did so from a current tree (bk pull, undo, -r get) and it's working fine (url wraps): http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.9_a1.2000.7.2.txt so i can see with "bk changes" that the ChangeSet is still there. this is what i expected, because -a says: - -a<rev> Remove all changesets which occurred after <rev>. what i did not expect is that this ChangeSet is now *not* the culprit, because there is no oops. am i right? [1] >>Yes, that makes me suspicious, and is one reason why I wonder >>if it's just your tree not being built right. > > i'll build a -bk snapshot from a tar.bz2 later on and see what it gives. i've build from linux-2.6.10-rc1.tar.bz2 with patch-2.6.10-rc1-bk17.bz2 from kernel.org with the same .config and "modprobe snd-ens1371" oopses as expected :( > Hmm.. That may well have worked fine, but it sounds in that post like > you tried to undo the ALSA stuff, and what I suggested was really to > do the reverse: take _only_ the ALSA changes, and then if it still yes, i wanted to undo the alsa changes because i suspected the alsa framework (sorry guys) and wanted to see if it still oopses when the latest alsa patch was not appied. i did another thing: i enabled the (deprecated) OSS driver (es1371.ko) tried to load this thing: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt it oopses. - - you said it's not a b0rken pci thingy - - i have to assume now that it's not an ALSA issue (since oss oopses too) - - it is OSS? the driver? i've CC'ed linux-sound... > fails, at least you have now pinpointed it a bit more (admittedly to > the _likely_ source, but that's as it should be: you narrow down the > "known bad" source base until you've narrowed it down to the smallest > change you can find that causes the problem). yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties in using my tools (bk). sorry for that. > Sounds like you're doing everything right, but hey, it can't hurt to > double-check. yes, i really hope that it's not just a user error (on my side). building kernels since 2.0...but you never know... thanks again for help, Christian (whose only wish these days is to get over this strange thing and not wasting peoples precious time with a "sound driver". hey, at least the box is booting...) - -- BOFH excuse #224: Jan 9 16:41:27 huber su: 'su root' succeeded for .... on /dev/pts/1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBjrOp+A7rjkF8z0wRAl59AKCEbRRzsGujcOlLUA74taFZJb8H0ACfUUxQ nVQHjBXRBBn9BgSs7cLhTlY= =wb90 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-07 23:45 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-07 23:45 UTC (permalink / raw) To: linux-kernel; +Cc: Linus Torvalds, alsa-devel, linux-sound -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Christian Kujau schrieb: > On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote > >> bk undo -a1.2000.7.2 >> >>which should give you a tree without any of "my" stuff, ie it >>was what Jaroslav was working on before he merged it into the >>standard tree. i did so from a current tree (bk pull, undo, -r get) and it's working fine (url wraps): http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.9_a1.2000.7.2.txt so i can see with "bk changes" that the ChangeSet is still there. this is what i expected, because -a says: - -a<rev> Remove all changesets which occurred after <rev>. what i did not expect is that this ChangeSet is now *not* the culprit, because there is no oops. am i right? [1] >>Yes, that makes me suspicious, and is one reason why I wonder >>if it's just your tree not being built right. > > i'll build a -bk snapshot from a tar.bz2 later on and see what it gives. i've build from linux-2.6.10-rc1.tar.bz2 with patch-2.6.10-rc1-bk17.bz2 from kernel.org with the same .config and "modprobe snd-ens1371" oopses as expected :( > Hmm.. That may well have worked fine, but it sounds in that post like > you tried to undo the ALSA stuff, and what I suggested was really to > do the reverse: take _only_ the ALSA changes, and then if it still yes, i wanted to undo the alsa changes because i suspected the alsa framework (sorry guys) and wanted to see if it still oopses when the latest alsa patch was not appied. i did another thing: i enabled the (deprecated) OSS driver (es1371.ko) tried to load this thing: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt it oopses. - - you said it's not a b0rken pci thingy - - i have to assume now that it's not an ALSA issue (since oss oopses too) - - it is OSS? the driver? i've CC'ed linux-sound... > fails, at least you have now pinpointed it a bit more (admittedly to > the _likely_ source, but that's as it should be: you narrow down the > "known bad" source base until you've narrowed it down to the smallest > change you can find that causes the problem). yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties in using my tools (bk). sorry for that. > Sounds like you're doing everything right, but hey, it can't hurt to > double-check. yes, i really hope that it's not just a user error (on my side). building kernels since 2.0...but you never know... thanks again for help, Christian (whose only wish these days is to get over this strange thing and not wasting peoples precious time with a "sound driver". hey, at least the box is booting...) - -- BOFH excuse #224: Jan 9 16:41:27 huber su: 'su root' succeeded for .... on /dev/pts/1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBjrOp+A7rjkF8z0wRAl59AKCEbRRzsGujcOlLUA74taFZJb8H0ACfUUxQ nVQHjBXRBBn9BgSs7cLhTlY=wb90 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 23:45 ` Christian Kujau @ 2004-11-08 1:16 ` Linus Torvalds -1 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-08 1:16 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH On Mon, 8 Nov 2004, Christian Kujau wrote: > > what i did not expect is that this ChangeSet is now *not* the culprit, > because there is no oops. am i right? [1] Yes. So now I'd like to know _where_ the culprit is, since it turned out to be not the ALSA code. > i did another thing: i enabled the (deprecated) OSS driver (es1371.ko) > tried to load this thing: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt > > it oopses. > - you said it's not a b0rken pci thingy > - i have to assume now that it's not an ALSA issue (since oss oopses too) > - it is OSS? the driver? i've CC'ed linux-sound... Sounds like something else changed, and likely the ALSA _and_ the OSS driver both broke. Which is not all that unlikely, since I suspect they share a lot of history. > yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties > in using my tools (bk). sorry for that. Not your fault. Think of this as a learning experience ;) Anyway, now that the _other_ driver also oopses, and with a very similar oops too, so it looks like they both depended on some undocumented (or changed) detail in the PCI layer. Next step would be to see if the thing that breaks is this merge: ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux which merges Greg's PCI/driver model changes. It's all the same steps you took with the ALSA merge, you're a professional by now ;) Greg, have you followed this thread? > (whose only wish these days is to get over this strange thing and not > wasting peoples precious time with a "sound driver". hey, at least the > box is booting...) Hey, sound is important. And especially if you somehow found something non-sound that just broke sound by mistake, all the more important to fix it. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 1:16 ` Linus Torvalds 0 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-08 1:16 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH On Mon, 8 Nov 2004, Christian Kujau wrote: > > what i did not expect is that this ChangeSet is now *not* the culprit, > because there is no oops. am i right? [1] Yes. So now I'd like to know _where_ the culprit is, since it turned out to be not the ALSA code. > i did another thing: i enabled the (deprecated) OSS driver (es1371.ko) > tried to load this thing: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt > > it oopses. > - you said it's not a b0rken pci thingy > - i have to assume now that it's not an ALSA issue (since oss oopses too) > - it is OSS? the driver? i've CC'ed linux-sound... Sounds like something else changed, and likely the ALSA _and_ the OSS driver both broke. Which is not all that unlikely, since I suspect they share a lot of history. > yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties > in using my tools (bk). sorry for that. Not your fault. Think of this as a learning experience ;) Anyway, now that the _other_ driver also oopses, and with a very similar oops too, so it looks like they both depended on some undocumented (or changed) detail in the PCI layer. Next step would be to see if the thing that breaks is this merge: ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux which merges Greg's PCI/driver model changes. It's all the same steps you took with the ALSA merge, you're a professional by now ;) Greg, have you followed this thread? > (whose only wish these days is to get over this strange thing and not > wasting peoples precious time with a "sound driver". hey, at least the > box is booting...) Hey, sound is important. And especially if you somehow found something non-sound that just broke sound by mistake, all the more important to fix it. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 1:16 ` Linus Torvalds @ 2004-11-08 13:01 ` Christian Kujau -1 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 13:01 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > Not your fault. Think of this as a learning experience ;) it definitely is, yes. > Anyway, now that the _other_ driver also oopses, and with a very similar > oops too, so it looks like they both depended on some undocumented (or > changed) detail in the PCI layer. Next step would be to see if the thing > that breaks is this merge: may i ask how you come to this conclusion? by technical knowledge or could this be deduced by some bk magic too? > > ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org > Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 > into ppc970.osdl.org:/home/torvalds/v2.6/linux > > which merges Greg's PCI/driver model changes. > > It's all the same steps you took with the ALSA merge, you're a > professional by now ;) i did "bk undo -a1.2463" from a current -BK tree and it oopses: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2463.txt (i've booted with different boot options this time, because i noticed that i always booted with "acpi=force". changing this did not help either.) next i wanted to do "bk undo -r1.2463" now to see if it does *not* break without this ChangeSet (because i already know it *breaks* with this ChangeSet) but that would leave some parentless child deltas. i read in the BK docs that "bk cset -x<version>" would help here. but "bk cset - -x1.2463" aborts: - --------------------- evil@atlant:~/kernel/linux-2.6-BK$ bk changes | head -n3 ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux evil@atlant:~/kernel/linux-2.6-BK$ bk cset -x1.2463 cset: Merge cset found in revision list: (1.2463). Aborting. (cset1) - --------------------- i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ the .configs, the oopses are there. i've double checked a kernel built from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) thank you, Christian. - -- BOFH excuse #121: halon system went off and killed the operators. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj24z+A7rjkF8z0wRAu0tAJ9g7mfG0iz/LvSAafD7LWKNu9qvLQCg3fjW 1oMRRK8oSqH5oZsudyIQVtw= =f8CQ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 13:01 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 13:01 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > Not your fault. Think of this as a learning experience ;) it definitely is, yes. > Anyway, now that the _other_ driver also oopses, and with a very similar > oops too, so it looks like they both depended on some undocumented (or > changed) detail in the PCI layer. Next step would be to see if the thing > that breaks is this merge: may i ask how you come to this conclusion? by technical knowledge or could this be deduced by some bk magic too? > > ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org > Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 > into ppc970.osdl.org:/home/torvalds/v2.6/linux > > which merges Greg's PCI/driver model changes. > > It's all the same steps you took with the ALSA merge, you're a > professional by now ;) i did "bk undo -a1.2463" from a current -BK tree and it oopses: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2463.txt (i've booted with different boot options this time, because i noticed that i always booted with "acpi=force". changing this did not help either.) next i wanted to do "bk undo -r1.2463" now to see if it does *not* break without this ChangeSet (because i already know it *breaks* with this ChangeSet) but that would leave some parentless child deltas. i read in the BK docs that "bk cset -x<version>" would help here. but "bk cset - -x1.2463" aborts: - --------------------- evil@atlant:~/kernel/linux-2.6-BK$ bk changes | head -n3 ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux evil@atlant:~/kernel/linux-2.6-BK$ bk cset -x1.2463 cset: Merge cset found in revision list: (1.2463). Aborting. (cset1) - --------------------- i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ the .configs, the oopses are there. i've double checked a kernel built from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) thank you, Christian. - -- BOFH excuse #121: halon system went off and killed the operators. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj24z+A7rjkF8z0wRAu0tAJ9g7mfG0iz/LvSAafD7LWKNu9qvLQCg3fjW 1oMRRK8oSqH5oZsudyIQVtwøCQ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 13:01 ` Christian Kujau @ 2004-11-08 18:13 ` Linus Torvalds -1 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-08 18:13 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH On Mon, 8 Nov 2004, Christian Kujau wrote: > > > Anyway, now that the _other_ driver also oopses, and with a very similar > > oops too, so it looks like they both depended on some undocumented (or > > changed) detail in the PCI layer. Next step would be to see if the thing > > that breaks is this merge: > > may i ask how you come to this conclusion? by technical knowledge or could > this be deduced by some bk magic too? No, just gut feel. If the pre-merge ALSA works, and the post-merge one doesn't, and the oops in both cases happen somewhere close to where it does "pci_enable_device()", there's not a lot left. There are interrupts, and there is the PCI layer... > > ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org > > Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 > > into ppc970.osdl.org:/home/torvalds/v2.6/linux > > > > which merges Greg's PCI/driver model changes. > > > > It's all the same steps you took with the ALSA merge, you're a > > professional by now ;) > > i did "bk undo -a1.2463" from a current -BK tree and it oopses: Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything after. So what you did still has the merge in the tree, and that it still oopses is thus to be expected. BUT, we're getting closer. > next i wanted to do "bk undo -r1.2463" now to see if it does *not* break > without this ChangeSet (because i already know it *breaks* with this > ChangeSet) but that would leave some parentless child deltas. i read in > the BK docs that "bk cset -x<version>" would help here. but "bk cset > - -x1.2463" aborts: "cset -x" only works on patches, not on complex operations. You still want "bk undo", but you want to use "bk revtool" to see what the merge point was, and tell _which_ of the merged top-of-trees you want to get to. In other words, you can't just undo a merge, you need to tell which _way_ to undo it. See? It does actually make sense, and "bk revtool" will show you the relationships of merges (at least if the time range is big enough to show enough info). Anyway, if you have the top-of-tree-is-1.2463, then go to "bk revtool", and select that node in the graph by clicking on it. Notice how those edges turned white, and you can now easily see which children were pre-merge. In this case, the top-of-tree tree _without_ the PCI merge is 1.2642: ChangeSet@1.2462, 2004-11-04 17:06:13-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/usb-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux (you won't see it in "bk changes", since it's a trivial merge: use "bk changes -a" to see it). So just before I merged Greg's PCI changes, I merged his USB changes. Now, that's fine - the USB merge is likely to be ok, so try doing bk undo -a1.2462 and you will now have a tree that is exactly the same as before, except it does _not_ have the PCI merge from Greg. And if this one does not oops, you can now officially blame Greg. Now, if you want to get _really_ fancy, you can now look at each changeset that differed, with something like bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - which is black magic that does a set operation and shows all the changes in between the sets of "bk at 1.2462" and "bk at 1.2463". (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one just shows the single merge change that is on the direct _path_ from one changeset to another. The black magic thing shows the set difference of changesets that comes from the full graph at two points). Then you can look at each change individually and see if they matter. And once you can do the set operations, you're officially a BK poweruser. Me, I just have a script, I'm a BK dabbler. Looking at the list (appended), I don't see anything obvious, but hey, if it was obvious it wouldn't have been merged in the first place. Thanks for your willingness to pursue this thing, Linus ----- <maneesh@in.ibm.com> [PATCH] sysfs: fix sysfs backing store error path confusion o sysfs_new_dirent to retrun 0 if kmalloc fails. Thanks to Milton Miller for spotting this. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <bunk@stusta.de> [PATCH] small sysfs cleanups The patch below does the following cleanups for the sysfs code: - remove the unused global function sysfs_mknod - make some structs and functions static Please check whether this patch is correct, or whether some of the things I made static should be used globally in the forseeable future. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <kay.sievers@vrfy.org> [PATCH] add the physical device and the bus to the hotplug environment Add the sysfs path of the physical device to the hotplug event of class and block devices. This should solve the userspace issue not to know if the device is a virtual one and the "device" symlink will never be created, but we sit there and wait for it to show up not knowing when we should give up. Also the bus name is added to the hotplug event, so we don't need to reverse lookup in the /sys/bus/* directory which bus our physical device belongs to. This is e.g. the value matched against the BUS= key, that may be used in an udev rule. This is a PCI network card: ACTION=add SUBSYSTEM=net DEVPATH=/class/net/eth0 PHYSDEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:02:01.0 PHYSDEVBUS=pci INTERFACE=eth0 SEQNUM=827 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ This is a IDE CDROM: ACTION=add SUBSYSTEM=block DEVPATH=/block/hdc PHYSDEVPATH=/devices/pci0000:00/0000:00:1f.1/ide1/1.0 PHYSDEVBUS=ide SEQNUM=1017 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ This is an USB-stick partition: ACTION=add SUBSYSTEM=block DEVPATH=/block/sda/sda1 PHYSDEVPATH=/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.0/host1/target1:0:0/1:0:0:0 PHYSDEVBUS=scsi SEQNUM=1032 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: comment fix in bus.c df_01_driver_attach_comment_fix.patch bus_match() was renamed to driver_probe_device() but the comment for device_attach() wasn't updated. This patch updates it. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: bus_recan_devices() locking fix df_02_bus_rescan_devcies_fix.patch bus_rescan_devices() eventually calls device_attach() and thus requires write locking the corresponding bus. The original code just called bus_for_each_dev() which only read locks the bus. This patch separates __bus_for_each_dev() and __bus_for_each_drv(), which don't do locking themselves, out from the original functions and call them with read lock in the original functions and with write lock in bus_rescan_devices(). Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: sysfs_release() dangling pointer reference fix df_03_sysfs_release_fix.patch Some attributes are allocated dynamically (e.g. module and device parameters) and are usually deallocated when the assoicated kobject is released. So, it's not safe to access attr after putting the kobject. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: kobject_add() error path reference counting fix df_04_kobject_add_ref_fix.patch In kobject_add(), @kobj wasn't put'd properly on error path. This patch fixes it. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: device_add() error path reference counting fix df_05_device_add_ref_fix.patch In device_add(), @dev wan't put'd properly when it has zero length bus_id (error path). Fixed. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <greg@kroah.com> kevent: fix build error if CONFIG_KOBJECT_UEVENT is not selected. Thanks to Serge Hallyn <serue@us.ibm.com> for pointing this out. Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <rml@novell.com> [PATCH] kobject_uevent: fix init ordering Looks like kobject_uevent_init is executed before netlink_proto_init and consequently always fails. Not cool. Attached patch switches the initialization over from core_initcall (init level 1) to postcore_initcall (init level 2). Netlink's initialization is done in core_initcall, so this should fix the problem. We should be fine waiting until postcore_initcall. Also a couple white space changes mixed in, because I am anal. Signed-Off-By: Robert Love <rml@novell.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <rml@novell.com> [PATCH] kobject_uevent: add MAINTAINER entry Attached patch adds a MAINTAINER entry for the kernel event layer. Signed-Off-By: Robert Love <rml@novell.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <greg@kroah.com> Merge kroah.com:/home/greg/linux/BK/bleed-2.6 into kroah.com:/home/greg/linux/BK/driver-2.6 <maneesh@in.ibm.com> [PATCH] fix kernel BUG at fs/sysfs/dir.c:20! On Thu, Nov 04, 2004 at 12:52:38PM -0800, Greg KH wrote: > Hi, > > I get the following BUG in the sysfs code when I do: > - plug in a usb-serial device. > - open the port with 'cat /dev/ttyUSB0' > - unplug the device. > - stop the 'cat' process with control-C > > This used to work just fine before your big sysfs changes. There is a similar problem reported by s390 people where we see parent kobject (directory) going away before child kobject (sub-directory). It seems kobject code is able to handle this, but not the sysfs. What could be happening that in sysfs_remove_dir() of parent directory, we try to remove its contents. It works well with the regular files as it is the final removal for sysfs_dirent corresponding to the files. But in case of sub-directory we are doing an extra sysfs_put(). Once while removing parent and the other one being the one from when sysfs_remove_dir() is called for the child. The following patch worked for the s390 people, I hope same will work in this case also. o Do not remove sysfs_dirents corresponding to the sub-directory in sysfs_remove_dir(). They will be removed in the sysfs_remove_dir() call for the specific sub-directory. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <torvalds@ppc970.osdl.org> Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 18:13 ` Linus Torvalds 0 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-08 18:13 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH On Mon, 8 Nov 2004, Christian Kujau wrote: > > > Anyway, now that the _other_ driver also oopses, and with a very similar > > oops too, so it looks like they both depended on some undocumented (or > > changed) detail in the PCI layer. Next step would be to see if the thing > > that breaks is this merge: > > may i ask how you come to this conclusion? by technical knowledge or could > this be deduced by some bk magic too? No, just gut feel. If the pre-merge ALSA works, and the post-merge one doesn't, and the oops in both cases happen somewhere close to where it does "pci_enable_device()", there's not a lot left. There are interrupts, and there is the PCI layer... > > ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org > > Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 > > into ppc970.osdl.org:/home/torvalds/v2.6/linux > > > > which merges Greg's PCI/driver model changes. > > > > It's all the same steps you took with the ALSA merge, you're a > > professional by now ;) > > i did "bk undo -a1.2463" from a current -BK tree and it oopses: Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything after. So what you did still has the merge in the tree, and that it still oopses is thus to be expected. BUT, we're getting closer. > next i wanted to do "bk undo -r1.2463" now to see if it does *not* break > without this ChangeSet (because i already know it *breaks* with this > ChangeSet) but that would leave some parentless child deltas. i read in > the BK docs that "bk cset -x<version>" would help here. but "bk cset > - -x1.2463" aborts: "cset -x" only works on patches, not on complex operations. You still want "bk undo", but you want to use "bk revtool" to see what the merge point was, and tell _which_ of the merged top-of-trees you want to get to. In other words, you can't just undo a merge, you need to tell which _way_ to undo it. See? It does actually make sense, and "bk revtool" will show you the relationships of merges (at least if the time range is big enough to show enough info). Anyway, if you have the top-of-tree-is-1.2463, then go to "bk revtool", and select that node in the graph by clicking on it. Notice how those edges turned white, and you can now easily see which children were pre-merge. In this case, the top-of-tree tree _without_ the PCI merge is 1.2642: ChangeSet@1.2462, 2004-11-04 17:06:13-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/usb-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux (you won't see it in "bk changes", since it's a trivial merge: use "bk changes -a" to see it). So just before I merged Greg's PCI changes, I merged his USB changes. Now, that's fine - the USB merge is likely to be ok, so try doing bk undo -a1.2462 and you will now have a tree that is exactly the same as before, except it does _not_ have the PCI merge from Greg. And if this one does not oops, you can now officially blame Greg. Now, if you want to get _really_ fancy, you can now look at each changeset that differed, with something like bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - which is black magic that does a set operation and shows all the changes in between the sets of "bk at 1.2462" and "bk at 1.2463". (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one just shows the single merge change that is on the direct _path_ from one changeset to another. The black magic thing shows the set difference of changesets that comes from the full graph at two points). Then you can look at each change individually and see if they matter. And once you can do the set operations, you're officially a BK poweruser. Me, I just have a script, I'm a BK dabbler. Looking at the list (appended), I don't see anything obvious, but hey, if it was obvious it wouldn't have been merged in the first place. Thanks for your willingness to pursue this thing, Linus ----- <maneesh@in.ibm.com> [PATCH] sysfs: fix sysfs backing store error path confusion o sysfs_new_dirent to retrun 0 if kmalloc fails. Thanks to Milton Miller for spotting this. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <bunk@stusta.de> [PATCH] small sysfs cleanups The patch below does the following cleanups for the sysfs code: - remove the unused global function sysfs_mknod - make some structs and functions static Please check whether this patch is correct, or whether some of the things I made static should be used globally in the forseeable future. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <kay.sievers@vrfy.org> [PATCH] add the physical device and the bus to the hotplug environment Add the sysfs path of the physical device to the hotplug event of class and block devices. This should solve the userspace issue not to know if the device is a virtual one and the "device" symlink will never be created, but we sit there and wait for it to show up not knowing when we should give up. Also the bus name is added to the hotplug event, so we don't need to reverse lookup in the /sys/bus/* directory which bus our physical device belongs to. This is e.g. the value matched against the BUS= key, that may be used in an udev rule. This is a PCI network card: ACTIONd SUBSYSTEM=net DEVPATH=/class/net/eth0 PHYSDEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:02:01.0 PHYSDEVBUS=pci INTERFACE=eth0 SEQNUM‚7 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ This is a IDE CDROM: ACTIONd SUBSYSTEM=block DEVPATH=/block/hdc PHYSDEVPATH=/devices/pci0000:00/0000:00:1f.1/ide1/1.0 PHYSDEVBUS=ide SEQNUM\x1017 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ This is an USB-stick partition: ACTIONd SUBSYSTEM=block DEVPATH=/block/sda/sda1 PHYSDEVPATH=/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.0/host1/target1:0:0/1:0:0:0 PHYSDEVBUS=scsi SEQNUM\x1032 PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/ Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: comment fix in bus.c df_01_driver_attach_comment_fix.patch bus_match() was renamed to driver_probe_device() but the comment for device_attach() wasn't updated. This patch updates it. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: bus_recan_devices() locking fix df_02_bus_rescan_devcies_fix.patch bus_rescan_devices() eventually calls device_attach() and thus requires write locking the corresponding bus. The original code just called bus_for_each_dev() which only read locks the bus. This patch separates __bus_for_each_dev() and __bus_for_each_drv(), which don't do locking themselves, out from the original functions and call them with read lock in the original functions and with write lock in bus_rescan_devices(). Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: sysfs_release() dangling pointer reference fix df_03_sysfs_release_fix.patch Some attributes are allocated dynamically (e.g. module and device parameters) and are usually deallocated when the assoicated kobject is released. So, it's not safe to access attr after putting the kobject. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: kobject_add() error path reference counting fix df_04_kobject_add_ref_fix.patch In kobject_add(), @kobj wasn't put'd properly on error path. This patch fixes it. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <tj@home-tj.org> [PATCH] driver-model: device_add() error path reference counting fix df_05_device_add_ref_fix.patch In device_add(), @dev wan't put'd properly when it has zero length bus_id (error path). Fixed. Signed-off-by: Tejun Heo <tj@home-tj.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <greg@kroah.com> kevent: fix build error if CONFIG_KOBJECT_UEVENT is not selected. Thanks to Serge Hallyn <serue@us.ibm.com> for pointing this out. Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <rml@novell.com> [PATCH] kobject_uevent: fix init ordering Looks like kobject_uevent_init is executed before netlink_proto_init and consequently always fails. Not cool. Attached patch switches the initialization over from core_initcall (init level 1) to postcore_initcall (init level 2). Netlink's initialization is done in core_initcall, so this should fix the problem. We should be fine waiting until postcore_initcall. Also a couple white space changes mixed in, because I am anal. Signed-Off-By: Robert Love <rml@novell.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <rml@novell.com> [PATCH] kobject_uevent: add MAINTAINER entry Attached patch adds a MAINTAINER entry for the kernel event layer. Signed-Off-By: Robert Love <rml@novell.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <greg@kroah.com> Merge kroah.com:/home/greg/linux/BK/bleed-2.6 into kroah.com:/home/greg/linux/BK/driver-2.6 <maneesh@in.ibm.com> [PATCH] fix kernel BUG at fs/sysfs/dir.c:20! On Thu, Nov 04, 2004 at 12:52:38PM -0800, Greg KH wrote: > Hi, > > I get the following BUG in the sysfs code when I do: > - plug in a usb-serial device. > - open the port with 'cat /dev/ttyUSB0' > - unplug the device. > - stop the 'cat' process with control-C > > This used to work just fine before your big sysfs changes. There is a similar problem reported by s390 people where we see parent kobject (directory) going away before child kobject (sub-directory). It seems kobject code is able to handle this, but not the sysfs. What could be happening that in sysfs_remove_dir() of parent directory, we try to remove its contents. It works well with the regular files as it is the final removal for sysfs_dirent corresponding to the files. But in case of sub-directory we are doing an extra sysfs_put(). Once while removing parent and the other one being the one from when sysfs_remove_dir() is called for the child. The following patch worked for the s390 people, I hope same will work in this case also. o Do not remove sysfs_dirents corresponding to the sub-directory in sysfs_remove_dir(). They will be removed in the sysfs_remove_dir() call for the specific sub-directory. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> <torvalds@ppc970.osdl.org> Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 18:13 ` Linus Torvalds @ 2004-11-08 20:59 ` Christian Kujau -1 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 20:59 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > No, just gut feel. If the pre-merge ALSA works, and the post-merge one > doesn't, and the oops in both cases happen somewhere close to where it > does "pci_enable_device()", there's not a lot left. There are interrupts, > and there is the PCI layer... yes, makes sense. >> >>i did "bk undo -a1.2463" from a current -BK tree and it oopses: > > Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything > after. > > So what you did still has the merge in the tree, and that it still oopses > is thus to be expected. BUT, we're getting closer. yes, i think i understood that. that's why i wanted to revert 1.2463 too. [...] > > Now, that's fine - the USB merge is likely to be ok, so try doing > > bk undo -a1.2462 for now i appreciate your work here but i have to postpone the the "bk revtool" stuff because i have no X _and_ bk here. (but i'm a good student and will do my homework) > and you will now have a tree that is exactly the same as before, except it > does _not_ have the PCI merge from Greg. > > And if this one does not oops, you can now officially blame Greg. i can't wait... ;) >> Now, if you want to get _really_ fancy, you can now look at each changeset > that differed, with something like > > bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - > > which is black magic that does a set operation and shows all the changes > in between the sets of "bk at 1.2462" and "bk at 1.2463". > > (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one > just shows the single merge change that is on the direct _path_ from one > changeset to another. The black magic thing shows the set difference of > changesets that comes from the full graph at two points). > > Then you can look at each change individually and see if they matter. will do, after the build > > And once you can do the set operations, you're officially a BK poweruser. > Me, I just have a script, I'm a BK dabbler. > > Looking at the list (appended), I don't see anything obvious, but hey, if > it was obvious it wouldn't have been merged in the first place. > > Thanks for your willingness to pursue this thing, hey, thanks to you and to the folks in the Cc: field to chase a bug which only _i_ encounter until now. /me is building now.... thanks, Christian. - -- BOFH excuse #111: The salesman drove over the CPU board. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj94f+A7rjkF8z0wRAm/uAJ0eTBa20JnX+250GpFiSED4b+arQwCggSgo CO/MQ+1jeOOvb7WaJRKg7uY= =Qlt1 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 20:59 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 20:59 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > No, just gut feel. If the pre-merge ALSA works, and the post-merge one > doesn't, and the oops in both cases happen somewhere close to where it > does "pci_enable_device()", there's not a lot left. There are interrupts, > and there is the PCI layer... yes, makes sense. >> >>i did "bk undo -a1.2463" from a current -BK tree and it oopses: > > Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything > after. > > So what you did still has the merge in the tree, and that it still oopses > is thus to be expected. BUT, we're getting closer. yes, i think i understood that. that's why i wanted to revert 1.2463 too. [...] > > Now, that's fine - the USB merge is likely to be ok, so try doing > > bk undo -a1.2462 for now i appreciate your work here but i have to postpone the the "bk revtool" stuff because i have no X _and_ bk here. (but i'm a good student and will do my homework) > and you will now have a tree that is exactly the same as before, except it > does _not_ have the PCI merge from Greg. > > And if this one does not oops, you can now officially blame Greg. i can't wait... ;) >> Now, if you want to get _really_ fancy, you can now look at each changeset > that differed, with something like > > bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - > > which is black magic that does a set operation and shows all the changes > in between the sets of "bk at 1.2462" and "bk at 1.2463". > > (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one > just shows the single merge change that is on the direct _path_ from one > changeset to another. The black magic thing shows the set difference of > changesets that comes from the full graph at two points). > > Then you can look at each change individually and see if they matter. will do, after the build > > And once you can do the set operations, you're officially a BK poweruser. > Me, I just have a script, I'm a BK dabbler. > > Looking at the list (appended), I don't see anything obvious, but hey, if > it was obvious it wouldn't have been merged in the first place. > > Thanks for your willingness to pursue this thing, hey, thanks to you and to the folks in the Cc: field to chase a bug which only _i_ encounter until now. /me is building now.... thanks, Christian. - -- BOFH excuse #111: The salesman drove over the CPU board. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj94f+A7rjkF8z0wRAm/uAJ0eTBa20JnX+250GpFiSED4b+arQwCggSgo CO/MQ+1jeOOvb7WaJRKg7uY=Qlt1 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 20:59 ` Christian Kujau (?) @ 2004-11-08 23:49 ` Christian Kujau 2004-11-09 1:05 ` Linus Torvalds 2004-11-09 1:31 ` Christian Kujau -1 siblings, 2 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 23:49 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>Now, that's fine - the USB merge is likely to be ok, so try doing >>> >>> bk undo -a1.2462 i did so, 1.2463 went away, building as usual - but the oops resists :( http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2462.txt > > for now i appreciate your work here but i have to postpone the the "bk > revtool" stuff because i have no X _and_ bk here. (but i'm a good student > and will do my homework) ...in progress... >>> >>> bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - >>> >>>which is black magic that does a set operation and shows all the changes >>>in between the sets of "bk at 1.2462" and "bk at 1.2463". hm, i guess this has to wait now. >>>Looking at the list (appended), I don't see anything obvious, but hey, if >>>it was obvious it wouldn't have been merged in the first place. yes, i'll look for changes regarding PCI. i've started to compile the -bk snapshots too. there i can do less wrong things. when i have the "bad" -bk snapshot i'll use "bk" itself again to find the detailed change leading to the oops. i hope to get another machine with a another es1371 tomorrow and see if the error is reproduceable. thanks, Christian. PS: i've taken linux-sound and alsa-devel from CC. - -- BOFH excuse #74: You're out of memory -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkAXx+A7rjkF8z0wRAttsAJ9sOI7FVw+Lx8rBYHusHILQvIkeJACfZWDX zMY4MtVYCCxU3y0Tb/muG5Y= =CBO/ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 23:49 ` Christian Kujau @ 2004-11-09 1:05 ` Linus Torvalds 2004-11-09 1:41 ` Christian Kujau 2004-11-09 1:31 ` Christian Kujau 1 sibling, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-09 1:05 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, Greg KH On Tue, 9 Nov 2004, Christian Kujau wrote: > > >>>Looking at the list (appended), I don't see anything obvious, but hey, if > >>>it was obvious it wouldn't have been merged in the first place. > > yes, i'll look for changes regarding PCI. i've started to compile the -bk > snapshots too. there i can do less wrong things. when i have the "bad" -bk > snapshot i'll use "bk" itself again to find the detailed change leading to > the oops. Actually, looking a bit closer, I think the PCI merge we just looked at was the PCI merge that happened _after_ 2.6.10-rc1. And since 2.6.10-rc1 already oopsed for you, it shouldn't be an issue. I think the _real_ PCI merge we should have looked at is: ChangeSet@1.2000.1.7, 2004-10-19 16:59:19-07:00, torvalds@ppc970.osdl.org Merge PCI updates and in particular, that merged the PCI changes from ChangeSet@1.1988.2.81, 2004-10-19 14:48:04-07:00, greg@kroah.com PCI: fix up pci_save/restore_state in via-agp due to api change. Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> with my pre-PCI-merge tree at: ChangeSet@1.2000.1.6, 2004-10-19 15:06:19-07:00, torvalds@ppc970.osdl.org Merge bk://bart.bkbits.net/ide-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux (all of these revision numbers are relative to a pristine 2.6.10-rc1 tree: remember that they change with merges, so they may not be the same in your tree. "bk changes -a" is your friend). So what I'd like you to do is to take the pre-PCI-merge tree, and see if that works for you # assuming a 2.6.10-rc1 tree bk undo -a1.2000.1.6 and if that works, then try the post-PCI-merge tree: # assuming a 2.6.10-rc1 tree bk undo -a1.2000.1.7 (I just checked: the above numbers are actually valid even in the current -bk tree, so you don't have to first go to 2.6.10-rc1, you can just start from a current tree) Thanks for testing, and sorry for the confusion with the more recent PCI merge. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-09 1:05 ` Linus Torvalds @ 2004-11-09 1:41 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-09 1:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > So what I'd like you to do is to take the pre-PCI-merge tree, and see if > that works for you > > # assuming a 2.6.10-rc1 tree > bk undo -a1.2000.1.6 > > and if that works, then try the post-PCI-merge tree: > > # assuming a 2.6.10-rc1 tree > bk undo -a1.2000.1.7 > > (I just checked: the above numbers are actually valid even in the current > -bk tree, so you don't have to first go to 2.6.10-rc1, you can just start > from a current tree) thanks, Linus. i'll do all this tomorrow, see my other mail i just sent. i'll definitely do all this 'cause i'm really curious about this thing. (it's not even the need of sound any more. heck, i could just put in another soundcard but that'd be too easy :) > > Thanks for testing, and sorry for the confusion with the more recent PCI > merge. doh, you can't image how thankful i am for your (and the other people's!) help here. but don't waste too many cycles on this weird issue here. if it does not break for a million users out there now - why bother at all? perhaps it'll break later on but then we have the lkml-archives and someone will eventually remember this thing. but no, i don't want to discourage anyone here ;-) regards, Christian. - -- BOFH excuse #19: floating point processor overflow -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkCAs+A7rjkF8z0wRAu2pAKDBw1Cj3fFBXbtbkpfagkpgbxiK+ACcC2gn HXmcjnhFFX8vAjK0IawPQgI= =T1C6 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 23:49 ` Christian Kujau 2004-11-09 1:05 ` Linus Torvalds @ 2004-11-09 1:31 ` Christian Kujau 2004-11-09 7:40 ` Pekka Enberg 1 sibling, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-09 1:31 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 ok, i've done some other things here and built kernels from 2.6.10-rc1-bk13 and all were giving the oops: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1-bk13 http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-2.6.10-rc1-bk13.txt the config is the same config i am usually using, never gave me a headache, new options (due to new kernel version) were left to default in most cases. anyway - i've pulled again a recent tree, did "bk undo -a1.2463" again but this time i stripped down my .config (via menuconfig) to the absolute necessary things: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_a1.2463_take2 ...and it did *NOT* oops: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.10-rc1_a1.2463.txt i'll investigate further, building former -bk snapshots, using other configs before i'll fiddle around with bk again (to get the smaller changes). but this is a tomorrow thing, real life calls in :( Thank you all so far, Christian. - -- BOFH excuse #92: Stale file handle (next time use Tupperware(tm)!) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkB3v+A7rjkF8z0wRAjU/AKCGPnfuJiBzamcRwU9hIiH+GXZNSwCgi2YK kwN9O4z/1MzWEakWX0p6IGo= =d8GA -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-09 1:31 ` Christian Kujau @ 2004-11-09 7:40 ` Pekka Enberg 2004-11-09 12:33 ` Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Pekka Enberg @ 2004-11-09 7:40 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, Linus Torvalds, Greg KH Hi, On Tue, 09 Nov 2004 02:31:28 +0100, Christian Kujau <evil@g-house.de> wrote: > the config is the same config i am usually using, never gave me a > headache, new options (due to new kernel version) were left to default in > most cases. anyway - i've pulled again a recent tree, did > "bk undo -a1.2463" again but this time i stripped down my .config (via > menuconfig) to the absolute necessary things: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_a1.2463_take2 > > ...and it did *NOT* oops: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.10-rc1_a1.2463.txt > > i'll investigate further, building former -bk snapshots, using other > configs before i'll fiddle around with bk again (to get the smaller > changes). but this is a tomorrow thing, real life calls in :( CONFIG_PREEMPT is one obvious candidate (you have that enabled in the original config and disabled in the non-oopsing one). Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-09 7:40 ` Pekka Enberg @ 2004-11-09 12:33 ` Christian Kujau 2004-11-09 17:26 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-09 12:33 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Pekka Enberg, Linus Torvalds, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 this damn thread is far too long already... Pekka Enberg schrieb: > CONFIG_PREEMPT is one obvious candidate (you have that enabled in the > original config and disabled in the non-oopsing one). i've disabled *only* CONFIG_PREEMPT in another .config but it still oopses: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-2.6.10-rc1_no-preempt.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_no-preempt.txt 2.6.9 with preempt enabled does not oops: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.9_preempt.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops_2.6.9_preempt.txt i was a fool to test further -bk snapshots but it was kinda late yesterday and i was confused: patch-2.6.9.bz2 -> 19-Oct-2004 patch-2.6.10-rc1.bz2 -> 23-Oct-2004 00:12 patch-2.6.10-rc1-bk1.bz2 -> 23-Oct-2004 13:34 2.6.9 is not oopsing *here*, plain 2.6.10-rc1 is oopsing. so i can *not* use -bk snapshots any more and i will go on with BK (undo the ChangeSets Linus told me about) and use different .configs now. sorry for the confusion and especially sorry to my bk mentor: we seem to be so close to the right ChangeSet and then i started to use *snapshots* again. Thanks, Christian - -- BOFH excuse #76: Unoptimized hard drive -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkLkQ+A7rjkF8z0wRAhqLAJ9bZm+B5LKR+sY7V+yi/fSrhJuGrwCfcumS GwsGsjKson9vwRMCDtT9/Zk= =ailz -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 12:33 ` Christian Kujau @ 2004-11-09 17:26 ` Christian Kujau 2004-11-09 18:53 ` Linus Torvalds 0 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-09 17:26 UTC (permalink / raw) To: Christian Kujau, Kernel Mailing List Cc: Pekka Enberg, Linus Torvalds, Greg KH On Tue, 09 Nov 2004 13:33:20 +0100, Christian Kujau wrote > i've disabled *only* CONFIG_PREEMPT in another .config but it > still oopses: at least i finally found the "bad" .config option: it's CONFIG_EDD. when i disable this option (and only this options. i can use the same .config as usual only disbaling this very option. diff is my witness.) i can boot a current (!) 2.6.10-rc1-bk and a working snd-ens1371! i'll test with CONFIG_EDD=m later on. here a short summary: 2.6.9 CONFIG_EDD=y - OK 2.6.10-rc1-bk CONFIG_EDD=y - OOPS! 2.6.10-rc1-bk CONFIG_EDD=n - OK 2.6.10-rc1-bk CONFIG_EDD=m - ?? yes, i'll continue to find out the ChangeSet but now i (and perhaps you too, if you are as curious as me) will know where to look at. i must admit that i was not entirely sure why i wanted to enable CONFIG_EDD at all. if i had never enabled it, it'd have saved me a week of bug chasing, but learning is fun, too. thanks, Christian. -- BOFH excuse #209: Only people with names beginning with 'A' are getting mail this week (a la Microsoft) ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 17:26 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau @ 2004-11-09 18:53 ` Linus Torvalds 2004-11-09 19:04 ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH 2004-11-09 23:30 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau 0 siblings, 2 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-09 18:53 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, Pekka Enberg, Greg KH, Matt_Domsch On Tue, 9 Nov 2004, Christian Kujau wrote: > > at least i finally found the "bad" .config option: it's CONFIG_EDD. > when i disable this option (and only this options. i can use the same > .config as usual only disbaling this very option. diff is my witness.) > i can boot a current (!) 2.6.10-rc1-bk and a working snd-ens1371! Very strange. There's not a lot of stuff that affects EDD directly that I can see, but there is: ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR Some controller BIOSes have problems with the legacy int13 fn02 READ SECTORS command. int13 fn42 EXTENDED READ is used in preference by most boot loaders today, so lets use that. If EXTENDED READ fails or isn't supported, fall back to READ SECTORS. This hopefully resolves the three reports of BIOSes which would either long-pause (30+ seconds) or hang completely on the legacy READ SECTORS command. This also adds CONFIG_EDD_SKIP_MBR to eliminate reading the MBR on each BIOS-presented disk, in case there are further problems in this area. Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> which might fit the bill. However, even that would just change the EDD _data_, it doesn't change the code that actually runs in the kernel. And I _really_ don't see what EDD has got to do with anything. I wonder if the EDD stuff corrupts the sysfs tree or something, and you're just seeing some strange kobject interference. Greg, you'd likely still be on the line for that one. Christian, finding which change triggers this would be very good indeed. I think the merge with greg is still a good place to start, although even just doing the snapshot trees (from _before_ -rc1: ie the patches in /pub/linux/kernel/v2.6/snapshots/old: patch-2.6.9-bk*.gz) is actually also a good way to narrow things down. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 18:53 ` Linus Torvalds @ 2004-11-09 19:04 ` Greg KH 2004-11-09 19:08 ` Greg KH 2004-11-09 19:09 ` Linus Torvalds 2004-11-09 23:30 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau 1 sibling, 2 replies; 70+ messages in thread From: Greg KH @ 2004-11-09 19:04 UTC (permalink / raw) To: Linus Torvalds Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch This fixes a problem introduced in the previous set of driver model changes that has been seen by a lot of people (most notibly the greater than 256 pty users, but others might also be hitting this without realizing it.) Also add a comment so we don't try to "fix" this again. Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> --- a/lib/kobject.c 2004-11-05 10:06:33 -08:00 +++ b/lib/kobject.c 2004-11-08 23:58:02 -08:00 @@ -181,10 +181,10 @@ int kobject_add(struct kobject * kobj) error = create_dir(kobj); if (error) { + /* unlink does the kobject_put() for us */ unlink(kobj); if (parent) kobject_put(parent); - kobject_put(kobj); } else { kobject_hotplug(kobj, KOBJ_ADD); } ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:04 ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH @ 2004-11-09 19:08 ` Greg KH 2004-11-09 20:19 ` Pekka Enberg ` (2 more replies) 2004-11-09 19:09 ` Linus Torvalds 1 sibling, 3 replies; 70+ messages in thread From: Greg KH @ 2004-11-09 19:08 UTC (permalink / raw) To: Linus Torvalds Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch On Tue, Nov 09, 2004 at 11:04:21AM -0800, Greg KH wrote: > This fixes a problem introduced in the previous set of driver model > changes that has been seen by a lot of people (most notibly the greater > than 256 pty users, but others might also be hitting this without > realizing it.) > > Also add a comment so we don't try to "fix" this again. > > Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> Christian, I don't know if this patch explicitly fixes your problem, but it fixes problems other people have been having with the driver core lately. I'd appreciate it if you could test it out and let me know if it solves your problem, with CONFIG_EDD enabled, or if it doesn't help at all. thanks, greg k-h ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:08 ` Greg KH @ 2004-11-09 20:19 ` Pekka Enberg 2004-11-09 21:21 ` Christian Kujau 2004-11-09 21:31 ` Christian Kujau 2 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-09 20:19 UTC (permalink / raw) To: Greg KH; +Cc: Linus Torvalds, Christian Kujau, Kernel Mailing List, matt_domsch Hi Greg, On Tue, 9 Nov 2004 11:08:09 -0800, Greg KH <greg@kroah.com> wrote: > Christian, I don't know if this patch explicitly fixes your problem, but > it fixes problems other people have been having with the driver core > lately. I'd appreciate it if you could test it out and let me know if > it solves your problem, with CONFIG_EDD enabled, or if it doesn't help > at all. The broken kobject_add fix is not in -rc1 proper which oopses on Christian's machine. I don't think this patch has anything to do with his problem. Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:08 ` Greg KH 2004-11-09 20:19 ` Pekka Enberg @ 2004-11-09 21:21 ` Christian Kujau 2004-11-09 21:31 ` Christian Kujau 2 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-09 21:21 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Greg KH, Linus Torvalds, Pekka Enberg, Matt_Domsch -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH schrieb: > > Christian, I don't know if this patch explicitly fixes your problem, but > it fixes problems other people have been having with the driver core > lately. I'd appreciate it if you could test it out and let me know if > it solves your problem, with CONFIG_EDD enabled, or if it doesn't help > at all. > yes, i'll do so and test the patch. is this in current -BK yet? because applying your patch [1] to 2.6.10-rc1 gives: Hunk #1 FAILED at 181. 1 out of 1 hunk FAILED -- saving rejects to file lib/kobject.c.rej i've done a few other things before, let me just post the results before i go on with your suggestions: i've compiled a recent (BK) 2.6.10-rc1 again with CONFIG_EDD=m|y|n http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_edd-modular.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_edd.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_no-edd.txt the results: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd-modular.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_no-edd.txt the interesting thing (for me) was, that when CONFIG_EDD=m was set, my sound card was working properly and i could do "modprobe edd" and "rmmod edd" as i like: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/typescript-2.6.10-rc1_edd-modular.txt again: i double checked and compiled on 2 different hosts, each having it's own -BK tree. thanks, Christian. [1] http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/edd-fix.patch - -- BOFH excuse #22: monitor resolution too high -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkTTg+A7rjkF8z0wRAvFPAKCCM05vqhg4u2NH2wklRRbxdVSpcwCff9a3 /KodSmgp9J4Nf2LDcTiBOCo= =B/3X -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:08 ` Greg KH 2004-11-09 20:19 ` Pekka Enberg 2004-11-09 21:21 ` Christian Kujau @ 2004-11-09 21:31 ` Christian Kujau 2 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-09 21:31 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH schrieb: > lately. I'd appreciate it if you could test it out and let me know if > it solves your problem, with CONFIG_EDD enabled, or if it doesn't help > at all. please ignore my first mail (the part about not being able to patch), it's already in BK i can see now, sorry. compiling now... - -- BOFH excuse #22: monitor resolution too high -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkTc3+A7rjkF8z0wRAl7LAJ9/mXV4/uFet5aqpJB/02+J/654bACbBz/k Px9muqjJ+e7OiRPDHbmyS1s= =Q+hA -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:04 ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH 2004-11-09 19:08 ` Greg KH @ 2004-11-09 19:09 ` Linus Torvalds 2004-11-09 22:06 ` Christian Kujau 1 sibling, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-09 19:09 UTC (permalink / raw) To: Greg KH; +Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch On Tue, 9 Nov 2004, Greg KH wrote: > > This fixes a problem introduced in the previous set of driver model > changes that has been seen by a lot of people (most notibly the greater > than 256 pty users, but others might also be hitting this without > realizing it.) Ahh.. Christian, pls test this one. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add() 2004-11-09 19:09 ` Linus Torvalds @ 2004-11-09 22:06 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-09 22:06 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH, Pekka Enberg, Matt_Domsch -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 i'm sorry to say that it did not help: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd__kobject_put.txt i'll go on and try to exclude ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR (or just test /pub/linux/kernel/v2.6/snapshots/old/patch-2.6.9-bk*.gz ...) thanks, Christian. - -- BOFH excuse #200: The monitor needs another box of pixels. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkT9q+A7rjkF8z0wRArHjAJ4qSyZf+ioC4VkvPxk2fCNWUrl18QCeLK85 8e2EyGuWgBviGETlV25t/XE= =Qvnz -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 18:53 ` Linus Torvalds 2004-11-09 19:04 ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH @ 2004-11-09 23:30 ` Christian Kujau 2004-11-09 23:40 ` Matt Domsch 1 sibling, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-09 23:30 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Linus Torvalds, Pekka Enberg, Greg KH, Matt_Domsch -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > Very strange. There's not a lot of stuff that affects EDD directly that I > can see, but there is: > > ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com > [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR and i say: good catch! that does it! i did "bk undo -a1.2000.5.108" on a current tree, booting this still gives an oops: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_a1.2000.5.108.txt excluding this single ChangeSet with "bk undo -r1.2118" does work with CONFIG_EDD=y: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_r1.2000.5.108.txt (the filename here should really read "...r1.2118.txt" because that was the number of the changeset representing the above [PATCH] *after* i did "bk undo -a1.2000.5.108". right?) > However, even that would just change the EDD _data_, it doesn't change the > code that actually runs in the kernel. And I _really_ don't see what EDD > has got to do with anything. understanding a lot less of all this than you guys i also wonder why only this single driver broke. i've always loaded a couple of drivers here, maybe i could play around a bit e.g. CONFIG_SND_ENS1371=y instead of =m or see if other hw drivers break too. > I wonder if the EDD stuff corrupts the sysfs tree or something, and you're > just seeing some strange kobject interference. do userspace tools matter here? there is "sysfsutils-1.1.0-1" and "libsysfs1-1.1.0-1" (both debian/unstable) installed here, /sys is mounted: sysfs on /sys type sysfs (rw) > Christian, finding which change triggers this would be very good indeed. I > think the merge with greg is still a good place to start, although even i'll look again over the -bk magic you told me about and see what it gives. thanks so far to all involved here, i really enjoyed "working" with you. first class support at no charge...it's just incredible. you guys rock, Christian. - -- BOFH excuse #112: The monitor is plugged into the serial port -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkVMN+A7rjkF8z0wRAqu4AKCtxZxE2spjZGgSnxTWzTTB0CWCkACgi2f3 RmHQXbnkcI1OEcLORhP1dmA= =5Dot -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 23:30 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau @ 2004-11-09 23:40 ` Matt Domsch 2004-11-10 0:21 ` Christian Kujau 2004-11-11 22:43 ` Matt Domsch 0 siblings, 2 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-09 23:40 UTC (permalink / raw) To: Christian Kujau Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH On Wed, Nov 10, 2004 at 12:30:21AM +0100, Christian Kujau wrote: > > ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com > > [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR > > and i say: good catch! that does it! > > i did "bk undo -a1.2000.5.108" on a current tree, booting this still gives > an oops: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_a1.2000.5.108.txt > > excluding this single ChangeSet with "bk undo -r1.2118" does work with > CONFIG_EDD=y: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_r1.2000.5.108.txt OK, thanks, that helps. From the diff of those dmesg: -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found So with the latest EDD patch noted above, it's finding more disks than before. How many disks do you actually have in the system? I'll review the assembly again to see where I could have miscounted, and see how that may affect the EDD sysfs exports. Likely no answer from me before tomorrow though. Thanks, Matt -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 23:40 ` Matt Domsch @ 2004-11-10 0:21 ` Christian Kujau 2004-11-10 1:01 ` Linus Torvalds 2004-11-11 22:43 ` Matt Domsch 1 sibling, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-10 0:21 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Matt Domsch, Linus Torvalds, Pekka Enberg, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Matt Domsch schrieb: > > -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found > +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found > > So with the latest EDD patch noted above, it's finding more disks than > before. How many disks do you actually have in the system? i have one scsi disk (sda) and two atapi cdrom drives: hda: CRD-8483B, ATAPI CD/DVD-ROM drive hdb: AOPEN CD-RW CRW3248 1.17 20020620, ATAPI CD/DVD-ROM drive ... SCSI device sda: 35548320 512-byte hdwr sectors (18201 MB) SCSI device sda: drive cache: write back the "scsi0 : sym-2.1.18k" is on a pci card, the atapi devices are connected onboard. if it helps: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt > I'll review the assembly again to see where I could have miscounted, > and see how that may affect the EDD sysfs exports. Likely no answer > from me before tomorrow though. that's ok, real life kicks in here too... thanks, Christian. PS: do you have *any* idea how this could be related to the snd-es1371 driver (which is producing the oops then)? - -- BOFH excuse #449: greenpeace free'd the mallocs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkV75+A7rjkF8z0wRAl67AJ9P+SF1WfRe7r2zoF9D/b/fyDeD0QCfe6/f Uxt5DVlb/IzW9VSWuFJqLlI= =Hpg9 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-10 0:21 ` Christian Kujau @ 2004-11-10 1:01 ` Linus Torvalds 0 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-10 1:01 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List, Matt Domsch, Pekka Enberg, Greg KH On Wed, 10 Nov 2004, Christian Kujau wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Matt Domsch schrieb: > > > > -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found > > +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found > > > > So with the latest EDD patch noted above, it's finding more disks than > > before. How many disks do you actually have in the system? > > i have one scsi disk (sda) and two atapi cdrom drives: Interestingly, "16" is also EDD_MBR_SIG_MAX, so my suspicion is that it overflowed some EDD data area. edd_num_devices() (which is what reports the above number) does min_t(unsigned char, max_t(unsigned char, edd.edd_info_nr, edd.mbr_signature_nr), max_t(unsigned char, EDD_MBR_SIG_MAX, EDDMAXNR)); where EDDMAXNR is 6, and EDD_MBR_SIG_MAX is the afore-mentioned 16, so we know that either edd.edd_info_nr or edd.mbr_signature_nr is actually _bigger_ than 16. Which is clearly totally bogus. In fact, even your old "6 devices found" thing looks suspiciously bogus. > PS: do you have *any* idea how this could be related to the snd-es1371 > driver (which is producing the oops then)? I bet it's overwriting some array, and just corrupting memory after it. For example, the edd_info[] array only has 6 entries, and for example, the EDD_MBR_SIG_BUFFER is quite close to where we save the E820MAP memory map at bootup, so if something stomps on that, the kernel might be confused about where PCI memory can be allocated or similar. Or it might have overwritten some ACPI memory data, who knows. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-09 23:40 ` Matt Domsch 2004-11-10 0:21 ` Christian Kujau @ 2004-11-11 22:43 ` Matt Domsch 2004-11-11 22:53 ` Linus Torvalds 2004-11-12 0:27 ` Christian Kujau 1 sibling, 2 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-11 22:43 UTC (permalink / raw) To: Christian Kujau Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH On Tue, Nov 09, 2004 at 05:40:54PM -0600, Matt Domsch wrote: > OK, thanks, that helps. From the diff of those dmesg: > > -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found > +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found As Linus points out, those are the magic numbers in EDD for number of device entries stored. Your BIOS seems to be reporting that is has more devices than it does, or the EDD assembly is horked in a way I have not yet deciphered. > I'll review the assembly again to see where I could have miscounted, > and see how that may affect the EDD sysfs exports. Likely no answer > from me before tomorrow though. I haven't been able to find a solution to your problem yet, and given some external time constraints I've got, won't be able to look into this again for another week or more. Thanks, Matt -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-11 22:43 ` Matt Domsch @ 2004-11-11 22:53 ` Linus Torvalds 2004-11-11 22:55 ` Matt Domsch 2004-11-12 0:27 ` Christian Kujau 1 sibling, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-11 22:53 UTC (permalink / raw) To: Matt Domsch, Andrew Morton Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Greg KH On Thu, 11 Nov 2004, Matt Domsch wrote: > > I haven't been able to find a solution to your problem yet, and given > some external time constraints I've got, won't be able to look into > this again for another week or more. Matt, I'll revert the EXTENDED READ change for now, then. The random behaviour of the problem it causes makes me really dislike this bug, and I'd like to release a -rc2 and start calming down the 2.6.10 stuff, but having known random stuff happen really disturbs me. We can re-do it once it's more obvious why it broke.. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-11 22:53 ` Linus Torvalds @ 2004-11-11 22:55 ` Matt Domsch 0 siblings, 0 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-11 22:55 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Christian Kujau, Kernel Mailing List, Pekka Enberg, Greg KH On Thu, Nov 11, 2004 at 02:53:15PM -0800, Linus Torvalds wrote: > Matt, I'll revert the EXTENDED READ change for now, then. The random > behaviour of the problem it causes makes me really dislike this bug, and > I'd like to release a -rc2 and start calming down the 2.6.10 stuff, but > having known random stuff happen really disturbs me. > > We can re-do it once it's more obvious why it broke.. Good plan, thanks. -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-11 22:43 ` Matt Domsch 2004-11-11 22:53 ` Linus Torvalds @ 2004-11-12 0:27 ` Christian Kujau 2004-11-12 0:49 ` Linus Torvalds 1 sibling, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-12 0:27 UTC (permalink / raw) To: Matt Domsch; +Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Matt Domsch schrieb: > > As Linus points out, those are the magic numbers in EDD for number of > device entries stored. Your BIOS seems to be reporting that is has > more devices than it does, or the EDD assembly is horked in a way I > have not yet deciphered. actually, my BIOS is even to old for e.g. ACPI, with latest firmware installed. i had no issues so far with the board/bios, but perhaps this is no longer true. however, it's still strange that this thing is only triggerd with you change and CONFIG_EDD=y. > > I haven't been able to find a solution to your problem yet, and given > some external time constraints I've got, won't be able to look into > this again for another week or more. nevermind then. as nobody else seem to be bothered by this i am happy with the workarund (CONFIG_EDD=n) and since the lkml-archives exist we could get back to it when it's bothering more people (n>1) thank you for your time, Christian. - -- BOFH excuse #396: Mail server hit by UniSpammer. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBlAOE+A7rjkF8z0wRAkyLAJ4uy4LYBHWk8Wxwr/heQRVm7VOXfwCfW30C Zv1RdMYf1VOBEGkUnkQ+k0Q= =f2hG -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-12 0:27 ` Christian Kujau @ 2004-11-12 0:49 ` Linus Torvalds 2004-11-12 1:27 ` Christian Kujau 0 siblings, 1 reply; 70+ messages in thread From: Linus Torvalds @ 2004-11-12 0:49 UTC (permalink / raw) To: Christian Kujau; +Cc: Matt Domsch, Kernel Mailing List, Pekka Enberg, Greg KH On Fri, 12 Nov 2004, Christian Kujau wrote: > > nevermind then. as nobody else seem to be bothered by this i am happy with > the workarund (CONFIG_EDD=n) and since the lkml-archives exist we could > get back to it when it's bothering more people (n>1) The problem with that approach is that very few people are willing to spend the time and effort to really try to figure out where the problem triggers for them. Thanks again for testing lots of kernels, and different configurations. Basically, if it's a problem that only happens for a smallish percentage of people, and an even smaller percentage of those is willing to dig down and find it, it's not a problem we can afford to ignore. Ignoring it just means that there will be "a few" error reports that we will either waste time on, or (even worse) we'll dismiss as "known problems" and then possibly miss _another_ bug. This is why I take random unexplained (but pinpointed) problems so seriously. If it wasn't as apparently random, we could file it under "known problem" and decide to try to fix it later. As it is, it's filed under "known cause", but since we don't know _why_, it might cause totally different problems on another machine, and that just makes it too painful for words. So the changeset is reverted for now in the current -bk tree, and I'll make a -rc2 this weekend and hope that we can stabilize for 2.6.10. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-12 0:49 ` Linus Torvalds @ 2004-11-12 1:27 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-12 1:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: Matt Domsch, Kernel Mailing List, Pekka Enberg, Greg KH -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > This is why I take random unexplained (but pinpointed) problems so > seriously. If it wasn't as apparently random, we could file it under > "known problem" and decide to try to fix it later. As it is, it's filed > under "known cause", but since we don't know _why_, it might cause totally > different problems on another machine, and that just makes it too painful > for words. just after sending my last mail i too (re)thought about this and i'd have begged Matt to revert the patch if it was not *only* me having this issue. but i can see your point here and i appreciate your decision. > So the changeset is reverted for now in the current -bk tree, and I'll > make a -rc2 this weekend and hope that we can stabilize for 2.6.10. yay! thanks, Christian. - -- BOFH excuse #96: Vendor no longer supports the product -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBlBFw+A7rjkF8z0wRAld5AJ40MjbzFbVXepXkJr1tLZCvYy7z2QCeMYCe QQyekHBs1cjuebPZTEuPZZ0= =wwF6 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 18:13 ` Linus Torvalds (?) (?) @ 2004-11-10 0:12 ` Christian Kujau 2004-11-10 0:23 ` Linus Torvalds -1 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-10 0:12 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Linus Torvalds schrieb: > > Now, if you want to get _really_ fancy, you can now look at each changeset > that differed, with something like > > bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - > > which is black magic that does a set operation and shows all the changes > in between the sets of "bk at 1.2462" and "bk at 1.2463". > > (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one > just shows the single merge change that is on the direct _path_ from one > changeset to another. The black magic thing shows the set difference of > changesets that comes from the full graph at two points). hm, i still fail to see the "magic" part here. from a current tree i get: - --------------- $ bk set -n -d -r1.2000.5.107 -r1.2000.5.108 | bk -R prs -h \ - -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - | head -n5 <Matt_Domsch@dell.com> [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR Some controller BIOSes have problems with the legacy int13 fn02 READ SECTORS command. int13 fn42 EXTENDED READ is used in preference by most - --------------- which looks similiar to the next one, but with "bk changes" i get the ChangeSet number again: - --------------- $ bk changes -r1.2000.5.108 | head -n5 ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR Some controller BIOSes have problems with the legacy int13 fn02 READ SECTORS command. int13 fn42 EXTENDED READ is used in preference by most - --------------- ...or was i supposed to alter your cmdline? i just copy'n'pasted it... anyway, i've seen that i have a lot of "bk help" ahead of me, thanks for the course, though ;) greetings, Christian. - -- BOFH excuse #297: Too many interrupts -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBkVzi+A7rjkF8z0wRAte6AKCO8isFqWGyFK53IpVtEnAImvQq8gCfeePr rzMnTyR3EPMqpv7+qz9iR6c= =BB+K -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-10 0:12 ` Oops in 2.6.10-rc1 Christian Kujau @ 2004-11-10 0:23 ` Linus Torvalds 0 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-10 0:23 UTC (permalink / raw) To: Christian Kujau; +Cc: Kernel Mailing List On Wed, 10 Nov 2004, Christian Kujau wrote: > > > > bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - > > > > which is black magic that does a set operation and shows all the changes > > in between the sets of "bk at 1.2462" and "bk at 1.2463". > > hm, i still fail to see the "magic" part here. from a current tree i get: You don't see any magic, unless there are merges involved. And you've already narrowed the thing down to a single non-merge changeset, at which point the "magic" way is just a very slow way of doing the same thing. The magic hits you only when you have non-trivial merges, in which case the set operation shows you more than the "just walk from one top-of-tree to the other". Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 13:01 ` Christian Kujau @ 2004-11-08 18:44 ` Pekka Enberg -1 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 18:44 UTC (permalink / raw) To: Christian Kujau Cc: Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, Greg KH, penberg Hi Christian, On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote: > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ > the .configs, the oopses are there. i've double checked a kernel built > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) Just to update, I cannot reproduce the oops with your config (nor mine) on my machine running 2.6.10-rc1-bk14. Pekka 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: bus master, medium devsel, latency 8 Memory at e7000000 (32-bit, prefetchable) Capabilities: [a0] AGP version 2.0 Capabilities: [c0] Power Management version 2 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, medium devsel, latency 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 0000d000-0000dfff Memory behind bridge: d7000000-d7efffff Prefetchable memory behind bridge: d7f00000-e6ffffff Expansion ROM at 0000d000 [disabled] [size=4K] Capabilities: [80] Power Management version 2 0000:00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: bus master, stepping, medium devsel, latency 0 Capabilities: [c0] Power Management version 2 0000:00:04.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP]) Flags: bus master, medium devsel, latency 32 I/O ports at b800 [size=16] Capabilities: [c0] Power Management version 2 0000:00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at b400 [size=32] Capabilities: [80] Power Management version 2 0000:00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at b000 [size=32] Capabilities: [80] Power Management version 2 0000:00:04.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: medium devsel, IRQ 9 Capabilities: [68] Power Management version 2 0000:00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139 Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at 9400 Memory at d6800000 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 0000:00:0a.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 04) Subsystem: Ensoniq Sound Blaster 16PCI 4.1ch Flags: bus master, slow devsel, latency 32, IRQ 11 I/O ports at 9000 Capabilities: [dc] Power Management version 2 0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139 Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at 8800 Memory at d6000000 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA]) Subsystem: Hightech Information System Ltd.: Unknown device 0f02 Flags: bus master, stepping, 66Mhz, medium devsel, latency 64 Memory at d8000000 (32-bit, prefetchable) [size=d7fe0000] I/O ports at d800 [size=256] Memory at d7000000 (32-bit, non-prefetchable) [size=64K] Expansion ROM at 00020000 [disabled] Capabilities: [58] AGP version 2.0 Capabilities: [50] Power Management version 2 Linux version 2.6.10-rc1-bk14 (root@cherry) (gcc version 3.4.2 (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) #8 Mon Nov 8 20:18:45 EET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003ffec000 (usable) BIOS-e820: 000000003ffec000 - 000000003ffef000 (ACPI data) BIOS-e820: 000000003ffef000 - 000000003ffff000 (reserved) BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. On node 0 totalpages: 262124 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 32748 pages, LIFO batch:7 DMI 2.3 present. ACPI: RSDP (v000 ASUS ) @ 0x000f6a80 ACPI: RSDT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec000 ACPI: FADT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec080 ACPI: BOOT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec040 ACPI: DSDT (v001 ASUS A7V133-C 0x00001000 MSFT 0x0100000b) @ 0x00000000 ACPI: PM-Timer IO Port: 0xe408 Built 1 zonelists Kernel command line: root=/dev/ram0 init=/linuxrc real_root=/dev/hda3 acpi=force No local APIC present or hardware disabled Initializing CPU#0 PID hash table entries: 4096 (order: 12, 65536 bytes) Detected 1009.328 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1034128k/1048496k available (2582k kernel code, 13664k reserved, 770k data, 148k init, 130992k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 1998.84 BogoMIPS (lpj=999424) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383f9ff c1c7f9ff 00000000 00000000 CPU: After vendor identify, caps: 0383f9ff c1c7f9ff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 64K (64 bytes/line) CPU: After all inits, caps: 0383f9ff c1c7f9ff 00000000 00000020 CPU: AMD Duron(tm) Processor stepping 00 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: IRQ9 SCI: Edge set to Level Trigger. checking if image is initramfs...it isn't (no cpio magic); looks like an initrd Freeing initrd memory: 885k freed kobject_uevent: unable to create netlink socket! NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040816 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] PCI: Using ACPI for IRQ routing ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI interrupt 0000:00:04.2[D] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI interrupt 0000:00:04.3[D] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11 ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11 ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10 Simple Boot Flag at 0x3a set to 0x1 highmem bounce pool size: 64 pages devfs: 2004-01-31 Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x0 SGI XFS with ACLs, realtime, no debug enabled SGI XFS Quota Management subsystem Applying VIA southbridge workaround. PCI: Disabling Via external APIC routing Real Time Clock Driver v1.12 serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize Equalizer2002: Simon Janes (simon@ncm.com) and David S. Miller (davem@redhat.com) Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:04.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:04.1 ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio Probing IDE interface ide0... hda: Maxtor 4D060H3, ATA DISK drive elevator: using anticipatory as default io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: Hewlett-Packard CD-Writer Plus 8200a, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 Probing IDE interface ide2... ide2: Wait for ready failed before probe ! Probing IDE interface ide3... ide3: Wait for ready failed before probe ! Probing IDE interface ide4... ide4: Wait for ready failed before probe ! Probing IDE interface ide5... ide5: Wait for ready failed before probe ! hda: max request size: 128KiB hda: 120069936 sectors (61475 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100) hda: cache flushes not supported /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard on isa0060/serio0 input: ImPS/2 Logitech Wheel Mouse on isa0060/serio1 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 10 IPv6 over IPv4 tunneling driver NET: Registered protocol family 17 ACPI: (supports S0 S1 S4 S5) ACPI wakeup devices: PWRB PCI0 UAR1 UAR2 USB0 USB1 RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 148k freed usbcore: registered new driver usbfs usbcore: registered new driver hub usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver SCSI subsystem initialized Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) ReiserFS: hda3: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on hda3 kjournald starting. Commit interval 5 seconds EXT3 FS on hda3, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2040244k swap on /dev/hda2. Priority:-1 extents:1 EXT3 FS on hda3, internal journal 8139too Fast Ethernet driver 0.9.27 PCI: Enabling device 0000:00:09.0 (0004 -> 0007) ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10 eth0: RealTek RTL8139 at 0xf8814000, 00:06:4f:01:66:57, IRQ 10 eth0: Identified 8139 chip type 'RTL-8139C' PCI: Enabling device 0000:00:0d.0 (0004 -> 0007) ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10 eth1: RealTek RTL8139 at 0xf8816000, 00:06:4f:01:66:58, IRQ 10 eth1: Identified 8139 chip type 'RTL-8139C' eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 [drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held [drm:radeon_unlock] *ERROR* Process 6283 using kernel context 0 inserting floppy driver for 2.6.10-rc1-bk14 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 PCI: Enabling device 0000:00:0a.0 (0004 -> 0005) ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11 ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 18:44 ` Pekka Enberg 0 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 18:44 UTC (permalink / raw) To: Christian Kujau Cc: Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, Greg KH, penberg Hi Christian, On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote: > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ > the .configs, the oopses are there. i've double checked a kernel built > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) Just to update, I cannot reproduce the oops with your config (nor mine) on my machine running 2.6.10-rc1-bk14. Pekka 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: bus master, medium devsel, latency 8 Memory at e7000000 (32-bit, prefetchable) Capabilities: [a0] AGP version 2.0 Capabilities: [c0] Power Management version 2 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, medium devsel, latency 0 Bus: primary\0, secondary\x01, subordinate\x01, sec-latency=0 I/O behind bridge: 0000d000-0000dfff Memory behind bridge: d7000000-d7efffff Prefetchable memory behind bridge: d7f00000-e6ffffff Expansion ROM at 0000d000 [disabled] [size=4K] Capabilities: [80] Power Management version 2 0000:00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: bus master, stepping, medium devsel, latency 0 Capabilities: [c0] Power Management version 2 0000:00:04.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP]) Flags: bus master, medium devsel, latency 32 I/O ports at b800 [size\x16] Capabilities: [c0] Power Management version 2 0000:00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at b400 [size2] Capabilities: [80] Power Management version 2 0000:00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 16) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at b000 [size2] Capabilities: [80] Power Management version 2 0000:00:04.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard Flags: medium devsel, IRQ 9 Capabilities: [68] Power Management version 2 0000:00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139 Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at 9400 Memory at d6800000 (32-bit, non-prefetchable) [size%6] Capabilities: [50] Power Management version 2 0000:00:0a.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 04) Subsystem: Ensoniq Sound Blaster 16PCI 4.1ch Flags: bus master, slow devsel, latency 32, IRQ 11 I/O ports at 9000 Capabilities: [dc] Power Management version 2 0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139 Flags: bus master, medium devsel, latency 32, IRQ 10 I/O ports at 8800 Memory at d6000000 (32-bit, non-prefetchable) [size%6] Capabilities: [50] Power Management version 2 0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA]) Subsystem: Hightech Information System Ltd.: Unknown device 0f02 Flags: bus master, stepping, 66Mhz, medium devsel, latency 64 Memory at d8000000 (32-bit, prefetchable) [size×fe0000] I/O ports at d800 [size%6] Memory at d7000000 (32-bit, non-prefetchable) [sizedK] Expansion ROM at 00020000 [disabled] Capabilities: [58] AGP version 2.0 Capabilities: [50] Power Management version 2 Linux version 2.6.10-rc1-bk14 (root@cherry) (gcc version 3.4.2 (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) #8 Mon Nov 8 20:18:45 EET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003ffec000 (usable) BIOS-e820: 000000003ffec000 - 000000003ffef000 (ACPI data) BIOS-e820: 000000003ffef000 - 000000003ffff000 (reserved) BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. On node 0 totalpages: 262124 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 32748 pages, LIFO batch:7 DMI 2.3 present. ACPI: RSDP (v000 ASUS ) @ 0x000f6a80 ACPI: RSDT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec000 ACPI: FADT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec080 ACPI: BOOT (v001 ASUS A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec040 ACPI: DSDT (v001 ASUS A7V133-C 0x00001000 MSFT 0x0100000b) @ 0x00000000 ACPI: PM-Timer IO Port: 0xe408 Built 1 zonelists Kernel command line: root=/dev/ram0 init=/linuxrc real_root=/dev/hda3 acpi=force No local APIC present or hardware disabled Initializing CPU#0 PID hash table entries: 4096 (order: 12, 65536 bytes) Detected 1009.328 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1034128k/1048496k available (2582k kernel code, 13664k reserved, 770k data, 148k init, 130992k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 1998.84 BogoMIPS (lpj™9424) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383f9ff c1c7f9ff 00000000 00000000 CPU: After vendor identify, caps: 0383f9ff c1c7f9ff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 64K (64 bytes/line) CPU: After all inits, caps: 0383f9ff c1c7f9ff 00000000 00000020 CPU: AMD Duron(tm) Processor stepping 00 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: IRQ9 SCI: Edge set to Level Trigger. checking if image is initramfs...it isn't (no cpio magic); looks like an initrd Freeing initrd memory: 885k freed kobject_uevent: unable to create netlink socket! NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040816 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] PCI: Using ACPI for IRQ routing ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI interrupt 0000:00:04.2[D] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI interrupt 0000:00:04.3[D] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11 ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11 ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10 Simple Boot Flag at 0x3a set to 0x1 highmem bounce pool size: 64 pages devfs: 2004-01-31 Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x0 SGI XFS with ACLs, realtime, no debug enabled SGI XFS Quota Management subsystem Applying VIA southbridge workaround. PCI: Disabling Via external APIC routing Real Time Clock Driver v1.12 serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize Equalizer2002: Simon Janes (simon@ncm.com) and David S. Miller (davem@redhat.com) Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:04.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:04.1 ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio Probing IDE interface ide0... hda: Maxtor 4D060H3, ATA DISK drive elevator: using anticipatory as default io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: Hewlett-Packard CD-Writer Plus 8200a, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 Probing IDE interface ide2... ide2: Wait for ready failed before probe ! Probing IDE interface ide3... ide3: Wait for ready failed before probe ! Probing IDE interface ide4... ide4: Wait for ready failed before probe ! Probing IDE interface ide5... ide5: Wait for ready failed before probe ! hda: max request size: 128KiB hda: 120069936 sectors (61475 MB) w/2048KiB Cache, CHSe535/16/63, UDMA(100) hda: cache flushes not supported /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard on isa0060/serio0 input: ImPS/2 Logitech Wheel Mouse on isa0060/serio1 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 10 IPv6 over IPv4 tunneling driver NET: Registered protocol family 17 ACPI: (supports S0 S1 S4 S5) ACPI wakeup devices: PWRB PCI0 UAR1 UAR2 USB0 USB1 RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 148k freed usbcore: registered new driver usbfs usbcore: registered new driver hub usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver SCSI subsystem initialized Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) ReiserFS: hda3: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on hda3 kjournald starting. Commit interval 5 seconds EXT3 FS on hda3, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2040244k swap on /dev/hda2. Priority:-1 extents:1 EXT3 FS on hda3, internal journal 8139too Fast Ethernet driver 0.9.27 PCI: Enabling device 0000:00:09.0 (0004 -> 0007) ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10 eth0: RealTek RTL8139 at 0xf8814000, 00:06:4f:01:66:57, IRQ 10 eth0: Identified 8139 chip type 'RTL-8139C' PCI: Enabling device 0000:00:0d.0 (0004 -> 0007) ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10 eth1: RealTek RTL8139 at 0xf8816000, 00:06:4f:01:66:58, IRQ 10 eth1: Identified 8139 chip type 'RTL-8139C' eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 [drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held [drm:radeon_unlock] *ERROR* Process 6283 using kernel context 0 inserting floppy driver for 2.6.10-rc1-bk14 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 PCI: Enabling device 0000:00:0a.0 (0004 -> 0005) ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11 ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 18:44 ` Pekka Enberg @ 2004-11-08 19:00 ` Greg KH -1 siblings, 0 replies; 70+ messages in thread From: Greg KH @ 2004-11-08 19:00 UTC (permalink / raw) To: Pekka Enberg Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg On Mon, Nov 08, 2004 at 08:44:37PM +0200, Pekka Enberg wrote: > Hi Christian, > > On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote: > > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ > > the .configs, the oopses are there. i've double checked a kernel built > > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) > > Just to update, I cannot reproduce the oops with your config (nor > mine) on my machine running 2.6.10-rc1-bk14. But 2.6.10-rc1-bk15 does have the problem? Trying to figure out where the issue is... greg k-h ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 19:00 ` Greg KH 0 siblings, 0 replies; 70+ messages in thread From: Greg KH @ 2004-11-08 19:00 UTC (permalink / raw) To: Pekka Enberg Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg On Mon, Nov 08, 2004 at 08:44:37PM +0200, Pekka Enberg wrote: > Hi Christian, > > On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote: > > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/ > > the .configs, the oopses are there. i've double checked a kernel built > > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops) > > Just to update, I cannot reproduce the oops with your config (nor > mine) on my machine running 2.6.10-rc1-bk14. But 2.6.10-rc1-bk15 does have the problem? Trying to figure out where the issue is... greg k-h ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 19:00 ` Greg KH @ 2004-11-08 19:18 ` Pekka Enberg -1 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 19:18 UTC (permalink / raw) To: Greg KH Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg Hi, On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > But 2.6.10-rc1-bk15 does have the problem? > > Trying to figure out where the issue is... No, -bk14 is just the kernel I am running right now (I haven't tried -bk15) and I haven't had the problem. I cannot reproduce the oops _at all_ which is why I suspect it's his hardware. I included my lspci and dmesg output because we have similar (but not exactly the same) setups. FWIW, I've asked Christian for an obdump of the kernel to see if I can track down where it oopses at because I cannot find anything in the code. I suspected pcibios_enable_irq (which is a function pointer) might be wrong but looking at his logs, I don't think we get that far. Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 19:18 ` Pekka Enberg 0 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 19:18 UTC (permalink / raw) To: Greg KH Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg Hi, On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > But 2.6.10-rc1-bk15 does have the problem? > > Trying to figure out where the issue is... No, -bk14 is just the kernel I am running right now (I haven't tried -bk15) and I haven't had the problem. I cannot reproduce the oops _at all_ which is why I suspect it's his hardware. I included my lspci and dmesg output because we have similar (but not exactly the same) setups. FWIW, I've asked Christian for an obdump of the kernel to see if I can track down where it oopses at because I cannot find anything in the code. I suspected pcibios_enable_irq (which is a function pointer) might be wrong but looking at his logs, I don't think we get that far. Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 19:18 ` Pekka Enberg @ 2004-11-08 19:30 ` Pekka Enberg -1 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 19:30 UTC (permalink / raw) To: Greg KH Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > > But 2.6.10-rc1-bk15 does have the problem? > > > > Trying to figure out where the issue is... On Mon, 8 Nov 2004 21:18:09 +0200, Pekka Enberg <penberg@gmail.com> wrote: > No, -bk14 is just the kernel I am running right now (I haven't tried > -bk15) and I haven't had the problem. Sorry for not being clear, any kernel after 2.6.10-rc1 oopses according to Christian which is why I haven't bothered to test anything else except -bk14. Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 19:30 ` Pekka Enberg 0 siblings, 0 replies; 70+ messages in thread From: Pekka Enberg @ 2004-11-08 19:30 UTC (permalink / raw) To: Greg KH Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound, penberg On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > > But 2.6.10-rc1-bk15 does have the problem? > > > > Trying to figure out where the issue is... On Mon, 8 Nov 2004 21:18:09 +0200, Pekka Enberg <penberg@gmail.com> wrote: > No, -bk14 is just the kernel I am running right now (I haven't tried > -bk15) and I haven't had the problem. Sorry for not being clear, any kernel after 2.6.10-rc1 oopses according to Christian which is why I haven't bothered to test anything else except -bk14. Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-08 19:18 ` Pekka Enberg @ 2004-11-08 20:31 ` Christian Kujau -1 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 20:31 UTC (permalink / raw) To: Pekka Enberg Cc: Greg KH, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Pekka Enberg schrieb: > Hi, > > On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > >>But 2.6.10-rc1-bk15 does have the problem? >> >>Trying to figure out where the issue is... i could use the -bk snapshots too, but since i am using bk myself (i try), i think we can narrow it down a bit more. > > No, -bk14 is just the kernel I am running right now (I haven't tried > -bk15) and I haven't had the problem. I cannot reproduce the oops _at > all_ which is why I suspect it's his hardware. I included my lspci and > dmesg output because we have similar (but not exactly the same) > setups. i've put an lspci output here: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt i do not suspect hw problems *yet*, because kernel up to 2.6.9 (tracking bk) do not show this behaviour. > FWIW, I've asked Christian for an obdump of the kernel to see if I can will show up in a couple of minutes here: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/objdump-d_a1.2463.txt.bz2 this is from the vmlinux from a "bk undo -a1.2463" kernel, IOW it still contains: ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux thank you for the hints, Christian. PS: should we i un'CC linux-sound and alsa-devel, now we are sure it's a pci thing? - -- BOFH excuse #228: That function is not currently supported, but Bill Gates assures us it will be featured in the next upgrade. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj9e9+A7rjkF8z0wRAregAJ9TyK5Mt00CFmCcgA1pOKmzvIxv2QCg0OBi /9eNZ41Kp2GAOg4J5l0QR8E= =OkFI -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 @ 2004-11-08 20:31 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-08 20:31 UTC (permalink / raw) To: Pekka Enberg Cc: Greg KH, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Pekka Enberg schrieb: > Hi, > > On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote: > >>But 2.6.10-rc1-bk15 does have the problem? >> >>Trying to figure out where the issue is... i could use the -bk snapshots too, but since i am using bk myself (i try), i think we can narrow it down a bit more. > > No, -bk14 is just the kernel I am running right now (I haven't tried > -bk15) and I haven't had the problem. I cannot reproduce the oops _at > all_ which is why I suspect it's his hardware. I included my lspci and > dmesg output because we have similar (but not exactly the same) > setups. i've put an lspci output here: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt i do not suspect hw problems *yet*, because kernel up to 2.6.9 (tracking bk) do not show this behaviour. > FWIW, I've asked Christian for an obdump of the kernel to see if I can will show up in a couple of minutes here: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/objdump-d_a1.2463.txt.bz2 this is from the vmlinux from a "bk undo -a1.2463" kernel, IOW it still contains: ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux thank you for the hints, Christian. PS: should we i un'CC linux-sound and alsa-devel, now we are sure it's a pci thing? - -- BOFH excuse #228: That function is not currently supported, but Bill Gates assures us it will be featured in the next upgrade. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBj9e9+A7rjkF8z0wRAregAJ9TyK5Mt00CFmCcgA1pOKmzvIxv2QCg0OBi /9eNZ41Kp2GAOg4J5l0QR8E=OkFI -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 1:24 ` Christian Kujau 2004-11-07 7:02 ` Linus Torvalds @ 2004-11-07 13:05 ` Pekka Enberg 2004-11-07 13:43 ` Christian Kujau 1 sibling, 1 reply; 70+ messages in thread From: Pekka Enberg @ 2004-11-07 13:05 UTC (permalink / raw) To: Christian Kujau; +Cc: LKML, alsa-devel, perex, penberg Hi Christian, On Sun, 07 Nov 2004 02:24:41 +0100, Christian Kujau <evil@g-house.de> wrote: > if someone could give me a hint here what to do next or perhaps tell me > that the whole things was totally pointless - please say so. > i am somehow lost as to which is the right person to bug here. I am running 2.6.10-rc1-bk14 with ens-1371 working ok. Could you please post your .config so I can try to reproduce your oops? Pekka ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 2004-11-07 13:05 ` Pekka Enberg @ 2004-11-07 13:43 ` Christian Kujau 0 siblings, 0 replies; 70+ messages in thread From: Christian Kujau @ 2004-11-07 13:43 UTC (permalink / raw) To: LKML -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Pekka Enberg schrieb: > > I am running 2.6.10-rc1-bk14 with ens-1371 working ok. Could you > please post your .config so I can try to reproduce your oops? i put it on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config thank you, Christian. - -- BOFH excuse #361: Communist revolutionaries taking over the server room and demanding all the computers in the building or they shoot the sysadmin. Poor misguided fools. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBjiae+A7rjkF8z0wRAqo9AJ0e0iHAXi2Q6oI/UKl1vBw/dPvODQCfSjfh ucfAhJkoCMS5gGxt/HtSKrw= =pqTN -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved)
@ 2004-11-13 3:45 Chuck Ebbert
2004-11-13 14:28 ` Matt Domsch
0 siblings, 1 reply; 70+ messages in thread
From: Chuck Ebbert @ 2004-11-13 3:45 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, Matt Domsch
On Tue, 9 Nov 2004 at 17:01:10 -0800 Linus Torvalds <torvalds@osdl.org> wrote:
> > PS: do you have *any* idea how this could be related to the snd-es1371
> > driver (which is producing the oops then)?
>
> I bet it's overwriting some array, and just corrupting memory after it.
> For example, the edd_info[] array only has 6 entries,
That's almost certainly the problem. There can be up to 16 EDD devices
as of the Jun 30 update to the EDD code.
And sound_class is the next item after edd_info[] in my System.map...
--Chuck Ebbert 12-Nov-04 22:21:27
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-13 3:45 Oops in 2.6.10-rc1 (almost solved) Chuck Ebbert @ 2004-11-13 14:28 ` Matt Domsch 2004-11-13 18:55 ` Matt Domsch 2004-11-14 2:58 ` Matt Domsch 0 siblings, 2 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-13 14:28 UTC (permalink / raw) To: Chuck Ebbert, Christian Kujau; +Cc: Linus Torvalds, linux-kernel On Fri, Nov 12, 2004 at 10:45:12PM -0500, Chuck Ebbert wrote: > On Tue, 9 Nov 2004 at 17:01:10 -0800 Linus Torvalds <torvalds@osdl.org> wrote: > > > > PS: do you have *any* idea how this could be related to the snd-es1371 > > > driver (which is producing the oops then)? > > > > I bet it's overwriting some array, and just corrupting memory after it. > > For example, the edd_info[] array only has 6 entries, > > That's almost certainly the problem. There can be up to 16 EDD devices > as of the Jun 30 update to the EDD code. Bingo... edd_devices[] was too short. When we keep more than 6 signatures, it overruns the end. Also, I rewrote edd_num_devices to be clearer about its goal. This patch is necessary even after the last edd.S patch was reverted. It still doesn't explain why Christian's BIOS reports more devices than he has, that's still UI, so don't re-apply the edd.S patch just reverted. Signed-off-by: Matt Domsch -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ===== drivers/firmware/edd.c 1.30 vs edited ===== --- 1.30/drivers/firmware/edd.c 2004-06-29 09:44:48 -05:00 +++ edited/drivers/firmware/edd.c 2004-11-13 07:56:00 -06:00 @@ -70,7 +70,7 @@ static int edd_dev_is_type(struct edd_device *edev, const char *type); static struct pci_dev *edd_get_pci_dev(struct edd_device *edev); -static struct edd_device *edd_devices[EDDMAXNR]; +static struct edd_device *edd_devices[EDD_MBR_SIG_MAX]; #define EDD_DEVICE_ATTR(_name,_mode,_show,_test) \ struct edd_attribute edd_attr_##_name = { \ @@ -728,9 +728,9 @@ static inline int edd_num_devices(void) { - return min_t(unsigned char, - max_t(unsigned char, edd.edd_info_nr, edd.mbr_signature_nr), - max_t(unsigned char, EDD_MBR_SIG_MAX, EDDMAXNR)); + return max_t(unsigned char, + min_t(unsigned char, EDD_MBR_SIG_MAX, edd.mbr_signature_nr), + min_t(unsigned char, EDDMAXNR, edd.edd_info_nr)); } /** ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-13 14:28 ` Matt Domsch @ 2004-11-13 18:55 ` Matt Domsch 2004-11-14 2:58 ` Matt Domsch 1 sibling, 0 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-13 18:55 UTC (permalink / raw) To: Chuck Ebbert, Christian Kujau; +Cc: Linus Torvalds, linux-kernel On Sat, Nov 13, 2004 at 08:28:35AM -0600, Matt Domsch wrote: > On Fri, Nov 12, 2004 at 10:45:12PM -0500, Chuck Ebbert wrote: > > On Tue, 9 Nov 2004 at 17:01:10 -0800 Linus Torvalds <torvalds@osdl.org> wrote: > > > > > > PS: do you have *any* idea how this could be related to the snd-es1371 > > > > driver (which is producing the oops then)? > > > > > > I bet it's overwriting some array, and just corrupting memory after it. > > > For example, the edd_info[] array only has 6 entries, > > > > That's almost certainly the problem. There can be up to 16 EDD devices > > as of the Jun 30 update to the EDD code. > > Bingo... edd_devices[] was too short. When we keep more > than 6 signatures, it overruns the end. In particular, depending on your .config, with EDD=y it overwrites 40 bytes past the end of edd_devices (here I've already extended it by the necessary amount, but the 40 bytes past its end are all subject to be overwritten): c043a880 b edd_devices c043a8c0 b pci_bios_present c043a8c4 B pci_mmcfg_base_addr c043a8c8 b mmcfg_last_accessed_device c043a8cc b called.0 c043a8d0 B pcibios_enable_irq c043a8d4 b eisa_irq_mask.0 c043a8d8 b broken_hp_bios_irq9 c043a8dc b acer_tm360_irqrouting c043a8e0 b pirq_table c043a8e4 b pirq_router hence the failure Christian saw and attributed to the sound drivers: EIP is at 0xc15d5820 eax: 00000000 ebx: dff20400 ecx: c15d5820 edx: dff205c4 esi: ffffffed edi: dff20400 ebp: dff20400 esp: c17a3e58 ds: 007b es: 007b ss: 0068 Process modprobe (pid: 178, threadinfo=c17a2000 task=dfcf05a0) Stack: c01fa5c8 dff20400 000007ff dff20400 c01fa5ff dff20400 000007ff c15ea400 e082729d dff20400 c15ea400 00000000 e08469df c15ea400 000001f8 000000d0 000000d0 df45ed14 00000000 c018e14e c15ea400 ffffffed dff20400 dff20400 Call Trace: [<c01fa5c8>] pci_enable_device_bars+0x28/0x40 [<c01fa5ff>] pci_enable_device+0x1f/0x40 [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371] [<e08469df>] snd_card_new+0x1cf/0x2c0 [snd] [<c018e14e>] sysfs_new_dirent+0x2e/0x90 [<e0827867>] snd_audiopci_probe+0x87/0x1e0 [snd_ens1371] [<c01fb012>] pci_device_probe_static+0x52/0x70 [<c01fb05c>] __pci_device_probe+0x2c/0x30 [<c01fb08c>] pci_device_probe+0x2c/0x60 [<c0258f4f>] driver_probe_device+0x2f/0x80 [<c02590b2>] driver_attach+0x52/0xa0 [<c02595f8>] bus_add_driver+0x98/0xe0 [<c0259c5f>] driver_register+0x2f/0x40 [<c01fb340>] pci_register_driver+0x40/0x50 [<e08279cf>] alsa_card_ens137x_init+0xf/0x13 [snd_ens1371] [<c0134279>] sys_init_module+0x169/0x240 [<c01041eb>] syscall_call+0x7/0xb With CONFIG_EDD=m, there just wasn't anything interesting in memory following edd_devices[] (thanks module loader for using whole pages I believe). -Matt -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-13 14:28 ` Matt Domsch 2004-11-13 18:55 ` Matt Domsch @ 2004-11-14 2:58 ` Matt Domsch 2004-11-14 4:43 ` Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-14 2:58 UTC (permalink / raw) To: Christian Kujau; +Cc: Linus Torvalds, linux-kernel, Chuck Ebbert On Sat, Nov 13, 2004 at 08:28:35AM -0600, Matt Domsch wrote: > It still doesn't explain why Christian's BIOS reports more devices > than he has, that's still UI, so don't re-apply the edd.S patch just reverted. Alexander van Heukelum noted to me that addw here modifies CF, so I think something like should fix that. Christian, if you're in a position to test this, I'd really appreciate it. You've been a fantastic bug reporter / tester! Not ready for Linus yet, and you'll need to re-apply the previous edd.S patch which is now reverted in Linus's tree. As your BIOS reports via CHECK EXTENSIONS PRESENT that you've got more devices than you actually have, hopefully the int13 EXTENDED READ won't succeed for non-existant devices anymore, and then neither will the READ SECTORS call. -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ===== arch/i386/boot/edd.S 1.3 vs edited ===== --- 1.3/arch/i386/boot/edd.S 2004-10-20 03:37:11 -05:00 +++ edited/arch/i386/boot/edd.S 2004-11-13 20:31:58 -06:00 @@ -58,8 +58,12 @@ sti # work around buggy BIOSes popw %dx popw %si - addw $EDD_DEV_ADDR_PACKET_LEN, %sp # remove packet from stack - jnc edd_mbr_store_sig + pushfl # save EFLAGS into ebx + popl %ebx # because addw modifies CF + addw $EDD_DEV_ADDR_PACKET_LEN, %sp # remove packet from stack + pushl %ebx # get back right CF + popfl + jnc edd_mbr_store_sig # otherwise, fall through to the legacy read function edd_mbr_read_sectors: ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-14 2:58 ` Matt Domsch @ 2004-11-14 4:43 ` Linus Torvalds 2004-11-14 11:45 ` Christian 2004-11-14 20:02 ` Christian Kujau 2 siblings, 0 replies; 70+ messages in thread From: Linus Torvalds @ 2004-11-14 4:43 UTC (permalink / raw) To: Matt Domsch; +Cc: Christian Kujau, linux-kernel, Chuck Ebbert On Sat, 13 Nov 2004, Matt Domsch wrote: > > Not ready for Linus yet Indeed. Please don't use pushfl/popfl to save the carry flag. There are tons of better ways. For example, use "lea" instead of "add" to not write the flags (and add a comment). Or save the carry flag in a register with sbb %bx,%bx ant test %bx later. Or any of a million other _standard_ ways to handle this problem. Linus ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-14 2:58 ` Matt Domsch 2004-11-14 4:43 ` Linus Torvalds @ 2004-11-14 11:45 ` Christian 2004-11-14 20:02 ` Christian Kujau 2 siblings, 0 replies; 70+ messages in thread From: Christian @ 2004-11-14 11:45 UTC (permalink / raw) To: Matt Domsch; +Cc: Linus Torvalds, linux-kernel, Chuck Ebbert Matt Domsch wrote: > > Alexander van Heukelum noted to me that addw here modifies CF, so I > think something like should fix that. Christian, if you're in a > position to test this, I'd really appreciate it. You've been a yes, i'll do so. right now i am off (and late) to sth. else, but i'll test this in the evening. thank you, Christian. -- BOFH excuse #318: Your EMAIL is now being delivered by the USPS. ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-14 2:58 ` Matt Domsch 2004-11-14 4:43 ` Linus Torvalds 2004-11-14 11:45 ` Christian @ 2004-11-14 20:02 ` Christian Kujau 2004-11-14 21:55 ` Matt Domsch 2 siblings, 1 reply; 70+ messages in thread From: Christian Kujau @ 2004-11-14 20:02 UTC (permalink / raw) To: Matt Domsch; +Cc: Linus Torvalds, linux-kernel, Chuck Ebbert -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 sorry, took me a bit longer to get to the testing. Matt Domsch schrieb: > > Not ready for Linus yet, and you'll need to re-apply the previous > edd.S patch which is now reverted in Linus's tree. As your BIOS i've applied the patch to a pristine 2.6.10-rc1, so the (currently reverted) EDD change is still there. tell me, if the patch had to be applied to sth. else. but for now i have to say, that it still oopses: http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd-2.txt ... BIOS EDD facility v0.16 2004-Jun-25, 16 devices found ... (oh, i've added an ide-disk yesterday, so hde will show up in dmesg.) sorry, Christian. - -- BOFH excuse #401: Sales staff sold a product we don't offer. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBl7nZ+A7rjkF8z0wRAvuHAKCX8TWiDt5DP25OqBEWKecfM6x3HwCeNRoM 1IzHqKpcbWOABXWJ4vC4d1w= =FiKX -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: Oops in 2.6.10-rc1 (almost solved) 2004-11-14 20:02 ` Christian Kujau @ 2004-11-14 21:55 ` Matt Domsch 0 siblings, 0 replies; 70+ messages in thread From: Matt Domsch @ 2004-11-14 21:55 UTC (permalink / raw) To: Christian Kujau; +Cc: Linus Torvalds, linux-kernel, Chuck Ebbert On Sun, Nov 14, 2004 at 09:02:33PM +0100, Christian Kujau wrote: > > Not ready for Linus yet, and you'll need to re-apply the previous > > edd.S patch which is now reverted in Linus's tree. As your BIOS > > i've applied the patch to a pristine 2.6.10-rc1, so the (currently > reverted) EDD change is still there. tell me, if the patch had to be > applied to sth. else. > > but for now i have to say, that it still oopses: > > http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd-2.txt OK, the patch below (which Linus applied to his tree yesterday) should fix the oopses. > BIOS EDD facility v0.16 2004-Jun-25, 16 devices found but the patch to edd.S doesn't resolve that EDD believes you've got 16 devices (I would expect it to report 2, as you have only 2 disks). Thanks for the quick testing. Back to the drawing board though for this second part. Thanks, Matt -- Matt Domsch Sr. Software Engineer, Lead Engineer Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com ===== drivers/firmware/edd.c 1.30 vs edited ===== --- 1.30/drivers/firmware/edd.c 2004-06-29 09:44:48 -05:00 +++ edited/drivers/firmware/edd.c 2004-11-13 07:56:00 -06:00 @@ -70,7 +70,7 @@ static int edd_dev_is_type(struct edd_device *edev, const char *type); static struct pci_dev *edd_get_pci_dev(struct edd_device *edev); -static struct edd_device *edd_devices[EDDMAXNR]; +static struct edd_device *edd_devices[EDD_MBR_SIG_MAX]; #define EDD_DEVICE_ATTR(_name,_mode,_show,_test) \ struct edd_attribute edd_attr_##_name = { \ @@ -728,9 +728,9 @@ static inline int edd_num_devices(void) { - return min_t(unsigned char, - max_t(unsigned char, edd.edd_info_nr, edd.mbr_signature_nr), - max_t(unsigned char, EDD_MBR_SIG_MAX, EDDMAXNR)); + return max_t(unsigned char, + min_t(unsigned char, EDD_MBR_SIG_MAX, edd.mbr_signature_nr), + min_t(unsigned char, EDDMAXNR, edd.edd_info_nr)); } /** ^ permalink raw reply [flat|nested] 70+ messages in thread
end of thread, other threads:[~2004-11-14 21:55 UTC | newest] Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-10-28 13:12 Oops in 2.6.10-rc1 Christian 2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela 2004-10-28 14:09 ` Christian 2004-11-04 15:16 ` Christian Kujau 2004-11-05 2:35 ` Christian Kujau 2004-11-05 11:40 ` holborn 2004-11-07 1:24 ` Christian Kujau 2004-11-07 7:02 ` Linus Torvalds 2004-11-07 13:10 ` Christian Kujau 2004-11-07 16:02 ` Christian Kujau 2004-11-07 16:57 ` Linus Torvalds 2004-11-07 18:31 ` Christian Kujau 2004-11-07 18:44 ` Linus Torvalds 2004-11-07 23:45 ` Christian Kujau 2004-11-07 23:45 ` Christian Kujau 2004-11-08 1:16 ` Linus Torvalds 2004-11-08 1:16 ` Linus Torvalds 2004-11-08 13:01 ` Christian Kujau 2004-11-08 13:01 ` Christian Kujau 2004-11-08 18:13 ` Linus Torvalds 2004-11-08 18:13 ` Linus Torvalds 2004-11-08 20:59 ` Christian Kujau 2004-11-08 20:59 ` Christian Kujau 2004-11-08 23:49 ` Christian Kujau 2004-11-09 1:05 ` Linus Torvalds 2004-11-09 1:41 ` Christian Kujau 2004-11-09 1:31 ` Christian Kujau 2004-11-09 7:40 ` Pekka Enberg 2004-11-09 12:33 ` Christian Kujau 2004-11-09 17:26 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau 2004-11-09 18:53 ` Linus Torvalds 2004-11-09 19:04 ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH 2004-11-09 19:08 ` Greg KH 2004-11-09 20:19 ` Pekka Enberg 2004-11-09 21:21 ` Christian Kujau 2004-11-09 21:31 ` Christian Kujau 2004-11-09 19:09 ` Linus Torvalds 2004-11-09 22:06 ` Christian Kujau 2004-11-09 23:30 ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau 2004-11-09 23:40 ` Matt Domsch 2004-11-10 0:21 ` Christian Kujau 2004-11-10 1:01 ` Linus Torvalds 2004-11-11 22:43 ` Matt Domsch 2004-11-11 22:53 ` Linus Torvalds 2004-11-11 22:55 ` Matt Domsch 2004-11-12 0:27 ` Christian Kujau 2004-11-12 0:49 ` Linus Torvalds 2004-11-12 1:27 ` Christian Kujau 2004-11-10 0:12 ` Oops in 2.6.10-rc1 Christian Kujau 2004-11-10 0:23 ` Linus Torvalds 2004-11-08 18:44 ` Pekka Enberg 2004-11-08 18:44 ` Pekka Enberg 2004-11-08 19:00 ` Greg KH 2004-11-08 19:00 ` Greg KH 2004-11-08 19:18 ` Pekka Enberg 2004-11-08 19:18 ` Pekka Enberg 2004-11-08 19:30 ` Pekka Enberg 2004-11-08 19:30 ` Pekka Enberg 2004-11-08 20:31 ` Christian Kujau 2004-11-08 20:31 ` Christian Kujau 2004-11-07 13:05 ` Pekka Enberg 2004-11-07 13:43 ` Christian Kujau 2004-11-13 3:45 Oops in 2.6.10-rc1 (almost solved) Chuck Ebbert 2004-11-13 14:28 ` Matt Domsch 2004-11-13 18:55 ` Matt Domsch 2004-11-14 2:58 ` Matt Domsch 2004-11-14 4:43 ` Linus Torvalds 2004-11-14 11:45 ` Christian 2004-11-14 20:02 ` Christian Kujau 2004-11-14 21:55 ` Matt Domsch
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.