All of lore.kernel.org
 help / color / mirror / Atom feed
* Oops in 2.6.10-rc1
@ 2004-10-28 13:12 Christian
  2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela
  0 siblings, 1 reply; 62+ messages in thread
From: Christian @ 2004-10-28 13:12 UTC (permalink / raw)
  To: alsa-devel

[repost to alsa-devel as suggested by lkml]

hi,

yesterday i was updating to recent 2.6.10-rc1-BK and booting gives:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
  printing eip:
dfc10ce0
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: snd_ens1371 snd_rawmidi snd_ac97_codec snd_pcm
snd_timer snd soundcore snd_page_alloc rtc
CPU:    0
EIP:    0060:[<dfc10ce0>]    Not tainted VLI
EFLAGS: 00010282   (2.6.10-rc1)
EIP is at 0xdfc10ce0
eax: 00000000   ebx: dff1f800   ecx: dfc10ce0   edx: dff1f9c4
esi: ffffffed   edi: dff1f800   ebp: dff1f800   esp: de613e50
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 186, threadinfo=de612000 task=deb5e5a0)
Stack: c01fc7b8 dff1f800 000007ff dff1f800 c01fc7ef dff1f800 000007ff 
dfc1e400
        e082729d dff1f800 dfc1e400 00000000 e08469cf dfc1e400 000001f8 
000000d0
        c01667f7 de36da8c c0171759 dffe79e0 dfc1e400 ffffffed dff1f800 
dff1f800
Call Trace:
  [<c01fc7b8>] pci_enable_device_bars+0x28/0x40
  [<c01fc7ef>] pci_enable_device+0x1f/0x40
  [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371]
  [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd]
  [<c01667f7>] __lookup_hash+0xa7/0xe0
  [<c0171759>] alloc_inode+0x129/0x150
  [<e0827867>] snd_audiopci_probe+0x87/0x1e0 [snd_ens1371]
  [<c016f6c2>] dput+0x92/0x250
  [<c01fd202>] pci_device_probe_static+0x52/0x70
  [<c01fd24c>] __pci_device_probe+0x2c/0x30
  [<c01fd27c>] pci_device_probe+0x2c/0x60
  [<c025adff>] bus_match+0x3f/0x80
  [<c025af52>] driver_attach+0x52/0xa0
  [<c025b478>] bus_add_driver+0x98/0xe0
  [<c025ba8f>] driver_register+0x2f/0x40
  [<c01fd530>] pci_register_driver+0x40/0x50
  [<e08279cf>] alsa_card_ens137x_init+0xf/0x13 [snd_ens1371]
  [<c01341ba>] sys_init_module+0x18a/0x270
  [<c01041fb>] syscall_call+0x7/0xb
Code: 5f 64 65 76 38 62 00 00 00 00 00 00 00 00 00 02 00 00 00 88 0c c1 df
08 0d c1 df 10 fa 3a c0 00 fa 3a c0 00 00 00 00 6c 5a c1 df <0a> 00 00 00
36 46 37 46 00 00 00 00 f0 0c c1 df 69 6e 74 31 33

full dmesg output here:
www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg.txt

updating to an even more recent (read: updated now) does not help and 
the problem is really triggered when loading snd_ens1371. well, the only
"problem" is the oops and i have no sound :-(

just strange that nobody else cries out loud. or am i just lacking 
enough information? ok, this is debian/unstable (i386), gcc3.4.2, 
libc2.3.2, pls tell me if you need more information.

thank you,
Christian.
-- 
BOFH excuse #374:

It's the InterNIC's fault.


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Alsa-devel] Oops in 2.6.10-rc1
  2004-10-28 13:12 Oops in 2.6.10-rc1 Christian
@ 2004-10-28 13:29 ` Jaroslav Kysela
  2004-10-28 14:09   ` Christian
  0 siblings, 1 reply; 62+ messages in thread
From: Jaroslav Kysela @ 2004-10-28 13:29 UTC (permalink / raw)
  To: Christian; +Cc: alsa-devel, LKML

On Thu, 28 Oct 2004, Christian wrote:

>   [<c01fc7b8>] pci_enable_device_bars+0x28/0x40
>   [<c01fc7ef>] pci_enable_device+0x1f/0x40
>   [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371]
>   [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd]

It's a bit dead-lock, because we cannot help you. It seems that
the pci structure passed to our code is broken. The driver has had
no changes in initialization for a long time.

						Jaroslav

-----
Jaroslav Kysela <perex@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SUSE Labs

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Alsa-devel] Oops in 2.6.10-rc1
  2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela
@ 2004-10-28 14:09   ` Christian
  2004-11-04 15:16     ` Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Christian @ 2004-10-28 14:09 UTC (permalink / raw)
  To: LKML; +Cc: alsa-devel

Jaroslav Kysela wrote:
> On Thu, 28 Oct 2004, Christian wrote:
> 
> 
>>  [<c01fc7b8>] pci_enable_device_bars+0x28/0x40
>>  [<c01fc7ef>] pci_enable_device+0x1f/0x40
>>  [<e082729d>] snd_ensoniq_create+0x1d/0x480 [snd_ens1371]
>>  [<e08469cf>] snd_card_new+0x1cf/0x2c0 [snd]
> 
> 
> It's a bit dead-lock, because we cannot help you. It seems that
> the pci structure passed to our code is broken. The driver has had
> no changes in initialization for a long time.

so, it's a kernel problem again, not related to the alsa framework?

i see in

http://www.kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.10-rc1

[...]
<rddunlap@osdl.org>
	[PATCH] i386/io_apic init section fixups

<wli@holomorphy.com>
	[PATCH] vm: convert users of remap_page_range() under sound/ to
	use remap_pfn_range()
[...]

so i'll revert the patches and see what it gives.

thank you,
Christian
-- 
BOFH excuse #131:

telnet: Unable to connect to remote host: Connection refused

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Alsa-devel] Oops in 2.6.10-rc1
  2004-10-28 14:09   ` Christian
@ 2004-11-04 15:16     ` Christian Kujau
  2004-11-05  2:35       ` Christian Kujau
  2004-11-07  1:24       ` Christian Kujau
  0 siblings, 2 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-04 15:16 UTC (permalink / raw)
  To: LKML; +Cc: alsa-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hm,

still no sound with snd_ens1371 but now i spend some time to find out how
to revert a patch with bk. while compiling is still ongoing, let me tell
you how i tried to revert the patch with bk, because i am not entirely
sure if i do the right thing here:

bk changes > ../changes-04-11-2004.txt

as written before, i suspect (!) two changes here:

> [...]
> <rddunlap@osdl.org>
>     [PATCH] i386/io_apic init section fixups
> 
> <wli@holomorphy.com>
>     [PATCH] vm: convert users of remap_page_range() under sound/ to
>     use remap_pfn_range()
> [...]
> 
> so i'll revert the patches and see what it gives.

in ../changes-04-11-2004.txt i found out the ChnageSet numbers:
1.1988.72.76 + 1.2000.5.77. then i did

bk undo -a1.1988.72.76

only to find out that i misread the manual and 1.1988.72.76 is still in
place. i did

bk changes > ../changes-1.1988.72.76.txt

and the very patch has a different ChangeSet now: 1.2202. so i did

bk undo -a1.2201

is this the right way to revert patches when subsequent patches might not
allow to simply "bk undo -r<vers>" (because subsequent patches rely on
this single ChangeSet).

thank you for your assistance,
Christian
- --
BOFH excuse #182:

endothermal recalibration
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBike6+A7rjkF8z0wRAl/DAKDAMP31cXrzjBnnl+713F1zJ5ShQQCdFYRr
TpRkMTwdhZq9SvoZEPR2Plw=
=sm2q
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Alsa-devel] Oops in 2.6.10-rc1
  2004-11-04 15:16     ` Christian Kujau
@ 2004-11-05  2:35       ` Christian Kujau
  2004-11-05 11:40         ` holborn
  2004-11-07  1:24       ` Christian Kujau
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-05  2:35 UTC (permalink / raw)
  To: LKML; +Cc: alsa-devel, perex

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi again,

i *think* i found the ChangeSet leading to the bug i tried to report in
 http://marc.theaimsgroup.com/?l=linux-kernel&m=109888178603516&w=2

the error is sill present here (and only here? strange...), the latest -BK
does not fix it. i had some difficulties in telling BK to do the right
thing. to summarise the error:

- - upon loading of snd_ens1371 the Oops occurs. system is still stable
then, but no sound available.
- - this occured somewhere between 2.6.9 (released 15-Oct-2004) and 2.6.9-10
(released 22-Oct-2004)

one interesting changeset was:

ChangeSet@1.2000.7.1, 2004-10-20 20:33:06+02:00, perex@suse.cz
  Merge suse.cz:/home/perex/bk/linux-sound/linux-2.5
  into suse.cz:/home/perex/bk/linux-sound/linux-sound

i tried to back it out:

$ bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test

but the said ChangeSet was still there (of course). i tried to back it out
(now for sure):

$ bk undo -a1.2010
(hm: the changesets get renumbered everytime i "do" something with the
tree) this one reverted quite a few ChangeSets but i let it happen.

compiling & booting this thing goes fine and i am now running 2,6,9-BK(?)
with working snd_ens1371.

if someone could give me a hint here what to do next or perhaps tell me
that the whole things was totally pointless - please say so.
i am somehow lost as to which is the right person to bug here.

thank you for your time,
Christian.
- --
BOFH excuse #328:

Fiber optics caused gas main leak
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBiucN+A7rjkF8z0wRAkpKAJ0bbevHqmpU/Ut3r5TbWgfu42cGBACgsrhm
X8euqIjgc8KNCWl50oys/Yw=
=8VM9
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-05  2:35       ` Christian Kujau
@ 2004-11-05 11:40         ` holborn
  0 siblings, 0 replies; 62+ messages in thread
From: holborn @ 2004-11-05 11:40 UTC (permalink / raw)
  To: alsa-devel



I use snd_ens1371 and linux-2.6.10-rc1 and works ....

Josep




-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Oops in 2.6.10-rc1
  2004-11-04 15:16     ` Christian Kujau
  2004-11-05  2:35       ` Christian Kujau
@ 2004-11-07  1:24       ` Christian Kujau
  2004-11-07  7:02         ` Linus Torvalds
  2004-11-07 13:05         ` Pekka Enberg
  1 sibling, 2 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-07  1:24 UTC (permalink / raw)
  To: LKML; +Cc: alsa-devel, perex

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi again,

i *think* i found the ChangeSet leading to the bug i tried to report in
 http://marc.theaimsgroup.com/?l=linux-kernel&m=109888178603516&w=2

the error is sill present here (and only here? strange...), the latest -BK
does not fix it. i had some difficulties in telling BK to do the right
thing. to summarise the error:

- - upon loading of snd_ens1371 the Oops occurs. system is still stable
then, but no sound available.
- - this occured somewhere between 2.6.9 (released 15-Oct-2004) and 2.6.9-10
(released 22-Oct-2004)

one interesting changeset was:

ChangeSet@1.2000.7.1, 2004-10-20 20:33:06+02:00, perex@suse.cz
  Merge suse.cz:/home/perex/bk/linux-sound/linux-2.5
  into suse.cz:/home/perex/bk/linux-sound/linux-sound

i tried to back it out:

$ bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test

but the said ChangeSet was still there (of course). i tried to back it out
(now for sure):

$ bk undo -a1.2010
(hm: the changesets get renumbered everytime i "do" something with the
tree) this one reverted quite a few ChangeSets but i let it happen.

compiling & booting this thing goes fine and i am now running 2,6,9-BK(?)
with working snd_ens1371.

if someone could give me a hint here what to do next or perhaps tell me
that the whole things was totally pointless - please say so.
i am somehow lost as to which is the right person to bug here.

thank you for your time,
Christian.
- --
BOFH excuse #328:

Fiber optics caused gas main leak
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBjXlZ+A7rjkF8z0wRAqaVAJ9ljiIpxi01SblgEg/ce/Vd/uYksQCfeuJ9
hRGA0/17ttZ83xRQDb8jfhs=
=DQYp
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07  1:24       ` Christian Kujau
@ 2004-11-07  7:02         ` Linus Torvalds
  2004-11-07 13:10           ` Christian Kujau
  2004-11-07 13:05         ` Pekka Enberg
  1 sibling, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-07  7:02 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, alsa-devel, perex



On Sun, 7 Nov 2004, Christian Kujau wrote:
> 
> if someone could give me a hint here what to do next or perhaps tell me
> that the whole things was totally pointless - please say so.
> i am somehow lost as to which is the right person to bug here.

Since you seem to be a BK user, try doing a

	bk revtool sound/pci/ens1370.c 

and see if you can find the change that caused your problem. Of course, 
the real change might be somewhere else in the sound driver initialization 
path, so it's not like just that one file might be the cause. Regardöess, 
the more you can pinpoint when the problem started, the better.

Also, if you enable frame pointers (under kernel debugging), the traceback
will look a bit better. As it is, your oops looks looks like something has
jumped off into la-la-land by jumping through a bad pointer (the value is
still in %ecx), but it's definitely not clear _where_ that happened.  
Your trace points to pci_enable_device_bars(), but that may well be just
stale stack contents.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07  1:24       ` Christian Kujau
  2004-11-07  7:02         ` Linus Torvalds
@ 2004-11-07 13:05         ` Pekka Enberg
  2004-11-07 13:43           ` Christian Kujau
  1 sibling, 1 reply; 62+ messages in thread
From: Pekka Enberg @ 2004-11-07 13:05 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, alsa-devel, perex, penberg

Hi Christian,

On Sun, 07 Nov 2004 02:24:41 +0100, Christian Kujau <evil@g-house.de> wrote:
> if someone could give me a hint here what to do next or perhaps tell me
> that the whole things was totally pointless - please say so.
> i am somehow lost as to which is the right person to bug here.

I am running 2.6.10-rc1-bk14 with ens-1371 working ok. Could you
please post your .config so I can try to reproduce your oops?

               Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07  7:02         ` Linus Torvalds
@ 2004-11-07 13:10           ` Christian Kujau
  2004-11-07 16:02             ` Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 13:10 UTC (permalink / raw)
  To: linux-kernel; +Cc: alsa-devel, perex

On Sat, 6 Nov 2004 23:02:28 -0800 (PST), Linus Torvalds wrote
>
> Since you seem to be a BK user, try doing a

s/BK user/BK beginner/

> 
> 	bk revtool sound/pci/ens1370.c
> 
> and see if you can find the change that caused your problem.

hm, i already found the ChangeSet (ChangeSet@1.2000.7.1), but it seems
the ChangeSets get renumbered when linux makes progress. the issuer of
this changeset did not comment yet.

> Of course, the real change might be somewhere else in the 
> sound driver initialization path, so it's not like just that 
> one file might be the cause. Regardöess, the more you can 
> pinpoint when the problem started, the better.

yes.

> 
> Also, if you enable frame pointers (under kernel debugging), 
> the traceback will look a bit better. As it is, your oops 

ah, ok, will do.

thank you for your time,
Christian.
-- 
BOFH excuse #206:

Police are examining all internet packets in the search for a
narco-net-trafficker

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 13:05         ` Pekka Enberg
@ 2004-11-07 13:43           ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 13:43 UTC (permalink / raw)
  To: LKML

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Pekka Enberg schrieb:
>
> I am running 2.6.10-rc1-bk14 with ens-1371 working ok. Could you
> please post your .config so I can try to reproduce your oops?

i put it on
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config

thank you,
Christian.
- --
BOFH excuse #361:

Communist revolutionaries taking over the server room and demanding all
the computers in the building or they shoot the sysadmin. Poor misguided
fools.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBjiae+A7rjkF8z0wRAqo9AJ0e0iHAXi2Q6oI/UKl1vBw/dPvODQCfSjfh
ucfAhJkoCMS5gGxt/HtSKrw=
=pqTN
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 13:10           ` Christian Kujau
@ 2004-11-07 16:02             ` Christian Kujau
  2004-11-07 16:57               ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 16:02 UTC (permalink / raw)
  To: evil; +Cc: linux-kernel, alsa-devel, perex, torvalds

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>>	bk revtool sound/pci/ens1370.c
>>
>>and see if you can find the change that caused your problem.

since i got this oops between 2.6.9 and 2.6.10-rc1 i am still assuming
that the change was made somewere between 15-Oct-2004 (2.6.9) and
22-Oct-2004 (2.6.10-rc1). so the only Changeset matching this timespan is:

- -------------------------
ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au
[PATCH] module_param_array() should take a pointer

module_param_array() takes a variable to put the number of elements in.
Looking through the uses, many people don't care, so they declare a dummy
or share one variable between several parameters.  The latter is
problematic because sysfs uses that number to decide how many to display.

The solution is to change the variable arg to a pointer, and if the
pointer  is NULL, use the "max" value.  This change is fairly small, but
fixing up the callers is a lot of (trivial) churn.

  Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
  Signed-off-by: Andrew Morton <akpm@osdl.org>
  Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- -------------------------

>>Also, if you enable frame pointers (under kernel debugging), 
>>the traceback will look a bit better. As it is, your oops 

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config

the new config has this enabled:

CONFIG_DEBUG_DRIVER=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KOBJECT=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_FRAME_POINTER=y
CONFIG_KPROBES=y

shows the output of dmesg after doing "modprobe snd-ens1371". after this,
snd-ens1371 seems to be loaded:

Module                  Size  Used by
snd_ens1371            29928  1
snd_rawmidi            25952  1 snd_ens1371
snd_ac97_codec         77856  1 snd_ens1371
snd_pcm               101768  2 snd_ens1371,snd_ac97_codec
snd_timer              31940  1 snd_pcm
snd                    51620  5
snd_ens1371,snd_rawmidi,snd_ac97_codec,snd_pcm,snd_timer
soundcore               9440  1 snd
snd_page_alloc          7620  1 snd_pcm
ipv6                  260480  8
psmouse                20424  0
rtc                    20188  0

but is not working and cannot be unloaded:

prinz:~$ rmmod snd_ens1371
ERROR: Module snd_ens1371 is in use

there was an answer from the alsa-devel folks here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=109897024116288&w=2

"It's a bit dead-lock, because we cannot help you. It seems that
the pci structure passed to our code is broken. The driver has had
no changes in initialization for a long time."

i hope these information will help a bit.
thank you for your assistance, i really appreciate it
Christian

(still wondering why nobody else has this bug, 1370 is not *that* weird, i
thought)


PS: if someone could explain me, why the ChangeSet numbers are always
different: i've used "bk revtool sound/pci/ens1370.c" to find out the
changes for this file and the suspicious patch reads

     sound/pci/ens1370.c@1.54.1.1, 2004-10-20....

in "bk revtool". the changelog however reads:

     ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au

- --
BOFH excuse #62:

need to wrap system in aluminum foil to fix problem
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBjkcE+A7rjkF8z0wRAkR/AJ98DKSv5dZfOSJdKGWdz1LWPlItgQCgvS1A
iS1wUtTgHzsx4JFpqsQGt68=
=Hv9R
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 16:02             ` Christian Kujau
@ 2004-11-07 16:57               ` Linus Torvalds
  2004-11-07 18:31                 ` Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-07 16:57 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, alsa-devel, perex



On Sun, 7 Nov 2004, Christian Kujau wrote:
> 
> since i got this oops between 2.6.9 and 2.6.10-rc1 i am still assuming
> that the change was made somewere between 15-Oct-2004 (2.6.9) and
> 22-Oct-2004 (2.6.10-rc1).

Not necessarily. The ALSA merge is the most likely reason for the oops, 
and since ALSA development does not merge with the kernel very often, it 
may be some much older change in the ALSA tree.

You can check the ALSA tree _before_ the merge, by doing (in the current 
tree):

	bk undo -a1.2000.7.2

which should give you a tree without any of "my" stuff, ie it was what 
Jaroslav was working on before he merged it into the standard tree.

(BK revision numbers change on merges, so the above number is not 
necessarily the right one unless you have the current -bk tree. It should 
have a changeset something like:

	ChangeSet@1.2000.7.2, 2004-10-20 20:51:33+02:00, perex@suse.cz
	  Merge suse.cz:/home/perex/bk/linux-sound/linux-sound
	  into suse.cz:/home/perex/bk/linux-sound/work

so that you can double-check).

> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops.txt

Yup, it's a call through a bad pointer again, and again the EIP value 
can be found in %ecx. But the source of the bug is not clear. The stack 
trace implies "show_stack()", but that function doesn't do any indirect 
calls, so I suspect the frame pointer didn't help in this case. 

And it's not "pci_enable_device()" either (which was there last time too),
since that one calls "pci_enable_device_bars()" at the point it shows in
the stack trace.

Quite frankly, it looks like something smashed the stack, and the fact 
that it happens _around_ when "pci_enable_device()" was called makes me 
seriously suspect the IRQ handler for the device. That's when IRQ routing 
is enabled, so often the interrupts start at that point. And since 
FRAME_POINTER didn't make the stack frame look sane, it's very possible 
that the bogus call isn't due to a real "call", but due to a return from a 
broken stack.

> there was an answer from the alsa-devel folks here:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109897024116288&w=2
> 
> "It's a bit dead-lock, because we cannot help you. It seems that
> the pci structure passed to our code is broken. The driver has had
> no changes in initialization for a long time."

I seriously doubt that it's the PCI structure being broken.  It's the ALSA 
merge, almost certainly - it's just that the stack is so confused that 
it's hard to tell where the bug has happened.

And I'll double-check the "regparm" changes, just in case. They change
some irq calling conventions, although none of the involved stuff seems to
be implied here.

A quick suggestion: make sure that there is not some stale object file 
lying around confusing things about memory layout, and do a "make clean" 
and make sure that all old modules are clean too and re-installed. The 
kernel dependencies should be correct, but even then there can be problems 
with clocks that are off a bit etc.

> (still wondering why nobody else has this bug, 1370 is not *that* weird, i
> thought)

Yes, that makes me suspicious, and is one reason why I wonder if it's just 
your tree not being built right.

> PS: if someone could explain me, why the ChangeSet numbers are always
> different: i've used "bk revtool sound/pci/ens1370.c" to find out the
> changes for this file and the suspicious patch reads
> 
>      sound/pci/ens1370.c@1.54.1.1, 2004-10-20....
> 
> in "bk revtool". the changelog however reads:
> 
>      ChangeSet@1.2011, 2004-10-20 08:10:43-07:00, rusty@rustcorp.com.au

There are different revision numbers: there's the revision number for the 
_file_, and there is the revision number for the _change_. 

Also, both (or one) of them can change when a merge occurs, since other 
people may have had different merge histories, and in a distributed 
environment the revision numbers are a lot more fluid than in CVS.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 16:57               ` Linus Torvalds
@ 2004-11-07 18:31                 ` Christian Kujau
  2004-11-07 18:44                   ` Linus Torvalds
  2004-11-07 23:45                     ` Christian Kujau
  0 siblings, 2 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 18:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, alsa-devel, perex

On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote
> 
> You can check the ALSA tree _before_ the merge, by doing (in 
> the current tree):
> 
> 	bk undo -a1.2000.7.2
> 
> which should give you a tree without any of "my" stuff, ie it 
> was what Jaroslav was working on before he merged it into the 
> standard tree.

yes, i already did so, i think:

http://marc.theaimsgroup.com/?l=linux-kernel&m=109979092216919&w=2

but i did it this way:
 bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test
 bk undo -a1.2010

(probably wrong, so i'll repeat it as you suggeseted)

> (BK revision numbers change on merges, so the above number is 
> not necessarily the right one unless you have the current -bk 

aha!

> A quick suggestion: make sure that there is not some stale 
> object file lying around confusing things about memory layout, 
> and do a "make clean" and make sure that all old modules are 
> clean too and re-installed.

really: i always do "make clean", even "make mrproper" sometimes, just
to be sure. and i am quite certain, that i did not forget to install the
modules. but i'll keep my eyes open, yes.

> The kernel dependencies should be correct, but even then there can be
> problems with clocks that are off a bit etc.

i'm updating via "ntpdate" on every boot. i am even using a (faster) 2nd
machine for my build and the bk things right now: building a current -bk
on boths hosts gives me this error.

> Yes, that makes me suspicious, and is one reason why I wonder 
> if it's just your tree not being built right.

i'll build a -bk snapshot from a tar.bz2 later on and see what it gives.

> There are different revision numbers: there's the revision 
> number for the _file_, and there is the revision number for 
> the _change_.

aha. it was kinda confusing...now i got it, i think ;)

again: thank you for your time on this rainy weekend,
Christian.
-- 
BOFH excuse #8:

static buildup

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 18:31                 ` Christian Kujau
@ 2004-11-07 18:44                   ` Linus Torvalds
  2004-11-07 23:45                     ` Christian Kujau
  1 sibling, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-07 18:44 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-kernel, alsa-devel, perex



On Sun, 7 Nov 2004, Christian Kujau wrote:

> On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote
> > 
> > You can check the ALSA tree _before_ the merge, by doing (in 
> > the current tree):
> > 
> > 	bk undo -a1.2000.7.2
> > 
> > which should give you a tree without any of "my" stuff, ie it 
> > was what Jaroslav was working on before he merged it into the 
> > standard tree.
> 
> yes, i already did so, i think:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109979092216919&w=2
> 
> but i did it this way:
>  bk clone -r1.2000.7.1 linux-2.6-BK linux-2.6-BK-test
>  bk undo -a1.2010

Hmm.. That may well have worked fine, but it sounds in that post like you
tried to undo the ALSA stuff, and what I suggested was really to do the
reverse: take _only_ the ALSA changes, and then if it still fails, at
least you have now pinpointed it a bit more (admittedly to the _likely_
source, but that's as it should be: you narrow down the "known bad" source
base until you've narrowed it down to the smallest change you can find
that causes the problem).

> > Yes, that makes me suspicious, and is one reason why I wonder 
> > if it's just your tree not being built right.
> 
> i'll build a -bk snapshot from a tar.bz2 later on and see what it gives.

Sounds like you're doing everything right, but hey, it can't hurt to 
double-check.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 18:31                 ` Christian Kujau
@ 2004-11-07 23:45                     ` Christian Kujau
  2004-11-07 23:45                     ` Christian Kujau
  1 sibling, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 23:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, alsa-devel, linux-sound

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Kujau schrieb:
> On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote
> 
>>	bk undo -a1.2000.7.2
>>
>>which should give you a tree without any of "my" stuff, ie it 
>>was what Jaroslav was working on before he merged it into the 
>>standard tree.

i did so from a current tree (bk pull, undo, -r get) and it's working
fine (url wraps):

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.9_a1.2000.7.2.txt

so i can see with "bk changes" that the ChangeSet is still there. this is
what i expected, because -a says:

- -a<rev>   Remove all changesets which occurred after <rev>.

what i did not expect is that this ChangeSet is now *not* the culprit,
because there is no oops. am i right? [1]

>>Yes, that makes me suspicious, and is one reason why I wonder 
>>if it's just your tree not being built right.
> 
> i'll build a -bk snapshot from a tar.bz2 later on and see what it gives.

i've build from linux-2.6.10-rc1.tar.bz2 with patch-2.6.10-rc1-bk17.bz2
from kernel.org with the same .config and "modprobe snd-ens1371" oopses as
expected :(

> Hmm.. That may well have worked fine, but it sounds in that post like
> you tried to undo the ALSA stuff, and what I suggested was really to
> do the reverse: take _only_ the ALSA changes, and then if it still

yes, i wanted to undo the alsa changes because i suspected the alsa
framework (sorry guys) and wanted to see if it still oopses when the
latest alsa patch was not appied.

i did another thing: i enabled the (deprecated) OSS driver (es1371.ko)
tried to load this thing:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt

it oopses.
- - you said it's not a b0rken pci thingy
- - i have to assume now that it's not an ALSA issue (since oss oopses too)
- - it is OSS? the driver? i've CC'ed linux-sound...


> fails, at least you have now pinpointed it a bit more (admittedly to
> the _likely_ source, but that's as it should be: you narrow down the
> "known bad" source base until you've narrowed it down to the smallest
> change you can find that causes the problem).

yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties
in using my tools (bk). sorry for that.

> Sounds like you're doing everything right, but hey, it can't hurt to
> double-check.

yes, i really hope that it's not just a user error (on my side). building
kernels since 2.0...but you never know...


thanks again for help,
Christian
(whose only wish these days is to get over this strange thing and not
wasting peoples precious time with a "sound driver". hey, at least  the
box is booting...)

- --
BOFH excuse #224:

Jan  9 16:41:27 huber su: 'su root' succeeded for .... on /dev/pts/1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBjrOp+A7rjkF8z0wRAl59AKCEbRRzsGujcOlLUA74taFZJb8H0ACfUUxQ
nVQHjBXRBBn9BgSs7cLhTlY=
=wb90
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-07 23:45                     ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-07 23:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, alsa-devel, linux-sound

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Kujau schrieb:
> On Sun, 7 Nov 2004 08:57:40 -0800 (PST), Linus Torvalds wrote
> 
>>	bk undo -a1.2000.7.2
>>
>>which should give you a tree without any of "my" stuff, ie it 
>>was what Jaroslav was working on before he merged it into the 
>>standard tree.

i did so from a current tree (bk pull, undo, -r get) and it's working
fine (url wraps):

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.9_a1.2000.7.2.txt

so i can see with "bk changes" that the ChangeSet is still there. this is
what i expected, because -a says:

- -a<rev>   Remove all changesets which occurred after <rev>.

what i did not expect is that this ChangeSet is now *not* the culprit,
because there is no oops. am i right? [1]

>>Yes, that makes me suspicious, and is one reason why I wonder 
>>if it's just your tree not being built right.
> 
> i'll build a -bk snapshot from a tar.bz2 later on and see what it gives.

i've build from linux-2.6.10-rc1.tar.bz2 with patch-2.6.10-rc1-bk17.bz2
from kernel.org with the same .config and "modprobe snd-ens1371" oopses as
expected :(

> Hmm.. That may well have worked fine, but it sounds in that post like
> you tried to undo the ALSA stuff, and what I suggested was really to
> do the reverse: take _only_ the ALSA changes, and then if it still

yes, i wanted to undo the alsa changes because i suspected the alsa
framework (sorry guys) and wanted to see if it still oopses when the
latest alsa patch was not appied.

i did another thing: i enabled the (deprecated) OSS driver (es1371.ko)
tried to load this thing:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt

it oopses.
- - you said it's not a b0rken pci thingy
- - i have to assume now that it's not an ALSA issue (since oss oopses too)
- - it is OSS? the driver? i've CC'ed linux-sound...


> fails, at least you have now pinpointed it a bit more (admittedly to
> the _likely_ source, but that's as it should be: you narrow down the
> "known bad" source base until you've narrowed it down to the smallest
> change you can find that causes the problem).

yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties
in using my tools (bk). sorry for that.

> Sounds like you're doing everything right, but hey, it can't hurt to
> double-check.

yes, i really hope that it's not just a user error (on my side). building
kernels since 2.0...but you never know...


thanks again for help,
Christian
(whose only wish these days is to get over this strange thing and not
wasting peoples precious time with a "sound driver". hey, at least  the
box is booting...)

- --
BOFH excuse #224:

Jan  9 16:41:27 huber su: 'su root' succeeded for .... on /dev/pts/1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBjrOp+A7rjkF8z0wRAl59AKCEbRRzsGujcOlLUA74taFZJb8H0ACfUUxQ
nVQHjBXRBBn9BgSs7cLhTlY=wb90
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-07 23:45                     ` Christian Kujau
@ 2004-11-08  1:16                       ` Linus Torvalds
  -1 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-08  1:16 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH



On Mon, 8 Nov 2004, Christian Kujau wrote:
> 
> what i did not expect is that this ChangeSet is now *not* the culprit,
> because there is no oops. am i right? [1]

Yes.

So now I'd like to know _where_ the culprit is, since it turned out to be 
not the ALSA code. 

> i did another thing: i enabled the (deprecated) OSS driver (es1371.ko)
> tried to load this thing:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt
> 
> it oopses.
>  - you said it's not a b0rken pci thingy
>  - i have to assume now that it's not an ALSA issue (since oss oopses too)
>  - it is OSS? the driver? i've CC'ed linux-sound...

Sounds like something else changed, and likely the ALSA _and_ the OSS 
driver both broke. Which is not all that unlikely, since I suspect they 
share a lot of history.

> yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties
> in using my tools (bk). sorry for that.

Not your fault. Think of this as a learning experience ;)

Anyway, now that the _other_ driver also oopses, and with a very similar 
oops too, so it looks like they both depended on some undocumented (or 
changed) detail in the PCI layer. Next step would be to see if the thing 
that breaks is this merge:

	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
	  into ppc970.osdl.org:/home/torvalds/v2.6/linux

which merges Greg's PCI/driver model changes.

It's all the same steps you took with the ALSA merge, you're a
professional by now ;)

Greg, have you followed this thread?

> (whose only wish these days is to get over this strange thing and not
> wasting peoples precious time with a "sound driver". hey, at least  the
> box is booting...)

Hey, sound is important. And especially if you somehow found something 
non-sound that just broke sound by mistake, all the more important to fix 
it.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08  1:16                       ` Linus Torvalds
  0 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-08  1:16 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH



On Mon, 8 Nov 2004, Christian Kujau wrote:
> 
> what i did not expect is that this ChangeSet is now *not* the culprit,
> because there is no oops. am i right? [1]

Yes.

So now I'd like to know _where_ the culprit is, since it turned out to be 
not the ALSA code. 

> i did another thing: i enabled the (deprecated) OSS driver (es1371.ko)
> tried to load this thing:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-OSS.txt
> 
> it oopses.
>  - you said it's not a b0rken pci thingy
>  - i have to assume now that it's not an ALSA issue (since oss oopses too)
>  - it is OSS? the driver? i've CC'ed linux-sound...

Sounds like something else changed, and likely the ALSA _and_ the OSS 
driver both broke. Which is not all that unlikely, since I suspect they 
share a lot of history.

> yes, like Documentation/BUG-HUNTING says. but i seem to have difficulties
> in using my tools (bk). sorry for that.

Not your fault. Think of this as a learning experience ;)

Anyway, now that the _other_ driver also oopses, and with a very similar 
oops too, so it looks like they both depended on some undocumented (or 
changed) detail in the PCI layer. Next step would be to see if the thing 
that breaks is this merge:

	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
	  into ppc970.osdl.org:/home/torvalds/v2.6/linux

which merges Greg's PCI/driver model changes.

It's all the same steps you took with the ALSA merge, you're a
professional by now ;)

Greg, have you followed this thread?

> (whose only wish these days is to get over this strange thing and not
> wasting peoples precious time with a "sound driver". hey, at least  the
> box is booting...)

Hey, sound is important. And especially if you somehow found something 
non-sound that just broke sound by mistake, all the more important to fix 
it.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08  1:16                       ` Linus Torvalds
@ 2004-11-08 13:01                         ` Christian Kujau
  -1 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 13:01 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> Not your fault. Think of this as a learning experience ;)

it definitely is, yes.

> Anyway, now that the _other_ driver also oopses, and with a very similar 
> oops too, so it looks like they both depended on some undocumented (or 
> changed) detail in the PCI layer. Next step would be to see if the thing 
> that breaks is this merge:

may i ask how you come to this conclusion? by technical knowledge or could
this be deduced by some bk magic too?

> 
> 	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
> 	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
> 	  into ppc970.osdl.org:/home/torvalds/v2.6/linux
> 
> which merges Greg's PCI/driver model changes.
> 
> It's all the same steps you took with the ALSA merge, you're a
> professional by now ;)

i did "bk undo -a1.2463" from a current -BK tree and it oopses:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2463.txt

(i've booted with different boot options this time, because i noticed that
i always booted with "acpi=force". changing this did not help either.)

next i wanted to do "bk undo -r1.2463" now to see if it does *not* break
without this ChangeSet (because i already know it *breaks* with this
ChangeSet) but that would leave some parentless child deltas. i read in
the BK docs that "bk cset -x<version>" would help here. but "bk cset
- -x1.2463" aborts:

- ---------------------
evil@atlant:~/kernel/linux-2.6-BK$ bk changes | head -n3
ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
  into ppc970.osdl.org:/home/torvalds/v2.6/linux

evil@atlant:~/kernel/linux-2.6-BK$ bk cset -x1.2463
cset: Merge cset found in revision list: (1.2463).  Aborting. (cset1)
- ---------------------

i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
the .configs, the oopses are there. i've double checked a kernel built
from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)

thank you,
Christian.
- --
BOFH excuse #121:

halon system went off and killed the operators.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj24z+A7rjkF8z0wRAu0tAJ9g7mfG0iz/LvSAafD7LWKNu9qvLQCg3fjW
1oMRRK8oSqH5oZsudyIQVtw=
=f8CQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 13:01                         ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 13:01 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> Not your fault. Think of this as a learning experience ;)

it definitely is, yes.

> Anyway, now that the _other_ driver also oopses, and with a very similar 
> oops too, so it looks like they both depended on some undocumented (or 
> changed) detail in the PCI layer. Next step would be to see if the thing 
> that breaks is this merge:

may i ask how you come to this conclusion? by technical knowledge or could
this be deduced by some bk magic too?

> 
> 	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
> 	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
> 	  into ppc970.osdl.org:/home/torvalds/v2.6/linux
> 
> which merges Greg's PCI/driver model changes.
> 
> It's all the same steps you took with the ALSA merge, you're a
> professional by now ;)

i did "bk undo -a1.2463" from a current -BK tree and it oopses:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2463.txt

(i've booted with different boot options this time, because i noticed that
i always booted with "acpi=force". changing this did not help either.)

next i wanted to do "bk undo -r1.2463" now to see if it does *not* break
without this ChangeSet (because i already know it *breaks* with this
ChangeSet) but that would leave some parentless child deltas. i read in
the BK docs that "bk cset -x<version>" would help here. but "bk cset
- -x1.2463" aborts:

- ---------------------
evil@atlant:~/kernel/linux-2.6-BK$ bk changes | head -n3
ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
  into ppc970.osdl.org:/home/torvalds/v2.6/linux

evil@atlant:~/kernel/linux-2.6-BK$ bk cset -x1.2463
cset: Merge cset found in revision list: (1.2463).  Aborting. (cset1)
- ---------------------

i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
the .configs, the oopses are there. i've double checked a kernel built
from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)

thank you,
Christian.
- --
BOFH excuse #121:

halon system went off and killed the operators.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj24z+A7rjkF8z0wRAu0tAJ9g7mfG0iz/LvSAafD7LWKNu9qvLQCg3fjW
1oMRRK8oSqH5oZsudyIQVtwøCQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 13:01                         ` Christian Kujau
@ 2004-11-08 18:13                           ` Linus Torvalds
  -1 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-08 18:13 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH



On Mon, 8 Nov 2004, Christian Kujau wrote:
> 
> > Anyway, now that the _other_ driver also oopses, and with a very similar 
> > oops too, so it looks like they both depended on some undocumented (or 
> > changed) detail in the PCI layer. Next step would be to see if the thing 
> > that breaks is this merge:
> 
> may i ask how you come to this conclusion? by technical knowledge or could
> this be deduced by some bk magic too?

No, just gut feel. If the pre-merge ALSA works, and the post-merge one 
doesn't, and the oops in both cases happen somewhere close to where it 
does "pci_enable_device()", there's not a lot left. There are interrupts, 
and there is the PCI layer...

> > 	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
> > 	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
> > 	  into ppc970.osdl.org:/home/torvalds/v2.6/linux
> > 
> > which merges Greg's PCI/driver model changes.
> > 
> > It's all the same steps you took with the ALSA merge, you're a
> > professional by now ;)
> 
> i did "bk undo -a1.2463" from a current -BK tree and it oopses:

Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything 
after. 

So what you did still has the merge in the tree, and that it still oopses 
is thus to be expected. BUT, we're getting closer.

> next i wanted to do "bk undo -r1.2463" now to see if it does *not* break
> without this ChangeSet (because i already know it *breaks* with this
> ChangeSet) but that would leave some parentless child deltas. i read in
> the BK docs that "bk cset -x<version>" would help here. but "bk cset
> - -x1.2463" aborts:

"cset -x" only works on patches, not on complex operations. You still want 
"bk undo", but you want to use "bk revtool" to see what the merge point 
was, and tell _which_ of the merged top-of-trees you want to get to. 

In other words, you can't just undo a merge, you need to tell which _way_
to undo it. See? It does actually make sense, and "bk revtool" will show 
you the relationships of merges (at least if the time range is big enough 
to show enough info).

Anyway, if you have the top-of-tree-is-1.2463, then go to "bk revtool", 
and select that node in the graph by clicking on it. Notice how those 
edges turned white, and you can now easily see which children were 
pre-merge.

In this case, the top-of-tree tree _without_ the PCI merge is 1.2642:

	ChangeSet@1.2462, 2004-11-04 17:06:13-08:00, torvalds@ppc970.osdl.org
	  Merge bk://kernel.bkbits.net/gregkh/linux/usb-2.6
	  into ppc970.osdl.org:/home/torvalds/v2.6/linux

(you won't see it in "bk changes", since it's a trivial merge: use "bk 
changes -a" to see it). So just before I merged Greg's PCI changes, I 
merged his USB changes.

Now, that's fine - the USB merge is likely to be ok, so try doing

	bk undo -a1.2462

and you will now have a tree that is exactly the same as before, except it 
does _not_ have the PCI merge from Greg.

And if this one does not oops, you can now officially blame Greg.

Now, if you want to get _really_ fancy, you can now look at each changeset 
that differed, with something like

	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -

which is black magic that does a set operation and shows all the changes 
in between the sets of "bk at 1.2462" and "bk at 1.2463".

(This is _not_ the same as "bk changes -r1.2462..1.2463", because that one 
just shows the single merge change that is on the direct _path_ from one 
changeset to another. The black magic thing shows the set difference of 
changesets that comes from the full graph at two points).

Then you can look at each change individually and see if they matter.

And once you can do the set operations, you're officially a BK poweruser.  
Me, I just have a script, I'm a BK dabbler.

Looking at the list (appended), I don't see anything obvious, but hey, if 
it was obvious it wouldn't have been merged in the first place. 

Thanks for your willingness to pursue this thing,

		Linus

-----
<maneesh@in.ibm.com>
	[PATCH] sysfs: fix sysfs backing store error path confusion
	
	o sysfs_new_dirent to retrun 0 if kmalloc fails. Thanks to Milton Miller
	  for spotting this.
	
	Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<bunk@stusta.de>
	[PATCH] small sysfs cleanups
	
	The patch below does the following cleanups for the sysfs code:
	- remove the unused global function sysfs_mknod
	- make some structs and functions static
	
	Please check whether this patch is correct, or whether some of the
	things I made static should be used globally in the forseeable future.
	
	
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<kay.sievers@vrfy.org>
	[PATCH] add the physical device and the bus to the hotplug environment
	
	Add the sysfs path of the physical device to the hotplug event of class
	and block devices. This should solve the userspace issue not to know if
	the device is a virtual one and the "device" symlink will never be created,
	but we sit there and wait for it to show up not knowing when we should
	give up.
	
	Also the bus name is added to the hotplug event, so we don't need to
	reverse lookup in the /sys/bus/* directory which bus our physical
	device belongs to. This is e.g. the value matched against the BUS= key,
	that may be used in an udev rule.
	
	This is a PCI network card:
	  ACTION=add
	  SUBSYSTEM=net
	  DEVPATH=/class/net/eth0
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:02:01.0
	  PHYSDEVBUS=pci
	  INTERFACE=eth0
	  SEQNUM=827
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	This is a IDE CDROM:
	  ACTION=add
	  SUBSYSTEM=block
	  DEVPATH=/block/hdc
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1f.1/ide1/1.0
	  PHYSDEVBUS=ide
	  SEQNUM=1017
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	This is an USB-stick partition:
	  ACTION=add
	  SUBSYSTEM=block
	  DEVPATH=/block/sda/sda1
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.0/host1/target1:0:0/1:0:0:0
	  PHYSDEVBUS=scsi
	  SEQNUM=1032
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	
	Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: comment fix in bus.c
	
	 df_01_driver_attach_comment_fix.patch
	
	bus_match() was renamed to driver_probe_device() but the comment for
	device_attach() wasn't updated.  This patch updates it.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: bus_recan_devices() locking fix
	
	 df_02_bus_rescan_devcies_fix.patch
	
	 bus_rescan_devices() eventually calls device_attach() and thus
	requires write locking the corresponding bus.  The original code just
	called bus_for_each_dev() which only read locks the bus.  This patch
	separates __bus_for_each_dev() and __bus_for_each_drv(), which don't
	do locking themselves, out from the original functions and call them
	with read lock in the original functions and with write lock in
	bus_rescan_devices().
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: sysfs_release() dangling pointer reference fix
	
	 df_03_sysfs_release_fix.patch
	
	Some attributes are allocated dynamically (e.g. module and device
	parameters) and are usually deallocated when the assoicated kobject is
	released.  So, it's not safe to access attr after putting the kobject.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: kobject_add() error path reference counting fix
	
	 df_04_kobject_add_ref_fix.patch
	
	In kobject_add(), @kobj wasn't put'd properly on error path.  This
	patch fixes it.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: device_add() error path reference counting fix
	
	 df_05_device_add_ref_fix.patch
	
	 In device_add(), @dev wan't put'd properly when it has zero length
	bus_id (error path).  Fixed.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<greg@kroah.com>
	kevent: fix build error if CONFIG_KOBJECT_UEVENT is not selected.
	
	Thanks to Serge Hallyn <serue@us.ibm.com> for pointing this out.
	
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<rml@novell.com>
	[PATCH] kobject_uevent: fix init ordering
	
	Looks like kobject_uevent_init is executed before netlink_proto_init and
	consequently always fails.  Not cool.
	
	Attached patch switches the initialization over from core_initcall (init
	level 1) to postcore_initcall (init level 2).  Netlink's initialization
	is done in core_initcall, so this should fix the problem.  We should be
	fine waiting until postcore_initcall.
	
	Also a couple white space changes mixed in, because I am anal.
	
	Signed-Off-By: Robert Love <rml@novell.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<rml@novell.com>
	[PATCH] kobject_uevent: add MAINTAINER entry
	
	Attached patch adds a MAINTAINER entry for the kernel event layer.
	
	
	Signed-Off-By: Robert Love <rml@novell.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<greg@kroah.com>
	Merge kroah.com:/home/greg/linux/BK/bleed-2.6
	into kroah.com:/home/greg/linux/BK/driver-2.6

<maneesh@in.ibm.com>
	[PATCH] fix kernel BUG at fs/sysfs/dir.c:20!
	
	On Thu, Nov 04, 2004 at 12:52:38PM -0800, Greg KH wrote:
	> Hi,
	>
	> I get the following BUG in the sysfs code when I do:
	> 	- plug in a usb-serial device.
	> 	- open the port with 'cat /dev/ttyUSB0'
	> 	- unplug the device.
	> 	- stop the 'cat' process with control-C
	>
	> This used to work just fine before your big sysfs changes.
	
	There is a similar problem reported by s390 people where we see parent
	kobject (directory) going away before child kobject (sub-directory). It
	seems kobject code is able to handle this, but not the sysfs. What could
	be happening that in sysfs_remove_dir() of parent directory, we try to
	remove its contents. It works well with the regular files as it is the
	final removal for sysfs_dirent corresponding to the files. But in case
	of sub-directory we are doing an extra sysfs_put().  Once while removing
	parent and the other one being the one from when sysfs_remove_dir() is
	called for the child.
	
	The following patch worked for the s390 people, I hope same will work in
	this case also.
	
	
	o Do not remove sysfs_dirents corresponding to the sub-directory in
	  sysfs_remove_dir(). They will be removed in the sysfs_remove_dir() call
	  for the specific sub-directory.
	
	Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<torvalds@ppc970.osdl.org>
	Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
	into ppc970.osdl.org:/home/torvalds/v2.6/linux


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 18:13                           ` Linus Torvalds
  0 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-08 18:13 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, alsa-devel, linux-sound, Greg KH



On Mon, 8 Nov 2004, Christian Kujau wrote:
> 
> > Anyway, now that the _other_ driver also oopses, and with a very similar 
> > oops too, so it looks like they both depended on some undocumented (or 
> > changed) detail in the PCI layer. Next step would be to see if the thing 
> > that breaks is this merge:
> 
> may i ask how you come to this conclusion? by technical knowledge or could
> this be deduced by some bk magic too?

No, just gut feel. If the pre-merge ALSA works, and the post-merge one 
doesn't, and the oops in both cases happen somewhere close to where it 
does "pci_enable_device()", there's not a lot left. There are interrupts, 
and there is the PCI layer...

> > 	ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
> > 	  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
> > 	  into ppc970.osdl.org:/home/torvalds/v2.6/linux
> > 
> > which merges Greg's PCI/driver model changes.
> > 
> > It's all the same steps you took with the ALSA merge, you're a
> > professional by now ;)
> 
> i did "bk undo -a1.2463" from a current -BK tree and it oopses:

Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything 
after. 

So what you did still has the merge in the tree, and that it still oopses 
is thus to be expected. BUT, we're getting closer.

> next i wanted to do "bk undo -r1.2463" now to see if it does *not* break
> without this ChangeSet (because i already know it *breaks* with this
> ChangeSet) but that would leave some parentless child deltas. i read in
> the BK docs that "bk cset -x<version>" would help here. but "bk cset
> - -x1.2463" aborts:

"cset -x" only works on patches, not on complex operations. You still want 
"bk undo", but you want to use "bk revtool" to see what the merge point 
was, and tell _which_ of the merged top-of-trees you want to get to. 

In other words, you can't just undo a merge, you need to tell which _way_
to undo it. See? It does actually make sense, and "bk revtool" will show 
you the relationships of merges (at least if the time range is big enough 
to show enough info).

Anyway, if you have the top-of-tree-is-1.2463, then go to "bk revtool", 
and select that node in the graph by clicking on it. Notice how those 
edges turned white, and you can now easily see which children were 
pre-merge.

In this case, the top-of-tree tree _without_ the PCI merge is 1.2642:

	ChangeSet@1.2462, 2004-11-04 17:06:13-08:00, torvalds@ppc970.osdl.org
	  Merge bk://kernel.bkbits.net/gregkh/linux/usb-2.6
	  into ppc970.osdl.org:/home/torvalds/v2.6/linux

(you won't see it in "bk changes", since it's a trivial merge: use "bk 
changes -a" to see it). So just before I merged Greg's PCI changes, I 
merged his USB changes.

Now, that's fine - the USB merge is likely to be ok, so try doing

	bk undo -a1.2462

and you will now have a tree that is exactly the same as before, except it 
does _not_ have the PCI merge from Greg.

And if this one does not oops, you can now officially blame Greg.

Now, if you want to get _really_ fancy, you can now look at each changeset 
that differed, with something like

	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -

which is black magic that does a set operation and shows all the changes 
in between the sets of "bk at 1.2462" and "bk at 1.2463".

(This is _not_ the same as "bk changes -r1.2462..1.2463", because that one 
just shows the single merge change that is on the direct _path_ from one 
changeset to another. The black magic thing shows the set difference of 
changesets that comes from the full graph at two points).

Then you can look at each change individually and see if they matter.

And once you can do the set operations, you're officially a BK poweruser.  
Me, I just have a script, I'm a BK dabbler.

Looking at the list (appended), I don't see anything obvious, but hey, if 
it was obvious it wouldn't have been merged in the first place. 

Thanks for your willingness to pursue this thing,

		Linus

-----
<maneesh@in.ibm.com>
	[PATCH] sysfs: fix sysfs backing store error path confusion
	
	o sysfs_new_dirent to retrun 0 if kmalloc fails. Thanks to Milton Miller
	  for spotting this.
	
	Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<bunk@stusta.de>
	[PATCH] small sysfs cleanups
	
	The patch below does the following cleanups for the sysfs code:
	- remove the unused global function sysfs_mknod
	- make some structs and functions static
	
	Please check whether this patch is correct, or whether some of the
	things I made static should be used globally in the forseeable future.
	
	
	Signed-off-by: Adrian Bunk <bunk@stusta.de>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<kay.sievers@vrfy.org>
	[PATCH] add the physical device and the bus to the hotplug environment
	
	Add the sysfs path of the physical device to the hotplug event of class
	and block devices. This should solve the userspace issue not to know if
	the device is a virtual one and the "device" symlink will never be created,
	but we sit there and wait for it to show up not knowing when we should
	give up.
	
	Also the bus name is added to the hotplug event, so we don't need to
	reverse lookup in the /sys/bus/* directory which bus our physical
	device belongs to. This is e.g. the value matched against the BUS= key,
	that may be used in an udev rule.
	
	This is a PCI network card:
	  ACTION­d
	  SUBSYSTEM=net
	  DEVPATH=/class/net/eth0
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:02:01.0
	  PHYSDEVBUS=pci
	  INTERFACE=eth0
	  SEQNUM‚7
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	This is a IDE CDROM:
	  ACTION­d
	  SUBSYSTEM=block
	  DEVPATH=/block/hdc
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1f.1/ide1/1.0
	  PHYSDEVBUS=ide
	  SEQNUM\x1017
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	This is an USB-stick partition:
	  ACTION­d
	  SUBSYSTEM=block
	  DEVPATH=/block/sda/sda1
	  PHYSDEVPATH=/devices/pci0000:00/0000:00:1d.1/usb3/3-1/3-1:1.0/host1/target1:0:0/1:0:0:0
	  PHYSDEVBUS=scsi
	  SEQNUM\x1032
	  PATH=/sbin:/bin:/usr/sbin:/usr/bin
	  HOME=/
	
	
	Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: comment fix in bus.c
	
	 df_01_driver_attach_comment_fix.patch
	
	bus_match() was renamed to driver_probe_device() but the comment for
	device_attach() wasn't updated.  This patch updates it.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: bus_recan_devices() locking fix
	
	 df_02_bus_rescan_devcies_fix.patch
	
	 bus_rescan_devices() eventually calls device_attach() and thus
	requires write locking the corresponding bus.  The original code just
	called bus_for_each_dev() which only read locks the bus.  This patch
	separates __bus_for_each_dev() and __bus_for_each_drv(), which don't
	do locking themselves, out from the original functions and call them
	with read lock in the original functions and with write lock in
	bus_rescan_devices().
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: sysfs_release() dangling pointer reference fix
	
	 df_03_sysfs_release_fix.patch
	
	Some attributes are allocated dynamically (e.g. module and device
	parameters) and are usually deallocated when the assoicated kobject is
	released.  So, it's not safe to access attr after putting the kobject.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: kobject_add() error path reference counting fix
	
	 df_04_kobject_add_ref_fix.patch
	
	In kobject_add(), @kobj wasn't put'd properly on error path.  This
	patch fixes it.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<tj@home-tj.org>
	[PATCH] driver-model: device_add() error path reference counting fix
	
	 df_05_device_add_ref_fix.patch
	
	 In device_add(), @dev wan't put'd properly when it has zero length
	bus_id (error path).  Fixed.
	
	
	Signed-off-by: Tejun Heo <tj@home-tj.org>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<greg@kroah.com>
	kevent: fix build error if CONFIG_KOBJECT_UEVENT is not selected.
	
	Thanks to Serge Hallyn <serue@us.ibm.com> for pointing this out.
	
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<rml@novell.com>
	[PATCH] kobject_uevent: fix init ordering
	
	Looks like kobject_uevent_init is executed before netlink_proto_init and
	consequently always fails.  Not cool.
	
	Attached patch switches the initialization over from core_initcall (init
	level 1) to postcore_initcall (init level 2).  Netlink's initialization
	is done in core_initcall, so this should fix the problem.  We should be
	fine waiting until postcore_initcall.
	
	Also a couple white space changes mixed in, because I am anal.
	
	Signed-Off-By: Robert Love <rml@novell.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<rml@novell.com>
	[PATCH] kobject_uevent: add MAINTAINER entry
	
	Attached patch adds a MAINTAINER entry for the kernel event layer.
	
	
	Signed-Off-By: Robert Love <rml@novell.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<greg@kroah.com>
	Merge kroah.com:/home/greg/linux/BK/bleed-2.6
	into kroah.com:/home/greg/linux/BK/driver-2.6

<maneesh@in.ibm.com>
	[PATCH] fix kernel BUG at fs/sysfs/dir.c:20!
	
	On Thu, Nov 04, 2004 at 12:52:38PM -0800, Greg KH wrote:
	> Hi,
	>
	> I get the following BUG in the sysfs code when I do:
	> 	- plug in a usb-serial device.
	> 	- open the port with 'cat /dev/ttyUSB0'
	> 	- unplug the device.
	> 	- stop the 'cat' process with control-C
	>
	> This used to work just fine before your big sysfs changes.
	
	There is a similar problem reported by s390 people where we see parent
	kobject (directory) going away before child kobject (sub-directory). It
	seems kobject code is able to handle this, but not the sysfs. What could
	be happening that in sysfs_remove_dir() of parent directory, we try to
	remove its contents. It works well with the regular files as it is the
	final removal for sysfs_dirent corresponding to the files. But in case
	of sub-directory we are doing an extra sysfs_put().  Once while removing
	parent and the other one being the one from when sysfs_remove_dir() is
	called for the child.
	
	The following patch worked for the s390 people, I hope same will work in
	this case also.
	
	
	o Do not remove sysfs_dirents corresponding to the sub-directory in
	  sysfs_remove_dir(). They will be removed in the sysfs_remove_dir() call
	  for the specific sub-directory.
	
	Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
	Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

<torvalds@ppc970.osdl.org>
	Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
	into ppc970.osdl.org:/home/torvalds/v2.6/linux


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 13:01                         ` Christian Kujau
@ 2004-11-08 18:44                           ` Pekka Enberg
  -1 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 18:44 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound,
	Greg KH, penberg

Hi Christian,

On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote:
> i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
> the .configs, the oopses are there. i've double checked a kernel built
> from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)

Just to update, I cannot reproduce the oops with your config (nor
mine) on my machine running 2.6.10-rc1-bk14.

                       Pekka

0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365
[KT133/KM133] (rev 03)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: bus master, medium devsel, latency 8
        Memory at e7000000 (32-bit, prefetchable)
        Capabilities: [a0] AGP version 2.0
        Capabilities: [c0] Power Management version 2

0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365
[KT133/KM133 AGP] (prog-if 00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 0000d000-0000dfff
        Memory behind bridge: d7000000-d7efffff
        Prefetchable memory behind bridge: d7f00000-e6ffffff
        Expansion ROM at 0000d000 [disabled] [size=4K]
        Capabilities: [80] Power Management version 2

0000:00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super
South] (rev 40)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: bus master, stepping, medium devsel, latency 0
        Capabilities: [c0] Power Management version 2

0000:00:04.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
(prog-if
 8a [Master SecP PriP])
        Flags: bus master, medium devsel, latency 32
        I/O ports at b800 [size=16]
        Capabilities: [c0] Power Management version 2

0000:00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 16) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at b400 [size=32]
        Capabilities: [80] Power Management version 2

0000:00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 16) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at b000 [size=32]
        Capabilities: [80] Power Management version 2

0000:00:04.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super
ACPI] (rev 40)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: medium devsel, IRQ 9
        Capabilities: [68] Power Management version 2

0000:00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at 9400
        Memory at d6800000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2

0000:00:0a.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 04)
        Subsystem: Ensoniq Sound Blaster 16PCI 4.1ch
        Flags: bus master, slow devsel, latency 32, IRQ 11
        I/O ports at 9000
        Capabilities: [dc] Power Management version 2

0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at 8800
        Memory at d6000000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2

0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon
RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA])
        Subsystem: Hightech Information System Ltd.: Unknown device 0f02
        Flags: bus master, stepping, 66Mhz, medium devsel, latency 64
        Memory at d8000000 (32-bit, prefetchable) [size=d7fe0000]
        I/O ports at d800 [size=256]
        Memory at d7000000 (32-bit, non-prefetchable) [size=64K]
        Expansion ROM at 00020000 [disabled]
        Capabilities: [58] AGP version 2.0
        Capabilities: [50] Power Management version 2



Linux version 2.6.10-rc1-bk14 (root@cherry) (gcc version 3.4.2 (Gentoo
Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) #8 Mon Nov 8 20:18:45 EET
2004
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003ffec000 (usable)
 BIOS-e820: 000000003ffec000 - 000000003ffef000 (ACPI data)
 BIOS-e820: 000000003ffef000 - 000000003ffff000 (reserved)
 BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
On node 0 totalpages: 262124
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 225280 pages, LIFO batch:16
  HighMem zone: 32748 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP (v000 ASUS                                  ) @ 0x000f6a80
ACPI: RSDT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec000
ACPI: FADT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec080
ACPI: BOOT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec040
ACPI: DSDT (v001   ASUS A7V133-C 0x00001000 MSFT 0x0100000b) @ 0x00000000
ACPI: PM-Timer IO Port: 0xe408
Built 1 zonelists
Kernel command line: root=/dev/ram0 init=/linuxrc real_root=/dev/hda3 acpi=force
No local APIC present or hardware disabled
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 1009.328 MHz processor.
Using pmtmr for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1034128k/1048496k available (2582k kernel code, 13664k
reserved, 770k data, 148k init, 130992k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 1998.84 BogoMIPS (lpj=999424)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0383f9ff c1c7f9ff 00000000 00000000
CPU: After vendor identify, caps:  0383f9ff c1c7f9ff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 64K (64 bytes/line)
CPU: After all inits, caps:        0383f9ff c1c7f9ff 00000000 00000020
CPU: AMD Duron(tm) Processor stepping 00
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ACPI: IRQ9 SCI: Edge set to Level Trigger.
checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Freeing initrd memory: 885k freed
kobject_uevent: unable to create netlink socket!
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
PCI: Using ACPI for IRQ routing
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:04.2[D] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI interrupt 0000:00:04.3[D] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10
Simple Boot Flag at 0x3a set to 0x1
highmem bounce pool size: 64 pages
devfs: 2004-01-31 Richard Gooch (rgooch@atnf.csiro.au)
devfs: boot_options: 0x0
SGI XFS with ACLs, realtime, no debug enabled
SGI XFS Quota Management subsystem
Applying VIA southbridge workaround.
PCI: Disabling Via external APIC routing
Real Time Clock Driver v1.12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Equalizer2002: Simon Janes (simon@ncm.com) and David S. Miller
(davem@redhat.com)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:04.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:04.1
    ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: Maxtor 4D060H3, ATA DISK drive
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Hewlett-Packard CD-Writer Plus 8200a, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 120069936 sectors (61475 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
hda: cache flushes not supported
 /dev/ide/host0/bus0/target0/lun0: p1 p2 p3
hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Logitech Wheel Mouse on isa0060/serio1
NET: Registered protocol family 2
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
ACPI: (supports S0 S1 S4 S5)
ACPI wakeup devices:
PWRB PCI0 UAR1 UAR2 USB0 USB1
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 148k freed
usbcore: registered new driver usbfs
usbcore: registered new driver hub
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
SCSI subsystem initialized
Initializing USB Mass Storage driver...
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ReiserFS: hda3: warning: sh-2021: reiserfs_fill_super: can not find
reiserfs on hda3
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hda3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 2040244k swap on /dev/hda2.  Priority:-1 extents:1
EXT3 FS on hda3, internal journal
8139too Fast Ethernet driver 0.9.27
PCI: Enabling device 0000:00:09.0 (0004 -> 0007)
ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10
eth0: RealTek RTL8139 at 0xf8814000, 00:06:4f:01:66:57, IRQ 10
eth0:  Identified 8139 chip type 'RTL-8139C'
PCI: Enabling device 0000:00:0d.0 (0004 -> 0007)
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10
eth1: RealTek RTL8139 at 0xf8816000, 00:06:4f:01:66:58, IRQ 10
eth1:  Identified 8139 chip type 'RTL-8139C'
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies
Inc Radeon RV100 QY [Radeon 7000/VE]
[drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held
[drm:radeon_unlock] *ERROR* Process 6283 using kernel context 0
inserting floppy driver for 2.6.10-rc1-bk14
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
PCI: Enabling device 0000:00:0a.0 (0004 -> 0005)
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 18:44                           ` Pekka Enberg
  0 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 18:44 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound,
	Greg KH, penberg

Hi Christian,

On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote:
> i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
> the .configs, the oopses are there. i've double checked a kernel built
> from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)

Just to update, I cannot reproduce the oops with your config (nor
mine) on my machine running 2.6.10-rc1-bk14.

                       Pekka

0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365
[KT133/KM133] (rev 03)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: bus master, medium devsel, latency 8
        Memory at e7000000 (32-bit, prefetchable)
        Capabilities: [a0] AGP version 2.0
        Capabilities: [c0] Power Management version 2

0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365
[KT133/KM133 AGP] (prog-if 00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 0
        Bus: primary\0, secondary\x01, subordinate\x01, sec-latency=0
        I/O behind bridge: 0000d000-0000dfff
        Memory behind bridge: d7000000-d7efffff
        Prefetchable memory behind bridge: d7f00000-e6ffffff
        Expansion ROM at 0000d000 [disabled] [size=4K]
        Capabilities: [80] Power Management version 2

0000:00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super
South] (rev 40)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: bus master, stepping, medium devsel, latency 0
        Capabilities: [c0] Power Management version 2

0000:00:04.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
(prog-if
 8a [Master SecP PriP])
        Flags: bus master, medium devsel, latency 32
        I/O ports at b800 [size\x16]
        Capabilities: [c0] Power Management version 2

0000:00:04.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 16) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at b400 [size2]
        Capabilities: [80] Power Management version 2

0000:00:04.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 16) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at b000 [size2]
        Capabilities: [80] Power Management version 2

0000:00:04.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super
ACPI] (rev 40)
        Subsystem: ASUSTeK Computer Inc. A7V133/A7V133-C Mainboard
        Flags: medium devsel, IRQ 9
        Capabilities: [68] Power Management version 2

0000:00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at 9400
        Memory at d6800000 (32-bit, non-prefetchable) [size%6]
        Capabilities: [50] Power Management version 2

0000:00:0a.0 Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 04)
        Subsystem: Ensoniq Sound Blaster 16PCI 4.1ch
        Flags: bus master, slow devsel, latency 32, IRQ 11
        I/O ports at 9000
        Capabilities: [dc] Power Management version 2

0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 32, IRQ 10
        I/O ports at 8800
        Memory at d6000000 (32-bit, non-prefetchable) [size%6]
        Capabilities: [50] Power Management version 2

0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon
RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA])
        Subsystem: Hightech Information System Ltd.: Unknown device 0f02
        Flags: bus master, stepping, 66Mhz, medium devsel, latency 64
        Memory at d8000000 (32-bit, prefetchable) [size×fe0000]
        I/O ports at d800 [size%6]
        Memory at d7000000 (32-bit, non-prefetchable) [sizedK]
        Expansion ROM at 00020000 [disabled]
        Capabilities: [58] AGP version 2.0
        Capabilities: [50] Power Management version 2



Linux version 2.6.10-rc1-bk14 (root@cherry) (gcc version 3.4.2 (Gentoo
Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) #8 Mon Nov 8 20:18:45 EET
2004
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003ffec000 (usable)
 BIOS-e820: 000000003ffec000 - 000000003ffef000 (ACPI data)
 BIOS-e820: 000000003ffef000 - 000000003ffff000 (reserved)
 BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
On node 0 totalpages: 262124
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 225280 pages, LIFO batch:16
  HighMem zone: 32748 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP (v000 ASUS                                  ) @ 0x000f6a80
ACPI: RSDT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec000
ACPI: FADT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec080
ACPI: BOOT (v001 ASUS   A7V133-C 0x30303031 MSFT 0x31313031) @ 0x3ffec040
ACPI: DSDT (v001   ASUS A7V133-C 0x00001000 MSFT 0x0100000b) @ 0x00000000
ACPI: PM-Timer IO Port: 0xe408
Built 1 zonelists
Kernel command line: root=/dev/ram0 init=/linuxrc real_root=/dev/hda3 acpi=force
No local APIC present or hardware disabled
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 1009.328 MHz processor.
Using pmtmr for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1034128k/1048496k available (2582k kernel code, 13664k
reserved, 770k data, 148k init, 130992k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 1998.84 BogoMIPS (lpj™9424)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0383f9ff c1c7f9ff 00000000 00000000
CPU: After vendor identify, caps:  0383f9ff c1c7f9ff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 64K (64 bytes/line)
CPU: After all inits, caps:        0383f9ff c1c7f9ff 00000000 00000020
CPU: AMD Duron(tm) Processor stepping 00
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ACPI: IRQ9 SCI: Edge set to Level Trigger.
checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Freeing initrd memory: 885k freed
kobject_uevent: unable to create netlink socket!
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
PCI: Using ACPI for IRQ routing
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:04.2[D] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI interrupt 0000:00:04.3[D] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10
Simple Boot Flag at 0x3a set to 0x1
highmem bounce pool size: 64 pages
devfs: 2004-01-31 Richard Gooch (rgooch@atnf.csiro.au)
devfs: boot_options: 0x0
SGI XFS with ACLs, realtime, no debug enabled
SGI XFS Quota Management subsystem
Applying VIA southbridge workaround.
PCI: Disabling Via external APIC routing
Real Time Clock Driver v1.12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Equalizer2002: Simon Janes (simon@ncm.com) and David S. Miller
(davem@redhat.com)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:04.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:04.1
    ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: Maxtor 4D060H3, ATA DISK drive
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Hewlett-Packard CD-Writer Plus 8200a, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 120069936 sectors (61475 MB) w/2048KiB Cache, CHSe535/16/63, UDMA(100)
hda: cache flushes not supported
 /dev/ide/host0/bus0/target0/lun0: p1 p2 p3
hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Logitech Wheel Mouse on isa0060/serio1
NET: Registered protocol family 2
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
NET: Registered protocol family 1
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
ACPI: (supports S0 S1 S4 S5)
ACPI wakeup devices:
PWRB PCI0 UAR1 UAR2 USB0 USB1
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 148k freed
usbcore: registered new driver usbfs
usbcore: registered new driver hub
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
SCSI subsystem initialized
Initializing USB Mass Storage driver...
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ReiserFS: hda3: warning: sh-2021: reiserfs_fill_super: can not find
reiserfs on hda3
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hda3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 2040244k swap on /dev/hda2.  Priority:-1 extents:1
EXT3 FS on hda3, internal journal
8139too Fast Ethernet driver 0.9.27
PCI: Enabling device 0000:00:09.0 (0004 -> 0007)
ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 10 (level, low) -> IRQ 10
eth0: RealTek RTL8139 at 0xf8814000, 00:06:4f:01:66:57, IRQ 10
eth0:  Identified 8139 chip type 'RTL-8139C'
PCI: Enabling device 0000:00:0d.0 (0004 -> 0007)
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 10 (level, low) -> IRQ 10
eth1: RealTek RTL8139 at 0xf8816000, 00:06:4f:01:66:58, IRQ 10
eth1:  Identified 8139 chip type 'RTL-8139C'
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies
Inc Radeon RV100 QY [Radeon 7000/VE]
[drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held
[drm:radeon_unlock] *ERROR* Process 6283 using kernel context 0
inserting floppy driver for 2.6.10-rc1-bk14
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
PCI: Enabling device 0000:00:0a.0 (0004 -> 0005)
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 18:44                           ` Pekka Enberg
@ 2004-11-08 19:00                             ` Greg KH
  -1 siblings, 0 replies; 62+ messages in thread
From: Greg KH @ 2004-11-08 19:00 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

On Mon, Nov 08, 2004 at 08:44:37PM +0200, Pekka Enberg wrote:
> Hi Christian,
> 
> On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote:
> > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
> > the .configs, the oopses are there. i've double checked a kernel built
> > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)
> 
> Just to update, I cannot reproduce the oops with your config (nor
> mine) on my machine running 2.6.10-rc1-bk14.

But 2.6.10-rc1-bk15 does have the problem?

Trying to figure out where the issue is...

greg k-h

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 19:00                             ` Greg KH
  0 siblings, 0 replies; 62+ messages in thread
From: Greg KH @ 2004-11-08 19:00 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

On Mon, Nov 08, 2004 at 08:44:37PM +0200, Pekka Enberg wrote:
> Hi Christian,
> 
> On Mon, 08 Nov 2004 14:01:39 +0100, Christian Kujau <evil@g-house.de> wrote:
> > i've put everthing on http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/
> > the .configs, the oopses are there. i've double checked a kernel built
> > from "bk -a a1.2000.7.2" yesterday but the result was the same (no oops)
> 
> Just to update, I cannot reproduce the oops with your config (nor
> mine) on my machine running 2.6.10-rc1-bk14.

But 2.6.10-rc1-bk15 does have the problem?

Trying to figure out where the issue is...

greg k-h

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 19:00                             ` Greg KH
@ 2004-11-08 19:18                               ` Pekka Enberg
  -1 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 19:18 UTC (permalink / raw)
  To: Greg KH
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

Hi,

On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> But 2.6.10-rc1-bk15 does have the problem?
> 
> Trying to figure out where the issue is...

No, -bk14 is just the kernel I am running right now (I haven't tried
-bk15) and I haven't had the problem. I cannot reproduce the oops _at
all_ which is why I suspect it's his hardware. I included my lspci and
dmesg output because we have similar (but not exactly the same)
setups.

FWIW, I've asked Christian for an obdump of the kernel to see if I can
track down where it oopses at because I cannot find anything in the
code. I suspected pcibios_enable_irq  (which is a function pointer)
might be wrong but looking at his logs, I don't think we get that far.

                          Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 19:18                               ` Pekka Enberg
  0 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 19:18 UTC (permalink / raw)
  To: Greg KH
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

Hi,

On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> But 2.6.10-rc1-bk15 does have the problem?
> 
> Trying to figure out where the issue is...

No, -bk14 is just the kernel I am running right now (I haven't tried
-bk15) and I haven't had the problem. I cannot reproduce the oops _at
all_ which is why I suspect it's his hardware. I included my lspci and
dmesg output because we have similar (but not exactly the same)
setups.

FWIW, I've asked Christian for an obdump of the kernel to see if I can
track down where it oopses at because I cannot find anything in the
code. I suspected pcibios_enable_irq  (which is a function pointer)
might be wrong but looking at his logs, I don't think we get that far.

                          Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 19:18                               ` Pekka Enberg
@ 2004-11-08 19:30                                 ` Pekka Enberg
  -1 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 19:30 UTC (permalink / raw)
  To: Greg KH
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> > But 2.6.10-rc1-bk15 does have the problem?
> >
> > Trying to figure out where the issue is...

On Mon, 8 Nov 2004 21:18:09 +0200, Pekka Enberg <penberg@gmail.com> wrote: 
> No, -bk14 is just the kernel I am running right now (I haven't tried
> -bk15) and I haven't had the problem.

Sorry for not being clear, any kernel after 2.6.10-rc1 oopses
according to Christian which is why I haven't bothered to test
anything else except -bk14.

                           Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 19:30                                 ` Pekka Enberg
  0 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-08 19:30 UTC (permalink / raw)
  To: Greg KH
  Cc: Christian Kujau, Kernel Mailing List, Linus Torvalds, alsa-devel,
	linux-sound, penberg

On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> > But 2.6.10-rc1-bk15 does have the problem?
> >
> > Trying to figure out where the issue is...

On Mon, 8 Nov 2004 21:18:09 +0200, Pekka Enberg <penberg@gmail.com> wrote: 
> No, -bk14 is just the kernel I am running right now (I haven't tried
> -bk15) and I haven't had the problem.

Sorry for not being clear, any kernel after 2.6.10-rc1 oopses
according to Christian which is why I haven't bothered to test
anything else except -bk14.

                           Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 19:18                               ` Pekka Enberg
@ 2004-11-08 20:31                                 ` Christian Kujau
  -1 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 20:31 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Greg KH, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Pekka Enberg schrieb:
> Hi,
> 
> On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> 
>>But 2.6.10-rc1-bk15 does have the problem?
>>
>>Trying to figure out where the issue is...

i could use the -bk snapshots too, but since i am using bk myself (i try),
i think we can narrow it down a bit more.

> 
> No, -bk14 is just the kernel I am running right now (I haven't tried
> -bk15) and I haven't had the problem. I cannot reproduce the oops _at
> all_ which is why I suspect it's his hardware. I included my lspci and
> dmesg output because we have similar (but not exactly the same)
> setups.

i've put an lspci output here:
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt

i do not suspect hw problems *yet*, because kernel up to 2.6.9 (tracking
bk) do not show this behaviour.

> FWIW, I've asked Christian for an obdump of the kernel to see if I can

will show up in a couple of minutes here:
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/objdump-d_a1.2463.txt.bz2

this is from the vmlinux from a "bk undo -a1.2463" kernel, IOW it still
contains:

ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
  into ppc970.osdl.org:/home/torvalds/v2.6/linux


thank you for the hints,
Christian.

PS: should we i un'CC linux-sound and alsa-devel, now we are sure it's a
pci thing?
- --
BOFH excuse #228:

That function is not currently supported, but Bill Gates assures us it
will be featured in the next upgrade.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj9e9+A7rjkF8z0wRAregAJ9TyK5Mt00CFmCcgA1pOKmzvIxv2QCg0OBi
/9eNZ41Kp2GAOg4J5l0QR8E=
=OkFI
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 20:31                                 ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 20:31 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Greg KH, Kernel Mailing List, Linus Torvalds, alsa-devel, linux-sound

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Pekka Enberg schrieb:
> Hi,
> 
> On Mon, 8 Nov 2004 11:00:40 -0800, Greg KH <greg@kroah.com> wrote:
> 
>>But 2.6.10-rc1-bk15 does have the problem?
>>
>>Trying to figure out where the issue is...

i could use the -bk snapshots too, but since i am using bk myself (i try),
i think we can narrow it down a bit more.

> 
> No, -bk14 is just the kernel I am running right now (I haven't tried
> -bk15) and I haven't had the problem. I cannot reproduce the oops _at
> all_ which is why I suspect it's his hardware. I included my lspci and
> dmesg output because we have similar (but not exactly the same)
> setups.

i've put an lspci output here:
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt

i do not suspect hw problems *yet*, because kernel up to 2.6.9 (tracking
bk) do not show this behaviour.

> FWIW, I've asked Christian for an obdump of the kernel to see if I can

will show up in a couple of minutes here:
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/objdump-d_a1.2463.txt.bz2

this is from the vmlinux from a "bk undo -a1.2463" kernel, IOW it still
contains:

ChangeSet@1.2463, 2004-11-04 17:07:16-08:00, torvalds@ppc970.osdl.org
  Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6
  into ppc970.osdl.org:/home/torvalds/v2.6/linux


thank you for the hints,
Christian.

PS: should we i un'CC linux-sound and alsa-devel, now we are sure it's a
pci thing?
- --
BOFH excuse #228:

That function is not currently supported, but Bill Gates assures us it
will be featured in the next upgrade.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj9e9+A7rjkF8z0wRAregAJ9TyK5Mt00CFmCcgA1pOKmzvIxv2QCg0OBi
/9eNZ41Kp2GAOg4J5l0QR8E=OkFI
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 18:13                           ` Linus Torvalds
@ 2004-11-08 20:59                             ` Christian Kujau
  -1 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 20:59 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
>
> No, just gut feel. If the pre-merge ALSA works, and the post-merge one 
> doesn't, and the oops in both cases happen somewhere close to where it 
> does "pci_enable_device()", there's not a lot left. There are interrupts, 
> and there is the PCI layer...

yes, makes sense.

>>
>>i did "bk undo -a1.2463" from a current -BK tree and it oopses:
> 
> Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything 
> after. 
> 
> So what you did still has the merge in the tree, and that it still oopses 
> is thus to be expected. BUT, we're getting closer.

yes, i think i understood that. that's why i wanted to revert 1.2463 too.

[...]

> 
> Now, that's fine - the USB merge is likely to be ok, so try doing
> 
> 	bk undo -a1.2462

for now i appreciate your work here but i have to postpone the the "bk
revtool" stuff because i have no X _and_ bk here. (but i'm a good student
and will do my homework)

> and you will now have a tree that is exactly the same as before, except it 
> does _not_ have the PCI merge from Greg.
> 
> And if this one does not oops, you can now officially blame Greg.

i can't wait... ;)

>> Now, if you want to get _really_ fancy, you can now look at each changeset 
> that differed, with something like
> 
> 	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -
> 
> which is black magic that does a set operation and shows all the changes 
> in between the sets of "bk at 1.2462" and "bk at 1.2463".
> 
> (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one 
> just shows the single merge change that is on the direct _path_ from one 
> changeset to another. The black magic thing shows the set difference of 
> changesets that comes from the full graph at two points).
> 
> Then you can look at each change individually and see if they matter.

will do, after the build

> 
> And once you can do the set operations, you're officially a BK poweruser.  
> Me, I just have a script, I'm a BK dabbler.
> 
> Looking at the list (appended), I don't see anything obvious, but hey, if 
> it was obvious it wouldn't have been merged in the first place. 
> 
> Thanks for your willingness to pursue this thing,

hey, thanks to you and to the folks in the Cc: field to chase a bug which
only _i_ encounter until now.

/me is building now....
thanks,
Christian.
- --
BOFH excuse #111:

The salesman drove over the CPU board.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj94f+A7rjkF8z0wRAm/uAJ0eTBa20JnX+250GpFiSED4b+arQwCggSgo
CO/MQ+1jeOOvb7WaJRKg7uY=
=Qlt1
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
@ 2004-11-08 20:59                             ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 20:59 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, alsa-devel, linux-sound, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
>
> No, just gut feel. If the pre-merge ALSA works, and the post-merge one 
> doesn't, and the oops in both cases happen somewhere close to where it 
> does "pci_enable_device()", there's not a lot left. There are interrupts, 
> and there is the PCI layer...

yes, makes sense.

>>
>>i did "bk undo -a1.2463" from a current -BK tree and it oopses:
> 
> Note that "bk undo -axxx" will _leave_ xxx in place, and undo everything 
> after. 
> 
> So what you did still has the merge in the tree, and that it still oopses 
> is thus to be expected. BUT, we're getting closer.

yes, i think i understood that. that's why i wanted to revert 1.2463 too.

[...]

> 
> Now, that's fine - the USB merge is likely to be ok, so try doing
> 
> 	bk undo -a1.2462

for now i appreciate your work here but i have to postpone the the "bk
revtool" stuff because i have no X _and_ bk here. (but i'm a good student
and will do my homework)

> and you will now have a tree that is exactly the same as before, except it 
> does _not_ have the PCI merge from Greg.
> 
> And if this one does not oops, you can now officially blame Greg.

i can't wait... ;)

>> Now, if you want to get _really_ fancy, you can now look at each changeset 
> that differed, with something like
> 
> 	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -
> 
> which is black magic that does a set operation and shows all the changes 
> in between the sets of "bk at 1.2462" and "bk at 1.2463".
> 
> (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one 
> just shows the single merge change that is on the direct _path_ from one 
> changeset to another. The black magic thing shows the set difference of 
> changesets that comes from the full graph at two points).
> 
> Then you can look at each change individually and see if they matter.

will do, after the build

> 
> And once you can do the set operations, you're officially a BK poweruser.  
> Me, I just have a script, I'm a BK dabbler.
> 
> Looking at the list (appended), I don't see anything obvious, but hey, if 
> it was obvious it wouldn't have been merged in the first place. 
> 
> Thanks for your willingness to pursue this thing,

hey, thanks to you and to the folks in the Cc: field to chase a bug which
only _i_ encounter until now.

/me is building now....
thanks,
Christian.
- --
BOFH excuse #111:

The salesman drove over the CPU board.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBj94f+A7rjkF8z0wRAm/uAJ0eTBa20JnX+250GpFiSED4b+arQwCggSgo
CO/MQ+1jeOOvb7WaJRKg7uY=Qlt1
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 20:59                             ` Christian Kujau
  (?)
@ 2004-11-08 23:49                             ` Christian Kujau
  2004-11-09  1:05                               ` Linus Torvalds
  2004-11-09  1:31                               ` Christian Kujau
  -1 siblings, 2 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-08 23:49 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>>>Now, that's fine - the USB merge is likely to be ok, so try doing
>>>
>>>	bk undo -a1.2462

i did so, 1.2463 went away, building as usual - but the oops resists :(

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-a1.2462.txt

> 
> for now i appreciate your work here but i have to postpone the the "bk
> revtool" stuff because i have no X _and_ bk here. (but i'm a good student
> and will do my homework)

...in progress...



>>>
>>>	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -
>>>
>>>which is black magic that does a set operation and shows all the changes 
>>>in between the sets of "bk at 1.2462" and "bk at 1.2463".

hm, i guess this has to wait now.

>>>Looking at the list (appended), I don't see anything obvious, but hey, if 
>>>it was obvious it wouldn't have been merged in the first place. 

yes, i'll look for changes regarding PCI. i've started to compile the -bk
snapshots too. there i can do less wrong things. when i have the "bad" -bk
snapshot i'll use "bk" itself again to find the detailed change leading to
the oops.

i hope to get another machine with a another es1371 tomorrow and see if
the error is reproduceable.

thanks,
Christian.

PS: i've taken linux-sound and alsa-devel from CC.
- --
BOFH excuse #74:

You're out of memory
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkAXx+A7rjkF8z0wRAttsAJ9sOI7FVw+Lx8rBYHusHILQvIkeJACfZWDX
zMY4MtVYCCxU3y0Tb/muG5Y=
=CBO/
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 23:49                             ` Christian Kujau
@ 2004-11-09  1:05                               ` Linus Torvalds
  2004-11-09  1:41                                 ` Christian Kujau
  2004-11-09  1:31                               ` Christian Kujau
  1 sibling, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-09  1:05 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, Greg KH



On Tue, 9 Nov 2004, Christian Kujau wrote:
> 
> >>>Looking at the list (appended), I don't see anything obvious, but hey, if 
> >>>it was obvious it wouldn't have been merged in the first place. 
> 
> yes, i'll look for changes regarding PCI. i've started to compile the -bk
> snapshots too. there i can do less wrong things. when i have the "bad" -bk
> snapshot i'll use "bk" itself again to find the detailed change leading to
> the oops.

Actually, looking a bit closer, I think the PCI merge we just looked at 
was the PCI merge that happened _after_ 2.6.10-rc1. And since 2.6.10-rc1 
already oopsed for you, it shouldn't be an issue.

I think the _real_ PCI merge we should have looked at is:

	ChangeSet@1.2000.1.7, 2004-10-19 16:59:19-07:00, torvalds@ppc970.osdl.org
	  Merge PCI updates

and in particular, that merged the PCI changes from

	ChangeSet@1.1988.2.81, 2004-10-19 14:48:04-07:00, greg@kroah.com
	  PCI: fix up pci_save/restore_state in via-agp due to api change.
	  
	  Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

with my pre-PCI-merge tree at:

	ChangeSet@1.2000.1.6, 2004-10-19 15:06:19-07:00, torvalds@ppc970.osdl.org
	  Merge bk://bart.bkbits.net/ide-2.6
	  into ppc970.osdl.org:/home/torvalds/v2.6/linux

(all of these revision numbers are relative to a pristine 2.6.10-rc1 
tree: remember that they change with merges, so they may not be the same 
in your tree. "bk changes -a" is your friend).

So what I'd like you to do is to take the pre-PCI-merge tree, and see if 
that works for you

	# assuming a 2.6.10-rc1 tree
	bk undo -a1.2000.1.6

and if that works, then try the post-PCI-merge tree:

	# assuming a 2.6.10-rc1 tree
	bk undo -a1.2000.1.7

(I just checked: the above numbers are actually valid even in the current
-bk tree, so you don't have to first go to 2.6.10-rc1, you can just start 
from a current tree)

Thanks for testing, and sorry for the confusion with the more recent PCI 
merge.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 23:49                             ` Christian Kujau
  2004-11-09  1:05                               ` Linus Torvalds
@ 2004-11-09  1:31                               ` Christian Kujau
  2004-11-09  7:40                                 ` Pekka Enberg
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-09  1:31 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

ok, i've done some other things here and built kernels from
2.6.10-rc1-bk13 and all were giving the oops:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1-bk13
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-2.6.10-rc1-bk13.txt

the config is the same config i am usually using, never gave me a
headache, new options (due to new kernel version) were left to default in
most cases. anyway - i've pulled again a recent tree, did
"bk undo -a1.2463" again but this time i stripped down my .config (via
menuconfig) to the absolute necessary things:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_a1.2463_take2

...and  it did *NOT* oops:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.10-rc1_a1.2463.txt

i'll investigate further, building former -bk snapshots, using other
configs before i'll fiddle around with bk again (to get the smaller
changes). but this is a tomorrow thing, real life calls in :(

Thank you all so far,
Christian.
- --
BOFH excuse #92:

Stale file handle (next time use Tupperware(tm)!)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkB3v+A7rjkF8z0wRAjU/AKCGPnfuJiBzamcRwU9hIiH+GXZNSwCgi2YK
kwN9O4z/1MzWEakWX0p6IGo=
=d8GA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-09  1:05                               ` Linus Torvalds
@ 2004-11-09  1:41                                 ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-09  1:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> So what I'd like you to do is to take the pre-PCI-merge tree, and see if 
> that works for you
> 
> 	# assuming a 2.6.10-rc1 tree
> 	bk undo -a1.2000.1.6
> 
> and if that works, then try the post-PCI-merge tree:
> 
> 	# assuming a 2.6.10-rc1 tree
> 	bk undo -a1.2000.1.7
> 
> (I just checked: the above numbers are actually valid even in the current
> -bk tree, so you don't have to first go to 2.6.10-rc1, you can just start 
> from a current tree)

thanks, Linus. i'll do all this tomorrow, see my other mail i just sent.
i'll definitely do all this 'cause i'm really curious about this thing.
(it's not even the need of sound any more. heck, i could just put in
another soundcard but that'd be too easy :)

> 
> Thanks for testing, and sorry for the confusion with the more recent PCI 
> merge.

doh, you can't image how thankful i am for your (and the other people's!)
help here. but don't waste too many cycles on this weird issue here. if it
does not break for a million users out there now - why bother at all?
perhaps it'll break later on but then we have the lkml-archives and
someone will eventually remember this thing. but no, i don't want to
discourage anyone here ;-)

regards,
Christian.
- --
BOFH excuse #19:

floating point processor overflow
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkCAs+A7rjkF8z0wRAu2pAKDBw1Cj3fFBXbtbkpfagkpgbxiK+ACcC2gn
HXmcjnhFFX8vAjK0IawPQgI=
=T1C6
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-09  1:31                               ` Christian Kujau
@ 2004-11-09  7:40                                 ` Pekka Enberg
  2004-11-09 12:33                                   ` Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Pekka Enberg @ 2004-11-09  7:40 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, Linus Torvalds, Greg KH

Hi,

On Tue, 09 Nov 2004 02:31:28 +0100, Christian Kujau <evil@g-house.de> wrote:
> the config is the same config i am usually using, never gave me a
> headache, new options (due to new kernel version) were left to default in
> most cases. anyway - i've pulled again a recent tree, did
> "bk undo -a1.2463" again but this time i stripped down my .config (via
> menuconfig) to the absolute necessary things:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_a1.2463_take2
> 
> ...and  it did *NOT* oops:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops-2.6.10-rc1_a1.2463.txt
> 
> i'll investigate further, building former -bk snapshots, using other
> configs before i'll fiddle around with bk again (to get the smaller
> changes). but this is a tomorrow thing, real life calls in :(

CONFIG_PREEMPT is one obvious candidate (you have that enabled in the
original config and disabled in the non-oopsing one).

                       Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-09  7:40                                 ` Pekka Enberg
@ 2004-11-09 12:33                                   ` Christian Kujau
  2004-11-09 17:26                                     ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 12:33 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Pekka Enberg, Linus Torvalds, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

this damn thread is far too long already...


Pekka Enberg schrieb:
> CONFIG_PREEMPT is one obvious candidate (you have that enabled in the
> original config and disabled in the non-oopsing one).

i've disabled *only* CONFIG_PREEMPT in another .config but it still oopses:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-debug_oops-2.6.10-rc1_no-preempt.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_no-preempt.txt

2.6.9 with preempt enabled does not oops:
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.9_preempt.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-no-oops_2.6.9_preempt.txt

i was a fool to test further -bk snapshots but it was kinda late yesterday
 and i was confused:

patch-2.6.9.bz2          -> 19-Oct-2004
patch-2.6.10-rc1.bz2     -> 23-Oct-2004 00:12
patch-2.6.10-rc1-bk1.bz2 -> 23-Oct-2004 13:34

2.6.9 is not oopsing *here*, plain 2.6.10-rc1 is oopsing. so i can *not*
use -bk snapshots any more and i will go on with BK (undo the ChangeSets
Linus told me about) and use different .configs now. sorry for the
confusion and especially sorry to my bk mentor: we seem to be so close to
the right ChangeSet and then i started to use *snapshots* again.

Thanks,
Christian
- --
BOFH excuse #76:

Unoptimized hard drive
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkLkQ+A7rjkF8z0wRAhqLAJ9bZm+B5LKR+sY7V+yi/fSrhJuGrwCfcumS
GwsGsjKson9vwRMCDtT9/Zk=
=ailz
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 12:33                                   ` Christian Kujau
@ 2004-11-09 17:26                                     ` Christian Kujau
  2004-11-09 18:53                                       ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 17:26 UTC (permalink / raw)
  To: Christian Kujau, Kernel Mailing List
  Cc: Pekka Enberg, Linus Torvalds, Greg KH

On Tue, 09 Nov 2004 13:33:20 +0100, Christian Kujau wrote
> i've disabled *only* CONFIG_PREEMPT in another .config but it 
> still oopses:

at least i finally found the "bad" .config option: it's CONFIG_EDD.
when i disable this option (and only this options. i can use the same
.config as usual only disbaling this very option. diff is my witness.)
i can boot a current (!) 2.6.10-rc1-bk and a working snd-ens1371!

i'll test with CONFIG_EDD=m later on. here a short summary:

2.6.9         CONFIG_EDD=y   - OK
2.6.10-rc1-bk CONFIG_EDD=y   - OOPS!
2.6.10-rc1-bk CONFIG_EDD=n   - OK
2.6.10-rc1-bk CONFIG_EDD=m   - ??

yes, i'll continue to find out the ChangeSet but now i (and perhaps you
too, if you are as curious as me) will know where to look at.
i must admit that i was not entirely sure why i wanted to enable
CONFIG_EDD at all. if i had never enabled it, it'd have saved me a week
of bug chasing, but learning is fun, too.

thanks,
Christian.
-- 
BOFH excuse #209:

Only people with names beginning with 'A' are getting mail this week (a
la Microsoft)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 17:26                                     ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
@ 2004-11-09 18:53                                       ` Linus Torvalds
  2004-11-09 19:04                                         ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH
  2004-11-09 23:30                                         ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
  0 siblings, 2 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-09 18:53 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, Pekka Enberg, Greg KH, Matt_Domsch



On Tue, 9 Nov 2004, Christian Kujau wrote:
> 
> at least i finally found the "bad" .config option: it's CONFIG_EDD.
> when i disable this option (and only this options. i can use the same
> .config as usual only disbaling this very option. diff is my witness.)
> i can boot a current (!) 2.6.10-rc1-bk and a working snd-ens1371!

Very strange. There's not a lot of stuff that affects EDD directly that I 
can see, but there is:

	ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com
	  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR
	  
	  Some controller BIOSes have problems with the legacy int13 fn02 READ
	  SECTORS command.  int13 fn42 EXTENDED READ is used in preference by most
	  boot loaders today, so lets use that.  If EXTENDED READ fails or isn't
	  supported, fall back to READ SECTORS.
	  
	  This hopefully resolves the three reports of BIOSes which would either
	  long-pause (30+ seconds) or hang completely on the legacy READ SECTORS
	  command.
	  
	  This also adds CONFIG_EDD_SKIP_MBR to eliminate reading the MBR on each
	  BIOS-presented disk, in case there are further problems in this area.
	  
	  Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
	  Signed-off-by: Andrew Morton <akpm@osdl.org>
	  Signed-off-by: Linus Torvalds <torvalds@osdl.org>

which might fit the bill.

However, even that would just change the EDD _data_, it doesn't change the 
code that actually runs in the kernel. And I _really_ don't see what EDD 
has got to do with anything.

I wonder if the EDD stuff corrupts the sysfs tree or something, and you're
just seeing some strange kobject interference. Greg, you'd likely still be
on the line for that one.

Christian, finding which change triggers this would be very good indeed. I 
think the merge with greg is still a good place to start, although even 
just doing the snapshot trees (from _before_ -rc1: ie the patches in 
/pub/linux/kernel/v2.6/snapshots/old: patch-2.6.9-bk*.gz) is actually also 
a good way to narrow things down.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 18:53                                       ` Linus Torvalds
@ 2004-11-09 19:04                                         ` Greg KH
  2004-11-09 19:08                                           ` Greg KH
  2004-11-09 19:09                                           ` Linus Torvalds
  2004-11-09 23:30                                         ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
  1 sibling, 2 replies; 62+ messages in thread
From: Greg KH @ 2004-11-09 19:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch

This fixes a problem introduced in the previous set of driver model
changes that has been seen by a lot of people (most notibly the greater
than 256 pty users, but others might also be hitting this without
realizing it.)

Also add a comment so we don't try to "fix" this again.

Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

--- a/lib/kobject.c	2004-11-05 10:06:33 -08:00
+++ b/lib/kobject.c	2004-11-08 23:58:02 -08:00
@@ -181,10 +181,10 @@ int kobject_add(struct kobject * kobj)
 
 	error = create_dir(kobj);
 	if (error) {
+		/* unlink does the kobject_put() for us */
 		unlink(kobj);
 		if (parent)
 			kobject_put(parent);
-		kobject_put(kobj);
 	} else {
 		kobject_hotplug(kobj, KOBJ_ADD);
 	}

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:04                                         ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH
@ 2004-11-09 19:08                                           ` Greg KH
  2004-11-09 20:19                                             ` Pekka Enberg
                                                               ` (2 more replies)
  2004-11-09 19:09                                           ` Linus Torvalds
  1 sibling, 3 replies; 62+ messages in thread
From: Greg KH @ 2004-11-09 19:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch

On Tue, Nov 09, 2004 at 11:04:21AM -0800, Greg KH wrote:
> This fixes a problem introduced in the previous set of driver model
> changes that has been seen by a lot of people (most notibly the greater
> than 256 pty users, but others might also be hitting this without
> realizing it.)
> 
> Also add a comment so we don't try to "fix" this again.
> 
> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>

Christian, I don't know if this patch explicitly fixes your problem, but
it fixes problems other people have been having with the driver core
lately.  I'd appreciate it if you could test it out and let me know if
it solves your problem, with CONFIG_EDD enabled, or if it doesn't help
at all.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:04                                         ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH
  2004-11-09 19:08                                           ` Greg KH
@ 2004-11-09 19:09                                           ` Linus Torvalds
  2004-11-09 22:06                                             ` Christian Kujau
  1 sibling, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-09 19:09 UTC (permalink / raw)
  To: Greg KH; +Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Matt_Domsch



On Tue, 9 Nov 2004, Greg KH wrote:
>
> This fixes a problem introduced in the previous set of driver model
> changes that has been seen by a lot of people (most notibly the greater
> than 256 pty users, but others might also be hitting this without
> realizing it.)

Ahh.. Christian, pls test this one.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:08                                           ` Greg KH
@ 2004-11-09 20:19                                             ` Pekka Enberg
  2004-11-09 21:21                                             ` Christian Kujau
  2004-11-09 21:31                                             ` Christian Kujau
  2 siblings, 0 replies; 62+ messages in thread
From: Pekka Enberg @ 2004-11-09 20:19 UTC (permalink / raw)
  To: Greg KH; +Cc: Linus Torvalds, Christian Kujau, Kernel Mailing List, matt_domsch

Hi Greg,

On Tue, 9 Nov 2004 11:08:09 -0800, Greg KH <greg@kroah.com> wrote:
> Christian, I don't know if this patch explicitly fixes your problem, but
> it fixes problems other people have been having with the driver core
> lately.  I'd appreciate it if you could test it out and let me know if
> it solves your problem, with CONFIG_EDD enabled, or if it doesn't help
> at all.

The broken kobject_add fix is not in -rc1 proper which oopses on
Christian's machine. I don't think this patch has anything to do with
his problem.

                               Pekka

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:08                                           ` Greg KH
  2004-11-09 20:19                                             ` Pekka Enberg
@ 2004-11-09 21:21                                             ` Christian Kujau
  2004-11-09 21:31                                             ` Christian Kujau
  2 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 21:21 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Greg KH, Linus Torvalds, Pekka Enberg, Matt_Domsch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg KH schrieb:
> 
> Christian, I don't know if this patch explicitly fixes your problem, but
> it fixes problems other people have been having with the driver core
> lately.  I'd appreciate it if you could test it out and let me know if
> it solves your problem, with CONFIG_EDD enabled, or if it doesn't help
> at all.
> 

yes, i'll do so and test the patch. is this in current -BK yet? because
applying your patch [1] to 2.6.10-rc1 gives:

Hunk #1 FAILED at 181.
1 out of 1 hunk FAILED -- saving rejects to file lib/kobject.c.rej

i've done a few other things before, let me just post the results before i
go on with your suggestions:

i've compiled a recent (BK) 2.6.10-rc1 again with CONFIG_EDD=m|y|n

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_edd-modular.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_edd.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/config-2.6.10-rc1_no-edd.txt

the results:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd-modular.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_no-edd.txt

the interesting thing (for me) was, that when CONFIG_EDD=m was set, my
sound card was working properly and i could do "modprobe edd" and "rmmod
edd" as i like:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/typescript-2.6.10-rc1_edd-modular.txt

again: i double checked and compiled on 2 different hosts, each having
it's own -BK tree.

thanks,
Christian.

[1] http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/edd-fix.patch
- --
BOFH excuse #22:

monitor resolution too high
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkTTg+A7rjkF8z0wRAvFPAKCCM05vqhg4u2NH2wklRRbxdVSpcwCff9a3
/KodSmgp9J4Nf2LDcTiBOCo=
=B/3X
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:08                                           ` Greg KH
  2004-11-09 20:19                                             ` Pekka Enberg
  2004-11-09 21:21                                             ` Christian Kujau
@ 2004-11-09 21:31                                             ` Christian Kujau
  2 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 21:31 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg KH schrieb:
> lately.  I'd appreciate it if you could test it out and let me know if
> it solves your problem, with CONFIG_EDD enabled, or if it doesn't help
> at all.

please ignore my first mail (the part about not being able to patch), it's
already in BK i can see now, sorry.

compiling now...

- --
BOFH excuse #22:

monitor resolution too high
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkTc3+A7rjkF8z0wRAl7LAJ9/mXV4/uFet5aqpJB/02+J/654bACbBz/k
Px9muqjJ+e7OiRPDHbmyS1s=
=Q+hA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] kobject: fix double kobject_put() in error path of kobject_add()
  2004-11-09 19:09                                           ` Linus Torvalds
@ 2004-11-09 22:06                                             ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 22:06 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, Greg KH, Pekka Enberg, Matt_Domsch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

i'm sorry to say that it did not help:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.10-rc1_edd__kobject_put.txt

i'll go on and try to exclude

ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com
	  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR

(or just test /pub/linux/kernel/v2.6/snapshots/old/patch-2.6.9-bk*.gz ...)

thanks,
Christian.
- --
BOFH excuse #200:

The monitor needs another box of pixels.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkT9q+A7rjkF8z0wRArHjAJ4qSyZf+ioC4VkvPxk2fCNWUrl18QCeLK85
8e2EyGuWgBviGETlV25t/XE=
=Qvnz
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 18:53                                       ` Linus Torvalds
  2004-11-09 19:04                                         ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH
@ 2004-11-09 23:30                                         ` Christian Kujau
  2004-11-09 23:40                                           ` Matt Domsch
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-09 23:30 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Linus Torvalds, Pekka Enberg, Greg KH, Matt_Domsch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> Very strange. There's not a lot of stuff that affects EDD directly that I 
> can see, but there is:
> 
> 	ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com
> 	  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR

and i say: good catch! that does it!

i did "bk undo -a1.2000.5.108" on a current tree, booting this still gives
an oops:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_a1.2000.5.108.txt

excluding this single ChangeSet with "bk undo -r1.2118" does work with
CONFIG_EDD=y:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_r1.2000.5.108.txt

(the filename here should really read "...r1.2118.txt" because that was
the number of the changeset representing the above [PATCH] *after* i did
"bk undo -a1.2000.5.108". right?)

> However, even that would just change the EDD _data_, it doesn't change the 
> code that actually runs in the kernel. And I _really_ don't see what EDD 
> has got to do with anything.

understanding a lot less of all this than you guys i also wonder why only
this single driver broke. i've always loaded a couple of drivers here,
maybe i could play around a bit e.g. CONFIG_SND_ENS1371=y instead of =m or
see if other hw drivers break too.

> I wonder if the EDD stuff corrupts the sysfs tree or something, and you're
> just seeing some strange kobject interference.

do userspace tools matter here? there is "sysfsutils-1.1.0-1" and
"libsysfs1-1.1.0-1" (both debian/unstable) installed here, /sys is mounted:

   sysfs on /sys type sysfs (rw)

> Christian, finding which change triggers this would be very good indeed. I 
> think the merge with greg is still a good place to start, although even 

i'll look again over the -bk magic you told me about and see what it gives.

thanks so far to all involved here, i really enjoyed "working" with you.
first class support at no charge...it's just incredible.

you guys rock,
Christian.
- --
BOFH excuse #112:

The monitor is plugged into the serial port
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkVMN+A7rjkF8z0wRAqu4AKCtxZxE2spjZGgSnxTWzTTB0CWCkACgi2f3
RmHQXbnkcI1OEcLORhP1dmA=
=5Dot
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 23:30                                         ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
@ 2004-11-09 23:40                                           ` Matt Domsch
  2004-11-10  0:21                                             ` Christian Kujau
  2004-11-11 22:43                                             ` Matt Domsch
  0 siblings, 2 replies; 62+ messages in thread
From: Matt Domsch @ 2004-11-09 23:40 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH

On Wed, Nov 10, 2004 at 12:30:21AM +0100, Christian Kujau wrote:
> > 	ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com
> > 	  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR
> 
> and i say: good catch! that does it!
> 
> i did "bk undo -a1.2000.5.108" on a current tree, booting this still gives
> an oops:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_a1.2000.5.108.txt
> 
> excluding this single ChangeSet with "bk undo -r1.2118" does work with
> CONFIG_EDD=y:
> 
> http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/dmesg-2.6.9_r1.2000.5.108.txt

OK, thanks, that helps.  From the diff of those dmesg:

-BIOS EDD facility v0.16 2004-Jun-25, 16 devices found
+BIOS EDD facility v0.16 2004-Jun-25, 6 devices found

So with the latest EDD patch noted above, it's finding more disks than
before.  How many disks do you actually have in the system?

I'll review the assembly again to see where I could have miscounted,
and see how that may affect the EDD sysfs exports.  Likely no answer
from me before tomorrow though.

Thanks,
Matt

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-08 18:13                           ` Linus Torvalds
  (?)
  (?)
@ 2004-11-10  0:12                           ` Christian Kujau
  2004-11-10  0:23                             ` Linus Torvalds
  -1 siblings, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-10  0:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> Now, if you want to get _really_ fancy, you can now look at each changeset 
> that differed, with something like
> 
> 	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -
> 
> which is black magic that does a set operation and shows all the changes 
> in between the sets of "bk at 1.2462" and "bk at 1.2463".
> 
> (This is _not_ the same as "bk changes -r1.2462..1.2463", because that one 
> just shows the single merge change that is on the direct _path_ from one 
> changeset to another. The black magic thing shows the set difference of 
> changesets that comes from the full graph at two points).

hm, i still fail to see the "magic" part here. from a current tree i get:

- ---------------
$ bk set -n -d -r1.2000.5.107 -r1.2000.5.108 | bk -R prs -h \
- -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' - | head -n5
<Matt_Domsch@dell.com>
  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR

  Some controller BIOSes have problems with the legacy int13 fn02 READ
  SECTORS command.  int13 fn42 EXTENDED READ is used in preference by most
- ---------------

which looks similiar to the next one, but with "bk changes" i get the
ChangeSet number again:

- ---------------
$ bk changes -r1.2000.5.108 | head -n5
ChangeSet@1.2000.5.108, 2004-10-20 08:36:22-07:00, Matt_Domsch@dell.com
  [PATCH] EDD: use EXTENDED READ command, add CONFIG_EDD_SKIP_MBR

  Some controller BIOSes have problems with the legacy int13 fn02 READ
  SECTORS command.  int13 fn42 EXTENDED READ is used in preference by most
- ---------------

...or was i supposed to alter your cmdline? i just copy'n'pasted it...
anyway, i've seen that i have a lot of "bk help" ahead of me, thanks for
the course, though ;)

greetings,
Christian.
- --
BOFH excuse #297:

Too many interrupts
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkVzi+A7rjkF8z0wRAte6AKCO8isFqWGyFK53IpVtEnAImvQq8gCfeePr
rzMnTyR3EPMqpv7+qz9iR6c=
=BB+K
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 23:40                                           ` Matt Domsch
@ 2004-11-10  0:21                                             ` Christian Kujau
  2004-11-10  1:01                                               ` Linus Torvalds
  2004-11-11 22:43                                             ` Matt Domsch
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-10  0:21 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: Matt Domsch, Linus Torvalds, Pekka Enberg, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matt Domsch schrieb:
> 
> -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found
> +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
> 
> So with the latest EDD patch noted above, it's finding more disks than
> before.  How many disks do you actually have in the system?

i have one scsi disk (sda) and two atapi cdrom drives:

hda: CRD-8483B, ATAPI CD/DVD-ROM drive
hdb: AOPEN CD-RW CRW3248 1.17 20020620, ATAPI CD/DVD-ROM drive
...
SCSI device sda: 35548320 512-byte hdwr sectors (18201 MB)
SCSI device sda: drive cache: write back

the "scsi0 : sym-2.1.18k" is on a pci card, the atapi devices are
connected onboard. if it helps:

http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-v.txt
http://www.nerdbynature.de/bits/prinz/2.6.10-rc1/lspci-vv.txt

> I'll review the assembly again to see where I could have miscounted,
> and see how that may affect the EDD sysfs exports.  Likely no answer
> from me before tomorrow though.

that's ok, real life kicks in here too...

thanks,
Christian.

PS: do you have *any* idea how this could be related to the snd-es1371
driver (which is producing the oops then)?
- --
BOFH excuse #449:

greenpeace free'd the mallocs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBkV75+A7rjkF8z0wRAl67AJ9P+SF1WfRe7r2zoF9D/b/fyDeD0QCfe6/f
Uxt5DVlb/IzW9VSWuFJqLlI=
=Hpg9
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1
  2004-11-10  0:12                           ` Oops in 2.6.10-rc1 Christian Kujau
@ 2004-11-10  0:23                             ` Linus Torvalds
  0 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-10  0:23 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List



On Wed, 10 Nov 2004, Christian Kujau wrote:
> > 
> > 	bk set -n -d -r1.2462 -r1.2463 | bk -R prs -h -d'<:P:@:HOST:>\n$each(:C:){\t(:C:)\n}\n' -
> > 
> > which is black magic that does a set operation and shows all the changes 
> > in between the sets of "bk at 1.2462" and "bk at 1.2463".
> 
> hm, i still fail to see the "magic" part here. from a current tree i get:

You don't see any magic, unless there are merges involved. And you've 
already narrowed the thing down to a single non-merge changeset, at which 
point the "magic" way is just a very slow way of doing the same thing.

The magic hits you only when you have non-trivial merges, in which case 
the set operation shows you more than the "just walk from one top-of-tree 
to the other".

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-10  0:21                                             ` Christian Kujau
@ 2004-11-10  1:01                                               ` Linus Torvalds
  0 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2004-11-10  1:01 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Kernel Mailing List, Matt Domsch, Pekka Enberg, Greg KH



On Wed, 10 Nov 2004, Christian Kujau wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Matt Domsch schrieb:
> > 
> > -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found
> > +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
> > 
> > So with the latest EDD patch noted above, it's finding more disks than
> > before.  How many disks do you actually have in the system?
> 
> i have one scsi disk (sda) and two atapi cdrom drives:

Interestingly, "16" is also EDD_MBR_SIG_MAX, so my suspicion is that it 
overflowed some EDD data area. edd_num_devices() (which is what reports 
the above number) does

	min_t(unsigned char,
		max_t(unsigned char, edd.edd_info_nr, edd.mbr_signature_nr),
		max_t(unsigned char, EDD_MBR_SIG_MAX, EDDMAXNR));

where EDDMAXNR is 6, and EDD_MBR_SIG_MAX is the afore-mentioned 16, so we 
know that either edd.edd_info_nr or edd.mbr_signature_nr is actually 
_bigger_ than 16.

Which is clearly totally bogus. In fact, even your old "6 devices found" 
thing looks suspiciously bogus.

> PS: do you have *any* idea how this could be related to the snd-es1371
> driver (which is producing the oops then)?

I bet it's overwriting some array, and just corrupting memory after it. 
For example, the edd_info[] array only has 6 entries, and for example, the 
EDD_MBR_SIG_BUFFER is quite close to where we save the E820MAP memory map 
at bootup, so if something stomps on that, the kernel might be confused 
about where PCI memory can be allocated or similar. Or it might have 
overwritten some ACPI memory data, who knows.

			Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-09 23:40                                           ` Matt Domsch
  2004-11-10  0:21                                             ` Christian Kujau
@ 2004-11-11 22:43                                             ` Matt Domsch
  2004-11-11 22:53                                               ` Linus Torvalds
  2004-11-12  0:27                                               ` Christian Kujau
  1 sibling, 2 replies; 62+ messages in thread
From: Matt Domsch @ 2004-11-11 22:43 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH

On Tue, Nov 09, 2004 at 05:40:54PM -0600, Matt Domsch wrote:
> OK, thanks, that helps.  From the diff of those dmesg:
> 
> -BIOS EDD facility v0.16 2004-Jun-25, 16 devices found
> +BIOS EDD facility v0.16 2004-Jun-25, 6 devices found

As Linus points out, those are the magic numbers in EDD for number of
device entries stored.  Your BIOS seems to be reporting that is has
more devices than it does, or the EDD assembly is horked in a way I
have not yet deciphered.
 
> I'll review the assembly again to see where I could have miscounted,
> and see how that may affect the EDD sysfs exports.  Likely no answer
> from me before tomorrow though.

I haven't been able to find a solution to your problem yet, and given
some external time constraints I've got, won't be able to look into
this again for another week or more.

Thanks,
Matt

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-11 22:43                                             ` Matt Domsch
@ 2004-11-11 22:53                                               ` Linus Torvalds
  2004-11-11 22:55                                                 ` Matt Domsch
  2004-11-12  0:27                                               ` Christian Kujau
  1 sibling, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-11 22:53 UTC (permalink / raw)
  To: Matt Domsch, Andrew Morton
  Cc: Christian Kujau, Kernel Mailing List, Pekka Enberg, Greg KH



On Thu, 11 Nov 2004, Matt Domsch wrote:
> 
> I haven't been able to find a solution to your problem yet, and given
> some external time constraints I've got, won't be able to look into
> this again for another week or more.

Matt, I'll revert the EXTENDED READ change for now, then. The random
behaviour of the problem it causes makes me really dislike this bug, and
I'd like to release a -rc2 and start calming down the 2.6.10 stuff, but
having known random stuff happen really disturbs me.

We can re-do it once it's more obvious why it broke..

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-11 22:53                                               ` Linus Torvalds
@ 2004-11-11 22:55                                                 ` Matt Domsch
  0 siblings, 0 replies; 62+ messages in thread
From: Matt Domsch @ 2004-11-11 22:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Christian Kujau, Kernel Mailing List,
	Pekka Enberg, Greg KH

On Thu, Nov 11, 2004 at 02:53:15PM -0800, Linus Torvalds wrote:
> Matt, I'll revert the EXTENDED READ change for now, then. The random
> behaviour of the problem it causes makes me really dislike this bug, and
> I'd like to release a -rc2 and start calming down the 2.6.10 stuff, but
> having known random stuff happen really disturbs me.
> 
> We can re-do it once it's more obvious why it broke..

Good plan, thanks.

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-11 22:43                                             ` Matt Domsch
  2004-11-11 22:53                                               ` Linus Torvalds
@ 2004-11-12  0:27                                               ` Christian Kujau
  2004-11-12  0:49                                                 ` Linus Torvalds
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Kujau @ 2004-11-12  0:27 UTC (permalink / raw)
  To: Matt Domsch; +Cc: Kernel Mailing List, Linus Torvalds, Pekka Enberg, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matt Domsch schrieb:
> 
> As Linus points out, those are the magic numbers in EDD for number of
> device entries stored.  Your BIOS seems to be reporting that is has
> more devices than it does, or the EDD assembly is horked in a way I
> have not yet deciphered.

actually, my BIOS is even to old for e.g. ACPI, with latest firmware
installed. i had no issues so far with the board/bios, but perhaps this is
no longer true. however, it's still strange that this thing is only
triggerd with you change and CONFIG_EDD=y.

> 
> I haven't been able to find a solution to your problem yet, and given
> some external time constraints I've got, won't be able to look into
> this again for another week or more.

nevermind then. as nobody else seem to be bothered by this i am happy with
the workarund (CONFIG_EDD=n) and since the lkml-archives exist we could
get back to it when it's bothering more people (n>1)

thank you for your time,
Christian.
- --
BOFH excuse #396:

Mail server hit by UniSpammer.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBlAOE+A7rjkF8z0wRAkyLAJ4uy4LYBHWk8Wxwr/heQRVm7VOXfwCfW30C
Zv1RdMYf1VOBEGkUnkQ+k0Q=
=f2hG
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-12  0:27                                               ` Christian Kujau
@ 2004-11-12  0:49                                                 ` Linus Torvalds
  2004-11-12  1:27                                                   ` Christian Kujau
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2004-11-12  0:49 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Matt Domsch, Kernel Mailing List, Pekka Enberg, Greg KH



On Fri, 12 Nov 2004, Christian Kujau wrote:
> 
> nevermind then. as nobody else seem to be bothered by this i am happy with
> the workarund (CONFIG_EDD=n) and since the lkml-archives exist we could
> get back to it when it's bothering more people (n>1)

The problem with that approach is that very few people are willing to 
spend the time and effort to really try to figure out where the problem 
triggers for them. Thanks again for testing lots of kernels, and different 
configurations.

Basically, if it's a problem that only happens for a smallish percentage
of people, and an even smaller percentage of those is willing to dig down
and find it, it's not a problem we can afford to ignore. Ignoring it just
means that there will be "a few" error reports that we will either waste
time on, or (even worse) we'll dismiss as "known problems" and then
possibly miss _another_ bug.

This is why I take random unexplained (but pinpointed) problems so 
seriously. If it wasn't as apparently random, we could file it under 
"known problem" and decide to try to fix it later. As it is, it's filed 
under "known cause", but since we don't know _why_, it might cause totally 
different problems on another machine, and that just makes it too painful 
for words. 

So the changeset is reverted for now in the current -bk tree, and I'll 
make a -rc2 this weekend and hope that we can stabilize for 2.6.10.

		Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Oops in 2.6.10-rc1 (almost solved)
  2004-11-12  0:49                                                 ` Linus Torvalds
@ 2004-11-12  1:27                                                   ` Christian Kujau
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Kujau @ 2004-11-12  1:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Matt Domsch, Kernel Mailing List, Pekka Enberg, Greg KH

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds schrieb:
> 
> This is why I take random unexplained (but pinpointed) problems so 
> seriously. If it wasn't as apparently random, we could file it under 
> "known problem" and decide to try to fix it later. As it is, it's filed 
> under "known cause", but since we don't know _why_, it might cause totally 
> different problems on another machine, and that just makes it too painful 
> for words. 

just after sending my last mail i too (re)thought about this and i'd have
begged Matt to revert the patch if it was not *only* me having this issue.

but i can see your point here and i appreciate your decision.

> So the changeset is reverted for now in the current -bk tree, and I'll 
> make a -rc2 this weekend and hope that we can stabilize for 2.6.10.

yay!

thanks,
Christian.
- --
BOFH excuse #96:

Vendor no longer supports the product
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBlBFw+A7rjkF8z0wRAld5AJ40MjbzFbVXepXkJr1tLZCvYy7z2QCeMYCe
QQyekHBs1cjuebPZTEuPZZ0=
=wwF6
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2004-11-12  1:27 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-28 13:12 Oops in 2.6.10-rc1 Christian
2004-10-28 13:29 ` [Alsa-devel] " Jaroslav Kysela
2004-10-28 14:09   ` Christian
2004-11-04 15:16     ` Christian Kujau
2004-11-05  2:35       ` Christian Kujau
2004-11-05 11:40         ` holborn
2004-11-07  1:24       ` Christian Kujau
2004-11-07  7:02         ` Linus Torvalds
2004-11-07 13:10           ` Christian Kujau
2004-11-07 16:02             ` Christian Kujau
2004-11-07 16:57               ` Linus Torvalds
2004-11-07 18:31                 ` Christian Kujau
2004-11-07 18:44                   ` Linus Torvalds
2004-11-07 23:45                   ` Christian Kujau
2004-11-07 23:45                     ` Christian Kujau
2004-11-08  1:16                     ` Linus Torvalds
2004-11-08  1:16                       ` Linus Torvalds
2004-11-08 13:01                       ` Christian Kujau
2004-11-08 13:01                         ` Christian Kujau
2004-11-08 18:13                         ` Linus Torvalds
2004-11-08 18:13                           ` Linus Torvalds
2004-11-08 20:59                           ` Christian Kujau
2004-11-08 20:59                             ` Christian Kujau
2004-11-08 23:49                             ` Christian Kujau
2004-11-09  1:05                               ` Linus Torvalds
2004-11-09  1:41                                 ` Christian Kujau
2004-11-09  1:31                               ` Christian Kujau
2004-11-09  7:40                                 ` Pekka Enberg
2004-11-09 12:33                                   ` Christian Kujau
2004-11-09 17:26                                     ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
2004-11-09 18:53                                       ` Linus Torvalds
2004-11-09 19:04                                         ` [PATCH] kobject: fix double kobject_put() in error path of kobject_add() Greg KH
2004-11-09 19:08                                           ` Greg KH
2004-11-09 20:19                                             ` Pekka Enberg
2004-11-09 21:21                                             ` Christian Kujau
2004-11-09 21:31                                             ` Christian Kujau
2004-11-09 19:09                                           ` Linus Torvalds
2004-11-09 22:06                                             ` Christian Kujau
2004-11-09 23:30                                         ` Oops in 2.6.10-rc1 (almost solved) Christian Kujau
2004-11-09 23:40                                           ` Matt Domsch
2004-11-10  0:21                                             ` Christian Kujau
2004-11-10  1:01                                               ` Linus Torvalds
2004-11-11 22:43                                             ` Matt Domsch
2004-11-11 22:53                                               ` Linus Torvalds
2004-11-11 22:55                                                 ` Matt Domsch
2004-11-12  0:27                                               ` Christian Kujau
2004-11-12  0:49                                                 ` Linus Torvalds
2004-11-12  1:27                                                   ` Christian Kujau
2004-11-10  0:12                           ` Oops in 2.6.10-rc1 Christian Kujau
2004-11-10  0:23                             ` Linus Torvalds
2004-11-08 18:44                         ` Pekka Enberg
2004-11-08 18:44                           ` Pekka Enberg
2004-11-08 19:00                           ` Greg KH
2004-11-08 19:00                             ` Greg KH
2004-11-08 19:18                             ` Pekka Enberg
2004-11-08 19:18                               ` Pekka Enberg
2004-11-08 19:30                               ` Pekka Enberg
2004-11-08 19:30                                 ` Pekka Enberg
2004-11-08 20:31                               ` Christian Kujau
2004-11-08 20:31                                 ` Christian Kujau
2004-11-07 13:05         ` Pekka Enberg
2004-11-07 13:43           ` Christian Kujau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.