linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Qn: Queer "Unable to handle kernel NULL pointer dereference at ..." error in kernel
       [not found] <OFAFB946CE.4BD085AE-ON87256D10.006801CE-88256D10.00683E0D@us.ibm.com>
@ 2003-04-23 17:53 ` Yours Lovingly
  2003-04-23 18:15   ` Jan Harkes
  0 siblings, 1 reply; 2+ messages in thread
From: Yours Lovingly @ 2003-04-23 17:53 UTC (permalink / raw)
  To: Bryan Henderson, Jan Harkes; +Cc: linux-kernel

i am sorry for that useless information. i am
attaching  a ksymoops output of that problem (focus on
the register operation (that ksymoops identifies as
the fault triggering instruction) in the "code"
section at the end of the ksymoops output)

FIRST THE CHAOTIC CODE REVISITED:-

static void nfs_print_path(struct dentry *d) {
	struct dentry *parent;
	struct qstr *qs;
	char name[64];
	struct inode *inode_parent, *inode;
	void *p, *me;

	if(!d) { 
		return;
	}	
	parent = d->d_parent;
	qs = &d->d_name;
	
	if(parent)  {
		inode_parent = parent->d_inode;
		inode = d->d_inode;
		p = (void *)inode_parent;
		me = (void *)inode;
// Till here things work just fine. I am DEAD SURE of
that as i put printk()
// followed by return here and there and checked.

// My analysis with printk's and return's shows that
the next statement, or somewhere
// after that is the problem. ksymoops identifies the
fault triggering instrxn as
// 'cmp' (see output below) which (??) could be for
this statement.
		if( p - me != 0 ) {
			//nfs_print_path(parent);
			printk("return 3\n");
			return;
		}




KSYMOOPS output:

<1>Unable to handle kernel NULL pointer dereference at
virtual address 0000000f
c88b6956
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c88b6956>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000006   ebx: c6675024   ecx: 00000001   edx:
00000007
esi: 429d3663   edi: 0000005d   ebp: c6598cdc   esp:
c5b21f08
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 1996, stackpage=c5b21000)
Stack: 00000246 ffffffd2 c011b91b 0001ea92 0001ea96
00000282 0001ea96 0001ea8f 
       00000001 00000282 00000001 c033c964 00000000
00000004 c6598cdc c011bb41 
       c6675044 429d3663 c6675044 c88b80e2 c6675024
bffff930 c014f67e c71bf000 
Call Trace: [<c011b91b>] [<c011bb41>] [<c88b80e2>]
[<c014f67e>] [<c0151111>] 
   [<c014cd04>] [<c010775c>] [<c010766b>] 
Code: 39 42 08 74 0d 83 ec 0c 68 2c 81 8c c8 eb 0b 8d
76 00 83 ec 

>>EIP; c88b6956 <[nfs]nfs_print_path+2e/58> <=====
Trace; c011b91b <call_console_drivers+eb/100>
Trace; c011bb41 <printk+1a1/1f0>
Trace; c88b80e2 <[nfs]nfs_revalidate+116/1b0>
Trace; c014f67e <getname+5e/a0>
Trace; c0151111 <__user_walk+41/50>
Trace; c014cd04 <sys_lstat64+34/70>
Trace; c010775c <error_code+34/3c>
Trace; c010766b <system_call+33/38>
Code;  c88b6956 <[nfs]nfs_print_path+2e/58>
00000000 <_EIP>:
Code;  c88b6956 <[nfs]nfs_print_path+2e/58> <=====
   0:   39 42 08                  cmp   
%eax,0x8(%edx)   <=====
Code;  c88b6959 <[nfs]nfs_print_path+31/58>
3:   74 0d                     je     12 <_EIP+0x12>
c88b6968 <[nfs]nfs_print_path+40/58>
Code;  c88b695b <[nfs]nfs_print_path+33/58>
5:   83 ec 0c                  sub    $0xc,%esp
Code;  c88b695e <[nfs]nfs_print_path+36/58>
8:   68 2c 81 8c c8            push   $0xc88c812c
Code;  c88b6963 <[nfs]nfs_print_path+3b/58>
d:   eb 0b                     jmp    1a <_EIP+0x1a>
c88b6970 <[nfs]nfs_print_path+48/58>
Code;  c88b6965 <[nfs]nfs_print_path+3d/58>
f:   8d 76 00                  lea    0x0(%esi),%esi
Code;  c88b6968 <[nfs]nfs_print_path+40/58>
12:   83 ec 00                  sub    $0x0,%esp


--- Jan Harkes <jaharkes@cs.cmu.edu> wrote: >
> > I guess your 'small' hack puts too much on the
stack
> resulting in a
> stackoverflow.
> > > More likely an error writing to 'p' and 'me'
> veriables on the stack.
> They are probably past the end of the stack,
> resulting in a pagefault
> which simply triggers the NULL ptr dereference
> message, because
> pagefaults should never occur in kernel space and
> typically indicate a
> bad pointer dereference.

--- Bryan Henderson <hbryan@us.ibm.com> wrote: > > I
agree that the statements in question can't cause
> that symptom.  (I don't
> even think the stack is being accessed -- looks like
> register operations to
> me).  I suspect your analysis is in error.  You may
> need to use a kernel
> debugger to figure this out.

so it does look like a plain register operation. so
whats so wrong ??
if u still think this info is really insufficient for
finding out whats
wrong can u suggest a nice kernel debugger(ksymoops i
all i use). note that this error is highly
reproducible  and not something that just went wrong
sometime.


regards
abhishek



________________________________________________________________________
Missed your favourite TV serial last night? Try the new, Yahoo! TV.
       visit http://in.tv.yahoo.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Qn: Queer "Unable to handle kernel NULL pointer dereference at ..." error in kernel
  2003-04-23 17:53 ` Qn: Queer "Unable to handle kernel NULL pointer dereference at ..." error in kernel Yours Lovingly
@ 2003-04-23 18:15   ` Jan Harkes
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Harkes @ 2003-04-23 18:15 UTC (permalink / raw)
  To: Yours Lovingly; +Cc: Bryan Henderson, linux-kernel

On Wed, Apr 23, 2003 at 06:53:33PM +0100, Yours Lovingly wrote:
> i am sorry for that useless information. i am
> attaching  a ksymoops output of that problem (focus on
> the register operation (that ksymoops identifies as
> the fault triggering instruction) in the "code"
> section at the end of the ksymoops output)
...
> 		inode_parent = parent->d_inode;
> 		inode = d->d_inode;
> 		p = (void *)inode_parent;
> 		me = (void *)inode;
> // Till here things work just fine. I am DEAD SURE of that as i put printk()
> // followed by return here and there and checked.
> 
> 		if( p - me != 0 ) {
> 			//nfs_print_path(parent);
> 			printk("return 3\n");
> 			return;
> 		}

What I typically do in these cases is,

- remove the object file where the oops occurs
- rerun make
- copy the gcc line that is responsible for compiling the object
- run the same gcc line again, but add a '-g' flag

Now we have an object with debugging symbols, and can use 'objdump
--source <file>.o | less' and get something that has both the source
lines and the related assembly code. In this case the faulting
instruction is about 0x2e bytes past the beginning of nfs_print_path.

But I can get pretty far just from reading the oops and it is probably
the test you added. Ok, it looks like -O2 is actually optimizing away
some of those intermediate variables for you.

> Code;  c88b6956 <[nfs]nfs_print_path+2e/58> <=====
>    0:   39 42 08                  cmp   %eax,0x8(%edx)   <=====
>    3:   74 0d                     je     12 <_EIP+0x12>

Ok, so we're comparing the contents of %eax to something that is 8 bytes
offset from the address stored in %edx. and then jump out if they are
equal.

So this is something like and if (eax != edx->bar) { } construct. And in
fact the test you added was

    if ( p - me != 0 )
    if ( p != me )
    if ( inode_parent != inode )
    if ( parent->d_inode != d->d_inode )

So the contents of either parent->d_inode or d->d_inode was stored in
register eax, and we're dereferencing the other pointer during the test
operation.

Why does it crash, because the pointer we are dereferencing is 0x7, not
really a valid address, in fact it already looks like an inode number.

> eax: 00000006   ebx: c6675024   ecx: 00000001   edx: 00000007

And eax looks pretty suspicious as well. These are not pointers to inode
structures, but possibly the i_ino numbers.

Jan


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-04-23 18:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <OFAFB946CE.4BD085AE-ON87256D10.006801CE-88256D10.00683E0D@us.ibm.com>
2003-04-23 17:53 ` Qn: Queer "Unable to handle kernel NULL pointer dereference at ..." error in kernel Yours Lovingly
2003-04-23 18:15   ` Jan Harkes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).