All of lore.kernel.org
 help / color / mirror / Atom feed
* Debugging a Stall or a Freeze
@ 2013-07-25 17:56 Salam Farhat
  2013-07-25 18:23 ` Valdis.Kletnieks at vt.edu
  0 siblings, 1 reply; 4+ messages in thread
From: Salam Farhat @ 2013-07-25 17:56 UTC (permalink / raw)
  To: kernelnewbies

I added the tracefs functionality to wrapfs.
wrapfs is a stackable file system that intercepts file system calls and
Tracefs basically records the VFS calls into a file on disk.
wrapfs worked fine its that when I added the logging functionality it
failed.

The issue I am having is that When I run a workload on the mounted file
system (wrapfs+tracefs) the system freezes. the mouse pointer does not even
move.

I set it up in virtual box and had the kernel dump the messages to a serial
port which I then read from my host OS. I did this following the
instructions on this page:
http://linuxdeveloper.blogspot.com/2012/05/debugging-linux-kernel-over-serial-port.html

When the guest OS freezes I get the following messages seen below. I would
like to know what is a good approach for debugging this issue. I am not
sure what a process stall is. Is that a deadlock?


[  780.357876] BUG: soft lockup - CPU#0 stuck for 22s!
[nautilus:1382]
[  780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230
task.ti=d)
[  780.361658]
Stack:
[  780.361658] Call
Trace:
[  780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90 55
ba 0
[  808.356372] BUG: soft lockup - CPU#0 stuck for 22s!
[nautilus:1382]
[  808.360223] Process nautilus (pid: 1382, ti=dca12000 task=dc837230
task.ti=d)
[  808.360223]
Stack:
[  808.360223] Call
Trace:
[  808.360223] Code: ff ff 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff
90 5
[  814.876223] INFO: rcu_sched detected stall on CPU 0 (t=15000
jiffies)
[  814.876223] Process nautilus (pid: 1382, ti=dca12000 task=dc837230
task.ti=d)
[  814.876223]
Stack:
[  814.876223] Call
Trace:
[  814.876223] Code: 00 c3 ff ff 80 e5 10 75 ee c1 e6 18 89 b0 10 c3 ff ff
89 d




Thanks.
Salam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130725/8b348a2a/attachment.html 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Debugging a Stall or a Freeze
  2013-07-25 17:56 Debugging a Stall or a Freeze Salam Farhat
@ 2013-07-25 18:23 ` Valdis.Kletnieks at vt.edu
  2013-08-16 16:38   ` Salam Farhat
  0 siblings, 1 reply; 4+ messages in thread
From: Valdis.Kletnieks at vt.edu @ 2013-07-25 18:23 UTC (permalink / raw)
  To: kernelnewbies

On Thu, 25 Jul 2013 13:56:47 -0400, Salam Farhat said:

> When the guest OS freezes I get the following messages seen below. I would
> like to know what is a good approach for debugging this issue. I am not
> sure what a process stall is. Is that a deadlock?
>
>
> [  780.357876] BUG: soft lockup - CPU#0 stuck for 22s! [nautilus:1382]
> [  780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230 task.ti=d)
> [  780.361658]
> Stack:
> [  780.361658] Call Trace:
> [  780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90 55 ba 0
> [  808.356372] BUG: soft lockup - CPU#0 stuck for 22s!

That's probably not a deadlock.  That's code stuck in an infinite loop,
probably while running in a non-interruptible state.

Too bad we didn't get a stack dump out of it, that would tell us what
code is hung in a loop.

For debugging deadlocks, turning on CONFIG_PROVE_LOCKING=y in the .config
is the best bet - that will fire an alert not only when the kernel *does*
lock up, but also if there's even a *possible* deadlock (for instance, if one
section takes 2 locks in the order A B, it will trigger if it ever spots
another chunk of code taking B and then A - even if that doesn't actually
trigger a deadlock because neither lock is held at the time).


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 865 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130725/6642f212/attachment.bin 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Debugging a Stall or a Freeze
  2013-07-25 18:23 ` Valdis.Kletnieks at vt.edu
@ 2013-08-16 16:38   ` Salam Farhat
  2013-08-17  2:14     ` Sankar P
  0 siblings, 1 reply; 4+ messages in thread
From: Salam Farhat @ 2013-08-16 16:38 UTC (permalink / raw)
  To: kernelnewbies

I have posted a question earlier and I have confirmed that this is running
in an infinite loop. However, I discovered that the infinite loop is
happening inside kernel code. Specifically inside the kmalloc function. I
know this is highly improbable, but I believe that this is the case.

The line of code that cause the infinite loop is in bold below and starts
with buf =

If I comment this line out then it does not hang. If I uncomment it then it
does. Further, more no print statements after that line are being printed
and I have it surrounded by print statements.

KMALLOC is a macro defined as
# define KMALLOC(a,b)    kmalloc((a),(b))

The last line being printed:
b0b0b0    4096
4096 being the size of buffer.


The get_buffer method is called quite a few times before the last time
where it goes into an infinite loop. I am thinking there could be a memory
leak or if memory is low this can happen?

An advice on how to tackle this issue would be greatly appreciated.

Thanks.



static inline struct buffer *get_buffer(void)
{
    /* XXX:  __get_free_page should be used.  KMALLOC is for small stuff <
PAGE_SIZE */
    struct buffer *buf;
    printk(KERN_EMERG " b0b0b0   %d\n", sizeof(struct buffer));
 *   buf = KMALLOC(sizeof(struct buffer), GFP_KERNEL);*
    print_entry_location();
    printk(KERN_EMERG " b1b1b1\n");
    //if (buf)  //i commented these out
    //buf->ptr = buf->data + INIT_LOC;  //i commented these out
    printk(KERN_EMERG " b1b1b1\n");
    print_exit_location();
    return NULL;  //i changed it to return null so the next function just
exits
}



Additional info
struct buffer {
    char *ptr;
    char data[DATA_SIZE];
};
#define DATA_SIZE (PAGE_SIZE - sizeof(int))


On Thu, Jul 25, 2013 at 2:23 PM, <Valdis.Kletnieks@vt.edu> wrote:

> On Thu, 25 Jul 2013 13:56:47 -0400, Salam Farhat said:
>
> > When the guest OS freezes I get the following messages seen below. I
> would
> > like to know what is a good approach for debugging this issue. I am not
> > sure what a process stall is. Is that a deadlock?
> >
> >
> > [  780.357876] BUG: soft lockup - CPU#0 stuck for 22s! [nautilus:1382]
> > [  780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230
> task.ti=d)
> > [  780.361658]
> > Stack:
> > [  780.361658] Call Trace:
> > [  780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90
> 55 ba 0
> > [  808.356372] BUG: soft lockup - CPU#0 stuck for 22s!
>
> That's probably not a deadlock.  That's code stuck in an infinite loop,
> probably while running in a non-interruptible state.
>
> Too bad we didn't get a stack dump out of it, that would tell us what
> code is hung in a loop.
>
> For debugging deadlocks, turning on CONFIG_PROVE_LOCKING=y in the .config
> is the best bet - that will fire an alert not only when the kernel *does*
> lock up, but also if there's even a *possible* deadlock (for instance, if
> one
> section takes 2 locks in the order A B, it will trigger if it ever spots
> another chunk of code taking B and then A - even if that doesn't actually
> trigger a deadlock because neither lock is held at the time).
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130816/1e9663eb/attachment.html 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Debugging a Stall or a Freeze
  2013-08-16 16:38   ` Salam Farhat
@ 2013-08-17  2:14     ` Sankar P
  0 siblings, 0 replies; 4+ messages in thread
From: Sankar P @ 2013-08-17  2:14 UTC (permalink / raw)
  To: kernelnewbies

On Fri, Aug 16, 2013 at 10:08 PM, Salam Farhat <salalimo@gmail.com> wrote:
> I have posted a question earlier and I have confirmed that this is running
> in an infinite loop. However, I discovered that the infinite loop is
> happening inside kernel code. Specifically inside the kmalloc function. I
> know this is highly improbable, but I believe that this is the case.
>
> The line of code that cause the infinite loop is in bold below and starts
> with buf =
>
> If I comment this line out then it does not hang. If I uncomment it then it
> does. Further, more no print statements after that line are being printed
> and I have it surrounded by print statements.
>
> KMALLOC is a macro defined as
> # define KMALLOC(a,b)    kmalloc((a),(b))
>
> The last line being printed:
> b0b0b0    4096
> 4096 being the size of buffer.
>
>
> The get_buffer method is called quite a few times before the last time where
> it goes into an infinite loop. I am thinking there could be a memory leak or
> if memory is low this can happen?
>
> An advice on how to tackle this issue would be greatly appreciated.
>

GFP_KERNEL flag can make the kmalloc call to sleep on low-memory
situations. If you pass GFP_ATOMIC, the kernel will fail the kmalloc
instantly, in case memory cannot be allocated, instead of putting it
to sleep. You can try that.

Google showed me that a show_page_info call can tell you about the
memory usage. You can use that prior to making the kmalloc call to see
the memory status, alternatively.

If you are interested in debugging memory leaks, you can try kmemleak
http://psankar.blogspot.in/2010/11/detecting-memory-leaks-in-kernel.html

> Thanks.
>
>
>
> static inline struct buffer *get_buffer(void)
> {
>     /* XXX:  __get_free_page should be used.  KMALLOC is for small stuff <
> PAGE_SIZE */
>     struct buffer *buf;
>     printk(KERN_EMERG " b0b0b0   %d\n", sizeof(struct buffer));
>     buf = KMALLOC(sizeof(struct buffer), GFP_KERNEL);
>     print_entry_location();
>     printk(KERN_EMERG " b1b1b1\n");
>     //if (buf)  //i commented these out
>     //buf->ptr = buf->data + INIT_LOC;  //i commented these out
>     printk(KERN_EMERG " b1b1b1\n");
>     print_exit_location();
>     return NULL;  //i changed it to return null so the next function just
> exits
> }
>
>
>
> Additional info
> struct buffer {
>     char *ptr;
>     char data[DATA_SIZE];
> };
> #define DATA_SIZE (PAGE_SIZE - sizeof(int))
>
>
> On Thu, Jul 25, 2013 at 2:23 PM, <Valdis.Kletnieks@vt.edu> wrote:
>>
>> On Thu, 25 Jul 2013 13:56:47 -0400, Salam Farhat said:
>>
>> > When the guest OS freezes I get the following messages seen below. I
>> > would
>> > like to know what is a good approach for debugging this issue. I am not
>> > sure what a process stall is. Is that a deadlock?
>> >
>> >
>> > [  780.357876] BUG: soft lockup - CPU#0 stuck for 22s! [nautilus:1382]
>> > [  780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230
>> > task.ti=d)
>> > [  780.361658]
>> > Stack:
>> > [  780.361658] Call Trace:
>> > [  780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90
>> > 55 ba 0
>> > [  808.356372] BUG: soft lockup - CPU#0 stuck for 22s!
>>
>> That's probably not a deadlock.  That's code stuck in an infinite loop,
>> probably while running in a non-interruptible state.
>>
>> Too bad we didn't get a stack dump out of it, that would tell us what
>> code is hung in a loop.
>>
>> For debugging deadlocks, turning on CONFIG_PROVE_LOCKING=y in the .config
>> is the best bet - that will fire an alert not only when the kernel *does*
>> lock up, but also if there's even a *possible* deadlock (for instance, if
>> one
>> section takes 2 locks in the order A B, it will trigger if it ever spots
>> another chunk of code taking B and then A - even if that doesn't actually
>> trigger a deadlock because neither lock is held at the time).
>>
>>
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>



-- 
Sankar P
http://psankar.blogspot.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-17  2:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-25 17:56 Debugging a Stall or a Freeze Salam Farhat
2013-07-25 18:23 ` Valdis.Kletnieks at vt.edu
2013-08-16 16:38   ` Salam Farhat
2013-08-17  2:14     ` Sankar P

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.