100% reliable Oops on xen 4.0.1

* 100% reliable Oops on xen 4.0.1
@ 2012-08-14  0:03 Peter Moody
  2012-08-14  6:46 ` Pasi Kärkkäinen
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Peter Moody @ 2012-08-14  0:03 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2555 bytes --]

This seems to be some combination of Xen and the audit subsystem, but
the attached program crashes my machine 100% of the time.

steps to reproduce the crash:

 *  1) compile with gcc -m32
 *  2) start auditd, install any rule (I've only tested syscall
auditing, but any syscall seems to work).
 *     /etc/init.d/auditd start ; auditctl -D ; auditctl -a
exit,always -F arch=64 -S chmod
 *  3) run'n wait (this only loops twice for me before dying)
 *     ./a.out
 *  4) bask in instantaneous kernel oops.

here's xm info from dom0

[xen2.atl] root@gntb1:~# xm info
host                   : gntb1.atl.corp.google.com
release                : 3.2.13-ganeti-rx6-xen0
version                : #1 SMP Thu Jun 7 12:59:40 CEST 2012
machine                : x86_64
nr_cpus                : 12
nr_nodes               : 2
cores_per_socket       : 6
threads_per_core       : 1
cpu_mhz                : 2660
hw_caps                :
bfebfbff:2c100800:00000000:00001f40:029ee3ff:00000000:00000001:00000000
virt_caps              : hvm
total_memory           : 32755
free_memory            : 22665
node_to_cpu            : node0:0,2,4,6,8,10
                         node1:1,3,5,7,9,11
node_to_memory         : node0:13083
                         node1:9582
node_to_dma32_mem      : node0:0
                         node1:3235
max_node_id            : 1
xen_major              : 4
xen_minor              : 0
xen_extra              : .1
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : placeholder dom0_mem=1024M loglvl=all
com1=115200,8n1 console=com1 iommu=0
cc_compiler            : gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
cc_compile_by          : pmacedo
cc_compile_domain      : google.com
cc_compile_date        : Wed Mar 16 15:24:06 UTC 2011
xend_config_format     : 4

I'm not sure what you need from the domU. It's running 2.6.38.8 (but
I've seen this bug all the way up to 3.5.0-rc7, the latest I've
tested). It's a fairly beefy setup, 32G memory and 6 cpus.

I suspect xen as opposed to auditd because:

 a) this only happens on our xen machines (though not all of them)
 b) one of my stack traces started with

[172577.560441]  [<ffffffff810065ad>] ? xen_force_evtchn_callback+0xd/0x10

Any one have any idea what's going on?

Cheers,
peter

-- 
Peter Moody      Google    1.650.253.7306
Security Engineer  pgp:0xC3410038

[-- Attachment #2: crasher.c --]
[-- Type: text/x-csrc, Size: 3827 bytes --]

/*
 * steps:
 *  1) compile with gcc -m32
 *  2) start auditd, install any rule (I've only tested syscall auditing, but any syscall seems to work).
 *     /etc/init.d/auditd start ; auditctl -D ; auditctl -a exit,always -F arch=64 -S chmod
 *  3) run'n wait (this only loops twice for me before dying)
 *     ./a.out
 *  4) bask in instantaneous kernel oops.
 [  571.282777] ------------[ cut here ]------------
 [  571.282786] kernel BUG at fs/buffer.c:1263!
 [  571.282790] invalid opcode: 0000 [#1] SMP
 [  571.282795] last sysfs file: /sys/devices/system/cpu/sched_mc_power_savings
 [  571.282798] CPU 0
 [  571.282802] Pid: 7457, comm: a.out Not tainted 2.6.38.8-gg868-ganetixenu #1
 [  571.282808] RIP: e030:[<ffffffff81153853>]  [<ffffffff81153853>] __find_get_block+0x1f3/0x200
 [  571.282819] RSP: e02b:ffff88079b7ddc78  EFLAGS: 00010046
 [  571.282822] RAX: ffff8807bc290000 RBX: ffff8806d9bb9a98 RCX: 00000000023dc17c
 [  571.282826] RDX: 0000000000001000 RSI: 00000000023dc17c RDI: ffff8807fec29a00
 [  571.282830] RBP: ffff88079b7ddcd8 R08: 0000000000000001 R09: ffff8806d9bb99c0
 [  571.282834] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8806d9bb99c4
 [  571.282839] R13: ffff8806d9bb99f0 R14: ffff8807feff9060 R15: 00000000023dc17c
 [  571.282845] FS:  00007f8f6a76a7c0(0000) GS:ffff8807fff26000(0063) knlGS:0000000000000000
 [  571.282849] CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
 [  571.282853] CR2: 00000000f76c6970 CR3: 00000007a250b000 CR4: 0000000000002660
 [  571.282857] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [  571.282861] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 [  571.282866] Process a.out (pid: 7457, threadinfo ffff88079b7dc000, task ffff8807786843e0)
 [  571.282870] Stack:
 [  571.282872]  ffff88079b7ddc98 ffffffff81654cd1 ffff88079b7ddca8 ffff8806d9bba440
 [  571.282879]  ffff88079b7ddd08 ffffffff811c9294 ffff8807ffffffc3 0000000000000014
 [  571.282887]  ffff8806d9bb9a98 ffff8806d9bb99c4 ffff8806d9bb99f0 ffff8807feff9060
 [  571.282895] Call Trace:
 [  571.282901]  [<ffffffff81654cd1>] ? down_read+0x11/0x30
 [  571.282907]  [<ffffffff811c9294>] ? ext3_xattr_get+0xf4/0x2b0
 [  571.282913]  [<ffffffff811baf88>] ext3_clear_blocks+0x128/0x190
 [  571.282918]  [<ffffffff811bb104>] ext3_free_data+0x114/0x160
 [  571.282923]  [<ffffffff811bbc0a>] ext3_truncate+0x87a/0x950
 [  571.282928]  [<ffffffff812133f5>] ? journal_start+0xb5/0x100
 [  571.282933]  [<ffffffff811bc840>] ext3_evict_inode+0x180/0x1a0
 [  571.282938]  [<ffffffff8114065f>] evict+0x1f/0xb0
 [  571.282945]  [<ffffffff81006d52>] ? check_events+0x12/0x20
 [  571.282949]  [<ffffffff81140c14>] iput+0x1a4/0x290
 [  571.282955]  [<ffffffff8113ed05>] dput+0x265/0x310
 [  571.282959]  [<ffffffff81132435>] path_put+0x15/0x30
 [  571.282965]  [<ffffffff810a5d31>] audit_syscall_exit+0x171/0x260
 [  571.282971]  [<ffffffff8103ed9a>] sysexit_audit+0x21/0x5f
 [  571.282974] Code: 82 00 05 01 00 85 c0 75 de 65 48 89 1c 25 00 05 01 00 e9 87 fe ff ff 48 89 df e8 e9 fc ff ff 4c 89 f7 e9 02 ff ff ff 0f 0b eb fe <0f> 0b eb fe 0f 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 49 89
 [  571.283027] RIP  [<ffffffff81153853>] __find_get_block+0x1f3/0x200
 [  571.283033]  RSP <ffff88079b7ddc78>
 [  571.283036] ---[ end trace 5975ffe20808ecd2 ]---
 *
 */

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

#define KILLDIR "/usr/local/tmp/crasher/kill_dir"

int main(void) {
  FILE *f;
  char fullpath[512];
  int i;

  while (1) {
    fprintf(stderr, "%d ", i++);
    mkdir(KILLDIR, 0777);
    chdir(KILLDIR);
    sprintf(fullpath, "%s/file", KILLDIR);
    f = fopen(fullpath, "w+");
    fprintf(f, "nothing to see here");
    fclose(f);
    unlink("/usr/local/tmp/crasher/kill_dir/file");
    rmdir(KILLDIR);
  }
  return 0;
}

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread