All of lore.kernel.org
 help / color / mirror / Atom feed
* linux-next: Tree for March 3
@ 2010-03-03  6:46 Stephen Rothwell
  2010-03-03 15:44 ` -next March 3: Boot failure on x86 (Oops) Sachin Sant
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Rothwell @ 2010-03-03  6:46 UTC (permalink / raw)
  To: linux-next; +Cc: LKML

[-- Attachment #1: Type: text/plain, Size: 12639 bytes --]

Hi all,

Changes since 20100302:

Dropped tree:	blackfin (temporarily, at the maintainers request)

My fixes tree contains:
	a patch for a dell-laptop build failure from Ingo Molnar
	a patch for a SCSI build error from Randy Dunlap
	a patch for a pktcdvd build error from Arnd Bergmann
	
We are seeing conflicts migrate from being between 2 trees in linux-next
to be between a tree and Linus' tree as things start to get merged.

The omap tree lost it conflicts.

The ext3 tree lost its build failure.

The nfs removed a commit so a build fix for the ceph tree is no longer
required.

The net tree lost its conflict.

The mtd tree still has its 2 build failures for which I applied patches.

The mfd tree gained a conflict against the i2c tree.

The driver-core tree (interacting with the wireless tree) gained a build
failure for which I applied a patch.

----------------------------------------------------------------------------

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/v2.6/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc
and sparc64 defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 158 trees (counting Linus' and 22 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Jan Dittmer for adding the linux-next tree to his build tests
at http://l4x.org/k/ , the guys at http://test.kernel.org/ and Randy
Dunlap for doing many randconfig builds.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master
Merging fixes/fixes
Merging arm-current/master
Merging m68k-current/for-linus
Merging powerpc-merge/merge
Merging sparc-current/master
Merging scsi-rc-fixes/master
Merging net-current/master
Merging sound-current/for-linus
Merging pci-current/for-linus
Merging wireless-current/master
Merging kbuild-current/for-linus
Merging quilt/driver-core.current
Merging quilt/tty.current
Merging quilt/usb.current
Merging quilt/staging.current
Merging cpufreq-current/fixes
Merging input-current/for-linus
Merging md-current/for-linus
Merging audit-current/for-linus
Merging crypto-current/master
Merging ide-curent/master
Merging dwmw2/master
Merging arm/devel
Merging davinci/davinci-next
Merging i.MX/for-next
CONFLICT (content): Merge conflict in arch/arm/Makefile
Merging msm/for-next
Merging omap/for-next
Merging pxa/for-next
Merging samsung/next-samsung
CONFLICT (content): Merge conflict in arch/arm/Kconfig
CONFLICT (content): Merge conflict in arch/arm/Makefile
Applying: arm: fix bad merge of arch/arm/Kconfig
Merging avr32/avr32-arch
Merging cris/for-next
Merging ia64/test
Merging m68k/for-next
Merging m68knommu/for-next
Merging microblaze/next
CONFLICT (content): Merge conflict in arch/microblaze/include/asm/prom.h
Merging mips/mips-for-linux-next
CONFLICT (content): Merge conflict in arch/mips/alchemy/common/platform.c
CONFLICT (content): Merge conflict in arch/mips/alchemy/common/setup.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/db1200/platform.c
CONFLICT (content): Merge conflict in arch/mips/alchemy/devboards/db1x00/board_setup.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/db1x00/platform.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/pb1100/platform.c
CONFLICT (content): Merge conflict in arch/mips/alchemy/devboards/pb1200/platform.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/pb1500/platform.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/pb1550/platform.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/platform.c
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/devboards/platform.h
CONFLICT (add/add): Merge conflict in arch/mips/alchemy/xxs1500/platform.c
CONFLICT (content): Merge conflict in arch/mips/cavium-octeon/octeon-irq.c
CONFLICT (content): Merge conflict in arch/mips/configs/db1200_defconfig
CONFLICT (content): Merge conflict in arch/mips/include/asm/mach-au1x00/au1000.h
CONFLICT (content): Merge conflict in drivers/pcmcia/au1000_generic.h
CONFLICT (add/add): Merge conflict in drivers/pcmcia/db1xxx_ss.c
CONFLICT (add/add): Merge conflict in drivers/pcmcia/xxs1500_ss.c
Merging parisc/next
Merging powerpc/next
Merging 4xx/next
Merging 52xx-and-virtex/next
Merging galak/next
Merging s390/features
Merging sh/master
Merging genesis/master
CONFLICT (content): Merge conflict in arch/arm/Kconfig
Merging sparc/master
Merging xtensa/master
Merging ceph/for-next
CONFLICT (content): Merge conflict in fs/gfs2/super.c
CONFLICT (content): Merge conflict in fs/xfs/linux-2.6/xfs_super.c
Merging cifs/master
Merging configfs/linux-next
Merging ecryptfs/next
Merging ext3/for_next
Merging ext4/next
Merging fatfs/master
Merging fuse/for-next
Merging gfs2/master
Merging jfs/next
Merging logfs/master
Merging nfs/linux-next
CONFLICT (content): Merge conflict in fs/gfs2/super.c
Merging nfsd/nfsd-next
Merging nilfs2/for-next
Merging ocfs2/linux-next
Merging squashfs/master
Merging udf/for_next
Merging v9fs/for-next
Merging ubifs/linux-next
Merging xfs/master
Applying: nfsd: fixup for _xfs_log_force API change
Merging reiserfs-bkl/reiserfs/kill-bkl
Merging vfs/for-next
Applying: logfs: fixup for write_inode API change
Merging pci/linux-next
Merging hid/for-next
Merging quilt/i2c
Merging bjdooks-i2c/next-i2c
CONFLICT (content): Merge conflict in drivers/i2c/busses/Kconfig
CONFLICT (content): Merge conflict in drivers/i2c/busses/Makefile
Merging quilt/jdelvare-hwmon
Merging quilt/kernel-doc
Merging v4l-dvb/master
CONFLICT (add/add): Merge conflict in drivers/media/video/tvp7002.c
Merging kbuild/for-next
Merging kconfig/for-next
Merging ide/master
Merging libata/NEXT
Merging infiniband/for-next
CONFLICT (content): Merge conflict in drivers/infiniband/core/uverbs_main.c
Merging acpi/test
Merging ieee1394/for-next
Merging ubi/linux-next
Merging kvm/linux-next
CONFLICT (content): Merge conflict in Documentation/feature-removal-schedule.txt
Merging dlm/next
Merging ibft/master
Merging scsi/master
Merging async_tx/next
Merging net/master
Merging wireless/master
Merging mtd/master
CONFLICT (content): Merge conflict in drivers/mtd/nand/sh_flctl.c
Applying: mtd: nand: fix name space clash
Applying: mtd: declare inline functions static
Merging crypto/master
Merging sound/for-next
Merging cpufreq/next
CONFLICT (content): Merge conflict in drivers/acpi/processor_core.c
Applying: cpufreq: merge fixup for processor_core.c rename
Merging quilt/rr
CONFLICT (content): Merge conflict in MAINTAINERS
CONFLICT (content): Merge conflict in drivers/char/hvc_console.c
CONFLICT (content): Merge conflict in drivers/char/hvc_console.h
Merging mmc/next
Merging tmio-mmc/linux-next
CONFLICT (content): Merge conflict in drivers/mfd/asic3.c
CONFLICT (content): Merge conflict in drivers/mfd/t7l66xb.c
CONFLICT (content): Merge conflict in drivers/mfd/tc6387xb.c
CONFLICT (content): Merge conflict in drivers/mfd/tc6393xb.c
CONFLICT (add/add): Merge conflict in drivers/mfd/tmio_core.c
CONFLICT (content): Merge conflict in drivers/mmc/host/tmio_mmc.c
CONFLICT (content): Merge conflict in drivers/mmc/host/tmio_mmc.h
CONFLICT (content): Merge conflict in include/linux/mfd/tmio.h
Merging input/next
Merging lsm/for-next
Merging block/for-next
Merging quilt/device-mapper
Merging embedded/master
Merging firmware/master
Merging pcmcia/master
Merging battery/master
Merging leds/for-mm
Merging backlight/for-mm
Merging kgdb/kgdb-next
Merging slab/for-next
Merging uclinux/for-next
Merging md/for-next
Merging mfd/for-next
CONFLICT (content): Merge conflict in drivers/i2c/busses/i2c-isch.c
Merging hdlc/hdlc-next
Merging drm/drm-next
CONFLICT (content): Merge conflict in drivers/gpu/vga/Kconfig
Merging voltage/for-next
Merging security-testing/next
CONFLICT (content): Merge conflict in security/tomoyo/realpath.c
Merging lblnet/master
Merging agp/agp-next
Merging uwb/for-upstream
Merging watchdog/master
Merging bdev/master
Merging dwmw2-iommu/master
Merging cputime/cputime
Merging osd/linux-next
Merging jc_docs/docs-next
Merging nommu/master
Merging trivial/for-next
CONFLICT (content): Merge conflict in arch/arm/mach-u300/include/mach/debug-macro.S
CONFLICT (content): Merge conflict in drivers/net/qlge/qlge_ethtool.c
CONFLICT (content): Merge conflict in drivers/net/qlge/qlge_main.c
CONFLICT (content): Merge conflict in drivers/net/typhoon.c
Merging audit/for-next
Merging quilt/aoe
Merging suspend/linux-next
Merging bluetooth/master
Merging fsnotify/for-next
CONFLICT (content): Merge conflict in fs/notify/inotify/inotify_user.c
CONFLICT (content): Merge conflict in kernel/audit_tree.c
Merging irda/for-next
CONFLICT (content): Merge conflict in drivers/net/irda/irda-usb.c
Merging hwlat/for-linus
CONFLICT (content): Merge conflict in MAINTAINERS
CONFLICT (content): Merge conflict in drivers/misc/Makefile
Merging drbd/for-jens
Merging catalin/for-next
Merging alacrity/linux-next
CONFLICT (content): Merge conflict in include/linux/Kbuild
CONFLICT (content): Merge conflict in lib/Kconfig
Merging i7core_edac/linux_next
Merging devicetree/next-devicetree
Merging spi/next-spi
Merging limits/writable_limits
CONFLICT (content): Merge conflict in arch/x86/ia32/ia32entry.S
CONFLICT (content): Merge conflict in arch/x86/include/asm/unistd_32.h
CONFLICT (content): Merge conflict in arch/x86/include/asm/unistd_64.h
CONFLICT (content): Merge conflict in arch/x86/kernel/syscall_table_32.S
Merging omap_dss2/for-next
Merging als/for-next
Merging tip/auto-latest
Merging edac-amd/for-next
Merging oprofile/for-next
Merging percpu/for-next
Applying: slab: update for percpu API change
Merging workqueues/for-next
Merging sfi/sfi-test
Merging asm-generic/next
Merging hwpoison/hwpoison
Merging sysctl/master
Merging quilt/driver-core
CONFLICT (content): Merge conflict in drivers/base/power/main.c
CONFLICT (content): Merge conflict in drivers/pcmcia/ds.c
CONFLICT (content): Merge conflict in include/linux/device.h
Applying: i2c: update for semaphore to mutex conversion of devices
Applying: sysfs: fix for thinko with sysfs_bin_attr_init()
Merging quilt/tty
Merging quilt/usb
CONFLICT (content): Merge conflict in arch/arm/mach-mx2/devices.c
CONFLICT (content): Merge conflict in arch/arm/mach-mx2/devices.h
CONFLICT (content): Merge conflict in drivers/usb/early/ehci-dbgp.c
CONFLICT (content): Merge conflict in drivers/usb/storage/scsiglue.c
Merging quilt/staging
CONFLICT (content): Merge conflict in drivers/staging/rtl8192su/ieee80211/ieee80211_rx.c
CONFLICT (delete/modify): drivers/staging/sm7xx/smtc2d.c deleted in quilt/staging and modified in HEAD. Version HEAD of drivers/staging/sm7xx/smtc2d.c left in tree.
CONFLICT (delete/modify): drivers/staging/sm7xx/smtc2d.h deleted in quilt/staging and modified in HEAD. Version HEAD of drivers/staging/sm7xx/smtc2d.h left in tree.
$ git rm -f drivers/staging/sm7xx/smtc2d.c drivers/staging/sm7xx/smtc2d.h
Applying: ar9170: fix for driver-core ABI change
Merging scsi-post-merge/master

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* -next March 3: Boot failure on x86 (Oops)
  2010-03-03  6:46 linux-next: Tree for March 3 Stephen Rothwell
@ 2010-03-03 15:44 ` Sachin Sant
  2010-03-04  1:28   ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Sachin Sant @ 2010-03-03 15:44 UTC (permalink / raw)
  To: linux-next; +Cc: LKML, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 2039 bytes --]

Today's next failed to boot on x86 box with following trace

Unpacking initramfs...
Freeing initrd memory: 10584k freed
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e
*pdpt = 00000000005dd001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
last sysfs file:
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100303 #1 /eserver xSeries 235 -[86717AX]-
EIP: 0060:[<c01a6b12>] EFLAGS: 00010002 CPU: 1
EIP is at pcpu_alloc+0x1cb/0x75e
EAX: 00000000 EBX: c05c4100 ECX: cccccccc EDX: 00000000
ESI: 000000b0 EDI: 00000005 EBP: f5c69fa8 ESP: f5c69f2c
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=f5c68000 task=f5c66ce0 task.ti=f5c68000)
Stack:
 00000000 00000005 61746164 0a383d29 5f656400 61746164 00000004 000000b0
<0> 000a3036 00000286 61637061 29656863 0a36313d c0579100 c058da30 f5c69f94
<0> c01310be c05c41cc 00000028 c058da30 f5c69fa4 c0116a73 c0492091 c0492089
Call Trace:
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c01310be>] ? log_buf_kexec_setup+0x3f/0x67
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c0116a73>] ? arch_crash_save_vmcoreinfo+0x37/0x3c
 [<c058d9ff>] ? crash_notes_memory_init+0x0/0x31
 [<c01a70be>] ? __alloc_percpu+0xa/0xc
 [<c058da11>] ? crash_notes_memory_init+0x12/0x31
 [<c0101139>] ? do_one_initcall+0x4c/0x131
 [<c057b352>] ? kernel_init+0x127/0x178
 [<c057b22b>] ? kernel_init+0x0/0x178
 [<c0102df6>] ? kernel_thread_helper+0x6/0x10
Code: 45 a8 e9 65 ff ff ff 8b 4d 9c 8b 55 a0 8b 45 84 e8 31 fa ff ff 85 c0 89 45 a4 0f 89 fd 00 00 00 8b 45 84 8b 00 89 45 84 8b 55 84 <8b> 02 0f 18 00 90 8b 45 cc 03 05 a0 9b 57 c0 39 c2 0f 85 67 ff
EIP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e SS:ESP 0068:f5c69f2c
CR2: 0000000000000000
---[ end trace a7919e7f17c0a725 ]---

x86_64 boots fine. Have attached dmesg log.

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


[-- Attachment #2: next-0303-log --]
[-- Type: text/plain, Size: 13585 bytes --]

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.33-autotest-next-20100303 (root@x235b) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Wed Mar 3 13:56:30 IST 2010
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
 BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000005ffd8740 (usable)
 BIOS-e820: 000000005ffd8740 - 000000005ffe0000 (ACPI data)
 BIOS-e820: 000000005ffe0000 - 0000000060000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
DMI 2.3 present.
last_pfn = 0x5ffd8 max_arch_pfn = 0x1000000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
found SMP MP-table at [c009c140] 9c140
init_memory_mapping: 0000000000000000-00000000371fe000
RAMDISK: 37599000 - 37fef152
Allocated new RAMDISK: 0077b000 - 011d1152
Move RAMDISK from 0000000037599000 - 0000000037fef151 to 0077b000 - 011d1151
ACPI: RSDP 000fdfc0 00014 (v00 IBM   )
ACPI: RSDT 5ffdff80 00030 (v01 IBM    SERONYXP 00001000 IBM  45444F43)
ACPI: FACP 5ffdff00 00074 (v01 IBM    SERONYXP 00001000 IBM  45444F43)
ACPI: DSDT 5ffd8740 075D2 (v01 IBM    SERGEODE 00001000 MSFT 0100000B)
ACPI: FACS 5ffdfe00 00040
ACPI: APIC 5ffdfe40 00092 (v01 IBM    SERONYXP 00001000 IBM  45444F43)
ACPI: ASF! 5ffdfd80 0004B (v16 IBM    SERONYXP 00000001 IBM  45444F43)
Node: 0, start_pfn: 0, end_pfn: 5ffd8
Reserving total of e00 pages for numa KVA remap
kva_start_pfn ~ 36200 max_low_pfn ~ 371fe
max_pfn = 5ffd8
653MB HIGHMEM available.
881MB LOWMEM available.
  mapped low ram: 0 - 371fe000
  low ram: 0 - 371fe000
Zone PFN ranges:
  DMA      0x00000001 -> 0x00001000
  Normal   0x00001000 -> 0x000371fe
  HighMem  0x000371fe -> 0x0005ffd8
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
    0: 0x00000001 -> 0x0000009c
    0: 0x00000100 -> 0x0005f000
    0: 0x0005fe00 -> 0x0005ffd8
Using APIC driver default
ACPI: PM-Timer IO Port: 0x488
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-15
ACPI: IOAPIC (id[0x0d] address[0xfec01000] gsi_base[16])
IOAPIC[1]: apic_id 13, version 17, address 0xfec01000, GSI 16-31
ACPI: IOAPIC (id[0x0c] address[0xfec02000] gsi_base[32])
IOAPIC[2]: apic_id 12, version 17, address 0xfec02000, GSI 32-47
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Using ACPI (MADT) for SMP configuration information
SMP: Allowing 4 CPUs, 0 hotplug CPUs
PM: Registered nosave memory: 000000000009c000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
Allocating PCI resources starting at 60000000 (gap: 60000000:9ec00000)
setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:4 nr_node_ids:8
PERCPU: Embedded 13 pages/cpu @c1200000 s29140 r0 d24108 u524288
pcpu-alloc: s29140 r0 d24108 u524288 alloc=1*2097152
pcpu-alloc: [0] 0 1 2 3 
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 386419
Policy zone: HighMem
Kernel command line: root=/dev/sda2 console=tty0 console=ttyS0,9600n1 IDENT=1267607421
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
allocated 7863500 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
Subtract (48 early reservations)
  #1 [0000001000 - 0000002000]   EX TRAMPOLINE
  #2 [0000100000 - 00007731c4]   TEXT DATA BSS
  #3 [0000774000 - 000077a1d8]             BRK
  #4 [000009c000 - 000009c140]   BIOS reserved
  #5 [000009c140 - 000009c150]    MP-table mpf
  #6 [000009c150 - 000009d0a0]   BIOS reserved
  #7 [000009d22c - 0000100000]   BIOS reserved
  #8 [000009d0a0 - 000009d22c]    MP-table mpc
  #9 [0000002000 - 0000003000]      TRAMPOLINE
  #10 [0000003000 - 0000007000]     ACPI WAKEUP
  #11 [0000007000 - 000000e000]         PGTABLE
  #12 [000077b000 - 00011d1152]     NEW RAMDISK
  #13 [005f000000 - 005fe00000]         KVA RAM
  #14 [0036200000 - 0037000000]          KVA PG
  #15 [00011d2000 - 00011d3000]         BOOTMEM
  #16 [00011d1180 - 00011d1184]         BOOTMEM
  #17 [00011d11c0 - 00011d1280]         BOOTMEM
  #18 [00011d1280 - 00011d1324]         BOOTMEM
  #19 [00011d3000 - 00011d6000]         BOOTMEM
  #20 [00011d1340 - 00011d13bc]         BOOTMEM
  #21 [00011d6000 - 00011d9000]         BOOTMEM
  #22 [00011d13c0 - 00011d144d]         BOOTMEM
  #23 [00011d1480 - 00011d15a0]         BOOTMEM
  #24 [00011d15c0 - 00011d1600]         BOOTMEM
  #25 [00011d1600 - 00011d1640]         BOOTMEM
  #26 [00011d1640 - 00011d1680]         BOOTMEM
  #27 [00011d1680 - 00011d16c0]         BOOTMEM
  #28 [00011d16c0 - 00011d1700]         BOOTMEM
  #29 [00011d1700 - 00011d1740]         BOOTMEM
  #30 [00011d1740 - 00011d1780]         BOOTMEM
  #31 [00011d1780 - 00011d1790]         BOOTMEM
  #32 [00011d17c0 - 00011d1802]         BOOTMEM
  #33 [00011d1840 - 00011d1882]         BOOTMEM
  #34 [0001200000 - 000120d000]         BOOTMEM
  #35 [0001280000 - 000128d000]         BOOTMEM
  #36 [0001300000 - 000130d000]         BOOTMEM
  #37 [0001380000 - 000138d000]         BOOTMEM
  #38 [00011d18c0 - 00011d18c4]         BOOTMEM
  #39 [00011d1900 - 00011d1904]         BOOTMEM
  #40 [00011d1940 - 00011d1950]         BOOTMEM
  #41 [00011d1980 - 00011d1990]         BOOTMEM
  #42 [00011d19c0 - 00011d1a58]         BOOTMEM
  #43 [00011d1a80 - 00011d1ab8]         BOOTMEM
  #44 [00011d9000 - 00011dd000]         BOOTMEM
  #45 [000138d000 - 000140d000]         BOOTMEM
  #46 [000120d000 - 000124d000]         BOOTMEM
  #47 [000140d000 - 0001b8cccc]         BOOTMEM
Initializing HighMem for node 0 (000371fe:0005ffd8)
Memory: 1517652k/1572704k available (2744k kernel code, 40312k reserved, 1842k data, 356k init, 655208k highmem)
virtual kernel memory layout:
    fixmap  : 0xff5b6000 - 0xfffff000   (10532 kB)
    pkmap   : 0xff200000 - 0xff400000   (2048 kB)
    vmalloc : 0xf79fe000 - 0xff1fe000   ( 120 MB)
    lowmem  : 0xc0000000 - 0xf71fe000   ( 881 MB)
      .init : 0xc057b000 - 0xc05d4000   ( 356 kB)
      .data : 0xc03ae3e8 - 0xc057ad88   (1842 kB)
      .text : 0xc0100000 - 0xc03ae3e8   (2744 kB)
Checking if this processor honours the WP bit even in supervisor mode...Ok.
Hierarchical RCU implementation.
RCU-based detection of stalled CPUs is enabled.
NR_IRQS:2304
Console: colour VGA+ 80x25
console [tty0] enabled
console [ttyS0] enabled
Fast TSC calibration using PIT
Detected 2793.676 MHz processor.
Calibrating delay loop (skipped), value calculated using timer frequency.. 5587.35 BogoMIPS (lpj=11174704)
Security Framework initialized
SELinux:  Disabled at boot.
Mount-cache hash table entries: 512
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
Initializing cgroup subsys memory
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 4 MCE banks
CPU0: Thermal monitoring enabled (TM1)
Performance Events: no PMU driver, software events only.
Checking 'hlt' instruction... OK.
ACPI: Core revision 20100121
Enabling APIC mode:  Flat.  Using 3 I/O APICs
Mapping cpu 0 to node 0
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07
Booting Node   0, Processors  #1
Initializing CPU#1
Mapping cpu 1 to node 0
 #2
Initializing CPU#2
Mapping cpu 2 to node 0
 #3 Ok.
Initializing CPU#3
Mapping cpu 3 to node 0
Brought up 4 CPUs
Total of 4 processors activated (22352.10 BogoMIPS).
devtmpfs: initialized
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfd7bc, last bus=10
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ACPI: Interpreter enabled
ACPI: (supports S0 S4 S5)
ACPI: Using IOAPIC for interrupt routing
PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug
ACPI: PCI Root Bridge [PCI0] (0000:00)
ACPI: PCI Root Bridge [PCI1] (0000:02)
ACPI: PCI Root Bridge [PCI2] (0000:05)
ACPI: PCI Root Bridge [PCI3] (0000:07)
ACPI: PCI Root Bridge [PCI4] (0000:09)
vgaarb: device added: PCI:0000:00:09.0,decodes=io+mem,owns=io+mem,locks=none
vgaarb: loaded
PCI: Using ACPI for IRQ routing
Switching to clocksource tsc
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 17 devices
ACPI: ACPI bus type pnp unregistered
PnPBIOS: Disabled by ACPI PNP
system 00:01: [io  0x0900-0x093f] has been reserved
system 00:01: [io  0x0510-0x0517] has been reserved
system 00:01: [io  0x0504-0x0507] has been reserved
system 00:01: [io  0x0500-0x0503] has been reserved
system 00:01: [io  0x0520-0x053f] has been reserved
system 00:01: [io  0x0420-0x0427] has been reserved
system 00:01: [io  0x0460-0x0461] has been reserved
system 00:0c: [io  0x01ec-0x01ef] has been reserved
system 00:0c: [io  0x0400-0x04fe] could not be reserved
system 00:0c: [io  0x0600] has been reserved
system 00:0c: [io  0x0800-0x080f] has been reserved
system 00:0c: [io  0x0c00-0x0cfe] could not be reserved
system 00:0c: [io  0x0f50-0x0f58] has been reserved
system 00:0c: [mem 0xfec00000-0xffffffff] could not be reserved
pci 0000:00:09.0: BAR 6: assigned [mem 0x60000000-0x6001ffff pref]
pci 0000:05:03.0: BAR 6: assigned [mem 0x60020000-0x60027fff pref]
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
Unpacking initramfs...
Freeing initrd memory: 10584k freed
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e
*pdpt = 00000000005dd001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 
last sysfs file: 
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100303 #1 /eserver xSeries 235 -[86717AX]-
EIP: 0060:[<c01a6b12>] EFLAGS: 00010002 CPU: 1
EIP is at pcpu_alloc+0x1cb/0x75e
EAX: 00000000 EBX: c05c4100 ECX: cccccccc EDX: 00000000
ESI: 000000b0 EDI: 00000005 EBP: f5c69fa8 ESP: f5c69f2c
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=f5c68000 task=f5c66ce0 task.ti=f5c68000)
Stack:
 00000000 00000005 61746164 0a383d29 5f656400 61746164 00000004 000000b0
<0> 000a3036 00000286 61637061 29656863 0a36313d c0579100 c058da30 f5c69f94
<0> c01310be c05c41cc 00000028 c058da30 f5c69fa4 c0116a73 c0492091 c0492089
Call Trace:
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c01310be>] ? log_buf_kexec_setup+0x3f/0x67
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c0116a73>] ? arch_crash_save_vmcoreinfo+0x37/0x3c
 [<c058d9ff>] ? crash_notes_memory_init+0x0/0x31
 [<c01a70be>] ? __alloc_percpu+0xa/0xc
 [<c058da11>] ? crash_notes_memory_init+0x12/0x31
 [<c0101139>] ? do_one_initcall+0x4c/0x131
 [<c057b352>] ? kernel_init+0x127/0x178
 [<c057b22b>] ? kernel_init+0x0/0x178
 [<c0102df6>] ? kernel_thread_helper+0x6/0x10
Code: 45 a8 e9 65 ff ff ff 8b 4d 9c 8b 55 a0 8b 45 84 e8 31 fa ff ff 85 c0 89 45 a4 0f 89 fd 00 00 00 8b 45 84 8b 00 89 45 84 8b 55 84 <8b> 02 0f 18 00 90 8b 45 cc 03 05 a0 9b 57 c0 39 c2 0f 85 67 ff 
EIP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e SS:ESP 0068:f5c69f2c
CR2: 0000000000000000
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D    2.6.33-autotest-next-20100303 #1
Call Trace:
 [<c03a72e2>] ? printk+0xf/0x15
 [<c03a7227>] panic+0x39/0xe5
 [<c0132dae>] do_exit+0x5c/0x5e3
 [<c0130a7c>] ? kmsg_dump+0xe4/0xf8
 [<c012fddb>] ? oops_exit+0x2a/0x2f
 [<c03aa5a1>] oops_end+0x93/0x9b
 [<c0119f8e>] no_context+0x13f/0x149
 [<c0289edf>] ? number+0x11f/0x1da
 [<c011a0f8>] __bad_area_nosemaphore+0x160/0x168
 [<c028a4b3>] ? string+0x33/0x81
 [<c028b3f1>] ? vsnprintf+0x4fe/0x6d1
 [<c011a10d>] bad_area_nosemaphore+0xd/0x10
 [<c03abf1c>] do_page_fault+0x199/0x2f7
 [<c03abd83>] ? do_page_fault+0x0/0x2f7
 [<c03a9c86>] error_code+0x66/0x70
 [<c015007b>] ? sys_futex+0x93/0xff
 [<c03abd83>] ? do_page_fault+0x0/0x2f7
 [<c01a6b12>] ? pcpu_alloc+0x1cb/0x75e
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c01310be>] ? log_buf_kexec_setup+0x3f/0x67
 [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
 [<c0116a73>] ? arch_crash_save_vmcoreinfo+0x37/0x3c
 [<c058d9ff>] ? crash_notes_memory_init+0x0/0x31
 [<c01a70be>] __alloc_percpu+0xa/0xc
 [<c058da11>] crash_notes_memory_init+0x12/0x31
 [<c0101139>] do_one_initcall+0x4c/0x131
 [<c057b352>] kernel_init+0x127/0x178
 [<c057b22b>] ? kernel_init+0x0/0x178
 [<c0102df6>] kernel_thread_helper+0x6/0x10


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-03 15:44 ` -next March 3: Boot failure on x86 (Oops) Sachin Sant
@ 2010-03-04  1:28   ` Tejun Heo
  2010-03-04  5:23     ` Sachin Sant
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2010-03-04  1:28 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

On 03/04/2010 12:44 AM, Sachin Sant wrote:
> Today's next failed to boot on x86 box with following trace
> 
> Unpacking initramfs...
> Freeing initrd memory: 10584k freed
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e
> *pdpt = 00000000005dd001 *pde = 0000000000000000
> Oops: 0000 [#1] SMP
> last sysfs file:
> Modules linked in:
> 
> Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100303 #1
> /eserver xSeries 235 -[86717AX]-
> EIP: 0060:[<c01a6b12>] EFLAGS: 00010002 CPU: 1
> EIP is at pcpu_alloc+0x1cb/0x75e
> EAX: 00000000 EBX: c05c4100 ECX: cccccccc EDX: 00000000
> ESI: 000000b0 EDI: 00000005 EBP: f5c69fa8 ESP: f5c69f2c
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process swapper (pid: 1, ti=f5c68000 task=f5c66ce0 task.ti=f5c68000)
> Stack:
> 00000000 00000005 61746164 0a383d29 5f656400 61746164 00000004 000000b0
> <0> 000a3036 00000286 61637061 29656863 0a36313d c0579100 c058da30 f5c69f94
> <0> c01310be c05c41cc 00000028 c058da30 f5c69fa4 c0116a73 c0492091 c0492089
> Call Trace:
> [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
> [<c01310be>] ? log_buf_kexec_setup+0x3f/0x67
> [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
> [<c0116a73>] ? arch_crash_save_vmcoreinfo+0x37/0x3c
> [<c058d9ff>] ? crash_notes_memory_init+0x0/0x31
> [<c01a70be>] ? __alloc_percpu+0xa/0xc
> [<c058da11>] ? crash_notes_memory_init+0x12/0x31
> [<c0101139>] ? do_one_initcall+0x4c/0x131
> [<c057b352>] ? kernel_init+0x127/0x178
> [<c057b22b>] ? kernel_init+0x0/0x178
> [<c0102df6>] ? kernel_thread_helper+0x6/0x10
> Code: 45 a8 e9 65 ff ff ff 8b 4d 9c 8b 55 a0 8b 45 84 e8 31 fa ff ff 85
> c0 89 45 a4 0f 89 fd 00 00 00 8b 45 84 8b 00 89 45 84 8b 55 84 <8b> 02
> 0f 18 00 90 8b 45 cc 03 05 a0 9b 57 c0 39 c2 0f 85 67 ff
> EIP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e SS:ESP 0068:f5c69f2c
> CR2: 0000000000000000
> ---[ end trace a7919e7f17c0a725 ]---
> 
> x86_64 boots fine. Have attached dmesg log.

Can you please feed the address to gdb and get the line number?  Also,
is it reproducible on mainline?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-04  1:28   ` Tejun Heo
@ 2010-03-04  5:23     ` Sachin Sant
  2010-03-05  6:08       ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Sachin Sant @ 2010-03-04  5:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-next, LKML

Tejun Heo wrote:
> On 03/04/2010 12:44 AM, Sachin Sant wrote:
>   
>> Today's next failed to boot on x86 box with following trace
>>
>> Unpacking initramfs...
>> Freeing initrd memory: 10584k freed
>> BUG: unable to handle kernel NULL pointer dereference at (null)
>> IP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e
>> *pdpt = 00000000005dd001 *pde = 0000000000000000
>> Oops: 0000 [#1] SMP
>> last sysfs file:
>> Modules linked in:
>>
>> Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100303 #1
>> /eserver xSeries 235 -[86717AX]-
>> EIP: 0060:[<c01a6b12>] EFLAGS: 00010002 CPU: 1
>> EIP is at pcpu_alloc+0x1cb/0x75e
>> EAX: 00000000 EBX: c05c4100 ECX: cccccccc EDX: 00000000
>> ESI: 000000b0 EDI: 00000005 EBP: f5c69fa8 ESP: f5c69f2c
>> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> Process swapper (pid: 1, ti=f5c68000 task=f5c66ce0 task.ti=f5c68000)
>> Stack:
>> 00000000 00000005 61746164 0a383d29 5f656400 61746164 00000004 000000b0
>> <0> 000a3036 00000286 61637061 29656863 0a36313d c0579100 c058da30 f5c69f94
>> <0> c01310be c05c41cc 00000028 c058da30 f5c69fa4 c0116a73 c0492091 c0492089
>> Call Trace:
>> [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
>> [<c01310be>] ? log_buf_kexec_setup+0x3f/0x67
>> [<c058da30>] ? crash_save_vmcoreinfo_init+0x0/0x31a
>> [<c0116a73>] ? arch_crash_save_vmcoreinfo+0x37/0x3c
>> [<c058d9ff>] ? crash_notes_memory_init+0x0/0x31
>> [<c01a70be>] ? __alloc_percpu+0xa/0xc
>> [<c058da11>] ? crash_notes_memory_init+0x12/0x31
>> [<c0101139>] ? do_one_initcall+0x4c/0x131
>> [<c057b352>] ? kernel_init+0x127/0x178
>> [<c057b22b>] ? kernel_init+0x0/0x178
>> [<c0102df6>] ? kernel_thread_helper+0x6/0x10
>> Code: 45 a8 e9 65 ff ff ff 8b 4d 9c 8b 55 a0 8b 45 84 e8 31 fa ff ff 85
>> c0 89 45 a4 0f 89 fd 00 00 00 8b 45 84 8b 00 89 45 84 8b 55 84 <8b> 02
>> 0f 18 00 90 8b 45 cc 03 05 a0 9b 57 c0 39 c2 0f 85 67 ff
>> EIP: [<c01a6b12>] pcpu_alloc+0x1cb/0x75e SS:ESP 0068:f5c69f2c
>> CR2: 0000000000000000
>> ---[ end trace a7919e7f17c0a725 ]---
>>
>> x86_64 boots fine. Have attached dmesg log.
>>     
>
> Can you please feed the address to gdb and get the line number?  Also,
> is it reproducible on mainline?
>   
I can recreate this with latest git as well (2.6.33-git9 [eaa5eec7..])

Disassembly from 2.6.33-git9 code base follows :

/usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1137
                        if (off >= 0)
     e91:       0f 89 fd 00 00 00       jns    f94 <pcpu_alloc+0x2bd>
/usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1116
        }

restart:
        /* search through normal chunks */
        for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) {
                list_for_each_entry(chunk, &pcpu_slot[slot], list) {
     e97:       8b 45 84                mov    -0x7c(%ebp),%eax
     e9a:       8b 00                   mov    (%eax),%eax
     e9c:       89 45 84                mov    %eax,-0x7c(%ebp)
prefetch():
/usr/local/autobench/var/tmp/build/linux/arch/x86/include/asm/processor.h:886
     e9f:       8b 55 84                mov    -0x7c(%ebp),%edx
     ea2:       8b 02                   mov    (%edx),%eax

^^^^^^^^^^^^^^^^^^^ EIP corresponds to this line

     ea4:       8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
pcpu_alloc():
/usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1116
     ea8:       8b 45 cc                mov    -0x34(%ebp),%eax
     eab:       03 05 30 00 00 00       add    0x30,%eax
     eb1:       39 c2                   cmp    %eax,%edx

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-04  5:23     ` Sachin Sant
@ 2010-03-05  6:08       ` Tejun Heo
  2010-03-05  6:09         ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2010-03-05  6:08 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

Hello,

On 03/04/2010 02:23 PM, Sachin Sant wrote:
>> Can you please feed the address to gdb and get the line number?  Also,
>> is it reproducible on mainline?
>>   
> I can recreate this with latest git as well (2.6.33-git9 [eaa5eec7..])
> 
> Disassembly from 2.6.33-git9 code base follows :
> 
> /usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1137
>                        if (off >= 0)
>     e91:       0f 89 fd 00 00 00       jns    f94 <pcpu_alloc+0x2bd>
> /usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1116
>        }
> 
> restart:
>        /* search through normal chunks */
>        for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) {
>                list_for_each_entry(chunk, &pcpu_slot[slot], list) {
>     e97:       8b 45 84                mov    -0x7c(%ebp),%eax
>     e9a:       8b 00                   mov    (%eax),%eax
>     e9c:       89 45 84                mov    %eax,-0x7c(%ebp)
> prefetch():
> /usr/local/autobench/var/tmp/build/linux/arch/x86/include/asm/processor.h:886
> 
>     e9f:       8b 55 84                mov    -0x7c(%ebp),%edx
>     ea2:       8b 02                   mov    (%edx),%eax
> 
> ^^^^^^^^^^^^^^^^^^^ EIP corresponds to this line

Hmmm... this means that on one of the chunks, chunk->list.next was
NULL (BTW, the disassembly is from unlinked object, right?).  The main
allocation code hasn't seen much change lately.  The only changes are,

22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new

Another possibility could be that the data structure before it was
overrun and corrupted the list part.  pcpu_chunk is allocated with
variable size array attached at the end, so maybe I screwed up
calculation somewhere?  This could explain the difference between 64
and 32bits.  If you add padding at the head of struct pcpu_chunk, say,
unsigned long pad[16], does the problem go away?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05  6:08       ` Tejun Heo
@ 2010-03-05  6:09         ` Tejun Heo
  2010-03-05  6:17           ` Tejun Heo
  2010-03-05 10:44           ` Sachin Sant
  0 siblings, 2 replies; 12+ messages in thread
From: Tejun Heo @ 2010-03-05  6:09 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

On 03/05/2010 03:08 PM, Tejun Heo wrote:
> Hmmm... this means that on one of the chunks, chunk->list.next was
> NULL (BTW, the disassembly is from unlinked object, right?).  The main
> allocation code hasn't seen much change lately.  The only changes are,
> 
> 22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
> 833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new

Can you also please try reverting the above two commits?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05  6:09         ` Tejun Heo
@ 2010-03-05  6:17           ` Tejun Heo
  2010-03-05  7:47             ` Sachin Sant
  2010-03-05 10:44           ` Sachin Sant
  1 sibling, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2010-03-05  6:17 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

Hello,

On 03/05/2010 03:09 PM, Tejun Heo wrote:
> On 03/05/2010 03:08 PM, Tejun Heo wrote:
>> Hmmm... this means that on one of the chunks, chunk->list.next was
>> NULL (BTW, the disassembly is from unlinked object, right?).  The main
>> allocation code hasn't seen much change lately.  The only changes are,
>>
>> 22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
>> 833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new
> 
> Can you also please try reverting the above two commits?

Sorry about all the fuss but I think this could be it.  It looks like
I forgot to update need_to_extend logic while adding simultaneous
head/tail split for alignment, so the array might be overrun by one
entry.  Can you please try this one first?

Thanks.

diff --git a/mm/percpu.c b/mm/percpu.c
index 768419d..f1ed9ea 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -373,11 +373,11 @@ static int pcpu_need_to_extend(struct pcpu_chunk *chunk)
 {
 	int new_alloc;
 
-	if (chunk->map_alloc >= chunk->map_used + 2)
+	if (chunk->map_alloc >= chunk->map_used + 3)
 		return 0;
 
 	new_alloc = PCPU_DFL_MAP_ALLOC;
-	while (new_alloc < chunk->map_used + 2)
+	while (new_alloc < chunk->map_used + 3)
 		new_alloc *= 2;
 
 	return new_alloc;

-- 
tejun

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05  6:17           ` Tejun Heo
@ 2010-03-05  7:47             ` Sachin Sant
  2010-03-05 13:25               ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Sachin Sant @ 2010-03-05  7:47 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-next, LKML

Tejun Heo wrote:
> Hello,
>
> On 03/05/2010 03:09 PM, Tejun Heo wrote:
>   
>> On 03/05/2010 03:08 PM, Tejun Heo wrote:
>>     
>>> Hmmm... this means that on one of the chunks, chunk->list.next was
>>> NULL (BTW, the disassembly is from unlinked object, right?).  The main
>>> allocation code hasn't seen much change lately.  The only changes are,
>>>
>>> 22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
>>> 833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new
>>>       
>> Can you also please try reverting the above two commits?
>>     
>
> Sorry about all the fuss but I think this could be it.  It looks like
> I forgot to update need_to_extend logic while adding simultaneous
> head/tail split for alignment, so the array might be overrun by one
> entry.  Can you please try this one first?
>
> Thanks.
>
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 768419d..f1ed9ea 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -373,11 +373,11 @@ static int pcpu_need_to_extend(struct pcpu_chunk *chunk)
>  {
>  	int new_alloc;
>
> -	if (chunk->map_alloc >= chunk->map_used + 2)
> +	if (chunk->map_alloc >= chunk->map_used + 3)
>  		return 0;
>
>  	new_alloc = PCPU_DFL_MAP_ALLOC;
> -	while (new_alloc < chunk->map_used + 2)
> +	while (new_alloc < chunk->map_used + 3)
>  		new_alloc *= 2;
>
>  	return new_alloc;
>
>   
This did not help. With this patch applied i ran into the
following 

Unpacking initramfs...
Freeing initrd memory: 11780k freed
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c01df0e2>] pcpu_alloc+0x1cb/0x75e
*pdpt = 0000000000661001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
last sysfs file:
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.33-git10-autotest #2 /eserver xSeries 235 -[86717AX]-
EIP: 0060:[<c01df0e2>] EFLAGS: 00010002 CPU: 1
EIP is at pcpu_alloc+0x1cb/0x75e
EAX: 00000000 EBX: c0647a00 ECX: cccccccc EDX: 00000000
ESI: 00000040 EDI: 00000004 EBP: f5c3ff74 ESP: f5c3fef8
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=f5c3e000 task=f5c48000 task.ti=f5c3e000)
Stack:
 00000000 00000004 00000002 00000002 00000002 f6227fa0 00000040 00000040
<0> 00000002 00000286 f62294a0 00000000 f5c3ff54 00000202 00000000 f6200400
<0> 00000202 00000000 00000020 f6200000 00000000 f62294a0 00000000 f5c3ff64
Call Trace:
 [<c01b364b>] ? free_hot_page+0x31/0x35
 [<c01df68e>] ? __alloc_percpu+0xa/0xc
 [<c0163c29>] ? __create_workqueue_key+0x74/0x1c8
 [<c05fe3d3>] ? irqfd_module_init+0x0/0x2c
 [<c05fe3ed>] ? irqfd_module_init+0x1a/0x2c
 [<c0101139>] ? do_one_initcall+0x4c/0x131
 [<c05fc352>] ? kernel_init+0x127/0x1a8
 [<c05fc22b>] ? kernel_init+0x0/0x1a8
 [<c01220b6>] ? kernel_thread_helper+0x6/0x10
Code: 45 a8 e9 65 ff ff ff 8b 4d 9c 8b 55 a0 8b 45 84 e8 31 fa ff ff 85 c0 89 45 a4 0f 89 fd 00 00 00 8b 45 84 8b 00 89 45 84 8b 55 84 <8b> 02 0f 18 00 90 8b 45 cc 03 05 a0 9f 5f c0 39 c2 0f 85 67 ff
EIP: [<c01df0e2>] pcpu_alloc+0x1cb/0x75e SS:ESP 0068:f5c3fef8
CR2: 0000000000000000
---[ end trace a7919e7f17c0a725 ]---

I will try reverting the two commits you mentioned and see if that
helps.

Thanks
-Sachin

-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05  6:09         ` Tejun Heo
  2010-03-05  6:17           ` Tejun Heo
@ 2010-03-05 10:44           ` Sachin Sant
  2010-03-05 13:24             ` Tejun Heo
  1 sibling, 1 reply; 12+ messages in thread
From: Sachin Sant @ 2010-03-05 10:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-next, LKML

Tejun Heo wrote:
> On 03/05/2010 03:08 PM, Tejun Heo wrote:
>   
>> Hmmm... this means that on one of the chunks, chunk->list.next was
>> NULL (BTW, the disassembly is from unlinked object, right?).  The main
>> allocation code hasn't seen much change lately.  The only changes are,
>>
>> 22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
>> 833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new
>>     
>
> Can you also please try reverting the above two commits?
>
> Thanks.
>
>   
Reverting both the commits allows the machine to boot.
If i just apply 22b737f4c75197372d64afc6ed1bccd58c00e549 the
box fails to boot with following kobject related traces:

registered taskstats version 1
kobject '' (c11d5fdc): tried to add an uninitialized object, something is seriously wrong.
Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100305 #3
Call Trace:
 [<c03a7678>] ? printk+0xf/0x17
 [<c028766f>] kobject_add+0x28/0x49
 [<c05a1a8e>] memmap_init+0x4f/0x89
 [<c05a1a3f>] ? memmap_init+0x0/0x89
 [<c0101139>] do_one_initcall+0x4c/0x131
 [<c057b352>] kernel_init+0x127/0x1a8
 [<c057b22b>] ? kernel_init+0x0/0x1a8
 [<c0102db6>] kernel_thread_helper+0x6/0x10
------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
Hardware name: eserver xSeries 235 -[86717AX]-
kobject: '' (c11d5fdc): is not initialized, yet kobject_put() is being called.
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100305 #3
Call Trace:
 [<c012fe66>] warn_slowpath_common+0x60/0x90
 [<c012feca>] warn_slowpath_fmt+0x24/0x27
 [<c02872c6>] kobject_put+0x27/0x3c
 [<c05a1a9c>] memmap_init+0x5d/0x89
 [<c05a1a3f>] ? memmap_init+0x0/0x89
 [<c0101139>] do_one_initcall+0x4c/0x131
 [<c057b352>] kernel_init+0x127/0x1a8
 [<c057b22b>] ? kernel_init+0x0/0x1a8
 [<c0102db6>] kernel_thread_helper+0x6/0x10
---[ end trace 7b6574301a0037c2 ]---

The results are with today's next, but i think same applies to Linus
tree as well.

Thanks
-Sachin



-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05 10:44           ` Sachin Sant
@ 2010-03-05 13:24             ` Tejun Heo
  2010-03-06  7:39               ` Sachin Sant
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2010-03-05 13:24 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

Hello,

On 03/05/2010 07:44 PM, Sachin Sant wrote:
> Tejun Heo wrote:
>> On 03/05/2010 03:08 PM, Tejun Heo wrote:
>>  
>>> Hmmm... this means that on one of the chunks, chunk->list.next was
>>> NULL (BTW, the disassembly is from unlinked object, right?).  The main
>>> allocation code hasn't seen much change lately.  The only changes are,
>>>
>>> 22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
>>> 833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new
>>>     
>>
>> Can you also please try reverting the above two commits?
>>
>> Thanks.
>>
>>   
> Reverting both the commits allows the machine to boot.
> If i just apply 22b737f4c75197372d64afc6ed1bccd58c00e549 the
> box fails to boot with following kobject related traces:
> 
> registered taskstats version 1
> kobject '' (c11d5fdc): tried to add an uninitialized object, something
> is seriously wrong.
> Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100305 #3
> Call Trace:
> [<c03a7678>] ? printk+0xf/0x17
> [<c028766f>] kobject_add+0x28/0x49
> [<c05a1a8e>] memmap_init+0x4f/0x89
> [<c05a1a3f>] ? memmap_init+0x0/0x89
> [<c0101139>] do_one_initcall+0x4c/0x131
> [<c057b352>] kernel_init+0x127/0x1a8
> [<c057b22b>] ? kernel_init+0x0/0x1a8
> [<c0102db6>] kernel_thread_helper+0x6/0x10
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: eserver xSeries 235 -[86717AX]-
> kobject: '' (c11d5fdc): is not initialized, yet kobject_put() is being
> called.
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.33-autotest-next-20100305 #3
> Call Trace:
> [<c012fe66>] warn_slowpath_common+0x60/0x90
> [<c012feca>] warn_slowpath_fmt+0x24/0x27
> [<c02872c6>] kobject_put+0x27/0x3c
> [<c05a1a9c>] memmap_init+0x5d/0x89
> [<c05a1a3f>] ? memmap_init+0x0/0x89
> [<c0101139>] do_one_initcall+0x4c/0x131
> [<c057b352>] kernel_init+0x127/0x1a8
> [<c057b22b>] ? kernel_init+0x0/0x1a8
> [<c0102db6>] kernel_thread_helper+0x6/0x10
> ---[ end trace 7b6574301a0037c2 ]---
> 
> The results are with today's next, but i think same applies to Linus
> tree as well.

I'm having very difficult time imagining how 22b737f4 could have
affected this as the patch is identical transformation of the previous
code.  Also, 833af842 was released with 2.6.32 and stayed that way, so
it really looks like a memory overrun / random corruption thing.  Can
you please retry with kmalloc debug stuff turned on?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05  7:47             ` Sachin Sant
@ 2010-03-05 13:25               ` Tejun Heo
  0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2010-03-05 13:25 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linux-next, LKML

On 03/05/2010 04:47 PM, Sachin Sant wrote:
>> diff --git a/mm/percpu.c b/mm/percpu.c
>> index 768419d..f1ed9ea 100644
>> --- a/mm/percpu.c
>> +++ b/mm/percpu.c
>> @@ -373,11 +373,11 @@ static int pcpu_need_to_extend(struct pcpu_chunk
>> *chunk)
>>  {
>>      int new_alloc;
>>
>> -    if (chunk->map_alloc >= chunk->map_used + 2)
>> +    if (chunk->map_alloc >= chunk->map_used + 3)
>>          return 0;
>>
>>      new_alloc = PCPU_DFL_MAP_ALLOC;
>> -    while (new_alloc < chunk->map_used + 2)
>> +    while (new_alloc < chunk->map_used + 3)
>>          new_alloc *= 2;
>>
>>      return new_alloc;

This was a red herring.  +2 is correct.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: -next March 3: Boot failure on x86 (Oops)
  2010-03-05 13:24             ` Tejun Heo
@ 2010-03-06  7:39               ` Sachin Sant
  0 siblings, 0 replies; 12+ messages in thread
From: Sachin Sant @ 2010-03-06  7:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-next, LKML

Tejun Heo wrote:
>
> I'm having very difficult time imagining how 22b737f4 could have
> affected this as the patch is identical transformation of the previous
> code.  Also, 833af842 was released with 2.6.32 and stayed that way, so
> it really looks like a memory overrun / random corruption thing.  Can
> you please retry with kmalloc debug stuff turned on?
>   
With latest git (2.6.33-git11) the machine boots fine. So as you
mentioned this could be a memory overrun / random corruption issue.

I will try few more boots with latest / old git snapshots, and will
try to collect debug information on recreation.

Thanks
-Sachin

> Thanks.
>
>   


-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-03-06  7:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-03  6:46 linux-next: Tree for March 3 Stephen Rothwell
2010-03-03 15:44 ` -next March 3: Boot failure on x86 (Oops) Sachin Sant
2010-03-04  1:28   ` Tejun Heo
2010-03-04  5:23     ` Sachin Sant
2010-03-05  6:08       ` Tejun Heo
2010-03-05  6:09         ` Tejun Heo
2010-03-05  6:17           ` Tejun Heo
2010-03-05  7:47             ` Sachin Sant
2010-03-05 13:25               ` Tejun Heo
2010-03-05 10:44           ` Sachin Sant
2010-03-05 13:24             ` Tejun Heo
2010-03-06  7:39               ` Sachin Sant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.