linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG at mm/filemap.c:332!
@ 2003-12-04 14:59 Mihai RUSU
  2003-12-04 16:45 ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Mihai RUSU @ 2003-12-04 14:59 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3642 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi

I have been starting to get some of the machines here on 2.6.0 (curently
test11). On one of them I got this kernel message:

Bad page state at free_hot_cold_page
flags:0x01020005 mapping:00000000 mapped:0 count:0
Backtrace:
Call Trace:
 [<c0135172>] bad_page+0x46/0x6c
 [<c01356ac>] free_hot_cold_page+0x60/0xec
 [<c013573f>] free_hot_page+0x7/0x8
 [<c013a23b>] __page_cache_release+0xa7/0xac
 [<c013a9a2>] invalidate_complete_page+0xba/0xc4
 [<c013ac9b>] invalidate_mapping_pages+0x73/0xd4
 [<c014e617>] remove_inode_buffers+0x13/0x94
 [<c013ad0a>] invalidate_inode_pages+0xe/0x14
 [<c0163e09>] prune_icache+0x11d/0x260
 [<c0163f61>] shrink_icache_memory+0x15/0x20
 [<c013afec>] shrink_slab+0x110/0x168
 [<c013c561>] balance_pgdat+0x13d/0x1e8
 [<c013c70f>] kswapd+0x103/0x108
 [<c013c60c>] kswapd+0x0/0x108
 [<c011bdd8>] autoremove_wake_function+0x0/0x40
 [<c011bdd8>] autoremove_wake_function+0x0/0x40
 [<c0107011>] kernel_thread_helper+0x5/0xc

Trying to fix it up, but a reboot is needed
- ------------[ cut here ]------------
kernel BUG at mm/filemap.c:332!
invalid operand: 0000 [#1]
CPU:    0
EIP:    0060:[<c0131fff>]    Not tainted
EFLAGS: 00010246
EIP is at unlock_page+0x1b/0x3c
eax: 00000000   ebx: c14ecdc0   ecx: 00000016   edx: c1780120
esi: c1781728   edi: 00000001   ebp: efd0be6c   esp: efd0be04
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 13, threadinfo=efd0a000 task=efd226b0)
Stack: c14ecdc0 00000002 c013aca8 d4048c24 d4048c2c efd0a000 efd0a000 00000002 
       00000003 00000000 c1726f28 c14ecdc0 c1743290 efd0bec8 efd0bf14 c02f8900 
       000000fa 00000000 00000001 efd0be70 efd0be68 000000fb 000000fa d48f0704 
Call Trace:
 [<c013aca8>] invalidate_mapping_pages+0x80/0xd4
 [<c014e617>] remove_inode_buffers+0x13/0x94
 [<c013ad0a>] invalidate_inode_pages+0xe/0x14
 [<c0163e09>] prune_icache+0x11d/0x260
 [<c0163f61>] shrink_icache_memory+0x15/0x20
 [<c013afec>] shrink_slab+0x110/0x168
 [<c013c561>] balance_pgdat+0x13d/0x1e8
 [<c013c70f>] kswapd+0x103/0x108
 [<c013c60c>] kswapd+0x0/0x108
 [<c011bdd8>] autoremove_wake_function+0x0/0x40
 [<c011bdd8>] autoremove_wake_function+0x0/0x40
 [<c0107011>] kernel_thread_helper+0x5/0xc

Code: 0f 0b 4c 01 d3 d0 2b c0 8d 46 04 39 46 04 74 0e 31 c9 ba 03 

It seems that this bug is related to a directory found on a xfs partition 
with lots of entries (several thousands). I didnt got any message like 
thos on some other systems which dont have such directories with many file 
entries. Any 'ls' I try in that directory and any other process trying to 
list its contents gets stuck in 'D' state.

I have attached .config file and here is what ver-linux says:
Linux status 2.6.0-test11 #1 SMP Tue Dec 2 17:19:31 EET 2003 i686 unknown

Gnu C                  2.95.3
Gnu make               3.79.1
util-linux             2.11r
mount                  2.11r
module-init-tools      implemented
e2fsprogs              1.27
jfsutils               1.0.18
reiserfsprogs          3.x.1b
xfsprogs               2.0.3
quota-tools            3.06.
Linux C Library        2.2.5
Dynamic linker (ldd)   2.2.5
Procps                 2.0.16
Net-tools              1.60
Kbd                    1.06
Sh-utils               2.0

Please help, thanks!

- -- 
Mihai RUSU                                    Email: dizzy@roedu.net
GPG : http://dizzy.roedu.net/dizzy-gpg.txt    WWW: http://dizzy.roedu.net
                       "Linux is obsolete" -- AST
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/z0vAPZzOzrZY/1QRAi0jAKDhTGbfetbch5XBOV31+0sMAtjBwQCfbdFb
DCeSCZHDSVQZqaZfhUI6sbo=
=oFXW
-----END PGP SIGNATURE-----

[-- Attachment #2: Type: TEXT/PLAIN, Size: 13546 bytes --]

#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y

#
# Code maturity level options
#
# CONFIG_EXPERIMENTAL is not set
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y

#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y

#
# Loadable module support
#
# CONFIG_MODULES is not set

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
CONFIG_MPENTIUMIII=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MELAN is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y

#
# Power management options (ACPI, APM)
#
# CONFIG_PM is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BOOT=y

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
# CONFIG_HOTPLUG is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
# CONFIG_BINFMT_MISC is not set

#
# Device Drivers
#

#
# Generic Driver Options
#

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play support
#
# CONFIG_PNP is not set

#
# Block devices
#
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_LBD is not set

#
# ATA/ATAPI/MFM/RLL support
#
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_REPORT_LUNS is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set

#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_ACARD is not set
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=2000
# CONFIG_AIC7XXX_PROBE_EISA_VL is not set
# CONFIG_AIC7XXX_BUILD_FIRMWARE is not set
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set

#
# Fusion MPT device support
#
# CONFIG_FUSION is not set

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Networking support
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_INET_ECN is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set

#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set

#
# IP: Netfilter Configuration
#
# CONFIG_IP_NF_CONNTRACK is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=y
# CONFIG_IP_NF_MATCH_LIMIT is not set
# CONFIG_IP_NF_MATCH_IPRANGE is not set
# CONFIG_IP_NF_MATCH_MAC is not set
# CONFIG_IP_NF_MATCH_PKTTYPE is not set
# CONFIG_IP_NF_MATCH_MARK is not set
# CONFIG_IP_NF_MATCH_MULTIPORT is not set
# CONFIG_IP_NF_MATCH_TOS is not set
# CONFIG_IP_NF_MATCH_RECENT is not set
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_DSCP is not set
# CONFIG_IP_NF_MATCH_AH_ESP is not set
# CONFIG_IP_NF_MATCH_LENGTH is not set
# CONFIG_IP_NF_MATCH_TTL is not set
# CONFIG_IP_NF_MATCH_TCPMSS is not set
# CONFIG_IP_NF_MATCH_OWNER is not set
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_TARGET_LOG is not set
# CONFIG_IP_NF_TARGET_ULOG is not set
# CONFIG_IP_NF_TARGET_TCPMSS is not set
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
CONFIG_NETDEVICES=y

#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=y
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set

#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
CONFIG_E100=y
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_TIGON3 is not set

#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_FDDI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# Bluetooth support
#
# CONFIG_BT is not set

#
# ISDN subsystem
#
# CONFIG_ISDN_BOOL is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256

#
# I2C support
#
# CONFIG_I2C is not set

#
# I2C Algorithms
#

#
# I2C Hardware Bus support
#

#
# I2C Hardware Sensors Chip support
#
# CONFIG_I2C_SENSOR is not set

#
# Mice
#
# CONFIG_BUSMOUSE is not set
# CONFIG_QIC02_TAPE is not set

#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set

#
# Ftape, the floppy tape device driver
#
# CONFIG_AGP is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HANGCHECK_TIMER is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set

#
# Graphics support
#
# CONFIG_FB is not set
# CONFIG_VIDEO_SELECT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y

#
# Sound
#
# CONFIG_SOUND is not set

#
# USB support
#
# CONFIG_USB is not set

#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_XFS_FS=y
# CONFIG_XFS_QUOTA is not set
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
# CONFIG_FAT_FS is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_DEVPTS_FS=y
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y

#
# Miscellaneous filesystems
#
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_EXPORTFS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y

#
# Kernel hacking
#
# CONFIG_DEBUG_KERNEL is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_FRAME_POINTER is not set
CONFIG_X86_EXTRA_IRQS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y

#
# Security options
#
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_SELINUX is not set

#
# Cryptographic options
#
# CONFIG_CRYPTO is not set

#
# Library routines
#
# CONFIG_CRC32 is not set
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel BUG at mm/filemap.c:332!
  2003-12-04 14:59 kernel BUG at mm/filemap.c:332! Mihai RUSU
@ 2003-12-04 16:45 ` Linus Torvalds
  2003-12-04 17:26   ` Mihai RUSU
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2003-12-04 16:45 UTC (permalink / raw)
  To: Nathan Scott; +Cc: Kernel Mailing List, Mihai RUSU, Jens Axboe, Neil Brown


Nathan,
 you're not off the hook yet. This is a smoking gun on XFS, and this time
with a big clue: large directories, and a low-memory situation.

Mihai RUSU <dizzy@roedu.net> wrote:
>
> It seems that this bug is related to a directory found on a xfs partition
> with lots of entries (several thousands). I didnt got any message like
> thos on some other systems which dont have such directories with many file
> entries. Any 'ls' I try in that directory and any other process trying to
> list its contents gets stuck in 'D' state.

Also, this time the config file doesn't have any MD/RAID support according
to the attachment:

	# Multi-device support (RAID and LVM)
	#
	# CONFIG_MD is not set

so it looks like the XFS and MD issues really are totally unrelated.

Basically, as far as I can see a pattern, we have:

 - MD/RAID causes a double bio free, and as a result we see the reports of
   slab corruption from people who have slab debugging enabled.
 - the XFS case causes some kind of "struct page" corruption, and as a
   result we see the "Bad page state" messages followed by an eventual
   oops due to corrupted page lists.

Two different bugs, two different behaviours. It's just that some people
have _both_ ;)

Mihai: the oops itself is in this case not very telling, since it's just a
result of corruption of some fundamental data structures (probably
somebody using a page cache page after having free'd it - and it probably
only shows up when memory gets low and pages have to be cleaned). Can you
tell Nathan more about the filesystem setup (block size, as much as
possible about the affected directory, etc).

		Linus

----
On Thu, 4 Dec 2003, Mihai RUSU wrote:
>
> I have been starting to get some of the machines here on 2.6.0 (curently
> test11). On one of them I got this kernel message:
>
> Bad page state at free_hot_cold_page
> flags:0x01020005 mapping:00000000 mapped:0 count:0
> Backtrace:
> Call Trace:
>  [<c0135172>] bad_page+0x46/0x6c
>  [<c01356ac>] free_hot_cold_page+0x60/0xec
>  [<c013573f>] free_hot_page+0x7/0x8
>  [<c013a23b>] __page_cache_release+0xa7/0xac
>  [<c013a9a2>] invalidate_complete_page+0xba/0xc4
>  [<c013ac9b>] invalidate_mapping_pages+0x73/0xd4
>  [<c014e617>] remove_inode_buffers+0x13/0x94
>  [<c013ad0a>] invalidate_inode_pages+0xe/0x14
>  [<c0163e09>] prune_icache+0x11d/0x260
>  [<c0163f61>] shrink_icache_memory+0x15/0x20
>  [<c013afec>] shrink_slab+0x110/0x168
>  [<c013c561>] balance_pgdat+0x13d/0x1e8
>  [<c013c70f>] kswapd+0x103/0x108
>  [<c013c60c>] kswapd+0x0/0x108
>  [<c011bdd8>] autoremove_wake_function+0x0/0x40
>  [<c011bdd8>] autoremove_wake_function+0x0/0x40
>  [<c0107011>] kernel_thread_helper+0x5/0xc
>
> Trying to fix it up, but a reboot is needed
> - ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:332!
> invalid operand: 0000 [#1]
> CPU:    0
> EIP:    0060:[<c0131fff>]    Not tainted
> EFLAGS: 00010246
> EIP is at unlock_page+0x1b/0x3c
> eax: 00000000   ebx: c14ecdc0   ecx: 00000016   edx: c1780120
> esi: c1781728   edi: 00000001   ebp: efd0be6c   esp: efd0be04
> ds: 007b   es: 007b   ss: 0068
> Process kswapd0 (pid: 13, threadinfo=efd0a000 task=efd226b0)
> Stack: c14ecdc0 00000002 c013aca8 d4048c24 d4048c2c efd0a000 efd0a000 00000002
>        00000003 00000000 c1726f28 c14ecdc0 c1743290 efd0bec8 efd0bf14 c02f8900
>        000000fa 00000000 00000001 efd0be70 efd0be68 000000fb 000000fa d48f0704
> Call Trace:
>  [<c013aca8>] invalidate_mapping_pages+0x80/0xd4
>  [<c014e617>] remove_inode_buffers+0x13/0x94
>  [<c013ad0a>] invalidate_inode_pages+0xe/0x14
>  [<c0163e09>] prune_icache+0x11d/0x260
>  [<c0163f61>] shrink_icache_memory+0x15/0x20
>  [<c013afec>] shrink_slab+0x110/0x168
>  [<c013c561>] balance_pgdat+0x13d/0x1e8
>  [<c013c70f>] kswapd+0x103/0x108
>  [<c013c60c>] kswapd+0x0/0x108
>  [<c011bdd8>] autoremove_wake_function+0x0/0x40
>  [<c011bdd8>] autoremove_wake_function+0x0/0x40
>  [<c0107011>] kernel_thread_helper+0x5/0xc
>
> Code: 0f 0b 4c 01 d3 d0 2b c0 8d 46 04 39 46 04 74 0e 31 c9 ba 03
>
> It seems that this bug is related to a directory found on a xfs partition
> with lots of entries (several thousands). I didnt got any message like
> thos on some other systems which dont have such directories with many file
> entries. Any 'ls' I try in that directory and any other process trying to
> list its contents gets stuck in 'D' state.
>
> I have attached .config file and here is what ver-linux says:
> Linux status 2.6.0-test11 #1 SMP Tue Dec 2 17:19:31 EET 2003 i686 unknown
>
> Gnu C                  2.95.3
> Gnu make               3.79.1
> util-linux             2.11r
> mount                  2.11r
> module-init-tools      implemented
> e2fsprogs              1.27
> jfsutils               1.0.18
> reiserfsprogs          3.x.1b
> xfsprogs               2.0.3
> quota-tools            3.06.
> Linux C Library        2.2.5
> Dynamic linker (ldd)   2.2.5
> Procps                 2.0.16
> Net-tools              1.60
> Kbd                    1.06
> Sh-utils               2.0
>
> Please help, thanks!
>
> - --
> Mihai RUSU                                    Email: dizzy@roedu.net
> GPG : http://dizzy.roedu.net/dizzy-gpg.txt    WWW: http://dizzy.roedu.net
>                        "Linux is obsolete" -- AST
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.3 (GNU/Linux)
>
> iD8DBQE/z0vAPZzOzrZY/1QRAi0jAKDhTGbfetbch5XBOV31+0sMAtjBwQCfbdFb
> DCeSCZHDSVQZqaZfhUI6sbo=
> =oFXW
> -----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel BUG at mm/filemap.c:332!
  2003-12-04 16:45 ` Linus Torvalds
@ 2003-12-04 17:26   ` Mihai RUSU
  2003-12-04 21:16     ` Nathan Scott
  0 siblings, 1 reply; 5+ messages in thread
From: Mihai RUSU @ 2003-12-04 17:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nathan Scott, Kernel Mailing List, Jens Axboe, Neil Brown

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Linus

First of all thanks for the answer!

On Thu, 4 Dec 2003, Linus Torvalds wrote:

> 
> Nathan,
>  you're not off the hook yet. This is a smoking gun on XFS, and this time
> with a big clue: large directories, and a low-memory situation.

Sorry to have misguided you guys in the first post. After rebooting the
machine I have some more information, the actual directory size its about
some hundred entries (~400) and not thousands as I previously speculated
(I didnt know the exact number until I could ls it and I couldnt do that
until I had to reboot the machine).

Beeing just several hundred entries I know that I have at least one more
2.6.0-test11 machine (SMP, no MD but hw DAC960 RAID) with more entries in
one directory and I didnt got any such message (yet), it has only 5 days
uptime, we will see if I get anything there too. It could be just that on
the other machine I dont have much action happening in the directories
with many entries. The machine which got the kernel error has a lot of
things going in that directory with many entries (mostly stats gathered
every 5 mins from cron with mrtg and written to binary image files).

However I have some more usefull (I hope) information about the subject.  
Before rebooting I wanted to first install a do_brk() patched 2.4.21-xfs
kernel with lilo. Unfortunetly lilo stuck in a fsync() call after writing
to screen that it did added all kernel images to MBR as configured in
lilo.conf. When I booted I had no problem to boot from the new do_brk()
fixed kernel so lilo seems it did the job, I dont know why it stuck
in fsync().

ctrl-alt-del didnt do the trick (I had online ssh session on the machine
which was working , I could do ps ax, vmstat etc, but probably init was
doing something which also stuck in D state) so I had to reboot it "hard".
After power on, one coleague complained that a file on which he worked a
couple of minutes before I took the machine down had NULL bytes instead of
actual content. I know that "dirty" data gets flushed to disk every 30
seconds so this seems a little bit strange (in general I know that XFS
leaves NULL bytes in files modified just before a unclean reboot but this
file was modified some 5 minutes before the "hard" reboot).

> Also, this time the config file doesn't have any MD/RAID support according
> to the attachment:
> 
> 	# Multi-device support (RAID and LVM)
> 	#
> 	# CONFIG_MD is not set
> 
> so it looks like the XFS and MD issues really are totally unrelated.

Yep, Im very conservative to the features I use in the kernel :)

> Mihai: the oops itself is in this case not very telling, since it's just a
> result of corruption of some fundamental data structures (probably
> somebody using a page cache page after having free'd it - and it probably
> only shows up when memory gets low and pages have to be cleaned). Can you
> tell Nathan more about the filesystem setup (block size, as much as
> possible about the affected directory, etc).

Ok.

$ xfs_info /var
meta-data=/var                   isize=256    agcount=18, agsize=262144 blks
data     =                       bsize=4096   blocks=4482127, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=0
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=1200
realtime =none                   extsz=65536  blocks=0, rtextents=0

Mount options are "rw,noatime".

Please let me know if you need any other infos. Thanks!

> 		Linus

- -- 
Mihai RUSU                                    Email: dizzy@roedu.net
GPG : http://dizzy.roedu.net/dizzy-gpg.txt    WWW: http://dizzy.roedu.net
                       "Linux is obsolete" -- AST
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/z25QPZzOzrZY/1QRAnvLAKDmlFPQEYyzVmxgAgopuar3hhGZ5ACeOq6H
Zwty+roqa5JqjBZBJDF0xnc=
=gPtb
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel BUG at mm/filemap.c:332!
  2003-12-04 17:26   ` Mihai RUSU
@ 2003-12-04 21:16     ` Nathan Scott
  2003-12-05  7:14       ` Mihai RUSU
  0 siblings, 1 reply; 5+ messages in thread
From: Nathan Scott @ 2003-12-04 21:16 UTC (permalink / raw)
  To: Mihai RUSU, Linus Torvalds; +Cc: Kernel Mailing List, Jens Axboe, Neil Brown

On Thu, Dec 04, 2003 at 07:26:38PM +0200, Mihai RUSU wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi Linus
> 
> First of all thanks for the answer!
> 
> On Thu, 4 Dec 2003, Linus Torvalds wrote:
> 
> > 
> > Nathan,
> >  you're not off the hook yet. This is a smoking gun on XFS, and this time
> > with a big clue: large directories, and a low-memory situation.
> 
> Sorry to have misguided you guys in the first post. After rebooting the
> machine I have some more information, the actual directory size its about
> some hundred entries (~400) and not thousands as I previously speculated

Still, its a clue - all metadata I/O in XFS goes though the
pagebuf code, so we're looking in the right place.

> However I have some more usefull (I hope) information about the subject.  
> Before rebooting I wanted to first install a do_brk() patched 2.4.21-xfs
> kernel with lilo. Unfortunetly lilo stuck in a fsync() call after writing
> to screen that it did added all kernel images to MBR as configured in
> lilo.conf. When I booted I had no problem to boot from the new do_brk()
> fixed kernel so lilo seems it did the job, I dont know why it stuck
> in fsync().

Was your filesystem near full?  There was a 2.4 deadlock fixed
recently which could be what you hit there.

> After power on, one coleague complained that a file on which he worked a
> couple of minutes before I took the machine down had NULL bytes instead of
> actual content. I know that "dirty" data gets flushed to disk every 30
> seconds so this seems a little bit strange (in general I know that XFS
> leaves NULL bytes in files modified just before a unclean reboot but this
> file was modified some 5 minutes before the "hard" reboot).

You'll want a more recent 2.4 XFS kernel I suspect - Steve made
several improvements in this area awhile back.

> > Also, this time the config file doesn't have any MD/RAID support according
> > to the attachment:
> > 
> > 	# Multi-device support (RAID and LVM)
> > 	#
> > 	# CONFIG_MD is not set
> > 
> > so it looks like the XFS and MD issues really are totally unrelated.

Sure does.

> > Mihai: the oops itself is in this case not very telling, since it's just a
> > result of corruption of some fundamental data structures (probably
> > somebody using a page cache page after having free'd it - and it probably
> > only shows up when memory gets low and pages have to be cleaned). Can you
> > tell Nathan more about the filesystem setup (block size, as much as
> > possible about the affected directory, etc).
> 
> Ok.
> 
> $ xfs_info /var
> meta-data=/var                   isize=256    agcount=18, agsize=262144 blks
> data     =                       bsize=4096   blocks=4482127, imaxpct=25
>          =                       sunit=0      swidth=0 blks, unwritten=0
> naming   =version 2              bsize=4096  
> log      =internal               bsize=4096   blocks=1200
> realtime =none                   extsz=65536  blocks=0, rtextents=0

OK, looks like a default mkfs then (with an old-ish mkfs binary)?
Newer mkfs' will give you a better AG layout and unwritten extents
would be turned on - not relevent to this problem at all though.

An "ls -ld" and "xfs_bmap -v" on the directory would also provide
me a bit more info to work with -- thanks!

I have a few ideas about what this might be, let me stew on those
for a bit and try a few things.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel BUG at mm/filemap.c:332!
  2003-12-04 21:16     ` Nathan Scott
@ 2003-12-05  7:14       ` Mihai RUSU
  0 siblings, 0 replies; 5+ messages in thread
From: Mihai RUSU @ 2003-12-05  7:14 UTC (permalink / raw)
  To: Nathan Scott; +Cc: Linus Torvalds, Kernel Mailing List, Jens Axboe, Neil Brown

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 5 Dec 2003, Nathan Scott wrote:

> Was your filesystem near full?  There was a 2.4 deadlock fixed
> recently which could be what you hit there.

No it wasnt. It has a ~50% usage.

> You'll want a more recent 2.4 XFS kernel I suspect - Steve made
> several improvements in this area awhile back.

Ok. I know about improvements since XFS 1.1 and I assumed that using a 
recent (ie 2.6.0-test11) kernel the XFS bits with it whould be recent and 
such have those improvements.

> OK, looks like a default mkfs then (with an old-ish mkfs binary)?

True. Its a general /var partition, there waasnt any interest in giving 
mkfs paramteres for it.

> Newer mkfs' will give you a better AG layout and unwritten extents
> would be turned on - not relevent to this problem at all though.

Ok, noted :)

> An "ls -ld" and "xfs_bmap -v" on the directory would also provide
> me a bit more info to work with -- thanks!

$ ls -ld interfaces/
drwxr-xr-x    2 root     root        16384 Dec  5 09:06 interfaces/

$ /usr/sbin/xfs_bmap interfaces/
interfaces/:
        0: [0..7]: 25238288..25238295
        1: [8..31]: 25238304..25238327


> I have a few ideas about what this might be, let me stew on those
> for a bit and try a few things.

Thanks!

> -- 
> Nathan

- -- 
Mihai RUSU                                    Email: dizzy@roedu.net
GPG : http://dizzy.roedu.net/dizzy-gpg.txt    WWW: http://dizzy.roedu.net
                       "Linux is obsolete" -- AST
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/0DA+PZzOzrZY/1QRAnRoAJ9/VKw3okVloX1gTdayWXf1zxeJqACg1h9S
P9hQSHgK/K1CmlgT9/2L+H8=
=8PPr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-12-05  7:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-04 14:59 kernel BUG at mm/filemap.c:332! Mihai RUSU
2003-12-04 16:45 ` Linus Torvalds
2003-12-04 17:26   ` Mihai RUSU
2003-12-04 21:16     ` Nathan Scott
2003-12-05  7:14       ` Mihai RUSU

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).