From mboxrd@z Thu Jan 1 00:00:00 1970 From: Coly Li Date: Fri, 09 Apr 2010 15:58:48 +0800 Subject: [Ocfs2-devel] [PATCH] ocfs2: avoid direct write if we fall back to buffered In-Reply-To: <4BBE2356.4010103@oracle.com> References: <201004081547.24593.lidongyang@novell.com> <4BBE2356.4010103@oracle.com> Message-ID: <4BBEDE38.3030207@suse.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com On 04/09/2010 02:41 AM, Sunil Mushran Wrote: > I cannot read the bugzilla. Now it maybe that that bz > cannot be made public. That's ok. But if that's the case, > can you explain the problem encountered. I am not qs > the fix... rather trying to understand why this has not > been reported before. > Hi Sunil, This issue was reported by Jiaju Zhang, another Novell ocfs2/dlm developer. When he did I/O pressure test (fsstress from ltp package), the following dmesg was observed, Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717421] (11411,2):ocfs2_truncate_file:465 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode) Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717437] (11411,2):ocfs2_truncate_file:465 ERROR: Inode 241893, inode i_size = 1540096 != di i_size = 1535498, i_flags = 0x1 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717462] ------------[ cut here]------------ Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717465] kernel BUG at /usr/src/packages/BUILD/ocfs2-1.4/xen/ocfs2/file.c:465! Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717468] invalid opcode: 0000 [#2] SMP Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717471] last sysfs file: /sys/kernel/uevent_seqnum Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717474] Modules linked in: ocfs2 jbd2 ocfs2_nodemanager quota_tree ocfs2_stack_user ocfs2_stackglue dlm configfs sg sd_mod crc_t10dif crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod af_packet microcode softdog fuse loop dm_mod rtc_core rtc_lib joydev xennet ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717516] Supported: Yes Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717518] Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717521] Pid: 11411, comm: fsstress Tainted: G D (2.6.32.9-0.5-xen #1) Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717525] EIP: 0061:[] EFLAGS: 00010296 CPU: 2 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717538] EIP is at ocfs2_setattr+0xc1a/0x1d10 [ocfs2] Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717542] EAX: 00000089 EBX: cd8e25f0 ECX: c056c0ec EDX: 00000000 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717545] ESI: cc4c2000 EDI: cae4e908 EBP: 00068f02 ESP: c0a43e54 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717548] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717552] Process fsstress (pid: 11411, ti=c0a42000 task=cd8e25f0 task.ti=c0a42000) Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717555] Stack: Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717557] d24cfc30 00002c93 00000002 d24c809c 000001d1 0003b0e5 00000000 00178000 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717564] <0> 00000000 00176e0a 00000000 00000001 00110f02 00000000 00000000 00000000 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717572] <0> 00000000 00000000 00000000 00110f02 d24628e9 00008282 c0a43f44 ca5c4000 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717582] Call Trace: Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717606] [] notify_change+0x141/0x320 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717614] [] do_truncate+0x68/0xa0 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717619] [] do_sys_truncate+0x177/0x220 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717624] [] syscall_call+0x7/0xb Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717629] [] 0xf57fe424 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717631] Code: 69 e8 ed f6 05 82 9f d7 d1 80 75 09 f6 05 84 9f d7 d1 01 74 16 f6 05 8a 9f d7 d1 80 75 0d f6 05 8c 9f d7 d1 01 0f 84 48 06 00 00 <0f> 0b eb fe 66 90 8b 44 24 68 31 c9 e8 b5 2f c9 ed 31 c9 89 44 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717675] EIP: [] ocfs2_setattr+0xc1a/0x1d10 [ocfs2] SS:ESP 0069:c0a43e54 Mar 24 15:59:07 xen-sp1-2 kernel: [ 2309.717688] ---[ end trace cce1004f6a64f124 ]--- The above error can be reproduced by Jiaju, Dongyang, and me. Dongyang also reproduced this issue on vanilla kernel. We find these steps is easier to reproduce the error: 1) fill the ocfs2 volume to 97%-98% full (dd a big file on ocfs2 volume) 2) then ran fsstress Jan Kara also helps to review Dongyang's patch, no objection from him. Hope the explanation is informative. -- Coly Li SuSE Labs