All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy
@ 2014-05-03 19:49 Marcel Giannelia
  2014-05-04  0:17 ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Marcel Giannelia @ 2014-05-03 19:49 UTC (permalink / raw)
  To: xfs

Hi,

I just had the following happen on a server that's never had any
previous issues with XFS:

I was copying an 8 GB file onto an XFS filesystem, when the copy
aborted with an I/O error message and I was forced to unmount and
xfs_repair the filesystem before it would mount again. Relevant dmesg
messages below.

Some other information that might be relevant:

- Distribution & kernel version: Debian 7, uname -a returns:

Linux hostname 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2+deb7u2 i686 GNU/Linux

- This filesystem is on a software (mdadm) RAID 1 array, backed by
  2 disks.

- The drives underneath the RAID reported no errors, and there were no
  errors from either the RAID layer or the disk layer in dmesg. One hard
  drive shows no change in SMART status; the other one's reallocated
  sector count has increased by 1 since the last time I checked (1
  month ago), but there are no logged errors.

- The filesystem is 174 GB and contains 2.5 million (i.e. 2484996)
  files, and had 20 GB of free space. A day or two ago it contained
  around 4 million files and was almost completely full (2 GB free
  space), and the most recent activity before this problem was the
  deletion of about 1.5 million files. (Over 24 hours passed between
  deletion of these files and the attempt to copy the 8 GB file.)

- The file I was copying was a virtual disk image from a virtual
  machine that I had shut down about 5 seconds before starting the 
  copy, so it is likely that some of the file's data had not yet
  been committed to disk on the source filesystem, and it is also
  possible that the source file's metadata and contents
  (but not size) changed during the copy. The source filesystem is
  also xfs.

- A few weeks ago, one of the disks in the RAID 1 array failed and I
  replaced it with two disks, bringing the number of disks in the array
  up to 3. This seemed to cause performance issues (writes took too
  long to complete, causing disk timeouts in the virtual machines), so
  the third disk was removed and the array was changed back to 2 disks.

- I have never seen memory issues in this server before (I ran an
  overnight memtest when it was first set up, but that was a year ago),
  and after this error I ran the userspace program "memtester", as I
  cannot take the server offline. This is a fairly limited test (I was
  only able to test 1 GB of free memory), but it found no problems.

- The problem has not recurred -- after doing xfs_repair and remounting
  the filesystem, I was able to copy that same file without issues. I
  since have also copied another virtual machine disk of the same size,
  while the VM was running, without problems.


Output of xfs_info for the filesystem:

meta-data=/dev/md126             isize=256    agcount=4, agsize=11375416 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=45501664, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=22217, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


dmesg entries:

> Immediately after the cp command exited with "i/o error":

XFS (md126): xfs_iflush_int: Bad inode 939480132, ptr 0xd12fa080, magic number 0x494d
XFS (md126): xfs_do_force_shutdown(0x8) called from line 2606 of file /build/buildd-linux_3.2.41-2+deb7u2-i386-G4jjsr/linux-3.2.41/fs/xfs/xfs_inode.c.  Return address = 0xf88319b4
XFS (md126): Corruption of in-memory data detected.  Shutting down filesystem
XFS (md126): Please umount the filesystem and rectify the problem(s)
Buffer I/O error on device md126, logical block 5673596
lost page write due to I/O error on md126

(about 10 more of the "Buffer I/O error / lost page" messages)


> I then unmounted the filesystem, which caused these lines in dmesg:

XFS (md126): xfs_log_force: error 5 returned.
XFS (md126): xfs_log_force: error 5 returned.
XFS (md126): xfs_log_force: error 5 returned.
XFS (md126): xfs_do_force_shutdown(0x1) called from line 1033 of file /build/buildd-linux_3.2.41-2+deb7u2-i386-G4jjsr/linux-3.2.41/fs/xfs/xfs_buf.c.  Return address = 0xf87fd1a7
XFS (md126): xfs_log_force: error 5 returned.
XFS (md126): xfs_log_force: error 5 returned.


> I then tried to mount the filesystem again and got:

XFS (md126): Mounting Filesystem
XFS (md126): Starting recovery (logdev: internal)
XFS (md126): xlog_recover_inode_pass2: Bad inode log record, rec ptr 0xe153d720, ino 939480132
XFS (md126): Internal error xlog_recover_inode_pass2(2) at line 2259 of file /build/buildd-linux_3.2.41-2+deb7u2-i386-G4jjsr/linux-3.2.41/fs/xfs/xfs_log_recover.c.  Caller 0xf8835e66

Pid: 30323, comm: mount Tainted: G           O 3.2.0-4-686-pae #1 Debian 3.2.41-2+deb7u2
Call Trace:
 [<f8834fb9>] ? xlog_recover_inode_pass2+0x206/0x60a [xfs]
 [<f8835e66>] ? xlog_recover_commit_trans+0x5f/0x80 [xfs]
 [<f8835e66>] ? xlog_recover_commit_trans+0x5f/0x80 [xfs]
 [<f8835fb8>] ? xlog_recover_process_data+0x131/0x1bd [xfs]
 [<f8836277>] ? xlog_do_recovery_pass+0x233/0x57b [xfs]
 [<c1038fa5>] ? vprintk+0x3ae/0x3df
 [<f8836636>] ? xlog_do_log_recovery+0x77/0x92 [xfs]
 [<f8836663>] ? xlog_do_recover+0x12/0xab [xfs]
 [<f8836876>] ? xlog_recover+0x6d/0x79 [xfs]
 [<f883cb53>] ? xfs_log_mount+0xab/0x109 [xfs]
 [<f8838c30>] ? xfs_mountfs+0x2b6/0x501 [xfs]
 [<f8806ff8>] ? xfs_mru_cache_create+0xeb/0x112 [xfs]
 [<f8808839>] ? xfs_fs_fill_super+0x145/0x205 [xfs]
 [<c10ce843>] ? mount_bdev+0x11f/0x16d
 [<f88086f4>] ? xfs_parseargs+0x8da/0x8da [xfs]
 [<c10a0036>] ? move_active_pages_to_lru+0xab/0xde
 [<c10c1536>] ? __kmalloc_track_caller+0x9b/0xa7
 [<f88071f3>] ? xfs_fs_mount+0xe/0x11 [xfs]
 [<f88086f4>] ? xfs_parseargs+0x8da/0x8da [xfs]
 [<c10cef2b>] ? mount_fs+0x55/0x122
 [<c10df139>] ? vfs_kern_mount+0x4a/0x77
 [<c10df459>] ? do_kern_mount+0x2f/0xac
 [<c10e0763>] ? do_mount+0x5d0/0x61e
 [<c12c1631>] ? _cond_resched+0x5/0x18
 [<c10a6b88>] ? memdup_user+0x26/0x43
 [<c10e09ea>] ? sys_mount+0x67/0x96
 [<c12c6e1f>] ? sysenter_do_call+0x12/0x28
XFS (md126): log mount/recovery failed: error 117
XFS (md126): log mount failed


After this, I ran xfs_repair with -L. xfs_repair noted the same bad inode
number and deleted the file I had tried to copy, but otherwise made no changes
that I could see. After this, the filesystem mounted normally and there were no
further issues.

~Felix.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy
  2014-05-03 19:49 xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy Marcel Giannelia
@ 2014-05-04  0:17 ` Dave Chinner
  2014-05-04  6:18   ` Marcel Giannelia
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2014-05-04  0:17 UTC (permalink / raw)
  To: Marcel Giannelia; +Cc: xfs

On Sat, May 03, 2014 at 12:49:33PM -0700, Marcel Giannelia wrote:
> Hi,
> 
> I just had the following happen on a server that's never had any
> previous issues with XFS:
> 
> I was copying an 8 GB file onto an XFS filesystem, when the copy
> aborted with an I/O error message and I was forced to unmount and
> xfs_repair the filesystem before it would mount again. Relevant dmesg
> messages below.
> 
> Some other information that might be relevant:
> 
> - Distribution & kernel version: Debian 7, uname -a returns:
> 
> Linux hostname 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2+deb7u2 i686 GNU/Linux

So, old hardware...

> dmesg entries:
> 
> > Immediately after the cp command exited with "i/o error":
> 
> XFS (md126): xfs_iflush_int: Bad inode 939480132, ptr 0xd12fa080, magic number 0x494d

The magic number has a single bit error in it.

#define XFS_DINODE_MAGIC                0x494e  /* 'IN' */

That's the in-memory inode, not the on-disk inode. It caught the
problem before writing the bad magic number to disk - the in-memory
disk buffer was checked immediately before the in-memory copy, and
it checked out OK...

> After this, I ran xfs_repair with -L. xfs_repair noted the same bad inode
> number and deleted the file I had tried to copy, but otherwise made no changes
> that I could see. After this, the filesystem mounted normally and there were no
> further issues.

What was the error that xfs_repair returned? There may have been
other things wrong with the inode that weren't caught when it was
loaded into memory.

However, I'd almost certainly be checking you hardware at this
point, as software doesn't usually cause random single bit flips...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy
  2014-05-04  0:17 ` Dave Chinner
@ 2014-05-04  6:18   ` Marcel Giannelia
  2014-05-04 21:39     ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Marcel Giannelia @ 2014-05-04  6:18 UTC (permalink / raw)
  To: xfs

On Sun, 4 May 2014 10:17:46 +1000
Dave Chinner <david@fromorbit.com> wrote:

> > 
> > - Distribution & kernel version: Debian 7, uname -a returns:
> > 
> > Linux hostname 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2+deb7u2 i686
> > GNU/Linux
> 
> So, old hardware...

Actually no, fairly new underlying hardware -- but this is for a
not-for-profit with no hardware budget, and that one new machine is
the exception. At the time they had a lot more 32-bit hardware lying
around to build spares with, so I built it to run on that if needed :)

> 
> > dmesg entries:
> > 
> > > Immediately after the cp command exited with "i/o error":
> > 
> > XFS (md126): xfs_iflush_int: Bad inode 939480132, ptr 0xd12fa080,
> > magic number 0x494d
> 
> The magic number has a single bit error in it.
> 
> #define XFS_DINODE_MAGIC                0x494e  /* 'IN' */
> 
> That's the in-memory inode, not the on-disk inode. It caught the
> problem before writing the bad magic number to disk - the in-memory
> disk buffer was checked immediately before the in-memory copy, and
> it checked out OK...
> 
> > After this, I ran xfs_repair with -L. xfs_repair noted the same bad
> > inode number and deleted the file I had tried to copy, but
> > otherwise made no changes that I could see. After this, the
> > filesystem mounted normally and there were no further issues.
> 
> What was the error that xfs_repair returned? There may have been
> other things wrong with the inode that weren't caught when it was
> loaded into memory.

Sorry; I didn't capture that output. From what I remember, the only
line different from a clean xfs_repair run was a line quite similar to
what was in the dmesg, about inode # 939480132.

> 
> However, I'd almost certainly be checking you hardware at this
> point, as software doesn't usually cause random single bit flips...

Yeah, going to take that server offline for a full memtest next time
I'm out there.

I also discovered that the third disk I mentioned from that RAID array
was actually having serious problems (hardware ECC recovery and
reallocated sectors through the roof), which explains the performance
issues it was causing -- and that disk was still part of the array
containing the root filesystem.

Since mdadm RAID1 reads from whichever individual disk is least busy
(and doesn't read from all disks in the array and compare, during
normal I/O), is it conceivable that this is what happened?:

Copying that file busied out the two healthy disks, which were both in
the array I was writing to. So of the three disks backing the root
filesystem, the only one not busy was the failing one. During the file
copy, some piece of code was needed that was not in the memory cache
and had to be read from disk, so mdadm read it from the failing drive,
which silently flipped a bit (or several).

A memory problem still seems more likely to me, as I wouldn't expect
the part of the xfs filesystem driver containing the definition of that
magic number to ever need to be re-read from disk after boot -- but I
don't know. Memory load on this system is always very high; there's
almost no room for cache, so maybe... And is there a definition of
that constant someplace that might only be needed when creating a file
bigger than 2 or 4 GB (a very infrequent operation on this filesystem)?

Thanks,

~Felix.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy
  2014-05-04  6:18   ` Marcel Giannelia
@ 2014-05-04 21:39     ` Dave Chinner
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2014-05-04 21:39 UTC (permalink / raw)
  To: Marcel Giannelia; +Cc: xfs

On Sat, May 03, 2014 at 11:18:38PM -0700, Marcel Giannelia wrote:
> On Sun, 4 May 2014 10:17:46 +1000
> Dave Chinner <david@fromorbit.com> wrote:
> 
> > > 
> > > - Distribution & kernel version: Debian 7, uname -a returns:
> > > 
> > > Linux hostname 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2+deb7u2 i686
> > > GNU/Linux
> > 
> > So, old hardware...
> 
> Actually no, fairly new underlying hardware -- but this is for a
> not-for-profit with no hardware budget, and that one new machine is
> the exception. At the time they had a lot more 32-bit hardware lying
> around to build spares with, so I built it to run on that if needed :)

OK. If it were me, I would run a x86-64 kernel with a 32 bit
userspace rather than put up with all the highmem weirdness of a
686-pae kernel...

> > > dmesg entries:
> > > 
> > > > Immediately after the cp command exited with "i/o error":
> > > 
> > > XFS (md126): xfs_iflush_int: Bad inode 939480132, ptr 0xd12fa080,
> > > magic number 0x494d
> > 
> > The magic number has a single bit error in it.
> > 
> > #define XFS_DINODE_MAGIC                0x494e  /* 'IN' */
> > 
> > That's the in-memory inode, not the on-disk inode. It caught the
> > problem before writing the bad magic number to disk - the in-memory
> > disk buffer was checked immediately before the in-memory copy, and
> > it checked out OK...
...
> > However, I'd almost certainly be checking you hardware at this
> > point, as software doesn't usually cause random single bit flips...
> 
> Yeah, going to take that server offline for a full memtest next time
> I'm out there.
> 
> I also discovered that the third disk I mentioned from that RAID array
> was actually having serious problems (hardware ECC recovery and
> reallocated sectors through the roof), which explains the performance
> issues it was causing -- and that disk was still part of the array
> containing the root filesystem.

Ok, so it may be that the error came from disk in the first place.
The kernel you are running is old enough that it doesn't rigourously
check every inode that is read from disk, so maybe it slipped
through and was only caught by the writeback checks.


> A memory problem still seems more likely to me, as I wouldn't expect
> the part of the xfs filesystem driver containing the definition of that
> magic number to ever need to be re-read from disk after boot

The magic number is in every inode that is allocated on disk...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-04 21:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-03 19:49 xfs_iflush_int: Bad inode, xfs_do_force_shutdown from xfs_inode.c during file copy Marcel Giannelia
2014-05-04  0:17 ` Dave Chinner
2014-05-04  6:18   ` Marcel Giannelia
2014-05-04 21:39     ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.