* Internal error xfs_trans_cancel
@ 2016-06-01 5:52 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 5:52 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
Hi,
I got the error message below while compiling a kernel
on that system. I can't really say if I did something
which made the file system unhappy before the crash.
[Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
[ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
[ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
[ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
[ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
[ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
[ +0.000053] Call Trace:
[ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
[ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
[ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
[ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
[ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
[ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
[ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
[ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
[ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
[ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
[ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
[Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
[ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
cheers,
daniel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Internal error xfs_trans_cancel
@ 2016-06-01 5:52 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 5:52 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
Hi,
I got the error message below while compiling a kernel
on that system. I can't really say if I did something
which made the file system unhappy before the crash.
[Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
[ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
[ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
[ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
[ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
[ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
[ +0.000053] Call Trace:
[ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
[ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
[ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
[ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
[ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
[ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
[ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
[ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
[ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
[ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
[ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
[Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
[ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
cheers,
daniel
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 5:52 ` Daniel Wagner
@ 2016-06-01 7:10 ` Dave Chinner
-1 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-01 7:10 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
> Hi,
>
> I got the error message below while compiling a kernel
> on that system. I can't really say if I did something
> which made the file system unhappy before the crash.
>
>
> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
Anything in the log before this?
> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
> [ +0.000053] Call Trace:
> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
Doesn't normally happen, and there's not a lot to go on here. Can
you provide the info listed in the link below so we have some idea
of what configuration the error occurred on?
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
You didn't run out of space or something unusual like that? Does
'xfs_repair -n <dev>' report any errors?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-01 7:10 ` Dave Chinner
0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-01 7:10 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
> Hi,
>
> I got the error message below while compiling a kernel
> on that system. I can't really say if I did something
> which made the file system unhappy before the crash.
>
>
> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
Anything in the log before this?
> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
> [ +0.000053] Call Trace:
> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
Doesn't normally happen, and there's not a lot to go on here. Can
you provide the info listed in the link below so we have some idea
of what configuration the error occurred on?
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
You didn't run out of space or something unusual like that? Does
'xfs_repair -n <dev>' report any errors?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 7:10 ` Dave Chinner
@ 2016-06-01 13:50 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 13:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
On 06/01/2016 09:10 AM, Dave Chinner wrote:
> On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
>> I got the error message below while compiling a kernel
>> on that system. I can't really say if I did something
>> which made the file system unhappy before the crash.
>>
>>
>> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
>
> Anything in the log before this?
Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
>> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
>> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
>> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
>> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
>> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
>> [ +0.000053] Call Trace:
>> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
>> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
>> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
>> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
>> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
>> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
>> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
>> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
>> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
>> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
>> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
>> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
>> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
>
> Doesn't normally happen, and there's not a lot to go on here.
Restarted the box and did a couple of kernel builds and
everything was fine.
> Can
> you provide the info listed in the link below so we have some idea
> of what configuration the error occurred on?
Sure, forgot that in the first post.
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
# uname -r
4.7.0-rc1-00003-g1f55b0d
# xfs_repair -V
xfs_repair version 4.5.0
# cat /proc/cpuinfo | grep CPU | wc -l
64
# cat /proc/meminfo
MemTotal: 528344752 kB
MemFree: 526838036 kB
MemAvailable: 525265612 kB
Buffers: 2716 kB
Cached: 216896 kB
SwapCached: 0 kB
Active: 119924 kB
Inactive: 116552 kB
Active(anon): 17416 kB
Inactive(anon): 1108 kB
Active(file): 102508 kB
Inactive(file): 115444 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 16972 kB
Mapped: 25288 kB
Shmem: 1616 kB
Slab: 184920 kB
SReclaimable: 60028 kB
SUnreclaim: 124892 kB
KernelStack: 13120 kB
PageTables: 2292 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 264172376 kB
Committed_AS: 270612 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 232256 kB
DirectMap2M: 7061504 kB
DirectMap1G: 531628032 kB
# cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=264153644k,nr_inodes=66038411,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/sda2 / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
tmpfs /tmp tmpfs rw 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sdc1 /mnt/sdc1 xfs rw,relatime,attr2,inode64,noquota 0 0
/dev/sda1 /boot ext4 rw,relatime,data=ordered 0 0
/dev/sde2 /mnt/yocto xfs rw,relatime,attr2,inode64,noquota 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=52834476k,mode=700 0 0
/dev/sde1 /home xfs rw,relatime,attr2,inode64,noquota 0 0
# cat /proc/partitions
major minor #blocks name
11 0 1048575 sr0
8 64 249430016 sde
8 65 104857600 sde1
8 66 144571375 sde2
8 48 142737408 sdd
8 16 142737408 sdb
8 32 142737408 sdc
8 33 142736367 sdc1
8 0 142737408 sda
8 1 5120000 sda1
8 2 104857600 sda2
No RAID
No LVM
HDD (sda, sdb, sdc, sdd):
Manufacturer TOSHIBA
Product ID MK1401GRRB
SSD (sde):
Manufacturer Samsung
Product ID Samsung SSD 850
Revision 1B6Q
# hdparm -I /dev/sde
/dev/sde:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0d 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ATA device, with non-removable media
Standards:
Likely used: 1
Configuration:
Logical max current
cylinders 0 0
heads 0 0
sectors/track 0 0
--
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 0 MBytes
device size with M = 1000*1000: 0 MBytes
cache/buffer size = unknown
Capabilities:
IORDY not likely
Cannot perform double-word IO
R/W multiple sector transfer: not supported
DMA: not supported
PIO: pio0
# xfs_info /dev/sde1
meta-data=/dev/sde1 isize=256 agcount=4, agsize=6553600 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
> You didn't run out of space or something unusual like that?
It should have enough space for building the kernel. I haven't
expierenced any problems with that disk or partition in the
last half year. It's my test box, so it gets exposed to
many -rc kernels and test patches. I've never seen any
problems in xfs so far.
Filesystem Size Used Avail Use% Mounted on
/dev/sde1 100G 72G 29G 72% /home
> Does 'xfs_repair -n <dev>' report any errors?
# xfs_repair -n /dev/sde1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 3
- agno = 2
- agno = 1
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
cheers,
daniel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-01 13:50 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 13:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
On 06/01/2016 09:10 AM, Dave Chinner wrote:
> On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
>> I got the error message below while compiling a kernel
>> on that system. I can't really say if I did something
>> which made the file system unhappy before the crash.
>>
>>
>> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
>
> Anything in the log before this?
Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
>> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
>> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
>> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
>> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
>> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
>> [ +0.000053] Call Trace:
>> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
>> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
>> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
>> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
>> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
>> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
>> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
>> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
>> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
>> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
>> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
>> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
>> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
>
> Doesn't normally happen, and there's not a lot to go on here.
Restarted the box and did a couple of kernel builds and
everything was fine.
> Can
> you provide the info listed in the link below so we have some idea
> of what configuration the error occurred on?
Sure, forgot that in the first post.
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
# uname -r
4.7.0-rc1-00003-g1f55b0d
# xfs_repair -V
xfs_repair version 4.5.0
# cat /proc/cpuinfo | grep CPU | wc -l
64
# cat /proc/meminfo
MemTotal: 528344752 kB
MemFree: 526838036 kB
MemAvailable: 525265612 kB
Buffers: 2716 kB
Cached: 216896 kB
SwapCached: 0 kB
Active: 119924 kB
Inactive: 116552 kB
Active(anon): 17416 kB
Inactive(anon): 1108 kB
Active(file): 102508 kB
Inactive(file): 115444 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 16972 kB
Mapped: 25288 kB
Shmem: 1616 kB
Slab: 184920 kB
SReclaimable: 60028 kB
SUnreclaim: 124892 kB
KernelStack: 13120 kB
PageTables: 2292 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 264172376 kB
Committed_AS: 270612 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 232256 kB
DirectMap2M: 7061504 kB
DirectMap1G: 531628032 kB
# cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=264153644k,nr_inodes=66038411,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/sda2 / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=27,pgrp=1,timeout=0,minproto=5,maxproto=5,direct 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
tmpfs /tmp tmpfs rw 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sdc1 /mnt/sdc1 xfs rw,relatime,attr2,inode64,noquota 0 0
/dev/sda1 /boot ext4 rw,relatime,data=ordered 0 0
/dev/sde2 /mnt/yocto xfs rw,relatime,attr2,inode64,noquota 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=52834476k,mode=700 0 0
/dev/sde1 /home xfs rw,relatime,attr2,inode64,noquota 0 0
# cat /proc/partitions
major minor #blocks name
11 0 1048575 sr0
8 64 249430016 sde
8 65 104857600 sde1
8 66 144571375 sde2
8 48 142737408 sdd
8 16 142737408 sdb
8 32 142737408 sdc
8 33 142736367 sdc1
8 0 142737408 sda
8 1 5120000 sda1
8 2 104857600 sda2
No RAID
No LVM
HDD (sda, sdb, sdc, sdd):
Manufacturer TOSHIBA
Product ID MK1401GRRB
SSD (sde):
Manufacturer Samsung
Product ID Samsung SSD 850
Revision 1B6Q
# hdparm -I /dev/sde
/dev/sde:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0d 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ATA device, with non-removable media
Standards:
Likely used: 1
Configuration:
Logical max current
cylinders 0 0
heads 0 0
sectors/track 0 0
--
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 0 MBytes
device size with M = 1000*1000: 0 MBytes
cache/buffer size = unknown
Capabilities:
IORDY not likely
Cannot perform double-word IO
R/W multiple sector transfer: not supported
DMA: not supported
PIO: pio0
# xfs_info /dev/sde1
meta-data=/dev/sde1 isize=256 agcount=4, agsize=6553600 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
> You didn't run out of space or something unusual like that?
It should have enough space for building the kernel. I haven't
expierenced any problems with that disk or partition in the
last half year. It's my test box, so it gets exposed to
many -rc kernels and test patches. I've never seen any
problems in xfs so far.
Filesystem Size Used Avail Use% Mounted on
/dev/sde1 100G 72G 29G 72% /home
> Does 'xfs_repair -n <dev>' report any errors?
# xfs_repair -n /dev/sde1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 3
- agno = 2
- agno = 1
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
cheers,
daniel
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 13:50 ` Daniel Wagner
@ 2016-06-01 14:13 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 14:13 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
[-- Attachment #1: Type: text/plain, Size: 8297 bytes --]
>> Anything in the log before this?
>
> Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
Just triggered it again. My steps for it are:
- run all lockperf test
git://git.samba.org/jlayton/lockperf.git
via my test script:
#!/bin/sh
run_tests () {
echo $1
for i in `seq 10`;
do
rm -rf /tmp/a;
$1 /tmp/a > /dev/null
sync
done
for i in `seq 100`;
do
rm -rf /tmp/a;
$1 /tmp/a >> $2
sync
done
}
PATH=~/src/lockperf:$PATH
DIR=$1-`uname -r`
if [ ! -d "$DIR" ]; then
mkdir $DIR
fi
CPUSET=`cat /sys/devices/system/node/node0/cpulist`
taskset -pc $CPUSET $$
sudo sh -c 'for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i ; done'
for c in `seq 8 32 128`; do
for l in `seq 100 100 500`; do
time run_tests "posix01 -n $c -l $l " $DIR/posix01-$c-$l.data
time run_tests "posix02 -n $c -l $l " $DIR/posix02-$c-$l.data
time run_tests "posix03 -n $c -l $l " $DIR/posix03-$c-$l.data
time run_tests "posix04 -n $c -l $l " $DIR/posix04-$c-$l.data
time run_tests "flock01 -n $c -l $l " $DIR/flock01-$c-$l.data
time run_tests "flock02 -n $c -l $l " $DIR/flock02-$c-$l.data
time run_tests "lease01 -n $c -l $l " $DIR/lease01-$c-$l.data
time run_tests "lease02 -n $c -l $l " $DIR/lease02-$c-$l.data
done
done
sudo sh -c 'for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo powersave > $i ; done'
And after that rebuild a new kernel. That was all.
This time I saved the logs. xfs_repair was not so happy either.
cheers,
daniel
[-- Attachment #2: dmesg.log.xz --]
[-- Type: application/x-xz, Size: 19596 bytes --]
[-- Attachment #3: xfs_repair.log.xz --]
[-- Type: application/x-xz, Size: 16412 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-01 14:13 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 14:13 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
[-- Attachment #1: Type: text/plain, Size: 8297 bytes --]
>> Anything in the log before this?
>
> Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
Just triggered it again. My steps for it are:
- run all lockperf test
git://git.samba.org/jlayton/lockperf.git
via my test script:
#!/bin/sh
run_tests () {
echo $1
for i in `seq 10`;
do
rm -rf /tmp/a;
$1 /tmp/a > /dev/null
sync
done
for i in `seq 100`;
do
rm -rf /tmp/a;
$1 /tmp/a >> $2
sync
done
}
PATH=~/src/lockperf:$PATH
DIR=$1-`uname -r`
if [ ! -d "$DIR" ]; then
mkdir $DIR
fi
CPUSET=`cat /sys/devices/system/node/node0/cpulist`
taskset -pc $CPUSET $$
sudo sh -c 'for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i ; done'
for c in `seq 8 32 128`; do
for l in `seq 100 100 500`; do
time run_tests "posix01 -n $c -l $l " $DIR/posix01-$c-$l.data
time run_tests "posix02 -n $c -l $l " $DIR/posix02-$c-$l.data
time run_tests "posix03 -n $c -l $l " $DIR/posix03-$c-$l.data
time run_tests "posix04 -n $c -l $l " $DIR/posix04-$c-$l.data
time run_tests "flock01 -n $c -l $l " $DIR/flock01-$c-$l.data
time run_tests "flock02 -n $c -l $l " $DIR/flock02-$c-$l.data
time run_tests "lease01 -n $c -l $l " $DIR/lease01-$c-$l.data
time run_tests "lease02 -n $c -l $l " $DIR/lease02-$c-$l.data
done
done
sudo sh -c 'for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo powersave > $i ; done'
And after that rebuild a new kernel. That was all.
This time I saved the logs. xfs_repair was not so happy either.
cheers,
daniel
[-- Attachment #2: dmesg.log.xz --]
[-- Type: application/x-xz, Size: 19596 bytes --]
[-- Attachment #3: xfs_repair.log.xz --]
[-- Type: application/x-xz, Size: 16412 bytes --]
[-- Attachment #4: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 14:13 ` Daniel Wagner
@ 2016-06-01 14:19 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 14:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> via my test script:
Looks like my email client did not agree with my formatting of the script.
https://www.monom.org/data/lglock/run-tests.sh
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-01 14:19 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-01 14:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> via my test script:
Looks like my email client did not agree with my formatting of the script.
https://www.monom.org/data/lglock/run-tests.sh
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 14:13 ` Daniel Wagner
@ 2016-06-02 0:26 ` Dave Chinner
-1 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-02 0:26 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 04:13:10PM +0200, Daniel Wagner wrote:
> >> Anything in the log before this?
> >
> > Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
>
> Just triggered it again. My steps for it are:
>
> - run all lockperf test
>
> git://git.samba.org/jlayton/lockperf.git
>
> via my test script:
>
> #!/bin/sh
>
> run_tests () {
.....
> for c in `seq 8 32 128`; do
> for l in `seq 100 100 500`; do
> time run_tests "posix01 -n $c -l $l " $DIR/posix01-$c-$l.data
> time run_tests "posix02 -n $c -l $l " $DIR/posix02-$c-$l.data
> time run_tests "posix03 -n $c -l $l " $DIR/posix03-$c-$l.data
> time run_tests "posix04 -n $c -l $l " $DIR/posix04-$c-$l.data
posix03 and posix04 just emit error messages:
posix04 -n 40 -l 100
posix04: invalid option -- 'l'
posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
.....
So I changed them to run "-i $l" instead, and that has a somewhat
undesired effect:
static void
kill_children()
{
siginfo_t infop;
signal(SIGINT, SIG_IGN);
>>>>> kill(0, SIGINT);
while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
}
Yeah, it sends a SIGINT to everything with a process group id. It
kills the parent shell:
$ ./run-lockperf-tests.sh /mnt/scratch/
pid 9597's current affinity list: 0-15
pid 9597's new affinity list: 0,4,8,12
sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
posix01 -n 8 -l 100
posix02 -n 8 -l 100
posix03 -n 8 -i 100
$
So, I've just removed those tests from your script. I'll see if I
have any luck with reproducing the problem now.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-02 0:26 ` Dave Chinner
0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-02 0:26 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 04:13:10PM +0200, Daniel Wagner wrote:
> >> Anything in the log before this?
> >
> > Just the usual stuff, as I remember. Sorry, I haven't copied the whole log.
>
> Just triggered it again. My steps for it are:
>
> - run all lockperf test
>
> git://git.samba.org/jlayton/lockperf.git
>
> via my test script:
>
> #!/bin/sh
>
> run_tests () {
.....
> for c in `seq 8 32 128`; do
> for l in `seq 100 100 500`; do
> time run_tests "posix01 -n $c -l $l " $DIR/posix01-$c-$l.data
> time run_tests "posix02 -n $c -l $l " $DIR/posix02-$c-$l.data
> time run_tests "posix03 -n $c -l $l " $DIR/posix03-$c-$l.data
> time run_tests "posix04 -n $c -l $l " $DIR/posix04-$c-$l.data
posix03 and posix04 just emit error messages:
posix04 -n 40 -l 100
posix04: invalid option -- 'l'
posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
.....
So I changed them to run "-i $l" instead, and that has a somewhat
undesired effect:
static void
kill_children()
{
siginfo_t infop;
signal(SIGINT, SIG_IGN);
>>>>> kill(0, SIGINT);
while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
}
Yeah, it sends a SIGINT to everything with a process group id. It
kills the parent shell:
$ ./run-lockperf-tests.sh /mnt/scratch/
pid 9597's current affinity list: 0-15
pid 9597's new affinity list: 0,4,8,12
sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
posix01 -n 8 -l 100
posix02 -n 8 -l 100
posix03 -n 8 -i 100
$
So, I've just removed those tests from your script. I'll see if I
have any luck with reproducing the problem now.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-02 0:26 ` Dave Chinner
@ 2016-06-02 5:23 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-02 5:23 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> posix03 and posix04 just emit error messages:
>
> posix04 -n 40 -l 100
> posix04: invalid option -- 'l'
> posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
> .....
I screwed that this up. I have patched my version of lockperf to make
all test using the same options names. Though forgot to send those
patches. Will do now.
In this case you can use use '-i' instead of '-l'.
> So I changed them to run "-i $l" instead, and that has a somewhat
> undesired effect:
>
> static void
> kill_children()
> {
> siginfo_t infop;
>
> signal(SIGINT, SIG_IGN);
>>>>>> kill(0, SIGINT);
> while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
> }
>
> Yeah, it sends a SIGINT to everything with a process group id. It
> kills the parent shell:
Ah that rings a bell. I tuned the parameters so that I did not run into
this problem. I'll do patch for this one. It's pretty annoying.
> $ ./run-lockperf-tests.sh /mnt/scratch/
> pid 9597's current affinity list: 0-15
> pid 9597's new affinity list: 0,4,8,12
> sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
> posix01 -n 8 -l 100
> posix02 -n 8 -l 100
> posix03 -n 8 -i 100
>
> $
>
> So, I've just removed those tests from your script. I'll see if I
> have any luck with reproducing the problem now.
I was able to reproduce it again with the same steps.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-02 5:23 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-02 5:23 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> posix03 and posix04 just emit error messages:
>
> posix04 -n 40 -l 100
> posix04: invalid option -- 'l'
> posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
> .....
I screwed that this up. I have patched my version of lockperf to make
all test using the same options names. Though forgot to send those
patches. Will do now.
In this case you can use use '-i' instead of '-l'.
> So I changed them to run "-i $l" instead, and that has a somewhat
> undesired effect:
>
> static void
> kill_children()
> {
> siginfo_t infop;
>
> signal(SIGINT, SIG_IGN);
>>>>>> kill(0, SIGINT);
> while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
> }
>
> Yeah, it sends a SIGINT to everything with a process group id. It
> kills the parent shell:
Ah that rings a bell. I tuned the parameters so that I did not run into
this problem. I'll do patch for this one. It's pretty annoying.
> $ ./run-lockperf-tests.sh /mnt/scratch/
> pid 9597's current affinity list: 0-15
> pid 9597's new affinity list: 0,4,8,12
> sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
> posix01 -n 8 -l 100
> posix02 -n 8 -l 100
> posix03 -n 8 -i 100
>
> $
>
> So, I've just removed those tests from your script. I'll see if I
> have any luck with reproducing the problem now.
I was able to reproduce it again with the same steps.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-02 5:23 ` Daniel Wagner
@ 2016-06-02 6:35 ` Dave Chinner
-1 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-02 6:35 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Thu, Jun 02, 2016 at 07:23:24AM +0200, Daniel Wagner wrote:
> > posix03 and posix04 just emit error messages:
> >
> > posix04 -n 40 -l 100
> > posix04: invalid option -- 'l'
> > posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
> > .....
>
> I screwed that this up. I have patched my version of lockperf to make
> all test using the same options names. Though forgot to send those
> patches. Will do now.
>
> In this case you can use use '-i' instead of '-l'.
>
> > So I changed them to run "-i $l" instead, and that has a somewhat
> > undesired effect:
> >
> > static void
> > kill_children()
> > {
> > siginfo_t infop;
> >
> > signal(SIGINT, SIG_IGN);
> >>>>>> kill(0, SIGINT);
> > while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
> > }
> >
> > Yeah, it sends a SIGINT to everything with a process group id. It
> > kills the parent shell:
>
> Ah that rings a bell. I tuned the parameters so that I did not run into
> this problem. I'll do patch for this one. It's pretty annoying.
>
> > $ ./run-lockperf-tests.sh /mnt/scratch/
> > pid 9597's current affinity list: 0-15
> > pid 9597's new affinity list: 0,4,8,12
> > sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
> > posix01 -n 8 -l 100
> > posix02 -n 8 -l 100
> > posix03 -n 8 -i 100
> >
> > $
> >
> > So, I've just removed those tests from your script. I'll see if I
> > have any luck with reproducing the problem now.
>
> I was able to reproduce it again with the same steps.
Hmmm, Ok. I've been running the lockperf test and kernel builds all
day on a filesystem that is identical in shape and size to yours
(i.e. xfs_info output is the same) but I haven't reproduced it yet.
Is it possible to get a metadump image of your filesystem to see if
I can reproduce it on that?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-02 6:35 ` Dave Chinner
0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2016-06-02 6:35 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Thu, Jun 02, 2016 at 07:23:24AM +0200, Daniel Wagner wrote:
> > posix03 and posix04 just emit error messages:
> >
> > posix04 -n 40 -l 100
> > posix04: invalid option -- 'l'
> > posix04: Usage: posix04 [-i iterations] [-n nr_children] [-s] <filename>
> > .....
>
> I screwed that this up. I have patched my version of lockperf to make
> all test using the same options names. Though forgot to send those
> patches. Will do now.
>
> In this case you can use use '-i' instead of '-l'.
>
> > So I changed them to run "-i $l" instead, and that has a somewhat
> > undesired effect:
> >
> > static void
> > kill_children()
> > {
> > siginfo_t infop;
> >
> > signal(SIGINT, SIG_IGN);
> >>>>>> kill(0, SIGINT);
> > while (waitid(P_ALL, 0, &infop, WEXITED) != -1);
> > }
> >
> > Yeah, it sends a SIGINT to everything with a process group id. It
> > kills the parent shell:
>
> Ah that rings a bell. I tuned the parameters so that I did not run into
> this problem. I'll do patch for this one. It's pretty annoying.
>
> > $ ./run-lockperf-tests.sh /mnt/scratch/
> > pid 9597's current affinity list: 0-15
> > pid 9597's new affinity list: 0,4,8,12
> > sh: 1: cannot create /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor: Directory nonexistent
> > posix01 -n 8 -l 100
> > posix02 -n 8 -l 100
> > posix03 -n 8 -i 100
> >
> > $
> >
> > So, I've just removed those tests from your script. I'll see if I
> > have any luck with reproducing the problem now.
>
> I was able to reproduce it again with the same steps.
Hmmm, Ok. I've been running the lockperf test and kernel builds all
day on a filesystem that is identical in shape and size to yours
(i.e. xfs_info output is the same) but I haven't reproduced it yet.
Is it possible to get a metadump image of your filesystem to see if
I can reproduce it on that?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-02 6:35 ` Dave Chinner
@ 2016-06-02 13:29 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-02 13:29 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> Hmmm, Ok. I've been running the lockperf test and kernel builds all
> day on a filesystem that is identical in shape and size to yours
> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
I don't know if that is important: I run the lockperf test and after
they have finished I do a kernel build.
> Is it possible to get a metadump image of your filesystem to see if
> I can reproduce it on that?
Sure, see private mail.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-02 13:29 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-02 13:29 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, xfs
> Hmmm, Ok. I've been running the lockperf test and kernel builds all
> day on a filesystem that is identical in shape and size to yours
> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
I don't know if that is important: I run the lockperf test and after
they have finished I do a kernel build.
> Is it possible to get a metadump image of your filesystem to see if
> I can reproduce it on that?
Sure, see private mail.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-01 5:52 ` Daniel Wagner
@ 2016-06-14 4:29 ` Josh Poimboeuf
-1 siblings, 0 replies; 24+ messages in thread
From: Josh Poimboeuf @ 2016-06-14 4:29 UTC (permalink / raw)
To: Daniel Wagner; +Cc: Dave Chinner, linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
> Hi,
>
> I got the error message below while compiling a kernel
> on that system. I can't really say if I did something
> which made the file system unhappy before the crash.
>
>
> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
> [ +0.000053] Call Trace:
> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
I saw this today. I was just building/installing kernels, rebooting,
running kexec, running perf.
[ 1359.005573] ------------[ cut here ]------------
[ 1359.010191] WARNING: CPU: 4 PID: 6031 at fs/inode.c:280 drop_nlink+0x3e/0x50
[ 1359.017231] Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp kvm_intel kvm nfsd ipmi_ssif ipmi_devintf ipmi_si iTCO_wdt irqbypass iTCO_vendor_support ipmi_msghandler i7core_edac shpchp sg edac_core pcspkr wmi lpc_ich dcdbas mfd_core acpi_power_meter auth_rpcgss acpi_cpufreq nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod cdrom iw_cxgb3 ib_core mgag200 ata_generic pata_acpi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mptsas scsi_transport_sas ata_piix mptscsih libata cxgb3 crc32c_intel i2c_core serio_raw mptbase bnx2 fjes mdio dm_mirror dm_region_hash dm_log dm_mod
[ 1359.088447] CPU: 4 PID: 6031 Comm: depmod Tainted: G I 4.7.0-rc3+ #4
[ 1359.095911] Hardware name: Dell Inc. PowerEdge R410/0N051F, BIOS 1.11.0 07/20/2012
[ 1359.103461] 0000000000000286 00000000a0bc39d9 ffff8802143dfd18 ffffffff8134bb7f
[ 1359.110871] 0000000000000000 0000000000000000 ffff8802143dfd58 ffffffff8108b671
[ 1359.118280] 00000118575f7d13 ffff880222c9a6e8 ffff8803ec3874d8 ffff880428827000
[ 1359.125693] Call Trace:
[ 1359.128133] [<ffffffff8134bb7f>] dump_stack+0x63/0x84
[ 1359.133259] [<ffffffff8108b671>] __warn+0xd1/0xf0
[ 1359.138037] [<ffffffff8108b7ad>] warn_slowpath_null+0x1d/0x20
[ 1359.143855] [<ffffffff81238fde>] drop_nlink+0x3e/0x50
[ 1359.149017] [<ffffffffa0327148>] xfs_droplink+0x28/0x60 [xfs]
[ 1359.154864] [<ffffffffa0328c81>] xfs_remove+0x231/0x350 [xfs]
[ 1359.160682] [<ffffffff812cd70a>] ? security_inode_permission+0x3a/0x60
[ 1359.167309] [<ffffffffa03235e8>] xfs_vn_unlink+0x58/0xa0 [xfs]
[ 1359.173213] [<ffffffff812d7e33>] ? selinux_inode_unlink+0x13/0x20
[ 1359.179379] [<ffffffff8122b29a>] vfs_unlink+0xda/0x190
[ 1359.184590] [<ffffffff8122df53>] do_unlinkat+0x263/0x2a0
[ 1359.189974] [<ffffffff8122ea1b>] SyS_unlinkat+0x1b/0x30
[ 1359.195272] [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[ 1359.200743] [<ffffffff816d7961>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1359.207178] ---[ end trace 0d397afdaff9f340 ]---
[ 1359.211830] XFS (dm-0): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_remove+0x1d1/0x350 [xfs]
[ 1359.223723] CPU: 4 PID: 6031 Comm: depmod Tainted: G W I 4.7.0-rc3+ #4
[ 1359.231185] Hardware name: Dell Inc. PowerEdge R410/0N051F, BIOS 1.11.0 07/20/2012
[ 1359.238736] 0000000000000286 00000000a0bc39d9 ffff8802143dfd60 ffffffff8134bb7f
[ 1359.246147] ffff8803ec3874d8 0000000000000001 ffff8802143dfd78 ffffffffa03176bb
[ 1359.253559] ffffffffa0328c21 ffff8802143dfda0 ffffffffa03327a6 ffff880222e7e180
[ 1359.260969] Call Trace:
[ 1359.263407] [<ffffffff8134bb7f>] dump_stack+0x63/0x84
[ 1359.268560] [<ffffffffa03176bb>] xfs_error_report+0x3b/0x40 [xfs]
[ 1359.274755] [<ffffffffa0328c21>] ? xfs_remove+0x1d1/0x350 [xfs]
[ 1359.280778] [<ffffffffa03327a6>] xfs_trans_cancel+0xb6/0xe0 [xfs]
[ 1359.286973] [<ffffffffa0328c21>] xfs_remove+0x1d1/0x350 [xfs]
[ 1359.292820] [<ffffffffa03235e8>] xfs_vn_unlink+0x58/0xa0 [xfs]
[ 1359.298724] [<ffffffff812d7e33>] ? selinux_inode_unlink+0x13/0x20
[ 1359.304890] [<ffffffff8122b29a>] vfs_unlink+0xda/0x190
[ 1359.310100] [<ffffffff8122df53>] do_unlinkat+0x263/0x2a0
[ 1359.315486] [<ffffffff8122ea1b>] SyS_unlinkat+0x1b/0x30
[ 1359.320784] [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[ 1359.326256] [<ffffffff816d7961>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1359.332692] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa03327bf
[ 1360.461638] XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem
[ 1360.469729] XFS (dm-0): Please umount the filesystem and rectify the problem(s)
[ 1360.595843] XFS (dm-0): xfs_log_force: error -5 returned.
# uname -a
Linux dell-per410-01.khw.lab.eng.bos.redhat.com 4.7.0-rc3+ #5 SMP Mon Jun 13 23:35:14 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
# xfs_repair -V
xfs_repair version 3.2.2
# cat /proc/cpuinfo | grep CPU | wc -l
16
# cat /proc/meminfo
MemTotal: 16415296 kB
MemFree: 15723380 kB
MemAvailable: 15796192 kB
Buffers: 964 kB
Cached: 350700 kB
SwapCached: 0 kB
Active: 248992 kB
Inactive: 223000 kB
Active(anon): 121176 kB
Inactive(anon): 8116 kB
Active(file): 127816 kB
Inactive(file): 214884 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 8257532 kB
SwapFree: 8257532 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 120340 kB
Mapped: 40136 kB
Shmem: 8964 kB
Slab: 80092 kB
SReclaimable: 25208 kB
SUnreclaim: 54884 kB
KernelStack: 5872 kB
PageTables: 5468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16465180 kB
Committed_AS: 355084 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 51200 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 177572 kB
DirectMap2M: 16586752 kB
# cat /proc/mounts
sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=8161852k,nr_inodes=2040463,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/rhel_dell--per410--01-root / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
debugfs /sys/kernel/debug debugfs rw,seclabel,relatime 0 0
mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
/dev/mapper/rhel_dell--per410--01-home /home xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sda1 /boot xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=1641532k,mode=700 0 0
tracefs /sys/kernel/debug/tracing tracefs rw,relatime 0 0
# cat /proc/partitions
major minor #blocks name
11 0 1048575 sr0
8 16 143374740 sdb
8 17 143373312 sdb1
8 0 143374740 sda
8 1 512000 sda1
8 2 142861312 sda2
253 0 52428800 dm-0
253 1 8257536 dm-1
253 2 225480704 dm-2
# pvdisplay
--- Physical volume ---
PV Name /dev/sda2
VG Name rhel_dell-per410-01
PV Size 136.24 GiB / not usable 0
Allocatable yes
PE Size 4.00 MiB
Total PE 34878
Free PE 16
Allocated PE 34862
PV UUID cTa6X3-dz3E-HmdE-bY1J-XEoo-USwY-dl2lRm
--- Physical volume ---
PV Name /dev/sdb1
VG Name rhel_dell-per410-01
PV Size 136.73 GiB / not usable 0
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 35003
Free PE 0
Allocated PE 35003
PV UUID ZzXTQx-9CN1-TaMu-UfrN-Usuz-aFvl-A6PKJS
# lvdisplay
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/swap
LV Name swap
VG Name rhel_dell-per410-01
LV UUID E6Y5qQ-URKt-9wc6-3fRc-2wbZ-ev2n-IliB7s
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:31 -0400
LV Status available
# open 2
LV Size 7.88 GiB
Current LE 2016
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:1
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/home
LV Name home
VG Name rhel_dell-per410-01
LV UUID Zq6BIP-0Yem-3NAp-gJ5K-2c6Q-Zc67-mdp51k
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:31 -0400
LV Status available
# open 1
LV Size 215.04 GiB
Current LE 55049
Segments 2
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:2
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/root
LV Name root
VG Name rhel_dell-per410-01
LV UUID T4rKVg-cuiW-jc6c-grNW-DJQQ-mCt8-N8Ig5l
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:35 -0400
LV Status available
# open 1
LV Size 50.00 GiB
Current LE 12800
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:0
# hdparm -i /dev/sda
/dev/sda:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 06 01 00 00 00 00 00 00 00 00 00
HDIO_GET_IDENTITY failed: Invalid argument
# hdparm -i /dev/sdb
/dev/sdb:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 06 01 00 00 00 00 00 00 00 00 00
HDIO_GET_IDENTITY failed: Invalid argument
(I don't know anything about the disks, but I can try to find out if you
need it.)
# xfs_info /dev/mapper/rhel_dell--per410--01-root
meta-data=/dev/mapper/rhel_dell--per410--01-root isize=256 agcount=4, agsize=3276800 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=13107200, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=6400, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
--
Josh
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-14 4:29 ` Josh Poimboeuf
0 siblings, 0 replies; 24+ messages in thread
From: Josh Poimboeuf @ 2016-06-14 4:29 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-fsdevel, linux-kernel, xfs
On Wed, Jun 01, 2016 at 07:52:31AM +0200, Daniel Wagner wrote:
> Hi,
>
> I got the error message below while compiling a kernel
> on that system. I can't really say if I did something
> which made the file system unhappy before the crash.
>
>
> [Jun 1 07:41] XFS (sde1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_rename+0x453/0x960 [xfs]
> [ +0.000095] CPU: 22 PID: 8640 Comm: gcc Not tainted 4.7.0-rc1 #16
> [ +0.000035] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> [ +0.000048] 0000000000000286 00000000c8be6bc3 ffff885fa9473cb0 ffffffff813d146e
> [ +0.000056] ffff885fa9ac5ed0 0000000000000001 ffff885fa9473cc8 ffffffffa0213cdc
> [ +0.000053] ffffffffa02257b3 ffff885fa9473cf0 ffffffffa022eb36 ffff883faa502d00
> [ +0.000053] Call Trace:
> [ +0.000028] [<ffffffff813d146e>] dump_stack+0x63/0x85
> [ +0.000069] [<ffffffffa0213cdc>] xfs_error_report+0x3c/0x40 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] ? xfs_rename+0x453/0x960 [xfs]
> [ +0.000064] [<ffffffffa022eb36>] xfs_trans_cancel+0xb6/0xe0 [xfs]
> [ +0.000065] [<ffffffffa02257b3>] xfs_rename+0x453/0x960 [xfs]
> [ +0.000062] [<ffffffffa021fa63>] xfs_vn_rename+0xb3/0xf0 [xfs]
> [ +0.000040] [<ffffffff8124f92c>] vfs_rename+0x58c/0x8d0
> [ +0.000032] [<ffffffff81253fb1>] SyS_rename+0x371/0x390
> [ +0.000036] [<ffffffff817d2032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> [ +0.000040] XFS (sde1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa022eb4f
> [ +0.027680] XFS (sde1): Corruption of in-memory data detected. Shutting down filesystem
> [ +0.000057] XFS (sde1): Please umount the filesystem and rectify the problem(s)
> [Jun 1 07:42] XFS (sde1): xfs_log_force: error -5 returned.
> [ +30.081016] XFS (sde1): xfs_log_force: error -5 returned.
I saw this today. I was just building/installing kernels, rebooting,
running kexec, running perf.
[ 1359.005573] ------------[ cut here ]------------
[ 1359.010191] WARNING: CPU: 4 PID: 6031 at fs/inode.c:280 drop_nlink+0x3e/0x50
[ 1359.017231] Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp kvm_intel kvm nfsd ipmi_ssif ipmi_devintf ipmi_si iTCO_wdt irqbypass iTCO_vendor_support ipmi_msghandler i7core_edac shpchp sg edac_core pcspkr wmi lpc_ich dcdbas mfd_core acpi_power_meter auth_rpcgss acpi_cpufreq nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod cdrom iw_cxgb3 ib_core mgag200 ata_generic pata_acpi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mptsas scsi_transport_sas ata_piix mptscsih libata cxgb3 crc32c_intel i2c_core serio_raw mptbase bnx2 fjes mdio dm_mirror dm_region_hash dm_log dm_mod
[ 1359.088447] CPU: 4 PID: 6031 Comm: depmod Tainted: G I 4.7.0-rc3+ #4
[ 1359.095911] Hardware name: Dell Inc. PowerEdge R410/0N051F, BIOS 1.11.0 07/20/2012
[ 1359.103461] 0000000000000286 00000000a0bc39d9 ffff8802143dfd18 ffffffff8134bb7f
[ 1359.110871] 0000000000000000 0000000000000000 ffff8802143dfd58 ffffffff8108b671
[ 1359.118280] 00000118575f7d13 ffff880222c9a6e8 ffff8803ec3874d8 ffff880428827000
[ 1359.125693] Call Trace:
[ 1359.128133] [<ffffffff8134bb7f>] dump_stack+0x63/0x84
[ 1359.133259] [<ffffffff8108b671>] __warn+0xd1/0xf0
[ 1359.138037] [<ffffffff8108b7ad>] warn_slowpath_null+0x1d/0x20
[ 1359.143855] [<ffffffff81238fde>] drop_nlink+0x3e/0x50
[ 1359.149017] [<ffffffffa0327148>] xfs_droplink+0x28/0x60 [xfs]
[ 1359.154864] [<ffffffffa0328c81>] xfs_remove+0x231/0x350 [xfs]
[ 1359.160682] [<ffffffff812cd70a>] ? security_inode_permission+0x3a/0x60
[ 1359.167309] [<ffffffffa03235e8>] xfs_vn_unlink+0x58/0xa0 [xfs]
[ 1359.173213] [<ffffffff812d7e33>] ? selinux_inode_unlink+0x13/0x20
[ 1359.179379] [<ffffffff8122b29a>] vfs_unlink+0xda/0x190
[ 1359.184590] [<ffffffff8122df53>] do_unlinkat+0x263/0x2a0
[ 1359.189974] [<ffffffff8122ea1b>] SyS_unlinkat+0x1b/0x30
[ 1359.195272] [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[ 1359.200743] [<ffffffff816d7961>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1359.207178] ---[ end trace 0d397afdaff9f340 ]---
[ 1359.211830] XFS (dm-0): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c. Caller xfs_remove+0x1d1/0x350 [xfs]
[ 1359.223723] CPU: 4 PID: 6031 Comm: depmod Tainted: G W I 4.7.0-rc3+ #4
[ 1359.231185] Hardware name: Dell Inc. PowerEdge R410/0N051F, BIOS 1.11.0 07/20/2012
[ 1359.238736] 0000000000000286 00000000a0bc39d9 ffff8802143dfd60 ffffffff8134bb7f
[ 1359.246147] ffff8803ec3874d8 0000000000000001 ffff8802143dfd78 ffffffffa03176bb
[ 1359.253559] ffffffffa0328c21 ffff8802143dfda0 ffffffffa03327a6 ffff880222e7e180
[ 1359.260969] Call Trace:
[ 1359.263407] [<ffffffff8134bb7f>] dump_stack+0x63/0x84
[ 1359.268560] [<ffffffffa03176bb>] xfs_error_report+0x3b/0x40 [xfs]
[ 1359.274755] [<ffffffffa0328c21>] ? xfs_remove+0x1d1/0x350 [xfs]
[ 1359.280778] [<ffffffffa03327a6>] xfs_trans_cancel+0xb6/0xe0 [xfs]
[ 1359.286973] [<ffffffffa0328c21>] xfs_remove+0x1d1/0x350 [xfs]
[ 1359.292820] [<ffffffffa03235e8>] xfs_vn_unlink+0x58/0xa0 [xfs]
[ 1359.298724] [<ffffffff812d7e33>] ? selinux_inode_unlink+0x13/0x20
[ 1359.304890] [<ffffffff8122b29a>] vfs_unlink+0xda/0x190
[ 1359.310100] [<ffffffff8122df53>] do_unlinkat+0x263/0x2a0
[ 1359.315486] [<ffffffff8122ea1b>] SyS_unlinkat+0x1b/0x30
[ 1359.320784] [<ffffffff81003b12>] do_syscall_64+0x62/0x110
[ 1359.326256] [<ffffffff816d7961>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1359.332692] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa03327bf
[ 1360.461638] XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem
[ 1360.469729] XFS (dm-0): Please umount the filesystem and rectify the problem(s)
[ 1360.595843] XFS (dm-0): xfs_log_force: error -5 returned.
# uname -a
Linux dell-per410-01.khw.lab.eng.bos.redhat.com 4.7.0-rc3+ #5 SMP Mon Jun 13 23:35:14 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
# xfs_repair -V
xfs_repair version 3.2.2
# cat /proc/cpuinfo | grep CPU | wc -l
16
# cat /proc/meminfo
MemTotal: 16415296 kB
MemFree: 15723380 kB
MemAvailable: 15796192 kB
Buffers: 964 kB
Cached: 350700 kB
SwapCached: 0 kB
Active: 248992 kB
Inactive: 223000 kB
Active(anon): 121176 kB
Inactive(anon): 8116 kB
Active(file): 127816 kB
Inactive(file): 214884 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 8257532 kB
SwapFree: 8257532 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 120340 kB
Mapped: 40136 kB
Shmem: 8964 kB
Slab: 80092 kB
SReclaimable: 25208 kB
SUnreclaim: 54884 kB
KernelStack: 5872 kB
PageTables: 5468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16465180 kB
Committed_AS: 355084 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 51200 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 177572 kB
DirectMap2M: 16586752 kB
# cat /proc/mounts
sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=8161852k,nr_inodes=2040463,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0
devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/rhel_dell--per410--01-root / xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
debugfs /sys/kernel/debug debugfs rw,seclabel,relatime 0 0
mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
/dev/mapper/rhel_dell--per410--01-home /home xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sda1 /boot xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0
tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=1641532k,mode=700 0 0
tracefs /sys/kernel/debug/tracing tracefs rw,relatime 0 0
# cat /proc/partitions
major minor #blocks name
11 0 1048575 sr0
8 16 143374740 sdb
8 17 143373312 sdb1
8 0 143374740 sda
8 1 512000 sda1
8 2 142861312 sda2
253 0 52428800 dm-0
253 1 8257536 dm-1
253 2 225480704 dm-2
# pvdisplay
--- Physical volume ---
PV Name /dev/sda2
VG Name rhel_dell-per410-01
PV Size 136.24 GiB / not usable 0
Allocatable yes
PE Size 4.00 MiB
Total PE 34878
Free PE 16
Allocated PE 34862
PV UUID cTa6X3-dz3E-HmdE-bY1J-XEoo-USwY-dl2lRm
--- Physical volume ---
PV Name /dev/sdb1
VG Name rhel_dell-per410-01
PV Size 136.73 GiB / not usable 0
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 35003
Free PE 0
Allocated PE 35003
PV UUID ZzXTQx-9CN1-TaMu-UfrN-Usuz-aFvl-A6PKJS
# lvdisplay
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/swap
LV Name swap
VG Name rhel_dell-per410-01
LV UUID E6Y5qQ-URKt-9wc6-3fRc-2wbZ-ev2n-IliB7s
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:31 -0400
LV Status available
# open 2
LV Size 7.88 GiB
Current LE 2016
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:1
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/home
LV Name home
VG Name rhel_dell-per410-01
LV UUID Zq6BIP-0Yem-3NAp-gJ5K-2c6Q-Zc67-mdp51k
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:31 -0400
LV Status available
# open 1
LV Size 215.04 GiB
Current LE 55049
Segments 2
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:2
--- Logical volume ---
LV Path /dev/rhel_dell-per410-01/root
LV Name root
VG Name rhel_dell-per410-01
LV UUID T4rKVg-cuiW-jc6c-grNW-DJQQ-mCt8-N8Ig5l
LV Write Access read/write
LV Creation host, time dell-per410-01.khw.lab.eng.bos.redhat.com, 2016-06-13 12:55:35 -0400
LV Status available
# open 1
LV Size 50.00 GiB
Current LE 12800
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:0
# hdparm -i /dev/sda
/dev/sda:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 06 01 00 00 00 00 00 00 00 00 00
HDIO_GET_IDENTITY failed: Invalid argument
# hdparm -i /dev/sdb
/dev/sdb:
SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 28 00 00 00 00 20 00 00 00 00 00 00 85 55 06 01 00 00 00 00 00 00 00 00 00
HDIO_GET_IDENTITY failed: Invalid argument
(I don't know anything about the disks, but I can try to find out if you
need it.)
# xfs_info /dev/mapper/rhel_dell--per410--01-root
meta-data=/dev/mapper/rhel_dell--per410--01-root isize=256 agcount=4, agsize=3276800 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=13107200, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=6400, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
--
Josh
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-02 13:29 ` Daniel Wagner
@ 2016-06-26 12:16 ` Thorsten Leemhuis
-1 siblings, 0 replies; 24+ messages in thread
From: Thorsten Leemhuis @ 2016-06-26 12:16 UTC (permalink / raw)
To: Daniel Wagner, Dave Chinner
Cc: linux-fsdevel, linux-kernel, xfs, Josh Poimboeuf
On 02.06.2016 15:29, Daniel Wagner wrote:
>> Hmmm, Ok. I've been running the lockperf test and kernel builds all
>> day on a filesystem that is identical in shape and size to yours
>> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
> I don't know if that is important: I run the lockperf test and after
> they have finished I do a kernel build.
>
>> Is it possible to get a metadump image of your filesystem to see if
>> I can reproduce it on that?
> Sure, see private mail.
Dave, Daniel, what's the latest status on this issue? It made it to my
list of know 4.7 regressions after Christoph suggested it should be
listed. But this thread looks stalled, as afaics nothing happened for
three weeks apart from Josh (added to CC) mentioning he also saw it. Or
is this discussed elsewhere? Or fixed already?
Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRmJo)
Thorsten
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-26 12:16 ` Thorsten Leemhuis
0 siblings, 0 replies; 24+ messages in thread
From: Thorsten Leemhuis @ 2016-06-26 12:16 UTC (permalink / raw)
To: Daniel Wagner, Dave Chinner
Cc: linux-fsdevel, Josh Poimboeuf, linux-kernel, xfs
On 02.06.2016 15:29, Daniel Wagner wrote:
>> Hmmm, Ok. I've been running the lockperf test and kernel builds all
>> day on a filesystem that is identical in shape and size to yours
>> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
> I don't know if that is important: I run the lockperf test and after
> they have finished I do a kernel build.
>
>> Is it possible to get a metadump image of your filesystem to see if
>> I can reproduce it on that?
> Sure, see private mail.
Dave, Daniel, what's the latest status on this issue? It made it to my
list of know 4.7 regressions after Christoph suggested it should be
listed. But this thread looks stalled, as afaics nothing happened for
three weeks apart from Josh (added to CC) mentioning he also saw it. Or
is this discussed elsewhere? Or fixed already?
Sincerely, your regression tracker for Linux 4.7 (http://bit.ly/28JRmJo)
Thorsten
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
2016-06-26 12:16 ` Thorsten Leemhuis
@ 2016-06-26 15:13 ` Daniel Wagner
-1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-26 15:13 UTC (permalink / raw)
To: Thorsten Leemhuis, Dave Chinner
Cc: linux-fsdevel, linux-kernel, xfs, Josh Poimboeuf
On 06/26/2016 02:16 PM, Thorsten Leemhuis wrote:
> On 02.06.2016 15:29, Daniel Wagner wrote:
>>> Hmmm, Ok. I've been running the lockperf test and kernel builds all
>>> day on a filesystem that is identical in shape and size to yours
>>> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
>> I don't know if that is important: I run the lockperf test and after
>> they have finished I do a kernel build.
>>
>>> Is it possible to get a metadump image of your filesystem to see if
>>> I can reproduce it on that?
>> Sure, see private mail.
>
> Dave, Daniel, what's the latest status on this issue?
I had no time to do some more testing in last couple of weeks. Tomorrow
I'll try to reproduce it again, though last time I tried it couldn't
trigger it.
> It made it to my
> list of know 4.7 regressions after Christoph suggested it should be
> listed. But this thread looks stalled, as afaics nothing happened for
> three weeks apart from Josh (added to CC) mentioning he also saw it. Or
> is this discussed elsewhere? Or fixed already?
The discussion wandered over to the thread called 'crash in xfs in
current' and there are some instruction by Al what to do test
Message-ID: <20160622014253.GS12670@dastard>
cheers,
daniel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Internal error xfs_trans_cancel
@ 2016-06-26 15:13 ` Daniel Wagner
0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2016-06-26 15:13 UTC (permalink / raw)
To: Thorsten Leemhuis, Dave Chinner
Cc: linux-fsdevel, Josh Poimboeuf, linux-kernel, xfs
On 06/26/2016 02:16 PM, Thorsten Leemhuis wrote:
> On 02.06.2016 15:29, Daniel Wagner wrote:
>>> Hmmm, Ok. I've been running the lockperf test and kernel builds all
>>> day on a filesystem that is identical in shape and size to yours
>>> (i.e. xfs_info output is the same) but I haven't reproduced it yet.
>> I don't know if that is important: I run the lockperf test and after
>> they have finished I do a kernel build.
>>
>>> Is it possible to get a metadump image of your filesystem to see if
>>> I can reproduce it on that?
>> Sure, see private mail.
>
> Dave, Daniel, what's the latest status on this issue?
I had no time to do some more testing in last couple of weeks. Tomorrow
I'll try to reproduce it again, though last time I tried it couldn't
trigger it.
> It made it to my
> list of know 4.7 regressions after Christoph suggested it should be
> listed. But this thread looks stalled, as afaics nothing happened for
> three weeks apart from Josh (added to CC) mentioning he also saw it. Or
> is this discussed elsewhere? Or fixed already?
The discussion wandered over to the thread called 'crash in xfs in
current' and there are some instruction by Al what to do test
Message-ID: <20160622014253.GS12670@dastard>
cheers,
daniel
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2016-06-26 15:13 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01 5:52 Internal error xfs_trans_cancel Daniel Wagner
2016-06-01 5:52 ` Daniel Wagner
2016-06-01 7:10 ` Dave Chinner
2016-06-01 7:10 ` Dave Chinner
2016-06-01 13:50 ` Daniel Wagner
2016-06-01 13:50 ` Daniel Wagner
2016-06-01 14:13 ` Daniel Wagner
2016-06-01 14:13 ` Daniel Wagner
2016-06-01 14:19 ` Daniel Wagner
2016-06-01 14:19 ` Daniel Wagner
2016-06-02 0:26 ` Dave Chinner
2016-06-02 0:26 ` Dave Chinner
2016-06-02 5:23 ` Daniel Wagner
2016-06-02 5:23 ` Daniel Wagner
2016-06-02 6:35 ` Dave Chinner
2016-06-02 6:35 ` Dave Chinner
2016-06-02 13:29 ` Daniel Wagner
2016-06-02 13:29 ` Daniel Wagner
2016-06-26 12:16 ` Thorsten Leemhuis
2016-06-26 12:16 ` Thorsten Leemhuis
2016-06-26 15:13 ` Daniel Wagner
2016-06-26 15:13 ` Daniel Wagner
2016-06-14 4:29 ` Josh Poimboeuf
2016-06-14 4:29 ` Josh Poimboeuf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.