All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
@ 2013-04-15 23:14 Brian Foster
  2013-04-16 16:24 ` Dave Chinner
  2013-04-22 19:59 ` Eric Sandeen
  0 siblings, 2 replies; 50+ messages in thread
From: Brian Foster @ 2013-04-15 23:14 UTC (permalink / raw)
  To: yongtaofu; +Cc: sandeen, xfs

Hi,

Thanks for the data in the previous thread:

http://oss.sgi.com/archives/xfs/2013-04/msg00327.html

I'm spinning off a new thread specifically for this because the original
thread is already too large and scattered to track. As Eric stated,
please try to keep data contained in as few messages as possible.

The data confirms Dave's theory where we are going off the end of the
unlinked list when attempting to remove an inode, pass in NULLAGINO to
xfs_inotobp() and the attempted conversion to a global inode number
leads to EINVAL. The next question here is why wasn't the inode listed
in the probe output on the unlinked inode list?

Unfortunately we're probably going to require to start making some
debug-level changes to the kernel to make progress on this issue. If you
are able to recompile a kernel and/or xfs module (which you referred to
doing in the previous thread), could you start with the patch appended
to this message[1] and collect the xfs_iunlink and xfs_iunlink_remove
tracepoint data the next time the problem occurs? E.g.,

	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
	... reproduce ...
	cat /sys/kernel/debug/tracing/trace > trace.output

Please also include all data relevant to the crash (in a single mail):
trace output, stap script output (so we know what inode to search for in
the trace), references to metadump images, etc. Thanks.

Brian

[1] - Note that the code we're sending here is debug-level and lightly
tested.

---
 fs/xfs/linux-2.6/xfs_trace.h |    2 ++
 fs/xfs/xfs_inode.c           |    4 ++++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index e8ce644..227eb33 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
 DEFINE_INODE_EVENT(xfs_destroy_inode);
 DEFINE_INODE_EVENT(xfs_write_inode);
 DEFINE_INODE_EVENT(xfs_clear_inode);
+DEFINE_INODE_EVENT(xfs_iunlink);
+DEFINE_INODE_EVENT(xfs_iunlink_remove);

 DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
 DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 796edce..a43bec5 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1670,6 +1670,8 @@ xfs_iunlink(
 		(sizeof(xfs_agino_t) * bucket_index);
 	xfs_trans_log_buf(tp, agibp, offset,
 			  (offset + sizeof(xfs_agino_t) - 1));
+
+	trace_xfs_iunlink(ip);
 	return 0;
 }

@@ -1820,6 +1822,8 @@ xfs_iunlink_remove(
 				  (offset + sizeof(xfs_agino_t) - 1));
 		xfs_inobp_check(mp, last_ibp);
 	}
+
+	trace_xfs_iunlink_remove(ip);
 	return 0;
 }

-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-15 23:14 xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging Brian Foster
@ 2013-04-16 16:24 ` Dave Chinner
  2013-04-16 17:18   ` Brian Foster
  2013-04-22 19:59 ` Eric Sandeen
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2013-04-16 16:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: sandeen, yongtaofu, xfs

On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
> Hi,
> 
> Thanks for the data in the previous thread:
> 
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> 
> I'm spinning off a new thread specifically for this because the original
> thread is already too large and scattered to track. As Eric stated,
> please try to keep data contained in as few messages as possible.
> 
> The data confirms Dave's theory where we are going off the end of the
> unlinked list when attempting to remove an inode, pass in NULLAGINO to
> xfs_inotobp() and the attempted conversion to a global inode number
> leads to EINVAL. The next question here is why wasn't the inode listed
> in the probe output on the unlinked inode list?
> 
> Unfortunately we're probably going to require to start making some
> debug-level changes to the kernel to make progress on this issue. If you
> are able to recompile a kernel and/or xfs module (which you referred to
> doing in the previous thread), could you start with the patch appended
> to this message[1] and collect the xfs_iunlink and xfs_iunlink_remove
> tracepoint data the next time the problem occurs? E.g.,
> 
> 	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
> 	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
> 	... reproduce ...
> 	cat /sys/kernel/debug/tracing/trace > trace.output

It's better to use trace-cmd for this. it will result in less
dropped events. i.e.:

	$ trace-cmd record -e xfs_iunlink\*
	... reproduce ...
	^C
	$ trace-cmd report > trace.output

> --- a/fs/xfs/linux-2.6/xfs_trace.h
> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>  DEFINE_INODE_EVENT(xfs_write_inode);
>  DEFINE_INODE_EVENT(xfs_clear_inode);
> +DEFINE_INODE_EVENT(xfs_iunlink);
> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
> 
>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 796edce..a43bec5 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1670,6 +1670,8 @@ xfs_iunlink(
>  		(sizeof(xfs_agino_t) * bucket_index);
>  	xfs_trans_log_buf(tp, agibp, offset,
>  			  (offset + sizeof(xfs_agino_t) - 1));
> +
> +	trace_xfs_iunlink(ip);
>  	return 0;
>  }
> 
> @@ -1820,6 +1822,8 @@ xfs_iunlink_remove(
>  				  (offset + sizeof(xfs_agino_t) - 1));
>  		xfs_inobp_check(mp, last_ibp);
>  	}
> +
> +	trace_xfs_iunlink_remove(ip);
>  	return 0;

I would suggest that the the tracing shoul dbe at entry of the
function, otherwise we won't get a tracepoint for the operation that
triggers the shutdown. (That's the reason most tracepoints in XFS
are at function entry...)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-16 16:24 ` Dave Chinner
@ 2013-04-16 17:18   ` Brian Foster
  2013-04-17  1:04     ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Brian Foster @ 2013-04-16 17:18 UTC (permalink / raw)
  To: Dave Chinner; +Cc: sandeen, yongtaofu, xfs

On 04/16/2013 12:24 PM, Dave Chinner wrote:
> On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>> Hi,
>>
>> Thanks for the data in the previous thread:
>>
>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>
...
>>
>> 	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>> 	echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>> 	... reproduce ...
>> 	cat /sys/kernel/debug/tracing/trace > trace.output
> 
> It's better to use trace-cmd for this. it will result in less
> dropped events. i.e.:
> 
> 	$ trace-cmd record -e xfs_iunlink\*
> 	... reproduce ...
> 	^C
> 	$ trace-cmd report > trace.output
> 
>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
...
> 
> I would suggest that the the tracing shoul dbe at entry of the
> function, otherwise we won't get a tracepoint for the operation that
> triggers the shutdown. (That's the reason most tracepoints in XFS
> are at function entry...)
> 

Good points, thanks Dave. A v2 that pulls up the tracepoints towards
function entry is appended.

Brian

>From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
From: Brian Foster <bfoster@redhat.com>
Date: Mon, 15 Apr 2013 18:16:24 -0400
Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
xfs_iunlink_remove

---
 fs/xfs/linux-2.6/xfs_trace.h |    2 ++
 fs/xfs/xfs_inode.c           |    4 ++++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index adc6ec4..338a0f9 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
 DEFINE_INODE_EVENT(xfs_destroy_inode);
 DEFINE_INODE_EVENT(xfs_dirty_inode);
 DEFINE_INODE_EVENT(xfs_clear_inode);
+DEFINE_INODE_EVENT(xfs_iunlink);
+DEFINE_INODE_EVENT(xfs_iunlink_remove);

 DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
 DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 19900f0..d705c77 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1615,6 +1615,8 @@ xfs_iunlink(

 	mp = tp->t_mountp;

+	trace_xfs_iunlink(ip);
+
 	/*
 	 * Get the agi buffer first.  It ensures lock ordering
 	 * on the list.
@@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
 	mp = tp->t_mountp;
 	agno = XFS_INO_TO_AGNO(mp, ip->i_ino);

+	trace_xfs_iunlink_remove(ip);
+
 	/*
 	 * Get the agi buffer first.  It ensures lock ordering
 	 * on the list.
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-16 17:18   ` Brian Foster
@ 2013-04-17  1:04     ` 符永涛
  2013-04-17  1:35       ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-17  1:04 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3452 bytes --]

Hi Brain,
Thank you for your update, and I have applied your last kernel patch.
However it is not easy to reproduce especially in out test environment.
Till now is not happens again. I'll update the kernel patch now. BTW is
there any findings in the logs of previous thread?
http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
I guess it tend to happen during glusterfs rebalance because glusterfs
moves a lot of file from one server to another and then unlink it.

Thank you.


2013/4/17 Brian Foster <bfoster@redhat.com>

> On 04/16/2013 12:24 PM, Dave Chinner wrote:
> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
> >> Hi,
> >>
> >> Thanks for the data in the previous thread:
> >>
> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>
> ...
> >>
> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
> >>      echo 1 >
> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
> >>      ... reproduce ...
> >>      cat /sys/kernel/debug/tracing/trace > trace.output
> >
> > It's better to use trace-cmd for this. it will result in less
> > dropped events. i.e.:
> >
> >       $ trace-cmd record -e xfs_iunlink\*
> >       ... reproduce ...
> >       ^C
> >       $ trace-cmd report > trace.output
> >
> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
> ...
> >
> > I would suggest that the the tracing shoul dbe at entry of the
> > function, otherwise we won't get a tracepoint for the operation that
> > triggers the shutdown. (That's the reason most tracepoints in XFS
> > are at function entry...)
> >
>
> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
> function entry is appended.
>
> Brian
>
> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
> From: Brian Foster <bfoster@redhat.com>
> Date: Mon, 15 Apr 2013 18:16:24 -0400
> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
> xfs_iunlink_remove
>
> ---
>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>  fs/xfs/xfs_inode.c           |    4 ++++
>  2 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
> index adc6ec4..338a0f9 100644
> --- a/fs/xfs/linux-2.6/xfs_trace.h
> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>  DEFINE_INODE_EVENT(xfs_clear_inode);
> +DEFINE_INODE_EVENT(xfs_iunlink);
> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>
>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 19900f0..d705c77 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>
>         mp = tp->t_mountp;
>
> +       trace_xfs_iunlink(ip);
> +
>         /*
>          * Get the agi buffer first.  It ensures lock ordering
>          * on the list.
> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>         mp = tp->t_mountp;
>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>
> +       trace_xfs_iunlink_remove(ip);
> +
>         /*
>          * Get the agi buffer first.  It ensures lock ordering
>          * on the list.
> --
> 1.7.7.6
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 4975 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-17  1:04     ` 符永涛
@ 2013-04-17  1:35       ` 符永涛
  2013-04-17  3:15         ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-17  1:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 4385 bytes --]

Hi Brain,
I want to ask a question, according to the shutdown trace. The ino in
xfs_iunlink_remove
is 0x113, why xfs_imap got ino=0xffffffff ?

--- xfs_imap --
module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
-- return=0x16
vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff

--- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs
/xfs/xfs_inode.c:1680").return -- return=0x16
vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff882000000000 bucket_index=? offset=?
last_offset=0xffffffffffff8810 error=? __func__=[...]
ip: i_ino = 0x113, i_flags = 0x0

Thank you.



2013/4/17 符永涛 <yongtaofu@gmail.com>

> Hi Brain,
> Thank you for your update, and I have applied your last kernel patch.
> However it is not easy to reproduce especially in out test environment.
> Till now is not happens again. I'll update the kernel patch now. BTW is
> there any findings in the logs of previous thread?
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> I guess it tend to happen during glusterfs rebalance because glusterfs
> moves a lot of file from one server to another and then unlink it.
>
> Thank you.
>
>
> 2013/4/17 Brian Foster <bfoster@redhat.com>
>
>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>> >> Hi,
>> >>
>> >> Thanks for the data in the previous thread:
>> >>
>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> >>
>> ...
>> >>
>> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>> >>      echo 1 >
>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>> >>      ... reproduce ...
>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>> >
>> > It's better to use trace-cmd for this. it will result in less
>> > dropped events. i.e.:
>> >
>> >       $ trace-cmd record -e xfs_iunlink\*
>> >       ... reproduce ...
>> >       ^C
>> >       $ trace-cmd report > trace.output
>> >
>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>> ...
>> >
>> > I would suggest that the the tracing shoul dbe at entry of the
>> > function, otherwise we won't get a tracepoint for the operation that
>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>> > are at function entry...)
>> >
>>
>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>> function entry is appended.
>>
>> Brian
>>
>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>> From: Brian Foster <bfoster@redhat.com>
>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>> xfs_iunlink_remove
>>
>> ---
>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>  fs/xfs/xfs_inode.c           |    4 ++++
>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
>> index adc6ec4..338a0f9 100644
>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>> +DEFINE_INODE_EVENT(xfs_iunlink);
>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>
>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index 19900f0..d705c77 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>
>>         mp = tp->t_mountp;
>>
>> +       trace_xfs_iunlink(ip);
>> +
>>         /*
>>          * Get the agi buffer first.  It ensures lock ordering
>>          * on the list.
>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>         mp = tp->t_mountp;
>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>
>> +       trace_xfs_iunlink_remove(ip);
>> +
>>         /*
>>          * Get the agi buffer first.  It ensures lock ordering
>>          * on the list.
>> --
>> 1.7.7.6
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 8041 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-17  1:35       ` 符永涛
@ 2013-04-17  3:15         ` 符永涛
  2013-04-17  3:48           ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-17  3:15 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 6262 bytes --]

Hi Brain,
If it is because NULLAGINO is passed in  to xfs_inotobp().
Can I move the following two lines before xfs_inotobp?

For example:

1767                 while (next_agino != agino) {
1768                         /*
1769                          * If the last inode wasn't the one pointing to
1770                          * us, then release its buffer since we're not
1771                          * going to do anything with it.
1772                          */
1773                         if (last_ibp != NULL) {
1774                                 xfs_trans_brelse(tp, last_ibp);
1775                         }
1776                         next_ino = XFS_AGINO_TO_INO(mp, agno,
next_agino);
+                               ASSERT(next_agino != NULLAGINO);
+                               ASSERT(next_agino != 0);
1777                         error = xfs_inotobp(mp, tp, next_ino,
&last_dip,
1778                                             &last_ibp, &last_offset,
0);
1779                         if (error) {
1780                                 xfs_warn(mp,
1781                                         "%s: xfs_inotobp() returned
error %d.",
1782                                         __func__, error);
1783                                 return error;
1784                         }
1785                         next_agino =
be32_to_cpu(last_dip->di_next_unlinked);
-                               //ASSERT(next_agino != NULLAGINO);
-                               //ASSERT(next_agino != 0);
1788                 }
I don't understand xfs well and correct me if I'm totally wrong.
Thank you very much.


2013/4/17 符永涛 <yongtaofu@gmail.com>

> Hi Brain,
> I want to ask a question, according to the shutdown trace. The ino in  xfs_iunlink_remove
> is 0x113, why xfs_imap got ino=0xffffffff ?
>
> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs
> /xfs/xfs_inode.c:1680").return -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff882000000000 bucket_index=? offset=?
> last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
>
> Thank you.
>
>
>
> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>
>> Hi Brain,
>> Thank you for your update, and I have applied your last kernel patch.
>> However it is not easy to reproduce especially in out test environment.
>> Till now is not happens again. I'll update the kernel patch now. BTW is
>> there any findings in the logs of previous thread?
>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> I guess it tend to happen during glusterfs rebalance because glusterfs
>> moves a lot of file from one server to another and then unlink it.
>>
>> Thank you.
>>
>>
>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>
>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>> >> Hi,
>>> >>
>>> >> Thanks for the data in the previous thread:
>>> >>
>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>> >>
>>> ...
>>> >>
>>> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>> >>      echo 1 >
>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>> >>      ... reproduce ...
>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>> >
>>> > It's better to use trace-cmd for this. it will result in less
>>> > dropped events. i.e.:
>>> >
>>> >       $ trace-cmd record -e xfs_iunlink\*
>>> >       ... reproduce ...
>>> >       ^C
>>> >       $ trace-cmd report > trace.output
>>> >
>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>> ...
>>> >
>>> > I would suggest that the the tracing shoul dbe at entry of the
>>> > function, otherwise we won't get a tracepoint for the operation that
>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>> > are at function entry...)
>>> >
>>>
>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>> function entry is appended.
>>>
>>> Brian
>>>
>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>>> From: Brian Foster <bfoster@redhat.com>
>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>> xfs_iunlink_remove
>>>
>>> ---
>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
>>> index adc6ec4..338a0f9 100644
>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>
>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>> index 19900f0..d705c77 100644
>>> --- a/fs/xfs/xfs_inode.c
>>> +++ b/fs/xfs/xfs_inode.c
>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>
>>>         mp = tp->t_mountp;
>>>
>>> +       trace_xfs_iunlink(ip);
>>> +
>>>         /*
>>>          * Get the agi buffer first.  It ensures lock ordering
>>>          * on the list.
>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>         mp = tp->t_mountp;
>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>
>>> +       trace_xfs_iunlink_remove(ip);
>>> +
>>>         /*
>>>          * Get the agi buffer first.  It ensures lock ordering
>>>          * on the list.
>>> --
>>> 1.7.7.6
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 13559 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-17  3:15         ` 符永涛
@ 2013-04-17  3:48           ` 符永涛
  2013-04-17  4:28             ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-17  3:48 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 7579 bytes --]

Hi Brain,
Can I change as following?
--- a/xfs_inode.c
+++ b/xfs_inode.c
@@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
                        if (last_ibp != NULL) {
                                xfs_trans_brelse(tp, last_ibp);
                        }
+                        ASSERT(next_agino != NULLAGINO);
+                        ASSERT(next_agino != 0);
                        next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
                        error = xfs_inotobp(mp, tp, next_ino, &last_dip,
                                            &last_ibp, &last_offset, 0);
@@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
                                return error;
                        }
                        next_agino =
be32_to_cpu(last_dip->di_next_unlinked);
-                       ASSERT(next_agino != NULLAGINO);
-                       ASSERT(next_agino != 0);
                }
                /*
                 * Now last_ibp points to the buffer previous to us on

Thank you.


2013/4/17 符永涛 <yongtaofu@gmail.com>

> Hi Brain,
> If it is because NULLAGINO is passed in  to xfs_inotobp().
> Can I move the following two lines before xfs_inotobp?
>
> For example:
>
> 1767                 while (next_agino != agino) {
> 1768                         /*
> 1769                          * If the last inode wasn't the one pointing
> to
> 1770                          * us, then release its buffer since we're not
> 1771                          * going to do anything with it.
> 1772                          */
> 1773                         if (last_ibp != NULL) {
> 1774                                 xfs_trans_brelse(tp, last_ibp);
> 1775                         }
> 1776                         next_ino = XFS_AGINO_TO_INO(mp, agno,
> next_agino);
> +                               ASSERT(next_agino != NULLAGINO);
> +                               ASSERT(next_agino != 0);
> 1777                         error = xfs_inotobp(mp, tp, next_ino,
> &last_dip,
> 1778                                             &last_ibp, &last_offset,
> 0);
> 1779                         if (error) {
> 1780                                 xfs_warn(mp,
> 1781                                         "%s: xfs_inotobp() returned
> error %d.",
> 1782                                         __func__, error);
> 1783                                 return error;
> 1784                         }
> 1785                         next_agino =
> be32_to_cpu(last_dip->di_next_unlinked);
> -                               //ASSERT(next_agino != NULLAGINO);
> -                               //ASSERT(next_agino != 0);
> 1788                 }
> I don't understand xfs well and correct me if I'm totally wrong.
> Thank you very much.
>
>
> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>
>> Hi Brain,
>> I want to ask a question, according to the shutdown trace. The ino in  xfs_iunlink_remove
>> is 0x113, why xfs_imap got ino=0xffffffff ?
>>
>> --- xfs_imap -- module("xfs").function("xfs_imap@fs
>> /xfs/xfs_ialloc.c:1257").return -- return=0x16
>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>
>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs
>> /xfs/xfs_inode.c:1680").return -- return=0x16
>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
>> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
>> last_dip=0xffff882000000000 bucket_index=? offset=?
>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>> ip: i_ino = 0x113, i_flags = 0x0
>>
>> Thank you.
>>
>>
>>
>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>
>>> Hi Brain,
>>> Thank you for your update, and I have applied your last kernel patch.
>>> However it is not easy to reproduce especially in out test environment.
>>> Till now is not happens again. I'll update the kernel patch now. BTW is
>>> there any findings in the logs of previous thread?
>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>> I guess it tend to happen during glusterfs rebalance because glusterfs
>>> moves a lot of file from one server to another and then unlink it.
>>>
>>> Thank you.
>>>
>>>
>>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>>
>>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>>> >> Hi,
>>>> >>
>>>> >> Thanks for the data in the previous thread:
>>>> >>
>>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>> >>
>>>> ...
>>>> >>
>>>> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>>> >>      echo 1 >
>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>>> >>      ... reproduce ...
>>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>>> >
>>>> > It's better to use trace-cmd for this. it will result in less
>>>> > dropped events. i.e.:
>>>> >
>>>> >       $ trace-cmd record -e xfs_iunlink\*
>>>> >       ... reproduce ...
>>>> >       ^C
>>>> >       $ trace-cmd report > trace.output
>>>> >
>>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>> ...
>>>> >
>>>> > I would suggest that the the tracing shoul dbe at entry of the
>>>> > function, otherwise we won't get a tracepoint for the operation that
>>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>>> > are at function entry...)
>>>> >
>>>>
>>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>>> function entry is appended.
>>>>
>>>> Brian
>>>>
>>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>>>> From: Brian Foster <bfoster@redhat.com>
>>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>>> xfs_iunlink_remove
>>>>
>>>> ---
>>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
>>>> index adc6ec4..338a0f9 100644
>>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>>
>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>> index 19900f0..d705c77 100644
>>>> --- a/fs/xfs/xfs_inode.c
>>>> +++ b/fs/xfs/xfs_inode.c
>>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>>
>>>>         mp = tp->t_mountp;
>>>>
>>>> +       trace_xfs_iunlink(ip);
>>>> +
>>>>         /*
>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>          * on the list.
>>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>>         mp = tp->t_mountp;
>>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>>
>>>> +       trace_xfs_iunlink_remove(ip);
>>>> +
>>>>         /*
>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>          * on the list.
>>>> --
>>>> 1.7.7.6
>>>>
>>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 17050 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-17  3:48           ` 符永涛
@ 2013-04-17  4:28             ` Eric Sandeen
  2013-04-18  1:30               ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-17  4:28 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 8124 bytes --]

On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com> wrote:

> Hi Brain,
> Can I change as following?

ASSERTS are no-ops in a non-debug kernel, so this won't change any behavior.  I hope we'll know more if we get new traces from your patched kernel....

Eric

> --- a/xfs_inode.c
> +++ b/xfs_inode.c
> @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>                         if (last_ibp != NULL) {
>                                 xfs_trans_brelse(tp, last_ibp);
>                         }
> +                        ASSERT(next_agino != NULLAGINO);
> +                        ASSERT(next_agino != 0);
>                         next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
>                         error = xfs_inotobp(mp, tp, next_ino, &last_dip,
>                                             &last_ibp, &last_offset, 0);
> @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>                                 return error;
>                         }
>                         next_agino = be32_to_cpu(last_dip->di_next_unlinked);
> -                       ASSERT(next_agino != NULLAGINO);
> -                       ASSERT(next_agino != 0);
>                 }
>                 /*
>                  * Now last_ibp points to the buffer previous to us on
> 
> Thank you.
> 
> 
> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>> Hi Brain,
>> If it is because NULLAGINO is passed in  to xfs_inotobp().
>> Can I move the following two lines before xfs_inotobp?
>> 
>> For example:
>> 
>> 1767                 while (next_agino != agino) {
>> 1768                         /*
>> 1769                          * If the last inode wasn't the one pointing to
>> 1770                          * us, then release its buffer since we're not
>> 1771                          * going to do anything with it.
>> 1772                          */
>> 1773                         if (last_ibp != NULL) {
>> 1774                                 xfs_trans_brelse(tp, last_ibp);
>> 1775                         }
>> 1776                         next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
>> +                               ASSERT(next_agino != NULLAGINO);
>> +                               ASSERT(next_agino != 0);
>> 1777                         error = xfs_inotobp(mp, tp, next_ino, &last_dip,
>> 1778                                             &last_ibp, &last_offset, 0);
>> 1779                         if (error) {
>> 1780                                 xfs_warn(mp,
>> 1781                                         "%s: xfs_inotobp() returned error %d.",
>> 1782                                         __func__, error);
>> 1783                                 return error;
>> 1784                         }
>> 1785                         next_agino = be32_to_cpu(last_dip->di_next_unlinked);
>> -                               //ASSERT(next_agino != NULLAGINO);
>> -                               //ASSERT(next_agino != 0);
>> 1788                 }
>> I don't understand xfs well and correct me if I'm totally wrong.
>> Thank you very much.
>> 
>> 
>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>> Hi Brain,
>>> I want to ask a question, according to the shutdown trace. The ino in  xfs_iunlink_remove is 0x113, why xfs_imap got ino=0xffffffff ? 
>>> 
>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return -- return=0x16
>>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>> 
>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return -- return=0x16
>>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=? dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=? last_dip=0xffff882000000000 bucket_index=? offset=? last_offset=0xffffffffffff8810 error=? __func__=[...]
>>> ip: i_ino = 0x113, i_flags = 0x0
>>> 
>>> Thank you.
>>> 
>>> 
>>> 
>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>> Hi Brain,
>>>> Thank you for your update, and I have applied your last kernel patch. However it is not easy to reproduce especially in out test environment. Till now is not happens again. I'll update the kernel patch now. BTW is there any findings in the logs of previous thread?
>>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>> I guess it tend to happen during glusterfs rebalance because glusterfs moves a lot of file from one server to another and then unlink it.
>>>> 
>>>> Thank you.
>>>> 
>>>> 
>>>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>>>> >> Hi,
>>>>> >>
>>>>> >> Thanks for the data in the previous thread:
>>>>> >>
>>>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>> >>
>>>>> ...
>>>>> >>
>>>>> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>>>> >>      echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>>>> >>      ... reproduce ...
>>>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>>>> >
>>>>> > It's better to use trace-cmd for this. it will result in less
>>>>> > dropped events. i.e.:
>>>>> >
>>>>> >       $ trace-cmd record -e xfs_iunlink\*
>>>>> >       ... reproduce ...
>>>>> >       ^C
>>>>> >       $ trace-cmd report > trace.output
>>>>> >
>>>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>> ...
>>>>> >
>>>>> > I would suggest that the the tracing shoul dbe at entry of the
>>>>> > function, otherwise we won't get a tracepoint for the operation that
>>>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>>>> > are at function entry...)
>>>>> >
>>>>> 
>>>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>>>> function entry is appended.
>>>>> 
>>>>> Brian
>>>>> 
>>>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>>>>> From: Brian Foster <bfoster@redhat.com>
>>>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>>>> xfs_iunlink_remove
>>>>> 
>>>>> ---
>>>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>>> 
>>>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> index adc6ec4..338a0f9 100644
>>>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>>> 
>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>>> index 19900f0..d705c77 100644
>>>>> --- a/fs/xfs/xfs_inode.c
>>>>> +++ b/fs/xfs/xfs_inode.c
>>>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>>> 
>>>>>         mp = tp->t_mountp;
>>>>> 
>>>>> +       trace_xfs_iunlink(ip);
>>>>> +
>>>>>         /*
>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>          * on the list.
>>>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>>>         mp = tp->t_mountp;
>>>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>>> 
>>>>> +       trace_xfs_iunlink_remove(ip);
>>>>> +
>>>>>         /*
>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>          * on the list.
>>>>> --
>>>>> 1.7.7.6
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> 符永涛
>>> 
>>> 
>>> 
>>> -- 
>>> 符永涛
>> 
>> 
>> 
>> -- 
>> 符永涛
> 
> 
> 
> -- 
> 符永涛
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

[-- Attachment #1.2: Type: text/html, Size: 17905 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-17  4:28             ` Eric Sandeen
@ 2013-04-18  1:30               ` 符永涛
  2013-04-18  6:45                 ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-18  1:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 11339 bytes --]

Hi Eric,
The shutdown issue is still not reproduced yet. But I get the following
error today during test.

Apr 18 07:42:51 10 kernel: Call Trace:
Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
xfs_buf_cond_lock+0x2f/0xc0 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0
Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ? kmem_zone_alloc+0x77/0xf0
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ? _xfs_buf_find+0x102/0x280
[xfs]
Apr 18 07:42:51 10 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0     0
14522      1 0x00000083
Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
0000000000000000 0000000000000000
Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
ffff882015a639d8 0000000000000246
Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
000000000000fb88 ffff880e70491af8
Apr 18 07:42:51 10 kernel: Call Trace:
Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
xfs_buf_cond_lock+0x2f/0xc0 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0
Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ? kmem_zone_alloc+0x77/0xf0
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ? _xfs_buf_find+0x102/0x280
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>] xfs_buf_lock+0x53/0x110
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] _xfs_buf_find+0x102/0x280
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>] xfs_buf_get+0x6b/0x1a0 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>] xfs_buf_read+0x2c/0x100
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
xfs_trans_read_buf+0x1f8/0x400 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>] xfs_read_agi+0x74/0x100
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>] xfs_iunlink+0x5b/0x180 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ? current_fs_time+0x27/0x30
Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
xfs_trans_ichgtime+0x27/0xa0 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>] xfs_droplink+0x5b/0x70 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>] xfs_remove+0x27e/0x3a0 [xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
generic_permission+0x23/0xb0
Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>] xfs_vn_unlink+0x48/0x90
[xfs]
Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>] vfs_unlink+0x9f/0xe0
Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ? lookup_hash+0x3a/0x50
Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>] do_unlinkat+0x183/0x1c0
Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
syscall_trace_enter+0x1d8/0x1e0
Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>] sys_unlink+0x16/0x20
Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>] tracesys+0xd9/0xde

Thank you.


2013/4/17 Eric Sandeen <sandeen@sandeen.net>

> On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com> wrote:
>
> Hi Brain,
> Can I change as following?
>
>
> ASSERTS are no-ops in a non-debug kernel, so this won't change any
> behavior.  I hope we'll know more if we get new traces from your patched
> kernel....
>
> Eric
>
> --- a/xfs_inode.c
> +++ b/xfs_inode.c
> @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>                         if (last_ibp != NULL) {
>                                 xfs_trans_brelse(tp, last_ibp);
>                         }
> +                        ASSERT(next_agino != NULLAGINO);
> +                        ASSERT(next_agino != 0);
>                         next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
>                         error = xfs_inotobp(mp, tp, next_ino, &last_dip,
>                                             &last_ibp, &last_offset, 0);
> @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>                                 return error;
>                         }
>                         next_agino =
> be32_to_cpu(last_dip->di_next_unlinked);
> -                       ASSERT(next_agino != NULLAGINO);
> -                       ASSERT(next_agino != 0);
>                 }
>                 /*
>                  * Now last_ibp points to the buffer previous to us on
>
> Thank you.
>
>
> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>
>> Hi Brain,
>> If it is because NULLAGINO is passed in  to xfs_inotobp().
>> Can I move the following two lines before xfs_inotobp?
>>
>> For example:
>>
>> 1767                 while (next_agino != agino) {
>> 1768                         /*
>> 1769                          * If the last inode wasn't the one pointing
>> to
>> 1770                          * us, then release its buffer since we're
>> not
>> 1771                          * going to do anything with it.
>> 1772                          */
>> 1773                         if (last_ibp != NULL) {
>> 1774                                 xfs_trans_brelse(tp, last_ibp);
>> 1775                         }
>> 1776                         next_ino = XFS_AGINO_TO_INO(mp, agno,
>> next_agino);
>> +                               ASSERT(next_agino != NULLAGINO);
>> +                               ASSERT(next_agino != 0);
>> 1777                         error = xfs_inotobp(mp, tp, next_ino,
>> &last_dip,
>> 1778                                             &last_ibp, &last_offset,
>> 0);
>> 1779                         if (error) {
>> 1780                                 xfs_warn(mp,
>> 1781                                         "%s: xfs_inotobp() returned
>> error %d.",
>> 1782                                         __func__, error);
>> 1783                                 return error;
>> 1784                         }
>> 1785                         next_agino =
>> be32_to_cpu(last_dip->di_next_unlinked);
>> -                               //ASSERT(next_agino != NULLAGINO);
>> -                               //ASSERT(next_agino != 0);
>> 1788                 }
>> I don't understand xfs well and correct me if I'm totally wrong.
>> Thank you very much.
>>
>>
>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>
>>> Hi Brain,
>>> I want to ask a question, according to the shutdown trace. The ino in  xfs_iunlink_remove
>>> is 0x113, why xfs_imap got ino=0xffffffff ?
>>>
>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs
>>> /xfs/xfs_ialloc.c:1257").return -- return=0x16
>>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>>
>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs
>>> /xfs/xfs_inode.c:1680").return -- return=0x16
>>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
>>> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
>>> last_dip=0xffff882000000000 bucket_index=? offset=?
>>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>>> ip: i_ino = 0x113, i_flags = 0x0
>>>
>>> Thank you.
>>>
>>>
>>>
>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Hi Brain,
>>>> Thank you for your update, and I have applied your last kernel patch.
>>>> However it is not easy to reproduce especially in out test environment.
>>>> Till now is not happens again. I'll update the kernel patch now. BTW is
>>>> there any findings in the logs of previous thread?
>>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>> I guess it tend to happen during glusterfs rebalance because glusterfs
>>>> moves a lot of file from one server to another and then unlink it.
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>>>
>>>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>>>> >> Hi,
>>>>> >>
>>>>> >> Thanks for the data in the previous thread:
>>>>> >>
>>>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>> >>
>>>>> ...
>>>>> >>
>>>>> >>      echo 1 >
>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>>>> >>      echo 1 >
>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>>>> >>      ... reproduce ...
>>>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>>>> >
>>>>> > It's better to use trace-cmd for this. it will result in less
>>>>> > dropped events. i.e.:
>>>>> >
>>>>> >       $ trace-cmd record -e xfs_iunlink\*
>>>>> >       ... reproduce ...
>>>>> >       ^C
>>>>> >       $ trace-cmd report > trace.output
>>>>> >
>>>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>> ...
>>>>> >
>>>>> > I would suggest that the the tracing shoul dbe at entry of the
>>>>> > function, otherwise we won't get a tracepoint for the operation that
>>>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>>>> > are at function entry...)
>>>>> >
>>>>>
>>>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>>>> function entry is appended.
>>>>>
>>>>> Brian
>>>>>
>>>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>>>>> From: Brian Foster <bfoster@redhat.com>
>>>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>>>> xfs_iunlink_remove
>>>>>
>>>>> ---
>>>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>>>
>>>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>>>>> b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> index adc6ec4..338a0f9 100644
>>>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>>>
>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>>> index 19900f0..d705c77 100644
>>>>> --- a/fs/xfs/xfs_inode.c
>>>>> +++ b/fs/xfs/xfs_inode.c
>>>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>>>
>>>>>         mp = tp->t_mountp;
>>>>>
>>>>> +       trace_xfs_iunlink(ip);
>>>>> +
>>>>>         /*
>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>          * on the list.
>>>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>>>         mp = tp->t_mountp;
>>>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>>>
>>>>> +       trace_xfs_iunlink_remove(ip);
>>>>> +
>>>>>         /*
>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>          * on the list.
>>>>> --
>>>>> 1.7.7.6
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 21790 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18  1:30               ` 符永涛
@ 2013-04-18  6:45                 ` 符永涛
  2013-04-18  8:25                   ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-18  6:45 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 11968 bytes --]

Hi Brain and Eric,
If the problem is the agno can't be found in the unlinked list. Can we just
bypass it instead of passing ino=0xffffffff to xfs_inotobp?
Thank you.


2013/4/18 符永涛 <yongtaofu@gmail.com>

> Hi Eric,
> The shutdown issue is still not reproduced yet. But I get the following
> error today during test.
>
> Apr 18 07:42:51 10 kernel: Call Trace:
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> schedule_timeout+0x215/0x2e0
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> kmem_zone_alloc+0x77/0xf0 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> _xfs_buf_find+0x102/0x280 [xfs]
> Apr 18 07:42:51 10 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0     0
> 14522      1 0x00000083
> Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
> 0000000000000000 0000000000000000
> Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
> ffff882015a639d8 0000000000000246
> Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
> 000000000000fb88 ffff880e70491af8
> Apr 18 07:42:51 10 kernel: Call Trace:
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> schedule_timeout+0x215/0x2e0
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> kmem_zone_alloc+0x77/0xf0 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> _xfs_buf_find+0x102/0x280 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
> Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>] xfs_buf_lock+0x53/0x110
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] _xfs_buf_find+0x102/0x280
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>] xfs_buf_get+0x6b/0x1a0
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>] xfs_buf_read+0x2c/0x100
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
> xfs_trans_read_buf+0x1f8/0x400 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>] xfs_read_agi+0x74/0x100
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>] xfs_iunlink+0x5b/0x180
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ? current_fs_time+0x27/0x30
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
> xfs_trans_ichgtime+0x27/0xa0 [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>] xfs_droplink+0x5b/0x70
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>] xfs_remove+0x27e/0x3a0
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
> generic_permission+0x23/0xb0
> Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>] xfs_vn_unlink+0x48/0x90
> [xfs]
> Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>] vfs_unlink+0x9f/0xe0
> Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ? lookup_hash+0x3a/0x50
> Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>] do_unlinkat+0x183/0x1c0
> Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
> syscall_trace_enter+0x1d8/0x1e0
> Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>] sys_unlink+0x16/0x20
> Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>] tracesys+0xd9/0xde
>
> Thank you.
>
>
> 2013/4/17 Eric Sandeen <sandeen@sandeen.net>
>
>> On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com> wrote:
>>
>> Hi Brain,
>> Can I change as following?
>>
>>
>> ASSERTS are no-ops in a non-debug kernel, so this won't change any
>> behavior.  I hope we'll know more if we get new traces from your patched
>> kernel....
>>
>> Eric
>>
>> --- a/xfs_inode.c
>> +++ b/xfs_inode.c
>> @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>>                         if (last_ibp != NULL) {
>>                                 xfs_trans_brelse(tp, last_ibp);
>>                         }
>> +                        ASSERT(next_agino != NULLAGINO);
>> +                        ASSERT(next_agino != 0);
>>                         next_ino = XFS_AGINO_TO_INO(mp, agno, next_agino);
>>                         error = xfs_inotobp(mp, tp, next_ino, &last_dip,
>>                                             &last_ibp, &last_offset, 0);
>> @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>>                                 return error;
>>                         }
>>                         next_agino =
>> be32_to_cpu(last_dip->di_next_unlinked);
>> -                       ASSERT(next_agino != NULLAGINO);
>> -                       ASSERT(next_agino != 0);
>>                 }
>>                 /*
>>                  * Now last_ibp points to the buffer previous to us on
>>
>> Thank you.
>>
>>
>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>
>>> Hi Brain,
>>> If it is because NULLAGINO is passed in  to xfs_inotobp().
>>> Can I move the following two lines before xfs_inotobp?
>>>
>>> For example:
>>>
>>> 1767                 while (next_agino != agino) {
>>> 1768                         /*
>>> 1769                          * If the last inode wasn't the one
>>> pointing to
>>> 1770                          * us, then release its buffer since we're
>>> not
>>> 1771                          * going to do anything with it.
>>> 1772                          */
>>> 1773                         if (last_ibp != NULL) {
>>> 1774                                 xfs_trans_brelse(tp, last_ibp);
>>> 1775                         }
>>> 1776                         next_ino = XFS_AGINO_TO_INO(mp, agno,
>>> next_agino);
>>> +                               ASSERT(next_agino != NULLAGINO);
>>> +                               ASSERT(next_agino != 0);
>>> 1777                         error = xfs_inotobp(mp, tp, next_ino,
>>> &last_dip,
>>> 1778                                             &last_ibp,
>>> &last_offset, 0);
>>> 1779                         if (error) {
>>> 1780                                 xfs_warn(mp,
>>> 1781                                         "%s: xfs_inotobp() returned
>>> error %d.",
>>> 1782                                         __func__, error);
>>> 1783                                 return error;
>>> 1784                         }
>>> 1785                         next_agino =
>>> be32_to_cpu(last_dip->di_next_unlinked);
>>> -                               //ASSERT(next_agino != NULLAGINO);
>>> -                               //ASSERT(next_agino != 0);
>>> 1788                 }
>>> I don't understand xfs well and correct me if I'm totally wrong.
>>> Thank you very much.
>>>
>>>
>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Hi Brain,
>>>> I want to ask a question, according to the shutdown trace. The ino in  xfs_iunlink_remove
>>>> is 0x113, why xfs_imap got ino=0xffffffff ?
>>>>
>>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs
>>>> /xfs/xfs_ialloc.c:1257").return -- return=0x16
>>>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>>>
>>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs
>>>> /xfs/xfs_inode.c:1680").return -- return=0x16
>>>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
>>>> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
>>>> last_dip=0xffff882000000000 bucket_index=? offset=?
>>>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>>>> ip: i_ino = 0x113, i_flags = 0x0
>>>>
>>>> Thank you.
>>>>
>>>>
>>>>
>>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>>
>>>>> Hi Brain,
>>>>> Thank you for your update, and I have applied your last kernel patch.
>>>>> However it is not easy to reproduce especially in out test environment.
>>>>> Till now is not happens again. I'll update the kernel patch now. BTW is
>>>>> there any findings in the logs of previous thread?
>>>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>> I guess it tend to happen during glusterfs rebalance because glusterfs
>>>>> moves a lot of file from one server to another and then unlink it.
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>>>>
>>>>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>>>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>>>>> >> Hi,
>>>>>> >>
>>>>>> >> Thanks for the data in the previous thread:
>>>>>> >>
>>>>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>>> >>
>>>>>> ...
>>>>>> >>
>>>>>> >>      echo 1 >
>>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>>>>> >>      echo 1 >
>>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>>>>> >>      ... reproduce ...
>>>>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>>>>> >
>>>>>> > It's better to use trace-cmd for this. it will result in less
>>>>>> > dropped events. i.e.:
>>>>>> >
>>>>>> >       $ trace-cmd record -e xfs_iunlink\*
>>>>>> >       ... reproduce ...
>>>>>> >       ^C
>>>>>> >       $ trace-cmd report > trace.output
>>>>>> >
>>>>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>> ...
>>>>>> >
>>>>>> > I would suggest that the the tracing shoul dbe at entry of the
>>>>>> > function, otherwise we won't get a tracepoint for the operation that
>>>>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>>>>> > are at function entry...)
>>>>>> >
>>>>>>
>>>>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>>>>> function entry is appended.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00 2001
>>>>>> From: Brian Foster <bfoster@redhat.com>
>>>>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>>>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>>>>> xfs_iunlink_remove
>>>>>>
>>>>>> ---
>>>>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> index adc6ec4..338a0f9 100644
>>>>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>>>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>>>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>>>>
>>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>>>> index 19900f0..d705c77 100644
>>>>>> --- a/fs/xfs/xfs_inode.c
>>>>>> +++ b/fs/xfs/xfs_inode.c
>>>>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>>>>
>>>>>>         mp = tp->t_mountp;
>>>>>>
>>>>>> +       trace_xfs_iunlink(ip);
>>>>>> +
>>>>>>         /*
>>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>>          * on the list.
>>>>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>>>>         mp = tp->t_mountp;
>>>>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>>>>
>>>>>> +       trace_xfs_iunlink_remove(ip);
>>>>>> +
>>>>>>         /*
>>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>>          * on the list.
>>>>>> --
>>>>>> 1.7.7.6
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 符永涛
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 22520 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18  6:45                 ` 符永涛
@ 2013-04-18  8:25                   ` 符永涛
  2013-04-18 11:41                     ` Brian Foster
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-18  8:25 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 15247 bytes --]

Hi Brian and Eric,
Can I change as following to bypass it?
--- a/xfs_inode.c
+++ b/xfs_inode.c
@@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
                 */
                next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
                last_ibp = NULL;
-               while (next_agino != agino) {
+               while (next_agino != agino && next_agino != NULLAGINO) {
                        /*
                         * If the last inode wasn't the one pointing to
                         * us, then release its buffer since we're not
@@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
                        ASSERT(next_agino != NULLAGINO);
                        ASSERT(next_agino != 0);
                }
+               if (next_agino == NULLAGINO) {
+                       /*
+                        *After search the list for the inode being free
+                        *we still can't find it.
+                        */
+                       xfs_err(mp, "%s ino %lld not found in unlinked
list.",
+                                    __func__, (unsigned long
long)ip->i_ino);
+               }
                /*
                 * Now last_ibp points to the buffer previous to us on
                 * the unlinked list.  Pull us from the list.
@@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
                } else {
                        xfs_trans_brelse(tp, ibp);
                }
-               /*
-                * Point the previous inode on the list to the next inode.
-                */
-               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
-               ASSERT(next_agino != 0);
-               offset = last_offset + offsetof(xfs_dinode_t,
di_next_unlinked);
-               xfs_trans_inode_buf(tp, last_ibp);
-               xfs_trans_log_buf(tp, last_ibp, offset,
-                                 (offset + sizeof(xfs_agino_t) - 1));
-               xfs_inobp_check(mp, last_ibp);
+               if (next_agino != NULLAGINO) {
+                       /*
+                       * If only find the inode being free then we modify
+                       * the unlinked list.
+                       * Point the previous inode on the list to the next
inode.
+                       */
+                       last_dip->di_next_unlinked =
cpu_to_be32(next_agino);
+                       ASSERT(next_agino != 0);
+                       offset = last_offset + offsetof(xfs_dinode_t,
di_next_unlinked);
+                       xfs_trans_inode_buf(tp, last_ibp);
+                       xfs_trans_log_buf(tp, last_ibp, offset,
+                                         (offset + sizeof(xfs_agino_t) -
1));
+                       xfs_inobp_check(mp, last_ibp);
+               }
        }
        return 0;
 }

Thank you.


2013/4/18 符永涛 <yongtaofu@gmail.com>

> Hi Brain and Eric,
> If the problem is the agno can't be found in the unlinked list. Can we
> just bypass it instead of passing ino=0xffffffff to xfs_inotobp?
> Thank you.
>
>
> 2013/4/18 符永涛 <yongtaofu@gmail.com>
>
>> Hi Eric,
>> The shutdown issue is still not reproduced yet. But I get the following
>> error today during test.
>>
>> Apr 18 07:42:51 10 kernel: Call Trace:
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> schedule_timeout+0x215/0x2e0
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> kmem_zone_alloc+0x77/0xf0 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> _xfs_buf_find+0x102/0x280 [xfs]
>> Apr 18 07:42:51 10 kernel: "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0     0
>> 14522      1 0x00000083
>> Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
>> 0000000000000000 0000000000000000
>> Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
>> ffff882015a639d8 0000000000000246
>> Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
>> 000000000000fb88 ffff880e70491af8
>> Apr 18 07:42:51 10 kernel: Call Trace:
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> schedule_timeout+0x215/0x2e0
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> kmem_zone_alloc+0x77/0xf0 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> _xfs_buf_find+0x102/0x280 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>] xfs_buf_lock+0x53/0x110
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] _xfs_buf_find+0x102/0x280
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>] xfs_buf_get+0x6b/0x1a0
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>] xfs_buf_read+0x2c/0x100
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
>> xfs_trans_read_buf+0x1f8/0x400 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>] xfs_read_agi+0x74/0x100
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>] xfs_iunlink+0x5b/0x180
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
>> current_fs_time+0x27/0x30
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
>> xfs_trans_ichgtime+0x27/0xa0 [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>] xfs_droplink+0x5b/0x70
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>] xfs_remove+0x27e/0x3a0
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
>> generic_permission+0x23/0xb0
>> Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>] xfs_vn_unlink+0x48/0x90
>> [xfs]
>> Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>] vfs_unlink+0x9f/0xe0
>> Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ? lookup_hash+0x3a/0x50
>> Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>] do_unlinkat+0x183/0x1c0
>> Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
>> syscall_trace_enter+0x1d8/0x1e0
>> Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>] sys_unlink+0x16/0x20
>> Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>] tracesys+0xd9/0xde
>>
>> Thank you.
>>
>>
>> 2013/4/17 Eric Sandeen <sandeen@sandeen.net>
>>
>>> On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com> wrote:
>>>
>>> Hi Brain,
>>> Can I change as following?
>>>
>>>
>>> ASSERTS are no-ops in a non-debug kernel, so this won't change any
>>> behavior.  I hope we'll know more if we get new traces from your patched
>>> kernel....
>>>
>>> Eric
>>>
>>> --- a/xfs_inode.c
>>> +++ b/xfs_inode.c
>>> @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>>>                         if (last_ibp != NULL) {
>>>                                 xfs_trans_brelse(tp, last_ibp);
>>>                         }
>>> +                        ASSERT(next_agino != NULLAGINO);
>>> +                        ASSERT(next_agino != 0);
>>>                         next_ino = XFS_AGINO_TO_INO(mp, agno,
>>> next_agino);
>>>                         error = xfs_inotobp(mp, tp, next_ino, &last_dip,
>>>                                             &last_ibp, &last_offset, 0);
>>> @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>>>                                 return error;
>>>                         }
>>>                         next_agino =
>>> be32_to_cpu(last_dip->di_next_unlinked);
>>> -                       ASSERT(next_agino != NULLAGINO);
>>> -                       ASSERT(next_agino != 0);
>>>                 }
>>>                 /*
>>>                  * Now last_ibp points to the buffer previous to us on
>>>
>>> Thank you.
>>>
>>>
>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Hi Brain,
>>>> If it is because NULLAGINO is passed in  to xfs_inotobp().
>>>> Can I move the following two lines before xfs_inotobp?
>>>>
>>>> For example:
>>>>
>>>> 1767                 while (next_agino != agino) {
>>>> 1768                         /*
>>>> 1769                          * If the last inode wasn't the one
>>>> pointing to
>>>> 1770                          * us, then release its buffer since we're
>>>> not
>>>> 1771                          * going to do anything with it.
>>>> 1772                          */
>>>> 1773                         if (last_ibp != NULL) {
>>>> 1774                                 xfs_trans_brelse(tp, last_ibp);
>>>> 1775                         }
>>>> 1776                         next_ino = XFS_AGINO_TO_INO(mp, agno,
>>>> next_agino);
>>>> +                               ASSERT(next_agino != NULLAGINO);
>>>> +                               ASSERT(next_agino != 0);
>>>> 1777                         error = xfs_inotobp(mp, tp, next_ino,
>>>> &last_dip,
>>>> 1778                                             &last_ibp,
>>>> &last_offset, 0);
>>>> 1779                         if (error) {
>>>> 1780                                 xfs_warn(mp,
>>>> 1781                                         "%s: xfs_inotobp()
>>>> returned error %d.",
>>>> 1782                                         __func__, error);
>>>> 1783                                 return error;
>>>> 1784                         }
>>>> 1785                         next_agino =
>>>> be32_to_cpu(last_dip->di_next_unlinked);
>>>> -                               //ASSERT(next_agino != NULLAGINO);
>>>> -                               //ASSERT(next_agino != 0);
>>>> 1788                 }
>>>> I don't understand xfs well and correct me if I'm totally wrong.
>>>> Thank you very much.
>>>>
>>>>
>>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>>
>>>>> Hi Brain,
>>>>> I want to ask a question, according to the shutdown trace. The ino in
>>>>>  xfs_iunlink_remove is 0x113, why xfs_imap got ino=0xffffffff ?
>>>>>
>>>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs
>>>>> /xfs/xfs_ialloc.c:1257").return -- return=0x16
>>>>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>>>>
>>>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_
>>>>> iunlink_remove@fs/xfs/xfs_inode.c:1680").return -- return=0x16
>>>>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=?
>>>>> agi=? dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=?
>>>>> last_ibp=? last_dip=0xffff882000000000 bucket_index=? offset=?
>>>>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>>>>> ip: i_ino = 0x113, i_flags = 0x0
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>>
>>>>> 2013/4/17 符永涛 <yongtaofu@gmail.com>
>>>>>
>>>>>> Hi Brain,
>>>>>> Thank you for your update, and I have applied your last kernel patch.
>>>>>> However it is not easy to reproduce especially in out test environment.
>>>>>> Till now is not happens again. I'll update the kernel patch now. BTW is
>>>>>> there any findings in the logs of previous thread?
>>>>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>>> I guess it tend to happen during glusterfs rebalance because
>>>>>> glusterfs moves a lot of file from one server to another and then unlink it.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>> 2013/4/17 Brian Foster <bfoster@redhat.com>
>>>>>>
>>>>>>> On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>>>>>> > On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
>>>>>>> >> Hi,
>>>>>>> >>
>>>>>>> >> Thanks for the data in the previous thread:
>>>>>>> >>
>>>>>>> >> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>>>>> >>
>>>>>>> ...
>>>>>>> >>
>>>>>>> >>      echo 1 >
>>>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>>>>>> >>      echo 1 >
>>>>>>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>>>>>> >>      ... reproduce ...
>>>>>>> >>      cat /sys/kernel/debug/tracing/trace > trace.output
>>>>>>> >
>>>>>>> > It's better to use trace-cmd for this. it will result in less
>>>>>>> > dropped events. i.e.:
>>>>>>> >
>>>>>>> >       $ trace-cmd record -e xfs_iunlink\*
>>>>>>> >       ... reproduce ...
>>>>>>> >       ^C
>>>>>>> >       $ trace-cmd report > trace.output
>>>>>>> >
>>>>>>> >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> >> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>>> ...
>>>>>>> >
>>>>>>> > I would suggest that the the tracing shoul dbe at entry of the
>>>>>>> > function, otherwise we won't get a tracepoint for the operation
>>>>>>> that
>>>>>>> > triggers the shutdown. (That's the reason most tracepoints in XFS
>>>>>>> > are at function entry...)
>>>>>>> >
>>>>>>>
>>>>>>> Good points, thanks Dave. A v2 that pulls up the tracepoints towards
>>>>>>> function entry is appended.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> From 280943e78ebe0b97a774cba51e7815c42f044b55 Mon Sep 17 00:00:00
>>>>>>> 2001
>>>>>>> From: Brian Foster <bfoster@redhat.com>
>>>>>>> Date: Mon, 15 Apr 2013 18:16:24 -0400
>>>>>>> Subject: [PATCH v2] xfs: add tracepoints for xfs_iunlink and
>>>>>>> xfs_iunlink_remove
>>>>>>>
>>>>>>> ---
>>>>>>>  fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>>>>>>  fs/xfs/xfs_inode.c           |    4 ++++
>>>>>>>  2 files changed, 6 insertions(+), 0 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> index adc6ec4..338a0f9 100644
>>>>>>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>>>>>> @@ -583,6 +583,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>>>>>>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>>>>>>>  DEFINE_INODE_EVENT(xfs_dirty_inode);
>>>>>>>  DEFINE_INODE_EVENT(xfs_clear_inode);
>>>>>>> +DEFINE_INODE_EVENT(xfs_iunlink);
>>>>>>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>>>>>>
>>>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>>>>>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>>>>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>>>>>> index 19900f0..d705c77 100644
>>>>>>> --- a/fs/xfs/xfs_inode.c
>>>>>>> +++ b/fs/xfs/xfs_inode.c
>>>>>>> @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>>>>>>
>>>>>>>         mp = tp->t_mountp;
>>>>>>>
>>>>>>> +       trace_xfs_iunlink(ip);
>>>>>>> +
>>>>>>>         /*
>>>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>>>          * on the list.
>>>>>>> @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>>>>>>         mp = tp->t_mountp;
>>>>>>>         agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>>>>>>
>>>>>>> +       trace_xfs_iunlink_remove(ip);
>>>>>>> +
>>>>>>>         /*
>>>>>>>          * Get the agi buffer first.  It ensures lock ordering
>>>>>>>          * on the list.
>>>>>>> --
>>>>>>> 1.7.7.6
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 符永涛
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 符永涛
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>> _______________________________________________
>>> xfs mailing list
>>> xfs@oss.sgi.com
>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 30937 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18  8:25                   ` 符永涛
@ 2013-04-18 11:41                     ` Brian Foster
  2013-04-18 15:23                       ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Brian Foster @ 2013-04-18 11:41 UTC (permalink / raw)
  To: 符永涛; +Cc: Eric Sandeen, xfs

On 04/18/2013 04:25 AM, 符永涛 wrote:
> Hi Brian and Eric,
> Can I change as following to bypass it?

This is probably not a wise thing to do. The problem we're seeing here
is indicative of a potentially larger problem than this particular error
path. An inode is being unlinked and inactivated, but we aren't finding
on the list where we expect it to be. Killing the error return doesn't
eliminate the larger problem.

So while changes could end up being made in this area as part of a fix,
I would not suggest making any changes beyond those designed to help
debug until we have a better idea of root cause.

Brian

> --- a/xfs_inode.c
> +++ b/xfs_inode.c
> @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
>                  */
>                 next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
>                 last_ibp = NULL;
> -               while (next_agino != agino) {
> +               while (next_agino != agino && next_agino != NULLAGINO) {
>                         /*
>                          * If the last inode wasn't the one pointing to
>                          * us, then release its buffer since we're not
> @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
>                         ASSERT(next_agino != NULLAGINO);
>                         ASSERT(next_agino != 0);
>                 }
> +               if (next_agino == NULLAGINO) {
> +                       /*
> +                        *After search the list for the inode being free
> +                        *we still can't find it.
> +                        */
> +                       xfs_err(mp, "%s ino %lld not found in unlinked
> list.",
> +                                    __func__, (unsigned long
> long)ip->i_ino);
> +               }
>                 /*
>                  * Now last_ibp points to the buffer previous to us on
>                  * the unlinked list.  Pull us from the list.
> @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
>                 } else {
>                         xfs_trans_brelse(tp, ibp);
>                 }
> -               /*
> -                * Point the previous inode on the list to the next inode.
> -                */
> -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
> -               ASSERT(next_agino != 0);
> -               offset = last_offset + offsetof(xfs_dinode_t,
> di_next_unlinked);
> -               xfs_trans_inode_buf(tp, last_ibp);
> -               xfs_trans_log_buf(tp, last_ibp, offset,
> -                                 (offset + sizeof(xfs_agino_t) - 1));
> -               xfs_inobp_check(mp, last_ibp);
> +               if (next_agino != NULLAGINO) {
> +                       /*
> +                       * If only find the inode being free then we modify
> +                       * the unlinked list.
> +                       * Point the previous inode on the list to the
> next inode.
> +                       */
> +                       last_dip->di_next_unlinked =
> cpu_to_be32(next_agino);
> +                       ASSERT(next_agino != 0);
> +                       offset = last_offset + offsetof(xfs_dinode_t,
> di_next_unlinked);
> +                       xfs_trans_inode_buf(tp, last_ibp);
> +                       xfs_trans_log_buf(tp, last_ibp, offset,
> +                                         (offset + sizeof(xfs_agino_t)
> - 1));
> +                       xfs_inobp_check(mp, last_ibp);
> +               }
>         }
>         return 0;
>  }
> 
> Thank you.
> 
> 
> 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> 
>     Hi Brain and Eric,
>     If the problem is the agno can't be found in the unlinked list. Can
>     we just bypass it instead of passing ino=0xffffffff to xfs_inotobp?
>     Thank you.
> 
> 
>     2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> 
>         Hi Eric,
>         The shutdown issue is still not reproduced yet. But I get the
>         following error today during test.
> 
>         Apr 18 07:42:51 10 kernel: Call Trace:
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>         schedule_timeout+0x215/0x2e0
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>         kmem_zone_alloc+0x77/0xf0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>         _xfs_buf_find+0x102/0x280 [xfs]
>         Apr 18 07:42:51 10 kernel: "echo 0 >
>         /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0    
>         0 14522      1 0x00000083
>         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
>         0000000000000000 0000000000000000
>         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
>         ffff882015a639d8 0000000000000246
>         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
>         000000000000fb88 ffff880e70491af8
>         Apr 18 07:42:51 10 kernel: Call Trace:
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>         schedule_timeout+0x215/0x2e0
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>         kmem_zone_alloc+0x77/0xf0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>         _xfs_buf_find+0x102/0x280 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
>         xfs_buf_lock+0x53/0x110 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
>         _xfs_buf_find+0x102/0x280 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
>         xfs_buf_get+0x6b/0x1a0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
>         xfs_buf_read+0x2c/0x100 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
>         xfs_trans_read_buf+0x1f8/0x400 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
>         xfs_read_agi+0x74/0x100 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
>         xfs_iunlink+0x5b/0x180 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
>         current_fs_time+0x27/0x30
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
>         xfs_trans_ichgtime+0x27/0xa0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
>         xfs_droplink+0x5b/0x70 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
>         xfs_remove+0x27e/0x3a0 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
>         generic_permission+0x23/0xb0
>         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
>         xfs_vn_unlink+0x48/0x90 [xfs]
>         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>] vfs_unlink+0x9f/0xe0
>         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
>         lookup_hash+0x3a/0x50
>         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
>         do_unlinkat+0x183/0x1c0
>         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
>         syscall_trace_enter+0x1d8/0x1e0
>         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>] sys_unlink+0x16/0x20
>         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>] tracesys+0xd9/0xde
> 
>         Thank you.
> 
> 
>         2013/4/17 Eric Sandeen <sandeen@sandeen.net
>         <mailto:sandeen@sandeen.net>>
> 
>             On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com
>             <mailto:yongtaofu@gmail.com>> wrote:
> 
>>             Hi Brain,
>>             Can I change as following?
> 
>             ASSERTS are no-ops in a non-debug kernel, so this won't
>             change any behavior.  I hope we'll know more if we get new
>             traces from your patched kernel....
> 
>             Eric
> 
>>             --- a/xfs_inode.c
>>             +++ b/xfs_inode.c
>>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>>                                     if (last_ibp != NULL) {
>>                                             xfs_trans_brelse(tp,
>>             last_ibp);
>>                                     }
>>             +                        ASSERT(next_agino != NULLAGINO);
>>             +                        ASSERT(next_agino != 0);
>>                                     next_ino = XFS_AGINO_TO_INO(mp,
>>             agno, next_agino);
>>                                     error = xfs_inotobp(mp, tp,
>>             next_ino, &last_dip,
>>                                                         &last_ibp,
>>             &last_offset, 0);
>>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>>                                             return error;
>>                                     }
>>                                     next_agino =
>>             be32_to_cpu(last_dip->di_next_unlinked);
>>             -                       ASSERT(next_agino != NULLAGINO);
>>             -                       ASSERT(next_agino != 0);
>>                             }
>>                             /*
>>                              * Now last_ibp points to the buffer
>>             previous to us on
>>
>>             Thank you.
>>
>>
>>             2013/4/17 符永涛 <yongtaofu@gmail.com
>>             <mailto:yongtaofu@gmail.com>>
>>
>>                 Hi Brain,
>>                 If it is because NULLAGINO is passed in  to xfs_inotobp().
>>                 Can I move the following two lines before xfs_inotobp?
>>
>>                 For example:
>>
>>                 1767                 while (next_agino != agino) {
>>                 1768                         /*
>>                 1769                          * If the last inode
>>                 wasn't the one pointing to
>>                 1770                          * us, then release its
>>                 buffer since we're not
>>                 1771                          * going to do anything
>>                 with it.
>>                 1772                          */
>>                 1773                         if (last_ibp != NULL) {
>>                 1774                                
>>                 xfs_trans_brelse(tp, last_ibp);
>>                 1775                         }
>>                 1776                         next_ino =
>>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
>>                 +                               ASSERT(next_agino !=
>>                 NULLAGINO);
>>                 +                               ASSERT(next_agino != 0);
>>                 1777                         error = xfs_inotobp(mp,
>>                 tp, next_ino, &last_dip,
>>                 1778                                            
>>                 &last_ibp, &last_offset, 0);
>>                 1779                         if (error) {
>>                 1780                                 xfs_warn(mp,
>>                 1781                                         "%s:
>>                 xfs_inotobp() returned error %d.",
>>                 1782                                         __func__,
>>                 error);
>>                 1783                                 return error;
>>                 1784                         }
>>                 1785                         next_agino =
>>                 be32_to_cpu(last_dip->di_next_unlinked);
>>                 -                               //ASSERT(next_agino !=
>>                 NULLAGINO);
>>                 -                               //ASSERT(next_agino != 0);
>>                 1788                 }
>>                 I don't understand xfs well and correct me if I'm
>>                 totally wrong.
>>                 Thank you very much.
>>
>>
>>                 2013/4/17 符永涛 <yongtaofu@gmail.com
>>                 <mailto:yongtaofu@gmail.com>>
>>
>>                     Hi Brain,
>>                     I want to ask a question, according to the
>>                     shutdown trace. The ino in  xfs_iunlink_remove
>>                     is 0x113, why xfs_imap got ino=0xffffffff ? 
>>
>>                     --- xfs_imap --
>>                     module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>>                     -- return=0x16
>>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70
>>                     ino=0xffffffff
>>
>>                     --- xfs_iunlink_remove --
>>                     module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>>                     -- return=0x16
>>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00
>>                     next_ino=? mp=? agi=? dip=?
>>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
>>                     next_agino=? last_ibp=?
>>                     last_dip=0xffff882000000000 bucket_index=?
>>                     offset=? last_offset=0xffffffffffff8810 error=?
>>                     __func__=[...]
>>                     ip: i_ino = 0x113, i_flags = 0x0
>>
>>                     Thank you.
>>
>>
>>
>>                     2013/4/17 符永涛 <yongtaofu@gmail.com
>>                     <mailto:yongtaofu@gmail.com>>
>>
>>                         Hi Brain,
>>                         Thank you for your update, and I have applied
>>                         your last kernel patch. However it is not easy
>>                         to reproduce especially in out test
>>                         environment. Till now is not happens again.
>>                         I'll update the kernel patch now. BTW is there
>>                         any findings in the logs of previous thread?
>>                         http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>                         I guess it tend to happen during glusterfs
>>                         rebalance because glusterfs moves a lot of
>>                         file from one server to another and then
>>                         unlink it.
>>
>>                         Thank you.
>>
>>
>>                         2013/4/17 Brian Foster <bfoster@redhat.com
>>                         <mailto:bfoster@redhat.com>>
>>
>>                             On 04/16/2013 12:24 PM, Dave Chinner wrote:
>>                             > On Mon, Apr 15, 2013 at 07:14:39PM
>>                             -0400, Brian Foster wrote:
>>                             >> Hi,
>>                             >>
>>                             >> Thanks for the data in the previous thread:
>>                             >>
>>                             >>
>>                             http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>                             >>
>>                             ...
>>                             >>
>>                             >>      echo 1 >
>>                             /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>>                             >>      echo 1 >
>>                             /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>>                             >>      ... reproduce ...
>>                             >>      cat
>>                             /sys/kernel/debug/tracing/trace > trace.output
>>                             >
>>                             > It's better to use trace-cmd for this.
>>                             it will result in less
>>                             > dropped events. i.e.:
>>                             >
>>                             >       $ trace-cmd record -e xfs_iunlink\*
>>                             >       ... reproduce ...
>>                             >       ^C
>>                             >       $ trace-cmd report > trace.output
>>                             >
>>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>                             >> @@ -581,6 +581,8 @@
>>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>>                             ...
>>                             >
>>                             > I would suggest that the the tracing
>>                             shoul dbe at entry of the
>>                             > function, otherwise we won't get a
>>                             tracepoint for the operation that
>>                             > triggers the shutdown. (That's the
>>                             reason most tracepoints in XFS
>>                             > are at function entry...)
>>                             >
>>
>>                             Good points, thanks Dave. A v2 that pulls
>>                             up the tracepoints towards
>>                             function entry is appended.
>>
>>                             Brian
>>
>>                             From
>>                             280943e78ebe0b97a774cba51e7815c42f044b55
>>                             Mon Sep 17 00:00:00 2001
>>                             From: Brian Foster <bfoster@redhat.com
>>                             <mailto:bfoster@redhat.com>>
>>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
>>                             Subject: [PATCH v2] xfs: add tracepoints
>>                             for xfs_iunlink and
>>                             xfs_iunlink_remove
>>
>>                             ---
>>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>>                              fs/xfs/xfs_inode.c           |    4 ++++
>>                              2 files changed, 6 insertions(+), 0
>>                             deletions(-)
>>
>>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>>                             b/fs/xfs/linux-2.6/xfs_trace.h
>>                             index adc6ec4..338a0f9 100644
>>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
>>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
>>                             @@ -583,6 +583,8 @@
>>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>>                              DEFINE_INODE_EVENT(xfs_destroy_inode);
>>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
>>                              DEFINE_INODE_EVENT(xfs_clear_inode);
>>                             +DEFINE_INODE_EVENT(xfs_iunlink);
>>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>>
>>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>>                             diff --git a/fs/xfs/xfs_inode.c
>>                             b/fs/xfs/xfs_inode.c
>>                             index 19900f0..d705c77 100644
>>                             --- a/fs/xfs/xfs_inode.c
>>                             +++ b/fs/xfs/xfs_inode.c
>>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
>>
>>                                     mp = tp->t_mountp;
>>
>>                             +       trace_xfs_iunlink(ip);
>>                             +
>>                                     /*
>>                                      * Get the agi buffer first.  It
>>                             ensures lock ordering
>>                                      * on the list.
>>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>>                                     mp = tp->t_mountp;
>>                                     agno = XFS_INO_TO_AGNO(mp, ip->i_ino);
>>
>>                             +       trace_xfs_iunlink_remove(ip);
>>                             +
>>                                     /*
>>                                      * Get the agi buffer first.  It
>>                             ensures lock ordering
>>                                      * on the list.
>>                             --
>>                             1.7.7.6
>>
>>
>>
>>
>>                         -- 
>>                         符永涛
>>
>>
>>
>>
>>                     -- 
>>                     符永涛
>>
>>
>>
>>
>>                 -- 
>>                 符永涛
>>
>>
>>
>>
>>             -- 
>>             符永涛
>>             _______________________________________________
>>             xfs mailing list
>>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>>             http://oss.sgi.com/mailman/listinfo/xfs
> 
> 
> 
> 
>         -- 
>         符永涛
> 
> 
> 
> 
>     -- 
>     符永涛
> 
> 
> 
> 
> -- 
> 符永涛

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 11:41                     ` Brian Foster
@ 2013-04-18 15:23                       ` 符永涛
  2013-04-18 16:40                         ` 符永涛
                                           ` (3 more replies)
  0 siblings, 4 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-18 15:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 23009 bytes --]

Hi Brian and Eric,
The shutdown is not easy to produce but finally right now 2 of our servers
in our test cluster xfs was shutdown.

the trace output as following
https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing

Sorry but the systemtap is interrupt and I didn't noticed that so I didn't
get systemtap logs.

/var/log/message is same as before
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
0xffffffffa02d44aa
Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.

The metadump file is large I'll share it to you soon.


2013/4/18 Brian Foster <bfoster@redhat.com>

> On 04/18/2013 04:25 AM, 符永涛 wrote:
> > Hi Brian and Eric,
> > Can I change as following to bypass it?
>
> This is probably not a wise thing to do. The problem we're seeing here
> is indicative of a potentially larger problem than this particular error
> path. An inode is being unlinked and inactivated, but we aren't finding
> on the list where we expect it to be. Killing the error return doesn't
> eliminate the larger problem.
>
> So while changes could end up being made in this area as part of a fix,
> I would not suggest making any changes beyond those designed to help
> debug until we have a better idea of root cause.
>
> Brian
>
> > --- a/xfs_inode.c
> > +++ b/xfs_inode.c
> > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
> >                  */
> >                 next_agino =
> be32_to_cpu(agi->agi_unlinked[bucket_index]);
> >                 last_ibp = NULL;
> > -               while (next_agino != agino) {
> > +               while (next_agino != agino && next_agino != NULLAGINO) {
> >                         /*
> >                          * If the last inode wasn't the one pointing to
> >                          * us, then release its buffer since we're not
> > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
> >                         ASSERT(next_agino != NULLAGINO);
> >                         ASSERT(next_agino != 0);
> >                 }
> > +               if (next_agino == NULLAGINO) {
> > +                       /*
> > +                        *After search the list for the inode being free
> > +                        *we still can't find it.
> > +                        */
> > +                       xfs_err(mp, "%s ino %lld not found in unlinked
> > list.",
> > +                                    __func__, (unsigned long
> > long)ip->i_ino);
> > +               }
> >                 /*
> >                  * Now last_ibp points to the buffer previous to us on
> >                  * the unlinked list.  Pull us from the list.
> > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
> >                 } else {
> >                         xfs_trans_brelse(tp, ibp);
> >                 }
> > -               /*
> > -                * Point the previous inode on the list to the next
> inode.
> > -                */
> > -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
> > -               ASSERT(next_agino != 0);
> > -               offset = last_offset + offsetof(xfs_dinode_t,
> > di_next_unlinked);
> > -               xfs_trans_inode_buf(tp, last_ibp);
> > -               xfs_trans_log_buf(tp, last_ibp, offset,
> > -                                 (offset + sizeof(xfs_agino_t) - 1));
> > -               xfs_inobp_check(mp, last_ibp);
> > +               if (next_agino != NULLAGINO) {
> > +                       /*
> > +                       * If only find the inode being free then we
> modify
> > +                       * the unlinked list.
> > +                       * Point the previous inode on the list to the
> > next inode.
> > +                       */
> > +                       last_dip->di_next_unlinked =
> > cpu_to_be32(next_agino);
> > +                       ASSERT(next_agino != 0);
> > +                       offset = last_offset + offsetof(xfs_dinode_t,
> > di_next_unlinked);
> > +                       xfs_trans_inode_buf(tp, last_ibp);
> > +                       xfs_trans_log_buf(tp, last_ibp, offset,
> > +                                         (offset + sizeof(xfs_agino_t)
> > - 1));
> > +                       xfs_inobp_check(mp, last_ibp);
> > +               }
> >         }
> >         return 0;
> >  }
> >
> > Thank you.
> >
> >
> > 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> >     Hi Brain and Eric,
> >     If the problem is the agno can't be found in the unlinked list. Can
> >     we just bypass it instead of passing ino=0xffffffff to xfs_inotobp?
> >     Thank you.
> >
> >
> >     2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> >         Hi Eric,
> >         The shutdown issue is still not reproduced yet. But I get the
> >         following error today during test.
> >
> >         Apr 18 07:42:51 10 kernel: Call Trace:
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >         schedule_timeout+0x215/0x2e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: "echo 0 >
> >         /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0
> >         0 14522      1 0x00000083
> >         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
> >         0000000000000000 0000000000000000
> >         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
> >         ffff882015a639d8 0000000000000246
> >         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
> >         000000000000fb88 ffff880e70491af8
> >         Apr 18 07:42:51 10 kernel: Call Trace:
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >         schedule_timeout+0x215/0x2e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
> >         xfs_buf_lock+0x53/0x110 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
> >         xfs_buf_get+0x6b/0x1a0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
> >         xfs_buf_read+0x2c/0x100 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
> >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
> >         xfs_read_agi+0x74/0x100 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
> >         xfs_iunlink+0x5b/0x180 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
> >         current_fs_time+0x27/0x30
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
> >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
> >         xfs_droplink+0x5b/0x70 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
> >         xfs_remove+0x27e/0x3a0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
> >         generic_permission+0x23/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
> >         xfs_vn_unlink+0x48/0x90 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
> vfs_unlink+0x9f/0xe0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
> >         lookup_hash+0x3a/0x50
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
> >         do_unlinkat+0x183/0x1c0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
> >         syscall_trace_enter+0x1d8/0x1e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
> sys_unlink+0x16/0x20
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
> tracesys+0xd9/0xde
> >
> >         Thank you.
> >
> >
> >         2013/4/17 Eric Sandeen <sandeen@sandeen.net
> >         <mailto:sandeen@sandeen.net>>
> >
> >             On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com
> >             <mailto:yongtaofu@gmail.com>> wrote:
> >
> >>             Hi Brain,
> >>             Can I change as following?
> >
> >             ASSERTS are no-ops in a non-debug kernel, so this won't
> >             change any behavior.  I hope we'll know more if we get new
> >             traces from your patched kernel....
> >
> >             Eric
> >
> >>             --- a/xfs_inode.c
> >>             +++ b/xfs_inode.c
> >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
> >>                                     if (last_ibp != NULL) {
> >>                                             xfs_trans_brelse(tp,
> >>             last_ibp);
> >>                                     }
> >>             +                        ASSERT(next_agino != NULLAGINO);
> >>             +                        ASSERT(next_agino != 0);
> >>                                     next_ino = XFS_AGINO_TO_INO(mp,
> >>             agno, next_agino);
> >>                                     error = xfs_inotobp(mp, tp,
> >>             next_ino, &last_dip,
> >>                                                         &last_ibp,
> >>             &last_offset, 0);
> >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
> >>                                             return error;
> >>                                     }
> >>                                     next_agino =
> >>             be32_to_cpu(last_dip->di_next_unlinked);
> >>             -                       ASSERT(next_agino != NULLAGINO);
> >>             -                       ASSERT(next_agino != 0);
> >>                             }
> >>                             /*
> >>                              * Now last_ibp points to the buffer
> >>             previous to us on
> >>
> >>             Thank you.
> >>
> >>
> >>             2013/4/17 符永涛 <yongtaofu@gmail.com
> >>             <mailto:yongtaofu@gmail.com>>
> >>
> >>                 Hi Brain,
> >>                 If it is because NULLAGINO is passed in  to
> xfs_inotobp().
> >>                 Can I move the following two lines before xfs_inotobp?
> >>
> >>                 For example:
> >>
> >>                 1767                 while (next_agino != agino) {
> >>                 1768                         /*
> >>                 1769                          * If the last inode
> >>                 wasn't the one pointing to
> >>                 1770                          * us, then release its
> >>                 buffer since we're not
> >>                 1771                          * going to do anything
> >>                 with it.
> >>                 1772                          */
> >>                 1773                         if (last_ibp != NULL) {
> >>                 1774
> >>                 xfs_trans_brelse(tp, last_ibp);
> >>                 1775                         }
> >>                 1776                         next_ino =
> >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
> >>                 +                               ASSERT(next_agino !=
> >>                 NULLAGINO);
> >>                 +                               ASSERT(next_agino != 0);
> >>                 1777                         error = xfs_inotobp(mp,
> >>                 tp, next_ino, &last_dip,
> >>                 1778
> >>                 &last_ibp, &last_offset, 0);
> >>                 1779                         if (error) {
> >>                 1780                                 xfs_warn(mp,
> >>                 1781                                         "%s:
> >>                 xfs_inotobp() returned error %d.",
> >>                 1782                                         __func__,
> >>                 error);
> >>                 1783                                 return error;
> >>                 1784                         }
> >>                 1785                         next_agino =
> >>                 be32_to_cpu(last_dip->di_next_unlinked);
> >>                 -                               //ASSERT(next_agino !=
> >>                 NULLAGINO);
> >>                 -                               //ASSERT(next_agino !=
> 0);
> >>                 1788                 }
> >>                 I don't understand xfs well and correct me if I'm
> >>                 totally wrong.
> >>                 Thank you very much.
> >>
> >>
> >>                 2013/4/17 符永涛 <yongtaofu@gmail.com
> >>                 <mailto:yongtaofu@gmail.com>>
> >>
> >>                     Hi Brain,
> >>                     I want to ask a question, according to the
> >>                     shutdown trace. The ino in  xfs_iunlink_remove
> >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
> >>
> >>                     --- xfs_imap --
> >>                     module("xfs").function("xfs_imap@fs
> /xfs/xfs_ialloc.c:1257").return
> >>                     -- return=0x16
> >>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70
> >>                     ino=0xffffffff
> >>
> >>                     --- xfs_iunlink_remove --
> >>                     module("xfs").function("xfs_iunlink_remove@fs
> /xfs/xfs_inode.c:1680").return
> >>                     -- return=0x16
> >>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00
> >>                     next_ino=? mp=? agi=? dip=?
> >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
> >>                     next_agino=? last_ibp=?
> >>                     last_dip=0xffff882000000000 bucket_index=?
> >>                     offset=? last_offset=0xffffffffffff8810 error=?
> >>                     __func__=[...]
> >>                     ip: i_ino = 0x113, i_flags = 0x0
> >>
> >>                     Thank you.
> >>
> >>
> >>
> >>                     2013/4/17 符永涛 <yongtaofu@gmail.com
> >>                     <mailto:yongtaofu@gmail.com>>
> >>
> >>                         Hi Brain,
> >>                         Thank you for your update, and I have applied
> >>                         your last kernel patch. However it is not easy
> >>                         to reproduce especially in out test
> >>                         environment. Till now is not happens again.
> >>                         I'll update the kernel patch now. BTW is there
> >>                         any findings in the logs of previous thread?
> >>
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>                         I guess it tend to happen during glusterfs
> >>                         rebalance because glusterfs moves a lot of
> >>                         file from one server to another and then
> >>                         unlink it.
> >>
> >>                         Thank you.
> >>
> >>
> >>                         2013/4/17 Brian Foster <bfoster@redhat.com
> >>                         <mailto:bfoster@redhat.com>>
> >>
> >>                             On 04/16/2013 12:24 PM, Dave Chinner wrote:
> >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
> >>                             -0400, Brian Foster wrote:
> >>                             >> Hi,
> >>                             >>
> >>                             >> Thanks for the data in the previous
> thread:
> >>                             >>
> >>                             >>
> >>
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>                             >>
> >>                             ...
> >>                             >>
> >>                             >>      echo 1 >
> >>
> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
> >>                             >>      echo 1 >
> >>
> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
> >>                             >>      ... reproduce ...
> >>                             >>      cat
> >>                             /sys/kernel/debug/tracing/trace >
> trace.output
> >>                             >
> >>                             > It's better to use trace-cmd for this.
> >>                             it will result in less
> >>                             > dropped events. i.e.:
> >>                             >
> >>                             >       $ trace-cmd record -e xfs_iunlink\*
> >>                             >       ... reproduce ...
> >>                             >       ^C
> >>                             >       $ trace-cmd report > trace.output
> >>                             >
> >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             >> @@ -581,6 +581,8 @@
> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >>                             ...
> >>                             >
> >>                             > I would suggest that the the tracing
> >>                             shoul dbe at entry of the
> >>                             > function, otherwise we won't get a
> >>                             tracepoint for the operation that
> >>                             > triggers the shutdown. (That's the
> >>                             reason most tracepoints in XFS
> >>                             > are at function entry...)
> >>                             >
> >>
> >>                             Good points, thanks Dave. A v2 that pulls
> >>                             up the tracepoints towards
> >>                             function entry is appended.
> >>
> >>                             Brian
> >>
> >>                             From
> >>                             280943e78ebe0b97a774cba51e7815c42f044b55
> >>                             Mon Sep 17 00:00:00 2001
> >>                             From: Brian Foster <bfoster@redhat.com
> >>                             <mailto:bfoster@redhat.com>>
> >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
> >>                             Subject: [PATCH v2] xfs: add tracepoints
> >>                             for xfs_iunlink and
> >>                             xfs_iunlink_remove
> >>
> >>                             ---
> >>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++
> >>                              fs/xfs/xfs_inode.c           |    4 ++++
> >>                              2 files changed, 6 insertions(+), 0
> >>                             deletions(-)
> >>
> >>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             index adc6ec4..338a0f9 100644
> >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             @@ -583,6 +583,8 @@
> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >>                              DEFINE_INODE_EVENT(xfs_destroy_inode);
> >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
> >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
> >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
> >>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);
> >>
> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> >>                             diff --git a/fs/xfs/xfs_inode.c
> >>                             b/fs/xfs/xfs_inode.c
> >>                             index 19900f0..d705c77 100644
> >>                             --- a/fs/xfs/xfs_inode.c
> >>                             +++ b/fs/xfs/xfs_inode.c
> >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
> >>
> >>                                     mp = tp->t_mountp;
> >>
> >>                             +       trace_xfs_iunlink(ip);
> >>                             +
> >>                                     /*
> >>                                      * Get the agi buffer first.  It
> >>                             ensures lock ordering
> >>                                      * on the list.
> >>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
> >>                                     mp = tp->t_mountp;
> >>                                     agno = XFS_INO_TO_AGNO(mp,
> ip->i_ino);
> >>
> >>                             +       trace_xfs_iunlink_remove(ip);
> >>                             +
> >>                                     /*
> >>                                      * Get the agi buffer first.  It
> >>                             ensures lock ordering
> >>                                      * on the list.
> >>                             --
> >>                             1.7.7.6
> >>
> >>
> >>
> >>
> >>                         --
> >>                         符永涛
> >>
> >>
> >>
> >>
> >>                     --
> >>                     符永涛
> >>
> >>
> >>
> >>
> >>                 --
> >>                 符永涛
> >>
> >>
> >>
> >>
> >>             --
> >>             符永涛
> >>             _______________________________________________
> >>             xfs mailing list
> >>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
> >>             http://oss.sgi.com/mailman/listinfo/xfs
> >
> >
> >
> >
> >         --
> >         符永涛
> >
> >
> >
> >
> >     --
> >     符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 48096 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 15:23                       ` 符永涛
@ 2013-04-18 16:40                         ` 符永涛
  2013-04-18 17:03                         ` Eric Sandeen
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-18 16:40 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 25696 bytes --]

Hi Brian and Eric,
Here's the  meta_dump file of one server xfs repair log. And again this
happens exactly when one of the glusterfs finished rebalance.
https://docs.google.com/file/d/0B7n2C4T5tfNCdDFwN24zdkVmdHM/edit?usp=sharing

https://docs.google.com/file/d/0B7n2C4T5tfNCOGdpOGhIaVFRV28/edit?usp=sharing

The server which finishes rebalance and trigger xfs shutdown has the
following log on glusterfs:

[2013-04-18 22:42:52.063196] E
[dht-rebalance.c:1194:gf_defrag_migrate_data] 0-test2-dht: migrate-data
failed for /fytest/8/58037
[2013-04-18 22:42:52.067867] I [dht-rebalance.c:639:dht_migrate_file]
0-test2-dht: /fytest/8/58040: attempting to move from test2-replicate-1 to
test2-replicate-3
[2013-04-18 22:42:52.070530] W [dht-rebalance.c:353:__dht_check_free_space]
0-test2-dht: data movement attempted from node (test2-replicate-1) with
higher disk space to a node (test2-replicate-3) with lesser disk space
(/fytest/8/58040)
[2013-04-18 22:42:52.070613] E
[dht-rebalance.c:1194:gf_defrag_migrate_data] 0-test2-dht: migrate-data
failed for /fytest/8/58040
[2013-04-18 22:43:11.679797] I [dht-common.c:2337:dht_setxattr]
0-test2-dht: fixing the layout of /fytest/2
[2013-04-18 22:43:11.682162] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-test2-dht: migrate data
called on /fytest/2
[2013-04-18 22:43:14.246209] I [dht-rebalance.c:1611:gf_defrag_status_get]
0-glusterfs: Rebalance is completed
[2013-04-18 22:43:14.246278] I [dht-rebalance.c:1614:gf_defrag_status_get]
0-glusterfs: Files migrated: 7203, size: 745897761616, lookups: 79881,
failures: 9002
[2013-04-18 22:43:14.247021] W [glusterfsd.c:831:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x3ec22e767d]
(-->/lib64/libpthread.so.0() [0x3ec2607851]
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
received signum (15), shutting down
[2013-04-18 22:43:14.247202] E
[rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
unregister with portmap

The shutdown happens at 2013-04-18 22:43:14.

Thank you.



2013/4/18 符永涛 <yongtaofu@gmail.com>

> Hi Brian and Eric,
> The shutdown is not easy to produce but finally right now 2 of our servers
> in our test cluster xfs was shutdown.
>
> the trace output as following
>
> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
>
> Sorry but the systemtap is interrupt and I didn't noticed that so I didn't
> get systemtap logs.
>
> /var/log/message is same as before
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> 0xffffffffa02d44aa
> Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>
> The metadump file is large I'll share it to you soon.
>
>
> 2013/4/18 Brian Foster <bfoster@redhat.com>
>
>> On 04/18/2013 04:25 AM, 符永涛 wrote:
>> > Hi Brian and Eric,
>> > Can I change as following to bypass it?
>>
>> This is probably not a wise thing to do. The problem we're seeing here
>> is indicative of a potentially larger problem than this particular error
>> path. An inode is being unlinked and inactivated, but we aren't finding
>> on the list where we expect it to be. Killing the error return doesn't
>> eliminate the larger problem.
>>
>> So while changes could end up being made in this area as part of a fix,
>> I would not suggest making any changes beyond those designed to help
>> debug until we have a better idea of root cause.
>>
>> Brian
>>
>> > --- a/xfs_inode.c
>> > +++ b/xfs_inode.c
>> > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
>> >                  */
>> >                 next_agino =
>> be32_to_cpu(agi->agi_unlinked[bucket_index]);
>> >                 last_ibp = NULL;
>> > -               while (next_agino != agino) {
>> > +               while (next_agino != agino && next_agino != NULLAGINO) {
>> >                         /*
>> >                          * If the last inode wasn't the one pointing to
>> >                          * us, then release its buffer since we're not
>> > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
>> >                         ASSERT(next_agino != NULLAGINO);
>> >                         ASSERT(next_agino != 0);
>> >                 }
>> > +               if (next_agino == NULLAGINO) {
>> > +                       /*
>> > +                        *After search the list for the inode being free
>> > +                        *we still can't find it.
>> > +                        */
>> > +                       xfs_err(mp, "%s ino %lld not found in unlinked
>> > list.",
>> > +                                    __func__, (unsigned long
>> > long)ip->i_ino);
>> > +               }
>> >                 /*
>> >                  * Now last_ibp points to the buffer previous to us on
>> >                  * the unlinked list.  Pull us from the list.
>> > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
>> >                 } else {
>> >                         xfs_trans_brelse(tp, ibp);
>> >                 }
>> > -               /*
>> > -                * Point the previous inode on the list to the next
>> inode.
>> > -                */
>> > -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
>> > -               ASSERT(next_agino != 0);
>> > -               offset = last_offset + offsetof(xfs_dinode_t,
>> > di_next_unlinked);
>> > -               xfs_trans_inode_buf(tp, last_ibp);
>> > -               xfs_trans_log_buf(tp, last_ibp, offset,
>> > -                                 (offset + sizeof(xfs_agino_t) - 1));
>> > -               xfs_inobp_check(mp, last_ibp);
>> > +               if (next_agino != NULLAGINO) {
>> > +                       /*
>> > +                       * If only find the inode being free then we
>> modify
>> > +                       * the unlinked list.
>> > +                       * Point the previous inode on the list to the
>> > next inode.
>> > +                       */
>> > +                       last_dip->di_next_unlinked =
>> > cpu_to_be32(next_agino);
>> > +                       ASSERT(next_agino != 0);
>> > +                       offset = last_offset + offsetof(xfs_dinode_t,
>> > di_next_unlinked);
>> > +                       xfs_trans_inode_buf(tp, last_ibp);
>> > +                       xfs_trans_log_buf(tp, last_ibp, offset,
>> > +                                         (offset + sizeof(xfs_agino_t)
>> > - 1));
>> > +                       xfs_inobp_check(mp, last_ibp);
>> > +               }
>> >         }
>> >         return 0;
>> >  }
>> >
>> > Thank you.
>> >
>> >
>> > 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> >     Hi Brain and Eric,
>> >     If the problem is the agno can't be found in the unlinked list. Can
>> >     we just bypass it instead of passing ino=0xffffffff to xfs_inotobp?
>> >     Thank you.
>> >
>> >
>> >     2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> >         Hi Eric,
>> >         The shutdown issue is still not reproduced yet. But I get the
>> >         following error today during test.
>> >
>> >         Apr 18 07:42:51 10 kernel: Call Trace:
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> >         schedule_timeout+0x215/0x2e0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> >         kmem_zone_alloc+0x77/0xf0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> >         _xfs_buf_find+0x102/0x280 [xfs]
>> >         Apr 18 07:42:51 10 kernel: "echo 0 >
>> >         /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0
>> >         0 14522      1 0x00000083
>> >         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
>> >         0000000000000000 0000000000000000
>> >         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
>> >         ffff882015a639d8 0000000000000246
>> >         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
>> >         000000000000fb88 ffff880e70491af8
>> >         Apr 18 07:42:51 10 kernel: Call Trace:
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> >         schedule_timeout+0x215/0x2e0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> >         kmem_zone_alloc+0x77/0xf0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> >         _xfs_buf_find+0x102/0x280 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
>> >         xfs_buf_lock+0x53/0x110 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
>> >         _xfs_buf_find+0x102/0x280 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
>> >         xfs_buf_get+0x6b/0x1a0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
>> >         xfs_buf_read+0x2c/0x100 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
>> >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
>> >         xfs_read_agi+0x74/0x100 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
>> >         xfs_iunlink+0x5b/0x180 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
>> >         current_fs_time+0x27/0x30
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
>> >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
>> >         xfs_droplink+0x5b/0x70 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
>> >         xfs_remove+0x27e/0x3a0 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
>> >         generic_permission+0x23/0xb0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
>> >         xfs_vn_unlink+0x48/0x90 [xfs]
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
>> vfs_unlink+0x9f/0xe0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
>> >         lookup_hash+0x3a/0x50
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
>> >         do_unlinkat+0x183/0x1c0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
>> >         syscall_trace_enter+0x1d8/0x1e0
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
>> sys_unlink+0x16/0x20
>> >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
>> tracesys+0xd9/0xde
>> >
>> >         Thank you.
>> >
>> >
>> >         2013/4/17 Eric Sandeen <sandeen@sandeen.net
>> >         <mailto:sandeen@sandeen.net>>
>> >
>> >             On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu@gmail.com
>> >             <mailto:yongtaofu@gmail.com>> wrote:
>> >
>> >>             Hi Brain,
>> >>             Can I change as following?
>> >
>> >             ASSERTS are no-ops in a non-debug kernel, so this won't
>> >             change any behavior.  I hope we'll know more if we get new
>> >             traces from your patched kernel....
>> >
>> >             Eric
>> >
>> >>             --- a/xfs_inode.c
>> >>             +++ b/xfs_inode.c
>> >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>> >>                                     if (last_ibp != NULL) {
>> >>                                             xfs_trans_brelse(tp,
>> >>             last_ibp);
>> >>                                     }
>> >>             +                        ASSERT(next_agino != NULLAGINO);
>> >>             +                        ASSERT(next_agino != 0);
>> >>                                     next_ino = XFS_AGINO_TO_INO(mp,
>> >>             agno, next_agino);
>> >>                                     error = xfs_inotobp(mp, tp,
>> >>             next_ino, &last_dip,
>> >>                                                         &last_ibp,
>> >>             &last_offset, 0);
>> >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>> >>                                             return error;
>> >>                                     }
>> >>                                     next_agino =
>> >>             be32_to_cpu(last_dip->di_next_unlinked);
>> >>             -                       ASSERT(next_agino != NULLAGINO);
>> >>             -                       ASSERT(next_agino != 0);
>> >>                             }
>> >>                             /*
>> >>                              * Now last_ibp points to the buffer
>> >>             previous to us on
>> >>
>> >>             Thank you.
>> >>
>> >>
>> >>             2013/4/17 符永涛 <yongtaofu@gmail.com
>> >>             <mailto:yongtaofu@gmail.com>>
>> >>
>> >>                 Hi Brain,
>> >>                 If it is because NULLAGINO is passed in  to
>> xfs_inotobp().
>> >>                 Can I move the following two lines before xfs_inotobp?
>> >>
>> >>                 For example:
>> >>
>> >>                 1767                 while (next_agino != agino) {
>> >>                 1768                         /*
>> >>                 1769                          * If the last inode
>> >>                 wasn't the one pointing to
>> >>                 1770                          * us, then release its
>> >>                 buffer since we're not
>> >>                 1771                          * going to do anything
>> >>                 with it.
>> >>                 1772                          */
>> >>                 1773                         if (last_ibp != NULL) {
>> >>                 1774
>> >>                 xfs_trans_brelse(tp, last_ibp);
>> >>                 1775                         }
>> >>                 1776                         next_ino =
>> >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
>> >>                 +                               ASSERT(next_agino !=
>> >>                 NULLAGINO);
>> >>                 +                               ASSERT(next_agino !=
>> 0);
>> >>                 1777                         error = xfs_inotobp(mp,
>> >>                 tp, next_ino, &last_dip,
>> >>                 1778
>> >>                 &last_ibp, &last_offset, 0);
>> >>                 1779                         if (error) {
>> >>                 1780                                 xfs_warn(mp,
>> >>                 1781                                         "%s:
>> >>                 xfs_inotobp() returned error %d.",
>> >>                 1782                                         __func__,
>> >>                 error);
>> >>                 1783                                 return error;
>> >>                 1784                         }
>> >>                 1785                         next_agino =
>> >>                 be32_to_cpu(last_dip->di_next_unlinked);
>> >>                 -                               //ASSERT(next_agino !=
>> >>                 NULLAGINO);
>> >>                 -                               //ASSERT(next_agino !=
>> 0);
>> >>                 1788                 }
>> >>                 I don't understand xfs well and correct me if I'm
>> >>                 totally wrong.
>> >>                 Thank you very much.
>> >>
>> >>
>> >>                 2013/4/17 符永涛 <yongtaofu@gmail.com
>> >>                 <mailto:yongtaofu@gmail.com>>
>> >>
>> >>                     Hi Brain,
>> >>                     I want to ask a question, according to the
>> >>                     shutdown trace. The ino in  xfs_iunlink_remove
>> >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
>> >>
>> >>                     --- xfs_imap --
>> >>                     module("xfs").function("xfs_imap@fs
>> /xfs/xfs_ialloc.c:1257").return
>> >>                     -- return=0x16
>> >>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70
>> >>                     ino=0xffffffff
>> >>
>> >>                     --- xfs_iunlink_remove --
>> >>                     module("xfs").function("xfs_iunlink_remove@fs
>> /xfs/xfs_inode.c:1680").return
>> >>                     -- return=0x16
>> >>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00
>> >>                     next_ino=? mp=? agi=? dip=?
>> >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
>> >>                     next_agino=? last_ibp=?
>> >>                     last_dip=0xffff882000000000 bucket_index=?
>> >>                     offset=? last_offset=0xffffffffffff8810 error=?
>> >>                     __func__=[...]
>> >>                     ip: i_ino = 0x113, i_flags = 0x0
>> >>
>> >>                     Thank you.
>> >>
>> >>
>> >>
>> >>                     2013/4/17 符永涛 <yongtaofu@gmail.com
>> >>                     <mailto:yongtaofu@gmail.com>>
>> >>
>> >>                         Hi Brain,
>> >>                         Thank you for your update, and I have applied
>> >>                         your last kernel patch. However it is not easy
>> >>                         to reproduce especially in out test
>> >>                         environment. Till now is not happens again.
>> >>                         I'll update the kernel patch now. BTW is there
>> >>                         any findings in the logs of previous thread?
>> >>
>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> >>                         I guess it tend to happen during glusterfs
>> >>                         rebalance because glusterfs moves a lot of
>> >>                         file from one server to another and then
>> >>                         unlink it.
>> >>
>> >>                         Thank you.
>> >>
>> >>
>> >>                         2013/4/17 Brian Foster <bfoster@redhat.com
>> >>                         <mailto:bfoster@redhat.com>>
>> >>
>> >>                             On 04/16/2013 12:24 PM, Dave Chinner wrote:
>> >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
>> >>                             -0400, Brian Foster wrote:
>> >>                             >> Hi,
>> >>                             >>
>> >>                             >> Thanks for the data in the previous
>> thread:
>> >>                             >>
>> >>                             >>
>> >>
>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> >>                             >>
>> >>                             ...
>> >>                             >>
>> >>                             >>      echo 1 >
>> >>
>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>> >>                             >>      echo 1 >
>> >>
>> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>> >>                             >>      ... reproduce ...
>> >>                             >>      cat
>> >>                             /sys/kernel/debug/tracing/trace >
>> trace.output
>> >>                             >
>> >>                             > It's better to use trace-cmd for this.
>> >>                             it will result in less
>> >>                             > dropped events. i.e.:
>> >>                             >
>> >>                             >       $ trace-cmd record -e xfs_iunlink\*
>> >>                             >       ... reproduce ...
>> >>                             >       ^C
>> >>                             >       $ trace-cmd report > trace.output
>> >>                             >
>> >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             >> @@ -581,6 +581,8 @@
>> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>> >>                             ...
>> >>                             >
>> >>                             > I would suggest that the the tracing
>> >>                             shoul dbe at entry of the
>> >>                             > function, otherwise we won't get a
>> >>                             tracepoint for the operation that
>> >>                             > triggers the shutdown. (That's the
>> >>                             reason most tracepoints in XFS
>> >>                             > are at function entry...)
>> >>                             >
>> >>
>> >>                             Good points, thanks Dave. A v2 that pulls
>> >>                             up the tracepoints towards
>> >>                             function entry is appended.
>> >>
>> >>                             Brian
>> >>
>> >>                             From
>> >>                             280943e78ebe0b97a774cba51e7815c42f044b55
>> >>                             Mon Sep 17 00:00:00 2001
>> >>                             From: Brian Foster <bfoster@redhat.com
>> >>                             <mailto:bfoster@redhat.com>>
>> >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
>> >>                             Subject: [PATCH v2] xfs: add tracepoints
>> >>                             for xfs_iunlink and
>> >>                             xfs_iunlink_remove
>> >>
>> >>                             ---
>> >>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>> >>                              fs/xfs/xfs_inode.c           |    4 ++++
>> >>                              2 files changed, 6 insertions(+), 0
>> >>                             deletions(-)
>> >>
>> >>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             b/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             index adc6ec4..338a0f9 100644
>> >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> >>                             @@ -583,6 +583,8 @@
>> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>> >>                              DEFINE_INODE_EVENT(xfs_destroy_inode);
>> >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
>> >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
>> >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
>> >>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>> >>
>> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>> >>                             diff --git a/fs/xfs/xfs_inode.c
>> >>                             b/fs/xfs/xfs_inode.c
>> >>                             index 19900f0..d705c77 100644
>> >>                             --- a/fs/xfs/xfs_inode.c
>> >>                             +++ b/fs/xfs/xfs_inode.c
>> >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
>> >>
>> >>                                     mp = tp->t_mountp;
>> >>
>> >>                             +       trace_xfs_iunlink(ip);
>> >>                             +
>> >>                                     /*
>> >>                                      * Get the agi buffer first.  It
>> >>                             ensures lock ordering
>> >>                                      * on the list.
>> >>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>> >>                                     mp = tp->t_mountp;
>> >>                                     agno = XFS_INO_TO_AGNO(mp,
>> ip->i_ino);
>> >>
>> >>                             +       trace_xfs_iunlink_remove(ip);
>> >>                             +
>> >>                                     /*
>> >>                                      * Get the agi buffer first.  It
>> >>                             ensures lock ordering
>> >>                                      * on the list.
>> >>                             --
>> >>                             1.7.7.6
>> >>
>> >>
>> >>
>> >>
>> >>                         --
>> >>                         符永涛
>> >>
>> >>
>> >>
>> >>
>> >>                     --
>> >>                     符永涛
>> >>
>> >>
>> >>
>> >>
>> >>                 --
>> >>                 符永涛
>> >>
>> >>
>> >>
>> >>
>> >>             --
>> >>             符永涛
>> >>             _______________________________________________
>> >>             xfs mailing list
>> >>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>> >>             http://oss.sgi.com/mailman/listinfo/xfs
>> >
>> >
>> >
>> >
>> >         --
>> >         符永涛
>> >
>> >
>> >
>> >
>> >     --
>> >     符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 51177 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 15:23                       ` 符永涛
  2013-04-18 16:40                         ` 符永涛
@ 2013-04-18 17:03                         ` Eric Sandeen
  2013-04-18 18:35                         ` Eric Sandeen
  2013-04-18 20:59                         ` Brian Foster
  3 siblings, 0 replies; 50+ messages in thread
From: Eric Sandeen @ 2013-04-18 17:03 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/18/13 8:23 AM, 符永涛 wrote:
> Hi Brian and Eric,
> The shutdown is not easy to produce but finally right now 2 of our servers in our test cluster xfs was shutdown.
> 
> the trace output as following
> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
> 
> Sorry but the systemtap is interrupt and I didn't noticed that so I didn't get systemtap logs.
> 
> /var/log/message is same as before
> Apr 18 22:43:14 10 kernel: XFS (sdb): 	: xfs_inotobp() returned error 22.
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned error 22
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02d44aa
> Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down filesystem
> Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and rectify the problem(s)
> Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 
> The metadump file is large I'll share it to you soon.
> 

Thanks, we'll take a look.  Just to double check, in the kernel that ran the tracepoints, did you use brian's 2nd version of the patch?  I want to make sure the tracepoints were at the top of the function.

Since you're patching xfs anyway, can you add something like this for next time:

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 796edce..cad0e8e 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1777,8 +1777,9 @@ xfs_iunlink_remove(
 					    &last_ibp, &last_offset, 0);
 			if (error) {
 				xfs_warn(mp,
-					"%s: xfs_inotobp() returned error %d.",
-					__func__, error);
+					"%s: xfs_inotobp() returned error %d "
+					"for inode 0x%llx ag %d agino %x\n",
+					__func__, error, ip->i_ino, agno, agino);
 				return error;
 			}
 			next_agino = be32_to_cpu(last_dip->di_next_unlinked);

so that when we encounter the error we're sure to have the problematic inode number.

Thanks,
-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 15:23                       ` 符永涛
  2013-04-18 16:40                         ` 符永涛
  2013-04-18 17:03                         ` Eric Sandeen
@ 2013-04-18 18:35                         ` Eric Sandeen
  2013-04-18 20:59                         ` Brian Foster
  3 siblings, 0 replies; 50+ messages in thread
From: Eric Sandeen @ 2013-04-18 18:35 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/18/13 8:23 AM, 符永涛 wrote:
> Hi Brian and Eric,
> The shutdown is not easy to produce but finally right now 2 of our servers in our test cluster xfs was shutdown.
> 
> the trace output as following
> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
> 

here's something interesting, for 2 inodes we have double/racing calls to xfs_iunlink:

=== 0x5cc0b ===
           <...>-8336  [004]  6931.372924: xfs_iunlink: dev 8:16 ino 0x5cc0b
           <...>-8336  [004]  6931.372965: xfs_iunlink_remove: dev 8:16 ino 0x5cc0b
           <...>-27541 [001] 35061.349747: xfs_iunlink: dev 8:16 ino 0x5cc0b
           <...>-3356  [001] 36449.762504: xfs_iunlink_remove: dev 8:16 ino 0x5cc0b
           <...>-3300  [003] 41013.398566: xfs_iunlink: dev 8:16 ino 0x5cc0b
           <...>-26115 [012] 41013.399884: xfs_iunlink: dev 8:16 ino 0x5cc0b
           <...>-26115 [012] 41013.399935: xfs_iunlink_remove: dev 8:16 ino 0x5cc0b
           <...>-28961 [000] 68977.951208: xfs_iunlink: dev 8:16 ino 0x5cc0b
           <...>-3364  [021] 81616.210533: xfs_iunlink_remove: dev 8:16 ino 0x5cc0b

=== 0x7ef8c ===
           <...>-13169 [001] 118751.536025: xfs_iunlink: dev 8:16 ino 0x7ef8c
           <...>-13169 [001] 118751.536049: xfs_iunlink_remove: dev 8:16 ino 0x7ef8c
           <...>-3594  [015] 119027.006161: xfs_iunlink: dev 8:16 ino 0x7ef8c
           <...>-3594  [015] 119027.006186: xfs_iunlink_remove: dev 8:16 ino 0x7ef8c
           <...>-3591  [001] 121423.286004: xfs_iunlink: dev 8:16 ino 0x7ef8c
           <...>-4141  [019] 121423.288518: xfs_iunlink: dev 8:16 ino 0x7ef8c
           <...>-4141  [019] 121423.288541: xfs_iunlink_remove: dev 8:16 ino 0x7ef8c

2 threads on 2 different CPUs adding the same inode to the unlinked list in a race;
this will corrupt the list and lead to the failure to find the other inode we're
looking for.  So, progress!  We'll take a look at the iunlink paths.

-Eric



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 15:23                       ` 符永涛
                                           ` (2 preceding siblings ...)
  2013-04-18 18:35                         ` Eric Sandeen
@ 2013-04-18 20:59                         ` Brian Foster
  2013-04-19  6:40                           ` 符永涛
  3 siblings, 1 reply; 50+ messages in thread
From: Brian Foster @ 2013-04-18 20:59 UTC (permalink / raw)
  To: 符永涛; +Cc: Eric Sandeen, xfs

On 04/18/2013 11:23 AM, 符永涛 wrote:
> Hi Brian and Eric,
> The shutdown is not easy to produce but finally right now 2 of our
> servers in our test cluster xfs was shutdown.
> 

Understood. We've been trying very hard to reproduce ourselves to make
it easier to debug, but haven't been able to reproduce at all so far.
This process allows us to make _some_ progress on the issue, even if it
is slower going than we'd like... ;)

> the trace output as following
> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
> 

Thanks again for the data. The racing behavior Eric called out (nice
catch!) in his last mail lit up some light bulbs internally with regard
to some old locking issues triggered by XFS in the 6.3 kernel. The
following bug serves as an example:

https://bugzilla.redhat.com/show_bug.cgi?id=852847

... the fix for which went into the 2.6.32-279.19.1 6.3.z release. Could
you move some or all of your servers to this kernel[1] and see how it
goes? The best case is it resolves the problem, worst case we carry on
debugging from there...

Brian

[1] -
http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm

> Sorry but the systemtap is interrupt and I didn't noticed that so I
> didn't get systemtap logs.
> 
> /var/log/message is same as before
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> 0xffffffffa02d44aa
> Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 
> The metadump file is large I'll share it to you soon.
> 
> 
> 2013/4/18 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
> 
>     On 04/18/2013 04:25 AM, 符永涛 wrote:
>     > Hi Brian and Eric,
>     > Can I change as following to bypass it?
> 
>     This is probably not a wise thing to do. The problem we're seeing here
>     is indicative of a potentially larger problem than this particular error
>     path. An inode is being unlinked and inactivated, but we aren't finding
>     on the list where we expect it to be. Killing the error return doesn't
>     eliminate the larger problem.
> 
>     So while changes could end up being made in this area as part of a fix,
>     I would not suggest making any changes beyond those designed to help
>     debug until we have a better idea of root cause.
> 
>     Brian
> 
>     > --- a/xfs_inode.c
>     > +++ b/xfs_inode.c
>     > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
>     >                  */
>     >                 next_agino =
>     be32_to_cpu(agi->agi_unlinked[bucket_index]);
>     >                 last_ibp = NULL;
>     > -               while (next_agino != agino) {
>     > +               while (next_agino != agino && next_agino !=
>     NULLAGINO) {
>     >                         /*
>     >                          * If the last inode wasn't the one
>     pointing to
>     >                          * us, then release its buffer since we're not
>     > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
>     >                         ASSERT(next_agino != NULLAGINO);
>     >                         ASSERT(next_agino != 0);
>     >                 }
>     > +               if (next_agino == NULLAGINO) {
>     > +                       /*
>     > +                        *After search the list for the inode
>     being free
>     > +                        *we still can't find it.
>     > +                        */
>     > +                       xfs_err(mp, "%s ino %lld not found in unlinked
>     > list.",
>     > +                                    __func__, (unsigned long
>     > long)ip->i_ino);
>     > +               }
>     >                 /*
>     >                  * Now last_ibp points to the buffer previous to us on
>     >                  * the unlinked list.  Pull us from the list.
>     > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
>     >                 } else {
>     >                         xfs_trans_brelse(tp, ibp);
>     >                 }
>     > -               /*
>     > -                * Point the previous inode on the list to the
>     next inode.
>     > -                */
>     > -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
>     > -               ASSERT(next_agino != 0);
>     > -               offset = last_offset + offsetof(xfs_dinode_t,
>     > di_next_unlinked);
>     > -               xfs_trans_inode_buf(tp, last_ibp);
>     > -               xfs_trans_log_buf(tp, last_ibp, offset,
>     > -                                 (offset + sizeof(xfs_agino_t) - 1));
>     > -               xfs_inobp_check(mp, last_ibp);
>     > +               if (next_agino != NULLAGINO) {
>     > +                       /*
>     > +                       * If only find the inode being free then
>     we modify
>     > +                       * the unlinked list.
>     > +                       * Point the previous inode on the list to the
>     > next inode.
>     > +                       */
>     > +                       last_dip->di_next_unlinked =
>     > cpu_to_be32(next_agino);
>     > +                       ASSERT(next_agino != 0);
>     > +                       offset = last_offset + offsetof(xfs_dinode_t,
>     > di_next_unlinked);
>     > +                       xfs_trans_inode_buf(tp, last_ibp);
>     > +                       xfs_trans_log_buf(tp, last_ibp, offset,
>     > +                                         (offset +
>     sizeof(xfs_agino_t)
>     > - 1));
>     > +                       xfs_inobp_check(mp, last_ibp);
>     > +               }
>     >         }
>     >         return 0;
>     >  }
>     >
>     > Thank you.
>     >
>     >
>     > 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>     <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>     >
>     >     Hi Brain and Eric,
>     >     If the problem is the agno can't be found in the unlinked
>     list. Can
>     >     we just bypass it instead of passing ino=0xffffffff to
>     xfs_inotobp?
>     >     Thank you.
>     >
>     >
>     >     2013/4/18 符永涛 <yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com> <mailto:yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>>>
>     >
>     >         Hi Eric,
>     >         The shutdown issue is still not reproduced yet. But I get the
>     >         following error today during test.
>     >
>     >         Apr 18 07:42:51 10 kernel: Call Trace:
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>     >         schedule_timeout+0x215/0x2e0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>     >         kmem_zone_alloc+0x77/0xf0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
>     __down+0x72/0xb0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>     >         _xfs_buf_find+0x102/0x280 [xfs]
>     >         Apr 18 07:42:51 10 kernel: "echo 0 >
>     >         /proc/sys/kernel/hung_task_timeout_secs" disables this
>     message.
>     >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0
>     >         0 14522      1 0x00000083
>     >         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
>     >         0000000000000000 0000000000000000
>     >         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
>     >         ffff882015a639d8 0000000000000246
>     >         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
>     >         000000000000fb88 ffff880e70491af8
>     >         Apr 18 07:42:51 10 kernel: Call Trace:
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>     >         schedule_timeout+0x215/0x2e0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>     >         kmem_zone_alloc+0x77/0xf0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
>     __down+0x72/0xb0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>     >         _xfs_buf_find+0x102/0x280 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
>     >         xfs_buf_lock+0x53/0x110 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
>     >         _xfs_buf_find+0x102/0x280 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
>     >         xfs_buf_get+0x6b/0x1a0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
>     >         xfs_buf_read+0x2c/0x100 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
>     >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
>     >         xfs_read_agi+0x74/0x100 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
>     >         xfs_iunlink+0x5b/0x180 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
>     >         current_fs_time+0x27/0x30
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
>     >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
>     >         xfs_droplink+0x5b/0x70 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
>     >         xfs_remove+0x27e/0x3a0 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
>     >         generic_permission+0x23/0xb0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
>     >         xfs_vn_unlink+0x48/0x90 [xfs]
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
>     vfs_unlink+0x9f/0xe0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
>     >         lookup_hash+0x3a/0x50
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
>     >         do_unlinkat+0x183/0x1c0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
>     >         syscall_trace_enter+0x1d8/0x1e0
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
>     sys_unlink+0x16/0x20
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
>     tracesys+0xd9/0xde
>     >
>     >         Thank you.
>     >
>     >
>     >         2013/4/17 Eric Sandeen <sandeen@sandeen.net
>     <mailto:sandeen@sandeen.net>
>     >         <mailto:sandeen@sandeen.net <mailto:sandeen@sandeen.net>>>
>     >
>     >             On Apr 16, 2013, at 8:48 PM, 符永涛
>     <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>     >             <mailto:yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>>> wrote:
>     >
>     >>             Hi Brain,
>     >>             Can I change as following?
>     >
>     >             ASSERTS are no-ops in a non-debug kernel, so this won't
>     >             change any behavior.  I hope we'll know more if we get new
>     >             traces from your patched kernel....
>     >
>     >             Eric
>     >
>     >>             --- a/xfs_inode.c
>     >>             +++ b/xfs_inode.c
>     >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>     >>                                     if (last_ibp != NULL) {
>     >>                                             xfs_trans_brelse(tp,
>     >>             last_ibp);
>     >>                                     }
>     >>             +                        ASSERT(next_agino != NULLAGINO);
>     >>             +                        ASSERT(next_agino != 0);
>     >>                                     next_ino = XFS_AGINO_TO_INO(mp,
>     >>             agno, next_agino);
>     >>                                     error = xfs_inotobp(mp, tp,
>     >>             next_ino, &last_dip,
>     >>                                                         &last_ibp,
>     >>             &last_offset, 0);
>     >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>     >>                                             return error;
>     >>                                     }
>     >>                                     next_agino =
>     >>             be32_to_cpu(last_dip->di_next_unlinked);
>     >>             -                       ASSERT(next_agino != NULLAGINO);
>     >>             -                       ASSERT(next_agino != 0);
>     >>                             }
>     >>                             /*
>     >>                              * Now last_ibp points to the buffer
>     >>             previous to us on
>     >>
>     >>             Thank you.
>     >>
>     >>
>     >>             2013/4/17 符永涛 <yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>
>     >>             <mailto:yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>>>
>     >>
>     >>                 Hi Brain,
>     >>                 If it is because NULLAGINO is passed in  to
>     xfs_inotobp().
>     >>                 Can I move the following two lines before
>     xfs_inotobp?
>     >>
>     >>                 For example:
>     >>
>     >>                 1767                 while (next_agino != agino) {
>     >>                 1768                         /*
>     >>                 1769                          * If the last inode
>     >>                 wasn't the one pointing to
>     >>                 1770                          * us, then release its
>     >>                 buffer since we're not
>     >>                 1771                          * going to do anything
>     >>                 with it.
>     >>                 1772                          */
>     >>                 1773                         if (last_ibp != NULL) {
>     >>                 1774
>     >>                 xfs_trans_brelse(tp, last_ibp);
>     >>                 1775                         }
>     >>                 1776                         next_ino =
>     >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
>     >>                 +                               ASSERT(next_agino !=
>     >>                 NULLAGINO);
>     >>                 +                               ASSERT(next_agino
>     != 0);
>     >>                 1777                         error = xfs_inotobp(mp,
>     >>                 tp, next_ino, &last_dip,
>     >>                 1778
>     >>                 &last_ibp, &last_offset, 0);
>     >>                 1779                         if (error) {
>     >>                 1780                                 xfs_warn(mp,
>     >>                 1781                                         "%s:
>     >>                 xfs_inotobp() returned error %d.",
>     >>                 1782                                        
>     __func__,
>     >>                 error);
>     >>                 1783                                 return error;
>     >>                 1784                         }
>     >>                 1785                         next_agino =
>     >>                 be32_to_cpu(last_dip->di_next_unlinked);
>     >>                 -                              
>     //ASSERT(next_agino !=
>     >>                 NULLAGINO);
>     >>                 -                              
>     //ASSERT(next_agino != 0);
>     >>                 1788                 }
>     >>                 I don't understand xfs well and correct me if I'm
>     >>                 totally wrong.
>     >>                 Thank you very much.
>     >>
>     >>
>     >>                 2013/4/17 符永涛 <yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>
>     >>                 <mailto:yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>>>
>     >>
>     >>                     Hi Brain,
>     >>                     I want to ask a question, according to the
>     >>                     shutdown trace. The ino in  xfs_iunlink_remove
>     >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
>     >>
>     >>                     --- xfs_imap --
>     >>                    
>     module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>     >>                     -- return=0x16
>     >>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70
>     >>                     ino=0xffffffff
>     >>
>     >>                     --- xfs_iunlink_remove --
>     >>                    
>     module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>     >>                     -- return=0x16
>     >>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00
>     >>                     next_ino=? mp=? agi=? dip=?
>     >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
>     >>                     next_agino=? last_ibp=?
>     >>                     last_dip=0xffff882000000000 bucket_index=?
>     >>                     offset=? last_offset=0xffffffffffff8810 error=?
>     >>                     __func__=[...]
>     >>                     ip: i_ino = 0x113, i_flags = 0x0
>     >>
>     >>                     Thank you.
>     >>
>     >>
>     >>
>     >>                     2013/4/17 符永涛 <yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>
>     >>                     <mailto:yongtaofu@gmail.com
>     <mailto:yongtaofu@gmail.com>>>
>     >>
>     >>                         Hi Brain,
>     >>                         Thank you for your update, and I have applied
>     >>                         your last kernel patch. However it is not
>     easy
>     >>                         to reproduce especially in out test
>     >>                         environment. Till now is not happens again.
>     >>                         I'll update the kernel patch now. BTW is
>     there
>     >>                         any findings in the logs of previous thread?
>     >>                        
>     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>     >>                         I guess it tend to happen during glusterfs
>     >>                         rebalance because glusterfs moves a lot of
>     >>                         file from one server to another and then
>     >>                         unlink it.
>     >>
>     >>                         Thank you.
>     >>
>     >>
>     >>                         2013/4/17 Brian Foster
>     <bfoster@redhat.com <mailto:bfoster@redhat.com>
>     >>                         <mailto:bfoster@redhat.com
>     <mailto:bfoster@redhat.com>>>
>     >>
>     >>                             On 04/16/2013 12:24 PM, Dave Chinner
>     wrote:
>     >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
>     >>                             -0400, Brian Foster wrote:
>     >>                             >> Hi,
>     >>                             >>
>     >>                             >> Thanks for the data in the
>     previous thread:
>     >>                             >>
>     >>                             >>
>     >>                            
>     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>     >>                             >>
>     >>                             ...
>     >>                             >>
>     >>                             >>      echo 1 >
>     >>                            
>     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>     >>                             >>      echo 1 >
>     >>                            
>     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>     >>                             >>      ... reproduce ...
>     >>                             >>      cat
>     >>                             /sys/kernel/debug/tracing/trace >
>     trace.output
>     >>                             >
>     >>                             > It's better to use trace-cmd for this.
>     >>                             it will result in less
>     >>                             > dropped events. i.e.:
>     >>                             >
>     >>                             >       $ trace-cmd record -e
>     xfs_iunlink\*
>     >>                             >       ... reproduce ...
>     >>                             >       ^C
>     >>                             >       $ trace-cmd report > trace.output
>     >>                             >
>     >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             >> @@ -581,6 +581,8 @@
>     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>     >>                             ...
>     >>                             >
>     >>                             > I would suggest that the the tracing
>     >>                             shoul dbe at entry of the
>     >>                             > function, otherwise we won't get a
>     >>                             tracepoint for the operation that
>     >>                             > triggers the shutdown. (That's the
>     >>                             reason most tracepoints in XFS
>     >>                             > are at function entry...)
>     >>                             >
>     >>
>     >>                             Good points, thanks Dave. A v2 that pulls
>     >>                             up the tracepoints towards
>     >>                             function entry is appended.
>     >>
>     >>                             Brian
>     >>
>     >>                             From
>     >>                             280943e78ebe0b97a774cba51e7815c42f044b55
>     >>                             Mon Sep 17 00:00:00 2001
>     >>                             From: Brian Foster
>     <bfoster@redhat.com <mailto:bfoster@redhat.com>
>     >>                             <mailto:bfoster@redhat.com
>     <mailto:bfoster@redhat.com>>>
>     >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
>     >>                             Subject: [PATCH v2] xfs: add tracepoints
>     >>                             for xfs_iunlink and
>     >>                             xfs_iunlink_remove
>     >>
>     >>                             ---
>     >>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++
>     >>                              fs/xfs/xfs_inode.c           |    4 ++++
>     >>                              2 files changed, 6 insertions(+), 0
>     >>                             deletions(-)
>     >>
>     >>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             b/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             index adc6ec4..338a0f9 100644
>     >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
>     >>                             @@ -583,6 +583,8 @@
>     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>     >>                              DEFINE_INODE_EVENT(xfs_destroy_inode);
>     >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
>     >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
>     >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
>     >>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>     >>
>     >>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>     >>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>     >>                             diff --git a/fs/xfs/xfs_inode.c
>     >>                             b/fs/xfs/xfs_inode.c
>     >>                             index 19900f0..d705c77 100644
>     >>                             --- a/fs/xfs/xfs_inode.c
>     >>                             +++ b/fs/xfs/xfs_inode.c
>     >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
>     >>
>     >>                                     mp = tp->t_mountp;
>     >>
>     >>                             +       trace_xfs_iunlink(ip);
>     >>                             +
>     >>                                     /*
>     >>                                      * Get the agi buffer first.  It
>     >>                             ensures lock ordering
>     >>                                      * on the list.
>     >>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
>     >>                                     mp = tp->t_mountp;
>     >>                                     agno = XFS_INO_TO_AGNO(mp,
>     ip->i_ino);
>     >>
>     >>                             +       trace_xfs_iunlink_remove(ip);
>     >>                             +
>     >>                                     /*
>     >>                                      * Get the agi buffer first.  It
>     >>                             ensures lock ordering
>     >>                                      * on the list.
>     >>                             --
>     >>                             1.7.7.6
>     >>
>     >>
>     >>
>     >>
>     >>                         --
>     >>                         符永涛
>     >>
>     >>
>     >>
>     >>
>     >>                     --
>     >>                     符永涛
>     >>
>     >>
>     >>
>     >>
>     >>                 --
>     >>                 符永涛
>     >>
>     >>
>     >>
>     >>
>     >>             --
>     >>             符永涛
>     >>             _______________________________________________
>     >>             xfs mailing list
>     >>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>     <mailto:xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>>
>     >>             http://oss.sgi.com/mailman/listinfo/xfs
>     >
>     >
>     >
>     >
>     >         --
>     >         符永涛
>     >
>     >
>     >
>     >
>     >     --
>     >     符永涛
>     >
>     >
>     >
>     >
>     > --
>     > 符永涛
> 
> 
> 
> 
> -- 
> 符永涛

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-18 20:59                         ` Brian Foster
@ 2013-04-19  6:40                           ` 符永涛
  2013-04-19 11:41                             ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19  6:40 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 28209 bytes --]

Hi Brian and Eric,

I have applied your kernel path v2(add unlink trace) to
kernel-2.6.32-279.19.1.el6.x86_64.rpm<http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>in
out test cluster and start testing again.
Any progress will let you known. Thank you.


2013/4/19 Brian Foster <bfoster@redhat.com>

> On 04/18/2013 11:23 AM, 符永涛 wrote:
> > Hi Brian and Eric,
> > The shutdown is not easy to produce but finally right now 2 of our
> > servers in our test cluster xfs was shutdown.
> >
>
> Understood. We've been trying very hard to reproduce ourselves to make
> it easier to debug, but haven't been able to reproduce at all so far.
> This process allows us to make _some_ progress on the issue, even if it
> is slower going than we'd like... ;)
>
> > the trace output as following
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
> >
>
> Thanks again for the data. The racing behavior Eric called out (nice
> catch!) in his last mail lit up some light bulbs internally with regard
> to some old locking issues triggered by XFS in the 6.3 kernel. The
> following bug serves as an example:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=852847
>
> ... the fix for which went into the 2.6.32-279.19.1 6.3.z release. Could
> you move some or all of your servers to this kernel[1] and see how it
> goes? The best case is it resolves the problem, worst case we carry on
> debugging from there...
>
> Brian
>
> [1] -
>
> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm
>
> > Sorry but the systemtap is interrupt and I didn't noticed that so I
> > didn't get systemtap logs.
> >
> > /var/log/message is same as before
> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> > returned error 22.
> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> > error 22
> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> > from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> > 0xffffffffa02d44aa
> > Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> > filesystem
> > Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
> > rectify the problem(s)
> > Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >
> > The metadump file is large I'll share it to you soon.
> >
> >
> > 2013/4/18 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
> >
> >     On 04/18/2013 04:25 AM, 符永涛 wrote:
> >     > Hi Brian and Eric,
> >     > Can I change as following to bypass it?
> >
> >     This is probably not a wise thing to do. The problem we're seeing
> here
> >     is indicative of a potentially larger problem than this particular
> error
> >     path. An inode is being unlinked and inactivated, but we aren't
> finding
> >     on the list where we expect it to be. Killing the error return
> doesn't
> >     eliminate the larger problem.
> >
> >     So while changes could end up being made in this area as part of a
> fix,
> >     I would not suggest making any changes beyond those designed to help
> >     debug until we have a better idea of root cause.
> >
> >     Brian
> >
> >     > --- a/xfs_inode.c
> >     > +++ b/xfs_inode.c
> >     > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
> >     >                  */
> >     >                 next_agino =
> >     be32_to_cpu(agi->agi_unlinked[bucket_index]);
> >     >                 last_ibp = NULL;
> >     > -               while (next_agino != agino) {
> >     > +               while (next_agino != agino && next_agino !=
> >     NULLAGINO) {
> >     >                         /*
> >     >                          * If the last inode wasn't the one
> >     pointing to
> >     >                          * us, then release its buffer since we're
> not
> >     > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
> >     >                         ASSERT(next_agino != NULLAGINO);
> >     >                         ASSERT(next_agino != 0);
> >     >                 }
> >     > +               if (next_agino == NULLAGINO) {
> >     > +                       /*
> >     > +                        *After search the list for the inode
> >     being free
> >     > +                        *we still can't find it.
> >     > +                        */
> >     > +                       xfs_err(mp, "%s ino %lld not found in
> unlinked
> >     > list.",
> >     > +                                    __func__, (unsigned long
> >     > long)ip->i_ino);
> >     > +               }
> >     >                 /*
> >     >                  * Now last_ibp points to the buffer previous to
> us on
> >     >                  * the unlinked list.  Pull us from the list.
> >     > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
> >     >                 } else {
> >     >                         xfs_trans_brelse(tp, ibp);
> >     >                 }
> >     > -               /*
> >     > -                * Point the previous inode on the list to the
> >     next inode.
> >     > -                */
> >     > -               last_dip->di_next_unlinked =
> cpu_to_be32(next_agino);
> >     > -               ASSERT(next_agino != 0);
> >     > -               offset = last_offset + offsetof(xfs_dinode_t,
> >     > di_next_unlinked);
> >     > -               xfs_trans_inode_buf(tp, last_ibp);
> >     > -               xfs_trans_log_buf(tp, last_ibp, offset,
> >     > -                                 (offset + sizeof(xfs_agino_t) -
> 1));
> >     > -               xfs_inobp_check(mp, last_ibp);
> >     > +               if (next_agino != NULLAGINO) {
> >     > +                       /*
> >     > +                       * If only find the inode being free then
> >     we modify
> >     > +                       * the unlinked list.
> >     > +                       * Point the previous inode on the list to
> the
> >     > next inode.
> >     > +                       */
> >     > +                       last_dip->di_next_unlinked =
> >     > cpu_to_be32(next_agino);
> >     > +                       ASSERT(next_agino != 0);
> >     > +                       offset = last_offset +
> offsetof(xfs_dinode_t,
> >     > di_next_unlinked);
> >     > +                       xfs_trans_inode_buf(tp, last_ibp);
> >     > +                       xfs_trans_log_buf(tp, last_ibp, offset,
> >     > +                                         (offset +
> >     sizeof(xfs_agino_t)
> >     > - 1));
> >     > +                       xfs_inobp_check(mp, last_ibp);
> >     > +               }
> >     >         }
> >     >         return 0;
> >     >  }
> >     >
> >     > Thank you.
> >     >
> >     >
> >     > 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> >     <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
> >     >
> >     >     Hi Brain and Eric,
> >     >     If the problem is the agno can't be found in the unlinked
> >     list. Can
> >     >     we just bypass it instead of passing ino=0xffffffff to
> >     xfs_inotobp?
> >     >     Thank you.
> >     >
> >     >
> >     >     2013/4/18 符永涛 <yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com> <mailto:yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>>>
> >     >
> >     >         Hi Eric,
> >     >         The shutdown issue is still not reproduced yet. But I get
> the
> >     >         following error today during test.
> >     >
> >     >         Apr 18 07:42:51 10 kernel: Call Trace:
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >     >         schedule_timeout+0x215/0x2e0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >     >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
> >     __down+0x72/0xb0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >     >         _xfs_buf_find+0x102/0x280 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: "echo 0 >
> >     >         /proc/sys/kernel/hung_task_timeout_secs" disables this
> >     message.
> >     >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0
> >     >         0 14522      1 0x00000083
> >     >         Apr 18 07:42:51 10 kernel: ffff882015a63a28
> 0000000000000082
> >     >         0000000000000000 0000000000000000
> >     >         Apr 18 07:42:51 10 kernel: ffff882015a639b8
> ffffffffa02d91ef
> >     >         ffff882015a639d8 0000000000000246
> >     >         Apr 18 07:42:51 10 kernel: ffff880e70491af8
> ffff882015a63fd8
> >     >         000000000000fb88 ffff880e70491af8
> >     >         Apr 18 07:42:51 10 kernel: Call Trace:
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >     >         schedule_timeout+0x215/0x2e0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >     >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
> >     __down+0x72/0xb0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >     >         _xfs_buf_find+0x102/0x280 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>]
> down+0x41/0x50
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
> >     >         xfs_buf_lock+0x53/0x110 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
> >     >         _xfs_buf_find+0x102/0x280 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
> >     >         xfs_buf_get+0x6b/0x1a0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
> >     >         xfs_buf_read+0x2c/0x100 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
> >     >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
> >     >         xfs_read_agi+0x74/0x100 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
> >     >         xfs_iunlink+0x5b/0x180 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
> >     >         current_fs_time+0x27/0x30
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
> >     >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
> >     >         xfs_droplink+0x5b/0x70 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
> >     >         xfs_remove+0x27e/0x3a0 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
> >     >         generic_permission+0x23/0xb0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
> >     >         xfs_vn_unlink+0x48/0x90 [xfs]
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
> >     vfs_unlink+0x9f/0xe0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
> >     >         lookup_hash+0x3a/0x50
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
> >     >         do_unlinkat+0x183/0x1c0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
> >     >         syscall_trace_enter+0x1d8/0x1e0
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
> >     sys_unlink+0x16/0x20
> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
> >     tracesys+0xd9/0xde
> >     >
> >     >         Thank you.
> >     >
> >     >
> >     >         2013/4/17 Eric Sandeen <sandeen@sandeen.net
> >     <mailto:sandeen@sandeen.net>
> >     >         <mailto:sandeen@sandeen.net <mailto:sandeen@sandeen.net>>>
> >     >
> >     >             On Apr 16, 2013, at 8:48 PM, 符永涛
> >     <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> >     >             <mailto:yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>>> wrote:
> >     >
> >     >>             Hi Brain,
> >     >>             Can I change as following?
> >     >
> >     >             ASSERTS are no-ops in a non-debug kernel, so this won't
> >     >             change any behavior.  I hope we'll know more if we get
> new
> >     >             traces from your patched kernel....
> >     >
> >     >             Eric
> >     >
> >     >>             --- a/xfs_inode.c
> >     >>             +++ b/xfs_inode.c
> >     >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
> >     >>                                     if (last_ibp != NULL) {
> >     >>                                             xfs_trans_brelse(tp,
> >     >>             last_ibp);
> >     >>                                     }
> >     >>             +                        ASSERT(next_agino !=
> NULLAGINO);
> >     >>             +                        ASSERT(next_agino != 0);
> >     >>                                     next_ino =
> XFS_AGINO_TO_INO(mp,
> >     >>             agno, next_agino);
> >     >>                                     error = xfs_inotobp(mp, tp,
> >     >>             next_ino, &last_dip,
> >     >>                                                         &last_ibp,
> >     >>             &last_offset, 0);
> >     >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
> >     >>                                             return error;
> >     >>                                     }
> >     >>                                     next_agino =
> >     >>             be32_to_cpu(last_dip->di_next_unlinked);
> >     >>             -                       ASSERT(next_agino !=
> NULLAGINO);
> >     >>             -                       ASSERT(next_agino != 0);
> >     >>                             }
> >     >>                             /*
> >     >>                              * Now last_ibp points to the buffer
> >     >>             previous to us on
> >     >>
> >     >>             Thank you.
> >     >>
> >     >>
> >     >>             2013/4/17 符永涛 <yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>
> >     >>             <mailto:yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>>>
> >     >>
> >     >>                 Hi Brain,
> >     >>                 If it is because NULLAGINO is passed in  to
> >     xfs_inotobp().
> >     >>                 Can I move the following two lines before
> >     xfs_inotobp?
> >     >>
> >     >>                 For example:
> >     >>
> >     >>                 1767                 while (next_agino != agino) {
> >     >>                 1768                         /*
> >     >>                 1769                          * If the last inode
> >     >>                 wasn't the one pointing to
> >     >>                 1770                          * us, then release
> its
> >     >>                 buffer since we're not
> >     >>                 1771                          * going to do
> anything
> >     >>                 with it.
> >     >>                 1772                          */
> >     >>                 1773                         if (last_ibp !=
> NULL) {
> >     >>                 1774
> >     >>                 xfs_trans_brelse(tp, last_ibp);
> >     >>                 1775                         }
> >     >>                 1776                         next_ino =
> >     >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
> >     >>                 +                               ASSERT(next_agino
> !=
> >     >>                 NULLAGINO);
> >     >>                 +                               ASSERT(next_agino
> >     != 0);
> >     >>                 1777                         error =
> xfs_inotobp(mp,
> >     >>                 tp, next_ino, &last_dip,
> >     >>                 1778
> >     >>                 &last_ibp, &last_offset, 0);
> >     >>                 1779                         if (error) {
> >     >>                 1780                                 xfs_warn(mp,
> >     >>                 1781                                         "%s:
> >     >>                 xfs_inotobp() returned error %d.",
> >     >>                 1782
> >     __func__,
> >     >>                 error);
> >     >>                 1783                                 return error;
> >     >>                 1784                         }
> >     >>                 1785                         next_agino =
> >     >>                 be32_to_cpu(last_dip->di_next_unlinked);
> >     >>                 -
> >     //ASSERT(next_agino !=
> >     >>                 NULLAGINO);
> >     >>                 -
> >     //ASSERT(next_agino != 0);
> >     >>                 1788                 }
> >     >>                 I don't understand xfs well and correct me if I'm
> >     >>                 totally wrong.
> >     >>                 Thank you very much.
> >     >>
> >     >>
> >     >>                 2013/4/17 符永涛 <yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>
> >     >>                 <mailto:yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>>>
> >     >>
> >     >>                     Hi Brain,
> >     >>                     I want to ask a question, according to the
> >     >>                     shutdown trace. The ino in  xfs_iunlink_remove
> >     >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
> >     >>
> >     >>                     --- xfs_imap --
> >     >>
> >     module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> >     >>                     -- return=0x16
> >     >>                     vars: mp=0xffff882017a50800
> tp=0xffff881c81797c70
> >     >>                     ino=0xffffffff
> >     >>
> >     >>                     --- xfs_iunlink_remove --
> >     >>
> >     module("xfs").function("xfs_iunlink_remove@fs
> /xfs/xfs_inode.c:1680").return
> >     >>                     -- return=0x16
> >     >>                     vars: tp=0xffff881c81797c70
> ip=0xffff881003c13c00
> >     >>                     next_ino=? mp=? agi=? dip=?
> >     >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
> >     >>                     next_agino=? last_ibp=?
> >     >>                     last_dip=0xffff882000000000 bucket_index=?
> >     >>                     offset=? last_offset=0xffffffffffff8810
> error=?
> >     >>                     __func__=[...]
> >     >>                     ip: i_ino = 0x113, i_flags = 0x0
> >     >>
> >     >>                     Thank you.
> >     >>
> >     >>
> >     >>
> >     >>                     2013/4/17 符永涛 <yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>
> >     >>                     <mailto:yongtaofu@gmail.com
> >     <mailto:yongtaofu@gmail.com>>>
> >     >>
> >     >>                         Hi Brain,
> >     >>                         Thank you for your update, and I have
> applied
> >     >>                         your last kernel patch. However it is not
> >     easy
> >     >>                         to reproduce especially in out test
> >     >>                         environment. Till now is not happens
> again.
> >     >>                         I'll update the kernel patch now. BTW is
> >     there
> >     >>                         any findings in the logs of previous
> thread?
> >     >>
> >     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >     >>                         I guess it tend to happen during glusterfs
> >     >>                         rebalance because glusterfs moves a lot of
> >     >>                         file from one server to another and then
> >     >>                         unlink it.
> >     >>
> >     >>                         Thank you.
> >     >>
> >     >>
> >     >>                         2013/4/17 Brian Foster
> >     <bfoster@redhat.com <mailto:bfoster@redhat.com>
> >     >>                         <mailto:bfoster@redhat.com
> >     <mailto:bfoster@redhat.com>>>
> >     >>
> >     >>                             On 04/16/2013 12:24 PM, Dave Chinner
> >     wrote:
> >     >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
> >     >>                             -0400, Brian Foster wrote:
> >     >>                             >> Hi,
> >     >>                             >>
> >     >>                             >> Thanks for the data in the
> >     previous thread:
> >     >>                             >>
> >     >>                             >>
> >     >>
> >     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >     >>                             >>
> >     >>                             ...
> >     >>                             >>
> >     >>                             >>      echo 1 >
> >     >>
> >     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
> >     >>                             >>      echo 1 >
> >     >>
> >     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
> >     >>                             >>      ... reproduce ...
> >     >>                             >>      cat
> >     >>                             /sys/kernel/debug/tracing/trace >
> >     trace.output
> >     >>                             >
> >     >>                             > It's better to use trace-cmd for
> this.
> >     >>                             it will result in less
> >     >>                             > dropped events. i.e.:
> >     >>                             >
> >     >>                             >       $ trace-cmd record -e
> >     xfs_iunlink\*
> >     >>                             >       ... reproduce ...
> >     >>                             >       ^C
> >     >>                             >       $ trace-cmd report >
> trace.output
> >     >>                             >
> >     >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             >> @@ -581,6 +581,8 @@
> >     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >     >>                             ...
> >     >>                             >
> >     >>                             > I would suggest that the the tracing
> >     >>                             shoul dbe at entry of the
> >     >>                             > function, otherwise we won't get a
> >     >>                             tracepoint for the operation that
> >     >>                             > triggers the shutdown. (That's the
> >     >>                             reason most tracepoints in XFS
> >     >>                             > are at function entry...)
> >     >>                             >
> >     >>
> >     >>                             Good points, thanks Dave. A v2 that
> pulls
> >     >>                             up the tracepoints towards
> >     >>                             function entry is appended.
> >     >>
> >     >>                             Brian
> >     >>
> >     >>                             From
> >     >>
> 280943e78ebe0b97a774cba51e7815c42f044b55
> >     >>                             Mon Sep 17 00:00:00 2001
> >     >>                             From: Brian Foster
> >     <bfoster@redhat.com <mailto:bfoster@redhat.com>
> >     >>                             <mailto:bfoster@redhat.com
> >     <mailto:bfoster@redhat.com>>>
> >     >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
> >     >>                             Subject: [PATCH v2] xfs: add
> tracepoints
> >     >>                             for xfs_iunlink and
> >     >>                             xfs_iunlink_remove
> >     >>
> >     >>                             ---
> >     >>                              fs/xfs/linux-2.6/xfs_trace.h |    2
> ++
> >     >>                              fs/xfs/xfs_inode.c           |    4
> ++++
> >     >>                              2 files changed, 6 insertions(+), 0
> >     >>                             deletions(-)
> >     >>
> >     >>                             diff --git
> a/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             b/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             index adc6ec4..338a0f9 100644
> >     >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >     >>                             @@ -583,6 +583,8 @@
> >     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >     >>
>  DEFINE_INODE_EVENT(xfs_destroy_inode);
> >     >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
> >     >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
> >     >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
> >     >>
> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
> >     >>
> >     >>
>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
> >     >>
>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> >     >>                             diff --git a/fs/xfs/xfs_inode.c
> >     >>                             b/fs/xfs/xfs_inode.c
> >     >>                             index 19900f0..d705c77 100644
> >     >>                             --- a/fs/xfs/xfs_inode.c
> >     >>                             +++ b/fs/xfs/xfs_inode.c
> >     >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
> >     >>
> >     >>                                     mp = tp->t_mountp;
> >     >>
> >     >>                             +       trace_xfs_iunlink(ip);
> >     >>                             +
> >     >>                                     /*
> >     >>                                      * Get the agi buffer first.
>  It
> >     >>                             ensures lock ordering
> >     >>                                      * on the list.
> >     >>                             @@ -1694,6 +1696,8 @@
> xfs_iunlink_remove(
> >     >>                                     mp = tp->t_mountp;
> >     >>                                     agno = XFS_INO_TO_AGNO(mp,
> >     ip->i_ino);
> >     >>
> >     >>                             +       trace_xfs_iunlink_remove(ip);
> >     >>                             +
> >     >>                                     /*
> >     >>                                      * Get the agi buffer first.
>  It
> >     >>                             ensures lock ordering
> >     >>                                      * on the list.
> >     >>                             --
> >     >>                             1.7.7.6
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>                         --
> >     >>                         符永涛
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>                     --
> >     >>                     符永涛
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>                 --
> >     >>                 符永涛
> >     >>
> >     >>
> >     >>
> >     >>
> >     >>             --
> >     >>             符永涛
> >     >>             _______________________________________________
> >     >>             xfs mailing list
> >     >>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
> >     <mailto:xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>>
> >     >>             http://oss.sgi.com/mailman/listinfo/xfs
> >     >
> >     >
> >     >
> >     >
> >     >         --
> >     >         符永涛
> >     >
> >     >
> >     >
> >     >
> >     >     --
> >     >     符永涛
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 60965 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19  6:40                           ` 符永涛
@ 2013-04-19 11:41                             ` 符永涛
  2013-04-19 14:59                               ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 11:41 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 34548 bytes --]

Dear Brian and Eric,

kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm<http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>still
have this problem
I build the kernel from this srpm
https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm

today the shutdown happens again during test.
Seelogs bellow:

/var/log/message
Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
0xffffffffa02d4bda
Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.

systemtap script output:
--- xfs_imap --
module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
-- return=0x16
vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70 ino=0xffffffff
imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
mp: m_agno_log = 0x5, m_agino_log = 0x20
mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
kernel backtrace:
Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0 [xfs]
 0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
 0xffffffff814ede89
 0x0 (inexact)
user backtrace:
 0x3ec260e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]

--- xfs_iunlink_remove --
module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return
-- return=0x16
vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=? mp=? agi=?
dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff881000000001 bucket_index=? offset=?
last_offset=0xffffffffffff8811 error=? __func__=[...]
ip: i_ino = 0x1bd33, i_flags = 0x0
ip->i_d: di_nlink = 0x0, di_gen = 0x53068791

debugfs events trace:
https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing

xfs_metadump:
https://docs.google.com/file/d/0B7n2C4T5tfNCc2tiMjdhMTVfOWM/edit?usp=sharing

again this happens exactly when glusterfs rebalance finishes on one of the
brick(this time it is current host)
glusterfs log:
[2013-04-19 16:40:03.835675] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/kbd/consoletrans
[2013-04-19 16:40:03.842024] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/kbd/consoletrans
[2013-04-19 16:40:03.844120] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/kbd/consoletrans
[2013-04-19 16:40:03.852926] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/kbd/consoletrans
[2013-04-19 16:40:03.856602] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/kbd/consoletrans
[2013-04-19 16:40:03.860231] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/kbd/consoletrans
[2013-04-19 16:40:03.892069] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa
[2013-04-19 16:40:03.897155] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa
[2013-04-19 16:40:03.897582] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa
[2013-04-19 16:40:03.901076] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa
[2013-04-19 16:40:03.903689] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa
[2013-04-19 16:40:03.906643] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa
[2013-04-19 16:40:03.910744] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa/init
[2013-04-19 16:40:03.913475] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa/init
[2013-04-19 16:40:03.915424] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa/init
[2013-04-19 16:40:03.918699] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa/init
[2013-04-19 16:40:03.920459] I [dht-common.c:2337:dht_setxattr]
0-testbug-dht: fixing the layout of /lib/alsa/init
[2013-04-19 16:40:03.922857] I
[dht-rebalance.c:1055:gf_defrag_migrate_data] 0-testbug-dht: migrate data
called on /lib/alsa/init
[2013-04-19 16:40:05.107663] I [dht-rebalance.c:1611:gf_defrag_status_get]
0-glusterfs: Rebalance is completed
[2013-04-19 16:40:05.107770] I [dht-rebalance.c:1614:gf_defrag_status_get]
0-glusterfs: Files migrated: 993, size: 16161958687, lookups: 190891,
failures: 8957
[2013-04-19 16:40:05.108628] W [glusterfsd.c:831:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x3ec22e767d]
(-->/lib64/libpthread.so.0() [0x3ec2607851]
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
received signum (15), shutting down
[2013-04-19 16:40:05.109007] E
[rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
unregister with portmap



2013/4/19 符永涛 <yongtaofu@gmail.com>

> Hi Brian and Eric,
>
> I have applied your kernel path v2(add unlink trace) to
> kernel-2.6.32-279.19.1.el6.x86_64.rpm<http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>in out test cluster and start testing again.
> Any progress will let you known. Thank you.
>
>
> 2013/4/19 Brian Foster <bfoster@redhat.com>
>
>> On 04/18/2013 11:23 AM, 符永涛 wrote:
>> > Hi Brian and Eric,
>> > The shutdown is not easy to produce but finally right now 2 of our
>> > servers in our test cluster xfs was shutdown.
>> >
>>
>> Understood. We've been trying very hard to reproduce ourselves to make
>> it easier to debug, but haven't been able to reproduce at all so far.
>> This process allows us to make _some_ progress on the issue, even if it
>> is slower going than we'd like... ;)
>>
>> > the trace output as following
>> >
>> https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing
>> >
>>
>> Thanks again for the data. The racing behavior Eric called out (nice
>> catch!) in his last mail lit up some light bulbs internally with regard
>> to some old locking issues triggered by XFS in the 6.3 kernel. The
>> following bug serves as an example:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=852847
>>
>> ... the fix for which went into the 2.6.32-279.19.1 6.3.z release. Could
>> you move some or all of your servers to this kernel[1] and see how it
>> goes? The best case is it resolves the problem, worst case we carry on
>> debugging from there...
>>
>> Brian
>>
>> [1] -
>>
>> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm
>>
>> > Sorry but the systemtap is interrupt and I didn't noticed that so I
>> > didn't get systemtap logs.
>> >
>> > /var/log/message is same as before
>> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
>> > returned error 22.
>> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> > error 22
>> > Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
>> > from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
>> > 0xffffffffa02d44aa
>> > Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
>> > filesystem
>> > Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
>> > rectify the problem(s)
>> > Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> >
>> > The metadump file is large I'll share it to you soon.
>> >
>> >
>> > 2013/4/18 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
>> >
>> >     On 04/18/2013 04:25 AM, 符永涛 wrote:
>> >     > Hi Brian and Eric,
>> >     > Can I change as following to bypass it?
>> >
>> >     This is probably not a wise thing to do. The problem we're seeing
>> here
>> >     is indicative of a potentially larger problem than this particular
>> error
>> >     path. An inode is being unlinked and inactivated, but we aren't
>> finding
>> >     on the list where we expect it to be. Killing the error return
>> doesn't
>> >     eliminate the larger problem.
>> >
>> >     So while changes could end up being made in this area as part of a
>> fix,
>> >     I would not suggest making any changes beyond those designed to help
>> >     debug until we have a better idea of root cause.
>> >
>> >     Brian
>> >
>> >     > --- a/xfs_inode.c
>> >     > +++ b/xfs_inode.c
>> >     > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
>> >     >                  */
>> >     >                 next_agino =
>> >     be32_to_cpu(agi->agi_unlinked[bucket_index]);
>> >     >                 last_ibp = NULL;
>> >     > -               while (next_agino != agino) {
>> >     > +               while (next_agino != agino && next_agino !=
>> >     NULLAGINO) {
>> >     >                         /*
>> >     >                          * If the last inode wasn't the one
>> >     pointing to
>> >     >                          * us, then release its buffer since
>> we're not
>> >     > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
>> >     >                         ASSERT(next_agino != NULLAGINO);
>> >     >                         ASSERT(next_agino != 0);
>> >     >                 }
>> >     > +               if (next_agino == NULLAGINO) {
>> >     > +                       /*
>> >     > +                        *After search the list for the inode
>> >     being free
>> >     > +                        *we still can't find it.
>> >     > +                        */
>> >     > +                       xfs_err(mp, "%s ino %lld not found in
>> unlinked
>> >     > list.",
>> >     > +                                    __func__, (unsigned long
>> >     > long)ip->i_ino);
>> >     > +               }
>> >     >                 /*
>> >     >                  * Now last_ibp points to the buffer previous to
>> us on
>> >     >                  * the unlinked list.  Pull us from the list.
>> >     > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
>> >     >                 } else {
>> >     >                         xfs_trans_brelse(tp, ibp);
>> >     >                 }
>> >     > -               /*
>> >     > -                * Point the previous inode on the list to the
>> >     next inode.
>> >     > -                */
>> >     > -               last_dip->di_next_unlinked =
>> cpu_to_be32(next_agino);
>> >     > -               ASSERT(next_agino != 0);
>> >     > -               offset = last_offset + offsetof(xfs_dinode_t,
>> >     > di_next_unlinked);
>> >     > -               xfs_trans_inode_buf(tp, last_ibp);
>> >     > -               xfs_trans_log_buf(tp, last_ibp, offset,
>> >     > -                                 (offset + sizeof(xfs_agino_t) -
>> 1));
>> >     > -               xfs_inobp_check(mp, last_ibp);
>> >     > +               if (next_agino != NULLAGINO) {
>> >     > +                       /*
>> >     > +                       * If only find the inode being free then
>> >     we modify
>> >     > +                       * the unlinked list.
>> >     > +                       * Point the previous inode on the list to
>> the
>> >     > next inode.
>> >     > +                       */
>> >     > +                       last_dip->di_next_unlinked =
>> >     > cpu_to_be32(next_agino);
>> >     > +                       ASSERT(next_agino != 0);
>> >     > +                       offset = last_offset +
>> offsetof(xfs_dinode_t,
>> >     > di_next_unlinked);
>> >     > +                       xfs_trans_inode_buf(tp, last_ibp);
>> >     > +                       xfs_trans_log_buf(tp, last_ibp, offset,
>> >     > +                                         (offset +
>> >     sizeof(xfs_agino_t)
>> >     > - 1));
>> >     > +                       xfs_inobp_check(mp, last_ibp);
>> >     > +               }
>> >     >         }
>> >     >         return 0;
>> >     >  }
>> >     >
>> >     > Thank you.
>> >     >
>> >     >
>> >     > 2013/4/18 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>> >     <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>> >     >
>> >     >     Hi Brain and Eric,
>> >     >     If the problem is the agno can't be found in the unlinked
>> >     list. Can
>> >     >     we just bypass it instead of passing ino=0xffffffff to
>> >     xfs_inotobp?
>> >     >     Thank you.
>> >     >
>> >     >
>> >     >     2013/4/18 符永涛 <yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com> <mailto:yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>>>
>> >     >
>> >     >         Hi Eric,
>> >     >         The shutdown issue is still not reproduced yet. But I get
>> the
>> >     >         following error today during test.
>> >     >
>> >     >         Apr 18 07:42:51 10 kernel: Call Trace:
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> >     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> >     >         schedule_timeout+0x215/0x2e0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> >     >         kmem_zone_alloc+0x77/0xf0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
>> >     __down+0x72/0xb0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> >     >         _xfs_buf_find+0x102/0x280 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: "echo 0 >
>> >     >         /proc/sys/kernel/hung_task_timeout_secs" disables this
>> >     message.
>> >     >         Apr 18 07:42:51 10 kernel: glusterfsd    D
>> ffffffff8160b3c0
>> >     >         0 14522      1 0x00000083
>> >     >         Apr 18 07:42:51 10 kernel: ffff882015a63a28
>> 0000000000000082
>> >     >         0000000000000000 0000000000000000
>> >     >         Apr 18 07:42:51 10 kernel: ffff882015a639b8
>> ffffffffa02d91ef
>> >     >         ffff882015a639d8 0000000000000246
>> >     >         Apr 18 07:42:51 10 kernel: ffff880e70491af8
>> ffff882015a63fd8
>> >     >         000000000000fb88 ffff880e70491af8
>> >     >         Apr 18 07:42:51 10 kernel: Call Trace:
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
>> >     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
>> >     >         schedule_timeout+0x215/0x2e0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
>> >     >         kmem_zone_alloc+0x77/0xf0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]
>> >     __down+0x72/0xb0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
>> >     >         _xfs_buf_find+0x102/0x280 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>]
>> down+0x41/0x50
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
>> >     >         xfs_buf_lock+0x53/0x110 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
>> >     >         _xfs_buf_find+0x102/0x280 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
>> >     >         xfs_buf_get+0x6b/0x1a0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
>> >     >         xfs_buf_read+0x2c/0x100 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
>> >     >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
>> >     >         xfs_read_agi+0x74/0x100 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
>> >     >         xfs_iunlink+0x5b/0x180 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
>> >     >         current_fs_time+0x27/0x30
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
>> >     >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
>> >     >         xfs_droplink+0x5b/0x70 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
>> >     >         xfs_remove+0x27e/0x3a0 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
>> >     >         generic_permission+0x23/0xb0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
>> >     >         xfs_vn_unlink+0x48/0x90 [xfs]
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
>> >     vfs_unlink+0x9f/0xe0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
>> >     >         lookup_hash+0x3a/0x50
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
>> >     >         do_unlinkat+0x183/0x1c0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
>> >     >         syscall_trace_enter+0x1d8/0x1e0
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
>> >     sys_unlink+0x16/0x20
>> >     >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
>> >     tracesys+0xd9/0xde
>> >     >
>> >     >         Thank you.
>> >     >
>> >     >
>> >     >         2013/4/17 Eric Sandeen <sandeen@sandeen.net
>> >     <mailto:sandeen@sandeen.net>
>> >     >         <mailto:sandeen@sandeen.net <mailto:sandeen@sandeen.net
>> >>>
>> >     >
>> >     >             On Apr 16, 2013, at 8:48 PM, 符永涛
>> >     <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>> >     >             <mailto:yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>>> wrote:
>> >     >
>> >     >>             Hi Brain,
>> >     >>             Can I change as following?
>> >     >
>> >     >             ASSERTS are no-ops in a non-debug kernel, so this
>> won't
>> >     >             change any behavior.  I hope we'll know more if we
>> get new
>> >     >             traces from your patched kernel....
>> >     >
>> >     >             Eric
>> >     >
>> >     >>             --- a/xfs_inode.c
>> >     >>             +++ b/xfs_inode.c
>> >     >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
>> >     >>                                     if (last_ibp != NULL) {
>> >     >>                                             xfs_trans_brelse(tp,
>> >     >>             last_ibp);
>> >     >>                                     }
>> >     >>             +                        ASSERT(next_agino !=
>> NULLAGINO);
>> >     >>             +                        ASSERT(next_agino != 0);
>> >     >>                                     next_ino =
>> XFS_AGINO_TO_INO(mp,
>> >     >>             agno, next_agino);
>> >     >>                                     error = xfs_inotobp(mp, tp,
>> >     >>             next_ino, &last_dip,
>> >     >>
>> &last_ibp,
>> >     >>             &last_offset, 0);
>> >     >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
>> >     >>                                             return error;
>> >     >>                                     }
>> >     >>                                     next_agino =
>> >     >>             be32_to_cpu(last_dip->di_next_unlinked);
>> >     >>             -                       ASSERT(next_agino !=
>> NULLAGINO);
>> >     >>             -                       ASSERT(next_agino != 0);
>> >     >>                             }
>> >     >>                             /*
>> >     >>                              * Now last_ibp points to the buffer
>> >     >>             previous to us on
>> >     >>
>> >     >>             Thank you.
>> >     >>
>> >     >>
>> >     >>             2013/4/17 符永涛 <yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>
>> >     >>             <mailto:yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>>>
>> >     >>
>> >     >>                 Hi Brain,
>> >     >>                 If it is because NULLAGINO is passed in  to
>> >     xfs_inotobp().
>> >     >>                 Can I move the following two lines before
>> >     xfs_inotobp?
>> >     >>
>> >     >>                 For example:
>> >     >>
>> >     >>                 1767                 while (next_agino != agino)
>> {
>> >     >>                 1768                         /*
>> >     >>                 1769                          * If the last inode
>> >     >>                 wasn't the one pointing to
>> >     >>                 1770                          * us, then release
>> its
>> >     >>                 buffer since we're not
>> >     >>                 1771                          * going to do
>> anything
>> >     >>                 with it.
>> >     >>                 1772                          */
>> >     >>                 1773                         if (last_ibp !=
>> NULL) {
>> >     >>                 1774
>> >     >>                 xfs_trans_brelse(tp, last_ibp);
>> >     >>                 1775                         }
>> >     >>                 1776                         next_ino =
>> >     >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
>> >     >>                 +
>> ASSERT(next_agino !=
>> >     >>                 NULLAGINO);
>> >     >>                 +                               ASSERT(next_agino
>> >     != 0);
>> >     >>                 1777                         error =
>> xfs_inotobp(mp,
>> >     >>                 tp, next_ino, &last_dip,
>> >     >>                 1778
>> >     >>                 &last_ibp, &last_offset, 0);
>> >     >>                 1779                         if (error) {
>> >     >>                 1780                                 xfs_warn(mp,
>> >     >>                 1781                                         "%s:
>> >     >>                 xfs_inotobp() returned error %d.",
>> >     >>                 1782
>> >     __func__,
>> >     >>                 error);
>> >     >>                 1783                                 return
>> error;
>> >     >>                 1784                         }
>> >     >>                 1785                         next_agino =
>> >     >>                 be32_to_cpu(last_dip->di_next_unlinked);
>> >     >>                 -
>> >     //ASSERT(next_agino !=
>> >     >>                 NULLAGINO);
>> >     >>                 -
>> >     //ASSERT(next_agino != 0);
>> >     >>                 1788                 }
>> >     >>                 I don't understand xfs well and correct me if I'm
>> >     >>                 totally wrong.
>> >     >>                 Thank you very much.
>> >     >>
>> >     >>
>> >     >>                 2013/4/17 符永涛 <yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>
>> >     >>                 <mailto:yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>>>
>> >     >>
>> >     >>                     Hi Brain,
>> >     >>                     I want to ask a question, according to the
>> >     >>                     shutdown trace. The ino in
>>  xfs_iunlink_remove
>> >     >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
>> >     >>
>> >     >>                     --- xfs_imap --
>> >     >>
>> >     module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>> >     >>                     -- return=0x16
>> >     >>                     vars: mp=0xffff882017a50800
>> tp=0xffff881c81797c70
>> >     >>                     ino=0xffffffff
>> >     >>
>> >     >>                     --- xfs_iunlink_remove --
>> >     >>
>> >     module("xfs").function("xfs_iunlink_remove@fs
>> /xfs/xfs_inode.c:1680").return
>> >     >>                     -- return=0x16
>> >     >>                     vars: tp=0xffff881c81797c70
>> ip=0xffff881003c13c00
>> >     >>                     next_ino=? mp=? agi=? dip=?
>> >     >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
>> >     >>                     next_agino=? last_ibp=?
>> >     >>                     last_dip=0xffff882000000000 bucket_index=?
>> >     >>                     offset=? last_offset=0xffffffffffff8810
>> error=?
>> >     >>                     __func__=[...]
>> >     >>                     ip: i_ino = 0x113, i_flags = 0x0
>> >     >>
>> >     >>                     Thank you.
>> >     >>
>> >     >>
>> >     >>
>> >     >>                     2013/4/17 符永涛 <yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>
>> >     >>                     <mailto:yongtaofu@gmail.com
>> >     <mailto:yongtaofu@gmail.com>>>
>> >     >>
>> >     >>                         Hi Brain,
>> >     >>                         Thank you for your update, and I have
>> applied
>> >     >>                         your last kernel patch. However it is not
>> >     easy
>> >     >>                         to reproduce especially in out test
>> >     >>                         environment. Till now is not happens
>> again.
>> >     >>                         I'll update the kernel patch now. BTW is
>> >     there
>> >     >>                         any findings in the logs of previous
>> thread?
>> >     >>
>> >     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> >     >>                         I guess it tend to happen during
>> glusterfs
>> >     >>                         rebalance because glusterfs moves a lot
>> of
>> >     >>                         file from one server to another and then
>> >     >>                         unlink it.
>> >     >>
>> >     >>                         Thank you.
>> >     >>
>> >     >>
>> >     >>                         2013/4/17 Brian Foster
>> >     <bfoster@redhat.com <mailto:bfoster@redhat.com>
>> >     >>                         <mailto:bfoster@redhat.com
>> >     <mailto:bfoster@redhat.com>>>
>> >     >>
>> >     >>                             On 04/16/2013 12:24 PM, Dave Chinner
>> >     wrote:
>> >     >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
>> >     >>                             -0400, Brian Foster wrote:
>> >     >>                             >> Hi,
>> >     >>                             >>
>> >     >>                             >> Thanks for the data in the
>> >     previous thread:
>> >     >>                             >>
>> >     >>                             >>
>> >     >>
>> >     http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> >     >>                             >>
>> >     >>                             ...
>> >     >>                             >>
>> >     >>                             >>      echo 1 >
>> >     >>
>> >     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>> >     >>                             >>      echo 1 >
>> >     >>
>> >     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>> >     >>                             >>      ... reproduce ...
>> >     >>                             >>      cat
>> >     >>                             /sys/kernel/debug/tracing/trace >
>> >     trace.output
>> >     >>                             >
>> >     >>                             > It's better to use trace-cmd for
>> this.
>> >     >>                             it will result in less
>> >     >>                             > dropped events. i.e.:
>> >     >>                             >
>> >     >>                             >       $ trace-cmd record -e
>> >     xfs_iunlink\*
>> >     >>                             >       ... reproduce ...
>> >     >>                             >       ^C
>> >     >>                             >       $ trace-cmd report >
>> trace.output
>> >     >>                             >
>> >     >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             >> @@ -581,6 +581,8 @@
>> >     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>> >     >>                             ...
>> >     >>                             >
>> >     >>                             > I would suggest that the the
>> tracing
>> >     >>                             shoul dbe at entry of the
>> >     >>                             > function, otherwise we won't get a
>> >     >>                             tracepoint for the operation that
>> >     >>                             > triggers the shutdown. (That's the
>> >     >>                             reason most tracepoints in XFS
>> >     >>                             > are at function entry...)
>> >     >>                             >
>> >     >>
>> >     >>                             Good points, thanks Dave. A v2 that
>> pulls
>> >     >>                             up the tracepoints towards
>> >     >>                             function entry is appended.
>> >     >>
>> >     >>                             Brian
>> >     >>
>> >     >>                             From
>> >     >>
>> 280943e78ebe0b97a774cba51e7815c42f044b55
>> >     >>                             Mon Sep 17 00:00:00 2001
>> >     >>                             From: Brian Foster
>> >     <bfoster@redhat.com <mailto:bfoster@redhat.com>
>> >     >>                             <mailto:bfoster@redhat.com
>> >     <mailto:bfoster@redhat.com>>>
>> >     >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
>> >     >>                             Subject: [PATCH v2] xfs: add
>> tracepoints
>> >     >>                             for xfs_iunlink and
>> >     >>                             xfs_iunlink_remove
>> >     >>
>> >     >>                             ---
>> >     >>                              fs/xfs/linux-2.6/xfs_trace.h |    2
>> ++
>> >     >>                              fs/xfs/xfs_inode.c           |    4
>> ++++
>> >     >>                              2 files changed, 6 insertions(+), 0
>> >     >>                             deletions(-)
>> >     >>
>> >     >>                             diff --git
>> a/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             b/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             index adc6ec4..338a0f9 100644
>> >     >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> >     >>                             @@ -583,6 +583,8 @@
>> >     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
>> >     >>
>>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>> >     >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
>> >     >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
>> >     >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
>> >     >>
>> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
>> >     >>
>> >     >>
>>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>> >     >>
>>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
>> >     >>                             diff --git a/fs/xfs/xfs_inode.c
>> >     >>                             b/fs/xfs/xfs_inode.c
>> >     >>                             index 19900f0..d705c77 100644
>> >     >>                             --- a/fs/xfs/xfs_inode.c
>> >     >>                             +++ b/fs/xfs/xfs_inode.c
>> >     >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
>> >     >>
>> >     >>                                     mp = tp->t_mountp;
>> >     >>
>> >     >>                             +       trace_xfs_iunlink(ip);
>> >     >>                             +
>> >     >>                                     /*
>> >     >>                                      * Get the agi buffer first.
>>  It
>> >     >>                             ensures lock ordering
>> >     >>                                      * on the list.
>> >     >>                             @@ -1694,6 +1696,8 @@
>> xfs_iunlink_remove(
>> >     >>                                     mp = tp->t_mountp;
>> >     >>                                     agno = XFS_INO_TO_AGNO(mp,
>> >     ip->i_ino);
>> >     >>
>> >     >>                             +       trace_xfs_iunlink_remove(ip);
>> >     >>                             +
>> >     >>                                     /*
>> >     >>                                      * Get the agi buffer first.
>>  It
>> >     >>                             ensures lock ordering
>> >     >>                                      * on the list.
>> >     >>                             --
>> >     >>                             1.7.7.6
>> >     >>
>> >     >>
>> >     >>
>> >     >>
>> >     >>                         --
>> >     >>                         符永涛
>> >     >>
>> >     >>
>> >     >>
>> >     >>
>> >     >>                     --
>> >     >>                     符永涛
>> >     >>
>> >     >>
>> >     >>
>> >     >>
>> >     >>                 --
>> >     >>                 符永涛
>> >     >>
>> >     >>
>> >     >>
>> >     >>
>> >     >>             --
>> >     >>             符永涛
>> >     >>             _______________________________________________
>> >     >>             xfs mailing list
>> >     >>             xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>> >     <mailto:xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>>
>> >     >>             http://oss.sgi.com/mailman/listinfo/xfs
>> >     >
>> >     >
>> >     >
>> >     >
>> >     >         --
>> >     >         符永涛
>> >     >
>> >     >
>> >     >
>> >     >
>> >     >     --
>> >     >     符永涛
>> >     >
>> >     >
>> >     >
>> >     >
>> >     > --
>> >     > 符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 68310 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 11:41                             ` 符永涛
@ 2013-04-19 14:59                               ` Eric Sandeen
  2013-04-19 15:13                                 ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-19 14:59 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/19/13 4:41 AM, 符永涛 wrote:
> Dear Brian and Eric,
> 
> kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm> still have this problem
> I build the kernel from this srpm
> https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
> 
> today the shutdown happens again during test.
> Seelogs bellow:
> 
> /var/log/message
> Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp() returned error 22.
> Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned error 22
> Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02d4bda
> Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected. Shutting down filesystem
> Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the filesystem and rectify the problem(s)
> Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 
> systemtap script output:
> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return -- return=0x16
> vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70 ino=0xffffffff imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=? chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> mp: m_agno_log = 0x5, m_agino_log = 0x20
> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4, sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
> kernel backtrace:
> Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
> Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0 [xfs]
>  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
>  0xffffffff814ede89
>  0x0 (inexact)
> user backtrace:
>  0x3ec260e5ad [/lib64/libpthread-2.12.so <http://libpthread-2.12.so>+0xe5ad/0x219000]
> 
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return -- return=0x16
> vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=? mp=? agi=? dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=? last_dip=0xffff881000000001 bucket_index=? offset=? last_offset=0xffffffffffff8811 error=? __func__=[...]
> ip: i_ino = 0x1bd33, i_flags = 0x0
> ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
> 
> debugfs events trace:
> https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing

Same issue, one file was unlinked twice in a race:

=== ino 0x6b133 ===
           <...>-4477  [003]  2721.176790: xfs_iunlink: dev 8:16 ino 0x6b133
           <...>-4477  [003]  2721.176839: xfs_iunlink_remove: dev 8:16 ino 0x6b133
           <...>-4477  [009]  3320.127227: xfs_iunlink: dev 8:16 ino 0x6b133
           <...>-4477  [001]  3320.141126: xfs_iunlink_remove: dev 8:16 ino 0x6b133
           <...>-4477  [003]  7973.136368: xfs_iunlink: dev 8:16 ino 0x6b133
           <...>-4479  [018]  7973.158457: xfs_iunlink: dev 8:16 ino 0x6b133
           <...>-4479  [018]  7973.158497: xfs_iunlink_remove: dev 8:16 ino 0x6b133

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 14:59                               ` Eric Sandeen
@ 2013-04-19 15:13                                 ` 符永涛
  2013-04-19 15:18                                   ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 15:13 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3664 bytes --]

Sure the serious thing here is that it corrupt the unlinked list. The inode
0x1bd33 which trigger xfs shutdown is not  0x6b133.


2013/4/19 Eric Sandeen <sandeen@sandeen.net>

> On 4/19/13 4:41 AM, 符永涛 wrote:
> > Dear Brian and Eric,
> >
> > kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <
> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>
> still have this problem
> > I build the kernel from this srpm
> >
> https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
> >
> > today the shutdown happens again during test.
> > Seelogs bellow:
> >
> > /var/log/message
> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> 0xffffffffa02d4bda
> > Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> > Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> > Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >
> > systemtap script output:
> > --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> > vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70 ino=0xffffffff
> imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> > mp: m_agno_log = 0x5, m_agino_log = 0x20
> > mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> > imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
> > kernel backtrace:
> > Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
> > Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0 [xfs]
> >  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
> >  0xffffffff814ede89
> >  0x0 (inexact)
> > user backtrace:
> >  0x3ec260e5ad [/lib64/libpthread-2.12.so <http://libpthread-2.12.so
> >+0xe5ad/0x219000]
> >
> > --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return
> -- return=0x16
> > vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=? mp=? agi=?
> dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff881000000001 bucket_index=? offset=?
> last_offset=0xffffffffffff8811 error=? __func__=[...]
> > ip: i_ino = 0x1bd33, i_flags = 0x0
> > ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
> >
> > debugfs events trace:
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing
>
> Same issue, one file was unlinked twice in a race:
>
> === ino 0x6b133 ===
>            <...>-4477  [003]  2721.176790: xfs_iunlink: dev 8:16 ino
> 0x6b133
>            <...>-4477  [003]  2721.176839: xfs_iunlink_remove: dev 8:16
> ino 0x6b133
>            <...>-4477  [009]  3320.127227: xfs_iunlink: dev 8:16 ino
> 0x6b133
>            <...>-4477  [001]  3320.141126: xfs_iunlink_remove: dev 8:16
> ino 0x6b133
>            <...>-4477  [003]  7973.136368: xfs_iunlink: dev 8:16 ino
> 0x6b133
>            <...>-4479  [018]  7973.158457: xfs_iunlink: dev 8:16 ino
> 0x6b133
>            <...>-4479  [018]  7973.158497: xfs_iunlink_remove: dev 8:16
> ino 0x6b133
>
> -Eric
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 5154 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 15:13                                 ` 符永涛
@ 2013-04-19 15:18                                   ` 符永涛
  2013-04-19 16:16                                     ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 15:18 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3939 bytes --]

Dear Eric,
If it's racing issue where the lock is introduced? I want to study the code
from you. Thank you.


2013/4/19 符永涛 <yongtaofu@gmail.com>

> Sure the serious thing here is that it corrupt the unlinked list. The
> inode 0x1bd33 which trigger xfs shutdown is not  0x6b133.
>
>
> 2013/4/19 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/19/13 4:41 AM, 符永涛 wrote:
>> > Dear Brian and Eric,
>> >
>> > kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <
>> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>
>> still have this problem
>> > I build the kernel from this srpm
>> >
>> https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
>> >
>> > today the shutdown happens again during test.
>> > Seelogs bellow:
>> >
>> > /var/log/message
>> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
>> returned error 22.
>> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> error 22
>> > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
>> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
>> 0xffffffffa02d4bda
>> > Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
>> filesystem
>> > Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the filesystem and
>> rectify the problem(s)
>> > Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> > Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> >
>> > systemtap script output:
>> > --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>> -- return=0x16
>> > vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70 ino=0xffffffff
>> imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
>> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>> > mp: m_agno_log = 0x5, m_agino_log = 0x20
>> > mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog =
>> 0x4, sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>> > imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
>> > kernel backtrace:
>> > Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
>> > Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0 [xfs]
>> >  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
>> >  0xffffffff814ede89
>> >  0x0 (inexact)
>> > user backtrace:
>> >  0x3ec260e5ad [/lib64/libpthread-2.12.so <http://libpthread-2.12.so
>> >+0xe5ad/0x219000]
>> >
>> > --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return
>> -- return=0x16
>> > vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=? mp=? agi=?
>> dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=?
>> last_dip=0xffff881000000001 bucket_index=? offset=?
>> last_offset=0xffffffffffff8811 error=? __func__=[...]
>> > ip: i_ino = 0x1bd33, i_flags = 0x0
>> > ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
>> >
>> > debugfs events trace:
>> >
>> https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing
>>
>> Same issue, one file was unlinked twice in a race:
>>
>> === ino 0x6b133 ===
>>            <...>-4477  [003]  2721.176790: xfs_iunlink: dev 8:16 ino
>> 0x6b133
>>            <...>-4477  [003]  2721.176839: xfs_iunlink_remove: dev 8:16
>> ino 0x6b133
>>            <...>-4477  [009]  3320.127227: xfs_iunlink: dev 8:16 ino
>> 0x6b133
>>            <...>-4477  [001]  3320.141126: xfs_iunlink_remove: dev 8:16
>> ino 0x6b133
>>            <...>-4477  [003]  7973.136368: xfs_iunlink: dev 8:16 ino
>> 0x6b133
>>            <...>-4479  [018]  7973.158457: xfs_iunlink: dev 8:16 ino
>> 0x6b133
>>            <...>-4479  [018]  7973.158497: xfs_iunlink_remove: dev 8:16
>> ino 0x6b133
>>
>> -Eric
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 5712 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 15:18                                   ` 符永涛
@ 2013-04-19 16:16                                     ` Eric Sandeen
  2013-04-19 16:47                                       ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-19 16:16 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/19/13 8:18 AM, 符永涛 wrote:
> Dear Eric,
> If it's racing issue where the lock is introduced? I want to study the code from you. Thank you.
> 

essentially:

xfs_remove()
{
...
        xfs_lock_two_inodes(dp, ip, XFS_ILOCK_EXCL);
...
	xfs_droplink()

You are 100% sure that you were running the 279.19.1 kernel?

(I'm not very familiar with Oracle's clone of RHEL - I assume that they have copied all of Red Hat's work verbatim, but I have not looked)

Can you verify that in:

__rwsem_do_wake()

the undo target looks like:

  out:
        return sem;


        /* undo the change to the active count, but check for a transition
         * 1->0 */
  undo:
        if (rwsem_atomic_update(-RWSEM_ACTIVE_BIAS, sem) & RWSEM_ACTIVE_MASK)
                goto out;
        goto try_again;


thanks,
-Eric

> 2013/4/19 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> 
>     Sure the serious thing here is that it corrupt the unlinked list. The inode 0x1bd33 which trigger xfs shutdown is not  0x6b133.
> 
> 
>     2013/4/19 Eric Sandeen <sandeen@sandeen.net <mailto:sandeen@sandeen.net>>
> 
>         On 4/19/13 4:41 AM, 符永涛 wrote:
>         > Dear Brian and Eric,
>         >
>         > kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm> still have this problem
>         > I build the kernel from this srpm
>         > https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
>         >
>         > today the shutdown happens again during test.
>         > Seelogs bellow:
>         >
>         > /var/log/message
>         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp() returned error 22.
>         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned error 22
>         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02d4bda
>         > Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected. Shutting down filesystem
>         > Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the filesystem and rectify the problem(s)
>         > Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>         > Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>         >
>         > systemtap script output:
>         > --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return -- return=0x16
>         > vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70 ino=0xffffffff imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=? chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>         > mp: m_agno_log = 0x5, m_agino_log = 0x20
>         > mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4, sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>         > imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
>         > kernel backtrace:
>         > Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
>         > Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0 [xfs]
>         >  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
>         >  0xffffffff814ede89
>         >  0x0 (inexact)
>         > user backtrace:
>         >  0x3ec260e5ad [/lib64/libpthread-2.12.so <http://libpthread-2.12.so> <http://libpthread-2.12.so>+0xe5ad/0x219000]
>         >
>         > --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return -- return=0x16
>         > vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=? mp=? agi=? dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=? last_dip=0xffff881000000001 bucket_index=? offset=? last_offset=0xffffffffffff8811 error=? __func__=[...]
>         > ip: i_ino = 0x1bd33, i_flags = 0x0
>         > ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
>         >
>         > debugfs events trace:
>         > https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing
> 
>         Same issue, one file was unlinked twice in a race:
> 
>         === ino 0x6b133 ===
>                    <...>-4477  [003]  2721.176790: xfs_iunlink: dev 8:16 ino 0x6b133
>                    <...>-4477  [003]  2721.176839: xfs_iunlink_remove: dev 8:16 ino 0x6b133
>                    <...>-4477  [009]  3320.127227: xfs_iunlink: dev 8:16 ino 0x6b133
>                    <...>-4477  [001]  3320.141126: xfs_iunlink_remove: dev 8:16 ino 0x6b133
>                    <...>-4477  [003]  7973.136368: xfs_iunlink: dev 8:16 ino 0x6b133
>                    <...>-4479  [018]  7973.158457: xfs_iunlink: dev 8:16 ino 0x6b133
>                    <...>-4479  [018]  7973.158497: xfs_iunlink_remove: dev 8:16 ino 0x6b133
> 
>         -Eric
> 
> 
> 
> 
>     -- 
>     符永涛
> 
> 
> 
> 
> -- 
> 符永涛

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 16:16                                     ` Eric Sandeen
@ 2013-04-19 16:47                                       ` 符永涛
  2013-04-19 17:00                                         ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 16:47 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 8174 bytes --]

Hi Eric,
Here's the server info:
[root@10.23.72.95 ~]# rpm -qa|grep kernel
kernel-debug-debuginfo-2.6.32-279.19.1.el6.x86_64
kernel-headers-2.6.32-279.19.1.el6.x86_64
abrt-addon-kerneloops-2.0.8-6.el6.x86_64
dracut-kernel-004-283.el6.noarch
kernel-debuginfo-common-x86_64-2.6.32-279.19.1.el6.x86_64
kernel-debuginfo-2.6.32-279.19.1.el6.x86_64
kernel-debug-2.6.32-279.19.1.el6.x86_64
kernel-devel-2.6.32-279.19.1.el6.x86_64
libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
kernel-firmware-2.6.32-279.19.1.el6.noarch
kernel-2.6.32-279.19.1.el6.x86_64
kernel-debug-devel-2.6.32-279.19.1.el6.x86_64
[root@10.23.72.95 ~]# uname -a
Linux 10.23.72.95 2.6.32-279.19.1.el6.x86_64 #1 SMP Fri Apr 19 10:44:52 CST
2013 x86_64 x86_64 x86_64 GNU/Linux
[root@10.23.72.95 ~]#

The kernel code looks like:
__rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
        struct rwsem_waiter *waiter;
        struct task_struct *tsk;
        int woken;

        waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);

        if (!wakewrite) {
                if (waiter->flags & RWSEM_WAITING_FOR_WRITE)
                        goto out;
                goto dont_wake_writers;
        }

        /* if we are allowed to wake writers try to grant a single write
lock
         * if there's a writer at the front of the queue
         * - we leave the 'waiting count' incremented to signify potential
         *   contention
         */
        if (waiter->flags & RWSEM_WAITING_FOR_WRITE) {
                sem->activity = -1;
                list_del(&waiter->list);
                tsk = waiter->task;
                /* Don't touch waiter after ->task has been NULLed */
                smp_mb();
                waiter->task = NULL;
                wake_up_process(tsk);
                put_task_struct(tsk);
                goto out;
        }

        /* grant an infinite number of read locks to the front of the queue
*/
 dont_wake_writers:
        woken = 0;
        while (waiter->flags & RWSEM_WAITING_FOR_READ) {
                struct list_head *next = waiter->list.next;

                list_del(&waiter->list);
                tsk = waiter->task;
                smp_mb();
                waiter->task = NULL;
                wake_up_process(tsk);
                put_task_struct(tsk);
                woken++;
                if (list_empty(&sem->wait_list))
                        break;
                waiter = list_entry(next, struct rwsem_waiter, list);
        }

        sem->activity += woken;

 out:
        return sem;
}

I use srpm because I want to apply the trace path. Can you help to provide
the official 279.19.1 srpm link.
Thank you.


2013/4/20 Eric Sandeen <sandeen@sandeen.net>

> On 4/19/13 8:18 AM, 符永涛 wrote:
> > Dear Eric,
> > If it's racing issue where the lock is introduced? I want to study the
> code from you. Thank you.
> >
>
> essentially:
>
> xfs_remove()
> {
> ...
>         xfs_lock_two_inodes(dp, ip, XFS_ILOCK_EXCL);
> ...
>         xfs_droplink()
>
> You are 100% sure that you were running the 279.19.1 kernel?
>
> (I'm not very familiar with Oracle's clone of RHEL - I assume that they
> have copied all of Red Hat's work verbatim, but I have not looked)
>
> Can you verify that in:
>
> __rwsem_do_wake()
>
> the undo target looks like:
>
>   out:
>         return sem;
>
>
>         /* undo the change to the active count, but check for a transition
>          * 1->0 */
>   undo:
>         if (rwsem_atomic_update(-RWSEM_ACTIVE_BIAS, sem) &
> RWSEM_ACTIVE_MASK)
>                 goto out;
>         goto try_again;
>
>
> thanks,
> -Eric
>
> > 2013/4/19 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> >     Sure the serious thing here is that it corrupt the unlinked list.
> The inode 0x1bd33 which trigger xfs shutdown is not  0x6b133.
> >
> >
> >     2013/4/19 Eric Sandeen <sandeen@sandeen.net <mailto:
> sandeen@sandeen.net>>
> >
> >         On 4/19/13 4:41 AM, 符永涛 wrote:
> >         > Dear Brian and Eric,
> >         >
> >         > kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <
> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>
> still have this problem
> >         > I build the kernel from this srpm
> >         >
> https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
> >         >
> >         > today the shutdown happens again during test.
> >         > Seelogs bellow:
> >         >
> >         > /var/log/message
> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove:
> xfs_inotobp() returned error 22.
> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree
> returned error 22
> >         > Apr 19 16:40:05 10 kernel: XFS (sdb):
> xfs_do_force_shutdown(0x1) called from line 1184 of file
> fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02d4bda
> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected.
> Shutting down filesystem
> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the
> filesystem and rectify the problem(s)
> >         > Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >         > Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >         >
> >         > systemtap script output:
> >         > --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> >         > vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70
> ino=0xffffffff imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=?
> blks_per_cluster=? chunk_agbno=? cluster_agbno=? error=? offset=?
> offset_agbno=? __func__=[...]
> >         > mp: m_agno_log = 0x5, m_agino_log = 0x20
> >         > mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0,
> sb_inopblog = 0x4, sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> >         > imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
> >         > kernel backtrace:
> >         > Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
> >         > Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0
> [xfs]
> >         >  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
> >         >  0xffffffff814ede89
> >         >  0x0 (inexact)
> >         > user backtrace:
> >         >  0x3ec260e5ad [/lib64/libpthread-2.12.so <
> http://libpthread-2.12.so> <http://libpthread-2.12.so>+0xe5ad/0x219000]
> >         >
> >         > --- xfs_iunlink_remove --
> module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return
> -- return=0x16
> >         > vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=?
> mp=? agi=? dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff881000000001 bucket_index=? offset=?
> last_offset=0xffffffffffff8811 error=? __func__=[...]
> >         > ip: i_ino = 0x1bd33, i_flags = 0x0
> >         > ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
> >         >
> >         > debugfs events trace:
> >         >
> https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing
> >
> >         Same issue, one file was unlinked twice in a race:
> >
> >         === ino 0x6b133 ===
> >                    <...>-4477  [003]  2721.176790: xfs_iunlink: dev 8:16
> ino 0x6b133
> >                    <...>-4477  [003]  2721.176839: xfs_iunlink_remove:
> dev 8:16 ino 0x6b133
> >                    <...>-4477  [009]  3320.127227: xfs_iunlink: dev 8:16
> ino 0x6b133
> >                    <...>-4477  [001]  3320.141126: xfs_iunlink_remove:
> dev 8:16 ino 0x6b133
> >                    <...>-4477  [003]  7973.136368: xfs_iunlink: dev 8:16
> ino 0x6b133
> >                    <...>-4479  [018]  7973.158457: xfs_iunlink: dev 8:16
> ino 0x6b133
> >                    <...>-4479  [018]  7973.158497: xfs_iunlink_remove:
> dev 8:16 ino 0x6b133
> >
> >         -Eric
> >
> >
> >
> >
> >     --
> >     符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 14397 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 16:47                                       ` 符永涛
@ 2013-04-19 17:00                                         ` 符永涛
  2013-04-19 17:04                                           ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 17:00 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 10355 bytes --]

Dear Eric,

I checked rh srpm
https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
And the code is same, as following:

__rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
    struct rwsem_waiter *waiter;
    struct task_struct *tsk;
    int woken;

    waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);

    if (!wakewrite) {
        if (waiter->flags & RWSEM_WAITING_FOR_WRITE)
            goto out;
        goto dont_wake_writers;
    }

    /* if we are allowed to wake writers try to grant a single write lock
     * if there's a writer at the front of the queue
     * - we leave the 'waiting count' incremented to signify potential
     *   contention
     */
    if (waiter->flags & RWSEM_WAITING_FOR_WRITE) {
        sem->activity = -1;
        list_del(&waiter->list);
        tsk = waiter->task;
        /* Don't touch waiter after ->task has been NULLed */
        smp_mb();
        waiter->task = NULL;
        wake_up_process(tsk);
        put_task_struct(tsk);
        goto out;
    }

    /* grant an infinite number of read locks to the front of the queue */
 dont_wake_writers:
    woken = 0;
    while (waiter->flags & RWSEM_WAITING_FOR_READ) {
        struct list_head *next = waiter->list.next;

        list_del(&waiter->list);
        tsk = waiter->task;
        smp_mb();
        waiter->task = NULL;
        wake_up_process(tsk);
        put_task_struct(tsk);
        woken++;
        if (list_empty(&sem->wait_list))
            break;
        waiter = list_entry(next, struct rwsem_waiter, list);
    }

    sem->activity += woken;

 out:
    return sem;
}


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Hi Eric,
> Here's the server info:
> [root@10.23.72.95 ~]# rpm -qa|grep kernel
> kernel-debug-debuginfo-2.6.32-279.19.1.el6.x86_64
> kernel-headers-2.6.32-279.19.1.el6.x86_64
> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> dracut-kernel-004-283.el6.noarch
> kernel-debuginfo-common-x86_64-2.6.32-279.19.1.el6.x86_64
> kernel-debuginfo-2.6.32-279.19.1.el6.x86_64
> kernel-debug-2.6.32-279.19.1.el6.x86_64
> kernel-devel-2.6.32-279.19.1.el6.x86_64
> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> kernel-firmware-2.6.32-279.19.1.el6.noarch
> kernel-2.6.32-279.19.1.el6.x86_64
> kernel-debug-devel-2.6.32-279.19.1.el6.x86_64
> [root@10.23.72.95 ~]# uname -a
> Linux 10.23.72.95 2.6.32-279.19.1.el6.x86_64 #1 SMP Fri Apr 19 10:44:52
> CST 2013 x86_64 x86_64 x86_64 GNU/Linux
> [root@10.23.72.95 ~]#
>
> The kernel code looks like:
> __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
> {
>         struct rwsem_waiter *waiter;
>         struct task_struct *tsk;
>         int woken;
>
>         waiter = list_entry(sem->wait_list.next, struct rwsem_waiter,
> list);
>
>         if (!wakewrite) {
>                 if (waiter->flags & RWSEM_WAITING_FOR_WRITE)
>                         goto out;
>                 goto dont_wake_writers;
>         }
>
>         /* if we are allowed to wake writers try to grant a single write
> lock
>          * if there's a writer at the front of the queue
>          * - we leave the 'waiting count' incremented to signify potential
>          *   contention
>          */
>         if (waiter->flags & RWSEM_WAITING_FOR_WRITE) {
>                 sem->activity = -1;
>                 list_del(&waiter->list);
>                 tsk = waiter->task;
>                 /* Don't touch waiter after ->task has been NULLed */
>                 smp_mb();
>                 waiter->task = NULL;
>                 wake_up_process(tsk);
>                 put_task_struct(tsk);
>                 goto out;
>         }
>
>         /* grant an infinite number of read locks to the front of the
> queue */
>  dont_wake_writers:
>         woken = 0;
>         while (waiter->flags & RWSEM_WAITING_FOR_READ) {
>                 struct list_head *next = waiter->list.next;
>
>                 list_del(&waiter->list);
>                 tsk = waiter->task;
>                 smp_mb();
>                 waiter->task = NULL;
>                 wake_up_process(tsk);
>                 put_task_struct(tsk);
>                 woken++;
>                 if (list_empty(&sem->wait_list))
>                         break;
>                 waiter = list_entry(next, struct rwsem_waiter, list);
>         }
>
>         sem->activity += woken;
>
>  out:
>         return sem;
> }
>
> I use srpm because I want to apply the trace path. Can you help to provide
> the official 279.19.1 srpm link.
> Thank you.
>
>
> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/19/13 8:18 AM, 符永涛 wrote:
>> > Dear Eric,
>> > If it's racing issue where the lock is introduced? I want to study the
>> code from you. Thank you.
>> >
>>
>> essentially:
>>
>> xfs_remove()
>> {
>> ...
>>         xfs_lock_two_inodes(dp, ip, XFS_ILOCK_EXCL);
>> ...
>>         xfs_droplink()
>>
>> You are 100% sure that you were running the 279.19.1 kernel?
>>
>> (I'm not very familiar with Oracle's clone of RHEL - I assume that they
>> have copied all of Red Hat's work verbatim, but I have not looked)
>>
>> Can you verify that in:
>>
>> __rwsem_do_wake()
>>
>> the undo target looks like:
>>
>>   out:
>>         return sem;
>>
>>
>>         /* undo the change to the active count, but check for a transition
>>          * 1->0 */
>>   undo:
>>         if (rwsem_atomic_update(-RWSEM_ACTIVE_BIAS, sem) &
>> RWSEM_ACTIVE_MASK)
>>                 goto out;
>>         goto try_again;
>>
>>
>> thanks,
>> -Eric
>>
>> > 2013/4/19 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> >     Sure the serious thing here is that it corrupt the unlinked list.
>> The inode 0x1bd33 which trigger xfs shutdown is not  0x6b133.
>> >
>> >
>> >     2013/4/19 Eric Sandeen <sandeen@sandeen.net <mailto:
>> sandeen@sandeen.net>>
>> >
>> >         On 4/19/13 4:41 AM, 符永涛 wrote:
>> >         > Dear Brian and Eric,
>> >         >
>> >         > kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm <
>> http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm>
>> still have this problem
>> >         > I build the kernel from this srpm
>> >         >
>> https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
>> >         >
>> >         > today the shutdown happens again during test.
>> >         > Seelogs bellow:
>> >         >
>> >         > /var/log/message
>> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_iunlink_remove:
>> xfs_inotobp() returned error 22.
>> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> returned error 22
>> >         > Apr 19 16:40:05 10 kernel: XFS (sdb):
>> xfs_do_force_shutdown(0x1) called from line 1184 of file
>> fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02d4bda
>> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): I/O Error Detected.
>> Shutting down filesystem
>> >         > Apr 19 16:40:05 10 kernel: XFS (sdb): Please umount the
>> filesystem and rectify the problem(s)
>> >         > Apr 19 16:40:07 10 kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >         > Apr 19 16:40:37 10 kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >         >
>> >         > systemtap script output:
>> >         > --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>> -- return=0x16
>> >         > vars: mp=0xffff88101801e800 tp=0xffff880ff143ac70
>> ino=0xffffffff imap=0xffff88100e93bc08 flags=0x0 agbno=? agino=? agno=?
>> blks_per_cluster=? chunk_agbno=? cluster_agbno=? error=? offset=?
>> offset_agbno=? __func__=[...]
>> >         > mp: m_agno_log = 0x5, m_agino_log = 0x20
>> >         > mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0,
>> sb_inopblog = 0x4, sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>> >         > imap: im_blkno = 0x0, im_len = 0xe778, im_boffset = 0xd997
>> >         > kernel backtrace:
>> >         > Returning from:  0xffffffffa02b4260 : xfs_imap+0x0/0x280 [xfs]
>> >         > Returning to  :  0xffffffffa02b9d59 : xfs_inotobp+0x49/0xc0
>> [xfs]
>> >         >  0xffffffffa02b9ec1 : xfs_iunlink_remove+0xf1/0x360 [xfs]
>> >         >  0xffffffff814ede89
>> >         >  0x0 (inexact)
>> >         > user backtrace:
>> >         >  0x3ec260e5ad [/lib64/libpthread-2.12.so <
>> http://libpthread-2.12.so> <http://libpthread-2.12.so>+0xe5ad/0x219000]
>> >         >
>> >         > --- xfs_iunlink_remove --
>> module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1681").return
>> -- return=0x16
>> >         > vars: tp=0xffff880ff143ac70 ip=0xffff8811ed111000 next_ino=?
>> mp=? agi=? dip=? agibp=? ibp=? agno=? agino=? next_agino=? last_ibp=?
>> last_dip=0xffff881000000001 bucket_index=? offset=?
>> last_offset=0xffffffffffff8811 error=? __func__=[...]
>> >         > ip: i_ino = 0x1bd33, i_flags = 0x0
>> >         > ip->i_d: di_nlink = 0x0, di_gen = 0x53068791
>> >         >
>> >         > debugfs events trace:
>> >         >
>> https://docs.google.com/file/d/0B7n2C4T5tfNCREZtdC1yamc0RnM/edit?usp=sharing
>> >
>> >         Same issue, one file was unlinked twice in a race:
>> >
>> >         === ino 0x6b133 ===
>> >                    <...>-4477  [003]  2721.176790: xfs_iunlink: dev
>> 8:16 ino 0x6b133
>> >                    <...>-4477  [003]  2721.176839: xfs_iunlink_remove:
>> dev 8:16 ino 0x6b133
>> >                    <...>-4477  [009]  3320.127227: xfs_iunlink: dev
>> 8:16 ino 0x6b133
>> >                    <...>-4477  [001]  3320.141126: xfs_iunlink_remove:
>> dev 8:16 ino 0x6b133
>> >                    <...>-4477  [003]  7973.136368: xfs_iunlink: dev
>> 8:16 ino 0x6b133
>> >                    <...>-4479  [018]  7973.158457: xfs_iunlink: dev
>> 8:16 ino 0x6b133
>> >                    <...>-4479  [018]  7973.158497: xfs_iunlink_remove:
>> dev 8:16 ino 0x6b133
>> >
>> >         -Eric
>> >
>> >
>> >
>> >
>> >     --
>> >     符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 18237 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 17:00                                         ` 符永涛
@ 2013-04-19 17:04                                           ` Eric Sandeen
  2013-04-19 17:08                                             ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-19 17:04 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/19/13 10:00 AM, 符永涛 wrote:
> Dear Eric,
> 
> I checked rh srpm https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
> And the code is same, as following:
> 
> __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
> {

need to look in lib/rwsem.c not lib/rwsem-spinlock.c

Thanks,
-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 17:04                                           ` Eric Sandeen
@ 2013-04-19 17:08                                             ` 符永涛
  2013-04-19 17:17                                               ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 17:08 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 604 bytes --]

There it is, I use latest 279.19.1 and only apply xfs trace path.


2013/4/20 Eric Sandeen <sandeen@sandeen.net>

> On 4/19/13 10:00 AM, 符永涛 wrote:
> > Dear Eric,
> >
> > I checked rh srpm
> https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
> > And the code is same, as following:
> >
> > __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
> > {
>
> need to look in lib/rwsem.c not lib/rwsem-spinlock.c
>
> Thanks,
> -Eric
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 1230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 17:08                                             ` 符永涛
@ 2013-04-19 17:17                                               ` 符永涛
  2013-04-20  0:03                                                 ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-19 17:17 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 823 bytes --]

Dear Eric,
I noticed some functions call the xfs_lock_two_inodes(dp, ip,
XFS_ILOCK_EXCL); twince but not in xfs_remove.


2013/4/20 符永涛 <yongtaofu@gmail.com>

> There it is, I use latest 279.19.1 and only apply xfs trace path.
>
>
> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/19/13 10:00 AM, 符永涛 wrote:
>> > Dear Eric,
>> >
>> > I checked rh srpm
>> https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
>> > And the code is same, as following:
>> >
>> > __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>> > {
>>
>> need to look in lib/rwsem.c not lib/rwsem-spinlock.c
>>
>> Thanks,
>> -Eric
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 1816 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-19 17:17                                               ` 符永涛
@ 2013-04-20  0:03                                                 ` 符永涛
  2013-04-20  1:15                                                   ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-20  0:03 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 1267 bytes --]

Dear Eric and xfs experts,
Updated progress is after more than one day glusterfs rebalance 3 of our
servers xfs shutdown(8 servers in the test cluster). The errors are
identical. Actually one of the most serious accident for us is 8 of our
servers xfs shutdown at the same time during glusterfs rebalance.
Thank you very much!


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Dear Eric,
> I noticed some functions call the xfs_lock_two_inodes(dp, ip,
> XFS_ILOCK_EXCL); twince but not in xfs_remove.
>
>
> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>
>> There it is, I use latest 279.19.1 and only apply xfs trace path.
>>
>>
>> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>>
>>> On 4/19/13 10:00 AM, 符永涛 wrote:
>>> > Dear Eric,
>>> >
>>> > I checked rh srpm
>>> https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
>>> > And the code is same, as following:
>>> >
>>> > __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>>> > {
>>>
>>> need to look in lib/rwsem.c not lib/rwsem-spinlock.c
>>>
>>> Thanks,
>>> -Eric
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 2624 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  0:03                                                 ` 符永涛
@ 2013-04-20  1:15                                                   ` 符永涛
  2013-04-20  2:51                                                     ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-20  1:15 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 1476 bytes --]

Dear xfs experts,
Does mount with sync option helps to isolate this problem?


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Dear Eric and xfs experts,
> Updated progress is after more than one day glusterfs rebalance 3 of our
> servers xfs shutdown(8 servers in the test cluster). The errors are
> identical. Actually one of the most serious accident for us is 8 of our
> servers xfs shutdown at the same time during glusterfs rebalance.
> Thank you very much!
>
>
> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>
>> Dear Eric,
>> I noticed some functions call the xfs_lock_two_inodes(dp, ip,
>> XFS_ILOCK_EXCL); twince but not in xfs_remove.
>>
>>
>> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>>
>>> There it is, I use latest 279.19.1 and only apply xfs trace path.
>>>
>>>
>>> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>>>
>>>> On 4/19/13 10:00 AM, 符永涛 wrote:
>>>> > Dear Eric,
>>>> >
>>>> > I checked rh srpm
>>>> https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
>>>> > And the code is same, as following:
>>>> >
>>>> > __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>>>> > {
>>>>
>>>> need to look in lib/rwsem.c not lib/rwsem-spinlock.c
>>>>
>>>> Thanks,
>>>> -Eric
>>>>
>>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 3169 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  1:15                                                   ` 符永涛
@ 2013-04-20  2:51                                                     ` 符永涛
  2013-04-20  3:40                                                       ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-20  2:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 1949 bytes --]

After change mount option to sync shutdown still happens, and I got a trace
again, the inode 0x1c57d is abnormal.
https://docs.google.com/file/d/0B7n2C4T5tfNCYW1jNWhBbXBYakE/edit?usp=sharing
I have a question if the problem is hard to reproduce why I got 8 times in
a week only in a test cluster with 8 node?
What's the problem?


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Dear xfs experts,
> Does mount with sync option helps to isolate this problem?
>
>
> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>
>> Dear Eric and xfs experts,
>> Updated progress is after more than one day glusterfs rebalance 3 of our
>> servers xfs shutdown(8 servers in the test cluster). The errors are
>> identical. Actually one of the most serious accident for us is 8 of our
>> servers xfs shutdown at the same time during glusterfs rebalance.
>> Thank you very much!
>>
>>
>> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>>
>>> Dear Eric,
>>> I noticed some functions call the xfs_lock_two_inodes(dp, ip,
>>> XFS_ILOCK_EXCL); twince but not in xfs_remove.
>>>
>>>
>>> 2013/4/20 符永涛 <yongtaofu@gmail.com>
>>>
>>>> There it is, I use latest 279.19.1 and only apply xfs trace path.
>>>>
>>>>
>>>> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>>>>
>>>>> On 4/19/13 10:00 AM, 符永涛 wrote:
>>>>> > Dear Eric,
>>>>> >
>>>>> > I checked rh srpm
>>>>> https://content-web.rhn.redhat.com/rhn/public/NULL/kernel/2.6.32-279.19.1.el6/SRPMS/kernel-2.6.32-279.19.1.el6.src.rpm?__gda__=1366390847_8550b8568c50ea46b3180266b476353d&ext=.rpm
>>>>> > And the code is same, as following:
>>>>> >
>>>>> > __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>>>>> > {
>>>>>
>>>>> need to look in lib/rwsem.c not lib/rwsem-spinlock.c
>>>>>
>>>>> Thanks,
>>>>> -Eric
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 4095 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  2:51                                                     ` 符永涛
@ 2013-04-20  3:40                                                       ` Eric Sandeen
  2013-04-20  4:03                                                         ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-20  3:40 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/19/13 7:51 PM, 符永涛 wrote:
> After change mount option to sync shutdown still happens, and I got a trace again, the inode 0x1c57d is abnormal.

since this is a race on namespace operations, I wouldn't have expected sync to matter.

> https://docs.google.com/file/d/0B7n2C4T5tfNCYW1jNWhBbXBYakE/edit?usp=sharing
> I have a question if the problem is hard to reproduce why I got 8 times in a week only in a test cluster with 8 node?
> What's the problem?

you must have something unique in your environment, and we don't know what it is.

To gather more information, can you also turn on tracepoints for:

xfs_rename
xfs_create
xfs_link
xfs_remove

in addition to xfs_iunlink and xfs_iunlink_remove,
and we'll see what that tells us.

There are many paths that manipulate the di_nlink count, and something is racing, but we don't yet know what two callchains they are.

The above are all the callers that manipulate the link count, so they will yield more information about who is manipulating the counts.

Thanks,
-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  3:40                                                       ` Eric Sandeen
@ 2013-04-20  4:03                                                         ` 符永涛
  2013-04-20  4:11                                                           ` 符永涛
  2013-04-20  4:20                                                           ` Eric Sandeen
  0 siblings, 2 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-20  4:03 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 1685 bytes --]

Hi Eric,
I will enable them and run test again. I can only reproduce it with
glusterfs rebalance. Glusterfs uses a mechanism it called syncop to unlink
file. For rebalance it uses
syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
"makecontext/swapcontext"<http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
Does it leads to racing unlink from different CPU core?
Thank you.


2013/4/20 Eric Sandeen <sandeen@sandeen.net>

> On 4/19/13 7:51 PM, 符永涛 wrote:
> > After change mount option to sync shutdown still happens, and I got a
> trace again, the inode 0x1c57d is abnormal.
>
> since this is a race on namespace operations, I wouldn't have expected
> sync to matter.
>
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCYW1jNWhBbXBYakE/edit?usp=sharing
> > I have a question if the problem is hard to reproduce why I got 8 times
> in a week only in a test cluster with 8 node?
> > What's the problem?
>
> you must have something unique in your environment, and we don't know what
> it is.
>
> To gather more information, can you also turn on tracepoints for:
>
> xfs_rename
> xfs_create
> xfs_link
> xfs_remove
>
> in addition to xfs_iunlink and xfs_iunlink_remove,
> and we'll see what that tells us.
>
> There are many paths that manipulate the di_nlink count, and something is
> racing, but we don't yet know what two callchains they are.
>
> The above are all the callers that manipulate the link count, so they will
> yield more information about who is manipulating the counts.
>
> Thanks,
> -Eric
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 2273 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  4:03                                                         ` 符永涛
@ 2013-04-20  4:11                                                           ` 符永涛
  2013-04-20  4:20                                                           ` Eric Sandeen
  1 sibling, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-20  4:11 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 2015 bytes --]

And glusterfs always uses hardlink for sel-heal too(a backend file has a
hardlink under a hidden directory which name is .glusterfs). So as you have
mentioned reduce di_nlink may also conflicts.


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Hi Eric,
> I will enable them and run test again. I can only reproduce it with
> glusterfs rebalance. Glusterfs uses a mechanism it called syncop to unlink
> file. For rebalance it uses
> syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
> sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
> "makecontext/swapcontext"<http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
> Does it leads to racing unlink from different CPU core?
> Thank you.
>
>
> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/19/13 7:51 PM, 符永涛 wrote:
>> > After change mount option to sync shutdown still happens, and I got a
>> trace again, the inode 0x1c57d is abnormal.
>>
>> since this is a race on namespace operations, I wouldn't have expected
>> sync to matter.
>>
>> >
>> https://docs.google.com/file/d/0B7n2C4T5tfNCYW1jNWhBbXBYakE/edit?usp=sharing
>> > I have a question if the problem is hard to reproduce why I got 8 times
>> in a week only in a test cluster with 8 node?
>> > What's the problem?
>>
>> you must have something unique in your environment, and we don't know
>> what it is.
>>
>> To gather more information, can you also turn on tracepoints for:
>>
>> xfs_rename
>> xfs_create
>> xfs_link
>> xfs_remove
>>
>> in addition to xfs_iunlink and xfs_iunlink_remove,
>> and we'll see what that tells us.
>>
>> There are many paths that manipulate the di_nlink count, and something is
>> racing, but we don't yet know what two callchains they are.
>>
>> The above are all the callers that manipulate the link count, so they
>> will yield more information about who is manipulating the counts.
>>
>> Thanks,
>> -Eric
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 2938 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  4:03                                                         ` 符永涛
  2013-04-20  4:11                                                           ` 符永涛
@ 2013-04-20  4:20                                                           ` Eric Sandeen
  2013-04-20  4:27                                                             ` 符永涛
  1 sibling, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-20  4:20 UTC (permalink / raw)
  To: 符永涛; +Cc: Brian Foster, xfs

On 4/19/13 9:03 PM, 符永涛 wrote:
> Hi Eric,
> I will enable them and run test again. I can only reproduce it with
> glusterfs rebalance. Glusterfs uses a mechanism it called syncop to
> unlink file. For rebalance it uses
> syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
> sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
> "makecontext/swapcontext"
> <http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
> Does it leads to racing unlink from different CPU core?

Yep, I understand that it's rebalance.  It dies when rebalance finishes because an
open but unlinked file trips over the corrupted list from earlier, it seems.

I don't know why makecontext would matter...

Just to be sure, you are definitely loading the xfs module from the kernel you built, right, and you don't have a "priority" module getting loaded from elsewhere?  Seems unlikely, but just to be sure.

> Thank you.

You could also add this patch to the xfs tracepoints to print more information about the inodes - the mode & flags.

-Eric


diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
index e8ce644..c314b87 100644
--- a/fs/xfs/linux-2.6/xfs_trace.h
+++ b/fs/xfs/linux-2.6/xfs_trace.h
@@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
 	TP_STRUCT__entry(
 		__field(dev_t, dev)
 		__field(xfs_ino_t, ino)
+		__field(__u16, mode)
+		__field(unsigned long, flags)
 	),
 	TP_fast_assign(
 		__entry->dev = VFS_I(ip)->i_sb->s_dev;
 		__entry->ino = ip->i_ino;
+		__entry->mode = VFS_I(ip)->i_mode;
+		__entry->flags = ip->i_flags;
 	),
-	TP_printk("dev %d:%d ino 0x%llx",
+	TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
-		  __entry->ino)
+		  __entry->ino, __entry->mode, __entry->flags)
 )
 
 #define DEFINE_INODE_EVENT(name) \



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  4:20                                                           ` Eric Sandeen
@ 2013-04-20  4:27                                                             ` 符永涛
  2013-04-20 10:10                                                               ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-20  4:27 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 2595 bytes --]

Hi Eric,
The xfs module is loaded from system kernel, it happens on our production
server too (I did not touch that till now) and if the xfs module is mess up
the systemstap may also not working but now it works. As you have
mentioned, strange thing is xfs shutdown always happens when glusterfs
rebalance completes.


2013/4/20 Eric Sandeen <sandeen@sandeen.net>

> On 4/19/13 9:03 PM, 符永涛 wrote:
> > Hi Eric,
> > I will enable them and run test again. I can only reproduce it with
> > glusterfs rebalance. Glusterfs uses a mechanism it called syncop to
> > unlink file. For rebalance it uses
> > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
> > sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
> > "makecontext/swapcontext"
> > <
> http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
> > Does it leads to racing unlink from different CPU core?
>
> Yep, I understand that it's rebalance.  It dies when rebalance finishes
> because an
> open but unlinked file trips over the corrupted list from earlier, it
> seems.
>
> I don't know why makecontext would matter...
>
> Just to be sure, you are definitely loading the xfs module from the kernel
> you built, right, and you don't have a "priority" module getting loaded
> from elsewhere?  Seems unlikely, but just to be sure.
>
> > Thank you.
>
> You could also add this patch to the xfs tracepoints to print more
> information about the inodes - the mode & flags.
>
> -Eric
>
>
> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
> index e8ce644..c314b87 100644
> --- a/fs/xfs/linux-2.6/xfs_trace.h
> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
>         TP_STRUCT__entry(
>                 __field(dev_t, dev)
>                 __field(xfs_ino_t, ino)
> +               __field(__u16, mode)
> +               __field(unsigned long, flags)
>         ),
>         TP_fast_assign(
>                 __entry->dev = VFS_I(ip)->i_sb->s_dev;
>                 __entry->ino = ip->i_ino;
> +               __entry->mode = VFS_I(ip)->i_mode;
> +               __entry->flags = ip->i_flags;
>         ),
> -       TP_printk("dev %d:%d ino 0x%llx",
> +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
>                   MAJOR(__entry->dev), MINOR(__entry->dev),
> -                 __entry->ino)
> +                 __entry->ino, __entry->mode, __entry->flags)
>  )
>
>  #define DEFINE_INODE_EVENT(name) \
>
>
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 3857 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20  4:27                                                             ` 符永涛
@ 2013-04-20 10:10                                                               ` 符永涛
  2013-04-20 11:38                                                                 ` Brian Foster
       [not found]                                                                 ` <5172B73C.6000900@sandeen.net>
  0 siblings, 2 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-20 10:10 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3823 bytes --]

Dear Eric,
I have applied your latest patch and collected the following log:

/var/log/message
Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22 for inode 0x1b20b ag 0 agino 1b20b
Apr 20 17:28:23 10 kernel:
Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
0xffffffffa02d4d0a
Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.

debugfs trace:
https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing

Thank you.


2013/4/20 符永涛 <yongtaofu@gmail.com>

> Hi Eric,
> The xfs module is loaded from system kernel, it happens on our production
> server too (I did not touch that till now) and if the xfs module is mess up
> the systemstap may also not working but now it works. As you have
> mentioned, strange thing is xfs shutdown always happens when glusterfs
> rebalance completes.
>
>
> 2013/4/20 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/19/13 9:03 PM, 符永涛 wrote:
>> > Hi Eric,
>> > I will enable them and run test again. I can only reproduce it with
>> > glusterfs rebalance. Glusterfs uses a mechanism it called syncop to
>> > unlink file. For rebalance it uses
>> > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
>> > sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
>> > "makecontext/swapcontext"
>> > <
>> http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html
>> >.
>> > Does it leads to racing unlink from different CPU core?
>>
>> Yep, I understand that it's rebalance.  It dies when rebalance finishes
>> because an
>> open but unlinked file trips over the corrupted list from earlier, it
>> seems.
>>
>> I don't know why makecontext would matter...
>>
>> Just to be sure, you are definitely loading the xfs module from the
>> kernel you built, right, and you don't have a "priority" module getting
>> loaded from elsewhere?  Seems unlikely, but just to be sure.
>>
>> > Thank you.
>>
>> You could also add this patch to the xfs tracepoints to print more
>> information about the inodes - the mode & flags.
>>
>> -Eric
>>
>>
>> diff --git a/fs/xfs/linux-2.6/xfs_trace.h b/fs/xfs/linux-2.6/xfs_trace.h
>> index e8ce644..c314b87 100644
>> --- a/fs/xfs/linux-2.6/xfs_trace.h
>> +++ b/fs/xfs/linux-2.6/xfs_trace.h
>> @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
>>         TP_STRUCT__entry(
>>                 __field(dev_t, dev)
>>                 __field(xfs_ino_t, ino)
>> +               __field(__u16, mode)
>> +               __field(unsigned long, flags)
>>         ),
>>         TP_fast_assign(
>>                 __entry->dev = VFS_I(ip)->i_sb->s_dev;
>>                 __entry->ino = ip->i_ino;
>> +               __entry->mode = VFS_I(ip)->i_mode;
>> +               __entry->flags = ip->i_flags;
>>         ),
>> -       TP_printk("dev %d:%d ino 0x%llx",
>> +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
>>                   MAJOR(__entry->dev), MINOR(__entry->dev),
>> -                 __entry->ino)
>> +                 __entry->ino, __entry->mode, __entry->flags)
>>  )
>>
>>  #define DEFINE_INODE_EVENT(name) \
>>
>>
>>
>>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 5561 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20 10:10                                                               ` 符永涛
@ 2013-04-20 11:38                                                                 ` Brian Foster
  2013-04-20 11:52                                                                   ` 符永涛
  2013-04-20 15:36                                                                   ` Eric Sandeen
       [not found]                                                                 ` <5172B73C.6000900@sandeen.net>
  1 sibling, 2 replies; 50+ messages in thread
From: Brian Foster @ 2013-04-20 11:38 UTC (permalink / raw)
  To: 符永涛; +Cc: Eric Sandeen, xfs

On 04/20/2013 06:10 AM, 符永涛 wrote:
> Dear Eric,
> I have applied your latest patch and collected the following log:
> 
> /var/log/message
> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22 for inode 0x1b20b ag 0 agino 1b20b
> Apr 20 17:28:23 10 kernel:
> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> 0xffffffffa02d4d0a
> Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 
> debugfs trace:
> https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing
> 

FWIW...

<...>-6908  [001]  8739.967623: xfs_iunlink: dev 8:16 ino 0x83a8b mode
0100000, flags 0x0
<...>-6909  [001]  8739.970252: xfs_iunlink: dev 8:16 ino 0x83a8b mode
0100000, flags 0x0

0x83a8b and 0x1b20b both hash to unlinked list bucket 11.

As to the rest of the trace, there appears to be a significant amount of
link activity on (directory) inode 0x83a8a (the immediately prior inode
to the inode involved in the race). The name data in the trace suggests
activity somewhere under .glusterfs. A couple questions:

1.) Any idea what entries point to this inode right now (e.g., how many
links on this inode) and where it resides in the fs (path)?

2.) Can you associate this kind of heavy remove/link pattern on a single
inode to a higher level activity? For example, if you were to watch the
trace data live, is this a normal pattern you observe? Does it only
occur when a rebalance is in progress? Or when a rebalance finishes? Any
detailed observations you can make in that regard could be helpful.

Brian

> Thank you.
> 
> 
> 2013/4/20 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> 
>     Hi Eric,
>     The xfs module is loaded from system kernel, it happens on our
>     production server too (I did not touch that till now) and if the xfs
>     module is mess up the systemstap may also not working but now it
>     works. As you have mentioned, strange thing is xfs shutdown always
>     happens when glusterfs rebalance completes.
> 
> 
>     2013/4/20 Eric Sandeen <sandeen@sandeen.net
>     <mailto:sandeen@sandeen.net>>
> 
>         On 4/19/13 9:03 PM, 符永涛 wrote:
>         > Hi Eric,
>         > I will enable them and run test again. I can only reproduce it
>         with
>         > glusterfs rebalance. Glusterfs uses a mechanism it called
>         syncop to
>         > unlink file. For rebalance it uses
>         > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the
>         glusterfs
>         > sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
>         > "makecontext/swapcontext"
>         >
>         <http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
>         > Does it leads to racing unlink from different CPU core?
> 
>         Yep, I understand that it's rebalance.  It dies when rebalance
>         finishes because an
>         open but unlinked file trips over the corrupted list from
>         earlier, it seems.
> 
>         I don't know why makecontext would matter...
> 
>         Just to be sure, you are definitely loading the xfs module from
>         the kernel you built, right, and you don't have a "priority"
>         module getting loaded from elsewhere?  Seems unlikely, but just
>         to be sure.
> 
>         > Thank you.
> 
>         You could also add this patch to the xfs tracepoints to print
>         more information about the inodes - the mode & flags.
> 
>         -Eric
> 
> 
>         diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>         b/fs/xfs/linux-2.6/xfs_trace.h
>         index e8ce644..c314b87 100644
>         --- a/fs/xfs/linux-2.6/xfs_trace.h
>         +++ b/fs/xfs/linux-2.6/xfs_trace.h
>         @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
>                 TP_STRUCT__entry(
>                         __field(dev_t, dev)
>                         __field(xfs_ino_t, ino)
>         +               __field(__u16, mode)
>         +               __field(unsigned long, flags)
>                 ),
>                 TP_fast_assign(
>                         __entry->dev = VFS_I(ip)->i_sb->s_dev;
>                         __entry->ino = ip->i_ino;
>         +               __entry->mode = VFS_I(ip)->i_mode;
>         +               __entry->flags = ip->i_flags;
>                 ),
>         -       TP_printk("dev %d:%d ino 0x%llx",
>         +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
>                           MAJOR(__entry->dev), MINOR(__entry->dev),
>         -                 __entry->ino)
>         +                 __entry->ino, __entry->mode, __entry->flags)
>          )
> 
>          #define DEFINE_INODE_EVENT(name) \
> 
> 
> 
> 
> 
> 
>     -- 
>     符永涛
> 
> 
> 
> 
> -- 
> 符永涛

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20 11:38                                                                 ` Brian Foster
@ 2013-04-20 11:52                                                                   ` 符永涛
  2013-04-20 12:58                                                                     ` Brian Foster
  2013-04-20 15:36                                                                   ` Eric Sandeen
  1 sibling, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-20 11:52 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 6462 bytes --]

Hi Brain,
Here is the file:
find . -inum 539274
./crashtest/.glusterfs/indices/xattrop
[root@10.23.72.94 xfsd]# ls -lih ./crashtest/.glusterfs/indices/xattrop
total 0
539275 ---------- 2 root root 0 Apr 20 17:17
132ef294-71d1-4435-8daa-aa002e67cb6e
539275 ---------- 2 root root 0 Apr 20 17:17
xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
find . -inum 539275
./crashtest/.glusterfs/indices/xattrop/xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
./crashtest/.glusterfs/indices/xattrop/132ef294-71d1-4435-8daa-aa002e67cb6e
I'm not sure if it is normal or glusterfs fall in infinite loop. Is there a
change that the kernel fall into dead loop?
I'll study it.


2013/4/20 Brian Foster <bfoster@redhat.com>

> On 04/20/2013 06:10 AM, 符永涛 wrote:
> > Dear Eric,
> > I have applied your latest patch and collected the following log:
> >
> > /var/log/message
> > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> > returned error 22 for inode 0x1b20b ag 0 agino 1b20b
> > Apr 20 17:28:23 10 kernel:
> > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> > error 22
> > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> > from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> > 0xffffffffa02d4d0a
> > Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> > filesystem
> > Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem and
> > rectify the problem(s)
> > Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >
> > debugfs trace:
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing
> >
>
> FWIW...
>
> <...>-6908  [001]  8739.967623: xfs_iunlink: dev 8:16 ino 0x83a8b mode
> 0100000, flags 0x0
> <...>-6909  [001]  8739.970252: xfs_iunlink: dev 8:16 ino 0x83a8b mode
> 0100000, flags 0x0
>
> 0x83a8b and 0x1b20b both hash to unlinked list bucket 11.
>
> As to the rest of the trace, there appears to be a significant amount of
> link activity on (directory) inode 0x83a8a (the immediately prior inode
> to the inode involved in the race). The name data in the trace suggests
> activity somewhere under .glusterfs. A couple questions:
>
> 1.) Any idea what entries point to this inode right now (e.g., how many
> links on this inode) and where it resides in the fs (path)?
>
> 2.) Can you associate this kind of heavy remove/link pattern on a single
> inode to a higher level activity? For example, if you were to watch the
> trace data live, is this a normal pattern you observe? Does it only
> occur when a rebalance is in progress? Or when a rebalance finishes? Any
> detailed observations you can make in that regard could be helpful.
>
> Brian
>
> > Thank you.
> >
> >
> > 2013/4/20 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> >     Hi Eric,
> >     The xfs module is loaded from system kernel, it happens on our
> >     production server too (I did not touch that till now) and if the xfs
> >     module is mess up the systemstap may also not working but now it
> >     works. As you have mentioned, strange thing is xfs shutdown always
> >     happens when glusterfs rebalance completes.
> >
> >
> >     2013/4/20 Eric Sandeen <sandeen@sandeen.net
> >     <mailto:sandeen@sandeen.net>>
> >
> >         On 4/19/13 9:03 PM, 符永涛 wrote:
> >         > Hi Eric,
> >         > I will enable them and run test again. I can only reproduce it
> >         with
> >         > glusterfs rebalance. Glusterfs uses a mechanism it called
> >         syncop to
> >         > unlink file. For rebalance it uses
> >         > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the
> >         glusterfs
> >         > sync_task framework(glusterfs/libglusterfs/src/syncop.c) it
> uses
> >         > "makecontext/swapcontext"
> >         >
> >         <
> http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
> >         > Does it leads to racing unlink from different CPU core?
> >
> >         Yep, I understand that it's rebalance.  It dies when rebalance
> >         finishes because an
> >         open but unlinked file trips over the corrupted list from
> >         earlier, it seems.
> >
> >         I don't know why makecontext would matter...
> >
> >         Just to be sure, you are definitely loading the xfs module from
> >         the kernel you built, right, and you don't have a "priority"
> >         module getting loaded from elsewhere?  Seems unlikely, but just
> >         to be sure.
> >
> >         > Thank you.
> >
> >         You could also add this patch to the xfs tracepoints to print
> >         more information about the inodes - the mode & flags.
> >
> >         -Eric
> >
> >
> >         diff --git a/fs/xfs/linux-2.6/xfs_trace.h
> >         b/fs/xfs/linux-2.6/xfs_trace.h
> >         index e8ce644..c314b87 100644
> >         --- a/fs/xfs/linux-2.6/xfs_trace.h
> >         +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >         @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
> >                 TP_STRUCT__entry(
> >                         __field(dev_t, dev)
> >                         __field(xfs_ino_t, ino)
> >         +               __field(__u16, mode)
> >         +               __field(unsigned long, flags)
> >                 ),
> >                 TP_fast_assign(
> >                         __entry->dev = VFS_I(ip)->i_sb->s_dev;
> >                         __entry->ino = ip->i_ino;
> >         +               __entry->mode = VFS_I(ip)->i_mode;
> >         +               __entry->flags = ip->i_flags;
> >                 ),
> >         -       TP_printk("dev %d:%d ino 0x%llx",
> >         +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
> >                           MAJOR(__entry->dev), MINOR(__entry->dev),
> >         -                 __entry->ino)
> >         +                 __entry->ino, __entry->mode, __entry->flags)
> >          )
> >
> >          #define DEFINE_INODE_EVENT(name) \
> >
> >
> >
> >
> >
> >
> >     --
> >     符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 9880 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20 11:52                                                                   ` 符永涛
@ 2013-04-20 12:58                                                                     ` Brian Foster
  2013-04-20 13:12                                                                       ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: Brian Foster @ 2013-04-20 12:58 UTC (permalink / raw)
  To: 符永涛; +Cc: Eric Sandeen, xfs

On 04/20/2013 07:52 AM, 符永涛 wrote:
> Hi Brain,
> Here is the file:
> find . -inum 539274
> ./crashtest/.glusterfs/indices/xattrop
> [root@10.23.72.94 <mailto:root@10.23.72.94> xfsd]# ls -lih
> ./crashtest/.glusterfs/indices/xattrop
> total 0
> 539275 ---------- 2 root root 0 Apr 20 17:17
> 132ef294-71d1-4435-8daa-aa002e67cb6e
> 539275 ---------- 2 root root 0 Apr 20 17:17
> xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
> find . -inum 539275
> ./crashtest/.glusterfs/indices/xattrop/xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
> ./crashtest/.glusterfs/indices/xattrop/132ef294-71d1-4435-8daa-aa002e67cb6e
> I'm not sure if it is normal or glusterfs fall in infinite loop. Is
> there a change that the kernel fall into dead loop?
> I'll study it.
> 

Very interesting, thank you. I don't have the full context here yet, but
the short of it is that this particular indices/xattrop directory is
managed by a backend translator and driven by replication. It appears to
be doing some kind of transaction level tracking based on links. E.g.,
some quick behavioral observations:

- This directory is created and an xattrop-#### file created (no size).
- On starting a large sequential write from a client, I start observing
a continuous sequence of xfs_link/xfs_remove operations via tracing.
- The backend appears to create a link to the xattr-#### file for every
replication transaction (handwave) with the name of the link referring
to the gfid name of the file under write (but note again the link is not
to the inode under write, but this special xattr-#### file). It then
apparently removes this link on transaction completion and the process
repeats. I suspect this has to do with identifying what files were under
modification in the event of a crash before transaction completion, but
the underlying pattern/workload is what's more important to us here for
the time being...

So we're seeing a heavy link related workload on a directory inode
(0x83a8a) where the entries are all links to the same inode (0x83a8b).
These inodes happen to be close in proximity, which may or may not be a
factor. The translator that generates these link/unlink ops sits right
above a generic thread pool translator, so this is multi-threaded.

What isn't clear to me yet is where the xfs_iunlink() for this heavily
linked inode is induced. The primary dentry remains after my file copy
test, but then taking another look after a few minutes I see it removed.
The same thing occurs if I gracefully restart the volume. I'm going to
have to dig into that some more and also see if we can use this to
narrow in on a reproducer. I'm thinking something along the lines of:

- Create a directory/file. Ideally the directory and file inodes are in
the same cluster.
- Start a highly-threaded link-unlink workload against that file in the
same directory.
- Somewhere in the background unlink the main file.
- Check for multiple xfs_iunlink() ops, repeat.

... the assumption being that the xfs_iunlink() race could have lead to
a possible unlinked list corruption on the associated list, such that a
later inactivation/xfs_iunlink_remove of some other inode in that bucket
could fail.

Brian

> 
> 2013/4/20 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
> 
>     On 04/20/2013 06:10 AM, 符永涛 wrote:
>     > Dear Eric,
>     > I have applied your latest patch and collected the following log:
>     >
>     > /var/log/message
>     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove:
>     xfs_inotobp()
>     > returned error 22 for inode 0x1b20b ag 0 agino 1b20b
>     > Apr 20 17:28:23 10 kernel:
>     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>     > error 22
>     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
>     called
>     > from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
>     > 0xffffffffa02d4d0a
>     > Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting
>     down
>     > filesystem
>     > Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem and
>     > rectify the problem(s)
>     > Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>     > Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>     > Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>     > Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>     >
>     > debugfs trace:
>     >
>     https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing
>     >
> 
>     FWIW...
> 
>     <...>-6908  [001]  8739.967623: xfs_iunlink: dev 8:16 ino 0x83a8b mode
>     0100000, flags 0x0
>     <...>-6909  [001]  8739.970252: xfs_iunlink: dev 8:16 ino 0x83a8b mode
>     0100000, flags 0x0
> 
>     0x83a8b and 0x1b20b both hash to unlinked list bucket 11.
> 
>     As to the rest of the trace, there appears to be a significant amount of
>     link activity on (directory) inode 0x83a8a (the immediately prior inode
>     to the inode involved in the race). The name data in the trace suggests
>     activity somewhere under .glusterfs. A couple questions:
> 
>     1.) Any idea what entries point to this inode right now (e.g., how many
>     links on this inode) and where it resides in the fs (path)?
> 
>     2.) Can you associate this kind of heavy remove/link pattern on a single
>     inode to a higher level activity? For example, if you were to watch the
>     trace data live, is this a normal pattern you observe? Does it only
>     occur when a rebalance is in progress? Or when a rebalance finishes? Any
>     detailed observations you can make in that regard could be helpful.
> 
>     Brian
> 
>     > Thank you.
>     >
>     >
>     > 2013/4/20 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>     <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>     >
>     >     Hi Eric,
>     >     The xfs module is loaded from system kernel, it happens on our
>     >     production server too (I did not touch that till now) and if
>     the xfs
>     >     module is mess up the systemstap may also not working but now it
>     >     works. As you have mentioned, strange thing is xfs shutdown always
>     >     happens when glusterfs rebalance completes.
>     >
>     >
>     >     2013/4/20 Eric Sandeen <sandeen@sandeen.net
>     <mailto:sandeen@sandeen.net>
>     >     <mailto:sandeen@sandeen.net <mailto:sandeen@sandeen.net>>>
>     >
>     >         On 4/19/13 9:03 PM, 符永涛 wrote:
>     >         > Hi Eric,
>     >         > I will enable them and run test again. I can only
>     reproduce it
>     >         with
>     >         > glusterfs rebalance. Glusterfs uses a mechanism it called
>     >         syncop to
>     >         > unlink file. For rebalance it uses
>     >         > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the
>     >         glusterfs
>     >         > sync_task framework(glusterfs/libglusterfs/src/syncop.c)
>     it uses
>     >         > "makecontext/swapcontext"
>     >         >
>     >        
>     <http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
>     >         > Does it leads to racing unlink from different CPU core?
>     >
>     >         Yep, I understand that it's rebalance.  It dies when rebalance
>     >         finishes because an
>     >         open but unlinked file trips over the corrupted list from
>     >         earlier, it seems.
>     >
>     >         I don't know why makecontext would matter...
>     >
>     >         Just to be sure, you are definitely loading the xfs module
>     from
>     >         the kernel you built, right, and you don't have a "priority"
>     >         module getting loaded from elsewhere?  Seems unlikely, but
>     just
>     >         to be sure.
>     >
>     >         > Thank you.
>     >
>     >         You could also add this patch to the xfs tracepoints to print
>     >         more information about the inodes - the mode & flags.
>     >
>     >         -Eric
>     >
>     >
>     >         diff --git a/fs/xfs/linux-2.6/xfs_trace.h
>     >         b/fs/xfs/linux-2.6/xfs_trace.h
>     >         index e8ce644..c314b87 100644
>     >         --- a/fs/xfs/linux-2.6/xfs_trace.h
>     >         +++ b/fs/xfs/linux-2.6/xfs_trace.h
>     >         @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
>     >                 TP_STRUCT__entry(
>     >                         __field(dev_t, dev)
>     >                         __field(xfs_ino_t, ino)
>     >         +               __field(__u16, mode)
>     >         +               __field(unsigned long, flags)
>     >                 ),
>     >                 TP_fast_assign(
>     >                         __entry->dev = VFS_I(ip)->i_sb->s_dev;
>     >                         __entry->ino = ip->i_ino;
>     >         +               __entry->mode = VFS_I(ip)->i_mode;
>     >         +               __entry->flags = ip->i_flags;
>     >                 ),
>     >         -       TP_printk("dev %d:%d ino 0x%llx",
>     >         +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags
>     0x%lx",
>     >                           MAJOR(__entry->dev), MINOR(__entry->dev),
>     >         -                 __entry->ino)
>     >         +                 __entry->ino, __entry->mode, __entry->flags)
>     >          )
>     >
>     >          #define DEFINE_INODE_EVENT(name) \
>     >
>     >
>     >
>     >
>     >
>     >
>     >     --
>     >     符永涛
>     >
>     >
>     >
>     >
>     > --
>     > 符永涛
> 
> 
> 
> 
> -- 
> 符永涛

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20 12:58                                                                     ` Brian Foster
@ 2013-04-20 13:12                                                                       ` 符永涛
  0 siblings, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-20 13:12 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 10813 bytes --]

Hi Brian,
Yes, this directory (./crashtest/.glusterfs/indices) is glusterfs
index-base directory for glusterfs features/index xlator.The
./crashtest/.glusterfs/indices/xattrop directory is XATTROP_SUBDIR
directory for glusterfs. This path can be set in glusterfs server-side
volume file so before there're any progresses I'll try to isolate this
issue by try to set it to a ext4 path or somewhere else.
Thank you.


2013/4/20 Brian Foster <bfoster@redhat.com>

> On 04/20/2013 07:52 AM, 符永涛 wrote:
> > Hi Brain,
> > Here is the file:
> > find . -inum 539274
> > ./crashtest/.glusterfs/indices/xattrop
> > [root@10.23.72.94 <mailto:root@10.23.72.94> xfsd]# ls -lih
> > ./crashtest/.glusterfs/indices/xattrop
> > total 0
> > 539275 ---------- 2 root root 0 Apr 20 17:17
> > 132ef294-71d1-4435-8daa-aa002e67cb6e
> > 539275 ---------- 2 root root 0 Apr 20 17:17
> > xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
> > find . -inum 539275
> >
> ./crashtest/.glusterfs/indices/xattrop/xattrop-f3ad589a-b8dc-4416-ab84-fc9ad4033540
> >
> ./crashtest/.glusterfs/indices/xattrop/132ef294-71d1-4435-8daa-aa002e67cb6e
> > I'm not sure if it is normal or glusterfs fall in infinite loop. Is
> > there a change that the kernel fall into dead loop?
> > I'll study it.
> >
>
> Very interesting, thank you. I don't have the full context here yet, but
> the short of it is that this particular indices/xattrop directory is
> managed by a backend translator and driven by replication. It appears to
> be doing some kind of transaction level tracking based on links. E.g.,
> some quick behavioral observations:
>
> - This directory is created and an xattrop-#### file created (no size).
> - On starting a large sequential write from a client, I start observing
> a continuous sequence of xfs_link/xfs_remove operations via tracing.
> - The backend appears to create a link to the xattr-#### file for every
> replication transaction (handwave) with the name of the link referring
> to the gfid name of the file under write (but note again the link is not
> to the inode under write, but this special xattr-#### file). It then
> apparently removes this link on transaction completion and the process
> repeats. I suspect this has to do with identifying what files were under
> modification in the event of a crash before transaction completion, but
> the underlying pattern/workload is what's more important to us here for
> the time being...
>
> So we're seeing a heavy link related workload on a directory inode
> (0x83a8a) where the entries are all links to the same inode (0x83a8b).
> These inodes happen to be close in proximity, which may or may not be a
> factor. The translator that generates these link/unlink ops sits right
> above a generic thread pool translator, so this is multi-threaded.
>
> What isn't clear to me yet is where the xfs_iunlink() for this heavily
> linked inode is induced. The primary dentry remains after my file copy
> test, but then taking another look after a few minutes I see it removed.
> The same thing occurs if I gracefully restart the volume. I'm going to
> have to dig into that some more and also see if we can use this to
> narrow in on a reproducer. I'm thinking something along the lines of:
>
> - Create a directory/file. Ideally the directory and file inodes are in
> the same cluster.
> - Start a highly-threaded link-unlink workload against that file in the
> same directory.
> - Somewhere in the background unlink the main file.
> - Check for multiple xfs_iunlink() ops, repeat.
>
> ... the assumption being that the xfs_iunlink() race could have lead to
> a possible unlinked list corruption on the associated list, such that a
> later inactivation/xfs_iunlink_remove of some other inode in that bucket
> could fail.
>
> Brian
>
> >
> > 2013/4/20 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
> >
> >     On 04/20/2013 06:10 AM, 符永涛 wrote:
> >     > Dear Eric,
> >     > I have applied your latest patch and collected the following log:
> >     >
> >     > /var/log/message
> >     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove:
> >     xfs_inotobp()
> >     > returned error 22 for inode 0x1b20b ag 0 agino 1b20b
> >     > Apr 20 17:28:23 10 kernel:
> >     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree
> returned
> >     > error 22
> >     > Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> >     called
> >     > from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
> >     > 0xffffffffa02d4d0a
> >     > Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting
> >     down
> >     > filesystem
> >     > Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem
> and
> >     > rectify the problem(s)
> >     > Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >     > Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >     > Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >     > Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >     >
> >     > debugfs trace:
> >     >
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing
> >     >
> >
> >     FWIW...
> >
> >     <...>-6908  [001]  8739.967623: xfs_iunlink: dev 8:16 ino 0x83a8b
> mode
> >     0100000, flags 0x0
> >     <...>-6909  [001]  8739.970252: xfs_iunlink: dev 8:16 ino 0x83a8b
> mode
> >     0100000, flags 0x0
> >
> >     0x83a8b and 0x1b20b both hash to unlinked list bucket 11.
> >
> >     As to the rest of the trace, there appears to be a significant
> amount of
> >     link activity on (directory) inode 0x83a8a (the immediately prior
> inode
> >     to the inode involved in the race). The name data in the trace
> suggests
> >     activity somewhere under .glusterfs. A couple questions:
> >
> >     1.) Any idea what entries point to this inode right now (e.g., how
> many
> >     links on this inode) and where it resides in the fs (path)?
> >
> >     2.) Can you associate this kind of heavy remove/link pattern on a
> single
> >     inode to a higher level activity? For example, if you were to watch
> the
> >     trace data live, is this a normal pattern you observe? Does it only
> >     occur when a rebalance is in progress? Or when a rebalance finishes?
> Any
> >     detailed observations you can make in that regard could be helpful.
> >
> >     Brian
> >
> >     > Thank you.
> >     >
> >     >
> >     > 2013/4/20 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> >     <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
> >     >
> >     >     Hi Eric,
> >     >     The xfs module is loaded from system kernel, it happens on our
> >     >     production server too (I did not touch that till now) and if
> >     the xfs
> >     >     module is mess up the systemstap may also not working but now
> it
> >     >     works. As you have mentioned, strange thing is xfs shutdown
> always
> >     >     happens when glusterfs rebalance completes.
> >     >
> >     >
> >     >     2013/4/20 Eric Sandeen <sandeen@sandeen.net
> >     <mailto:sandeen@sandeen.net>
> >     >     <mailto:sandeen@sandeen.net <mailto:sandeen@sandeen.net>>>
> >     >
> >     >         On 4/19/13 9:03 PM, 符永涛 wrote:
> >     >         > Hi Eric,
> >     >         > I will enable them and run test again. I can only
> >     reproduce it
> >     >         with
> >     >         > glusterfs rebalance. Glusterfs uses a mechanism it called
> >     >         syncop to
> >     >         > unlink file. For rebalance it uses
> >     >         > syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In
> the
> >     >         glusterfs
> >     >         > sync_task framework(glusterfs/libglusterfs/src/syncop.c)
> >     it uses
> >     >         > "makecontext/swapcontext"
> >     >         >
> >     >
> >     <
> http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
> >     >         > Does it leads to racing unlink from different CPU core?
> >     >
> >     >         Yep, I understand that it's rebalance.  It dies when
> rebalance
> >     >         finishes because an
> >     >         open but unlinked file trips over the corrupted list from
> >     >         earlier, it seems.
> >     >
> >     >         I don't know why makecontext would matter...
> >     >
> >     >         Just to be sure, you are definitely loading the xfs module
> >     from
> >     >         the kernel you built, right, and you don't have a
> "priority"
> >     >         module getting loaded from elsewhere?  Seems unlikely, but
> >     just
> >     >         to be sure.
> >     >
> >     >         > Thank you.
> >     >
> >     >         You could also add this patch to the xfs tracepoints to
> print
> >     >         more information about the inodes - the mode & flags.
> >     >
> >     >         -Eric
> >     >
> >     >
> >     >         diff --git a/fs/xfs/linux-2.6/xfs_trace.h
> >     >         b/fs/xfs/linux-2.6/xfs_trace.h
> >     >         index e8ce644..c314b87 100644
> >     >         --- a/fs/xfs/linux-2.6/xfs_trace.h
> >     >         +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >     >         @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
> >     >                 TP_STRUCT__entry(
> >     >                         __field(dev_t, dev)
> >     >                         __field(xfs_ino_t, ino)
> >     >         +               __field(__u16, mode)
> >     >         +               __field(unsigned long, flags)
> >     >                 ),
> >     >                 TP_fast_assign(
> >     >                         __entry->dev = VFS_I(ip)->i_sb->s_dev;
> >     >                         __entry->ino = ip->i_ino;
> >     >         +               __entry->mode = VFS_I(ip)->i_mode;
> >     >         +               __entry->flags = ip->i_flags;
> >     >                 ),
> >     >         -       TP_printk("dev %d:%d ino 0x%llx",
> >     >         +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags
> >     0x%lx",
> >     >                           MAJOR(__entry->dev), MINOR(__entry->dev),
> >     >         -                 __entry->ino)
> >     >         +                 __entry->ino, __entry->mode,
> __entry->flags)
> >     >          )
> >     >
> >     >          #define DEFINE_INODE_EVENT(name) \
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >     --
> >     >     符永涛
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 16712 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-20 11:38                                                                 ` Brian Foster
  2013-04-20 11:52                                                                   ` 符永涛
@ 2013-04-20 15:36                                                                   ` Eric Sandeen
  1 sibling, 0 replies; 50+ messages in thread
From: Eric Sandeen @ 2013-04-20 15:36 UTC (permalink / raw)
  To: Brian Foster; +Cc: 符永涛, xfs

On 4/20/13 4:38 AM, Brian Foster wrote:
> On 04/20/2013 06:10 AM, 符永涛 wrote:
>> Dear Eric,
>> I have applied your latest patch and collected the following log:
>>
>> /var/log/message
>> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
>> returned error 22 for inode 0x1b20b ag 0 agino 1b20b
>> Apr 20 17:28:23 10 kernel:
>> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> error 22
>> Apr 20 17:28:23 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
>> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
>> 0xffffffffa02d4d0a
>> Apr 20 17:28:23 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
>> filesystem
>> Apr 20 17:28:23 10 kernel: XFS (sdb): Please umount the filesystem and
>> rectify the problem(s)
>> Apr 20 17:28:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 20 17:29:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 20 17:29:37 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 20 17:30:07 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>>
>> debugfs trace:
>> https://docs.google.com/file/d/0B7n2C4T5tfNCTlZGUVpnZENrZ3M/edit?usp=sharing
>>
> 
> FWIW...
> 
> <...>-6908  [001]  8739.967623: xfs_iunlink: dev 8:16 ino 0x83a8b mode
> 0100000, flags 0x0
> <...>-6909  [001]  8739.970252: xfs_iunlink: dev 8:16 ino 0x83a8b mode
> 0100000, flags 0x0

Interesting, this time it was on the same CPU [001]

I'd hoped that we'd see this same inode in some of the new tracepoints
but we don't, I'm not sure why; perhaps it overflowed the trace buffer?

> 0x83a8b and 0x1b20b both hash to unlinked list bucket 11.

which is why the error triggered on 0x1b20b, since the list is now corrupt. 

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
       [not found]                                                                 ` <5172B73C.6000900@sandeen.net>
@ 2013-04-20 23:52                                                                   ` 符永涛
  0 siblings, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-20 23:52 UTC (permalink / raw)
  To: Eric Sandeen, xfs, Brian Foster, Dave Chinner


[-- Attachment #1.1: Type: text/plain, Size: 4869 bytes --]

Dear Brian and Eric and xfs experts,
Thank you very much for helping to address the issue. With your help I am
now able to isolate this problem, I point the glusterfs volume indices
directory out of xfs filesystem(to a ext4 path) and the shutdown not
happens again. Since this directory is not where glusterfs store data it is
just index directory containing some kind of flag files it's ok to set it
to somewhere else. I'll run more test with this configuration.

Seems glusterfs aggressively call link/remove on the directory and files
under it leads to racing. When I move this directory to a ext4 path, after
rebalance the directory status is as following:

/data/testbug/.glusterfs:
total 12K
drwxr-xr-x 3 root root 4.0K Apr 20 22:00 .
drwxr-xr-x 3 root root 4.0K Apr 20 22:00 ..
drwxr-xr-x 3 root root 4.0K Apr 20 23:05 indices

/data/testbug/.glusterfs/indices:
total 12K
drwxr-xr-x 3 root root 4.0K Apr 20 23:05 .
drwxr-xr-x 3 root root 4.0K Apr 20 22:00 ..
drw------- 2 root root 4.0K Apr 20 23:44 xattrop

/data/testbug/.glusterfs/indices/xattrop:
total 8.0K
drw-------  2 root root 4.0K Apr 20 23:44 .
drwxr-xr-x  3 root root 4.0K Apr 20 23:05 ..
---------- 21 root root    0 Apr 20 23:05
2ea59fab-da86-4ccd-a6d3-20ca80d30e8c
---------- 21 root root    0 Apr 20 23:05
33e96373-7fe1-4c09-969e-f01c45ac445e
---------- 21 root root    0 Apr 20 23:05
35341482-c561-4fdd-b505-61c4e189f63c
---------- 21 root root    0 Apr 20 23:05
390e8676-b18a-4769-9e26-ebc47385d022
---------- 21 root root    0 Apr 20 23:05
4f153df5-0101-4375-bb3d-1c94a2ca5c69
---------- 21 root root    0 Apr 20 23:05
58867dfa-d4b1-47fb-9a54-f3cda0411297
---------- 21 root root    0 Apr 20 23:05
7608e535-6de4-40ba-bb63-aba986d77c6a
---------- 21 root root    0 Apr 20 23:05
7f71f1a0-4463-4fdd-b3b8-0ca73b282520
---------- 21 root root    0 Apr 20 23:05
82cfa2b8-0604-4f7c-bf18-5db47a4cc727
---------- 21 root root    0 Apr 20 23:05
8ea1dc4b-801a-49fb-bff2-7579826f5942
---------- 21 root root    0 Apr 20 23:05
9cc746a7-5b47-4b6a-990f-99f378b787bf
---------- 21 root root    0 Apr 20 23:05
aa1cbfb7-5661-4faf-a3e5-7607a2a2b884
---------- 21 root root    0 Apr 20 23:05
b23c5c1d-1076-43eb-a527-01970d5565ab
---------- 21 root root    0 Apr 20 23:05
b336fb93-82a5-45e8-bacc-66a270a12f3f
---------- 21 root root    0 Apr 20 23:05
bc0ada39-a479-4f29-949c-b8b110291699
---------- 21 root root    0 Apr 20 23:05
c7bd2930-3605-47bc-afc7-67254c8309b0
---------- 21 root root    0 Apr 20 23:05
ce517ec5-079a-41d6-b3f6-6b06cc88ace5
---------- 21 root root    0 Apr 20 23:05
d6d949e2-df27-4428-b4bf-5e589c224fec
---------- 21 root root    0 Apr 20 23:05
fc96f94f-4da0-4003-beca-91738e902c03
---------- 21 root root    0 Apr 20 23:05
fe7a2f27-a98a-4958-9024-85bf8671e612
---------- 21 root root    0 Apr 20 23:05
xattrop-bcb327ae-e265-4cd3-b6d4-9babff815ee7




2013/4/20 Eric Sandeen <sandeen@sandeen.net>

> On 4/20/13 3:10 AM, 符永涛 wrote:
> > Dear Eric,
> > I have applied your latest patch and collected the following log:
> >
>
> If you like, I think you could drop the below patch again; it didn't yield
> anything interesting and just makes for bigger trace logs.
>
> Every mode was 0100000 (S_ISREG / regular file) and flags were always 0.
>
> >         You could also add this patch to the xfs tracepoints to print
> more information about the inodes - the mode & flags.
> >
> >         -Eric
> >
> >
> >         diff --git a/fs/xfs/linux-2.6/xfs_trace.h
> b/fs/xfs/linux-2.6/xfs_trace.h
> >         index e8ce644..c314b87 100644
> >         --- a/fs/xfs/linux-2.6/xfs_trace.h
> >         +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >         @@ -544,14 +544,18 @@ DECLARE_EVENT_CLASS(xfs_inode_class,
> >                 TP_STRUCT__entry(
> >                         __field(dev_t, dev)
> >                         __field(xfs_ino_t, ino)
> >         +               __field(__u16, mode)
> >         +               __field(unsigned long, flags)
> >                 ),
> >                 TP_fast_assign(
> >                         __entry->dev = VFS_I(ip)->i_sb->s_dev;
> >                         __entry->ino = ip->i_ino;
> >         +               __entry->mode = VFS_I(ip)->i_mode;
> >         +               __entry->flags = ip->i_flags;
> >                 ),
> >         -       TP_printk("dev %d:%d ino 0x%llx",
> >         +       TP_printk("dev %d:%d ino 0x%llx mode 0%o, flags 0x%lx",
> >                           MAJOR(__entry->dev), MINOR(__entry->dev),
> >         -                 __entry->ino)
> >         +                 __entry->ino, __entry->mode, __entry->flags)
> >          )
> >
> >          #define DEFINE_INODE_EVENT(name) \
> >
> >
> >
> >
> >
> >
> >     --
> >     符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 7111 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-15 23:14 xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging Brian Foster
  2013-04-16 16:24 ` Dave Chinner
@ 2013-04-22 19:59 ` Eric Sandeen
  2013-04-23  0:08   ` Dave Chinner
  1 sibling, 1 reply; 50+ messages in thread
From: Eric Sandeen @ 2013-04-22 19:59 UTC (permalink / raw)
  To: Brian Foster; +Cc: yongtaofu, xfs

On 4/15/13 6:14 PM, Brian Foster wrote:
> Hi,
> 
> Thanks for the data in the previous thread:
> 
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> 
> I'm spinning off a new thread specifically for this because the original
> thread is already too large and scattered to track. As Eric stated,
> please try to keep data contained in as few messages as possible.
> 

Well, it's always simple in the end.  It just took a lot of debugging
to figure out what was happening - we do appreciate your help with that!

We were able to create a local reproducer, and it looks like
this patch fixes things:

commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Sat Jan 29 18:43:27 2011 +0530

    fs: Don't allow to create hardlink for deleted file
    
    Add inode->i_nlink == 0 check in VFS. Some of the file systems
    do this internally. A followup patch will remove those instance.
    This is needed to ensure that with link by handle we don't allow
    to create hardlink of an unlinked file. The check also prevent a race
    between unlink and link
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/fs/namei.c b/fs/namei.c
index 83e92ba..33be51a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2906,7 +2906,11 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		return error;
 
 	mutex_lock(&inode->i_mutex);
-	error = dir->i_op->link(old_dentry, dir, new_dentry);
+	/* Make sure we don't allow creating hardlink to an unlinked file */
+	if (inode->i_nlink == 0)
+		error =  -ENOENT;
+	else
+		error = dir->i_op->link(old_dentry, dir, new_dentry);
 	mutex_unlock(&inode->i_mutex);
 	if (!error)
 		fsnotify_link(dir, inode, new_dentry);


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-22 19:59 ` Eric Sandeen
@ 2013-04-23  0:08   ` Dave Chinner
  2013-04-23  0:52     ` Eric Sandeen
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2013-04-23  0:08 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, yongtaofu, xfs

On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
> On 4/15/13 6:14 PM, Brian Foster wrote:
> > Hi,
> > 
> > Thanks for the data in the previous thread:
> > 
> > http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> > 
> > I'm spinning off a new thread specifically for this because the original
> > thread is already too large and scattered to track. As Eric stated,
> > please try to keep data contained in as few messages as possible.
> > 
> 
> Well, it's always simple in the end.  It just took a lot of debugging
> to figure out what was happening - we do appreciate your help with that!
> 
> We were able to create a local reproducer, and it looks like
> this patch fixes things:
> 
> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Date:   Sat Jan 29 18:43:27 2011 +0530
> 
>     fs: Don't allow to create hardlink for deleted file

Good find Eric - great work on the reproducer script.

FWIW, can you confirm that a debug kernel assert fails
with a non-zero link count in xfs_bumplink() with your test case?

int
xfs_bumplink(
        xfs_trans_t *tp,
        xfs_inode_t *ip)
{
        xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);

>>>>>   ASSERT(ip->i_d.di_nlink > 0);
        ip->i_d.di_nlink++;
        inc_nlink(VFS_I(ip));

If it does, we should consider this a in-memory corruption case and
return and trigger a shutdown here....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-23  0:08   ` Dave Chinner
@ 2013-04-23  0:52     ` Eric Sandeen
  2013-04-23  1:31       ` 符永涛
  2013-04-24  9:02       ` Dave Chinner
  0 siblings, 2 replies; 50+ messages in thread
From: Eric Sandeen @ 2013-04-23  0:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Brian Foster, yongtaofu, xfs

On 4/22/13 7:08 PM, Dave Chinner wrote:
> On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
>> On 4/15/13 6:14 PM, Brian Foster wrote:
>>> Hi,
>>>
>>> Thanks for the data in the previous thread:
>>>
>>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>>>
>>> I'm spinning off a new thread specifically for this because the original
>>> thread is already too large and scattered to track. As Eric stated,
>>> please try to keep data contained in as few messages as possible.
>>>
>>
>> Well, it's always simple in the end.  It just took a lot of debugging
>> to figure out what was happening - we do appreciate your help with that!
>>
>> We were able to create a local reproducer, and it looks like
>> this patch fixes things:
>>
>> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
>> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Date:   Sat Jan 29 18:43:27 2011 +0530
>>
>>     fs: Don't allow to create hardlink for deleted file
> 
> Good find Eric - great work on the reproducer script.
> 
> FWIW, can you confirm that a debug kernel assert fails
> with a non-zero link count in xfs_bumplink() with your test case?
> 
> int
> xfs_bumplink(
>         xfs_trans_t *tp,
>         xfs_inode_t *ip)
> {
>         xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
> 
>>>>>>   ASSERT(ip->i_d.di_nlink > 0);

Yep, it does, I put a printk in there when I was testing
and it fired.

Guess we should have tested a debug xfs right off the bat ;)

>         ip->i_d.di_nlink++;
>         inc_nlink(VFS_I(ip));
> 
> If it does, we should consider this a in-memory corruption case and
> return and trigger a shutdown here....

I suppose that makes sense, it'd be a much less cryptic failure for
something that will fail soon anyway.

-Eric

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-23  0:52     ` Eric Sandeen
@ 2013-04-23  1:31       ` 符永涛
  2013-04-24  9:02       ` Dave Chinner
  1 sibling, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-23  1:31 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, xfs


[-- Attachment #1.1: Type: text/plain, Size: 2124 bytes --]

Terrific. Thank you Eric, Brian, and xfs experts. You have helped us a lot.
I'll test the path.


2013/4/23 Eric Sandeen <sandeen@sandeen.net>

> On 4/22/13 7:08 PM, Dave Chinner wrote:
> > On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
> >> On 4/15/13 6:14 PM, Brian Foster wrote:
> >>> Hi,
> >>>
> >>> Thanks for the data in the previous thread:
> >>>
> >>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>>
> >>> I'm spinning off a new thread specifically for this because the
> original
> >>> thread is already too large and scattered to track. As Eric stated,
> >>> please try to keep data contained in as few messages as possible.
> >>>
> >>
> >> Well, it's always simple in the end.  It just took a lot of debugging
> >> to figure out what was happening - we do appreciate your help with that!
> >>
> >> We were able to create a local reproducer, and it looks like
> >> this patch fixes things:
> >>
> >> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
> >> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> >> Date:   Sat Jan 29 18:43:27 2011 +0530
> >>
> >>     fs: Don't allow to create hardlink for deleted file
> >
> > Good find Eric - great work on the reproducer script.
> >
> > FWIW, can you confirm that a debug kernel assert fails
> > with a non-zero link count in xfs_bumplink() with your test case?
> >
> > int
> > xfs_bumplink(
> >         xfs_trans_t *tp,
> >         xfs_inode_t *ip)
> > {
> >         xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
> >
> >>>>>>   ASSERT(ip->i_d.di_nlink > 0);
>
> Yep, it does, I put a printk in there when I was testing
> and it fired.
>
> Guess we should have tested a debug xfs right off the bat ;)
>
> >         ip->i_d.di_nlink++;
> >         inc_nlink(VFS_I(ip));
> >
> > If it does, we should consider this a in-memory corruption case and
> > return and trigger a shutdown here....
>
> I suppose that makes sense, it'd be a much less cryptic failure for
> something that will fail soon anyway.
>
> -Eric
>
> > Cheers,
> >
> > Dave.
> >
>
>


-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 3198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-23  0:52     ` Eric Sandeen
  2013-04-23  1:31       ` 符永涛
@ 2013-04-24  9:02       ` Dave Chinner
  2013-04-24 10:21         ` 符永涛
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2013-04-24  9:02 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, yongtaofu, xfs

On Mon, Apr 22, 2013 at 07:52:51PM -0500, Eric Sandeen wrote:
> On 4/22/13 7:08 PM, Dave Chinner wrote:
> > On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
> >> On 4/15/13 6:14 PM, Brian Foster wrote:
> >>> Hi,
> >>>
> >>> Thanks for the data in the previous thread:
> >>>
> >>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>>
> >>> I'm spinning off a new thread specifically for this because the original
> >>> thread is already too large and scattered to track. As Eric stated,
> >>> please try to keep data contained in as few messages as possible.
> >>>
> >>
> >> Well, it's always simple in the end.  It just took a lot of debugging
> >> to figure out what was happening - we do appreciate your help with that!
> >>
> >> We were able to create a local reproducer, and it looks like
> >> this patch fixes things:
> >>
> >> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
> >> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> >> Date:   Sat Jan 29 18:43:27 2011 +0530
> >>
> >>     fs: Don't allow to create hardlink for deleted file
> > 
> > Good find Eric - great work on the reproducer script.
> > 
> > FWIW, can you confirm that a debug kernel assert fails
> > with a non-zero link count in xfs_bumplink() with your test case?
> > 
> > int
> > xfs_bumplink(
> >         xfs_trans_t *tp,
> >         xfs_inode_t *ip)
> > {
> >         xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
> > 
> >>>>>>   ASSERT(ip->i_d.di_nlink > 0);
> 
> Yep, it does, I put a printk in there when I was testing
> and it fired.
> 
> Guess we should have tested a debug xfs right off the bat ;)

Perhaps, but that may have changed the timing sufficiently to make
the race go away. What we really needed was a way to just turn the
assert into a WARN_ON() without all the other debug code like we've
previously talked about. So, rather than talk about it again, I
posted patches to do this....

> >         ip->i_d.di_nlink++;
> >         inc_nlink(VFS_I(ip));
> > 
> > If it does, we should consider this a in-memory corruption case and
> > return and trigger a shutdown here....
> 
> I suppose that makes sense, it'd be a much less cryptic failure for
> something that will fail soon anyway.

Exactly.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-24  9:02       ` Dave Chinner
@ 2013-04-24 10:21         ` 符永涛
  2013-04-25  0:48           ` 符永涛
  0 siblings, 1 reply; 50+ messages in thread
From: 符永涛 @ 2013-04-24 10:21 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Brian Foster, Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 5779 bytes --]

Dear Eric and Dave,
The xfs shutdown seems go away however one of our server report the
following error it make glusterfsd hang again. Is this just related to high
load? Or the same issue with different behavior after change the vfs.
Apr 24 12:35:07 10 kernel: [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b
Apr 24 12:37:07 10 kernel: INFO: task glusterfsd:5835 blocked for more than
120 seconds.
Apr 24 12:37:07 10 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 24 12:37:07 10 kernel: glusterfsd    D 0000000000000003     0
5835      1 0x00000080
Apr 24 12:37:07 10 kernel: ffff88100ed77a28 0000000000000082
0000000000000000 ffff8818e843cdd8
Apr 24 12:37:07 10 kernel: ffff8810177c1bc0 ffff8818e8422ea0
0000000000004004 ffff882019453000
Apr 24 12:37:07 10 kernel: ffff88101609b098 ffff88100ed77fd8
000000000000fb88 ffff88101609b098
Apr 24 12:37:07 10 kernel: Call Trace:
Apr 24 12:37:07 10 kernel: [<ffffffff814eaad5>] schedule_timeout+0x215/0x2e0
Apr 24 12:37:07 10 kernel: [<ffffffffa02a4978>] ? xfs_da_do_buf+0x618/0x770
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffff814eb9f2>] __down+0x72/0xb0
Apr 24 12:37:07 10 kernel: [<ffffffffa02daae2>] ? _xfs_buf_find+0x102/0x280
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffff810967f1>] down+0x41/0x50
Apr 24 12:37:07 10 kernel: [<ffffffffa02da923>] xfs_buf_lock+0x53/0x110
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02daae2>] _xfs_buf_find+0x102/0x280
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02daccb>] xfs_buf_get+0x6b/0x1a0 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02db33c>] xfs_buf_read+0x2c/0x100
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02d0f88>]
xfs_trans_read_buf+0x1f8/0x400 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02b3774>] xfs_read_agi+0x74/0x100
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02b999b>] xfs_iunlink+0x4b/0x170 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffff81070f97>] ? current_fs_time+0x27/0x30
Apr 24 12:37:07 10 kernel: [<ffffffffa02d1737>] ?
xfs_trans_ichgtime+0x27/0xa0 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02d1a8b>] xfs_droplink+0x5b/0x70 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffffa02d342e>] xfs_remove+0x27e/0x3a0 [xfs]
Apr 24 12:37:07 10 kernel: [<ffffffff8118215c>] ?
generic_permission+0x5c/0xb0
Apr 24 12:37:07 10 kernel: [<ffffffffa02e0da8>] xfs_vn_unlink+0x48/0x90
[xfs]
Apr 24 12:37:07 10 kernel: [<ffffffff81183d6f>] vfs_unlink+0x9f/0xe0
Apr 24 12:37:07 10 kernel: [<ffffffff81182aaa>] ? lookup_hash+0x3a/0x50
Apr 24 12:37:07 10 kernel: [<ffffffff811862a3>] do_unlinkat+0x183/0x1c0
Apr 24 12:37:07 10 kernel: [<ffffffff8117b876>] ? sys_newstat+0x36/0x50
Apr 24 12:37:07 10 kernel: [<ffffffff811862f6>] sys_unlink+0x16/0x20
Apr 24 12:37:07 10 kernel: [<ffffffff8100b072>]
system_call_fastpath+0x16/0x1b
.

BTW:
I use kernel 279.19.1
2675         mutex_lock(&inode->i_mutex);
2676         /* Make sure we don't allow creating hardlink to an unlinked
file */
2677         if (inode->i_nlink == 0)
2678                 error =  -ENOENT;
2679         else
2680                 vfs_dq_init(dir);
2681                 error = dir->i_op->link(old_dentry, dir, new_dentry);
2682         mutex_unlock(&inode->i_mutex);

Thank you.


2013/4/24 Dave Chinner <david@fromorbit.com>

> On Mon, Apr 22, 2013 at 07:52:51PM -0500, Eric Sandeen wrote:
> > On 4/22/13 7:08 PM, Dave Chinner wrote:
> > > On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
> > >> On 4/15/13 6:14 PM, Brian Foster wrote:
> > >>> Hi,
> > >>>
> > >>> Thanks for the data in the previous thread:
> > >>>
> > >>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> > >>>
> > >>> I'm spinning off a new thread specifically for this because the
> original
> > >>> thread is already too large and scattered to track. As Eric stated,
> > >>> please try to keep data contained in as few messages as possible.
> > >>>
> > >>
> > >> Well, it's always simple in the end.  It just took a lot of debugging
> > >> to figure out what was happening - we do appreciate your help with
> that!
> > >>
> > >> We were able to create a local reproducer, and it looks like
> > >> this patch fixes things:
> > >>
> > >> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
> > >> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > >> Date:   Sat Jan 29 18:43:27 2011 +0530
> > >>
> > >>     fs: Don't allow to create hardlink for deleted file
> > >
> > > Good find Eric - great work on the reproducer script.
> > >
> > > FWIW, can you confirm that a debug kernel assert fails
> > > with a non-zero link count in xfs_bumplink() with your test case?
> > >
> > > int
> > > xfs_bumplink(
> > >         xfs_trans_t *tp,
> > >         xfs_inode_t *ip)
> > > {
> > >         xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
> > >
> > >>>>>>   ASSERT(ip->i_d.di_nlink > 0);
> >
> > Yep, it does, I put a printk in there when I was testing
> > and it fired.
> >
> > Guess we should have tested a debug xfs right off the bat ;)
>
> Perhaps, but that may have changed the timing sufficiently to make
> the race go away. What we really needed was a way to just turn the
> assert into a WARN_ON() without all the other debug code like we've
> previously talked about. So, rather than talk about it again, I
> posted patches to do this....
>
> > >         ip->i_d.di_nlink++;
> > >         inc_nlink(VFS_I(ip));
> > >
> > > If it does, we should consider this a in-memory corruption case and
> > > return and trigger a shutdown here....
> >
> > I suppose that makes sense, it'd be a much less cryptic failure for
> > something that will fail soon anyway.
>
> Exactly.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 7928 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
  2013-04-24 10:21         ` 符永涛
@ 2013-04-25  0:48           ` 符永涛
  0 siblings, 0 replies; 50+ messages in thread
From: 符永涛 @ 2013-04-25  0:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Brian Foster, Eric Sandeen, xfs


[-- Attachment #1.1: Type: text/plain, Size: 6161 bytes --]

Sorry I make it wrong, I'll change it a little bit and test again, thank
you.


2013/4/24 符永涛 <yongtaofu@gmail.com>

> Dear Eric and Dave,
> The xfs shutdown seems go away however one of our server report the
> following error it make glusterfsd hang again. Is this just related to high
> load? Or the same issue with different behavior after change the vfs.
> Apr 24 12:35:07 10 kernel: [<ffffffff8100b072>]
> system_call_fastpath+0x16/0x1b
> Apr 24 12:37:07 10 kernel: INFO: task glusterfsd:5835 blocked for more
> than 120 seconds.
> Apr 24 12:37:07 10 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Apr 24 12:37:07 10 kernel: glusterfsd    D 0000000000000003     0
> 5835      1 0x00000080
> Apr 24 12:37:07 10 kernel: ffff88100ed77a28 0000000000000082
> 0000000000000000 ffff8818e843cdd8
> Apr 24 12:37:07 10 kernel: ffff8810177c1bc0 ffff8818e8422ea0
> 0000000000004004 ffff882019453000
> Apr 24 12:37:07 10 kernel: ffff88101609b098 ffff88100ed77fd8
> 000000000000fb88 ffff88101609b098
> Apr 24 12:37:07 10 kernel: Call Trace:
> Apr 24 12:37:07 10 kernel: [<ffffffff814eaad5>]
> schedule_timeout+0x215/0x2e0
> Apr 24 12:37:07 10 kernel: [<ffffffffa02a4978>] ?
> xfs_da_do_buf+0x618/0x770 [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffff814eb9f2>] __down+0x72/0xb0
> Apr 24 12:37:07 10 kernel: [<ffffffffa02daae2>] ?
> _xfs_buf_find+0x102/0x280 [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffff810967f1>] down+0x41/0x50
> Apr 24 12:37:07 10 kernel: [<ffffffffa02da923>] xfs_buf_lock+0x53/0x110
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02daae2>] _xfs_buf_find+0x102/0x280
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02daccb>] xfs_buf_get+0x6b/0x1a0
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02db33c>] xfs_buf_read+0x2c/0x100
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02d0f88>]
> xfs_trans_read_buf+0x1f8/0x400 [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02b3774>] xfs_read_agi+0x74/0x100
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02b999b>] xfs_iunlink+0x4b/0x170
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffff81070f97>] ? current_fs_time+0x27/0x30
> Apr 24 12:37:07 10 kernel: [<ffffffffa02d1737>] ?
> xfs_trans_ichgtime+0x27/0xa0 [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02d1a8b>] xfs_droplink+0x5b/0x70
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffffa02d342e>] xfs_remove+0x27e/0x3a0
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffff8118215c>] ?
> generic_permission+0x5c/0xb0
> Apr 24 12:37:07 10 kernel: [<ffffffffa02e0da8>] xfs_vn_unlink+0x48/0x90
> [xfs]
> Apr 24 12:37:07 10 kernel: [<ffffffff81183d6f>] vfs_unlink+0x9f/0xe0
> Apr 24 12:37:07 10 kernel: [<ffffffff81182aaa>] ? lookup_hash+0x3a/0x50
> Apr 24 12:37:07 10 kernel: [<ffffffff811862a3>] do_unlinkat+0x183/0x1c0
> Apr 24 12:37:07 10 kernel: [<ffffffff8117b876>] ? sys_newstat+0x36/0x50
> Apr 24 12:37:07 10 kernel: [<ffffffff811862f6>] sys_unlink+0x16/0x20
> Apr 24 12:37:07 10 kernel: [<ffffffff8100b072>]
> system_call_fastpath+0x16/0x1b
> .
>
> BTW:
> I use kernel 279.19.1
> 2675         mutex_lock(&inode->i_mutex);
> 2676         /* Make sure we don't allow creating hardlink to an unlinked
> file */
> 2677         if (inode->i_nlink == 0)
> 2678                 error =  -ENOENT;
> 2679         else
> 2680                 vfs_dq_init(dir);
> 2681                 error = dir->i_op->link(old_dentry, dir, new_dentry);
> 2682         mutex_unlock(&inode->i_mutex);
>
> Thank you.
>
>
> 2013/4/24 Dave Chinner <david@fromorbit.com>
>
>> On Mon, Apr 22, 2013 at 07:52:51PM -0500, Eric Sandeen wrote:
>> > On 4/22/13 7:08 PM, Dave Chinner wrote:
>> > > On Mon, Apr 22, 2013 at 02:59:54PM -0500, Eric Sandeen wrote:
>> > >> On 4/15/13 6:14 PM, Brian Foster wrote:
>> > >>> Hi,
>> > >>>
>> > >>> Thanks for the data in the previous thread:
>> > >>>
>> > >>> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
>> > >>>
>> > >>> I'm spinning off a new thread specifically for this because the
>> original
>> > >>> thread is already too large and scattered to track. As Eric stated,
>> > >>> please try to keep data contained in as few messages as possible.
>> > >>>
>> > >>
>> > >> Well, it's always simple in the end.  It just took a lot of debugging
>> > >> to figure out what was happening - we do appreciate your help with
>> that!
>> > >>
>> > >> We were able to create a local reproducer, and it looks like
>> > >> this patch fixes things:
>> > >>
>> > >> commit aae8a97d3ec30788790d1720b71d76fd8eb44b73
>> > >> Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> > >> Date:   Sat Jan 29 18:43:27 2011 +0530
>> > >>
>> > >>     fs: Don't allow to create hardlink for deleted file
>> > >
>> > > Good find Eric - great work on the reproducer script.
>> > >
>> > > FWIW, can you confirm that a debug kernel assert fails
>> > > with a non-zero link count in xfs_bumplink() with your test case?
>> > >
>> > > int
>> > > xfs_bumplink(
>> > >         xfs_trans_t *tp,
>> > >         xfs_inode_t *ip)
>> > > {
>> > >         xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
>> > >
>> > >>>>>>   ASSERT(ip->i_d.di_nlink > 0);
>> >
>> > Yep, it does, I put a printk in there when I was testing
>> > and it fired.
>> >
>> > Guess we should have tested a debug xfs right off the bat ;)
>>
>> Perhaps, but that may have changed the timing sufficiently to make
>> the race go away. What we really needed was a way to just turn the
>> assert into a WARN_ON() without all the other debug code like we've
>> previously talked about. So, rather than talk about it again, I
>> posted patches to do this....
>>
>> > >         ip->i_d.di_nlink++;
>> > >         inc_nlink(VFS_I(ip));
>> > >
>> > > If it does, we should consider this a in-memory corruption case and
>> > > return and trigger a shutdown here....
>> >
>> > I suppose that makes sense, it'd be a much less cryptic failure for
>> > something that will fail soon anyway.
>>
>> Exactly.
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david@fromorbit.com
>>
>
>
>
> --
> 符永涛
>



-- 
符永涛

[-- Attachment #1.2: Type: text/html, Size: 8497 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2013-04-25  0:48 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-15 23:14 xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging Brian Foster
2013-04-16 16:24 ` Dave Chinner
2013-04-16 17:18   ` Brian Foster
2013-04-17  1:04     ` 符永涛
2013-04-17  1:35       ` 符永涛
2013-04-17  3:15         ` 符永涛
2013-04-17  3:48           ` 符永涛
2013-04-17  4:28             ` Eric Sandeen
2013-04-18  1:30               ` 符永涛
2013-04-18  6:45                 ` 符永涛
2013-04-18  8:25                   ` 符永涛
2013-04-18 11:41                     ` Brian Foster
2013-04-18 15:23                       ` 符永涛
2013-04-18 16:40                         ` 符永涛
2013-04-18 17:03                         ` Eric Sandeen
2013-04-18 18:35                         ` Eric Sandeen
2013-04-18 20:59                         ` Brian Foster
2013-04-19  6:40                           ` 符永涛
2013-04-19 11:41                             ` 符永涛
2013-04-19 14:59                               ` Eric Sandeen
2013-04-19 15:13                                 ` 符永涛
2013-04-19 15:18                                   ` 符永涛
2013-04-19 16:16                                     ` Eric Sandeen
2013-04-19 16:47                                       ` 符永涛
2013-04-19 17:00                                         ` 符永涛
2013-04-19 17:04                                           ` Eric Sandeen
2013-04-19 17:08                                             ` 符永涛
2013-04-19 17:17                                               ` 符永涛
2013-04-20  0:03                                                 ` 符永涛
2013-04-20  1:15                                                   ` 符永涛
2013-04-20  2:51                                                     ` 符永涛
2013-04-20  3:40                                                       ` Eric Sandeen
2013-04-20  4:03                                                         ` 符永涛
2013-04-20  4:11                                                           ` 符永涛
2013-04-20  4:20                                                           ` Eric Sandeen
2013-04-20  4:27                                                             ` 符永涛
2013-04-20 10:10                                                               ` 符永涛
2013-04-20 11:38                                                                 ` Brian Foster
2013-04-20 11:52                                                                   ` 符永涛
2013-04-20 12:58                                                                     ` Brian Foster
2013-04-20 13:12                                                                       ` 符永涛
2013-04-20 15:36                                                                   ` Eric Sandeen
     [not found]                                                                 ` <5172B73C.6000900@sandeen.net>
2013-04-20 23:52                                                                   ` 符永涛
2013-04-22 19:59 ` Eric Sandeen
2013-04-23  0:08   ` Dave Chinner
2013-04-23  0:52     ` Eric Sandeen
2013-04-23  1:31       ` 符永涛
2013-04-24  9:02       ` Dave Chinner
2013-04-24 10:21         ` 符永涛
2013-04-25  0:48           ` 符永涛

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.