All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
@ 2017-10-06 20:07 Brian Foster
  2017-10-06 20:47 ` Darrick J. Wong
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Foster @ 2017-10-06 20:07 UTC (permalink / raw)
  To: linux-xfs

xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
associated blocks. xfs_attr3_node_inactive() recursively descends
from internal blocks to leaf blocks, caching block address values
along the way to revisit parent blocks, locate the next entry and
descend down that branch of the tree.

The code that attempts to reread the parent block is unsafe because
it assumes that the local xfs_da_node_entry pointer remains valid
after an xfs_trans_brelse() and re-read of the parent buffer. Under
heavy memory pressure, it is possible that the buffer has been
reclaimed and reallocated by the time the parent block is reread.
This means that 'btree' can point to an invalid memory address, lead
to a random/garbage value for child_fsb and cause the subsequent
read of the attr fork to go off the rails and return a NULL buffer
for an attr fork offset that is most likely not allocated.

Note that this problem can be manufactured by setting
XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
creating a file with a multi-level attr fork and removing it to
trigger inactivation.

To address this problem, reinit the node/btree pointers to the
parent buffer after it has been re-read. This ensures btree points
to a valid record and allows the walk to proceed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

I suspect this is the cause of the NULL buf problem down in
xfs_attr_inactive(). I can manufacture an instance of that problem as
noted above. We have a customer who's hitting that problem and will
attempt to validate this fix, but there is no confirmation as of yet.
I'm posting this for review in the meantime because this seems like a
legit fix regardless of whether they are hitting this or something else.

Brian

 fs/xfs/xfs_attr_inactive.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
index ebd66b1..e3a950e 100644
--- a/fs/xfs/xfs_attr_inactive.c
+++ b/fs/xfs/xfs_attr_inactive.c
@@ -302,6 +302,8 @@ xfs_attr3_node_inactive(
 						 &bp, XFS_ATTR_FORK);
 			if (error)
 				return error;
+			node = bp->b_addr;
+			btree = dp->d_ops->node_tree_p(node);
 			child_fsb = be32_to_cpu(btree[i + 1].before);
 			xfs_trans_brelse(*trans, bp);
 		}
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-10-06 20:07 [PATCH] xfs: reinit btree pointer on attr tree inactivation walk Brian Foster
@ 2017-10-06 20:47 ` Darrick J. Wong
  2017-10-07 12:14   ` Brian Foster
  2017-10-11 17:11   ` Marco Benatto
  0 siblings, 2 replies; 7+ messages in thread
From: Darrick J. Wong @ 2017-10-06 20:47 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> associated blocks. xfs_attr3_node_inactive() recursively descends
> from internal blocks to leaf blocks, caching block address values
> along the way to revisit parent blocks, locate the next entry and
> descend down that branch of the tree.
> 
> The code that attempts to reread the parent block is unsafe because
> it assumes that the local xfs_da_node_entry pointer remains valid
> after an xfs_trans_brelse() and re-read of the parent buffer. Under
> heavy memory pressure, it is possible that the buffer has been
> reclaimed and reallocated by the time the parent block is reread.
> This means that 'btree' can point to an invalid memory address, lead
> to a random/garbage value for child_fsb and cause the subsequent
> read of the attr fork to go off the rails and return a NULL buffer
> for an attr fork offset that is most likely not allocated.
> 
> Note that this problem can be manufactured by setting
> XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> creating a file with a multi-level attr fork and removing it to
> trigger inactivation.
> 
> To address this problem, reinit the node/btree pointers to the
> parent buffer after it has been re-read. This ensures btree points
> to a valid record and allows the walk to proceed.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

/me wonders if this is a good enough reason to introduce a new errortag
that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
garbage prior to releasing the memory to weed out any other dangling
pointers?

> ---
> 
> I suspect this is the cause of the NULL buf problem down in
> xfs_attr_inactive(). I can manufacture an instance of that problem as
> noted above. We have a customer who's hitting that problem and will
> attempt to validate this fix, but there is no confirmation as of yet.
> I'm posting this for review in the meantime because this seems like a
> legit fix regardless of whether they are hitting this or something else.

Let me know what they report back.

--D

> Brian
> 
>  fs/xfs/xfs_attr_inactive.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> index ebd66b1..e3a950e 100644
> --- a/fs/xfs/xfs_attr_inactive.c
> +++ b/fs/xfs/xfs_attr_inactive.c
> @@ -302,6 +302,8 @@ xfs_attr3_node_inactive(
>  						 &bp, XFS_ATTR_FORK);
>  			if (error)
>  				return error;
> +			node = bp->b_addr;
> +			btree = dp->d_ops->node_tree_p(node);
>  			child_fsb = be32_to_cpu(btree[i + 1].before);
>  			xfs_trans_brelse(*trans, bp);
>  		}
> -- 
> 2.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-10-06 20:47 ` Darrick J. Wong
@ 2017-10-07 12:14   ` Brian Foster
  2017-10-11 17:11   ` Marco Benatto
  1 sibling, 0 replies; 7+ messages in thread
From: Brian Foster @ 2017-10-07 12:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Oct 06, 2017 at 01:47:51PM -0700, Darrick J. Wong wrote:
> On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> > xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> > associated blocks. xfs_attr3_node_inactive() recursively descends
> > from internal blocks to leaf blocks, caching block address values
> > along the way to revisit parent blocks, locate the next entry and
> > descend down that branch of the tree.
> > 
> > The code that attempts to reread the parent block is unsafe because
> > it assumes that the local xfs_da_node_entry pointer remains valid
> > after an xfs_trans_brelse() and re-read of the parent buffer. Under
> > heavy memory pressure, it is possible that the buffer has been
> > reclaimed and reallocated by the time the parent block is reread.
> > This means that 'btree' can point to an invalid memory address, lead
> > to a random/garbage value for child_fsb and cause the subsequent
> > read of the attr fork to go off the rails and return a NULL buffer
> > for an attr fork offset that is most likely not allocated.
> > 
> > Note that this problem can be manufactured by setting
> > XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> > creating a file with a multi-level attr fork and removing it to
> > trigger inactivation.
> > 
> > To address this problem, reinit the node/btree pointers to the
> > parent buffer after it has been re-read. This ensures btree points
> > to a valid record and allows the walk to proceed.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> 
> Looks ok,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> /me wonders if this is a good enough reason to introduce a new errortag
> that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
> garbage prior to releasing the memory to weed out any other dangling
> pointers?
> 

I was starting to think about if/how we could test for this but hadn't
come up with anything concrete yet. An errortag that forces
_buf_set_ref() to set ->b_lru_ref to 0 of all buffer types might be
simple and effective enough to do the trick. I'll play around with that
next week.

Note that with my roughly equivalent hack in place, I didn't have to
overwrite the buffer content to observe the tree walk to go off the
rails. I haven't confirmed why, but perhaps the memory just happens to
be reused fairly quickly for the next buffer (we have a zone for
buffers, after all). I don't have poisoning or anything like that
enabled that I know of.

We may need a better way to know whether the test failed, however, since
the kernel now handles the !bpp case without crashing (as it should).
Perhaps we can assert that we don't hit a hole in this case, but that
has me wondering why we pass -2 to _da3_node_read() and skip over the
!child_bp case in xfs_attr3_node_inactive() in the first place. Is there
some legitimate case where we can hit a hole here that we don't know
about? I suppose adding an assert might be a more cautious first step
than switching it to -1 and potentially subjecting real users to
corruption errors. Thoughts?

> > ---
> > 
> > I suspect this is the cause of the NULL buf problem down in
> > xfs_attr_inactive(). I can manufacture an instance of that problem as
> > noted above. We have a customer who's hitting that problem and will
> > attempt to validate this fix, but there is no confirmation as of yet.
> > I'm posting this for review in the meantime because this seems like a
> > legit fix regardless of whether they are hitting this or something else.
> 
> Let me know what they report back.
> 

Will do, thanks.

Brian

> --D
> 
> > Brian
> > 
> >  fs/xfs/xfs_attr_inactive.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > index ebd66b1..e3a950e 100644
> > --- a/fs/xfs/xfs_attr_inactive.c
> > +++ b/fs/xfs/xfs_attr_inactive.c
> > @@ -302,6 +302,8 @@ xfs_attr3_node_inactive(
> >  						 &bp, XFS_ATTR_FORK);
> >  			if (error)
> >  				return error;
> > +			node = bp->b_addr;
> > +			btree = dp->d_ops->node_tree_p(node);
> >  			child_fsb = be32_to_cpu(btree[i + 1].before);
> >  			xfs_trans_brelse(*trans, bp);
> >  		}
> > -- 
> > 2.9.5
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-10-06 20:47 ` Darrick J. Wong
  2017-10-07 12:14   ` Brian Foster
@ 2017-10-11 17:11   ` Marco Benatto
  2017-10-11 17:30     ` Darrick J. Wong
  2017-11-06 22:26     ` Luis R. Rodriguez
  1 sibling, 2 replies; 7+ messages in thread
From: Marco Benatto @ 2017-10-11 17:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Brian Foster, linux-xfs

Hello all

On Fri, Oct 6, 2017 at 5:47 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> > xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> > associated blocks. xfs_attr3_node_inactive() recursively descends
> > from internal blocks to leaf blocks, caching block address values
> > along the way to revisit parent blocks, locate the next entry and
> > descend down that branch of the tree.
> >
> > The code that attempts to reread the parent block is unsafe because
> > it assumes that the local xfs_da_node_entry pointer remains valid
> > after an xfs_trans_brelse() and re-read of the parent buffer. Under
> > heavy memory pressure, it is possible that the buffer has been
> > reclaimed and reallocated by the time the parent block is reread.
> > This means that 'btree' can point to an invalid memory address, lead
> > to a random/garbage value for child_fsb and cause the subsequent
> > read of the attr fork to go off the rails and return a NULL buffer
> > for an attr fork offset that is most likely not allocated.
> >
> > Note that this problem can be manufactured by setting
> > XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> > creating a file with a multi-level attr fork and removing it to
> > trigger inactivation.
> >
> > To address this problem, reinit the node/btree pointers to the
> > parent buffer after it has been re-read. This ensures btree points
> > to a valid record and allows the walk to proceed.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
>
> Looks ok,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>
> /me wonders if this is a good enough reason to introduce a new errortag
> that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
> garbage prior to releasing the memory to weed out any other dangling
> pointers?
>
> > ---
> >
> > I suspect this is the cause of the NULL buf problem down in
> > xfs_attr_inactive(). I can manufacture an instance of that problem as
> > noted above. We have a customer who's hitting that problem and will
> > attempt to validate this fix, but there is no confirmation as of yet.
> > I'm posting this for review in the meantime because this seems like a
> > legit fix regardless of whether they are hitting this or something else.
>
> Let me know what they report back.

Just to let you know, we've got some news regarding this testing and the
patch seems effective to fix the issue they were facing before at
xfs_attr_inactive() case.

>
>
> --D
>
> > Brian
> >
> >  fs/xfs/xfs_attr_inactive.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > index ebd66b1..e3a950e 100644
> > --- a/fs/xfs/xfs_attr_inactive.c
> > +++ b/fs/xfs/xfs_attr_inactive.c
> > @@ -302,6 +302,8 @@ xfs_attr3_node_inactive(
> >                                                &bp, XFS_ATTR_FORK);
> >                       if (error)
> >                               return error;
> > +                     node = bp->b_addr;
> > +                     btree = dp->d_ops->node_tree_p(node);
> >                       child_fsb = be32_to_cpu(btree[i + 1].before);
> >                       xfs_trans_brelse(*trans, bp);
> >               }
> > --
> > 2.9.5
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Thanks,


-- 
Marco Benatto

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-10-11 17:11   ` Marco Benatto
@ 2017-10-11 17:30     ` Darrick J. Wong
  2017-11-06 22:26     ` Luis R. Rodriguez
  1 sibling, 0 replies; 7+ messages in thread
From: Darrick J. Wong @ 2017-10-11 17:30 UTC (permalink / raw)
  To: Marco Benatto; +Cc: Brian Foster, linux-xfs

On Wed, Oct 11, 2017 at 02:11:44PM -0300, Marco Benatto wrote:
> Hello all
> 
> On Fri, Oct 6, 2017 at 5:47 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> > > xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> > > associated blocks. xfs_attr3_node_inactive() recursively descends
> > > from internal blocks to leaf blocks, caching block address values
> > > along the way to revisit parent blocks, locate the next entry and
> > > descend down that branch of the tree.
> > >
> > > The code that attempts to reread the parent block is unsafe because
> > > it assumes that the local xfs_da_node_entry pointer remains valid
> > > after an xfs_trans_brelse() and re-read of the parent buffer. Under
> > > heavy memory pressure, it is possible that the buffer has been
> > > reclaimed and reallocated by the time the parent block is reread.
> > > This means that 'btree' can point to an invalid memory address, lead
> > > to a random/garbage value for child_fsb and cause the subsequent
> > > read of the attr fork to go off the rails and return a NULL buffer
> > > for an attr fork offset that is most likely not allocated.
> > >
> > > Note that this problem can be manufactured by setting
> > > XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> > > creating a file with a multi-level attr fork and removing it to
> > > trigger inactivation.
> > >
> > > To address this problem, reinit the node/btree pointers to the
> > > parent buffer after it has been re-read. This ensures btree points
> > > to a valid record and allows the walk to proceed.
> > >
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> >
> > Looks ok,
> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > /me wonders if this is a good enough reason to introduce a new errortag
> > that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
> > garbage prior to releasing the memory to weed out any other dangling
> > pointers?
> >
> > > ---
> > >
> > > I suspect this is the cause of the NULL buf problem down in
> > > xfs_attr_inactive(). I can manufacture an instance of that problem as
> > > noted above. We have a customer who's hitting that problem and will
> > > attempt to validate this fix, but there is no confirmation as of yet.
> > > I'm posting this for review in the meantime because this seems like a
> > > legit fix regardless of whether they are hitting this or something else.
> >
> > Let me know what they report back.
> 
> Just to let you know, we've got some news regarding this testing and the
> patch seems effective to fix the issue they were facing before at
> xfs_attr_inactive() case.

Ok, I'll queue this up for the next update.  Thank you for confirming!

--D

> 
> >
> >
> > --D
> >
> > > Brian
> > >
> > >  fs/xfs/xfs_attr_inactive.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
> > > index ebd66b1..e3a950e 100644
> > > --- a/fs/xfs/xfs_attr_inactive.c
> > > +++ b/fs/xfs/xfs_attr_inactive.c
> > > @@ -302,6 +302,8 @@ xfs_attr3_node_inactive(
> > >                                                &bp, XFS_ATTR_FORK);
> > >                       if (error)
> > >                               return error;
> > > +                     node = bp->b_addr;
> > > +                     btree = dp->d_ops->node_tree_p(node);
> > >                       child_fsb = be32_to_cpu(btree[i + 1].before);
> > >                       xfs_trans_brelse(*trans, bp);
> > >               }
> > > --
> > > 2.9.5
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Thanks,
> 
> 
> -- 
> Marco Benatto
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-10-11 17:11   ` Marco Benatto
  2017-10-11 17:30     ` Darrick J. Wong
@ 2017-11-06 22:26     ` Luis R. Rodriguez
  2017-11-07 11:02       ` Brian Foster
  1 sibling, 1 reply; 7+ messages in thread
From: Luis R. Rodriguez @ 2017-11-06 22:26 UTC (permalink / raw)
  To: Marco Benatto; +Cc: Darrick J. Wong, Brian Foster, linux-xfs

On Wed, Oct 11, 2017 at 02:11:44PM -0300, Marco Benatto wrote:
> Hello all
> 
> On Fri, Oct 6, 2017 at 5:47 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> > > xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> > > associated blocks. xfs_attr3_node_inactive() recursively descends
> > > from internal blocks to leaf blocks, caching block address values
> > > along the way to revisit parent blocks, locate the next entry and
> > > descend down that branch of the tree.
> > >
> > > The code that attempts to reread the parent block is unsafe because
> > > it assumes that the local xfs_da_node_entry pointer remains valid
> > > after an xfs_trans_brelse() and re-read of the parent buffer. Under
> > > heavy memory pressure, it is possible that the buffer has been
> > > reclaimed and reallocated by the time the parent block is reread.
> > > This means that 'btree' can point to an invalid memory address, lead
> > > to a random/garbage value for child_fsb and cause the subsequent
> > > read of the attr fork to go off the rails and return a NULL buffer
> > > for an attr fork offset that is most likely not allocated.
> > >
> > > Note that this problem can be manufactured by setting
> > > XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> > > creating a file with a multi-level attr fork and removing it to
> > > trigger inactivation.
> > >
> > > To address this problem, reinit the node/btree pointers to the
> > > parent buffer after it has been re-read. This ensures btree points
> > > to a valid record and allows the walk to proceed.
> > >
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> >
> > Looks ok,
> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > /me wonders if this is a good enough reason to introduce a new errortag
> > that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
> > garbage prior to releasing the memory to weed out any other dangling
> > pointers?
> >
> > > ---
> > >
> > > I suspect this is the cause of the NULL buf problem down in
> > > xfs_attr_inactive(). I can manufacture an instance of that problem as
> > > noted above. We have a customer who's hitting that problem and will
> > > attempt to validate this fix, but there is no confirmation as of yet.
> > > I'm posting this for review in the meantime because this seems like a
> > > legit fix regardless of whether they are hitting this or something else.
> >
> > Let me know what they report back.
> 
> Just to let you know, we've got some news regarding this testing and the
> patch seems effective to fix the issue they were facing before at
> xfs_attr_inactive() case.

Is there an actual oops trace that is reported somewhere? I didn't see it
provided.

  Luis

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] xfs: reinit btree pointer on attr tree inactivation walk
  2017-11-06 22:26     ` Luis R. Rodriguez
@ 2017-11-07 11:02       ` Brian Foster
  0 siblings, 0 replies; 7+ messages in thread
From: Brian Foster @ 2017-11-07 11:02 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: Marco Benatto, Darrick J. Wong, linux-xfs

On Mon, Nov 06, 2017 at 11:26:26PM +0100, Luis R. Rodriguez wrote:
> On Wed, Oct 11, 2017 at 02:11:44PM -0300, Marco Benatto wrote:
> > Hello all
> > 
> > On Fri, Oct 6, 2017 at 5:47 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > >
> > > On Fri, Oct 06, 2017 at 04:07:40PM -0400, Brian Foster wrote:
> > > > xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> > > > associated blocks. xfs_attr3_node_inactive() recursively descends
> > > > from internal blocks to leaf blocks, caching block address values
> > > > along the way to revisit parent blocks, locate the next entry and
> > > > descend down that branch of the tree.
> > > >
> > > > The code that attempts to reread the parent block is unsafe because
> > > > it assumes that the local xfs_da_node_entry pointer remains valid
> > > > after an xfs_trans_brelse() and re-read of the parent buffer. Under
> > > > heavy memory pressure, it is possible that the buffer has been
> > > > reclaimed and reallocated by the time the parent block is reread.
> > > > This means that 'btree' can point to an invalid memory address, lead
> > > > to a random/garbage value for child_fsb and cause the subsequent
> > > > read of the attr fork to go off the rails and return a NULL buffer
> > > > for an attr fork offset that is most likely not allocated.
> > > >
> > > > Note that this problem can be manufactured by setting
> > > > XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> > > > creating a file with a multi-level attr fork and removing it to
> > > > trigger inactivation.
> > > >
> > > > To address this problem, reinit the node/btree pointers to the
> > > > parent buffer after it has been re-read. This ensures btree points
> > > > to a valid record and allows the walk to proceed.
> > > >
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > >
> > > Looks ok,
> > > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > /me wonders if this is a good enough reason to introduce a new errortag
> > > that turns xfs_buf_set_ref into a no-op and fills bp->b_addr with
> > > garbage prior to releasing the memory to weed out any other dangling
> > > pointers?
> > >
> > > > ---
> > > >
> > > > I suspect this is the cause of the NULL buf problem down in
> > > > xfs_attr_inactive(). I can manufacture an instance of that problem as
> > > > noted above. We have a customer who's hitting that problem and will
> > > > attempt to validate this fix, but there is no confirmation as of yet.
> > > > I'm posting this for review in the meantime because this seems like a
> > > > legit fix regardless of whether they are hitting this or something else.
> > >
> > > Let me know what they report back.
> > 
> > Just to let you know, we've got some news regarding this testing and the
> > patch seems effective to fix the issue they were facing before at
> > xfs_attr_inactive() case.
> 
> Is there an actual oops trace that is reported somewhere? I didn't see it
> provided.
> 

I believe it is this thread:

https://www.spinics.net/lists/linux-xfs/msg06695.html

Brian

>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-11-07 11:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-06 20:07 [PATCH] xfs: reinit btree pointer on attr tree inactivation walk Brian Foster
2017-10-06 20:47 ` Darrick J. Wong
2017-10-07 12:14   ` Brian Foster
2017-10-11 17:11   ` Marco Benatto
2017-10-11 17:30     ` Darrick J. Wong
2017-11-06 22:26     ` Luis R. Rodriguez
2017-11-07 11:02       ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.