All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
@ 2016-04-12  8:56 Alex Lyakas
  2016-04-12 11:29 ` Brent Bice
  2016-04-12 23:29 ` Dave Chinner
  0 siblings, 2 replies; 8+ messages in thread
From: Alex Lyakas @ 2016-04-12  8:56 UTC (permalink / raw)
  To: xfs; +Cc: bbice, Shyam Kaushik


[-- Attachment #1.1: Type: text/plain, Size: 697 bytes --]

Hello Dave,

Looking at the patch, I see that now we call xfs_idestroy_fork() in RCU callback. This can do the following chain:

xfs_iext_destroy => xfs_iext_irec_remove => xfs_iext_realloc_indirect=> kmem_realloc => kmem_alloc => kmem_alloc => congestion_wait()

At least according to documentation, the RCU callback cannot block, since it may be called from softirq context. Is this fine?

Thanks,
Alex.

P.S.: I have submitted a patch called “[PATCH] xfs: optimize destruction of the indirection array”, which changes xfs_iext_destroy to call only kmem_free(). But this patch got stuck in the spam filter of the mailing list, and Brent is working to release it from there.


 

[-- Attachment #1.2: Type: text/html, Size: 1349 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-12  8:56 [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe Alex Lyakas
@ 2016-04-12 11:29 ` Brent Bice
  2016-04-12 21:11   ` Dave Chinner
  2016-04-12 23:29 ` Dave Chinner
  1 sibling, 1 reply; 8+ messages in thread
From: Brent Bice @ 2016-04-12 11:29 UTC (permalink / raw)
  To: Alex Lyakas, xfs; +Cc: Shyam Kaushik

    Could you resend it to be sure? I flagged it for redelivery the day 
I emailed you so if nobody else has seen it yet then something still 
didn't work.  I was pretty sure I'd seen it get delivered ok to oss, 
but... (shrug)

Brent

On 04/12/2016 02:56 AM, Alex Lyakas wrote:
> Hello Dave,
> Looking at the patch, I see that now we call xfs_idestroy_fork() in RCU
> callback. This can do the following chain:
> xfs_iext_destroy => xfs_iext_irec_remove => xfs_iext_realloc_indirect=>
> kmem_realloc => kmem_alloc => kmem_alloc => congestion_wait()
> At least according to documentation, the RCU callback cannot block,
> since it may be called from softirq context. Is this fine?
> Thanks,
> Alex.
> P.S.: I have submitted a patch called “[PATCH] xfs: optimize destruction
> of the indirection array”, which changes xfs_iext_destroy to call only
> kmem_free(). But this patch got stuck in the spam filter of the mailing
> list, and Brent is working to release it from there.
>
>
>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-12 11:29 ` Brent Bice
@ 2016-04-12 21:11   ` Dave Chinner
  2016-04-12 21:55     ` Brent Bice
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2016-04-12 21:11 UTC (permalink / raw)
  To: Brent Bice; +Cc: Shyam Kaushik, Alex Lyakas, xfs

On Tue, Apr 12, 2016 at 05:29:15AM -0600, Brent Bice wrote:
>    Could you resend it to be sure? I flagged it for redelivery the
> day I emailed you so if nobody else has seen it yet then something
> still didn't work.  I was pretty sure I'd seen it get delivered ok
> to oss, but... (shrug)

No, it didn't, and it's not in the archives, either.

It seems that sgi.com is rejecting quite a bit of legitimate mail
lately. I've seen a couple of people in the past couple of weeks
send stuff that has reached my private in boxes but not the list.
Now this one, too.

If someone is having to micro-manage the list to prevent rejections
of valid email then that is bad...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-12 21:11   ` Dave Chinner
@ 2016-04-12 21:55     ` Brent Bice
  0 siblings, 0 replies; 8+ messages in thread
From: Brent Bice @ 2016-04-12 21:55 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Shyam Kaushik, Alex Lyakas, xfs

On 04/12/2016 03:11 PM, Dave Chinner wrote:
> It seems that sgi.com is rejecting quite a bit of legitimate mail
> lately. I've seen a couple of people in the past couple of weeks
> send stuff that has reached my private in boxes but not the list.
> Now this one, too.

    Yeah, this one wasn't the anti-spam filters that I manage (the 
barracudas, our private RBL, etc) but spamassassin on oss.sgi.com. 
So...  Dunno.  I took a crack at whitelisting Alex's from address in 
spamassassin (though I haven't looked at it's configs or man pages in 
about 10 years).

> If someone is having to micro-manage the list to prevent rejections
> of valid email then that is bad...

So true.  I only happened to notice because I had someone (I forget who 
it was several years ago) tweak SA on oss so that rejections get 
forwarded to me.  The idea was I'd look at those (stuff that the cudas 
missed but spamassassin caught) and use that info to update our filters 
(the filters that benefit everyone at SGI, not just oss).

    Anyway, I don't always have the bandwidth to look at every email SA 
on oss rejects and update spam filters (if possible) and just happened 
to notice this one and that it wasn't spam.

    I am a lot more aggressive with the filters for oss because it's 
such a sore-thumb-target for spammers. For instance, there are large 
swaths of yahoo IP space being blocked now (for oss only) because over 
the years I only found one legit sender to oss from a yahoo IP - the 
rest was all spam/phish.  But I've only dropped in yahoo IP ranges that 
I was seeing spam from.  I've gotten aggressive with some Chinese IP 
space too for similar reasons. If any of the senders you know of having 
trouble might be in yahoo or chinese IP space let me know and I can look 
in the logs, but I'd bet a doughnut that if they didn't get a bounce 
then they're running afoul of spamassassin 'n not the filters I manage. :-(

    There was talk a while back of taking oss out of SGI and hosting it 
somewhere in "Da cloud!" (tm) and having someone else in the xfs 
community own it, wasn't there?  Anyone still looking into that?

Brent


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-12  8:56 [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe Alex Lyakas
  2016-04-12 11:29 ` Brent Bice
@ 2016-04-12 23:29 ` Dave Chinner
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2016-04-12 23:29 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: bbice, Shyam Kaushik, xfs

On Tue, Apr 12, 2016 at 11:56:35AM +0300, Alex Lyakas wrote:
> Hello Dave,
> 
> Looking at the patch, I see that now we call xfs_idestroy_fork() in RCU callback. This can do the following chain:
> 
> xfs_iext_destroy => xfs_iext_irec_remove => xfs_iext_realloc_indirect=> kmem_realloc => kmem_alloc => kmem_alloc => congestion_wait()
> 
> At least according to documentation, the RCU callback cannot block, since it may be called from softirq context. Is this fine?

Right, I forgot about that. Too many forests. I'll reconstruct your
patch from the email you appended it to previously and add that to
the series to test against.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
@ 2016-04-12  8:55 Alex Lyakas
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Lyakas @ 2016-04-12  8:55 UTC (permalink / raw)
  To: xfs; +Cc: bbice


[-- Attachment #1.1: Type: text/plain, Size: 697 bytes --]

Hello Dave,

Looking at the patch, I see that now we call xfs_idestroy_fork() in RCU callback. This can do the following chain:

xfs_iext_destroy => xfs_iext_irec_remove => xfs_iext_realloc_indirect=> kmem_realloc => kmem_alloc => kmem_alloc => congestion_wait()

At least according to documentation, the RCU callback cannot block, since it may be called from softirq context. Is this fine?

Thanks,
Alex.

P.S.: I have submitted a patch called “[PATCH] xfs: optimize destruction of the indirection array”, which changes xfs_iext_destroy to call only kmem_free(). But this patch got stuck in the spam filter of the mailing list, and Brent is working to release it from there.


 

[-- Attachment #1.2: Type: text/html, Size: 1090 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-06  9:22 ` [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe Dave Chinner
@ 2016-04-07 15:50   ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2016-04-07 15:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe
  2016-04-06  9:22 [PATCH 0/5] xfs; xfs_iflush_cluster vs xfs_reclaim_inode Dave Chinner
@ 2016-04-06  9:22 ` Dave Chinner
  2016-04-07 15:50   ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2016-04-06  9:22 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

The xfs_inode freed in xfs_inode_free() has multiple allocated
structures attached to it. We free these in xfs_inode_free() before
we mark the inode as invalid, and before we run call_rcu() to queue
the structure for freeing.

Unfortunately, this freeing can race with other accesses that are in
the RCU current grace period that have found the inode in the radix
tree with a valid state.  This includes xfs_iflush_cluster(), which
calls xfs_inode_clean(), and that accesses the inode log item on the
xfs_inode.

The log item structure is freed in xfs_inode_free(), so there is the
possibility we can be accessing freed memory in xfs_iflush_cluster()
after validating the xfs_inode structure as being valid for this RCU
context. Hence we can get spuriously incorrect clean state returned
from such checks. This can lead to use thinking the inode is dirty
when it is, in fact, clean, and so incorrectly attaching it to the
buffer for IO and completion processing.

This then leads to use-after-free situations on the xfs_inode itself
if the IO completes after the current RCU grace period expires. The
buffer callbacks will access the xfs_inode and try to do all sorts
of things it shouldn't with freed memory.

IOWs, xfs_iflush_cluster() only works correctly when racing with
inode reclaim if the inode log item is present and correctly stating
the inode is clean. If the inode is being freed, then reclaim has
already made sure the inode is clean, and hence xfs_iflush_cluster
can skip it. However, we are accessing the inode inode under RCU
read lock protection and so also must ensure that all dynamically
allocated memory we reference in this context is not freed until the
RCU grace period expires.

To fix this, move all the potential memory freeing into
xfs_inode_free_callback() so that we are guarantee RCU protected
lookup code will always have the memory structures it needs
available during the RCU grace period that lookup races can occur
in.

Discovered-by: Brain Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_icache.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index bf2d607..0c94cde 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -94,13 +94,6 @@ xfs_inode_free_callback(
 	struct inode		*inode = container_of(head, struct inode, i_rcu);
 	struct xfs_inode	*ip = XFS_I(inode);
 
-	kmem_zone_free(xfs_inode_zone, ip);
-}
-
-void
-xfs_inode_free(
-	struct xfs_inode	*ip)
-{
 	switch (VFS_I(ip)->i_mode & S_IFMT) {
 	case S_IFREG:
 	case S_IFDIR:
@@ -118,6 +111,13 @@ xfs_inode_free(
 		ip->i_itemp = NULL;
 	}
 
+	kmem_zone_free(xfs_inode_zone, ip);
+}
+
+void
+xfs_inode_free(
+	struct xfs_inode	*ip)
+{
 	/*
 	 * Because we use RCU freeing we need to ensure the inode always
 	 * appears to be reclaimed with an invalid inode number when in the
-- 
2.7.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-04-12 23:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-12  8:56 [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe Alex Lyakas
2016-04-12 11:29 ` Brent Bice
2016-04-12 21:11   ` Dave Chinner
2016-04-12 21:55     ` Brent Bice
2016-04-12 23:29 ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2016-04-12  8:55 Alex Lyakas
2016-04-06  9:22 [PATCH 0/5] xfs; xfs_iflush_cluster vs xfs_reclaim_inode Dave Chinner
2016-04-06  9:22 ` [PATCH 4/5] xfs: xfs_inode_free() isn't RCU safe Dave Chinner
2016-04-07 15:50   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.