All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guo Xuenan <guoxuenan@huawei.com>
To: Dave Chinner <david@fromorbit.com>
Cc: <dchinner@redhat.com>, <djwong@kernel.org>, <fangwei1@huawei.com>,
	<houtao1@huawei.com>, <jack.qiu@huawei.com>,
	<leo.lilong@huawei.com>, <linux-xfs@vger.kernel.org>,
	<yi.zhang@huawei.com>, <zengheng4@huawei.com>,
	<zhengbin13@huawei.com>
Subject: Re: [PATCH v3 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL
Date: Thu, 17 Nov 2022 10:12:56 +0800	[thread overview]
Message-ID: <68d584fa-5486-3254-35c2-9ad628b688e5@huawei.com> (raw)
In-Reply-To: <20221116060229.GC3600936@dread.disaster.area>

On 2022/11/16 14:02, Dave Chinner wrote:
> On Tue, Nov 08, 2022 at 10:06:05PM +0800, Guo Xuenan wrote:
>> Fix uaf in xfs_trans_ail_delete during xlog force shutdown.
>> In commit cd6f79d1fb32 ("xfs: run callbacks before waking waiters in
>> xlog_state_shutdown_callbacks") changed the order of running callbacks
>> and wait for iclog completion to avoid unmount path untimely destroy AIL.
>> But which seems not enough to ensue this, adding mdelay in
>> `xfs_buf_item_unpin` can prove that.
>>
>> The reproduction is as follows. To ensure destroy AIL safely,
>> we should wait all xlog ioend workers done and sync the AIL.
> Like Darrick, I didn't see this either because it's in an old
> thread....
Get it and gladly accept :)
>>   fs/xfs/xfs_log.c | 33 +++++++++++++++++++++++++++------
>>   1 file changed, 27 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
>> index f02a0dd522b3..467bac00951c 100644
>> --- a/fs/xfs/xfs_log.c
>> +++ b/fs/xfs/xfs_log.c
>> @@ -82,6 +82,9 @@ STATIC int
>>   xlog_iclogs_empty(
>>   	struct xlog		*log);
>>   
>> +static void
>> +xlog_wait_iodone(struct xlog *log);
>> +
> Why do we need a forward prototype definition?
I guess you mean the function name is same as before,
if so, this is coincidence. :)
>>   static int
>>   xfs_log_cover(struct xfs_mount *);
>>   
>> @@ -886,6 +889,23 @@ xlog_force_iclog(
>>   	return xlog_state_release_iclog(iclog->ic_log, iclog, NULL);
>>   }
>>   
>> +/*
>> + * Cycle all the iclogbuf locks to make sure all log IO completion
>> + * is done before we tear down AIL/CIL.
>> + */
>> +static void
>> +xlog_wait_iodone(struct xlog *log)
>> +{
>> +	int		i;
>> +	xlog_in_core_t	*iclog = log->l_iclog;
>> +
>> +	for (i = 0; i < log->l_iclog_bufs; i++) {
>> +		down(&iclog->ic_sema);
>> +		up(&iclog->ic_sema);
>> +		iclog = iclog->ic_next;
>> +	}
>> +}
> This doesn't guarantee no iclog IO is in progress, just that it
> waited for each iclog to finish any IO it had in progress. If we are
> in a normal runtime condition, xfs_log_force(mp, XFS_LOG_SYNC)
> performs this "wait for journal IO to complete" integrity operation.
>
> Which, in normal runtime unmount operation, we get from
> xfs_log_cover().....
>
>> +
>>   /*
>>    * Wait for the iclog and all prior iclogs to be written disk as required by the
>>    * log force state machine. Waiting on ic_force_wait ensures iclog completions
>> @@ -1276,6 +1296,12 @@ xfs_log_cover(
>>   		xfs_ail_push_all_sync(mp->m_ail);
>>   	} while (xfs_log_need_covered(mp));
>>   
>> +	/*
>> +	 * Cycle all the iclogbuf locks to make sure all log IO completion
>> +	 * is done before we tear down AIL.
>> +	 */
>> +	xlog_wait_iodone(mp->m_log);
> .... and so this call is redundant for normal runtime log quiesce
> operations.  That's because the synchronous transaction run in this
> loop:

yes, it is redundant,through discussion here[1] with Darrick

[1] 
https://lore.kernel.org/all/8575089d-bf9b-0383-939b-922f1440abf7@huawei.com/

>         do {
>>>>>>>           error = xfs_sync_sb(mp, true);
>                  if (error)
>                          break;
>                  xfs_ail_push_all_sync(mp->m_ail);
>          } while (xfs_log_need_covered(mp));
>
> causes a xfs_log_force(mp, XFS_LOG_SYNC) to be issued to force the
> transaction to disk and it *waits for the journal IO to complete.
> Further, it is iclog IO completion that moves the log covered state
> forwards, so this loop absolutely relies on the journal IO being
> fully completed and both the CIL and AIL being empty before it will
> exit.
>
> Hence adding xlog_wait_iodone() here would only have an effect if
> the log had been shut down. however, we never get here if the log
> has been shut down because the xfs_log_writable(mp) check at the
> start of xfs_log_cover() will fail. Hence we return without trying
> to cover the log or wait for iclog completion.
>
> IOWs, I don't see how this code does anything to avoid the problem
> that has been described - if the iclog callback runs the shutdown,
> by the time it gets to running shutdown callbacks it's already
> marked both the log and the mount as shut down, so unmount will
> definitely not run this code and so will not wait for the iclog
> running shutdown callbacks during IO completion....
Thank you very much for your detailed explanation here.
> ----
>
> Really, the only place this iclog walk matters is in the unmount
> path prior to tearing down internal structures, and it only matters
> when ths filesystem has been shut down. That means it needs
> to be a part of xfs_log_unmount(), not part of the normal runtime
> freeze/remount readonly/unmount log quiescing.
>
> If we are shut down, then we have to guarantee that we've
> finished processing the iclogs before we tear down the things that
> use log items - the CIL, the AIL, the iclogs themselves, etc. It
> *must* also run in the shutdown case, so it can't be put in a
> function that is conditional on a normal running filesystem.
> Something like this:
>
>   void
>   xfs_log_unmount(
>           struct xfs_mount        *mp)
>   {
>          xfs_log_clean(mp);
>
> +	/*
> +	 * If shutdown has come from iclog IO context, the log
> +	 * cleaning will have been skipped and so we need to wait
> +	 * for ithe iclog to complete shutdown processing before we
> +	 * tear anything down.
> +	 */
> +	xfs_iclog_iodone_wait(mp->m_log);
>          xfs_buftarg_drain(mp->m_ddev_targp);
>
> And the iclog walk can be removed from xlog_dealloc_log() as we've
> already done the checks it needs before tearing down the iclogs....
OK, I will send v4 in a new thread according to your kindly and detailed 
suggestion.

Best regards
Xuenan
> Cheers,
>
> Dave.

-- 
Guo Xuenan [OS Kernel Lab]
-----------------------------
Email: guoxuenan@huawei.com


  reply	other threads:[~2022-11-17  2:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03  8:36 [PATCH 0/2] xfs: shutdown UAF fixes Guo Xuenan
2022-11-03  8:36 ` [PATCH 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
2022-11-03 21:16   ` Dave Chinner
2022-11-04  7:50     ` Guo Xuenan
2022-11-04 15:46       ` Darrick J. Wong
2022-11-05  3:32         ` Guo Xuenan
2022-11-03  8:36 ` [PATCH 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
2022-11-03 21:44   ` Dave Chinner
2022-11-07 14:27 ` [PATCH v2 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
2022-11-07 14:27   ` [PATCH v2 2/2] xfs: fix super block buf log item UAF during force shutdown Guo Xuenan
2022-11-08 14:06     ` [PATCH v3 1/2] xfs: wait xlog ioend workqueue drained before tearing down AIL Guo Xuenan
2022-11-15 23:23       ` Darrick J. Wong
2022-11-17  1:18         ` Guo Xuenan
2022-11-16  6:02       ` Dave Chinner
2022-11-17  2:12         ` Guo Xuenan [this message]
2022-11-07 17:13   ` [PATCH v2 " Darrick J. Wong
2022-11-08  2:44     ` Guo Xuenan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68d584fa-5486-3254-35c2-9ad628b688e5@huawei.com \
    --to=guoxuenan@huawei.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=fangwei1@huawei.com \
    --cc=houtao1@huawei.com \
    --cc=jack.qiu@huawei.com \
    --cc=leo.lilong@huawei.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=zengheng4@huawei.com \
    --cc=zhengbin13@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.