All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joseph Qi <joseph.qi@linux.alibaba.com>
To: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Chenyuan Mi <cymi20@fudan.edu.cn>,
	akpm <akpm@linux-foundation.org>, Xin Tan <tanxin.ctf@gmail.com>,
	Xiyu Yang <xiyuyang19@fudan.edu.cn>,
	"yuanxzhang@fudan.edu.cn" <yuanxzhang@fudan.edu.cn>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ocfs2-devel@oss.oracle.com" <ocfs2-devel@oss.oracle.com>
Subject: Re: [Ocfs2-devel] [PATCH v2] ocfs2: Fix handle refcount leak in two exception handling paths
Date: Fri, 10 Sep 2021 09:53:57 +0800	[thread overview]
Message-ID: <ee86ea1a-0348-e975-3c67-8d574eaadbe3@linux.alibaba.com> (raw)
In-Reply-To: <CED0D2AD-7905-490E-8D36-50D192CD9BF1@oracle.com>



On 9/10/21 1:48 AM, Wengang Wang wrote:
> 
> 
> On Sep 9, 2021, at 4:07 AM, Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>> wrote:
> 
> Hi Wengang,
> 
> On 9/9/21 1:12 AM, Wengang Wang wrote:
> Hi,
> 
> Sorry for late involving, but this doesn’t look right to me.
> 
> On Sep 8, 2021, at 3:51 AM, Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>> wrote:
> 
> 
> 
> On 9/8/21 6:20 PM, Chenyuan Mi wrote:
> The reference counting issue happens in two exception handling paths
> of ocfs2_replay_truncate_records(). When executing these two exception
> handling paths, the function forgets to decrease the refcount of handle
> increased by ocfs2_start_trans(), causing a refcount leak.
> 
> Fix this issue by using ocfs2_commit_trans() to decrease the refcount
> of handle in two handling paths.
> 
> Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn<mailto:cymi20@fudan.edu.cn>>
> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn<mailto:xiyuyang19@fudan.edu.cn>>
> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com<mailto:tanxin.ctf@gmail.com>>
> 
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>>
> ---
> fs/ocfs2/alloc.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index f1cc8258d34a..b05fde7edc3a 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -5940,6 +5940,7 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
> status = ocfs2_journal_access_di(handle, INODE_CACHE(tl_inode), tl_bh,
>  OCFS2_JOURNAL_ACCESS_WRITE);
> if (status < 0) {
> + ocfs2_commit_trans(osb, handle);
> mlog_errno(status);
> goto bail;
> }
> @@ -5964,6 +5965,7 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
>      data_alloc_bh, start_blk,
>      num_clusters);
> if (status < 0) {
> + ocfs2_commit_trans(osb, handle);
> 
> As a transaction, stuff expected to be in the same handle should be treated as atomic.
> Here the stuff includes the tl_bh and other metadata block which will be modified in ocfs2_free_clusters().
> Coming here, some of related meta blocks may be in the handle but others are not due to the error happened.
> If you do a commit, partial meta blocks are committed to log. — that breaks the atomic idea, it will cause FS inconsistency.
> So what’s reason you want to commit the meta block changes, which is not all of expected, in this handle to journal log?
> 
> Do you really see a hit on the failure? or just you detected the refcount leak by code review?
> 
> You may want to look at ocfs2_journal_dirty() for the error handling part.
> 
> 
> For the first error handling, since we don't call ocfs2_journal_dirty()
> yet, so won't be a problem.
> For the second error handling, I think we don't have a better way. Look
> at other callers of ocfs2_free_clusters(), we simply ignore the error
> code.
> Anyway, we should commit transaction if starts, otherwise journal will
> be abnormal.
> 
> I don't think so. If error happened, we should fail ocfs2, rather than do a partial committing.
> 

Umm... not exactly...
Take ocfs2_free_clusters() for example, when it fails in case of EIO or
ENOMEM, we can't just abort journal in such cases, because it is not so
serious, only a bit blocks still occupied and they will recovery during
the next mount. 
That's why we have "errors=continue" in most filesystems, we should always
consider the business continuity first.
Also you can look at ext4_free_blocks() for reference.

Thanks,
Joseph

WARNING: multiple messages have this Message-ID (diff)
From: Joseph Qi <joseph.qi@linux.alibaba.com>
To: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Chenyuan Mi <cymi20@fudan.edu.cn>, Xin Tan <tanxin.ctf@gmail.com>,
	Xiyu Yang <xiyuyang19@fudan.edu.cn>,
	"yuanxzhang@fudan.edu.cn" <yuanxzhang@fudan.edu.cn>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ocfs2-devel@oss.oracle.com" <ocfs2-devel@oss.oracle.com>
Subject: Re: [Ocfs2-devel] [PATCH v2] ocfs2: Fix handle refcount leak in two exception handling paths
Date: Fri, 10 Sep 2021 09:53:57 +0800	[thread overview]
Message-ID: <ee86ea1a-0348-e975-3c67-8d574eaadbe3@linux.alibaba.com> (raw)
In-Reply-To: <CED0D2AD-7905-490E-8D36-50D192CD9BF1@oracle.com>



On 9/10/21 1:48 AM, Wengang Wang wrote:
> 
> 
> On Sep 9, 2021, at 4:07 AM, Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>> wrote:
> 
> Hi Wengang,
> 
> On 9/9/21 1:12 AM, Wengang Wang wrote:
> Hi,
> 
> Sorry for late involving, but this doesn’t look right to me.
> 
> On Sep 8, 2021, at 3:51 AM, Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>> wrote:
> 
> 
> 
> On 9/8/21 6:20 PM, Chenyuan Mi wrote:
> The reference counting issue happens in two exception handling paths
> of ocfs2_replay_truncate_records(). When executing these two exception
> handling paths, the function forgets to decrease the refcount of handle
> increased by ocfs2_start_trans(), causing a refcount leak.
> 
> Fix this issue by using ocfs2_commit_trans() to decrease the refcount
> of handle in two handling paths.
> 
> Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn<mailto:cymi20@fudan.edu.cn>>
> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn<mailto:xiyuyang19@fudan.edu.cn>>
> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com<mailto:tanxin.ctf@gmail.com>>
> 
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com<mailto:joseph.qi@linux.alibaba.com>>
> ---
> fs/ocfs2/alloc.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
> index f1cc8258d34a..b05fde7edc3a 100644
> --- a/fs/ocfs2/alloc.c
> +++ b/fs/ocfs2/alloc.c
> @@ -5940,6 +5940,7 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
> status = ocfs2_journal_access_di(handle, INODE_CACHE(tl_inode), tl_bh,
>  OCFS2_JOURNAL_ACCESS_WRITE);
> if (status < 0) {
> + ocfs2_commit_trans(osb, handle);
> mlog_errno(status);
> goto bail;
> }
> @@ -5964,6 +5965,7 @@ static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
>      data_alloc_bh, start_blk,
>      num_clusters);
> if (status < 0) {
> + ocfs2_commit_trans(osb, handle);
> 
> As a transaction, stuff expected to be in the same handle should be treated as atomic.
> Here the stuff includes the tl_bh and other metadata block which will be modified in ocfs2_free_clusters().
> Coming here, some of related meta blocks may be in the handle but others are not due to the error happened.
> If you do a commit, partial meta blocks are committed to log. — that breaks the atomic idea, it will cause FS inconsistency.
> So what’s reason you want to commit the meta block changes, which is not all of expected, in this handle to journal log?
> 
> Do you really see a hit on the failure? or just you detected the refcount leak by code review?
> 
> You may want to look at ocfs2_journal_dirty() for the error handling part.
> 
> 
> For the first error handling, since we don't call ocfs2_journal_dirty()
> yet, so won't be a problem.
> For the second error handling, I think we don't have a better way. Look
> at other callers of ocfs2_free_clusters(), we simply ignore the error
> code.
> Anyway, we should commit transaction if starts, otherwise journal will
> be abnormal.
> 
> I don't think so. If error happened, we should fail ocfs2, rather than do a partial committing.
> 

Umm... not exactly...
Take ocfs2_free_clusters() for example, when it fails in case of EIO or
ENOMEM, we can't just abort journal in such cases, because it is not so
serious, only a bit blocks still occupied and they will recovery during
the next mount. 
That's why we have "errors=continue" in most filesystems, we should always
consider the business continuity first.
Also you can look at ext4_free_blocks() for reference.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2021-09-10  1:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 10:20 [PATCH v2] ocfs2: Fix handle refcount leak in two exception handling paths Chenyuan Mi
2021-09-08 10:20 ` [Ocfs2-devel] " Chenyuan Mi
2021-09-08 10:51 ` Joseph Qi
2021-09-08 10:51   ` [Ocfs2-devel] " Joseph Qi
2021-09-08 17:12   ` Wengang Wang
2021-09-08 17:12     ` Wengang Wang
2021-09-09 11:07     ` Joseph Qi
2021-09-09 11:07       ` Joseph Qi
2021-09-09 17:48       ` Wengang Wang
2021-09-10  1:53         ` Joseph Qi [this message]
2021-09-10  1:53           ` Joseph Qi
2021-09-10 17:00           ` Wengang Wang
2021-09-10 17:00             ` Wengang Wang
2021-09-14  2:12             ` Joseph Qi
2021-09-14  2:12               ` Joseph Qi
2021-09-14  3:07               ` Wengang Wang
2021-09-14  3:07                 ` Wengang Wang
2021-09-15  2:36   ` Joseph Qi
2021-09-15  2:36     ` Joseph Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee86ea1a-0348-e975-3c67-8d574eaadbe3@linux.alibaba.com \
    --to=joseph.qi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cymi20@fudan.edu.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=tanxin.ctf@gmail.com \
    --cc=wen.gang.wang@oracle.com \
    --cc=xiyuyang19@fudan.edu.cn \
    --cc=yuanxzhang@fudan.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.