All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wengang Wang <wen.gang.wang@oracle.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jan Kara <jack@suse.cz>, ocfs2-devel ML <ocfs2-devel@oss.oracle.com>
Subject: Re: [Ocfs2-devel] Possible fs corruption when hole punch races with	other ops
Date: Thu, 22 Apr 2021 16:36:23 +0000	[thread overview]
Message-ID: <7C502B8A-F978-4DB3-897B-9E726D39F08E@oracle.com> (raw)
In-Reply-To: <20210422160608.GB547183@magnolia>



> On Apr 22, 2021, at 9:06 AM, Darrick J. Wong <djwong@kernel.org> wrote:
> 
> On Thu, Apr 22, 2021 at 03:56:15PM +0000, Wengang Wang wrote:
>> Hi Honza,
>> 
>> By “hole punching operation”, you meant deallocate some data blocks
>> from a file and keep the file size unchanged, after that operation
>> some “holes" are made in that file, right?
>> I am just curious that do we really have end user programs that do
>> hole punching on files? Or hole punching is only used for development
>> purpose?
> 
> Yes, we do.  QEMU does it to image files, various $database products,

Hm, to run a hole punching, that means the data blocks to deallocate were useful but now are no longer useful. 
For QEMU, it does hole punching at guest run time for guest FS trims? Don’t know in what cases DBs make use of hole punching.
Well, good to know.

Thanks Darrick.

wengang

> etc...
> 
> --D
> 
>> 
>> thanks,
>> wengang
>> 
>> On Apr 22, 2021, at 3:44 AM, Jan Kara <jack@suse.cz<mailto:jack@suse.cz>> wrote:
>> 
>> Hello!
>> 
>> On Thu 22-04-21 11:22:07, Joseph Qi wrote:
>> Checked the code flow, it seems the race you worried truly exists.
>> We have to take ip_alloc_sem before calling into ocfs2_get_block().
>> 
>> OK, that's what I assumed but this won't be easy. Because ocfs2_writepage()
>> is called with a page locked but ocfs2_remove_inode_range() gets called
>> under ip_alloc_sem and locks pages in ocfs2_truncate_cluster_pages() ->
>> truncate_inode_pages_range(). Which creates ABBA deadlock. So you'll
>> probably need to come up with a similar dance like in ocfs2_readpage().
>> 
>> But after fixing this, ip_alloc_sem should provide you with enough
>> protection so that ocfs2 isn't prone to races between hole punch and
>> readahead / fault my work will be mostly irrelevant for OCFS2.
>> 
>> Thanks for confirmation!
>> 
>> Honza
>> 
>> On 4/22/21 12:29 AM, Jan Kara wrote:
>> Hello,
>> 
>> I'm unifying protection various filesystems use to protect hole punch
>> operations from racing with other operations (like readahead, page fault,
>> writepage etc.). I was looking into OCFS2 and I think it is prone to a
>> following race which can possibly lead to filesystem corruption. But maybe
>> I miss something so that's why I'm writing here. The scenario I'm concerned
>> about is:
>> 
>> CPU1 CPU2
>> ocfs2_remove_inode_range() ocfs2_writepage()
>> ...   block_write_full_page()
>> ocfs2_remove_btree_range()     ocfs2_extent_map_get_blocks()
>> 
>> Now ocfs2_extent_map_get_blocks() runs without protection of ip_alloc_sem
>> AFAICT and so both these operations can be modifying extent map at the same
>> time? What am I missing?
>> 
>> Honza
>> 
>> --
>> Jan Kara <jack@suse.com<mailto:jack@suse.com>>
>> SUSE Labs, CR
>> 
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com<mailto:Ocfs2-devel@oss.oracle.com>
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>> 
> 
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2021-04-22 16:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21 16:29 [Ocfs2-devel] Possible fs corruption when hole punch races with other ops Jan Kara
2021-04-22  3:22 ` Joseph Qi
2021-04-22 10:44   ` Jan Kara
2021-04-22 15:56     ` Wengang Wang
2021-04-22 16:06       ` Darrick J. Wong
2021-04-22 16:36         ` Wengang Wang [this message]
2021-04-22 16:46           ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7C502B8A-F978-4DB3-897B-9E726D39F08E@oracle.com \
    --to=wen.gang.wang@oracle.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.