All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@lst.de>, Christian Brauner <brauner@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Chandan Babu R <chandan.babu@oracle.com>,
	Zhang Yi <yi.zhang@huaweicloud.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper
Date: Mon, 27 Nov 2023 12:27:22 +0530	[thread overview]
Message-ID: <8734wrsmy5.fsf@doe.com> (raw)
In-Reply-To: <20231126124720.1249310-6-hch@lst.de>

Christoph Hellwig <hch@lst.de> writes:

> Most of iomap_do_writepage is dedidcated to handling a folio crossing or
> beyond i_size.  Split this is into a separate helper and update the
> commens to deal with folios instead of pages and make them more readable.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/iomap/buffered-io.c | 128 ++++++++++++++++++++---------------------
>  1 file changed, 62 insertions(+), 66 deletions(-)

Just a small nit below.

But otherwise looks good to me. Please feel free to add - 

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8148e4c9765dac..4a5a21809b0182 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1768,6 +1768,64 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
>  	wbc_account_cgroup_owner(wbc, &folio->page, len);
>  }
>  
> +/*
> + * Check interaction of the folio with the file end.
> + *
> + * If the folio is entirely beyond i_size, return false.  If it straddles
> + * i_size, adjust end_pos and zero all data beyond i_size.
> + */
> +static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
> +		u64 *end_pos)
> +{
> +	u64 isize = i_size_read(inode);

i_size_read(inode) returns loff_t type. Can we make end_pos also as
loff_t type. We anyway use loff_t for
folio_pos(folio) + folio_size(folio), at many places in fs/iomap. It
would be more consistent with the data type then.

Thoughts?

-ritesh


> +
> +	if (*end_pos > isize) {
> +		size_t poff = offset_in_folio(folio, isize);
> +		pgoff_t end_index = isize >> PAGE_SHIFT;
> +
> +		/*
> +		 * If the folio is entirely ouside of i_size, skip it.
> +		 *
> +		 * This can happen due to a truncate operation that is in
> +		 * progress and in that case truncate will finish it off once
> +		 * we've dropped the folio lock.
> +		 *
> +		 * Note that the pgoff_t used for end_index is an unsigned long.
> +		 * If the given offset is greater than 16TB on a 32-bit system,
> +		 * then if we checked if the folio is fully outside i_size with
> +		 * "if (folio->index >= end_index + 1)", "end_index + 1" would
> +		 * overflow and evaluate to 0.  Hence this folio would be
> +		 * redirtied and written out repeatedly, which would result in
> +		 * an infinite loop; the user program performing this operation
> +		 * would hang.  Instead, we can detect this situation by
> +		 * checking if the folio is totally beyond i_size or if its
> +		 * offset is just equal to the EOF.
> +		 */
> +		if (folio->index > end_index ||
> +		    (folio->index == end_index && poff == 0))
> +			return false;
> +
> +		/*
> +		 * The folio straddles i_size.
> +		 *
> +		 * It must be zeroed out on each and every writepage invocation
> +		 * because it may be mmapped:
> +		 *
> +		 *    A file is mapped in multiples of the page size.  For a
> +		 *    file that is not a multiple of the page size, the
> +		 *    remaining memory is zeroed when mapped, and writes to that
> +		 *    region are not written out to the file.
> +		 *
> +		 * Also adjust the writeback range to skip all blocks entirely
> +		 * beyond i_size.
> +		 */
> +		folio_zero_segment(folio, poff, folio_size(folio));
> +		*end_pos = isize;
> +	}
> +
> +	return true;
> +}
> +
>  /*
>   * We implement an immediate ioend submission policy here to avoid needing to
>   * chain multiple ioends and hence nest mempool allocations which can violate
> @@ -1906,78 +1964,16 @@ static int iomap_do_writepage(struct folio *folio,
>  {
>  	struct iomap_writepage_ctx *wpc = data;
>  	struct inode *inode = folio->mapping->host;
> -	u64 end_pos, isize;
> +	u64 end_pos = folio_pos(folio) + folio_size(folio);
>  
>  	trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
>  
> -	/*
> -	 * Is this folio beyond the end of the file?
> -	 *
> -	 * The folio index is less than the end_index, adjust the end_pos
> -	 * to the highest offset that this folio should represent.
> -	 * -----------------------------------------------------
> -	 * |			file mapping	       | <EOF> |
> -	 * -----------------------------------------------------
> -	 * | Page ... | Page N-2 | Page N-1 |  Page N  |       |
> -	 * ^--------------------------------^----------|--------
> -	 * |     desired writeback range    |      see else    |
> -	 * ---------------------------------^------------------|
> -	 */
> -	isize = i_size_read(inode);
> -	end_pos = folio_pos(folio) + folio_size(folio);
> -	if (end_pos > isize) {
> -		/*
> -		 * Check whether the page to write out is beyond or straddles
> -		 * i_size or not.
> -		 * -------------------------------------------------------
> -		 * |		file mapping		        | <EOF>  |
> -		 * -------------------------------------------------------
> -		 * | Page ... | Page N-2 | Page N-1 |  Page N   | Beyond |
> -		 * ^--------------------------------^-----------|---------
> -		 * |				    |      Straddles     |
> -		 * ---------------------------------^-----------|--------|
> -		 */
> -		size_t poff = offset_in_folio(folio, isize);
> -		pgoff_t end_index = isize >> PAGE_SHIFT;
> -
> -		/*
> -		 * Skip the page if it's fully outside i_size, e.g.
> -		 * due to a truncate operation that's in progress.  We've
> -		 * cleaned this page and truncate will finish things off for
> -		 * us.
> -		 *
> -		 * Note that the end_index is unsigned long.  If the given
> -		 * offset is greater than 16TB on a 32-bit system then if we
> -		 * checked if the page is fully outside i_size with
> -		 * "if (page->index >= end_index + 1)", "end_index + 1" would
> -		 * overflow and evaluate to 0.  Hence this page would be
> -		 * redirtied and written out repeatedly, which would result in
> -		 * an infinite loop; the user program performing this operation
> -		 * would hang.  Instead, we can detect this situation by
> -		 * checking if the page is totally beyond i_size or if its
> -		 * offset is just equal to the EOF.
> -		 */
> -		if (folio->index > end_index ||
> -		    (folio->index == end_index && poff == 0))
> -			goto unlock;
> -
> -		/*
> -		 * The page straddles i_size.  It must be zeroed out on each
> -		 * and every writepage invocation because it may be mmapped.
> -		 * "A file is mapped in multiples of the page size.  For a file
> -		 * that is not a multiple of the page size, the remaining
> -		 * memory is zeroed when mapped, and writes to that region are
> -		 * not written out to the file."
> -		 */
> -		folio_zero_segment(folio, poff, folio_size(folio));
> -		end_pos = isize;
> +	if (!iomap_writepage_handle_eof(folio, inode, &end_pos)) {
> +		folio_unlock(folio);
> +		return 0;
>  	}
>  
>  	return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
> -
> -unlock:
> -	folio_unlock(folio);
> -	return 0;
>  }
>  
>  int
> -- 
> 2.39.2

  reply	other threads:[~2023-11-27  6:57 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-26 12:47 RFC: map multiple blocks per ->map_blocks in iomap writeback Christoph Hellwig
2023-11-26 12:47 ` [PATCH 01/13] iomap: clear the per-folio dirty bits on all writeback failures Christoph Hellwig
2023-11-27  3:47   ` Ritesh Harjani
2023-11-27  6:29     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 02/13] iomap: treat inline data in iomap_writepage_map as an I/O error Christoph Hellwig
2023-11-27  5:01   ` Ritesh Harjani
2023-11-27  6:33     ` Christoph Hellwig
2023-11-29  4:40       ` Darrick J. Wong
2023-11-29  5:39         ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 03/13] iomap: move the io_folios field out of struct iomap_ioend Christoph Hellwig
2023-11-27  5:33   ` Ritesh Harjani
2023-11-29  4:41   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 04/13] iomap: drop the obsolete PF_MEMALLOC check in iomap_do_writepage Christoph Hellwig
2023-11-27  6:39   ` Ritesh Harjani
2023-11-27  6:41     ` Christoph Hellwig
2023-11-29  4:45       ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper Christoph Hellwig
2023-11-27  6:57   ` Ritesh Harjani [this message]
2023-11-27  7:02     ` Ritesh Harjani
2023-11-27  7:12       ` Christoph Hellwig
2023-11-29  4:48         ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 06/13] iomap: move all remaining per-folio logic into xfs_writepage_map Christoph Hellwig
2023-11-27  7:36   ` Ritesh Harjani
2023-11-27 19:20   ` Josef Bacik
2023-11-29  4:50   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 07/13] iomap: clean up the iomap_new_ioend calling convention Christoph Hellwig
2023-11-27  7:43   ` Ritesh Harjani
2023-11-27  8:13     ` Christoph Hellwig
2023-11-29  4:51       ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 08/13] iomap: move the iomap_sector sector calculation out of iomap_add_to_ioend Christoph Hellwig
2023-11-27  9:54   ` Ritesh Harjani
2023-11-27 13:54     ` Christoph Hellwig
2023-11-29  4:53       ` Darrick J. Wong
2023-11-29  5:42         ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 09/13] iomap: don't chain bios Christoph Hellwig
2023-11-27 12:53   ` Zhang Yi
2023-11-27 13:51     ` Christoph Hellwig
2023-11-29  4:59   ` Darrick J. Wong
2023-11-29  5:40     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 10/13] iomap: only call mapping_set_error once for each failed bio Christoph Hellwig
2023-11-29  5:03   ` Darrick J. Wong
2023-11-29  5:41     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 11/13] iomap: factor out a iomap_writepage_map_block helper Christoph Hellwig
2023-11-29  5:06   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 12/13] iomap: submit ioends immediately Christoph Hellwig
2023-11-29  5:14   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 13/13] iomap: map multiple blocks at a time Christoph Hellwig
2023-11-29  5:25   ` Darrick J. Wong
2023-11-29  5:44     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734wrsmy5.fsf@doe.com \
    --to=ritesh.list@gmail.com \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=yi.zhang@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.