linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
Date: Tue, 30 Jul 2019 17:57:18 +0300	[thread overview]
Message-ID: <51ba7304-06bd-a50d-cb14-6dc41b92fab5@yandex-team.ru> (raw)
In-Reply-To: <20190730141457.GE28829@quack2.suse.cz>

On 30.07.2019 17:14, Jan Kara wrote:
> On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
>> On 23.07.2019 3:52, Andrew Morton wrote:
>>>
>>> (cc linux-fsdevel and Jan)
> 
> Thanks for CC Andrew.
> 
>>> On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
>>>
>>>> Functions like filemap_write_and_wait_range() should do nothing if inode
>>>> has no dirty pages or pages currently under writeback. But they anyway
>>>> construct struct writeback_control and this does some atomic operations
>>>> if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
>>>> updates state of writeback ownership, on slow path might be more work.
>>>> Current this path is safely avoided only when inode mapping has no pages.
>>>>
>>>> For example generic_file_read_iter() calls filemap_write_and_wait_range()
>>>> at each O_DIRECT read - pretty hot path.
> 
> Yes, but in common case mapping_needs_writeback() is false for files you do
> direct IO to (exactly the case with no pages in the mapping). So you
> shouldn't see the overhead at all. So which case you really care about?
> 
>>>> This patch skips starting new writeback if mapping has no dirty tags set.
>>>> If writeback is already in progress filemap_write_and_wait_range() will
>>>> wait for it.
>>>>
>>>> ...
>>>>
>>>> --- a/mm/filemap.c
>>>> +++ b/mm/filemap.c
>>>> @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>>>>    		.range_end = end,
>>>>    	};
>>>> -	if (!mapping_cap_writeback_dirty(mapping))
>>>> +	if (!mapping_cap_writeback_dirty(mapping) ||
>>>> +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>>>>    		return 0;
>>>>    	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>>>
>>> How does this play with tagged_writepages?  We assume that no tagging
>>> has been performed by any __filemap_fdatawrite_range() caller?
>>>
>>
>> Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
>>
>> To-write tags are supposed to be a subset of dirty tags:
>> to-write is set only when dirty is set and cleared after starting writeback.
>>
>> Special case set_page_writeback_keepwrite() which does not clear to-write
>> should be for dirty page thus dirty tag is not going to be cleared either.
>> Ext4 calls it after redirty_page_for_writepage()
>> XFS even without clear_page_dirty_for_io()
>>
>> Anyway to-write tag without dirty tag or at clear page is confusing.
> 
> Yeah, TOWRITE tag is intended to be internal to writepages logic so your
> patch is fine in that regard. Overall the patch looks good to me so I'm
> just wondering a bit about the motivation...

In our case file mixes cached pages and O_DIRECT read. Kind of database
were index header is memory mapped while the rest data read via O_DIRECT.
I suppose for sharing index between multiple instances.

On this path we also hit this bug:
https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
so that's why I've started looking into this code.


  reply	other threads:[~2019-07-30 14:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-22  9:36 [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Konstantin Khlebnikov
2019-07-22  9:36 ` [PATCH 2/2] mm/filemap: rewrite mapping_needs_writeback in less fancy manner Konstantin Khlebnikov
2019-07-23  0:52 ` [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Andrew Morton
2019-07-23  8:16   ` Konstantin Khlebnikov
2019-07-30 14:14     ` Jan Kara
2019-07-30 14:57       ` Konstantin Khlebnikov [this message]
2019-07-30 15:48         ` Jan Kara
2019-07-30 18:15           ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51ba7304-06bd-a50d-cb14-6dc41b92fab5@yandex-team.ru \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).