All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
@ 2019-07-22  9:36 Konstantin Khlebnikov
  2019-07-22  9:36 ` [PATCH 2/2] mm/filemap: rewrite mapping_needs_writeback in less fancy manner Konstantin Khlebnikov
  2019-07-23  0:52 ` [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Andrew Morton
  0 siblings, 2 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-22  9:36 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Tejun Heo, Jens Axboe, Johannes Weiner

Functions like filemap_write_and_wait_range() should do nothing if inode
has no dirty pages or pages currently under writeback. But they anyway
construct struct writeback_control and this does some atomic operations
if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
updates state of writeback ownership, on slow path might be more work.
Current this path is safely avoided only when inode mapping has no pages.

For example generic_file_read_iter() calls filemap_write_and_wait_range()
at each O_DIRECT read - pretty hot path.

This patch skips starting new writeback if mapping has no dirty tags set.
If writeback is already in progress filemap_write_and_wait_range() will
wait for it.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d0cf700bf201..d9572593e5c7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
 		.range_end = end,
 	};
 
-	if (!mapping_cap_writeback_dirty(mapping))
+	if (!mapping_cap_writeback_dirty(mapping) ||
+	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
 		return 0;
 
 	wbc_attach_fdatawrite_inode(&wbc, mapping->host);


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm/filemap: rewrite mapping_needs_writeback in less fancy manner
  2019-07-22  9:36 [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Konstantin Khlebnikov
@ 2019-07-22  9:36 ` Konstantin Khlebnikov
  2019-07-23  0:52 ` [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Andrew Morton
  1 sibling, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-22  9:36 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Tejun Heo, Jens Axboe, Johannes Weiner

This actually checks that writeback is needed or in progress.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/filemap.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index d9572593e5c7..29f503ffd70b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -618,10 +618,13 @@ int filemap_fdatawait_keep_errors(struct address_space *mapping)
 }
 EXPORT_SYMBOL(filemap_fdatawait_keep_errors);
 
+/* Returns true if writeback might be needed or already in progress. */
 static bool mapping_needs_writeback(struct address_space *mapping)
 {
-	return (!dax_mapping(mapping) && mapping->nrpages) ||
-	    (dax_mapping(mapping) && mapping->nrexceptional);
+	if (dax_mapping(mapping))
+		return mapping->nrexceptional;
+
+	return mapping->nrpages;
 }
 
 int filemap_write_and_wait(struct address_space *mapping)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-22  9:36 [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Konstantin Khlebnikov
  2019-07-22  9:36 ` [PATCH 2/2] mm/filemap: rewrite mapping_needs_writeback in less fancy manner Konstantin Khlebnikov
@ 2019-07-23  0:52 ` Andrew Morton
  2019-07-23  8:16   ` Konstantin Khlebnikov
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2019-07-23  0:52 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, linux-kernel, Tejun Heo, Jens Axboe, Johannes Weiner,
	linux-fsdevel, Jan Kara


(cc linux-fsdevel and Jan)

On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:

> Functions like filemap_write_and_wait_range() should do nothing if inode
> has no dirty pages or pages currently under writeback. But they anyway
> construct struct writeback_control and this does some atomic operations
> if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
> updates state of writeback ownership, on slow path might be more work.
> Current this path is safely avoided only when inode mapping has no pages.
> 
> For example generic_file_read_iter() calls filemap_write_and_wait_range()
> at each O_DIRECT read - pretty hot path.
> 
> This patch skips starting new writeback if mapping has no dirty tags set.
> If writeback is already in progress filemap_write_and_wait_range() will
> wait for it.
> 
> ...
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>  		.range_end = end,
>  	};
>  
> -	if (!mapping_cap_writeback_dirty(mapping))
> +	if (!mapping_cap_writeback_dirty(mapping) ||
> +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>  		return 0;
>  
>  	wbc_attach_fdatawrite_inode(&wbc, mapping->host);

How does this play with tagged_writepages?  We assume that no tagging
has been performed by any __filemap_fdatawrite_range() caller?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-23  0:52 ` [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Andrew Morton
@ 2019-07-23  8:16   ` Konstantin Khlebnikov
  2019-07-30 14:14     ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-23  8:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Tejun Heo, Jens Axboe, Johannes Weiner,
	linux-fsdevel, Jan Kara



On 23.07.2019 3:52, Andrew Morton wrote:
> 
> (cc linux-fsdevel and Jan)
> 
> On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
> 
>> Functions like filemap_write_and_wait_range() should do nothing if inode
>> has no dirty pages or pages currently under writeback. But they anyway
>> construct struct writeback_control and this does some atomic operations
>> if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
>> updates state of writeback ownership, on slow path might be more work.
>> Current this path is safely avoided only when inode mapping has no pages.
>>
>> For example generic_file_read_iter() calls filemap_write_and_wait_range()
>> at each O_DIRECT read - pretty hot path.
>>
>> This patch skips starting new writeback if mapping has no dirty tags set.
>> If writeback is already in progress filemap_write_and_wait_range() will
>> wait for it.
>>
>> ...
>>
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>>   		.range_end = end,
>>   	};
>>   
>> -	if (!mapping_cap_writeback_dirty(mapping))
>> +	if (!mapping_cap_writeback_dirty(mapping) ||
>> +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>>   		return 0;
>>   
>>   	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
> 
> How does this play with tagged_writepages?  We assume that no tagging
> has been performed by any __filemap_fdatawrite_range() caller?
>

Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.

To-write tags are supposed to be a subset of dirty tags:
to-write is set only when dirty is set and cleared after starting writeback.

Special case set_page_writeback_keepwrite() which does not clear to-write
should be for dirty page thus dirty tag is not going to be cleared either.
Ext4 calls it after redirty_page_for_writepage()
XFS even without clear_page_dirty_for_io()

Anyway to-write tag without dirty tag or at clear page is confusing.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-23  8:16   ` Konstantin Khlebnikov
@ 2019-07-30 14:14     ` Jan Kara
  2019-07-30 14:57       ` Konstantin Khlebnikov
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2019-07-30 14:14 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Andrew Morton, linux-mm, linux-kernel, Tejun Heo, Jens Axboe,
	Johannes Weiner, linux-fsdevel, Jan Kara

On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
> On 23.07.2019 3:52, Andrew Morton wrote:
> > 
> > (cc linux-fsdevel and Jan)

Thanks for CC Andrew.

> > On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
> > 
> > > Functions like filemap_write_and_wait_range() should do nothing if inode
> > > has no dirty pages or pages currently under writeback. But they anyway
> > > construct struct writeback_control and this does some atomic operations
> > > if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
> > > updates state of writeback ownership, on slow path might be more work.
> > > Current this path is safely avoided only when inode mapping has no pages.
> > > 
> > > For example generic_file_read_iter() calls filemap_write_and_wait_range()
> > > at each O_DIRECT read - pretty hot path.

Yes, but in common case mapping_needs_writeback() is false for files you do
direct IO to (exactly the case with no pages in the mapping). So you
shouldn't see the overhead at all. So which case you really care about?

> > > This patch skips starting new writeback if mapping has no dirty tags set.
> > > If writeback is already in progress filemap_write_and_wait_range() will
> > > wait for it.
> > > 
> > > ...
> > > 
> > > --- a/mm/filemap.c
> > > +++ b/mm/filemap.c
> > > @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
> > >   		.range_end = end,
> > >   	};
> > > -	if (!mapping_cap_writeback_dirty(mapping))
> > > +	if (!mapping_cap_writeback_dirty(mapping) ||
> > > +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> > >   		return 0;
> > >   	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
> > 
> > How does this play with tagged_writepages?  We assume that no tagging
> > has been performed by any __filemap_fdatawrite_range() caller?
> > 
> 
> Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
> 
> To-write tags are supposed to be a subset of dirty tags:
> to-write is set only when dirty is set and cleared after starting writeback.
> 
> Special case set_page_writeback_keepwrite() which does not clear to-write
> should be for dirty page thus dirty tag is not going to be cleared either.
> Ext4 calls it after redirty_page_for_writepage()
> XFS even without clear_page_dirty_for_io()
> 
> Anyway to-write tag without dirty tag or at clear page is confusing.

Yeah, TOWRITE tag is intended to be internal to writepages logic so your
patch is fine in that regard. Overall the patch looks good to me so I'm
just wondering a bit about the motivation...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-30 14:14     ` Jan Kara
@ 2019-07-30 14:57       ` Konstantin Khlebnikov
  2019-07-30 15:48         ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-30 14:57 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, linux-kernel, Tejun Heo, Jens Axboe,
	Johannes Weiner, linux-fsdevel

On 30.07.2019 17:14, Jan Kara wrote:
> On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
>> On 23.07.2019 3:52, Andrew Morton wrote:
>>>
>>> (cc linux-fsdevel and Jan)
> 
> Thanks for CC Andrew.
> 
>>> On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
>>>
>>>> Functions like filemap_write_and_wait_range() should do nothing if inode
>>>> has no dirty pages or pages currently under writeback. But they anyway
>>>> construct struct writeback_control and this does some atomic operations
>>>> if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
>>>> updates state of writeback ownership, on slow path might be more work.
>>>> Current this path is safely avoided only when inode mapping has no pages.
>>>>
>>>> For example generic_file_read_iter() calls filemap_write_and_wait_range()
>>>> at each O_DIRECT read - pretty hot path.
> 
> Yes, but in common case mapping_needs_writeback() is false for files you do
> direct IO to (exactly the case with no pages in the mapping). So you
> shouldn't see the overhead at all. So which case you really care about?
> 
>>>> This patch skips starting new writeback if mapping has no dirty tags set.
>>>> If writeback is already in progress filemap_write_and_wait_range() will
>>>> wait for it.
>>>>
>>>> ...
>>>>
>>>> --- a/mm/filemap.c
>>>> +++ b/mm/filemap.c
>>>> @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>>>>    		.range_end = end,
>>>>    	};
>>>> -	if (!mapping_cap_writeback_dirty(mapping))
>>>> +	if (!mapping_cap_writeback_dirty(mapping) ||
>>>> +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>>>>    		return 0;
>>>>    	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>>>
>>> How does this play with tagged_writepages?  We assume that no tagging
>>> has been performed by any __filemap_fdatawrite_range() caller?
>>>
>>
>> Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
>>
>> To-write tags are supposed to be a subset of dirty tags:
>> to-write is set only when dirty is set and cleared after starting writeback.
>>
>> Special case set_page_writeback_keepwrite() which does not clear to-write
>> should be for dirty page thus dirty tag is not going to be cleared either.
>> Ext4 calls it after redirty_page_for_writepage()
>> XFS even without clear_page_dirty_for_io()
>>
>> Anyway to-write tag without dirty tag or at clear page is confusing.
> 
> Yeah, TOWRITE tag is intended to be internal to writepages logic so your
> patch is fine in that regard. Overall the patch looks good to me so I'm
> just wondering a bit about the motivation...

In our case file mixes cached pages and O_DIRECT read. Kind of database
were index header is memory mapped while the rest data read via O_DIRECT.
I suppose for sharing index between multiple instances.

On this path we also hit this bug:
https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
so that's why I've started looking into this code.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-30 14:57       ` Konstantin Khlebnikov
@ 2019-07-30 15:48         ` Jan Kara
  2019-07-30 18:15           ` Konstantin Khlebnikov
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2019-07-30 15:48 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Jan Kara, Andrew Morton, linux-mm, linux-kernel, Tejun Heo,
	Jens Axboe, Johannes Weiner, linux-fsdevel

On Tue 30-07-19 17:57:18, Konstantin Khlebnikov wrote:
> On 30.07.2019 17:14, Jan Kara wrote:
> > On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
> > > On 23.07.2019 3:52, Andrew Morton wrote:
> > > > 
> > > > (cc linux-fsdevel and Jan)
> > 
> > Thanks for CC Andrew.
> > 
> > > > On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
> > > > 
> > > > > Functions like filemap_write_and_wait_range() should do nothing if inode
> > > > > has no dirty pages or pages currently under writeback. But they anyway
> > > > > construct struct writeback_control and this does some atomic operations
> > > > > if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
> > > > > updates state of writeback ownership, on slow path might be more work.
> > > > > Current this path is safely avoided only when inode mapping has no pages.
> > > > > 
> > > > > For example generic_file_read_iter() calls filemap_write_and_wait_range()
> > > > > at each O_DIRECT read - pretty hot path.
> > 
> > Yes, but in common case mapping_needs_writeback() is false for files you do
> > direct IO to (exactly the case with no pages in the mapping). So you
> > shouldn't see the overhead at all. So which case you really care about?
> > 
> > > > > This patch skips starting new writeback if mapping has no dirty tags set.
> > > > > If writeback is already in progress filemap_write_and_wait_range() will
> > > > > wait for it.
> > > > > 
> > > > > ...
> > > > > 
> > > > > --- a/mm/filemap.c
> > > > > +++ b/mm/filemap.c
> > > > > @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
> > > > >    		.range_end = end,
> > > > >    	};
> > > > > -	if (!mapping_cap_writeback_dirty(mapping))
> > > > > +	if (!mapping_cap_writeback_dirty(mapping) ||
> > > > > +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> > > > >    		return 0;
> > > > >    	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
> > > > 
> > > > How does this play with tagged_writepages?  We assume that no tagging
> > > > has been performed by any __filemap_fdatawrite_range() caller?
> > > > 
> > > 
> > > Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
> > > 
> > > To-write tags are supposed to be a subset of dirty tags:
> > > to-write is set only when dirty is set and cleared after starting writeback.
> > > 
> > > Special case set_page_writeback_keepwrite() which does not clear to-write
> > > should be for dirty page thus dirty tag is not going to be cleared either.
> > > Ext4 calls it after redirty_page_for_writepage()
> > > XFS even without clear_page_dirty_for_io()
> > > 
> > > Anyway to-write tag without dirty tag or at clear page is confusing.
> > 
> > Yeah, TOWRITE tag is intended to be internal to writepages logic so your
> > patch is fine in that regard. Overall the patch looks good to me so I'm
> > just wondering a bit about the motivation...
> 
> In our case file mixes cached pages and O_DIRECT read. Kind of database
> were index header is memory mapped while the rest data read via O_DIRECT.
> I suppose for sharing index between multiple instances.

OK, that has always been a bit problematic but you're not the first one to
have such design ;). So feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

to your patch.

> On this path we also hit this bug:
> https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
> so that's why I've started looking into this code.

I see. OK.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages
  2019-07-30 15:48         ` Jan Kara
@ 2019-07-30 18:15           ` Konstantin Khlebnikov
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2019-07-30 18:15 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, linux-mm, linux-kernel, Tejun Heo, Jens Axboe,
	Johannes Weiner, linux-fsdevel



On 30.07.2019 18:48, Jan Kara wrote:
> On Tue 30-07-19 17:57:18, Konstantin Khlebnikov wrote:
>> On 30.07.2019 17:14, Jan Kara wrote:
>>> On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
>>>> On 23.07.2019 3:52, Andrew Morton wrote:
>>>>>
>>>>> (cc linux-fsdevel and Jan)
>>>
>>> Thanks for CC Andrew.
>>>
>>>>> On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:
>>>>>
>>>>>> Functions like filemap_write_and_wait_range() should do nothing if inode
>>>>>> has no dirty pages or pages currently under writeback. But they anyway
>>>>>> construct struct writeback_control and this does some atomic operations
>>>>>> if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
>>>>>> updates state of writeback ownership, on slow path might be more work.
>>>>>> Current this path is safely avoided only when inode mapping has no pages.
>>>>>>
>>>>>> For example generic_file_read_iter() calls filemap_write_and_wait_range()
>>>>>> at each O_DIRECT read - pretty hot path.
>>>
>>> Yes, but in common case mapping_needs_writeback() is false for files you do
>>> direct IO to (exactly the case with no pages in the mapping). So you
>>> shouldn't see the overhead at all. So which case you really care about?
>>>
>>>>>> This patch skips starting new writeback if mapping has no dirty tags set.
>>>>>> If writeback is already in progress filemap_write_and_wait_range() will
>>>>>> wait for it.
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> --- a/mm/filemap.c
>>>>>> +++ b/mm/filemap.c
>>>>>> @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>>>>>>     		.range_end = end,
>>>>>>     	};
>>>>>> -	if (!mapping_cap_writeback_dirty(mapping))
>>>>>> +	if (!mapping_cap_writeback_dirty(mapping) ||
>>>>>> +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>>>>>>     		return 0;
>>>>>>     	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>>>>>
>>>>> How does this play with tagged_writepages?  We assume that no tagging
>>>>> has been performed by any __filemap_fdatawrite_range() caller?
>>>>>
>>>>
>>>> Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
>>>>
>>>> To-write tags are supposed to be a subset of dirty tags:
>>>> to-write is set only when dirty is set and cleared after starting writeback.
>>>>
>>>> Special case set_page_writeback_keepwrite() which does not clear to-write
>>>> should be for dirty page thus dirty tag is not going to be cleared either.
>>>> Ext4 calls it after redirty_page_for_writepage()
>>>> XFS even without clear_page_dirty_for_io()
>>>>
>>>> Anyway to-write tag without dirty tag or at clear page is confusing.
>>>
>>> Yeah, TOWRITE tag is intended to be internal to writepages logic so your
>>> patch is fine in that regard. Overall the patch looks good to me so I'm
>>> just wondering a bit about the motivation...
>>
>> In our case file mixes cached pages and O_DIRECT read. Kind of database
>> were index header is memory mapped while the rest data read via O_DIRECT.
>> I suppose for sharing index between multiple instances.
> 
> OK, that has always been a bit problematic but you're not the first one to
> have such design ;). So feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> to your patch.

Thanks.

O_DIRECT has long history of misunderstandings =)
It looks some cases are still not documented.
My favourite: O_DIRECT write into hole goes into cache, at least for ext4.

> 
>> On this path we also hit this bug:
>> https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
>> so that's why I've started looking into this code.
> 
> I see. OK.
> 
> 								Honza
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-30 18:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-22  9:36 [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Konstantin Khlebnikov
2019-07-22  9:36 ` [PATCH 2/2] mm/filemap: rewrite mapping_needs_writeback in less fancy manner Konstantin Khlebnikov
2019-07-23  0:52 ` [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages Andrew Morton
2019-07-23  8:16   ` Konstantin Khlebnikov
2019-07-30 14:14     ` Jan Kara
2019-07-30 14:57       ` Konstantin Khlebnikov
2019-07-30 15:48         ` Jan Kara
2019-07-30 18:15           ` Konstantin Khlebnikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.