[PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
@ 2018-12-04  2:08 Hou Tao
  2018-12-06  1:14 ` Hou Tao
  2018-12-15 14:38 ` Matthew Wilcox
  0 siblings, 2 replies; 15+ messages in thread
From: Hou Tao @ 2018-12-04  2:08 UTC (permalink / raw)
  To: phillip; +Cc: linux-fsdevel, linux-kernel, linux-mm

There is no need to disable __GFP_FS in ->readpage:
* It's a read-only fs, so there will be no dirty/writeback page and
  there will be no deadlock against the caller's locked page
* It just allocates one page, so compaction will not be invoked
* It doesn't take any inode lock, so the reclamation of inode will be fine

And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
squashfs page fault occurs in the context of a memory hogger, because
the hogger will not be killed due to the logic in __alloc_pages_may_oom().

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 fs/squashfs/file.c          |  3 ++-
 fs/squashfs/file_direct.c   |  4 +++-
 fs/squashfs/squashfs_fs_f.h | 25 +++++++++++++++++++++++++
 3 files changed, 30 insertions(+), 2 deletions(-)
 create mode 100644 fs/squashfs/squashfs_fs_f.h

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index f1c1430ae721..8603dda4a719 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -51,6 +51,7 @@
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
 #include "squashfs_fs_i.h"
+#include "squashfs_fs_f.h"
 #include "squashfs.h"
 
 /*
@@ -414,7 +415,7 @@ void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer,
 		TRACE("bytes %d, i %d, available_bytes %d\n", bytes, i, avail);
 
 		push_page = (i == page->index) ? page :
-			grab_cache_page_nowait(page->mapping, i);
+			squashfs_grab_cache_page_nowait(page->mapping, i);
 
 		if (!push_page)
 			continue;
diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
index 80db1b86a27c..a0fdd6215348 100644
--- a/fs/squashfs/file_direct.c
+++ b/fs/squashfs/file_direct.c
@@ -17,6 +17,7 @@
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
 #include "squashfs_fs_i.h"
+#include "squashfs_fs_f.h"
 #include "squashfs.h"
 #include "page_actor.h"
 
@@ -60,7 +61,8 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
 	/* Try to grab all the pages covered by the Squashfs block */
 	for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
 		page[i] = (n == target_page->index) ? target_page :
-			grab_cache_page_nowait(target_page->mapping, n);
+			squashfs_grab_cache_page_nowait(
+					target_page->mapping, n);
 
 		if (page[i] == NULL) {
 			missing_pages++;
diff --git a/fs/squashfs/squashfs_fs_f.h b/fs/squashfs/squashfs_fs_f.h
new file mode 100644
index 000000000000..fc5fb7aeb27d
--- /dev/null
+++ b/fs/squashfs/squashfs_fs_f.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef SQUASHFS_FS_F
+#define SQUASHFS_FS_F
+
+/*
+ * No need to use FGP_NOFS here:
+ * 1. It's a read-only fs, so there will be no dirty/writeback page and
+ *    there will be no deadlock against the caller's locked page.
+ * 2. It just allocates one page, so compaction will not be invoked.
+ * 3. It doesn't take any inode lock, so the reclamation of inode
+ *    will be fine.
+ *
+ * And GFP_NOFS may lead to infinite loop in __alloc_pages_slowpath() if a
+ * squashfs page fault occurs in the context of a memory hogger, because
+ * the hogger will not be killed due to the logic in __alloc_pages_may_oom().
+ */
+static inline struct page *
+squashfs_grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
+{
+	return pagecache_get_page(mapping, index,
+			FGP_LOCK|FGP_CREAT|FGP_NOWAIT,
+			mapping_gfp_mask(mapping));
+}
+#endif
+
-- 
2.16.2.dirty


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-04  2:08 [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc Hou Tao
@ 2018-12-06  1:14 ` Hou Tao
  2018-12-13  2:18   ` Hou Tao
  2018-12-15 14:38 ` Matthew Wilcox
  1 sibling, 1 reply; 15+ messages in thread
From: Hou Tao @ 2018-12-06  1:14 UTC (permalink / raw)
  To: phillip; +Cc: linux-fsdevel, linux-kernel, linux-mm

ping ?

On 2018/12/4 10:08, Hou Tao wrote:
> There is no need to disable __GFP_FS in ->readpage:
> * It's a read-only fs, so there will be no dirty/writeback page and
>   there will be no deadlock against the caller's locked page
> * It just allocates one page, so compaction will not be invoked
> * It doesn't take any inode lock, so the reclamation of inode will be fine
> 
> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
> squashfs page fault occurs in the context of a memory hogger, because
> the hogger will not be killed due to the logic in __alloc_pages_may_oom().
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  fs/squashfs/file.c          |  3 ++-
>  fs/squashfs/file_direct.c   |  4 +++-
>  fs/squashfs/squashfs_fs_f.h | 25 +++++++++++++++++++++++++
>  3 files changed, 30 insertions(+), 2 deletions(-)
>  create mode 100644 fs/squashfs/squashfs_fs_f.h
> 
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index f1c1430ae721..8603dda4a719 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -51,6 +51,7 @@
>  #include "squashfs_fs.h"
>  #include "squashfs_fs_sb.h"
>  #include "squashfs_fs_i.h"
> +#include "squashfs_fs_f.h"
>  #include "squashfs.h"
>  
>  /*
> @@ -414,7 +415,7 @@ void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer,
>  		TRACE("bytes %d, i %d, available_bytes %d\n", bytes, i, avail);
>  
>  		push_page = (i == page->index) ? page :
> -			grab_cache_page_nowait(page->mapping, i);
> +			squashfs_grab_cache_page_nowait(page->mapping, i);
>  
>  		if (!push_page)
>  			continue;
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index 80db1b86a27c..a0fdd6215348 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -17,6 +17,7 @@
>  #include "squashfs_fs.h"
>  #include "squashfs_fs_sb.h"
>  #include "squashfs_fs_i.h"
> +#include "squashfs_fs_f.h"
>  #include "squashfs.h"
>  #include "page_actor.h"
>  
> @@ -60,7 +61,8 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>  	/* Try to grab all the pages covered by the Squashfs block */
>  	for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
>  		page[i] = (n == target_page->index) ? target_page :
> -			grab_cache_page_nowait(target_page->mapping, n);
> +			squashfs_grab_cache_page_nowait(
> +					target_page->mapping, n);
>  
>  		if (page[i] == NULL) {
>  			missing_pages++;
> diff --git a/fs/squashfs/squashfs_fs_f.h b/fs/squashfs/squashfs_fs_f.h
> new file mode 100644
> index 000000000000..fc5fb7aeb27d
> --- /dev/null
> +++ b/fs/squashfs/squashfs_fs_f.h
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef SQUASHFS_FS_F
> +#define SQUASHFS_FS_F
> +
> +/*
> + * No need to use FGP_NOFS here:
> + * 1. It's a read-only fs, so there will be no dirty/writeback page and
> + *    there will be no deadlock against the caller's locked page.
> + * 2. It just allocates one page, so compaction will not be invoked.
> + * 3. It doesn't take any inode lock, so the reclamation of inode
> + *    will be fine.
> + *
> + * And GFP_NOFS may lead to infinite loop in __alloc_pages_slowpath() if a
> + * squashfs page fault occurs in the context of a memory hogger, because
> + * the hogger will not be killed due to the logic in __alloc_pages_may_oom().
> + */
> +static inline struct page *
> +squashfs_grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
> +{
> +	return pagecache_get_page(mapping, index,
> +			FGP_LOCK|FGP_CREAT|FGP_NOWAIT,
> +			mapping_gfp_mask(mapping));
> +}
> +#endif
> +
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-06  1:14 ` Hou Tao
@ 2018-12-13  2:18   ` Hou Tao
  2018-12-15 13:24     ` Hou Tao
  0 siblings, 1 reply; 15+ messages in thread
From: Hou Tao @ 2018-12-13  2:18 UTC (permalink / raw)
  To: phillip; +Cc: linux-fsdevel, linux-kernel, linux-mm

ping ?

On 2018/12/6 9:14, Hou Tao wrote:
> ping ?
> 
> On 2018/12/4 10:08, Hou Tao wrote:
>> There is no need to disable __GFP_FS in ->readpage:
>> * It's a read-only fs, so there will be no dirty/writeback page and
>>   there will be no deadlock against the caller's locked page
>> * It just allocates one page, so compaction will not be invoked
>> * It doesn't take any inode lock, so the reclamation of inode will be fine
>>
>> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
>> squashfs page fault occurs in the context of a memory hogger, because
>> the hogger will not be killed due to the logic in __alloc_pages_may_oom().
>>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> ---
>>  fs/squashfs/file.c          |  3 ++-
>>  fs/squashfs/file_direct.c   |  4 +++-
>>  fs/squashfs/squashfs_fs_f.h | 25 +++++++++++++++++++++++++
>>  3 files changed, 30 insertions(+), 2 deletions(-)
>>  create mode 100644 fs/squashfs/squashfs_fs_f.h
>>
>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
>> index f1c1430ae721..8603dda4a719 100644
>> --- a/fs/squashfs/file.c
>> +++ b/fs/squashfs/file.c
>> @@ -51,6 +51,7 @@
>>  #include "squashfs_fs.h"
>>  #include "squashfs_fs_sb.h"
>>  #include "squashfs_fs_i.h"
>> +#include "squashfs_fs_f.h"
>>  #include "squashfs.h"
>>  
>>  /*
>> @@ -414,7 +415,7 @@ void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer,
>>  		TRACE("bytes %d, i %d, available_bytes %d\n", bytes, i, avail);
>>  
>>  		push_page = (i == page->index) ? page :
>> -			grab_cache_page_nowait(page->mapping, i);
>> +			squashfs_grab_cache_page_nowait(page->mapping, i);
>>  
>>  		if (!push_page)
>>  			continue;
>> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
>> index 80db1b86a27c..a0fdd6215348 100644
>> --- a/fs/squashfs/file_direct.c
>> +++ b/fs/squashfs/file_direct.c
>> @@ -17,6 +17,7 @@
>>  #include "squashfs_fs.h"
>>  #include "squashfs_fs_sb.h"
>>  #include "squashfs_fs_i.h"
>> +#include "squashfs_fs_f.h"
>>  #include "squashfs.h"
>>  #include "page_actor.h"
>>  
>> @@ -60,7 +61,8 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>>  	/* Try to grab all the pages covered by the Squashfs block */
>>  	for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
>>  		page[i] = (n == target_page->index) ? target_page :
>> -			grab_cache_page_nowait(target_page->mapping, n);
>> +			squashfs_grab_cache_page_nowait(
>> +					target_page->mapping, n);
>>  
>>  		if (page[i] == NULL) {
>>  			missing_pages++;
>> diff --git a/fs/squashfs/squashfs_fs_f.h b/fs/squashfs/squashfs_fs_f.h
>> new file mode 100644
>> index 000000000000..fc5fb7aeb27d
>> --- /dev/null
>> +++ b/fs/squashfs/squashfs_fs_f.h
>> @@ -0,0 +1,25 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef SQUASHFS_FS_F
>> +#define SQUASHFS_FS_F
>> +
>> +/*
>> + * No need to use FGP_NOFS here:
>> + * 1. It's a read-only fs, so there will be no dirty/writeback page and
>> + *    there will be no deadlock against the caller's locked page.
>> + * 2. It just allocates one page, so compaction will not be invoked.
>> + * 3. It doesn't take any inode lock, so the reclamation of inode
>> + *    will be fine.
>> + *
>> + * And GFP_NOFS may lead to infinite loop in __alloc_pages_slowpath() if a
>> + * squashfs page fault occurs in the context of a memory hogger, because
>> + * the hogger will not be killed due to the logic in __alloc_pages_may_oom().
>> + */
>> +static inline struct page *
>> +squashfs_grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
>> +{
>> +	return pagecache_get_page(mapping, index,
>> +			FGP_LOCK|FGP_CREAT|FGP_NOWAIT,
>> +			mapping_gfp_mask(mapping));
>> +}
>> +#endif
>> +
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-13  2:18   ` Hou Tao
@ 2018-12-15 13:24     ` Hou Tao
  0 siblings, 0 replies; 15+ messages in thread
From: Hou Tao @ 2018-12-15 13:24 UTC (permalink / raw)
  To: phillip; +Cc: linux-fsdevel, linux-kernel, linux-mm

ping ?

On 2018/12/13 10:18, Hou Tao wrote:
> ping ?
> 
> On 2018/12/6 9:14, Hou Tao wrote:
>> ping ?
>>
>> On 2018/12/4 10:08, Hou Tao wrote:
>>> There is no need to disable __GFP_FS in ->readpage:
>>> * It's a read-only fs, so there will be no dirty/writeback page and
>>>   there will be no deadlock against the caller's locked page
>>> * It just allocates one page, so compaction will not be invoked
>>> * It doesn't take any inode lock, so the reclamation of inode will be fine
>>>
>>> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
>>> squashfs page fault occurs in the context of a memory hogger, because
>>> the hogger will not be killed due to the logic in __alloc_pages_may_oom().
>>>
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>> ---
>>>  fs/squashfs/file.c          |  3 ++-
>>>  fs/squashfs/file_direct.c   |  4 +++-
>>>  fs/squashfs/squashfs_fs_f.h | 25 +++++++++++++++++++++++++
>>>  3 files changed, 30 insertions(+), 2 deletions(-)
>>>  create mode 100644 fs/squashfs/squashfs_fs_f.h
>>>
>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
>>> index f1c1430ae721..8603dda4a719 100644
>>> --- a/fs/squashfs/file.c
>>> +++ b/fs/squashfs/file.c
>>> @@ -51,6 +51,7 @@
>>>  #include "squashfs_fs.h"
>>>  #include "squashfs_fs_sb.h"
>>>  #include "squashfs_fs_i.h"
>>> +#include "squashfs_fs_f.h"
>>>  #include "squashfs.h"
>>>  
>>>  /*
>>> @@ -414,7 +415,7 @@ void squashfs_copy_cache(struct page *page, struct squashfs_cache_entry *buffer,
>>>  		TRACE("bytes %d, i %d, available_bytes %d\n", bytes, i, avail);
>>>  
>>>  		push_page = (i == page->index) ? page :
>>> -			grab_cache_page_nowait(page->mapping, i);
>>> +			squashfs_grab_cache_page_nowait(page->mapping, i);
>>>  
>>>  		if (!push_page)
>>>  			continue;
>>> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
>>> index 80db1b86a27c..a0fdd6215348 100644
>>> --- a/fs/squashfs/file_direct.c
>>> +++ b/fs/squashfs/file_direct.c
>>> @@ -17,6 +17,7 @@
>>>  #include "squashfs_fs.h"
>>>  #include "squashfs_fs_sb.h"
>>>  #include "squashfs_fs_i.h"
>>> +#include "squashfs_fs_f.h"
>>>  #include "squashfs.h"
>>>  #include "page_actor.h"
>>>  
>>> @@ -60,7 +61,8 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>>>  	/* Try to grab all the pages covered by the Squashfs block */
>>>  	for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
>>>  		page[i] = (n == target_page->index) ? target_page :
>>> -			grab_cache_page_nowait(target_page->mapping, n);
>>> +			squashfs_grab_cache_page_nowait(
>>> +					target_page->mapping, n);
>>>  
>>>  		if (page[i] == NULL) {
>>>  			missing_pages++;
>>> diff --git a/fs/squashfs/squashfs_fs_f.h b/fs/squashfs/squashfs_fs_f.h
>>> new file mode 100644
>>> index 000000000000..fc5fb7aeb27d
>>> --- /dev/null
>>> +++ b/fs/squashfs/squashfs_fs_f.h
>>> @@ -0,0 +1,25 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef SQUASHFS_FS_F
>>> +#define SQUASHFS_FS_F
>>> +
>>> +/*
>>> + * No need to use FGP_NOFS here:
>>> + * 1. It's a read-only fs, so there will be no dirty/writeback page and
>>> + *    there will be no deadlock against the caller's locked page.
>>> + * 2. It just allocates one page, so compaction will not be invoked.
>>> + * 3. It doesn't take any inode lock, so the reclamation of inode
>>> + *    will be fine.
>>> + *
>>> + * And GFP_NOFS may lead to infinite loop in __alloc_pages_slowpath() if a
>>> + * squashfs page fault occurs in the context of a memory hogger, because
>>> + * the hogger will not be killed due to the logic in __alloc_pages_may_oom().
>>> + */
>>> +static inline struct page *
>>> +squashfs_grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
>>> +{
>>> +	return pagecache_get_page(mapping, index,
>>> +			FGP_LOCK|FGP_CREAT|FGP_NOWAIT,
>>> +			mapping_gfp_mask(mapping));
>>> +}
>>> +#endif
>>> +
>>>
>>
>>
>> .
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-04  2:08 [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc Hou Tao
  2018-12-06  1:14 ` Hou Tao
@ 2018-12-15 14:38 ` Matthew Wilcox
  2018-12-16  9:38   ` Hou Tao
  1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2018-12-15 14:38 UTC (permalink / raw)
  To: Hou Tao; +Cc: phillip, linux-fsdevel, linux-kernel, linux-mm

On Tue, Dec 04, 2018 at 10:08:40AM +0800, Hou Tao wrote:
> There is no need to disable __GFP_FS in ->readpage:
> * It's a read-only fs, so there will be no dirty/writeback page and
>   there will be no deadlock against the caller's locked page
> * It just allocates one page, so compaction will not be invoked
> * It doesn't take any inode lock, so the reclamation of inode will be fine
> 
> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
> squashfs page fault occurs in the context of a memory hogger, because
> the hogger will not be killed due to the logic in __alloc_pages_may_oom().

I don't understand your argument here.  There's a comment in
__alloc_pages_may_oom() saying that we _should_ treat GFP_NOFS
specially, but we currently don't.

        /*
         * XXX: GFP_NOFS allocations should rather fail than rely on
         * other request to make a forward progress.
         * We are in an unfortunate situation where out_of_memory cannot
         * do much for this context but let's try it to at least get
         * access to memory reserved if the current task is killed (see
         * out_of_memory). Once filesystems are ready to handle allocation
         * failures more gracefully we should just bail out here.
         */

What problem are you actually seeing?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-15 14:38 ` Matthew Wilcox
@ 2018-12-16  9:38   ` Hou Tao
  2018-12-17  3:51     ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Hou Tao @ 2018-12-16  9:38 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: phillip, linux-fsdevel, linux-kernel, linux-mm

Hi,

On 2018/12/15 22:38, Matthew Wilcox wrote:
> On Tue, Dec 04, 2018 at 10:08:40AM +0800, Hou Tao wrote:
>> There is no need to disable __GFP_FS in ->readpage:
>> * It's a read-only fs, so there will be no dirty/writeback page and
>>   there will be no deadlock against the caller's locked page
>> * It just allocates one page, so compaction will not be invoked
>> * It doesn't take any inode lock, so the reclamation of inode will be fine
>>
>> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
>> squashfs page fault occurs in the context of a memory hogger, because
>> the hogger will not be killed due to the logic in __alloc_pages_may_oom().
> 
> I don't understand your argument here.  There's a comment in
> __alloc_pages_may_oom() saying that we _should_ treat GFP_NOFS
> specially, but we currently don't.
I am trying to say that if __GFP_FS is used in pagecache_get_page() when it tries
to allocate a new page for squashfs, that will be no possibility of dead-lock for
squashfs.

We do treat GFP_NOFS specially in out_of_memory():

    /*
     * The OOM killer does not compensate for IO-less reclaim.
     * pagefault_out_of_memory lost its gfp context so we have to
     * make sure exclude 0 mask - all other users should have at least
     * ___GFP_DIRECT_RECLAIM to get here.
     */
    if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
        return true;

So if GFP_FS is used, no task will be killed because we will return from
out_of_memory() prematurely. And that will lead to an infinite loop in
__alloc_pages_slowpath() as we have observed:

* a squashfs page fault occurred in the context of a memory hogger
* the page used for page fault allocated successfully
* in squashfs_readpage() squashfs will try to allocate other pages
  in the same 128KB block, and __GFP_NOFS is used (actually GFP_HIGHUSER_MOVABLE & ~__GFP_FS)
* in __alloc_pages_slowpath() we can not get any pages through reclamation
  (because most of memory is used by the current task) and we also can not kill
  the current task (due to __GFP_NOFS), and it will loop forever until it's killed.

> 
>         /*
>          * XXX: GFP_NOFS allocations should rather fail than rely on
>          * other request to make a forward progress.
>          * We are in an unfortunate situation where out_of_memory cannot
>          * do much for this context but let's try it to at least get
>          * access to memory reserved if the current task is killed (see
>          * out_of_memory). Once filesystems are ready to handle allocation
>          * failures more gracefully we should just bail out here.
>          */
> 
> What problem are you actually seeing?
> 
> .
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-16  9:38   ` Hou Tao
@ 2018-12-17  3:51     ` Matthew Wilcox
  2018-12-17  9:33       ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2018-12-17  3:51 UTC (permalink / raw)
  To: Hou Tao; +Cc: phillip, linux-fsdevel, linux-kernel, linux-mm, Michal Hocko

On Sun, Dec 16, 2018 at 05:38:13PM +0800, Hou Tao wrote:
> Hi,
> 
> On 2018/12/15 22:38, Matthew Wilcox wrote:
> > On Tue, Dec 04, 2018 at 10:08:40AM +0800, Hou Tao wrote:
> >> There is no need to disable __GFP_FS in ->readpage:
> >> * It's a read-only fs, so there will be no dirty/writeback page and
> >>   there will be no deadlock against the caller's locked page
> >> * It just allocates one page, so compaction will not be invoked
> >> * It doesn't take any inode lock, so the reclamation of inode will be fine
> >>
> >> And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
> >> squashfs page fault occurs in the context of a memory hogger, because
> >> the hogger will not be killed due to the logic in __alloc_pages_may_oom().
> > 
> > I don't understand your argument here.  There's a comment in
> > __alloc_pages_may_oom() saying that we _should_ treat GFP_NOFS
> > specially, but we currently don't.
> I am trying to say that if __GFP_FS is used in pagecache_get_page() when it tries
> to allocate a new page for squashfs, that will be no possibility of dead-lock for
> squashfs.
> 
> We do treat GFP_NOFS specially in out_of_memory():
> 
>     /*
>      * The OOM killer does not compensate for IO-less reclaim.
>      * pagefault_out_of_memory lost its gfp context so we have to
>      * make sure exclude 0 mask - all other users should have at least
>      * ___GFP_DIRECT_RECLAIM to get here.
>      */
>     if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
>         return true;
> 
> So if GFP_FS is used, no task will be killed because we will return from
> out_of_memory() prematurely. And that will lead to an infinite loop in
> __alloc_pages_slowpath() as we have observed:
> 
> * a squashfs page fault occurred in the context of a memory hogger
> * the page used for page fault allocated successfully
> * in squashfs_readpage() squashfs will try to allocate other pages
>   in the same 128KB block, and __GFP_NOFS is used (actually GFP_HIGHUSER_MOVABLE & ~__GFP_FS)
> * in __alloc_pages_slowpath() we can not get any pages through reclamation
>   (because most of memory is used by the current task) and we also can not kill
>   the current task (due to __GFP_NOFS), and it will loop forever until it's killed.

Ah, yes, that makes perfect sense.  Thank you for the explanation.

I wonder if the correct fix, however, is not to move the check for
GFP_NOFS in out_of_memory() down to below the check whether to kill
the current task.  That would solve your problem, and I don't _think_
it would cause any new ones.  Michal, you touched this code last, what
do you think?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17  3:51     ` Matthew Wilcox
@ 2018-12-17  9:33       ` Michal Hocko
  2018-12-17 10:51         ` Tetsuo Handa
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2018-12-17  9:33 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
[...]
> Ah, yes, that makes perfect sense.  Thank you for the explanation.
> 
> I wonder if the correct fix, however, is not to move the check for
> GFP_NOFS in out_of_memory() down to below the check whether to kill
> the current task.  That would solve your problem, and I don't _think_
> it would cause any new ones.  Michal, you touched this code last, what
> do you think?

What do you mean exactly? Whether we kill a current task or something
else doesn't change much on the fact that NOFS is a reclaim restricted
context and we might kill too early. If the fs can do GFP_FS then it is
obviously a better thing to do because FS metadata can be reclaimed as
well and therefore there is potentially less memory pressure on
application data.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17  9:33       ` Michal Hocko
@ 2018-12-17 10:51         ` Tetsuo Handa
  2018-12-17 12:25           ` Matthew Wilcox
  2018-12-18  6:06           ` Hou Tao
  0 siblings, 2 replies; 15+ messages in thread
From: Tetsuo Handa @ 2018-12-17 10:51 UTC (permalink / raw)
  To: Michal Hocko, Matthew Wilcox
  Cc: Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On 2018/12/17 18:33, Michal Hocko wrote:
> On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
> [...]
>> Ah, yes, that makes perfect sense.  Thank you for the explanation.
>>
>> I wonder if the correct fix, however, is not to move the check for
>> GFP_NOFS in out_of_memory() down to below the check whether to kill
>> the current task.  That would solve your problem, and I don't _think_
>> it would cause any new ones.  Michal, you touched this code last, what
>> do you think?
> 
> What do you mean exactly? Whether we kill a current task or something
> else doesn't change much on the fact that NOFS is a reclaim restricted
> context and we might kill too early. If the fs can do GFP_FS then it is
> obviously a better thing to do because FS metadata can be reclaimed as
> well and therefore there is potentially less memory pressure on
> application data.
> 

I interpreted "to move the check for GFP_NOFS in out_of_memory() down to
below the check whether to kill the current task" as

@@ -1077,15 +1077,6 @@ bool out_of_memory(struct oom_control *oc)
 	}
 
 	/*
-	 * The OOM killer does not compensate for IO-less reclaim.
-	 * pagefault_out_of_memory lost its gfp context so we have to
-	 * make sure exclude 0 mask - all other users should have at least
-	 * ___GFP_DIRECT_RECLAIM to get here.
-	 */
-	if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
-		return true;
-
-	/*
 	 * Check if there were limitations on the allocation (only relevant for
 	 * NUMA and memcg) that may require different handling.
 	 */
@@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc)
 	}
 
 	select_bad_process(oc);
+
+	/*
+	 * The OOM killer does not compensate for IO-less reclaim.
+	 * pagefault_out_of_memory lost its gfp context so we have to
+	 * make sure exclude 0 mask - all other users should have at least
+	 * ___GFP_DIRECT_RECLAIM to get here.
+	 */
+	if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen &&
+	    oc->chosen != (void *)-1UL && oc->chosen != current) {
+		put_task_struct(oc->chosen);
+		return true;
+	}
+
 	/* Found nothing?!?! */
 	if (!oc->chosen) {
 		dump_header(oc, NULL);

which is prefixed by "the correct fix is not".

Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used
will not be the correct fix. But ...

Hou Tao wrote:
> There is no need to disable __GFP_FS in ->readpage:
> * It's a read-only fs, so there will be no dirty/writeback page and
>   there will be no deadlock against the caller's locked page

is read-only filesystem sufficient for safe to use __GFP_FS?

Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks
are held or not" rather than "whether fs has dirty/writeback page or not" ?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17 10:51         ` Tetsuo Handa
@ 2018-12-17 12:25           ` Matthew Wilcox
  2018-12-17 14:10             ` Michal Hocko
  2018-12-18  6:06           ` Hou Tao
  1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2018-12-17 12:25 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Michal Hocko, Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On Mon, Dec 17, 2018 at 07:51:27PM +0900, Tetsuo Handa wrote:
> On 2018/12/17 18:33, Michal Hocko wrote:
> > On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
> > [...]
> >> Ah, yes, that makes perfect sense.  Thank you for the explanation.
> >>
> >> I wonder if the correct fix, however, is not to move the check for
> >> GFP_NOFS in out_of_memory() down to below the check whether to kill
> >> the current task.  That would solve your problem, and I don't _think_
> >> it would cause any new ones.  Michal, you touched this code last, what
> >> do you think?
> > 
> > What do you mean exactly? Whether we kill a current task or something
> > else doesn't change much on the fact that NOFS is a reclaim restricted
> > context and we might kill too early. If the fs can do GFP_FS then it is
> > obviously a better thing to do because FS metadata can be reclaimed as
> > well and therefore there is potentially less memory pressure on
> > application data.
> > 
> 
> I interpreted "to move the check for GFP_NOFS in out_of_memory() down to
> below the check whether to kill the current task" as

Too far; I meant one line earlier, before we try to select a different
process.

> @@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc)
>  	}
>  
>  	select_bad_process(oc);
> +
> +	/*
> +	 * The OOM killer does not compensate for IO-less reclaim.
> +	 * pagefault_out_of_memory lost its gfp context so we have to
> +	 * make sure exclude 0 mask - all other users should have at least
> +	 * ___GFP_DIRECT_RECLAIM to get here.
> +	 */
> +	if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen &&
> +	    oc->chosen != (void *)-1UL && oc->chosen != current) {
> +		put_task_struct(oc->chosen);
> +		return true;
> +	}
> +
>  	/* Found nothing?!?! */
>  	if (!oc->chosen) {
>  		dump_header(oc, NULL);
> 
> which is prefixed by "the correct fix is not".
> 
> Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used
> will not be the correct fix. But ...
> 
> Hou Tao wrote:
> > There is no need to disable __GFP_FS in ->readpage:
> > * It's a read-only fs, so there will be no dirty/writeback page and
> >   there will be no deadlock against the caller's locked page
> 
> is read-only filesystem sufficient for safe to use __GFP_FS?
> 
> Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks
> are held or not" rather than "whether fs has dirty/writeback page or not" ?

It's worth noticing that squashfs _is_ in fact holding a page locked in
squashfs_copy_cache() when it calls grab_cache_page_nowait().  I'm not
sure if this will lead to trouble or not because I'm insufficiently
familiar with the reclaim path.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17 12:25           ` Matthew Wilcox
@ 2018-12-17 14:10             ` Michal Hocko
  2018-12-17 14:41               ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2018-12-17 14:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tetsuo Handa, Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On Mon 17-12-18 04:25:46, Matthew Wilcox wrote:
> On Mon, Dec 17, 2018 at 07:51:27PM +0900, Tetsuo Handa wrote:
> > On 2018/12/17 18:33, Michal Hocko wrote:
> > > On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
> > > [...]
> > >> Ah, yes, that makes perfect sense.  Thank you for the explanation.
> > >>
> > >> I wonder if the correct fix, however, is not to move the check for
> > >> GFP_NOFS in out_of_memory() down to below the check whether to kill
> > >> the current task.  That would solve your problem, and I don't _think_
> > >> it would cause any new ones.  Michal, you touched this code last, what
> > >> do you think?
> > > 
> > > What do you mean exactly? Whether we kill a current task or something
> > > else doesn't change much on the fact that NOFS is a reclaim restricted
> > > context and we might kill too early. If the fs can do GFP_FS then it is
> > > obviously a better thing to do because FS metadata can be reclaimed as
> > > well and therefore there is potentially less memory pressure on
> > > application data.
> > > 
> > 
> > I interpreted "to move the check for GFP_NOFS in out_of_memory() down to
> > below the check whether to kill the current task" as
> 
> Too far; I meant one line earlier, before we try to select a different
> process.

We could still panic the system on pre-mature OOM. So it doesn't really
seem good.

> > @@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc)
> >  	}
> >  
> >  	select_bad_process(oc);
> > +
> > +	/*
> > +	 * The OOM killer does not compensate for IO-less reclaim.
> > +	 * pagefault_out_of_memory lost its gfp context so we have to
> > +	 * make sure exclude 0 mask - all other users should have at least
> > +	 * ___GFP_DIRECT_RECLAIM to get here.
> > +	 */
> > +	if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen &&
> > +	    oc->chosen != (void *)-1UL && oc->chosen != current) {
> > +		put_task_struct(oc->chosen);
> > +		return true;
> > +	}
> > +
> >  	/* Found nothing?!?! */
> >  	if (!oc->chosen) {
> >  		dump_header(oc, NULL);
> > 
> > which is prefixed by "the correct fix is not".
> > 
> > Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used
> > will not be the correct fix. But ...
> > 
> > Hou Tao wrote:
> > > There is no need to disable __GFP_FS in ->readpage:
> > > * It's a read-only fs, so there will be no dirty/writeback page and
> > >   there will be no deadlock against the caller's locked page
> > 
> > is read-only filesystem sufficient for safe to use __GFP_FS?
> > 
> > Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks
> > are held or not" rather than "whether fs has dirty/writeback page or not" ?
> 
> It's worth noticing that squashfs _is_ in fact holding a page locked in
> squashfs_copy_cache() when it calls grab_cache_page_nowait().  I'm not
> sure if this will lead to trouble or not because I'm insufficiently
> familiar with the reclaim path.

Hmm, this is more interesting then. If there is any memcg accounted
allocation down that path _and_ the squashfs writeout can lock more
pages and mark them writeback before they are really sent to the storage
then we have a problem. See [1]

[1] http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17 14:10             ` Michal Hocko
@ 2018-12-17 14:41               ` Matthew Wilcox
  2018-12-17 14:49                 ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2018-12-17 14:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tetsuo Handa, Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On Mon, Dec 17, 2018 at 03:10:44PM +0100, Michal Hocko wrote:
> On Mon 17-12-18 04:25:46, Matthew Wilcox wrote:
> > It's worth noticing that squashfs _is_ in fact holding a page locked in
> > squashfs_copy_cache() when it calls grab_cache_page_nowait().  I'm not
> > sure if this will lead to trouble or not because I'm insufficiently
> > familiar with the reclaim path.
> 
> Hmm, this is more interesting then. If there is any memcg accounted
> allocation down that path _and_ the squashfs writeout can lock more
> pages and mark them writeback before they are really sent to the storage
> then we have a problem. See [1]
> 
> [1] http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org

Squashfs is read only, so it'll never have dirty pages and never do
writeout.

But ... maybe the GFP flags being used for grab_cache_page_nowait() are
wrong.  It does, after all, say "nowait".  Perhaps it shouldn't be trying
direct reclaim at all, but rather fail earlier.  Like this:

+++ b/mm/filemap.c
@@ -1550,6 +1550,8 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
                        gfp_mask |= __GFP_WRITE;
                if (fgp_flags & FGP_NOFS)
                        gfp_mask &= ~__GFP_FS;
+               if (fgp_flags & FGP_NOWAIT)
+                       gfp_mask &= ~__GFP_DIRECT_RECLAIM;
 
                page = __page_cache_alloc(gfp_mask);
                if (!page)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17 14:41               ` Matthew Wilcox
@ 2018-12-17 14:49                 ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2018-12-17 14:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tetsuo Handa, Hou Tao, phillip, linux-fsdevel, linux-kernel, linux-mm

On Mon 17-12-18 06:41:01, Matthew Wilcox wrote:
> On Mon, Dec 17, 2018 at 03:10:44PM +0100, Michal Hocko wrote:
> > On Mon 17-12-18 04:25:46, Matthew Wilcox wrote:
> > > It's worth noticing that squashfs _is_ in fact holding a page locked in
> > > squashfs_copy_cache() when it calls grab_cache_page_nowait().  I'm not
> > > sure if this will lead to trouble or not because I'm insufficiently
> > > familiar with the reclaim path.
> > 
> > Hmm, this is more interesting then. If there is any memcg accounted
> > allocation down that path _and_ the squashfs writeout can lock more
> > pages and mark them writeback before they are really sent to the storage
> > then we have a problem. See [1]
> > 
> > [1] http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org
> 
> Squashfs is read only, so it'll never have dirty pages and never do
> writeout.
> 
> But ... maybe the GFP flags being used for grab_cache_page_nowait() are
> wrong.  It does, after all, say "nowait".  Perhaps it shouldn't be trying
> direct reclaim at all, but rather fail earlier.  Like this:
> 
> +++ b/mm/filemap.c
> @@ -1550,6 +1550,8 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
>                         gfp_mask |= __GFP_WRITE;
>                 if (fgp_flags & FGP_NOFS)
>                         gfp_mask &= ~__GFP_FS;
> +               if (fgp_flags & FGP_NOWAIT)
> +                       gfp_mask &= ~__GFP_DIRECT_RECLAIM;
>  
>                 page = __page_cache_alloc(gfp_mask);
>                 if (!page)

Isn't FGP_NOWAIT about page lock rather than the allocation context?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-17 10:51         ` Tetsuo Handa
  2018-12-17 12:25           ` Matthew Wilcox
@ 2018-12-18  6:06           ` Hou Tao
  2018-12-18 11:32             ` Michal Hocko
  1 sibling, 1 reply; 15+ messages in thread
From: Hou Tao @ 2018-12-18  6:06 UTC (permalink / raw)
  To: Tetsuo Handa, Michal Hocko, Matthew Wilcox
  Cc: phillip, linux-fsdevel, linux-kernel, linux-mm

Hi,

On 2018/12/17 18:51, Tetsuo Handa wrote:
> On 2018/12/17 18:33, Michal Hocko wrote:
>> On Sun 16-12-18 19:51:57, Matthew Wilcox wrote:
>> [...]
>>> Ah, yes, that makes perfect sense.  Thank you for the explanation.
>>>
>>> I wonder if the correct fix, however, is not to move the check for
>>> GFP_NOFS in out_of_memory() down to below the check whether to kill
>>> the current task.  That would solve your problem, and I don't _think_
>>> it would cause any new ones.  Michal, you touched this code last, what
>>> do you think?
>>
>> What do you mean exactly? Whether we kill a current task or something
>> else doesn't change much on the fact that NOFS is a reclaim restricted
>> context and we might kill too early. If the fs can do GFP_FS then it is
>> obviously a better thing to do because FS metadata can be reclaimed as
>> well and therefore there is potentially less memory pressure on
>> application data.
>>
> 
> I interpreted "to move the check for GFP_NOFS in out_of_memory() down to
> below the check whether to kill the current task" as
> 
> @@ -1077,15 +1077,6 @@ bool out_of_memory(struct oom_control *oc)
>  	}
>  
>  	/*
> -	 * The OOM killer does not compensate for IO-less reclaim.
> -	 * pagefault_out_of_memory lost its gfp context so we have to
> -	 * make sure exclude 0 mask - all other users should have at least
> -	 * ___GFP_DIRECT_RECLAIM to get here.
> -	 */
> -	if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
> -		return true;
> -
> -	/*
>  	 * Check if there were limitations on the allocation (only relevant for
>  	 * NUMA and memcg) that may require different handling.
>  	 */
> @@ -1104,6 +1095,19 @@ bool out_of_memory(struct oom_control *oc)
>  	}
>  
>  	select_bad_process(oc);
> +
> +	/*
> +	 * The OOM killer does not compensate for IO-less reclaim.
> +	 * pagefault_out_of_memory lost its gfp context so we have to
> +	 * make sure exclude 0 mask - all other users should have at least
> +	 * ___GFP_DIRECT_RECLAIM to get here.
> +	 */
> +	if ((oc->gfp_mask && !(oc->gfp_mask & __GFP_FS)) && oc->chosen &&
> +	    oc->chosen != (void *)-1UL && oc->chosen != current) {
> +		put_task_struct(oc->chosen);
> +		return true;
> +	}
> +
>  	/* Found nothing?!?! */
>  	if (!oc->chosen) {
>  		dump_header(oc, NULL);
> 
> which is prefixed by "the correct fix is not".
> 
> Behaving like sysctl_oom_kill_allocating_task == 1 if __GFP_FS is not used
> will not be the correct fix. But ...
> 
> Hou Tao wrote:
>> There is no need to disable __GFP_FS in ->readpage:
>> * It's a read-only fs, so there will be no dirty/writeback page and
>>   there will be no deadlock against the caller's locked page
> 
> is read-only filesystem sufficient for safe to use __GFP_FS?
> 
> Isn't "whether it is safe to use __GFP_FS" depends on "whether fs locks
> are held or not" rather than "whether fs has dirty/writeback page or not" ?
> 
In my understanding (correct me if I am wrong), there are three ways through which
reclamation will invoked fs related code and may cause dead-lock:

(1) write-back dirty pages. Not possible for squashfs.
(2) the reclamation of inodes & dentries. The current file is in-use, so it will be not
    reclaimed, and for other reclaimable inodes, squashfs_destroy_inode() will
    be invoked and it doesn't take any locks.
(3) customized shrinker defined by fs. No customized shrinker in squashfs.

So my point is that even a page lock is already held by squashfs_readpage() and
reclamation invokes back to squashfs code, there will be no dead-lock, so it's
safe to use __GFP_FS.

Regards,
Tao

> .
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc
  2018-12-18  6:06           ` Hou Tao
@ 2018-12-18 11:32             ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2018-12-18 11:32 UTC (permalink / raw)
  To: Hou Tao
  Cc: Tetsuo Handa, Matthew Wilcox, phillip, linux-fsdevel,
	linux-kernel, linux-mm

On Tue 18-12-18 14:06:11, Hou Tao wrote:
[...]
> In my understanding (correct me if I am wrong), there are three ways through which
> reclamation will invoked fs related code and may cause dead-lock:
> 
> (1) write-back dirty pages. Not possible for squashfs.

only from kswapd context. So not relevant to OOM killer/

> (2) the reclamation of inodes & dentries. The current file is in-use, so it will be not
>     reclaimed, and for other reclaimable inodes, squashfs_destroy_inode() will
>     be invoked and it doesn't take any locks.

There are other inodes, not only those in use. Do you use any locks that
could be taken from an inode teardown?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-12-18 11:32 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-04  2:08 [PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc Hou Tao
2018-12-06  1:14 ` Hou Tao
2018-12-13  2:18   ` Hou Tao
2018-12-15 13:24     ` Hou Tao
2018-12-15 14:38 ` Matthew Wilcox
2018-12-16  9:38   ` Hou Tao
2018-12-17  3:51     ` Matthew Wilcox
2018-12-17  9:33       ` Michal Hocko
2018-12-17 10:51         ` Tetsuo Handa
2018-12-17 12:25           ` Matthew Wilcox
2018-12-17 14:10             ` Michal Hocko
2018-12-17 14:41               ` Matthew Wilcox
2018-12-17 14:49                 ` Michal Hocko
2018-12-18  6:06           ` Hou Tao
2018-12-18 11:32             ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).