linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
@ 2018-12-18  6:52 Yang Shi
  2018-12-18  6:52 ` [RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead Yang Shi
  2018-12-18 19:29 ` [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Tim Chen
  0 siblings, 2 replies; 11+ messages in thread
From: Yang Shi @ 2018-12-18  6:52 UTC (permalink / raw)
  To: ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: yang.shi, linux-mm, linux-kernel

Swap readahead would read in a few pages regardless if the underlying
device is busy or not.  It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.

Use inode_read_congested() to check if the underlying device is busy or
not like what file page readahead does.  Get inode from swap_info_struct.
Although we can add inode information in swap_address_space
(address_space->host), it may lead some unexpected side effect, i.e.
it may break mapping_cap_account_dirty().  Using inode from
swap_info_struct seems simple and good enough.

Just does the check in vma_cluster_readahead() since
swap_vma_readahead() is just used for non-rotational device which
much less likely has congestion than traditional HDD.

Although swap slots may be consecutive on swap partition, it still may be
fragmented on swap file. This check would help to reduce excessive stall
for such case.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 mm/swap_state.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index fd2f21e..7cc3c29 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 	bool do_poll = true, page_allocated;
 	struct vm_area_struct *vma = vmf->vma;
 	unsigned long addr = vmf->address;
+	struct inode *inode = si->swap_file->f_mapping->host;
 
 	mask = swapin_nr_pages(offset) - 1;
 	if (!mask)
 		goto skip;
 
+	if (inode_read_congested(inode))
+		goto skip;
+
 	do_poll = false;
 	/* Read a page_cluster sized and aligned cluster around offset. */
 	start_offset = offset & ~mask;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead
  2018-12-18  6:52 [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Yang Shi
@ 2018-12-18  6:52 ` Yang Shi
  2018-12-18 19:29 ` [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Tim Chen
  1 sibling, 0 replies; 11+ messages in thread
From: Yang Shi @ 2018-12-18  6:52 UTC (permalink / raw)
  To: ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: yang.shi, linux-mm, linux-kernel

swap_vma_readahead()'s comment is missed, just add it.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 mm/swap_state.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index 7cc3c29..c12aedf 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -695,6 +695,23 @@ static void swap_ra_info(struct vm_fault *vmf,
 	pte_unmap(orig_pte);
 }
 
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault information
+ *
+ * Returns the struct page for entry and addr, after queueing swapin.
+ *
+ * Primitive swap readahead code. We simply read in a few pages whoes
+ * virtual addresses are around the fault address in the same vma.
+ *
+ * This has been extended to use the NUMA policies from the mm triggering
+ * the readahead.
+ *
+ * Caller must hold down_read on the vma->vm_mm if vmf->vma is not NULL.
+ *
+ */
 static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
 				       struct vm_fault *vmf)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-18  6:52 [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Yang Shi
  2018-12-18  6:52 ` [RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead Yang Shi
@ 2018-12-18 19:29 ` Tim Chen
  2018-12-18 23:43   ` Yang Shi
  1 sibling, 1 reply; 11+ messages in thread
From: Tim Chen @ 2018-12-18 19:29 UTC (permalink / raw)
  To: Yang Shi, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel

On 12/17/18 10:52 PM, Yang Shi wrote:

> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index fd2f21e..7cc3c29 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>  	bool do_poll = true, page_allocated;
>  	struct vm_area_struct *vma = vmf->vma;
>  	unsigned long addr = vmf->address;
> +	struct inode *inode = si->swap_file->f_mapping->host;
>  
>  	mask = swapin_nr_pages(offset) - 1;
>  	if (!mask)
>  		goto skip;
>  

Shmem will also be using this function and I don't think the inode_read_congested
logic is relevant for that case.

So probably change the check to

	if (swp_type(entry) < nr_swapfiles &&
	    inode_read_congested(si->swap_file->f_mapping->host))
		goto skip;
		
> +	if (inode_read_congested(inode))
> +		goto skip;
> +
>  	do_poll = false;
>  	/* Read a page_cluster sized and aligned cluster around offset. */
>  	start_offset = offset & ~mask;
> 

Thanks.

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-18 19:29 ` [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Tim Chen
@ 2018-12-18 23:43   ` Yang Shi
  2018-12-19  0:16     ` Tim Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Yang Shi @ 2018-12-18 23:43 UTC (permalink / raw)
  To: Tim Chen, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel



On 12/18/18 11:29 AM, Tim Chen wrote:
> On 12/17/18 10:52 PM, Yang Shi wrote:
>
>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>> index fd2f21e..7cc3c29 100644
>> --- a/mm/swap_state.c
>> +++ b/mm/swap_state.c
>> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>>   	bool do_poll = true, page_allocated;
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	unsigned long addr = vmf->address;
>> +	struct inode *inode = si->swap_file->f_mapping->host;
>>   
>>   	mask = swapin_nr_pages(offset) - 1;
>>   	if (!mask)
>>   		goto skip;
>>   
> Shmem will also be using this function and I don't think the inode_read_congested
> logic is relevant for that case.

IMHO, shmem is also relevant. As long as it is trying to readahead from 
swap, it should check if the underlying device is busy or not regardless 
of shmem or anon page.

Just like mem_cgroup_try_charge_delay(), which throttles swap rate for 
both swap page fault and shmem.

Thanks,
Yang

>
> So probably change the check to
>
> 	if (swp_type(entry) < nr_swapfiles &&
> 	    inode_read_congested(si->swap_file->f_mapping->host))
> 		goto skip;
> 		
>> +	if (inode_read_congested(inode))
>> +		goto skip;
>> +
>>   	do_poll = false;
>>   	/* Read a page_cluster sized and aligned cluster around offset. */
>>   	start_offset = offset & ~mask;
>>
> Thanks.
>
> Tim


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-18 23:43   ` Yang Shi
@ 2018-12-19  0:16     ` Tim Chen
  2018-12-19  5:56       ` Yang Shi
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Chen @ 2018-12-19  0:16 UTC (permalink / raw)
  To: Yang Shi, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel

On 12/18/18 3:43 PM, Yang Shi wrote:
> 
> 
> On 12/18/18 11:29 AM, Tim Chen wrote:
>> On 12/17/18 10:52 PM, Yang Shi wrote:
>>
>>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>>> index fd2f21e..7cc3c29 100644
>>> --- a/mm/swap_state.c
>>> +++ b/mm/swap_state.c
>>> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>>>       bool do_poll = true, page_allocated;
>>>       struct vm_area_struct *vma = vmf->vma;
>>>       unsigned long addr = vmf->address;
>>> +    struct inode *inode = si->swap_file->f_mapping->host;
>>>         mask = swapin_nr_pages(offset) - 1;
>>>       if (!mask)
>>>           goto skip;
>>>   
>> Shmem will also be using this function and I don't think the inode_read_congested
>> logic is relevant for that case.
> 
> IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon page.
> 

I don't think your dereference inode = si->swap_file->f_mapping->host
is always safe.  You should do it only when (si->flags & SWP_FS) is true.

Tim


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19  0:16     ` Tim Chen
@ 2018-12-19  5:56       ` Yang Shi
  2018-12-19 17:28         ` Tim Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Yang Shi @ 2018-12-19  5:56 UTC (permalink / raw)
  To: Tim Chen, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel



On 12/18/18 4:16 PM, Tim Chen wrote:
> On 12/18/18 3:43 PM, Yang Shi wrote:
>>
>> On 12/18/18 11:29 AM, Tim Chen wrote:
>>> On 12/17/18 10:52 PM, Yang Shi wrote:
>>>
>>>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>>>> index fd2f21e..7cc3c29 100644
>>>> --- a/mm/swap_state.c
>>>> +++ b/mm/swap_state.c
>>>> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>>>>        bool do_poll = true, page_allocated;
>>>>        struct vm_area_struct *vma = vmf->vma;
>>>>        unsigned long addr = vmf->address;
>>>> +    struct inode *inode = si->swap_file->f_mapping->host;
>>>>          mask = swapin_nr_pages(offset) - 1;
>>>>        if (!mask)
>>>>            goto skip;
>>>>    
>>> Shmem will also be using this function and I don't think the inode_read_congested
>>> logic is relevant for that case.
>> IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon page.
>>
> I don't think your dereference inode = si->swap_file->f_mapping->host
> is always safe.  You should do it only when (si->flags & SWP_FS) is true.

Do you mean it is not safe for swap partition?

I tested with swap partition too. It looks fine. Opening block device 
also gets inode.

Filename                                Type            Size Used    
Priority
/dev/sdb1                               partition 20970492        850168  -2

Thanks,
Yang

>
> Tim


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19  5:56       ` Yang Shi
@ 2018-12-19 17:28         ` Tim Chen
  2018-12-19 18:40           ` Yang Shi
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Chen @ 2018-12-19 17:28 UTC (permalink / raw)
  To: Yang Shi, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel

On 12/18/18 9:56 PM, Yang Shi wrote:
> 
> 
> On 12/18/18 4:16 PM, Tim Chen wrote:
>> On 12/18/18 3:43 PM, Yang Shi wrote:
>>>
>>> On 12/18/18 11:29 AM, Tim Chen wrote:
>>>> On 12/17/18 10:52 PM, Yang Shi wrote:
>>>>
>>>>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>>>>> index fd2f21e..7cc3c29 100644
>>>>> --- a/mm/swap_state.c
>>>>> +++ b/mm/swap_state.c
>>>>> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>>>>>        bool do_poll = true, page_allocated;
>>>>>        struct vm_area_struct *vma = vmf->vma;
>>>>>        unsigned long addr = vmf->address;
>>>>> +    struct inode *inode = si->swap_file->f_mapping->host;
>>>>>          mask = swapin_nr_pages(offset) - 1;
>>>>>        if (!mask)
>>>>>            goto skip;
>>>>>    
>>>> Shmem will also be using this function and I don't think the inode_read_congested
>>>> logic is relevant for that case.
>>> IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon page.
>>>
>> I don't think your dereference inode = si->swap_file->f_mapping->host
>> is always safe.  You should do it only when (si->flags & SWP_FS) is true.
> 
> Do you mean it is not safe for swap partition?

The f_mapping may not be instantiated.  It is only done for SWP_FS.

Tim



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19 17:28         ` Tim Chen
@ 2018-12-19 18:40           ` Yang Shi
  2018-12-19 19:00             ` Tim Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Yang Shi @ 2018-12-19 18:40 UTC (permalink / raw)
  To: Tim Chen, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel



On 12/19/18 9:28 AM, Tim Chen wrote:
> On 12/18/18 9:56 PM, Yang Shi wrote:
>>
>> On 12/18/18 4:16 PM, Tim Chen wrote:
>>> On 12/18/18 3:43 PM, Yang Shi wrote:
>>>> On 12/18/18 11:29 AM, Tim Chen wrote:
>>>>> On 12/17/18 10:52 PM, Yang Shi wrote:
>>>>>
>>>>>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>>>>>> index fd2f21e..7cc3c29 100644
>>>>>> --- a/mm/swap_state.c
>>>>>> +++ b/mm/swap_state.c
>>>>>> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
>>>>>>         bool do_poll = true, page_allocated;
>>>>>>         struct vm_area_struct *vma = vmf->vma;
>>>>>>         unsigned long addr = vmf->address;
>>>>>> +    struct inode *inode = si->swap_file->f_mapping->host;
>>>>>>           mask = swapin_nr_pages(offset) - 1;
>>>>>>         if (!mask)
>>>>>>             goto skip;
>>>>>>     
>>>>> Shmem will also be using this function and I don't think the inode_read_congested
>>>>> logic is relevant for that case.
>>>> IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon page.
>>>>
>>> I don't think your dereference inode = si->swap_file->f_mapping->host
>>> is always safe.  You should do it only when (si->flags & SWP_FS) is true.
>> Do you mean it is not safe for swap partition?
> The f_mapping may not be instantiated.  It is only done for SWP_FS.

Really? I saw the below calls in swapon:

swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
...
p->swap_file = swap_file;
mapping = swap_file->f_mapping;
inode = mapping->host;
...

Then the below code manipulates the inode.

And, trace shows file_open_name() does call blkdev_open if it is turning 
block device swap on. And, blkdev_open() would return instantiated 
address_space and inode.

Am I missing something?

Thanks,
Yang

>
> Tim
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19 18:40           ` Yang Shi
@ 2018-12-19 19:00             ` Tim Chen
  2018-12-19 23:48               ` Yang Shi
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Chen @ 2018-12-19 19:00 UTC (permalink / raw)
  To: Yang Shi, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel

On 12/19/18 10:40 AM, Yang Shi wrote:
> 
> 

>>>> I don't think your dereference inode = si->swap_file->f_mapping->host
>>>> is always safe.  You should do it only when (si->flags & SWP_FS) is true.
>>> Do you mean it is not safe for swap partition?
>> The f_mapping may not be instantiated.  It is only done for SWP_FS.
> 
> Really? I saw the below calls in swapon:
> 
> swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
> ...
> p->swap_file = swap_file;
> mapping = swap_file->f_mapping;
> inode = mapping->host;
> ...
> 
> Then the below code manipulates the inode.
> 
> And, trace shows file_open_name() does call blkdev_open if it is turning block device swap on. And, blkdev_open() would return instantiated address_space and inode.
> 
> Am I missing something?
> 

I was trying to limit the congestion logic for block devices backed swap.
So the check I had in mind should really be "si->flags & SWP_BLKDEV"
instead of si->flags & SWP_FS.  I was concerned that there could
be other use cases where the inode dereference is invalid.

Looking at the code a bit more, looks like swap_cluster_readahead is not
used for other special case swap usage (like page migration).  So
you would a proper swapfile and inode here.  But I think it is still
a good idea to have a check for SWP_BLKDEV in si->flags.

Thanks.

Tim
 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19 19:00             ` Tim Chen
@ 2018-12-19 23:48               ` Yang Shi
  2018-12-20  1:05                 ` Tim Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Yang Shi @ 2018-12-19 23:48 UTC (permalink / raw)
  To: Tim Chen, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel



On 12/19/18 11:00 AM, Tim Chen wrote:
> On 12/19/18 10:40 AM, Yang Shi wrote:
>>
>>>>> I don't think your dereference inode = si->swap_file->f_mapping->host
>>>>> is always safe.  You should do it only when (si->flags & SWP_FS) is true.
>>>> Do you mean it is not safe for swap partition?
>>> The f_mapping may not be instantiated.  It is only done for SWP_FS.
>> Really? I saw the below calls in swapon:
>>
>> swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
>> ...
>> p->swap_file = swap_file;
>> mapping = swap_file->f_mapping;
>> inode = mapping->host;
>> ...
>>
>> Then the below code manipulates the inode.
>>
>> And, trace shows file_open_name() does call blkdev_open if it is turning block device swap on. And, blkdev_open() would return instantiated address_space and inode.
>>
>> Am I missing something?
>>
> I was trying to limit the congestion logic for block devices backed swap.
> So the check I had in mind should really be "si->flags & SWP_BLKDEV"
> instead of si->flags & SWP_FS.  I was concerned that there could
> be other use cases where the inode dereference is invalid.
>
> Looking at the code a bit more, looks like swap_cluster_readahead is not
> used for other special case swap usage (like page migration).  So
> you would a proper swapfile and inode here.  But I think it is still

Yes, just swap page fault and shmem calls this function. Actually, your 
above concern is valid if the inode were added into swap_address_space 
(address_space->host). I did this in my first attempt, and found out it 
may break some assumption in clear_page_dirty_for_io() and migration.

So, I made the patch as it is.

> a good idea to have a check for SWP_BLKDEV in si->flags.

I don't get your point why it should be block dev swap only. IMHO, block 
dev swap should be less likely fragmented and congested than swap file, 
right? Block dev swap could be a dedicated physical device, but swap 
file has to be with filesystem.

It sounds reasonable to me to have this check for swap file only. 
However, to be honest, it sounds not hurt to have both.

Thanks,
Yang

>
> Thanks.
>
> Tim
>   
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not
  2018-12-19 23:48               ` Yang Shi
@ 2018-12-20  1:05                 ` Tim Chen
  0 siblings, 0 replies; 11+ messages in thread
From: Tim Chen @ 2018-12-20  1:05 UTC (permalink / raw)
  To: Yang Shi, ying.huang, tim.c.chen, minchan, Andrew Morton
  Cc: linux-mm, linux-kernel

On 12/19/18 3:48 PM, Yang Shi wrote:
> 
> 
> On 12/19/18 11:00 AM, Tim Chen wrote:
>> On 12/19/18 10:40 AM, Yang Shi wrote:
>>>
>>>>>> I don't think your dereference inode = si->swap_file->f_mapping->host
>>>>>> is always safe.  You should do it only when (si->flags & SWP_FS) is true.
>>>>> Do you mean it is not safe for swap partition?
>>>> The f_mapping may not be instantiated.  It is only done for SWP_FS.
>>> Really? I saw the below calls in swapon:
>>>
>>> swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
>>> ...
>>> p->swap_file = swap_file;
>>> mapping = swap_file->f_mapping;
>>> inode = mapping->host;
>>> ...
>>>
>>> Then the below code manipulates the inode.
>>>
>>> And, trace shows file_open_name() does call blkdev_open if it is turning block device swap on. And, blkdev_open() would return instantiated address_space and inode.
>>>
>>> Am I missing something?
>>>
>> I was trying to limit the congestion logic for block devices backed swap.
>> So the check I had in mind should really be "si->flags & SWP_BLKDEV"
>> instead of si->flags & SWP_FS.  I was concerned that there could
>> be other use cases where the inode dereference is invalid.
>>
>> Looking at the code a bit more, looks like swap_cluster_readahead is not
>> used for other special case swap usage (like page migration).  So
>> you would a proper swapfile and inode here.  But I think it is still
> 
> Yes, just swap page fault and shmem calls this function. Actually, your above concern is valid if the inode were added into swap_address_space (address_space->host). I did this in my first attempt, and found out it may break some assumption in clear_page_dirty_for_io() and migration.
> 
> So, I made the patch as it is.
> 
>> a good idea to have a check for SWP_BLKDEV in si->flags.
> 
> I don't get your point why it should be block dev swap only. IMHO, block dev swap should be less likely fragmented and congested than swap file, right? Block dev swap could be a dedicated physical device, but swap file has to be with filesystem.
> 
> It sounds reasonable to me to have this check for swap file only. However, to be honest, it sounds not hurt to have both.
> 

Yes, I think we want to do it for both cases.

My original concern was that the backing store was not sitting on
a block device for some special case swap usage.  And si->swap_file->f_mapping->host
may fails to dereference to a host inode that's a valid block device.

It looks like on the call paths si->flags should either be SWP_BLKDEV or SWP_FS
on all the call paths. So si->swap_file->f_mapping->host should be valid
and your code is safe.

If we want to be paranoid we may do

	if (si->flags & (SWP_BLKDEV | SWP_FS)) {
		if (inode_read_congested(si->swap_file->f_mapping->host))
			goto skip;
	}

Thanks.

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-20  1:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-18  6:52 [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Yang Shi
2018-12-18  6:52 ` [RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead Yang Shi
2018-12-18 19:29 ` [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not Tim Chen
2018-12-18 23:43   ` Yang Shi
2018-12-19  0:16     ` Tim Chen
2018-12-19  5:56       ` Yang Shi
2018-12-19 17:28         ` Tim Chen
2018-12-19 18:40           ` Yang Shi
2018-12-19 19:00             ` Tim Chen
2018-12-19 23:48               ` Yang Shi
2018-12-20  1:05                 ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).