linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/damon: Make the sampling more accurate
@ 2022-03-18  9:23 Baolin Wang
  2022-03-18  9:40 ` sj
  0 siblings, 1 reply; 7+ messages in thread
From: Baolin Wang @ 2022-03-18  9:23 UTC (permalink / raw)
  To: sj, akpm; +Cc: baolin.wang, linux-mm, linux-kernel

When I try to sample the physical address with DAMON to migrate pages
on tiered memory system, I found it will demote some cold regions mistakenly.
Now we will choose an physical address in the region randomly, but if
its corresponding page is not an online LRU page, we will ignore the
accessing status in this cycle of sampling, and actually will be treated
as a non-accessed region. Suppose a region including some non-LRU pages,
it will be treated as a cold region with a high probability, and may be
merged with adjacent cold regions, but there are some pages may be
accessed we missed.

So instead of ignoring the access status of this region if we did not find
a valid page according to current sampling address, we can use last valid
sampling address to help to make the sampling more accurate, then we can do
a better decision.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 include/linux/damon.h |  2 ++
 mm/damon/core.c       |  2 ++
 mm/damon/paddr.c      | 15 ++++++++++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index f23cbfa..3311e15 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -38,6 +38,7 @@ struct damon_addr_range {
  * struct damon_region - Represents a monitoring target region.
  * @ar:			The address range of the region.
  * @sampling_addr:	Address of the sample for the next access check.
+ * @last_sampling_addr:	Last valid address of the sampling.
  * @nr_accesses:	Access frequency of this region.
  * @list:		List head for siblings.
  * @age:		Age of this region.
@@ -50,6 +51,7 @@ struct damon_addr_range {
 struct damon_region {
 	struct damon_addr_range ar;
 	unsigned long sampling_addr;
+	unsigned long last_sampling_addr;
 	unsigned int nr_accesses;
 	struct list_head list;
 
diff --git a/mm/damon/core.c b/mm/damon/core.c
index c1e0fed..957704f 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -108,6 +108,7 @@ struct damon_region *damon_new_region(unsigned long start, unsigned long end)
 	region->ar.start = start;
 	region->ar.end = end;
 	region->nr_accesses = 0;
+	region->last_sampling_addr = 0;
 	INIT_LIST_HEAD(&region->list);
 
 	region->age = 0;
@@ -848,6 +849,7 @@ static void damon_split_region_at(struct damon_ctx *ctx,
 		return;
 
 	r->ar.end = new->ar.start;
+	r->last_sampling_addr = 0;
 
 	new->age = r->age;
 	new->last_nr_accesses = r->last_nr_accesses;
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 21474ae..5f15068 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -31,10 +31,9 @@ static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
 	return true;
 }
 
-static void damon_pa_mkold(unsigned long paddr)
+static void damon_pa_mkold(struct page *page)
 {
 	struct folio *folio;
-	struct page *page = damon_get_page(PHYS_PFN(paddr));
 	struct rmap_walk_control rwc = {
 		.rmap_one = __damon_pa_mkold,
 		.anon_lock = folio_lock_anon_vma_read,
@@ -66,9 +65,19 @@ static void damon_pa_mkold(unsigned long paddr)
 static void __damon_pa_prepare_access_check(struct damon_ctx *ctx,
 					    struct damon_region *r)
 {
+	struct page *page;
+
 	r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
 
-	damon_pa_mkold(r->sampling_addr);
+	page = damon_get_page(PHYS_PFN(r->sampling_addr));
+	if (page) {
+		r->last_sampling_addr = r->sampling_addr;
+	} else if (r->last_sampling_addr) {
+		r->sampling_addr = r->last_sampling_addr;
+		page = damon_get_page(PHYS_PFN(r->last_sampling_addr));
+	}
+
+	damon_pa_mkold(page);
 }
 
 static void damon_pa_prepare_access_checks(struct damon_ctx *ctx)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18  9:23 [PATCH] mm/damon: Make the sampling more accurate Baolin Wang
@ 2022-03-18  9:40 ` sj
  2022-03-18 10:01   ` Baolin Wang
  0 siblings, 1 reply; 7+ messages in thread
From: sj @ 2022-03-18  9:40 UTC (permalink / raw)
  To: Baolin Wang; +Cc: sj, akpm, linux-mm, linux-kernel

Hi Baolin,

On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> When I try to sample the physical address with DAMON to migrate pages
> on tiered memory system, I found it will demote some cold regions mistakenly.
> Now we will choose an physical address in the region randomly, but if
> its corresponding page is not an online LRU page, we will ignore the
> accessing status in this cycle of sampling, and actually will be treated
> as a non-accessed region. Suppose a region including some non-LRU pages,
> it will be treated as a cold region with a high probability, and may be
> merged with adjacent cold regions, but there are some pages may be
> accessed we missed.
> 
> So instead of ignoring the access status of this region if we did not find
> a valid page according to current sampling address, we can use last valid
> sampling address to help to make the sampling more accurate, then we can do
> a better decision.

Well...  Offlined pages are also a valid part of the memory region, so treating
those as not accessed and making the memory region containing the offlined
pages looks colder seems legal to me.  IOW, this approach could make memory
regions containing many non-online-LRU pages as hot.

If I'm missing some points, please let me know.


Thanks,
SJ

> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  include/linux/damon.h |  2 ++
>  mm/damon/core.c       |  2 ++
>  mm/damon/paddr.c      | 15 ++++++++++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/damon.h b/include/linux/damon.h
> index f23cbfa..3311e15 100644
> --- a/include/linux/damon.h
> +++ b/include/linux/damon.h
> @@ -38,6 +38,7 @@ struct damon_addr_range {
>   * struct damon_region - Represents a monitoring target region.
>   * @ar:			The address range of the region.
>   * @sampling_addr:	Address of the sample for the next access check.
> + * @last_sampling_addr:	Last valid address of the sampling.
>   * @nr_accesses:	Access frequency of this region.
>   * @list:		List head for siblings.
>   * @age:		Age of this region.
> @@ -50,6 +51,7 @@ struct damon_addr_range {
>  struct damon_region {
>  	struct damon_addr_range ar;
>  	unsigned long sampling_addr;
> +	unsigned long last_sampling_addr;
>  	unsigned int nr_accesses;
>  	struct list_head list;
>  
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index c1e0fed..957704f 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -108,6 +108,7 @@ struct damon_region *damon_new_region(unsigned long start, unsigned long end)
>  	region->ar.start = start;
>  	region->ar.end = end;
>  	region->nr_accesses = 0;
> +	region->last_sampling_addr = 0;
>  	INIT_LIST_HEAD(&region->list);
>  
>  	region->age = 0;
> @@ -848,6 +849,7 @@ static void damon_split_region_at(struct damon_ctx *ctx,
>  		return;
>  
>  	r->ar.end = new->ar.start;
> +	r->last_sampling_addr = 0;
>  
>  	new->age = r->age;
>  	new->last_nr_accesses = r->last_nr_accesses;
> diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
> index 21474ae..5f15068 100644
> --- a/mm/damon/paddr.c
> +++ b/mm/damon/paddr.c
> @@ -31,10 +31,9 @@ static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
>  	return true;
>  }
>  
> -static void damon_pa_mkold(unsigned long paddr)
> +static void damon_pa_mkold(struct page *page)
>  {
>  	struct folio *folio;
> -	struct page *page = damon_get_page(PHYS_PFN(paddr));
>  	struct rmap_walk_control rwc = {
>  		.rmap_one = __damon_pa_mkold,
>  		.anon_lock = folio_lock_anon_vma_read,
> @@ -66,9 +65,19 @@ static void damon_pa_mkold(unsigned long paddr)
>  static void __damon_pa_prepare_access_check(struct damon_ctx *ctx,
>  					    struct damon_region *r)
>  {
> +	struct page *page;
> +
>  	r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
>  
> -	damon_pa_mkold(r->sampling_addr);
> +	page = damon_get_page(PHYS_PFN(r->sampling_addr));
> +	if (page) {
> +		r->last_sampling_addr = r->sampling_addr;
> +	} else if (r->last_sampling_addr) {
> +		r->sampling_addr = r->last_sampling_addr;
> +		page = damon_get_page(PHYS_PFN(r->last_sampling_addr));
> +	}
> +
> +	damon_pa_mkold(page);
>  }
>  
>  static void damon_pa_prepare_access_checks(struct damon_ctx *ctx)
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18  9:40 ` sj
@ 2022-03-18 10:01   ` Baolin Wang
  2022-03-18 10:49     ` sj
  0 siblings, 1 reply; 7+ messages in thread
From: Baolin Wang @ 2022-03-18 10:01 UTC (permalink / raw)
  To: sj; +Cc: akpm, linux-mm, linux-kernel


On 3/18/2022 5:40 PM, sj@kernel.org wrote:
> Hi Baolin,
> 
> On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>> When I try to sample the physical address with DAMON to migrate pages
>> on tiered memory system, I found it will demote some cold regions mistakenly.
>> Now we will choose an physical address in the region randomly, but if
>> its corresponding page is not an online LRU page, we will ignore the
>> accessing status in this cycle of sampling, and actually will be treated
>> as a non-accessed region. Suppose a region including some non-LRU pages,
>> it will be treated as a cold region with a high probability, and may be
>> merged with adjacent cold regions, but there are some pages may be
>> accessed we missed.
>>
>> So instead of ignoring the access status of this region if we did not find
>> a valid page according to current sampling address, we can use last valid
>> sampling address to help to make the sampling more accurate, then we can do
>> a better decision.
> 
> Well...  Offlined pages are also a valid part of the memory region, so treating
> those as not accessed and making the memory region containing the offlined
> pages looks colder seems legal to me.  IOW, this approach could make memory
> regions containing many non-online-LRU pages as hot.

IMO I don't think this is a problem, since if this region containing 
many non-online-LRU pages is treated as hot, which means threre are aome 
pages are hot, right? We can find them and promote them to fast memory 
(or do other schemes). Meanwhile, for non-online-LRU pages, we can 
filter them and do nothing for them, since we can not get a valid page 
struct for them.

>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>>   include/linux/damon.h |  2 ++
>>   mm/damon/core.c       |  2 ++
>>   mm/damon/paddr.c      | 15 ++++++++++++---
>>   3 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/damon.h b/include/linux/damon.h
>> index f23cbfa..3311e15 100644
>> --- a/include/linux/damon.h
>> +++ b/include/linux/damon.h
>> @@ -38,6 +38,7 @@ struct damon_addr_range {
>>    * struct damon_region - Represents a monitoring target region.
>>    * @ar:			The address range of the region.
>>    * @sampling_addr:	Address of the sample for the next access check.
>> + * @last_sampling_addr:	Last valid address of the sampling.
>>    * @nr_accesses:	Access frequency of this region.
>>    * @list:		List head for siblings.
>>    * @age:		Age of this region.
>> @@ -50,6 +51,7 @@ struct damon_addr_range {
>>   struct damon_region {
>>   	struct damon_addr_range ar;
>>   	unsigned long sampling_addr;
>> +	unsigned long last_sampling_addr;
>>   	unsigned int nr_accesses;
>>   	struct list_head list;
>>   
>> diff --git a/mm/damon/core.c b/mm/damon/core.c
>> index c1e0fed..957704f 100644
>> --- a/mm/damon/core.c
>> +++ b/mm/damon/core.c
>> @@ -108,6 +108,7 @@ struct damon_region *damon_new_region(unsigned long start, unsigned long end)
>>   	region->ar.start = start;
>>   	region->ar.end = end;
>>   	region->nr_accesses = 0;
>> +	region->last_sampling_addr = 0;
>>   	INIT_LIST_HEAD(&region->list);
>>   
>>   	region->age = 0;
>> @@ -848,6 +849,7 @@ static void damon_split_region_at(struct damon_ctx *ctx,
>>   		return;
>>   
>>   	r->ar.end = new->ar.start;
>> +	r->last_sampling_addr = 0;
>>   
>>   	new->age = r->age;
>>   	new->last_nr_accesses = r->last_nr_accesses;
>> diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
>> index 21474ae..5f15068 100644
>> --- a/mm/damon/paddr.c
>> +++ b/mm/damon/paddr.c
>> @@ -31,10 +31,9 @@ static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
>>   	return true;
>>   }
>>   
>> -static void damon_pa_mkold(unsigned long paddr)
>> +static void damon_pa_mkold(struct page *page)
>>   {
>>   	struct folio *folio;
>> -	struct page *page = damon_get_page(PHYS_PFN(paddr));
>>   	struct rmap_walk_control rwc = {
>>   		.rmap_one = __damon_pa_mkold,
>>   		.anon_lock = folio_lock_anon_vma_read,
>> @@ -66,9 +65,19 @@ static void damon_pa_mkold(unsigned long paddr)
>>   static void __damon_pa_prepare_access_check(struct damon_ctx *ctx,
>>   					    struct damon_region *r)
>>   {
>> +	struct page *page;
>> +
>>   	r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
>>   
>> -	damon_pa_mkold(r->sampling_addr);
>> +	page = damon_get_page(PHYS_PFN(r->sampling_addr));
>> +	if (page) {
>> +		r->last_sampling_addr = r->sampling_addr;
>> +	} else if (r->last_sampling_addr) {
>> +		r->sampling_addr = r->last_sampling_addr;
>> +		page = damon_get_page(PHYS_PFN(r->last_sampling_addr));
>> +	}
>> +
>> +	damon_pa_mkold(page);
>>   }
>>   
>>   static void damon_pa_prepare_access_checks(struct damon_ctx *ctx)
>> -- 
>> 1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18 10:01   ` Baolin Wang
@ 2022-03-18 10:49     ` sj
  2022-03-18 11:58       ` Baolin Wang
  0 siblings, 1 reply; 7+ messages in thread
From: sj @ 2022-03-18 10:49 UTC (permalink / raw)
  To: Baolin Wang; +Cc: sj, akpm, linux-mm, linux-kernel

On Fri, 18 Mar 2022 18:01:19 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> 
> On 3/18/2022 5:40 PM, sj@kernel.org wrote:
> > Hi Baolin,
> > 
> > On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> > 
> >> When I try to sample the physical address with DAMON to migrate pages
> >> on tiered memory system, I found it will demote some cold regions mistakenly.
> >> Now we will choose an physical address in the region randomly, but if
> >> its corresponding page is not an online LRU page, we will ignore the
> >> accessing status in this cycle of sampling, and actually will be treated
> >> as a non-accessed region. Suppose a region including some non-LRU pages,
> >> it will be treated as a cold region with a high probability, and may be
> >> merged with adjacent cold regions, but there are some pages may be
> >> accessed we missed.
> >>
> >> So instead of ignoring the access status of this region if we did not find
> >> a valid page according to current sampling address, we can use last valid
> >> sampling address to help to make the sampling more accurate, then we can do
> >> a better decision.
> > 
> > Well...  Offlined pages are also a valid part of the memory region, so treating
> > those as not accessed and making the memory region containing the offlined
> > pages looks colder seems legal to me.  IOW, this approach could make memory
> > regions containing many non-online-LRU pages as hot.
> 
> IMO I don't think this is a problem, since if this region containing 
> many non-online-LRU pages is treated as hot, which means threre are aome 
> pages are hot, right? We can find them and promote them to fast memory 
> (or do other schemes). Meanwhile, for non-online-LRU pages, we can 
> filter them and do nothing for them, since we can not get a valid page 
> struct for them.

For some of DAMOS actions that you mentioned, that could make sense.  However,
that wouldn't make much sense for some other cases, especially for manual
DAMON-based access pattern profiling.

After all, we already have a mechanism for this case: adaptive regions
adjustment (or, regions split/merge).  That mechanism will eventually separate
out hot oneline-LRU pages in the memory regions.  Before the region is
adjusted, reporting the whole region as hot looks like a right result to me.
Of course, I admit that it could take too much time to converge to the optimal
regions, and there are many rooms for improvement of the regions adjustment
mechanism.  I think we should pursue the direction (improving the regions
adjustment mechanism).


FYI, I have some rough ideas for improving the mechanism including partitioning
regions into more than 2 sub-regions if we belive it is not making a good
progress.  Nevertheless, I'd like to first make a methodology for evaluating
current accuracy.  For that, I am planning to implement a page-granularity
access monitoring.


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18 10:49     ` sj
@ 2022-03-18 11:58       ` Baolin Wang
  2022-03-18 12:15         ` sj
  0 siblings, 1 reply; 7+ messages in thread
From: Baolin Wang @ 2022-03-18 11:58 UTC (permalink / raw)
  To: sj; +Cc: akpm, linux-mm, linux-kernel



On 3/18/2022 6:49 PM, sj@kernel.org wrote:
> On Fri, 18 Mar 2022 18:01:19 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>>
>> On 3/18/2022 5:40 PM, sj@kernel.org wrote:
>>> Hi Baolin,
>>>
>>> On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>>
>>>> When I try to sample the physical address with DAMON to migrate pages
>>>> on tiered memory system, I found it will demote some cold regions mistakenly.
>>>> Now we will choose an physical address in the region randomly, but if
>>>> its corresponding page is not an online LRU page, we will ignore the
>>>> accessing status in this cycle of sampling, and actually will be treated
>>>> as a non-accessed region. Suppose a region including some non-LRU pages,
>>>> it will be treated as a cold region with a high probability, and may be
>>>> merged with adjacent cold regions, but there are some pages may be
>>>> accessed we missed.
>>>>
>>>> So instead of ignoring the access status of this region if we did not find
>>>> a valid page according to current sampling address, we can use last valid
>>>> sampling address to help to make the sampling more accurate, then we can do
>>>> a better decision.
>>>
>>> Well...  Offlined pages are also a valid part of the memory region, so treating
>>> those as not accessed and making the memory region containing the offlined
>>> pages looks colder seems legal to me.  IOW, this approach could make memory
>>> regions containing many non-online-LRU pages as hot.
>>
>> IMO I don't think this is a problem, since if this region containing
>> many non-online-LRU pages is treated as hot, which means threre are aome
>> pages are hot, right? We can find them and promote them to fast memory
>> (or do other schemes). Meanwhile, for non-online-LRU pages, we can
>> filter them and do nothing for them, since we can not get a valid page
>> struct for them.
> 
> For some of DAMOS actions that you mentioned, that could make sense.  However,
> that wouldn't make much sense for some other cases, especially for manual
> DAMON-based access pattern profiling.

I am not sure about this case, could you elaborate on how this can worse 
the case you mentioned?

Like you said as below, we can split the regions to separate the hot 
pages out of the hot regions containing some offline or non-lru pages, 
that is also a benefit to improve the regions adjustment.

> After all, we already have a mechanism for this case: adaptive regions
> adjustment (or, regions split/merge).  That mechanism will eventually separate
> out hot oneline-LRU pages in the memory regions.  Before the region is
> adjusted, reporting the whole region as hot looks like a right result to me.
> Of course, I admit that it could take too much time to converge to the optimal
> regions, and there are many rooms for improvement of the regions adjustment
> mechanism.  I think we should pursue the direction (improving the regions
> adjustment mechanism).

Yes, agree.

> FYI, I have some rough ideas for improving the mechanism including partitioning
> regions into more than 2 sub-regions if we belive it is not making a good
> progress.  Nevertheless, I'd like to first make a methodology for evaluating
> current accuracy.  For that, I am planning to implement a page-granularity
> access monitoring.

Great, I think the page-granularity monitoring will be more suitable for 
tiered memory system, which can reduce redundant demotion and promotion. 
However, I still concern the overhead if the monitoring is a 
page-granularity, especially for a large memory size. Anyway, I'd like 
to help to test or review the new page-granularity monitoring when 
you're ready to send out. Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18 11:58       ` Baolin Wang
@ 2022-03-18 12:15         ` sj
  2022-03-18 14:12           ` Baolin Wang
  0 siblings, 1 reply; 7+ messages in thread
From: sj @ 2022-03-18 12:15 UTC (permalink / raw)
  To: Baolin Wang; +Cc: sj, akpm, linux-mm, linux-kernel

On Fri, 18 Mar 2022 19:58:07 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> 
> 
> On 3/18/2022 6:49 PM, sj@kernel.org wrote:
> > On Fri, 18 Mar 2022 18:01:19 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> > 
> >>
> >> On 3/18/2022 5:40 PM, sj@kernel.org wrote:
> >>> Hi Baolin,
> >>>
> >>> On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> >>>
> >>>> When I try to sample the physical address with DAMON to migrate pages
> >>>> on tiered memory system, I found it will demote some cold regions mistakenly.
> >>>> Now we will choose an physical address in the region randomly, but if
> >>>> its corresponding page is not an online LRU page, we will ignore the
> >>>> accessing status in this cycle of sampling, and actually will be treated
> >>>> as a non-accessed region. Suppose a region including some non-LRU pages,
> >>>> it will be treated as a cold region with a high probability, and may be
> >>>> merged with adjacent cold regions, but there are some pages may be
> >>>> accessed we missed.
> >>>>
> >>>> So instead of ignoring the access status of this region if we did not find
> >>>> a valid page according to current sampling address, we can use last valid
> >>>> sampling address to help to make the sampling more accurate, then we can do
> >>>> a better decision.
> >>>
> >>> Well...  Offlined pages are also a valid part of the memory region, so treating
> >>> those as not accessed and making the memory region containing the offlined
> >>> pages looks colder seems legal to me.  IOW, this approach could make memory
> >>> regions containing many non-online-LRU pages as hot.
> >>
> >> IMO I don't think this is a problem, since if this region containing
> >> many non-online-LRU pages is treated as hot, which means threre are aome
> >> pages are hot, right? We can find them and promote them to fast memory
> >> (or do other schemes). Meanwhile, for non-online-LRU pages, we can
> >> filter them and do nothing for them, since we can not get a valid page
> >> struct for them.
> > 
> > For some of DAMOS actions that you mentioned, that could make sense.  However,
> > that wouldn't make much sense for some other cases, especially for manual
> > DAMON-based access pattern profiling.
> 
> I am not sure about this case, could you elaborate on how this can worse 
> the case you mentioned?

For an example, let's suppose a user using DAMON to know the working set size
of the system.  And further suppose there is a region that containing many
offlined pages and one online hot page.  With this patch, once DAMON sampled
the one hot page, the entire region will be reported as hot, though the other
offlined pages has not accessed.  As a result, the user will think the working
set size is bigger than real.

> 
> Like you said as below, we can split the regions to separate the hot 
> pages out of the hot regions containing some offline or non-lru pages, 
> that is also a benefit to improve the regions adjustment.
> 
> > After all, we already have a mechanism for this case: adaptive regions
> > adjustment (or, regions split/merge).  That mechanism will eventually separate
> > out hot oneline-LRU pages in the memory regions.  Before the region is
> > adjusted, reporting the whole region as hot looks like a right result to me.
> > Of course, I admit that it could take too much time to converge to the optimal
> > regions, and there are many rooms for improvement of the regions adjustment
> > mechanism.  I think we should pursue the direction (improving the regions
> > adjustment mechanism).
> 
> Yes, agree.
> 
> > FYI, I have some rough ideas for improving the mechanism including partitioning
> > regions into more than 2 sub-regions if we belive it is not making a good
> > progress.  Nevertheless, I'd like to first make a methodology for evaluating
> > current accuracy.  For that, I am planning to implement a page-granularity
> > access monitoring.
> 
> Great, I think the page-granularity monitoring will be more suitable for 
> tiered memory system, which can reduce redundant demotion and promotion. 
> However, I still concern the overhead if the monitoring is a 
> page-granularity, especially for a large memory size.

Sure.  It's main purpose for now is only to be compared with DAMON for
evaluating DAMON's accuracy.  Someone who has small-enough memory size of
huge-enough CPU resource could use that for their product, of course.

> Anyway, I'd like to help to test or review the new page-granularity
> monitoring when you're ready to send out. Thanks.

So glad to hear that and appreciate always for your help!


Thanks,
SJ

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/damon: Make the sampling more accurate
  2022-03-18 12:15         ` sj
@ 2022-03-18 14:12           ` Baolin Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Baolin Wang @ 2022-03-18 14:12 UTC (permalink / raw)
  To: sj; +Cc: akpm, linux-mm, linux-kernel



On 3/18/2022 8:15 PM, sj@kernel.org wrote:
> On Fri, 18 Mar 2022 19:58:07 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>>
>>
>> On 3/18/2022 6:49 PM, sj@kernel.org wrote:
>>> On Fri, 18 Mar 2022 18:01:19 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>>
>>>>
>>>> On 3/18/2022 5:40 PM, sj@kernel.org wrote:
>>>>> Hi Baolin,
>>>>>
>>>>> On Fri, 18 Mar 2022 17:23:13 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>>>>
>>>>>> When I try to sample the physical address with DAMON to migrate pages
>>>>>> on tiered memory system, I found it will demote some cold regions mistakenly.
>>>>>> Now we will choose an physical address in the region randomly, but if
>>>>>> its corresponding page is not an online LRU page, we will ignore the
>>>>>> accessing status in this cycle of sampling, and actually will be treated
>>>>>> as a non-accessed region. Suppose a region including some non-LRU pages,
>>>>>> it will be treated as a cold region with a high probability, and may be
>>>>>> merged with adjacent cold regions, but there are some pages may be
>>>>>> accessed we missed.
>>>>>>
>>>>>> So instead of ignoring the access status of this region if we did not find
>>>>>> a valid page according to current sampling address, we can use last valid
>>>>>> sampling address to help to make the sampling more accurate, then we can do
>>>>>> a better decision.
>>>>>
>>>>> Well...  Offlined pages are also a valid part of the memory region, so treating
>>>>> those as not accessed and making the memory region containing the offlined
>>>>> pages looks colder seems legal to me.  IOW, this approach could make memory
>>>>> regions containing many non-online-LRU pages as hot.
>>>>
>>>> IMO I don't think this is a problem, since if this region containing
>>>> many non-online-LRU pages is treated as hot, which means threre are aome
>>>> pages are hot, right? We can find them and promote them to fast memory
>>>> (or do other schemes). Meanwhile, for non-online-LRU pages, we can
>>>> filter them and do nothing for them, since we can not get a valid page
>>>> struct for them.
>>>
>>> For some of DAMOS actions that you mentioned, that could make sense.  However,
>>> that wouldn't make much sense for some other cases, especially for manual
>>> DAMON-based access pattern profiling.
>>
>> I am not sure about this case, could you elaborate on how this can worse
>> the case you mentioned?
> 
> For an example, let's suppose a user using DAMON to know the working set size
> of the system.  And further suppose there is a region that containing many
> offlined pages and one online hot page.  With this patch, once DAMON sampled
> the one hot page, the entire region will be reported as hot, though the other
> offlined pages has not accessed.  As a result, the user will think the working
> set size is bigger than real.

OK, sounds reasonable. Seems I need add a flag to indicate if we should 
ignore offline or non-lru pages when monitoring for some schemes, which 
can help to do a good decision.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-18 14:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-18  9:23 [PATCH] mm/damon: Make the sampling more accurate Baolin Wang
2022-03-18  9:40 ` sj
2022-03-18 10:01   ` Baolin Wang
2022-03-18 10:49     ` sj
2022-03-18 11:58       ` Baolin Wang
2022-03-18 12:15         ` sj
2022-03-18 14:12           ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).