linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm/page_reporting: Make page reporting work on arm64 with 64KB page size
@ 2021-06-22  7:49 Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 1/3] mm/page_reporting: Fix code style in __page_reporting_request() Gavin Shan
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Gavin Shan @ 2021-06-22  7:49 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, alexander.duyck, david, mst, akpm,
	anshuman.khandual, catalin.marinas, will, shan.gavin

The page reporting threshold is currently equal to @pageblock_order, which
is 13 and 512MB on arm64 with 64KB base page size selected. The page
reporting won't be triggered if the freeing page can't come up with a free
area like that huge. The condition is hard to be met, especially when the
system memory becomes fragmented.

This series intends to solve the issue by having page reporting threshold
as 5 (2MB) on arm64 with 64KB base page size. The patches are organized as:

   PATCH[1/3] Fix some coding style in __page_reporting_request().
   PATCH[2/3] Allows the device driver (e.g. virtio_balloon) to specify
              the page reporting order when the device info is registered.
   PATCH[3/3] Specifies the page reporting order to 5, corresponding to
              2MB in size on ARM64 when 64KB base page size is used.

Changelog
=========
v2:
   * Rewrite the patches to let device driver specify the page reporting
     order.                                                    (Alex)

Gavin Shan (3):
  mm/page_reporting: Fix code style in __page_reporting_request()
  mm/page_reporting: Allow driver to specify threshold
  virtio_balloon: Specify page reporting order if needed

 drivers/virtio/virtio_balloon.c | 17 +++++++++++++++++
 include/linux/page_reporting.h  |  3 +++
 mm/page_reporting.c             | 18 ++++++++++++------
 mm/page_reporting.h             | 10 ++--------
 4 files changed, 34 insertions(+), 14 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] mm/page_reporting: Fix code style in __page_reporting_request()
  2021-06-22  7:49 [PATCH v2 0/3] mm/page_reporting: Make page reporting work on arm64 with 64KB page size Gavin Shan
@ 2021-06-22  7:49 ` Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed Gavin Shan
  2 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2021-06-22  7:49 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, alexander.duyck, david, mst, akpm,
	anshuman.khandual, catalin.marinas, will, shan.gavin

The lines of comments would be starting with one, instead two space.
This corrects the style.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 mm/page_reporting.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index c50d93ffa252..df9c5054e1b4 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -31,8 +31,8 @@ __page_reporting_request(struct page_reporting_dev_info *prdev)
 		return;
 
 	/*
-	 *  If reporting is already active there is nothing we need to do.
-	 *  Test against 0 as that represents PAGE_REPORTING_IDLE.
+	 * If reporting is already active there is nothing we need to do.
+	 * Test against 0 as that represents PAGE_REPORTING_IDLE.
 	 */
 	state = atomic_xchg(&prdev->state, PAGE_REPORTING_REQUESTED);
 	if (state != PAGE_REPORTING_IDLE)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold
  2021-06-22  7:49 [PATCH v2 0/3] mm/page_reporting: Make page reporting work on arm64 with 64KB page size Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 1/3] mm/page_reporting: Fix code style in __page_reporting_request() Gavin Shan
@ 2021-06-22  7:49 ` Gavin Shan
  2021-06-22 17:39   ` Alexander Duyck
  2021-06-22  7:49 ` [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed Gavin Shan
  2 siblings, 1 reply; 8+ messages in thread
From: Gavin Shan @ 2021-06-22  7:49 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, alexander.duyck, david, mst, akpm,
	anshuman.khandual, catalin.marinas, will, shan.gavin

The page reporting threshold is currently sticky to @pageblock_order.
The page reporting can never be triggered because the freeing page
can't come up with a free area like that huge. The situation becomes
worse when the system memory becomes heavily fragmented.

For example, the following configurations are used on ARM64 when 64KB
base page size is enabled. In this specific case, the page reporting
won't be triggered until the freeing page comes up with a 512MB free
area. That's hard to be met, especially when the system memory becomes
heavily fragmented.

   PAGE_SIZE:          64KB
   HPAGE_SIZE:         512MB
   pageblock_order:    13       (512MB)
   MAX_ORDER:          14

This allows the drivers to specify the threshold when the page
reporting device is registered. The threshold falls back to
@pageblock_order if it's not specified by the driver. The existing
users (hv_balloon and virtio_balloon) don't specify the threshold
and @pageblock_order is still taken as their page reporting order.
So this shouldn't introduce functional changes.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 include/linux/page_reporting.h |  3 +++
 mm/page_reporting.c            | 14 ++++++++++----
 mm/page_reporting.h            | 10 ++--------
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
index 3b99e0ec24f2..fe648dfa3a7c 100644
--- a/include/linux/page_reporting.h
+++ b/include/linux/page_reporting.h
@@ -18,6 +18,9 @@ struct page_reporting_dev_info {
 
 	/* Current state of page reporting */
 	atomic_t state;
+
+	/* Minimal order of page reporting */
+	unsigned int order;
 };
 
 /* Tear-down and bring-up for page reporting devices */
diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index df9c5054e1b4..27670360bae6 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -47,7 +47,7 @@ __page_reporting_request(struct page_reporting_dev_info *prdev)
 }
 
 /* notify prdev of free page reporting request */
-void __page_reporting_notify(void)
+void __page_reporting_notify(unsigned int order)
 {
 	struct page_reporting_dev_info *prdev;
 
@@ -58,7 +58,7 @@ void __page_reporting_notify(void)
 	 */
 	rcu_read_lock();
 	prdev = rcu_dereference(pr_dev_info);
-	if (likely(prdev))
+	if (likely(prdev && order >= prdev->order))
 		__page_reporting_request(prdev);
 
 	rcu_read_unlock();
@@ -229,7 +229,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev,
 
 	/* Generate minimum watermark to be able to guarantee progress */
 	watermark = low_wmark_pages(zone) +
-		    (PAGE_REPORTING_CAPACITY << PAGE_REPORTING_MIN_ORDER);
+		    (PAGE_REPORTING_CAPACITY << prdev->order);
 
 	/*
 	 * Cancel request if insufficient free memory or if we failed
@@ -239,7 +239,7 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev,
 		return err;
 
 	/* Process each free list starting from lowest order/mt */
-	for (order = PAGE_REPORTING_MIN_ORDER; order < MAX_ORDER; order++) {
+	for (order = prdev->order; order < MAX_ORDER; order++) {
 		for (mt = 0; mt < MIGRATE_TYPES; mt++) {
 			/* We do not pull pages from the isolate free list */
 			if (is_migrate_isolate(mt))
@@ -324,6 +324,12 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
 		goto err_out;
 	}
 
+	/*
+	 * We need to choose the minimal order of page reporting if it's
+	 * not specified by the driver.
+	 */
+	prdev->order = prdev->order ? prdev->order : pageblock_order;
+
 	/* initialize state and work structures */
 	atomic_set(&prdev->state, PAGE_REPORTING_IDLE);
 	INIT_DELAYED_WORK(&prdev->work, &page_reporting_process);
diff --git a/mm/page_reporting.h b/mm/page_reporting.h
index 2c385dd4ddbd..d9f972e72649 100644
--- a/mm/page_reporting.h
+++ b/mm/page_reporting.h
@@ -10,11 +10,9 @@
 #include <linux/pgtable.h>
 #include <linux/scatterlist.h>
 
-#define PAGE_REPORTING_MIN_ORDER	pageblock_order
-
 #ifdef CONFIG_PAGE_REPORTING
 DECLARE_STATIC_KEY_FALSE(page_reporting_enabled);
-void __page_reporting_notify(void);
+void __page_reporting_notify(unsigned int order);
 
 static inline bool page_reported(struct page *page)
 {
@@ -37,12 +35,8 @@ static inline void page_reporting_notify_free(unsigned int order)
 	if (!static_branch_unlikely(&page_reporting_enabled))
 		return;
 
-	/* Determine if we have crossed reporting threshold */
-	if (order < PAGE_REPORTING_MIN_ORDER)
-		return;
-
 	/* This will add a few cycles, but should be called infrequently */
-	__page_reporting_notify();
+	__page_reporting_notify(order);
 }
 #else /* CONFIG_PAGE_REPORTING */
 #define page_reported(_page)	false
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed
  2021-06-22  7:49 [PATCH v2 0/3] mm/page_reporting: Make page reporting work on arm64 with 64KB page size Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 1/3] mm/page_reporting: Fix code style in __page_reporting_request() Gavin Shan
  2021-06-22  7:49 ` [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold Gavin Shan
@ 2021-06-22  7:49 ` Gavin Shan
  2021-06-22 17:44   ` Alexander Duyck
  2 siblings, 1 reply; 8+ messages in thread
From: Gavin Shan @ 2021-06-22  7:49 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, alexander.duyck, david, mst, akpm,
	anshuman.khandual, catalin.marinas, will, shan.gavin

The page reporting won't be triggered if the freeing page can't come
up with a free area, whose size is equal or bigger than the threshold
(page reporting order). The default page reporting order, equal to
@pageblock_order, is too huge on some architectures to trigger page
reporting. One example is ARM64 when 64KB base page size is used.

      PAGE_SIZE:          64KB
      pageblock_order:    13       (512MB)
      MAX_ORDER:          14

This specifies the page reporting order to 5 (2MB) for this specific
case so that page reporting can be triggered.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 drivers/virtio/virtio_balloon.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 510e9318854d..fd419780cc23 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -993,6 +993,23 @@ static int virtballoon_probe(struct virtio_device *vdev)
 			goto out_unregister_oom;
 		}
 
+		/*
+		 * The default page reporting order is @pageblock_order, which
+		 * corresponds to 512MB in size on ARM64 when 64KB base page
+		 * size is used. The page reporting won't be triggered if the
+		 * freeing page can't come up with a free area like that huge.
+		 * So we specify the page reporting order to 5, corresponding
+		 * to 2MB. It helps to avoid THP splitting if 4KB base page
+		 * size is used by host.
+		 *
+		 * Ideallh, the page reporting order is selected based on the
+		 * host's base page size. However, it needs more work to report
+		 * that value. The hardcoded order would be fine currently.
+		 */
+#if defined(CONFIG_ARM64) && defined(CONFIG_ARM64_64K_PAGES)
+		vb->pr_dev_info.order = 5;
+#endif
+
 		err = page_reporting_register(&vb->pr_dev_info);
 		if (err)
 			goto out_unregister_oom;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold
  2021-06-22  7:49 ` [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold Gavin Shan
@ 2021-06-22 17:39   ` Alexander Duyck
  2021-06-23  0:43     ` Gavin Shan
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Duyck @ 2021-06-22 17:39 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-mm, LKML, David Hildenbrand, Michael S. Tsirkin,
	Andrew Morton, Anshuman Khandual, Catalin Marinas, Will Deacon,
	shan.gavin

On Mon, Jun 21, 2021 at 10:48 PM Gavin Shan <gshan@redhat.com> wrote:
>
> The page reporting threshold is currently sticky to @pageblock_order.
> The page reporting can never be triggered because the freeing page
> can't come up with a free area like that huge. The situation becomes
> worse when the system memory becomes heavily fragmented.
>
> For example, the following configurations are used on ARM64 when 64KB
> base page size is enabled. In this specific case, the page reporting
> won't be triggered until the freeing page comes up with a 512MB free
> area. That's hard to be met, especially when the system memory becomes
> heavily fragmented.
>
>    PAGE_SIZE:          64KB
>    HPAGE_SIZE:         512MB
>    pageblock_order:    13       (512MB)
>    MAX_ORDER:          14
>
> This allows the drivers to specify the threshold when the page
> reporting device is registered. The threshold falls back to
> @pageblock_order if it's not specified by the driver. The existing
> users (hv_balloon and virtio_balloon) don't specify the threshold
> and @pageblock_order is still taken as their page reporting order.
> So this shouldn't introduce functional changes.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  include/linux/page_reporting.h |  3 +++
>  mm/page_reporting.c            | 14 ++++++++++----
>  mm/page_reporting.h            | 10 ++--------
>  3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
> index 3b99e0ec24f2..fe648dfa3a7c 100644
> --- a/include/linux/page_reporting.h
> +++ b/include/linux/page_reporting.h
> @@ -18,6 +18,9 @@ struct page_reporting_dev_info {
>
>         /* Current state of page reporting */
>         atomic_t state;
> +
> +       /* Minimal order of page reporting */
> +       unsigned int order;
>  };
>
>  /* Tear-down and bring-up for page reporting devices */
> diff --git a/mm/page_reporting.c b/mm/page_reporting.c
> index df9c5054e1b4..27670360bae6 100644
> --- a/mm/page_reporting.c
> +++ b/mm/page_reporting.c

<snip>

> @@ -324,6 +324,12 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
>                 goto err_out;
>         }
>
> +       /*
> +        * We need to choose the minimal order of page reporting if it's
> +        * not specified by the driver.
> +        */
> +       prdev->order = prdev->order ? prdev->order : pageblock_order;
> +
>         /* initialize state and work structures */
>         atomic_set(&prdev->state, PAGE_REPORTING_IDLE);
>         INIT_DELAYED_WORK(&prdev->work, &page_reporting_process);

Rather than using prdev->order directly it might be better to have a
reporting_order value you could export for use by
page_reporting_notify_free. That way you avoid the overhead of having
to make a function call per page freed.

> diff --git a/mm/page_reporting.h b/mm/page_reporting.h
> index 2c385dd4ddbd..d9f972e72649 100644
> --- a/mm/page_reporting.h
> +++ b/mm/page_reporting.h
> @@ -10,11 +10,9 @@
>  #include <linux/pgtable.h>
>  #include <linux/scatterlist.h>
>
> -#define PAGE_REPORTING_MIN_ORDER       pageblock_order
> -
>  #ifdef CONFIG_PAGE_REPORTING
>  DECLARE_STATIC_KEY_FALSE(page_reporting_enabled);
> -void __page_reporting_notify(void);
> +void __page_reporting_notify(unsigned int order);
>
>  static inline bool page_reported(struct page *page)
>  {
> @@ -37,12 +35,8 @@ static inline void page_reporting_notify_free(unsigned int order)
>         if (!static_branch_unlikely(&page_reporting_enabled))
>                 return;
>
> -       /* Determine if we have crossed reporting threshold */
> -       if (order < PAGE_REPORTING_MIN_ORDER)
> -               return;
> -
>         /* This will add a few cycles, but should be called infrequently */
> -       __page_reporting_notify();
> +       __page_reporting_notify(order);
>  }
>  #else /* CONFIG_PAGE_REPORTING */
>  #define page_reported(_page)   false

With us making the function call per page freed we are likely to have
a much more significant impact on performance with page reporting
enabled. Ideally we want to limit this impact so that we only take the
cost for the conditional check on the lower order pages.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed
  2021-06-22  7:49 ` [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed Gavin Shan
@ 2021-06-22 17:44   ` Alexander Duyck
  2021-06-23  0:47     ` Gavin Shan
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Duyck @ 2021-06-22 17:44 UTC (permalink / raw)
  To: Gavin Shan
  Cc: linux-mm, LKML, David Hildenbrand, Michael S. Tsirkin,
	Andrew Morton, Anshuman Khandual, Catalin Marinas, Will Deacon,
	shan.gavin

On Mon, Jun 21, 2021 at 10:49 PM Gavin Shan <gshan@redhat.com> wrote:
>
> The page reporting won't be triggered if the freeing page can't come
> up with a free area, whose size is equal or bigger than the threshold
> (page reporting order). The default page reporting order, equal to
> @pageblock_order, is too huge on some architectures to trigger page
> reporting. One example is ARM64 when 64KB base page size is used.
>
>       PAGE_SIZE:          64KB
>       pageblock_order:    13       (512MB)
>       MAX_ORDER:          14
>
> This specifies the page reporting order to 5 (2MB) for this specific
> case so that page reporting can be triggered.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  drivers/virtio/virtio_balloon.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 510e9318854d..fd419780cc23 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -993,6 +993,23 @@ static int virtballoon_probe(struct virtio_device *vdev)
>                         goto out_unregister_oom;
>                 }
>
> +               /*
> +                * The default page reporting order is @pageblock_order, which
> +                * corresponds to 512MB in size on ARM64 when 64KB base page
> +                * size is used. The page reporting won't be triggered if the
> +                * freeing page can't come up with a free area like that huge.
> +                * So we specify the page reporting order to 5, corresponding
> +                * to 2MB. It helps to avoid THP splitting if 4KB base page
> +                * size is used by host.
> +                *
> +                * Ideallh, the page reporting order is selected based on the

"Ideally"

> +                * host's base page size. However, it needs more work to report
> +                * that value. The hardcoded order would be fine currently.
> +                */
> +#if defined(CONFIG_ARM64) && defined(CONFIG_ARM64_64K_PAGES)
> +               vb->pr_dev_info.order = 5;
> +#endif
> +
>                 err = page_reporting_register(&vb->pr_dev_info);
>                 if (err)
>                         goto out_unregister_oom;

This works for now. However my preference would be to look into seeing
if we can add a value that the host can report that would override the
value you selected here. Then in situations where the host has a
smaller THP page size then the guest it can report the preferred
reporting order via the virtio_balloon interface and have greater
flexibility.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold
  2021-06-22 17:39   ` Alexander Duyck
@ 2021-06-23  0:43     ` Gavin Shan
  0 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2021-06-23  0:43 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: linux-mm, LKML, David Hildenbrand, Michael S. Tsirkin,
	Andrew Morton, Anshuman Khandual, Catalin Marinas, Will Deacon,
	shan.gavin

On 6/23/21 3:39 AM, Alexander Duyck wrote:
> On Mon, Jun 21, 2021 at 10:48 PM Gavin Shan <gshan@redhat.com> wrote:
>>
>> The page reporting threshold is currently sticky to @pageblock_order.
>> The page reporting can never be triggered because the freeing page
>> can't come up with a free area like that huge. The situation becomes
>> worse when the system memory becomes heavily fragmented.
>>
>> For example, the following configurations are used on ARM64 when 64KB
>> base page size is enabled. In this specific case, the page reporting
>> won't be triggered until the freeing page comes up with a 512MB free
>> area. That's hard to be met, especially when the system memory becomes
>> heavily fragmented.
>>
>>     PAGE_SIZE:          64KB
>>     HPAGE_SIZE:         512MB
>>     pageblock_order:    13       (512MB)
>>     MAX_ORDER:          14
>>
>> This allows the drivers to specify the threshold when the page
>> reporting device is registered. The threshold falls back to
>> @pageblock_order if it's not specified by the driver. The existing
>> users (hv_balloon and virtio_balloon) don't specify the threshold
>> and @pageblock_order is still taken as their page reporting order.
>> So this shouldn't introduce functional changes.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   include/linux/page_reporting.h |  3 +++
>>   mm/page_reporting.c            | 14 ++++++++++----
>>   mm/page_reporting.h            | 10 ++--------
>>   3 files changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h
>> index 3b99e0ec24f2..fe648dfa3a7c 100644
>> --- a/include/linux/page_reporting.h
>> +++ b/include/linux/page_reporting.h
>> @@ -18,6 +18,9 @@ struct page_reporting_dev_info {
>>
>>          /* Current state of page reporting */
>>          atomic_t state;
>> +
>> +       /* Minimal order of page reporting */
>> +       unsigned int order;
>>   };
>>
>>   /* Tear-down and bring-up for page reporting devices */
>> diff --git a/mm/page_reporting.c b/mm/page_reporting.c
>> index df9c5054e1b4..27670360bae6 100644
>> --- a/mm/page_reporting.c
>> +++ b/mm/page_reporting.c
> 
> <snip>
> 
>> @@ -324,6 +324,12 @@ int page_reporting_register(struct page_reporting_dev_info *prdev)
>>                  goto err_out;
>>          }
>>
>> +       /*
>> +        * We need to choose the minimal order of page reporting if it's
>> +        * not specified by the driver.
>> +        */
>> +       prdev->order = prdev->order ? prdev->order : pageblock_order;
>> +
>>          /* initialize state and work structures */
>>          atomic_set(&prdev->state, PAGE_REPORTING_IDLE);
>>          INIT_DELAYED_WORK(&prdev->work, &page_reporting_process);
> 
> Rather than using prdev->order directly it might be better to have a
> reporting_order value you could export for use by
> page_reporting_notify_free. That way you avoid the overhead of having
> to make a function call per page freed.
> 

Yes, I obviously missed the point to reduce the overhead because of
function call. In next revision, I will introduce @page_reporting_order
for this. Besides, it will be exported as a module parameter so that
it can be changed dynamically, as David suggested before.

>> diff --git a/mm/page_reporting.h b/mm/page_reporting.h
>> index 2c385dd4ddbd..d9f972e72649 100644
>> --- a/mm/page_reporting.h
>> +++ b/mm/page_reporting.h
>> @@ -10,11 +10,9 @@
>>   #include <linux/pgtable.h>
>>   #include <linux/scatterlist.h>
>>
>> -#define PAGE_REPORTING_MIN_ORDER       pageblock_order
>> -
>>   #ifdef CONFIG_PAGE_REPORTING
>>   DECLARE_STATIC_KEY_FALSE(page_reporting_enabled);
>> -void __page_reporting_notify(void);
>> +void __page_reporting_notify(unsigned int order);
>>
>>   static inline bool page_reported(struct page *page)
>>   {
>> @@ -37,12 +35,8 @@ static inline void page_reporting_notify_free(unsigned int order)
>>          if (!static_branch_unlikely(&page_reporting_enabled))
>>                  return;
>>
>> -       /* Determine if we have crossed reporting threshold */
>> -       if (order < PAGE_REPORTING_MIN_ORDER)
>> -               return;
>> -
>>          /* This will add a few cycles, but should be called infrequently */
>> -       __page_reporting_notify();
>> +       __page_reporting_notify(order);
>>   }
>>   #else /* CONFIG_PAGE_REPORTING */
>>   #define page_reported(_page)   false
> 
> With us making the function call per page freed we are likely to have
> a much more significant impact on performance with page reporting
> enabled. Ideally we want to limit this impact so that we only take the
> cost for the conditional check on the lower order pages.
> 

Yep, thanks for the explanation, Alex.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed
  2021-06-22 17:44   ` Alexander Duyck
@ 2021-06-23  0:47     ` Gavin Shan
  0 siblings, 0 replies; 8+ messages in thread
From: Gavin Shan @ 2021-06-23  0:47 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: linux-mm, LKML, David Hildenbrand, Michael S. Tsirkin,
	Andrew Morton, Anshuman Khandual, Catalin Marinas, Will Deacon,
	shan.gavin

On 6/23/21 3:44 AM, Alexander Duyck wrote:
> On Mon, Jun 21, 2021 at 10:49 PM Gavin Shan <gshan@redhat.com> wrote:
>>
>> The page reporting won't be triggered if the freeing page can't come
>> up with a free area, whose size is equal or bigger than the threshold
>> (page reporting order). The default page reporting order, equal to
>> @pageblock_order, is too huge on some architectures to trigger page
>> reporting. One example is ARM64 when 64KB base page size is used.
>>
>>        PAGE_SIZE:          64KB
>>        pageblock_order:    13       (512MB)
>>        MAX_ORDER:          14
>>
>> This specifies the page reporting order to 5 (2MB) for this specific
>> case so that page reporting can be triggered.
>>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: virtualization@lists.linux-foundation.org
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   drivers/virtio/virtio_balloon.c | 17 +++++++++++++++++
>>   1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
>> index 510e9318854d..fd419780cc23 100644
>> --- a/drivers/virtio/virtio_balloon.c
>> +++ b/drivers/virtio/virtio_balloon.c
>> @@ -993,6 +993,23 @@ static int virtballoon_probe(struct virtio_device *vdev)
>>                          goto out_unregister_oom;
>>                  }
>>
>> +               /*
>> +                * The default page reporting order is @pageblock_order, which
>> +                * corresponds to 512MB in size on ARM64 when 64KB base page
>> +                * size is used. The page reporting won't be triggered if the
>> +                * freeing page can't come up with a free area like that huge.
>> +                * So we specify the page reporting order to 5, corresponding
>> +                * to 2MB. It helps to avoid THP splitting if 4KB base page
>> +                * size is used by host.
>> +                *
>> +                * Ideallh, the page reporting order is selected based on the
> 
> "Ideally"
> 

Yeah, I noticed it right after the patch was posted. Will be fixed
in v3.

>> +                * host's base page size. However, it needs more work to report
>> +                * that value. The hardcoded order would be fine currently.
>> +                */
>> +#if defined(CONFIG_ARM64) && defined(CONFIG_ARM64_64K_PAGES)
>> +               vb->pr_dev_info.order = 5;
>> +#endif
>> +
>>                  err = page_reporting_register(&vb->pr_dev_info);
>>                  if (err)
>>                          goto out_unregister_oom;
> 
> This works for now. However my preference would be to look into seeing
> if we can add a value that the host can report that would override the
> value you selected here. Then in situations where the host has a
> smaller THP page size then the guest it can report the preferred
> reporting order via the virtio_balloon interface and have greater
> flexibility.
> 
> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
> 

Yes, It's something for later. Lets fix the particular case
(ARM64 and 64KB page size) for now.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-22 22:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22  7:49 [PATCH v2 0/3] mm/page_reporting: Make page reporting work on arm64 with 64KB page size Gavin Shan
2021-06-22  7:49 ` [PATCH v2 1/3] mm/page_reporting: Fix code style in __page_reporting_request() Gavin Shan
2021-06-22  7:49 ` [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold Gavin Shan
2021-06-22 17:39   ` Alexander Duyck
2021-06-23  0:43     ` Gavin Shan
2021-06-22  7:49 ` [PATCH v2 3/3] virtio_balloon: Specify page reporting order if needed Gavin Shan
2021-06-22 17:44   ` Alexander Duyck
2021-06-23  0:47     ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).