All of lore.kernel.org
 help / color / mirror / Atom feed
* Hugepage migration
@ 2023-05-28 20:07 Baruch Even
  2023-05-30  1:35 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Baruch Even @ 2023-05-28 20:07 UTC (permalink / raw)
  To: dpdk-dev

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

Hi,

We found an issue with newer kernels (5.13+) that are found on newer OSes
(Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
allocated for DPDK was migrated (moved into another physical page) when a
1G page was allocated.

From our reading of the kernel commits this started with commit
ae37c7ff79f1f030e28ec76c46ee032f8fd07607
    mm: make alloc_contig_range handle in-use hugetlb pages

This caused what looked like memory corruptions to us and cases where the
rings were moved from their physical location and communication was no
longer possible.

I wanted to ask if anyone else hit this issue and what mitigations are
available?

We are currently looking at using a kernel driver to pin the pages but I
expect that this issue will affect others and that a more general approach
is needed.

Thanks,
Baruch

-- 
Baruch Even
Platform Technical Lead,  WEKA
E baruch@weka.io* ­*W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
­* * ­*
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>

[-- Attachment #2: Type: text/html, Size: 4808 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-28 20:07 Hugepage migration Baruch Even
@ 2023-05-30  1:35 ` Stephen Hemminger
  2023-05-30 13:51   ` Baruch Even
  2023-05-30  3:11 ` Stephen Hemminger
  2023-05-30  8:04 ` Bruce Richardson
  2 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30  1:35 UTC (permalink / raw)
  To: Baruch Even; +Cc: dpdk-dev

On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <baruch@weka.io> wrote:

> Hi,
> 
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
> 
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
>     mm: make alloc_contig_range handle in-use hugetlb pages
> 
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
> 
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
> 
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
> 
> Thanks,
> Baruch
> 

Fix might be as simple as asking kernel to lock the mmap().

diff --git a/lib/eal/linux/eal_hugepage_info.c b/lib/eal/linux/eal_hugepage_info.c
index 581d9dfc91eb..989c69387233 100644
--- a/lib/eal/linux/eal_hugepage_info.c
+++ b/lib/eal/linux/eal_hugepage_info.c
@@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t mem_size, int flags)
 		return NULL;
 	}
 	retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
-			MAP_SHARED, fd, 0);
+			MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
+
 	close(fd);
 	return retval == MAP_FAILED ? NULL : retval;
 }

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-28 20:07 Hugepage migration Baruch Even
  2023-05-30  1:35 ` Stephen Hemminger
@ 2023-05-30  3:11 ` Stephen Hemminger
  2023-05-30  8:04 ` Bruce Richardson
  2 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30  3:11 UTC (permalink / raw)
  To: Baruch Even; +Cc: dpdk-dev

On Sun, 28 May 2023 23:07:40 +0300
Baruch Even <baruch@weka.io> wrote:

> Hi,
> 
> We found an issue with newer kernels (5.13+) that are found on newer OSes
> (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> allocated for DPDK was migrated (moved into another physical page) when a
> 1G page was allocated.
> 
> From our reading of the kernel commits this started with commit
> ae37c7ff79f1f030e28ec76c46ee032f8fd07607
>     mm: make alloc_contig_range handle in-use hugetlb pages
> 
> This caused what looked like memory corruptions to us and cases where the
> rings were moved from their physical location and communication was no
> longer possible.
> 
> I wanted to ask if anyone else hit this issue and what mitigations are
> available?
> 
> We are currently looking at using a kernel driver to pin the pages but I
> expect that this issue will affect others and that a more general approach
> is needed.
> 
> Thanks,
> Baruch

Report this to upstream kernel regressions, they probably care about it.
Doing a kernel driver hack is overkill, maintenance and long term technical debt problem.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-28 20:07 Hugepage migration Baruch Even
  2023-05-30  1:35 ` Stephen Hemminger
  2023-05-30  3:11 ` Stephen Hemminger
@ 2023-05-30  8:04 ` Bruce Richardson
  2023-05-30 13:53   ` Baruch Even
  2 siblings, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2023-05-30  8:04 UTC (permalink / raw)
  To: Baruch Even; +Cc: dpdk-dev

On Sun, May 28, 2023 at 11:07:40PM +0300, Baruch Even wrote:
>    Hi,
>    We found an issue with newer kernels (5.13+) that are found on newer
>    OSes (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that
>    was allocated for DPDK was migrated (moved into another physical page)
>    when a 1G page was allocated.
>    From our reading of the kernel commits this started with commit
>    ae37c7ff79f1f030e28ec76c46ee032f8fd07607
>        mm: make alloc_contig_range handle in-use hugetlb pages
>    This caused what looked like memory corruptions to us and cases where
>    the rings were moved from their physical location and communication was
>    no longer possible.
>    I wanted to ask if anyone else hit this issue and what mitigations are
>    available?
>    We are currently looking at using a kernel driver to pin the pages but
>    I expect that this issue will affect others and that a more general
>    approach is needed.
>    Thanks,
>    Baruch
>    --

Hi,

what kernel driver was being used for the device I/O part? Was it a UIO
based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
mappings, the pages mapped should be pinned by the kernel, I would have
thought, since the kernel knows they are being used by devices.

/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-30  1:35 ` Stephen Hemminger
@ 2023-05-30 13:51   ` Baruch Even
  0 siblings, 0 replies; 7+ messages in thread
From: Baruch Even @ 2023-05-30 13:51 UTC (permalink / raw)
  To: stephen; +Cc: dpdk-dev

[-- Attachment #1: Type: text/plain, Size: 2321 bytes --]

I have tested the MAP_LOCKED, it doesn't help in this case. I do intend to
report to the kernel but was wondering if others have hit upon this first.

On Tue, May 30, 2023 at 4:35 AM Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Sun, 28 May 2023 23:07:40 +0300
> Baruch Even <baruch@weka.io> wrote:
>
> > Hi,
> >
> > We found an issue with newer kernels (5.13+) that are found on newer OSes
> > (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was
> > allocated for DPDK was migrated (moved into another physical page) when a
> > 1G page was allocated.
> >
> > From our reading of the kernel commits this started with commit
> > ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> >     mm: make alloc_contig_range handle in-use hugetlb pages
> >
> > This caused what looked like memory corruptions to us and cases where the
> > rings were moved from their physical location and communication was no
> > longer possible.
> >
> > I wanted to ask if anyone else hit this issue and what mitigations are
> > available?
> >
> > We are currently looking at using a kernel driver to pin the pages but I
> > expect that this issue will affect others and that a more general
> approach
> > is needed.
> >
> > Thanks,
> > Baruch
> >
>
> Fix might be as simple as asking kernel to lock the mmap().
>
> diff --git a/lib/eal/linux/eal_hugepage_info.c
> b/lib/eal/linux/eal_hugepage_info.c
> index 581d9dfc91eb..989c69387233 100644
> --- a/lib/eal/linux/eal_hugepage_info.c
> +++ b/lib/eal/linux/eal_hugepage_info.c
> @@ -48,7 +48,8 @@ map_shared_memory(const char *filename, const size_t
> mem_size, int flags)
>                 return NULL;
>         }
>         retval = mmap(NULL, mem_size, PROT_READ | PROT_WRITE,
> -                       MAP_SHARED, fd, 0);
> +                       MAP_SHARED_VALIDATE | MAP_LOCKED, fd, 0);
> +
>         close(fd);
>         return retval == MAP_FAILED ? NULL : retval;
>  }
>


-- 
Baruch Even
Platform Technical Lead,  WEKA
E baruch@weka.io* ­*W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
­* * ­*
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>

[-- Attachment #2: Type: text/html, Size: 6402 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-30  8:04 ` Bruce Richardson
@ 2023-05-30 13:53   ` Baruch Even
  2023-05-30 15:33     ` Stephen Hemminger
  0 siblings, 1 reply; 7+ messages in thread
From: Baruch Even @ 2023-05-30 13:53 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dpdk-dev

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

On Tue, May 30, 2023 at 11:04 AM Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Sun, May 28, 2023 at 11:07:40PM +0300, Baruch Even wrote:
> >    Hi,
> >    We found an issue with newer kernels (5.13+) that are found on newer
> >    OSes (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page
> that
> >    was allocated for DPDK was migrated (moved into another physical page)
> >    when a 1G page was allocated.
> >    From our reading of the kernel commits this started with commit
> >    ae37c7ff79f1f030e28ec76c46ee032f8fd07607
> >        mm: make alloc_contig_range handle in-use hugetlb pages
> >    This caused what looked like memory corruptions to us and cases where
> >    the rings were moved from their physical location and communication
> was
> >    no longer possible.
> >    I wanted to ask if anyone else hit this issue and what mitigations are
> >    available?
> >    We are currently looking at using a kernel driver to pin the pages but
> >    I expect that this issue will affect others and that a more general
> >    approach is needed.
> >    Thanks,
> >    Baruch
> >    --
>
> Hi,
>
> what kernel driver was being used for the device I/O part? Was it a UIO
> based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
> mappings, the pages mapped should be pinned by the kernel, I would have
> thought, since the kernel knows they are being used by devices.
>
> /Bruce
>

This was using igb_uio on an AWS instance with their ena driver.

Baruch

-- 
Baruch Even
Platform Technical Lead,  WEKA
E baruch@weka.io* ­*W www.weka.io
<https://www.weka.io?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>*
­* * ­*
<https://www.weka.io/lp/weka-named-a-2023-customers-choice-by-gartner-peer-insights/?utm_source=signature&utm_medium=email>

[-- Attachment #2: Type: text/html, Size: 5900 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hugepage migration
  2023-05-30 13:53   ` Baruch Even
@ 2023-05-30 15:33     ` Stephen Hemminger
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2023-05-30 15:33 UTC (permalink / raw)
  To: Baruch Even; +Cc: Bruce Richardson, dpdk-dev

On Tue, 30 May 2023 16:53:14 +0300
Baruch Even <baruch@weka.io> wrote:

> > what kernel driver was being used for the device I/O part? Was it a UIO
> > based driver or "vfio-pci"? When using vfio-pci and configuring IOMMU
> > mappings, the pages mapped should be pinned by the kernel, I would have
> > thought, since the kernel knows they are being used by devices.
> >
> > /Bruce
> >  
> 
> This was using igb_uio on an AWS instance with their ena driver.
> 
> Baruch

Try VFIO, using igb_uio is effectively and out tree driver and the kernel
maintainers are unlikely to give you much support.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-30 15:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-28 20:07 Hugepage migration Baruch Even
2023-05-30  1:35 ` Stephen Hemminger
2023-05-30 13:51   ` Baruch Even
2023-05-30  3:11 ` Stephen Hemminger
2023-05-30  8:04 ` Bruce Richardson
2023-05-30 13:53   ` Baruch Even
2023-05-30 15:33     ` Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.