linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
@ 2021-02-03  0:31 Suren Baghdasaryan
  2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  0:31 UTC (permalink / raw)
  To: sumit.semwal
  Cc: akpm, hch, lmark, labbott, Brian.Starkey, john.stultz,
	christian.koenig, cgoldswo, orjan.eide, robin.murphy, jajones,
	minchan, hridya, sspatil, linux-media, dri-devel, linaro-mm-sig,
	linux-mm, linux-kernel, kernel-team, surenb

Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
WARN_ON_ONCE and returning an error. This is to ensure users of the
vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
and get an indication of an error without panicing the kernel.
This will help identifying drivers that need to clear VM_PFNMAP before
using dmabuf system heap which is moving to use vm_insert_page.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 mm/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..e503c9801cd9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1827,7 +1827,8 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
 		return -EINVAL;
 	if (!(vma->vm_flags & VM_MIXEDMAP)) {
 		BUG_ON(mmap_read_trylock(vma->vm_mm));
-		BUG_ON(vma->vm_flags & VM_PFNMAP);
+		if (WARN_ON_ONCE(vma->vm_flags & VM_PFNMAP))
+			return -EINVAL;
 		vma->vm_flags |= VM_MIXEDMAP;
 	}
 	return insert_page(vma, addr, page, vma->vm_page_prot);
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  0:31 [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Suren Baghdasaryan
@ 2021-02-03  0:31 ` Suren Baghdasaryan
  2021-02-03  1:39   ` Minchan Kim
  2021-02-03  2:07   ` John Stultz
  2021-02-03  1:27 ` [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Miaohe Lin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  0:31 UTC (permalink / raw)
  To: sumit.semwal
  Cc: akpm, hch, lmark, labbott, Brian.Starkey, john.stultz,
	christian.koenig, cgoldswo, orjan.eide, robin.murphy, jajones,
	minchan, hridya, sspatil, linux-media, dri-devel, linaro-mm-sig,
	linux-mm, linux-kernel, kernel-team, surenb

Currently system heap maps its buffers with VM_PFNMAP flag using
remap_pfn_range. This results in such buffers not being accounted
for in PSS calculations because vm treats this memory as having no
page structs. Without page structs there are no counters representing
how many processes are mapping a page and therefore PSS calculation
is impossible.
Historically, ION driver used to map its buffers as VM_PFNMAP areas
due to memory carveouts that did not have page structs [1]. That
is not the case anymore and it seems there was desire to move away
from remap_pfn_range [2].
Dmabuf system heap design inherits this ION behavior and maps its
pages using remap_pfn_range even though allocated pages are backed
by page structs.
Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
in [1]. This would allow correct PSS calculation for dmabufs.

[1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
[2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
(sorry, could not find lore links for these discussions)

Suggested-by: Laura Abbott <labbott@kernel.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
v1 posted at: https://lore.kernel.org/patchwork/patch/1372409/

changes in v2:
- removed VM_PFNMAP clearing part of the patch, per Minchan and Christoph
- created prerequisite patch to replace BUG_ON with WARN_ON_ONCE, per Christoph

 drivers/dma-buf/heaps/system_heap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 17e0e9a68baf..4983f18cc2ce 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -203,8 +203,7 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
 	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
 		struct page *page = sg_page_iter_page(&piter);
 
-		ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
-				      vma->vm_page_prot);
+		ret = vm_insert_page(vma, addr, page);
 		if (ret)
 			return ret;
 		addr += PAGE_SIZE;
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  0:31 [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Suren Baghdasaryan
  2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
@ 2021-02-03  1:27 ` Miaohe Lin
  2021-02-03  1:31 ` Minchan Kim
  2021-02-03  1:55 ` Matthew Wilcox
  3 siblings, 0 replies; 23+ messages in thread
From: Miaohe Lin @ 2021-02-03  1:27 UTC (permalink / raw)
  To: Suren Baghdasaryan, sumit.semwal
  Cc: akpm, hch, lmark, labbott, Brian.Starkey, john.stultz,
	christian.koenig, cgoldswo, orjan.eide, robin.murphy, jajones,
	minchan, hridya, sspatil, linux-media, dri-devel, linaro-mm-sig,
	linux-mm, linux-kernel, kernel-team

Hi:
On 2021/2/3 8:31, Suren Baghdasaryan wrote:
> Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> WARN_ON_ONCE and returning an error. This is to ensure users of the
> vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> and get an indication of an error without panicing the kernel.
> This will help identifying drivers that need to clear VM_PFNMAP before
> using dmabuf system heap which is moving to use vm_insert_page.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>

Looks reasonable. Thanks.
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index feff48e1465a..e503c9801cd9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1827,7 +1827,8 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
>  		return -EINVAL;
>  	if (!(vma->vm_flags & VM_MIXEDMAP)) {
>  		BUG_ON(mmap_read_trylock(vma->vm_mm));
> -		BUG_ON(vma->vm_flags & VM_PFNMAP);
> +		if (WARN_ON_ONCE(vma->vm_flags & VM_PFNMAP))
> +			return -EINVAL;
>  		vma->vm_flags |= VM_MIXEDMAP;
>  	}
>  	return insert_page(vma, addr, page, vma->vm_page_prot);
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  0:31 [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Suren Baghdasaryan
  2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
  2021-02-03  1:27 ` [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Miaohe Lin
@ 2021-02-03  1:31 ` Minchan Kim
  2021-02-03  1:55   ` Suren Baghdasaryan
  2021-02-03  1:55 ` Matthew Wilcox
  3 siblings, 1 reply; 23+ messages in thread
From: Minchan Kim @ 2021-02-03  1:31 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: sumit.semwal, akpm, hch, lmark, labbott, Brian.Starkey,
	john.stultz, christian.koenig, cgoldswo, orjan.eide,
	robin.murphy, jajones, hridya, sspatil, linux-media, dri-devel,
	linaro-mm-sig, linux-mm, linux-kernel, kernel-team

On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> WARN_ON_ONCE and returning an error. This is to ensure users of the
> vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> and get an indication of an error without panicing the kernel.
> This will help identifying drivers that need to clear VM_PFNMAP before
> using dmabuf system heap which is moving to use vm_insert_page.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index feff48e1465a..e503c9801cd9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1827,7 +1827,8 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
>  		return -EINVAL;
>  	if (!(vma->vm_flags & VM_MIXEDMAP)) {
>  		BUG_ON(mmap_read_trylock(vma->vm_mm));

Better to replace above BUG_ON with WARN_ON_ONCE, too?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
@ 2021-02-03  1:39   ` Minchan Kim
  2021-02-03  2:02     ` Suren Baghdasaryan
  2021-02-03  2:07   ` John Stultz
  1 sibling, 1 reply; 23+ messages in thread
From: Minchan Kim @ 2021-02-03  1:39 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: sumit.semwal, akpm, hch, lmark, labbott, Brian.Starkey,
	john.stultz, christian.koenig, cgoldswo, orjan.eide,
	robin.murphy, jajones, hridya, sspatil, linux-media, dri-devel,
	linaro-mm-sig, linux-mm, linux-kernel, kernel-team

On Tue, Feb 02, 2021 at 04:31:34PM -0800, Suren Baghdasaryan wrote:
> Currently system heap maps its buffers with VM_PFNMAP flag using
> remap_pfn_range. This results in such buffers not being accounted
> for in PSS calculations because vm treats this memory as having no
> page structs. Without page structs there are no counters representing
> how many processes are mapping a page and therefore PSS calculation
> is impossible.
> Historically, ION driver used to map its buffers as VM_PFNMAP areas
> due to memory carveouts that did not have page structs [1]. That
> is not the case anymore and it seems there was desire to move away
> from remap_pfn_range [2].
> Dmabuf system heap design inherits this ION behavior and maps its
> pages using remap_pfn_range even though allocated pages are backed
> by page structs.
> Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
> in [1]. This would allow correct PSS calculation for dmabufs.
> 
> [1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
> [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
> (sorry, could not find lore links for these discussions)
> 
> Suggested-by: Laura Abbott <labbott@kernel.org>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>

A note: This patch makes dmabuf system heap accounted as PSS so
if someone has relies on the size, they will see the bloat.
IIRC, there was some debate whether PSS accounting for their
buffer is correct or not. If it'd be a problem, we need to
discuss how to solve it(maybe, vma->vm_flags and reintroduce
remap_pfn_range for them to be respected).


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  1:31 ` Minchan Kim
@ 2021-02-03  1:55   ` Suren Baghdasaryan
  0 siblings, 0 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  1:55 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	labbott, Brian Starkey, John Stultz, Christian König,
	Chris Goldsworthy, Ørjan Eide, Robin Murphy, James Jones,
	Hridya Valsaraju, Sandeep Patil, linux-media, DRI mailing list,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, LKML,
	kernel-team

On Tue, Feb 2, 2021 at 5:31 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > and get an indication of an error without panicing the kernel.
> > This will help identifying drivers that need to clear VM_PFNMAP before
> > using dmabuf system heap which is moving to use vm_insert_page.
> >
> > Suggested-by: Christoph Hellwig <hch@infradead.org>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > ---
> >  mm/memory.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index feff48e1465a..e503c9801cd9 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1827,7 +1827,8 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
> >               return -EINVAL;
> >       if (!(vma->vm_flags & VM_MIXEDMAP)) {
> >               BUG_ON(mmap_read_trylock(vma->vm_mm));
>
> Better to replace above BUG_ON with WARN_ON_ONCE, too?

If nobody objects I'll do that in the next respin. Thanks!

>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  0:31 [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Suren Baghdasaryan
                   ` (2 preceding siblings ...)
  2021-02-03  1:31 ` Minchan Kim
@ 2021-02-03  1:55 ` Matthew Wilcox
  2021-02-03  2:26   ` Suren Baghdasaryan
  2021-02-03  8:52   ` [Linaro-mm-sig] " Daniel Vetter
  3 siblings, 2 replies; 23+ messages in thread
From: Matthew Wilcox @ 2021-02-03  1:55 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: sumit.semwal, akpm, hch, lmark, labbott, Brian.Starkey,
	john.stultz, christian.koenig, cgoldswo, orjan.eide,
	robin.murphy, jajones, minchan, hridya, sspatil, linux-media,
	dri-devel, linaro-mm-sig, linux-mm, linux-kernel, kernel-team

On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> WARN_ON_ONCE and returning an error. This is to ensure users of the
> vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> and get an indication of an error without panicing the kernel.
> This will help identifying drivers that need to clear VM_PFNMAP before
> using dmabuf system heap which is moving to use vm_insert_page.

NACK.

The system may not _panic_, but it is clearly now _broken_.  The device
doesn't work, and so the system is useless.  You haven't really improved
anything here.  Just bloated the kernel with yet another _ONCE variable
that in a normal system will never ever ever be triggered.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  1:39   ` Minchan Kim
@ 2021-02-03  2:02     ` Suren Baghdasaryan
  2021-02-03  8:05       ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  2:02 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	labbott, Brian Starkey, John Stultz, Christian König,
	Chris Goldsworthy, Ørjan Eide, Robin Murphy, James Jones,
	Hridya Valsaraju, Sandeep Patil, linux-media, DRI mailing list,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, LKML,
	kernel-team

On Tue, Feb 2, 2021 at 5:39 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Tue, Feb 02, 2021 at 04:31:34PM -0800, Suren Baghdasaryan wrote:
> > Currently system heap maps its buffers with VM_PFNMAP flag using
> > remap_pfn_range. This results in such buffers not being accounted
> > for in PSS calculations because vm treats this memory as having no
> > page structs. Without page structs there are no counters representing
> > how many processes are mapping a page and therefore PSS calculation
> > is impossible.
> > Historically, ION driver used to map its buffers as VM_PFNMAP areas
> > due to memory carveouts that did not have page structs [1]. That
> > is not the case anymore and it seems there was desire to move away
> > from remap_pfn_range [2].
> > Dmabuf system heap design inherits this ION behavior and maps its
> > pages using remap_pfn_range even though allocated pages are backed
> > by page structs.
> > Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
> > in [1]. This would allow correct PSS calculation for dmabufs.
> >
> > [1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
> > [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
> > (sorry, could not find lore links for these discussions)
> >
> > Suggested-by: Laura Abbott <labbott@kernel.org>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Reviewed-by: Minchan Kim <minchan@kernel.org>
>
> A note: This patch makes dmabuf system heap accounted as PSS so
> if someone has relies on the size, they will see the bloat.
> IIRC, there was some debate whether PSS accounting for their
> buffer is correct or not. If it'd be a problem, we need to
> discuss how to solve it(maybe, vma->vm_flags and reintroduce
> remap_pfn_range for them to be respected).

I did not see debates about not including *mapped* dmabufs into PSS
calculation. I remember people were discussing how to account dmabufs
referred only by the FD but that is a different discussion. If the
buffer is mapped into the address space of a process then IMHO
including it into PSS of that process is not controversial.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
  2021-02-03  1:39   ` Minchan Kim
@ 2021-02-03  2:07   ` John Stultz
  2021-02-03  2:13     ` Suren Baghdasaryan
  1 sibling, 1 reply; 23+ messages in thread
From: John Stultz @ 2021-02-03  2:07 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	Laura Abbott, Brian Starkey, Christian Koenig, Chris Goldsworthy,
	Ørjan Eide, Robin Murphy, James Jones, Minchan Kim,
	Hridya Valsaraju, Sandeep Patil, linux-media, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, lkml,
	Android Kernel Team

On Tue, Feb 2, 2021 at 4:31 PM Suren Baghdasaryan <surenb@google.com> wrote:
> Currently system heap maps its buffers with VM_PFNMAP flag using
> remap_pfn_range. This results in such buffers not being accounted
> for in PSS calculations because vm treats this memory as having no
> page structs. Without page structs there are no counters representing
> how many processes are mapping a page and therefore PSS calculation
> is impossible.
> Historically, ION driver used to map its buffers as VM_PFNMAP areas
> due to memory carveouts that did not have page structs [1]. That
> is not the case anymore and it seems there was desire to move away
> from remap_pfn_range [2].
> Dmabuf system heap design inherits this ION behavior and maps its
> pages using remap_pfn_range even though allocated pages are backed
> by page structs.
> Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
> in [1]. This would allow correct PSS calculation for dmabufs.
>
> [1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
> [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
> (sorry, could not find lore links for these discussions)
>
> Suggested-by: Laura Abbott <labbott@kernel.org>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>

For consistency, do we need something similar for the cma heap as well?

thanks
-john

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  2:07   ` John Stultz
@ 2021-02-03  2:13     ` Suren Baghdasaryan
  0 siblings, 0 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  2:13 UTC (permalink / raw)
  To: John Stultz
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	Laura Abbott, Brian Starkey, Christian Koenig, Chris Goldsworthy,
	Ørjan Eide, Robin Murphy, James Jones, Minchan Kim,
	Hridya Valsaraju, Sandeep Patil, linux-media, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, lkml,
	Android Kernel Team

On Tue, Feb 2, 2021 at 6:07 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Tue, Feb 2, 2021 at 4:31 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > Currently system heap maps its buffers with VM_PFNMAP flag using
> > remap_pfn_range. This results in such buffers not being accounted
> > for in PSS calculations because vm treats this memory as having no
> > page structs. Without page structs there are no counters representing
> > how many processes are mapping a page and therefore PSS calculation
> > is impossible.
> > Historically, ION driver used to map its buffers as VM_PFNMAP areas
> > due to memory carveouts that did not have page structs [1]. That
> > is not the case anymore and it seems there was desire to move away
> > from remap_pfn_range [2].
> > Dmabuf system heap design inherits this ION behavior and maps its
> > pages using remap_pfn_range even though allocated pages are backed
> > by page structs.
> > Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
> > in [1]. This would allow correct PSS calculation for dmabufs.
> >
> > [1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
> > [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
> > (sorry, could not find lore links for these discussions)
> >
> > Suggested-by: Laura Abbott <labbott@kernel.org>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
>
> For consistency, do we need something similar for the cma heap as well?

Good question. Let me look closer into it.

>
> thanks
> -john

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  1:55 ` Matthew Wilcox
@ 2021-02-03  2:26   ` Suren Baghdasaryan
  2021-02-03  8:52   ` [Linaro-mm-sig] " Daniel Vetter
  1 sibling, 0 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03  2:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	Laura Abbott, Brian Starkey, John Stultz, Christian König,
	Chris Goldsworthy, Ørjan Eide, Robin Murphy, James Jones,
	Minchan Kim, Hridya Valsaraju, Sandeep Patil, linux-media,
	DRI mailing list, moderated list:DMA BUFFER SHARING FRAMEWORK,
	linux-mm, LKML, kernel-team

On Tue, Feb 2, 2021 at 5:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > and get an indication of an error without panicing the kernel.
> > This will help identifying drivers that need to clear VM_PFNMAP before
> > using dmabuf system heap which is moving to use vm_insert_page.
>
> NACK.
>
> The system may not _panic_, but it is clearly now _broken_.  The device
> doesn't work, and so the system is useless.  You haven't really improved
> anything here.  Just bloated the kernel with yet another _ONCE variable
> that in a normal system will never ever ever be triggered.

We had a discussion in https://lore.kernel.org/patchwork/patch/1372409
about how some DRM drivers set up their VMAs with VM_PFNMAP before
mapping them. We want to use vm_insert_page instead of remap_pfn_range
in the dmabuf heaps so that this memory is visible in PSS. However if
a driver that sets VM_PFNMAP tries to use a dmabuf heap, it will step
into this BUG_ON. We wanted to catch and gradually fix such drivers
but without causing a panic in the process. I hope this clarifies the
reasons why I'm making this change and I'm open to other ideas if they
would address this issue in a better way.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  2:02     ` Suren Baghdasaryan
@ 2021-02-03  8:05       ` Christian König
  2021-02-03 19:53         ` Suren Baghdasaryan
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2021-02-03  8:05 UTC (permalink / raw)
  To: Suren Baghdasaryan, Minchan Kim
  Cc: Sumit Semwal, Andrew Morton, Christoph Hellwig, Liam Mark,
	labbott, Brian Starkey, John Stultz, Chris Goldsworthy,
	Ørjan Eide, Robin Murphy, James Jones, Hridya Valsaraju,
	Sandeep Patil, linux-media, DRI mailing list,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, LKML,
	kernel-team

Am 03.02.21 um 03:02 schrieb Suren Baghdasaryan:
> On Tue, Feb 2, 2021 at 5:39 PM Minchan Kim <minchan@kernel.org> wrote:
>> On Tue, Feb 02, 2021 at 04:31:34PM -0800, Suren Baghdasaryan wrote:
>>> Currently system heap maps its buffers with VM_PFNMAP flag using
>>> remap_pfn_range. This results in such buffers not being accounted
>>> for in PSS calculations because vm treats this memory as having no
>>> page structs. Without page structs there are no counters representing
>>> how many processes are mapping a page and therefore PSS calculation
>>> is impossible.
>>> Historically, ION driver used to map its buffers as VM_PFNMAP areas
>>> due to memory carveouts that did not have page structs [1]. That
>>> is not the case anymore and it seems there was desire to move away
>>> from remap_pfn_range [2].
>>> Dmabuf system heap design inherits this ION behavior and maps its
>>> pages using remap_pfn_range even though allocated pages are backed
>>> by page structs.
>>> Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
>>> in [1]. This would allow correct PSS calculation for dmabufs.
>>>
>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdriverdev-devel.linuxdriverproject.narkive.com%2Fv0fJGpaD%2Fusing-ion-memory-for-direct-io&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cb4c145b86dd0472c943c08d8c7e7ba4b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479145389160353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=W1N%2B%2BlcFDaRSvXdSPe5hPNMRByHfGkU7Uc3cmM3FCTU%3D&amp;reserved=0
>>> [2] https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdriverdev.linuxdriverproject.org%2Fpipermail%2Fdriverdev-devel%2F2018-October%2F127519.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cb4c145b86dd0472c943c08d8c7e7ba4b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479145389160353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=jQxSzKEr52lUcAIx%2FuBHMJ7yOgof%2FVMlW9%2BB2f%2FoS%2FE%3D&amp;reserved=0
>>> (sorry, could not find lore links for these discussions)
>>>
>>> Suggested-by: Laura Abbott <labbott@kernel.org>
>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
>> Reviewed-by: Minchan Kim <minchan@kernel.org>
>>
>> A note: This patch makes dmabuf system heap accounted as PSS so
>> if someone has relies on the size, they will see the bloat.
>> IIRC, there was some debate whether PSS accounting for their
>> buffer is correct or not. If it'd be a problem, we need to
>> discuss how to solve it(maybe, vma->vm_flags and reintroduce
>> remap_pfn_range for them to be respected).
> I did not see debates about not including *mapped* dmabufs into PSS
> calculation. I remember people were discussing how to account dmabufs
> referred only by the FD but that is a different discussion. If the
> buffer is mapped into the address space of a process then IMHO
> including it into PSS of that process is not controversial.

Well, I think it is. And to be honest this doesn't looks like a good 
idea to me since it will eventually lead to double accounting of system 
heap DMA-bufs.

As discussed multiple times it is illegal to use the struct page of a 
DMA-buf. This case here is a bit special since it is the owner of the 
pages which does that, but I'm not sure if this won't cause problems 
elsewhere as well.

A more appropriate solution would be to held processes accountable for 
resources they have allocated through device drivers.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  1:55 ` Matthew Wilcox
  2021-02-03  2:26   ` Suren Baghdasaryan
@ 2021-02-03  8:52   ` Daniel Vetter
  2021-02-03 20:20     ` Suren Baghdasaryan
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2021-02-03  8:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Suren Baghdasaryan, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Christian König, Android Kernel Team,
	James Jones, Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > and get an indication of an error without panicing the kernel.
> > This will help identifying drivers that need to clear VM_PFNMAP before
> > using dmabuf system heap which is moving to use vm_insert_page.
>
> NACK.
>
> The system may not _panic_, but it is clearly now _broken_.  The device
> doesn't work, and so the system is useless.  You haven't really improved
> anything here.  Just bloated the kernel with yet another _ONCE variable
> that in a normal system will never ever ever be triggered.

Also, what the heck are you doing with your drivers? dma-buf mmap must
call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
char nodes. If that doesn't work we have some issues with the calling
contract for that function, not in vm_insert_page.

Finally why exactly do we need to make this switch for system heap?
I've recently looked at gup usage by random drivers, and found a lot
of worrying things there. gup on dma-buf is really bad idea in
general.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm
  2021-02-03  8:05       ` Christian König
@ 2021-02-03 19:53         ` Suren Baghdasaryan
  0 siblings, 0 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03 19:53 UTC (permalink / raw)
  To: Christian König
  Cc: Minchan Kim, Sumit Semwal, Andrew Morton, Christoph Hellwig,
	Liam Mark, Laura Abbott, Brian Starkey, John Stultz,
	Chris Goldsworthy, Ørjan Eide, Robin Murphy, James Jones,
	Hridya Valsaraju, Sandeep Patil, linux-media, DRI mailing list,
	moderated list:DMA BUFFER SHARING FRAMEWORK, linux-mm, LKML,
	kernel-team

On Wed, Feb 3, 2021 at 12:06 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 03.02.21 um 03:02 schrieb Suren Baghdasaryan:
> > On Tue, Feb 2, 2021 at 5:39 PM Minchan Kim <minchan@kernel.org> wrote:
> >> On Tue, Feb 02, 2021 at 04:31:34PM -0800, Suren Baghdasaryan wrote:
> >>> Currently system heap maps its buffers with VM_PFNMAP flag using
> >>> remap_pfn_range. This results in such buffers not being accounted
> >>> for in PSS calculations because vm treats this memory as having no
> >>> page structs. Without page structs there are no counters representing
> >>> how many processes are mapping a page and therefore PSS calculation
> >>> is impossible.
> >>> Historically, ION driver used to map its buffers as VM_PFNMAP areas
> >>> due to memory carveouts that did not have page structs [1]. That
> >>> is not the case anymore and it seems there was desire to move away
> >>> from remap_pfn_range [2].
> >>> Dmabuf system heap design inherits this ION behavior and maps its
> >>> pages using remap_pfn_range even though allocated pages are backed
> >>> by page structs.
> >>> Replace remap_pfn_range with vm_insert_page, following Laura's suggestion
> >>> in [1]. This would allow correct PSS calculation for dmabufs.
> >>>
> >>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdriverdev-devel.linuxdriverproject.narkive.com%2Fv0fJGpaD%2Fusing-ion-memory-for-direct-io&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cb4c145b86dd0472c943c08d8c7e7ba4b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479145389160353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=W1N%2B%2BlcFDaRSvXdSPe5hPNMRByHfGkU7Uc3cmM3FCTU%3D&amp;reserved=0
> >>> [2] https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdriverdev.linuxdriverproject.org%2Fpipermail%2Fdriverdev-devel%2F2018-October%2F127519.html&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cb4c145b86dd0472c943c08d8c7e7ba4b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479145389160353%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=jQxSzKEr52lUcAIx%2FuBHMJ7yOgof%2FVMlW9%2BB2f%2FoS%2FE%3D&amp;reserved=0
> >>> (sorry, could not find lore links for these discussions)
> >>>
> >>> Suggested-by: Laura Abbott <labbott@kernel.org>
> >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> >> Reviewed-by: Minchan Kim <minchan@kernel.org>
> >>
> >> A note: This patch makes dmabuf system heap accounted as PSS so
> >> if someone has relies on the size, they will see the bloat.
> >> IIRC, there was some debate whether PSS accounting for their
> >> buffer is correct or not. If it'd be a problem, we need to
> >> discuss how to solve it(maybe, vma->vm_flags and reintroduce
> >> remap_pfn_range for them to be respected).
> > I did not see debates about not including *mapped* dmabufs into PSS
> > calculation. I remember people were discussing how to account dmabufs
> > referred only by the FD but that is a different discussion. If the
> > buffer is mapped into the address space of a process then IMHO
> > including it into PSS of that process is not controversial.
>
> Well, I think it is. And to be honest this doesn't looks like a good
> idea to me since it will eventually lead to double accounting of system
> heap DMA-bufs.

Thanks for the comment! Could you please expand on this double
accounting issue? Do you mean userspace could double account dmabufs
because it expects dmabufs not to be part of PSS or is there some
in-kernel accounting mechanism that would be broken by this?

>
> As discussed multiple times it is illegal to use the struct page of a
> DMA-buf. This case here is a bit special since it is the owner of the
> pages which does that, but I'm not sure if this won't cause problems
> elsewhere as well.

I would be happy to keep things as they are but calculating dmabuf
contribution to PSS without struct pages is extremely inefficient and
becomes a real pain when we consider the possibilities of partial
mappings, when not the entire dmabuf is being mapped.
Calculating this would require parsing /proc/pid/maps for the process,
finding dmabuf mappings and the size for each one, then parsing
/proc/pid/maps for ALL processes in the system to see if the same
dmabufs are used by other processes and only then calculating the PSS.
I hope that explains the desire to use already existing struct pages
to obtain PSS in a much more efficient way.

>
> A more appropriate solution would be to held processes accountable for
> resources they have allocated through device drivers.

Are you suggesting some new kernel mechanism to account resources
allocated by a process via a driver? If so, any details?

>
> Regards,
> Christian.
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03  8:52   ` [Linaro-mm-sig] " Daniel Vetter
@ 2021-02-03 20:20     ` Suren Baghdasaryan
  2021-02-03 20:29       ` Daniel Vetter
  2021-02-04  7:53       ` Christian König
  0 siblings, 2 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03 20:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Christian König, Android Kernel Team,
	James Jones, Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 3, 2021 at 12:52 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > > and get an indication of an error without panicing the kernel.
> > > This will help identifying drivers that need to clear VM_PFNMAP before
> > > using dmabuf system heap which is moving to use vm_insert_page.
> >
> > NACK.
> >
> > The system may not _panic_, but it is clearly now _broken_.  The device
> > doesn't work, and so the system is useless.  You haven't really improved
> > anything here.  Just bloated the kernel with yet another _ONCE variable
> > that in a normal system will never ever ever be triggered.
>
> Also, what the heck are you doing with your drivers? dma-buf mmap must
> call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
> char nodes. If that doesn't work we have some issues with the calling
> contract for that function, not in vm_insert_page.

The particular issue I observed (details were posted in
https://lore.kernel.org/patchwork/patch/1372409) is that DRM drivers
set VM_PFNMAP flag (via a call to drm_gem_mmap_obj) before calling
dma_buf_mmap. Some drivers clear that flag but some don't. I could not
find the answer to why VM_PFNMAP is required for dmabuf mappings and
maybe someone can explain that here?
If there is a reason to set this flag other than historical use of
carveout memory then we wanted to catch such cases and fix the drivers
that moved to using dmabuf heaps. However maybe there are other
reasons and if so I would be very grateful if someone could explain
them. That would help me to come up with a better solution.

> Finally why exactly do we need to make this switch for system heap?
> I've recently looked at gup usage by random drivers, and found a lot
> of worrying things there. gup on dma-buf is really bad idea in
> general.

The reason for the switch is to be able to account dmabufs allocated
using dmabuf heaps to the processes that map them. The next patch in
this series https://lore.kernel.org/patchwork/patch/1374851
implementing the switch contains more details and there is an active
discussion there. Would you mind joining that discussion to keep it in
one place?
Thanks!

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03 20:20     ` Suren Baghdasaryan
@ 2021-02-03 20:29       ` Daniel Vetter
  2021-02-03 21:25         ` Daniel Vetter
  2021-02-04  7:53       ` Christian König
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2021-02-03 20:29 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Christian König, Android Kernel Team,
	James Jones, Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 3, 2021 at 9:20 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Wed, Feb 3, 2021 at 12:52 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > > > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > > > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > > > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > > > and get an indication of an error without panicing the kernel.
> > > > This will help identifying drivers that need to clear VM_PFNMAP before
> > > > using dmabuf system heap which is moving to use vm_insert_page.
> > >
> > > NACK.
> > >
> > > The system may not _panic_, but it is clearly now _broken_.  The device
> > > doesn't work, and so the system is useless.  You haven't really improved
> > > anything here.  Just bloated the kernel with yet another _ONCE variable
> > > that in a normal system will never ever ever be triggered.
> >
> > Also, what the heck are you doing with your drivers? dma-buf mmap must
> > call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
> > char nodes. If that doesn't work we have some issues with the calling
> > contract for that function, not in vm_insert_page.
>
> The particular issue I observed (details were posted in
> https://lore.kernel.org/patchwork/patch/1372409) is that DRM drivers
> set VM_PFNMAP flag (via a call to drm_gem_mmap_obj) before calling
> dma_buf_mmap. Some drivers clear that flag but some don't. I could not
> find the answer to why VM_PFNMAP is required for dmabuf mappings and
> maybe someone can explain that here?
> If there is a reason to set this flag other than historical use of
> carveout memory then we wanted to catch such cases and fix the drivers
> that moved to using dmabuf heaps. However maybe there are other
> reasons and if so I would be very grateful if someone could explain
> them. That would help me to come up with a better solution.
>
> > Finally why exactly do we need to make this switch for system heap?
> > I've recently looked at gup usage by random drivers, and found a lot
> > of worrying things there. gup on dma-buf is really bad idea in
> > general.
>
> The reason for the switch is to be able to account dmabufs allocated
> using dmabuf heaps to the processes that map them. The next patch in
> this series https://lore.kernel.org/patchwork/patch/1374851
> implementing the switch contains more details and there is an active
> discussion there. Would you mind joining that discussion to keep it in
> one place?

How many semi-unrelated buffer accounting schemes does google come up with?

We're at three with this one.

And also we _cannot_ required that all dma-bufs are backed by struct
page, so requiring struct page to make this work is a no-go.

Second, we do not want to all get_user_pages and friends to work on
dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
exclusively in system memory you can maybe get away with this, but
dma-buf is supposed to work in more places than just Android SoCs.

If you want to account dma-bufs, and gpu memory in general, I'd say
the solid solution is cgroups. There's patches floating around. And
given that Google Android can't even agree internally on what exactly
you want I'd say we just need to cut over to that and make it happen.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03 20:29       ` Daniel Vetter
@ 2021-02-03 21:25         ` Daniel Vetter
  2021-02-03 21:41           ` Suren Baghdasaryan
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2021-02-03 21:25 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Christian König, Android Kernel Team,
	James Jones, Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 3, 2021 at 9:29 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Feb 3, 2021 at 9:20 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Wed, Feb 3, 2021 at 12:52 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > > > > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > > > > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > > > > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > > > > and get an indication of an error without panicing the kernel.
> > > > > This will help identifying drivers that need to clear VM_PFNMAP before
> > > > > using dmabuf system heap which is moving to use vm_insert_page.
> > > >
> > > > NACK.
> > > >
> > > > The system may not _panic_, but it is clearly now _broken_.  The device
> > > > doesn't work, and so the system is useless.  You haven't really improved
> > > > anything here.  Just bloated the kernel with yet another _ONCE variable
> > > > that in a normal system will never ever ever be triggered.
> > >
> > > Also, what the heck are you doing with your drivers? dma-buf mmap must
> > > call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
> > > char nodes. If that doesn't work we have some issues with the calling
> > > contract for that function, not in vm_insert_page.
> >
> > The particular issue I observed (details were posted in
> > https://lore.kernel.org/patchwork/patch/1372409) is that DRM drivers
> > set VM_PFNMAP flag (via a call to drm_gem_mmap_obj) before calling
> > dma_buf_mmap. Some drivers clear that flag but some don't. I could not
> > find the answer to why VM_PFNMAP is required for dmabuf mappings and
> > maybe someone can explain that here?
> > If there is a reason to set this flag other than historical use of
> > carveout memory then we wanted to catch such cases and fix the drivers
> > that moved to using dmabuf heaps. However maybe there are other
> > reasons and if so I would be very grateful if someone could explain
> > them. That would help me to come up with a better solution.
> >
> > > Finally why exactly do we need to make this switch for system heap?
> > > I've recently looked at gup usage by random drivers, and found a lot
> > > of worrying things there. gup on dma-buf is really bad idea in
> > > general.
> >
> > The reason for the switch is to be able to account dmabufs allocated
> > using dmabuf heaps to the processes that map them. The next patch in
> > this series https://lore.kernel.org/patchwork/patch/1374851
> > implementing the switch contains more details and there is an active
> > discussion there. Would you mind joining that discussion to keep it in
> > one place?
>
> How many semi-unrelated buffer accounting schemes does google come up with?
>
> We're at three with this one.
>
> And also we _cannot_ required that all dma-bufs are backed by struct
> page, so requiring struct page to make this work is a no-go.
>
> Second, we do not want to all get_user_pages and friends to work on
> dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
> exclusively in system memory you can maybe get away with this, but
> dma-buf is supposed to work in more places than just Android SoCs.

I just realized that vm_inser_page doesn't even work for CMA, it would
upset get_user_pages pretty badly - you're trying to pin a page in
ZONE_MOVEABLE but you can't move it because it's rather special.
VM_SPECIAL is exactly meant to catch this stuff.
-Daniel

> If you want to account dma-bufs, and gpu memory in general, I'd say
> the solid solution is cgroups. There's patches floating around. And
> given that Google Android can't even agree internally on what exactly
> you want I'd say we just need to cut over to that and make it happen.
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03 21:25         ` Daniel Vetter
@ 2021-02-03 21:41           ` Suren Baghdasaryan
  2021-02-04  8:16             ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-03 21:41 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Christian König, Android Kernel Team,
	James Jones, Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 3, 2021 at 1:25 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Feb 3, 2021 at 9:29 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Wed, Feb 3, 2021 at 9:20 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > >
> > > On Wed, Feb 3, 2021 at 12:52 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > On Wed, Feb 3, 2021 at 2:57 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > >
> > > > > On Tue, Feb 02, 2021 at 04:31:33PM -0800, Suren Baghdasaryan wrote:
> > > > > > Replace BUG_ON(vma->vm_flags & VM_PFNMAP) in vm_insert_page with
> > > > > > WARN_ON_ONCE and returning an error. This is to ensure users of the
> > > > > > vm_insert_page that set VM_PFNMAP are notified of the wrong flag usage
> > > > > > and get an indication of an error without panicing the kernel.
> > > > > > This will help identifying drivers that need to clear VM_PFNMAP before
> > > > > > using dmabuf system heap which is moving to use vm_insert_page.
> > > > >
> > > > > NACK.
> > > > >
> > > > > The system may not _panic_, but it is clearly now _broken_.  The device
> > > > > doesn't work, and so the system is useless.  You haven't really improved
> > > > > anything here.  Just bloated the kernel with yet another _ONCE variable
> > > > > that in a normal system will never ever ever be triggered.
> > > >
> > > > Also, what the heck are you doing with your drivers? dma-buf mmap must
> > > > call dma_buf_mmap(), even for forwarded/redirected mmaps from driver
> > > > char nodes. If that doesn't work we have some issues with the calling
> > > > contract for that function, not in vm_insert_page.
> > >
> > > The particular issue I observed (details were posted in
> > > https://lore.kernel.org/patchwork/patch/1372409) is that DRM drivers
> > > set VM_PFNMAP flag (via a call to drm_gem_mmap_obj) before calling
> > > dma_buf_mmap. Some drivers clear that flag but some don't. I could not
> > > find the answer to why VM_PFNMAP is required for dmabuf mappings and
> > > maybe someone can explain that here?
> > > If there is a reason to set this flag other than historical use of
> > > carveout memory then we wanted to catch such cases and fix the drivers
> > > that moved to using dmabuf heaps. However maybe there are other
> > > reasons and if so I would be very grateful if someone could explain
> > > them. That would help me to come up with a better solution.
> > >
> > > > Finally why exactly do we need to make this switch for system heap?
> > > > I've recently looked at gup usage by random drivers, and found a lot
> > > > of worrying things there. gup on dma-buf is really bad idea in
> > > > general.
> > >
> > > The reason for the switch is to be able to account dmabufs allocated
> > > using dmabuf heaps to the processes that map them. The next patch in
> > > this series https://lore.kernel.org/patchwork/patch/1374851
> > > implementing the switch contains more details and there is an active
> > > discussion there. Would you mind joining that discussion to keep it in
> > > one place?
> >
> > How many semi-unrelated buffer accounting schemes does google come up with?
> >
> > We're at three with this one.
> >
> > And also we _cannot_ required that all dma-bufs are backed by struct
> > page, so requiring struct page to make this work is a no-go.
> >
> > Second, we do not want to all get_user_pages and friends to work on
> > dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
> > exclusively in system memory you can maybe get away with this, but
> > dma-buf is supposed to work in more places than just Android SoCs.
>
> I just realized that vm_inser_page doesn't even work for CMA, it would
> upset get_user_pages pretty badly - you're trying to pin a page in
> ZONE_MOVEABLE but you can't move it because it's rather special.
> VM_SPECIAL is exactly meant to catch this stuff.

Thanks for the input, Daniel! Let me think about the cases you pointed out.

IMHO, the issue with PSS is the difficulty of calculating this metric
without struct page usage. I don't think that problem becomes easier
if we use cgroups or any other API. I wanted to enable existing PSS
calculation mechanisms for the dmabufs known to be backed by struct
pages (since we know how the heap allocated that memory), but sounds
like this would lead to problems that I did not consider.
Thanks,
Suren.

> -Daniel
>
> > If you want to account dma-bufs, and gpu memory in general, I'd say
> > the solid solution is cgroups. There's patches floating around. And
> > given that Google Android can't even agree internally on what exactly
> > you want I'd say we just need to cut over to that and make it happen.
> >
> > Cheers, Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03 20:20     ` Suren Baghdasaryan
  2021-02-03 20:29       ` Daniel Vetter
@ 2021-02-04  7:53       ` Christian König
  1 sibling, 0 replies; 23+ messages in thread
From: Christian König @ 2021-02-04  7:53 UTC (permalink / raw)
  To: Suren Baghdasaryan, Daniel Vetter
  Cc: Christoph Hellwig, Sandeep Patil, dri-devel, Linux MM,
	Robin Murphy, James Jones, Linux Kernel Mailing List,
	Matthew Wilcox, Brian Starkey,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Minchan Kim,
	John Stultz, Liam Mark, Chris Goldsworthy, Hridya Valsaraju,
	Andrew Morton, Android Kernel Team, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK

Am 03.02.21 um 21:20 schrieb Suren Baghdasaryan:
> [SNIP]
> If there is a reason to set this flag other than historical use of
> carveout memory then we wanted to catch such cases and fix the drivers
> that moved to using dmabuf heaps. However maybe there are other
> reasons and if so I would be very grateful if someone could explain
> them. That would help me to come up with a better solution.

Well one major reason for this is to prevent accounting of DMA-buf pages.

So you are going in circles here and trying to circumvent an intentional 
behavior.

Daniel is right that this is the completely wrong approach and we need 
to take a step back and think about it on a higher level.

Going to replay to his mail as well.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-03 21:41           ` Suren Baghdasaryan
@ 2021-02-04  8:16             ` Christian König
  2021-02-04 15:22               ` Daniel Vetter
  2021-02-04 15:54               ` Alex Deucher
  0 siblings, 2 replies; 23+ messages in thread
From: Christian König @ 2021-02-04  8:16 UTC (permalink / raw)
  To: Suren Baghdasaryan, Daniel Vetter
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Sandeep Patil, Android Kernel Team, James Jones,
	Linux Kernel Mailing List, Liam Mark, Brian Starkey,
	Christoph Hellwig, Minchan Kim, Linux MM, John Stultz, dri-devel,
	Chris Goldsworthy, Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

Am 03.02.21 um 22:41 schrieb Suren Baghdasaryan:
> [SNIP]
>>> How many semi-unrelated buffer accounting schemes does google come up with?
>>>
>>> We're at three with this one.
>>>
>>> And also we _cannot_ required that all dma-bufs are backed by struct
>>> page, so requiring struct page to make this work is a no-go.
>>>
>>> Second, we do not want to all get_user_pages and friends to work on
>>> dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
>>> exclusively in system memory you can maybe get away with this, but
>>> dma-buf is supposed to work in more places than just Android SoCs.
>> I just realized that vm_inser_page doesn't even work for CMA, it would
>> upset get_user_pages pretty badly - you're trying to pin a page in
>> ZONE_MOVEABLE but you can't move it because it's rather special.
>> VM_SPECIAL is exactly meant to catch this stuff.
> Thanks for the input, Daniel! Let me think about the cases you pointed out.
>
> IMHO, the issue with PSS is the difficulty of calculating this metric
> without struct page usage. I don't think that problem becomes easier
> if we use cgroups or any other API. I wanted to enable existing PSS
> calculation mechanisms for the dmabufs known to be backed by struct
> pages (since we know how the heap allocated that memory), but sounds
> like this would lead to problems that I did not consider.

Yeah, using struct page indeed won't work. We discussed that multiple 
times now and Daniel even has a patch to mangle the struct page pointers 
inside the sg_table object to prevent abuse in that direction.

On the other hand I totally agree that we need to do something on this 
side which goes beyong what cgroups provide.

A few years ago I came up with patches to improve the OOM killer to 
include resources bound to the processes through file descriptors. I 
unfortunately can't find them of hand any more and I'm currently to busy 
to dig them up.

In general I think we need to make it possible that both the in kernel 
OOM killer as well as userspace processes and handlers have access to 
that kind of data.

The fdinfo approach as suggested in the other thread sounds like the 
easiest solution to me.

Regards,
Christian.

> Thanks,
> Suren.
>
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-04  8:16             ` Christian König
@ 2021-02-04 15:22               ` Daniel Vetter
  2021-02-04 15:54               ` Alex Deucher
  1 sibling, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2021-02-04 15:22 UTC (permalink / raw)
  To: Christian König
  Cc: Suren Baghdasaryan, Daniel Vetter, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Sandeep Patil,
	Android Kernel Team, James Jones, Linux Kernel Mailing List,
	Liam Mark, Brian Starkey, Christoph Hellwig, Minchan Kim,
	Linux MM, John Stultz, dri-devel, Chris Goldsworthy,
	Hridya Valsaraju, Andrew Morton, Robin Murphy,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 04, 2021 at 09:16:32AM +0100, Christian König wrote:
> Am 03.02.21 um 22:41 schrieb Suren Baghdasaryan:
> > [SNIP]
> > > > How many semi-unrelated buffer accounting schemes does google come up with?
> > > > 
> > > > We're at three with this one.
> > > > 
> > > > And also we _cannot_ required that all dma-bufs are backed by struct
> > > > page, so requiring struct page to make this work is a no-go.
> > > > 
> > > > Second, we do not want to all get_user_pages and friends to work on
> > > > dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
> > > > exclusively in system memory you can maybe get away with this, but
> > > > dma-buf is supposed to work in more places than just Android SoCs.
> > > I just realized that vm_inser_page doesn't even work for CMA, it would
> > > upset get_user_pages pretty badly - you're trying to pin a page in
> > > ZONE_MOVEABLE but you can't move it because it's rather special.
> > > VM_SPECIAL is exactly meant to catch this stuff.
> > Thanks for the input, Daniel! Let me think about the cases you pointed out.
> > 
> > IMHO, the issue with PSS is the difficulty of calculating this metric
> > without struct page usage. I don't think that problem becomes easier
> > if we use cgroups or any other API. I wanted to enable existing PSS
> > calculation mechanisms for the dmabufs known to be backed by struct
> > pages (since we know how the heap allocated that memory), but sounds
> > like this would lead to problems that I did not consider.
> 
> Yeah, using struct page indeed won't work. We discussed that multiple times
> now and Daniel even has a patch to mangle the struct page pointers inside
> the sg_table object to prevent abuse in that direction.
> 
> On the other hand I totally agree that we need to do something on this side
> which goes beyong what cgroups provide.
> 
> A few years ago I came up with patches to improve the OOM killer to include
> resources bound to the processes through file descriptors. I unfortunately
> can't find them of hand any more and I'm currently to busy to dig them up.
> 
> In general I think we need to make it possible that both the in kernel OOM
> killer as well as userspace processes and handlers have access to that kind
> of data.
> 
> The fdinfo approach as suggested in the other thread sounds like the easiest
> solution to me.

Yeah for OOM handling cgroups alone isn't enough as the interface - we
need to make sure that oom killer takes into account the system memory
usage (ideally zone aware, for CMA pools).

But to track that we still need that infrastructure first I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-04  8:16             ` Christian König
  2021-02-04 15:22               ` Daniel Vetter
@ 2021-02-04 15:54               ` Alex Deucher
  2021-02-05  3:39                 ` Suren Baghdasaryan
  1 sibling, 1 reply; 23+ messages in thread
From: Alex Deucher @ 2021-02-04 15:54 UTC (permalink / raw)
  To: Christian König
  Cc: Suren Baghdasaryan, Daniel Vetter, Christoph Hellwig,
	Sandeep Patil, dri-devel, Linux MM, Robin Murphy, James Jones,
	Linux Kernel Mailing List, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Minchan Kim,
	Liam Mark, Chris Goldsworthy, Hridya Valsaraju, Andrew Morton,
	Android Kernel Team, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 4, 2021 at 3:16 AM Christian König <christian.koenig@amd.com> wrote:
>
> Am 03.02.21 um 22:41 schrieb Suren Baghdasaryan:
> > [SNIP]
> >>> How many semi-unrelated buffer accounting schemes does google come up with?
> >>>
> >>> We're at three with this one.
> >>>
> >>> And also we _cannot_ required that all dma-bufs are backed by struct
> >>> page, so requiring struct page to make this work is a no-go.
> >>>
> >>> Second, we do not want to all get_user_pages and friends to work on
> >>> dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
> >>> exclusively in system memory you can maybe get away with this, but
> >>> dma-buf is supposed to work in more places than just Android SoCs.
> >> I just realized that vm_inser_page doesn't even work for CMA, it would
> >> upset get_user_pages pretty badly - you're trying to pin a page in
> >> ZONE_MOVEABLE but you can't move it because it's rather special.
> >> VM_SPECIAL is exactly meant to catch this stuff.
> > Thanks for the input, Daniel! Let me think about the cases you pointed out.
> >
> > IMHO, the issue with PSS is the difficulty of calculating this metric
> > without struct page usage. I don't think that problem becomes easier
> > if we use cgroups or any other API. I wanted to enable existing PSS
> > calculation mechanisms for the dmabufs known to be backed by struct
> > pages (since we know how the heap allocated that memory), but sounds
> > like this would lead to problems that I did not consider.
>
> Yeah, using struct page indeed won't work. We discussed that multiple
> times now and Daniel even has a patch to mangle the struct page pointers
> inside the sg_table object to prevent abuse in that direction.
>
> On the other hand I totally agree that we need to do something on this
> side which goes beyong what cgroups provide.
>
> A few years ago I came up with patches to improve the OOM killer to
> include resources bound to the processes through file descriptors. I
> unfortunately can't find them of hand any more and I'm currently to busy
> to dig them up.

https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
I think there was a more recent discussion, but I can't seem to find it.

Alex

>
> In general I think we need to make it possible that both the in kernel
> OOM killer as well as userspace processes and handlers have access to
> that kind of data.
>
> The fdinfo approach as suggested in the other thread sounds like the
> easiest solution to me.
>
> Regards,
> Christian.
>
> > Thanks,
> > Suren.
> >
> >
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error
  2021-02-04 15:54               ` Alex Deucher
@ 2021-02-05  3:39                 ` Suren Baghdasaryan
  0 siblings, 0 replies; 23+ messages in thread
From: Suren Baghdasaryan @ 2021-02-05  3:39 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Christian König, Daniel Vetter, Christoph Hellwig,
	Sandeep Patil, dri-devel, Linux MM, Robin Murphy, James Jones,
	Linux Kernel Mailing List, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Minchan Kim,
	Liam Mark, Chris Goldsworthy, Hridya Valsaraju, Andrew Morton,
	Android Kernel Team, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 4, 2021 at 7:55 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Thu, Feb 4, 2021 at 3:16 AM Christian König <christian.koenig@amd.com> wrote:
> >
> > Am 03.02.21 um 22:41 schrieb Suren Baghdasaryan:
> > > [SNIP]
> > >>> How many semi-unrelated buffer accounting schemes does google come up with?
> > >>>
> > >>> We're at three with this one.
> > >>>
> > >>> And also we _cannot_ required that all dma-bufs are backed by struct
> > >>> page, so requiring struct page to make this work is a no-go.
> > >>>
> > >>> Second, we do not want to all get_user_pages and friends to work on
> > >>> dma-buf, it causes all kinds of pain. Yes on SoC where dma-buf are
> > >>> exclusively in system memory you can maybe get away with this, but
> > >>> dma-buf is supposed to work in more places than just Android SoCs.
> > >> I just realized that vm_inser_page doesn't even work for CMA, it would
> > >> upset get_user_pages pretty badly - you're trying to pin a page in
> > >> ZONE_MOVEABLE but you can't move it because it's rather special.
> > >> VM_SPECIAL is exactly meant to catch this stuff.
> > > Thanks for the input, Daniel! Let me think about the cases you pointed out.
> > >
> > > IMHO, the issue with PSS is the difficulty of calculating this metric
> > > without struct page usage. I don't think that problem becomes easier
> > > if we use cgroups or any other API. I wanted to enable existing PSS
> > > calculation mechanisms for the dmabufs known to be backed by struct
> > > pages (since we know how the heap allocated that memory), but sounds
> > > like this would lead to problems that I did not consider.
> >
> > Yeah, using struct page indeed won't work. We discussed that multiple
> > times now and Daniel even has a patch to mangle the struct page pointers
> > inside the sg_table object to prevent abuse in that direction.
> >
> > On the other hand I totally agree that we need to do something on this
> > side which goes beyong what cgroups provide.
> >
> > A few years ago I came up with patches to improve the OOM killer to
> > include resources bound to the processes through file descriptors. I
> > unfortunately can't find them of hand any more and I'm currently to busy
> > to dig them up.
>
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> I think there was a more recent discussion, but I can't seem to find it.

Thanks for the pointer!
Appreciate the time everyone took to explain the issues.
Thanks,
Suren.

>
> Alex
>
> >
> > In general I think we need to make it possible that both the in kernel
> > OOM killer as well as userspace processes and handlers have access to
> > that kind of data.
> >
> > The fdinfo approach as suggested in the other thread sounds like the
> > easiest solution to me.
> >
> > Regards,
> > Christian.
> >
> > > Thanks,
> > > Suren.
> > >
> > >
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-02-05  3:40 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03  0:31 [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Suren Baghdasaryan
2021-02-03  0:31 ` [PATCH v2 2/2] dma-buf: heaps: Map system heap pages as managed by linux vm Suren Baghdasaryan
2021-02-03  1:39   ` Minchan Kim
2021-02-03  2:02     ` Suren Baghdasaryan
2021-02-03  8:05       ` Christian König
2021-02-03 19:53         ` Suren Baghdasaryan
2021-02-03  2:07   ` John Stultz
2021-02-03  2:13     ` Suren Baghdasaryan
2021-02-03  1:27 ` [PATCH 1/2] mm: replace BUG_ON in vm_insert_page with a return of an error Miaohe Lin
2021-02-03  1:31 ` Minchan Kim
2021-02-03  1:55   ` Suren Baghdasaryan
2021-02-03  1:55 ` Matthew Wilcox
2021-02-03  2:26   ` Suren Baghdasaryan
2021-02-03  8:52   ` [Linaro-mm-sig] " Daniel Vetter
2021-02-03 20:20     ` Suren Baghdasaryan
2021-02-03 20:29       ` Daniel Vetter
2021-02-03 21:25         ` Daniel Vetter
2021-02-03 21:41           ` Suren Baghdasaryan
2021-02-04  8:16             ` Christian König
2021-02-04 15:22               ` Daniel Vetter
2021-02-04 15:54               ` Alex Deucher
2021-02-05  3:39                 ` Suren Baghdasaryan
2021-02-04  7:53       ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).