All of lore.kernel.org
 help / color / mirror / Atom feed
* CMA enhancement - non-default areas in x86
@ 2020-05-13  6:13 Idgar, Or
  2020-05-13  6:47 ` gregkh
  0 siblings, 1 reply; 8+ messages in thread
From: Idgar, Or @ 2020-05-13  6:13 UTC (permalink / raw)
  To: linux-kernel, linux-mm, gregkh; +Cc: Ravich, Leonid

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

Hi,
I'm working with Linux kernel on x86 and needed a way to allocate a very large contiguous memory (around 20GB) for DMA operations.
I've found out that CMA is one of the major ways to do so, but our problem is that CMA's default behavior is to create one default area from which all devices can allocate memory.
when booting, there were some drivers that allocated memory for DMA and used CMA memory if exist. The problem is that it takes memory that we need for our device and we want to make sure this area is dedicated for our device.

As I saw, the only way to reserve a dedicated area is by enabling OF_RESERVED_MEM which is available for several architectures but excluding x86 (and as far as I understand relies on device tree which is not in use with x86 or at least cannot be configured with OF_RESERVED_MEM).

I really want to leverage this mechanism/API and thought about modifying the code (and hopefully merge it upstream) so multiple non-default areas will be available for x86 and with a way to consume it by mapping specific area to specific device.

Is it something that will be open for merging if written properly?
I'll be glad to get some inputs and suggestions from you.
Thanks in advance,
Or Idgar

[-- Attachment #2: Type: text/html, Size: 4153 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CMA enhancement - non-default areas in x86
  2020-05-13  6:13 CMA enhancement - non-default areas in x86 Idgar, Or
@ 2020-05-13  6:47 ` gregkh
  2020-05-13  7:00   ` Idgar, Or
  0 siblings, 1 reply; 8+ messages in thread
From: gregkh @ 2020-05-13  6:47 UTC (permalink / raw)
  To: Idgar, Or; +Cc: linux-kernel, linux-mm, Ravich, Leonid

On Wed, May 13, 2020 at 06:13:55AM +0000, Idgar, Or wrote:
> Hi,
> I'm working with Linux kernel on x86 and needed a way to allocate a very large contiguous memory (around 20GB) for DMA operations.

For what type of device?

> I've found out that CMA is one of the major ways to do so, but our problem is that CMA's default behavior is to create one default area from which all devices can allocate memory.
> when booting, there were some drivers that allocated memory for DMA and used CMA memory if exist. The problem is that it takes memory that we need for our device and we want to make sure this area is dedicated for our device.
> 
> As I saw, the only way to reserve a dedicated area is by enabling OF_RESERVED_MEM which is available for several architectures but excluding x86 (and as far as I understand relies on device tree which is not in use with x86 or at least cannot be configured with OF_RESERVED_MEM).
> 
> I really want to leverage this mechanism/API and thought about modifying the code (and hopefully merge it upstream) so multiple non-default areas will be available for x86 and with a way to consume it by mapping specific area to specific device.
> 
> Is it something that will be open for merging if written properly?

We always will be glad to review patches, no need to ask us about that.
Just post them!

good luck,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: CMA enhancement - non-default areas in x86
  2020-05-13  6:47 ` gregkh
@ 2020-05-13  7:00   ` Idgar, Or
  2020-05-13  7:14     ` gregkh
  0 siblings, 1 reply; 8+ messages in thread
From: Idgar, Or @ 2020-05-13  7:00 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, linux-mm, Ravich, Leonid

> For what type of device?
NTB (Non-Transparent Bridge).

-----Original Message-----
From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> 
Sent: יום ד 13 מאי 2020 09:48
To: Idgar, Or
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; Ravich, Leonid
Subject: Re: CMA enhancement - non-default areas in x86


[EXTERNAL EMAIL] 

On Wed, May 13, 2020 at 06:13:55AM +0000, Idgar, Or wrote:
> Hi,
> I'm working with Linux kernel on x86 and needed a way to allocate a very large contiguous memory (around 20GB) for DMA operations.

For what type of device?

> I've found out that CMA is one of the major ways to do so, but our problem is that CMA's default behavior is to create one default area from which all devices can allocate memory.
> when booting, there were some drivers that allocated memory for DMA and used CMA memory if exist. The problem is that it takes memory that we need for our device and we want to make sure this area is dedicated for our device.
> 
> As I saw, the only way to reserve a dedicated area is by enabling OF_RESERVED_MEM which is available for several architectures but excluding x86 (and as far as I understand relies on device tree which is not in use with x86 or at least cannot be configured with OF_RESERVED_MEM).
> 
> I really want to leverage this mechanism/API and thought about modifying the code (and hopefully merge it upstream) so multiple non-default areas will be available for x86 and with a way to consume it by mapping specific area to specific device.
> 
> Is it something that will be open for merging if written properly?

We always will be glad to review patches, no need to ask us about that.
Just post them!

good luck,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CMA enhancement - non-default areas in x86
  2020-05-13  7:00   ` Idgar, Or
@ 2020-05-13  7:14     ` gregkh
  2020-05-13  8:29       ` Ravich, Leonid
  0 siblings, 1 reply; 8+ messages in thread
From: gregkh @ 2020-05-13  7:14 UTC (permalink / raw)
  To: Idgar, Or; +Cc: linux-kernel, linux-mm, Ravich, Leonid

On Wed, May 13, 2020 at 07:00:12AM +0000, Idgar, Or wrote:
> > For what type of device?
> NTB (Non-Transparent Bridge).


Very odd quoting style...

Anyway, what exactly is a non-transparent bridge, and why doesn't your
bios/uefi implementation properly reserve the memory for it so that the
OS does not use it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: CMA enhancement - non-default areas in x86
  2020-05-13  7:14     ` gregkh
@ 2020-05-13  8:29       ` Ravich, Leonid
  2020-05-13  8:33         ` gregkh
  0 siblings, 1 reply; 8+ messages in thread
From: Ravich, Leonid @ 2020-05-13  8:29 UTC (permalink / raw)
  To: gregkh, Idgar, Or; +Cc: linux-kernel, linux-mm

PCIe NTB 
Documentation/driver-api/ntb.rst

1) Basically PCI bridge between to root complex / PCI switches 
2) using out of OS memory is one solution but then this memory is
Limited for usage by other stack, ex: get_user_pages on this memory will fail, 
Therefore attempting to use it for block layer with (o_direct) will fail. 
 
Acutely any generic stack which attempts to "pin" this memory will fail.

Leonid Ravich 
> -----Original Message-----
> From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org>
> Sent: Wednesday, May 13, 2020 10:14 AM
> To: Idgar, Or
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; Ravich, Leonid
> Subject: Re: CMA enhancement - non-default areas in x86
> 
> On Wed, May 13, 2020 at 07:00:12AM +0000, Idgar, Or wrote:
> > > For what type of device?
> > NTB (Non-Transparent Bridge).
> 
> 
> Very odd quoting style...
> 
> Anyway, what exactly is a non-transparent bridge, and why doesn't your
> bios/uefi implementation properly reserve the memory for it so that the OS
> does not use it?
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CMA enhancement - non-default areas in x86
  2020-05-13  8:29       ` Ravich, Leonid
@ 2020-05-13  8:33         ` gregkh
  2020-05-13  9:43           ` Ravich, Leonid
  0 siblings, 1 reply; 8+ messages in thread
From: gregkh @ 2020-05-13  8:33 UTC (permalink / raw)
  To: Ravich, Leonid; +Cc: Idgar, Or, linux-kernel, linux-mm

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, May 13, 2020 at 08:29:16AM +0000, Ravich, Leonid wrote:
> PCIe NTB 
> Documentation/driver-api/ntb.rst

> 1) Basically PCI bridge between to root complex / PCI switches 
> 2) using out of OS memory is one solution but then this memory is
> Limited for usage by other stack, ex: get_user_pages on this memory will fail, 
> Therefore attempting to use it for block layer with (o_direct) will fail. 
>  
> Acutely any generic stack which attempts to "pin" this memory will fail.

So why isn't the BIOS/UEFI properly reserving this from the general
operating system's pages so that the driver knows to use them instead?

Is UEFI wrong here about these being valid memory ranges for general
use?  If so, why not fix that?  If not, how in the world is the OS
supposed to know these memory ranges are _not_ for general use?  I feel
like there is something missing here...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: CMA enhancement - non-default areas in x86
  2020-05-13  8:33         ` gregkh
@ 2020-05-13  9:43           ` Ravich, Leonid
  2020-05-13 10:04             ` gregkh
  0 siblings, 1 reply; 8+ messages in thread
From: Ravich, Leonid @ 2020-05-13  9:43 UTC (permalink / raw)
  To: gregkh; +Cc: Idgar, Or, linux-kernel, linux-mm

> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top

Sorry , bad habit .
 
> On Wed, May 13, 2020 at 08:29:16AM +0000, Ravich, Leonid wrote:
> > PCIe NTB
> > Documentation/driver-api/ntb.rst
> 
> > 1) Basically PCI bridge between to root complex / PCI switches
> > 2) using out of OS memory is one solution but then this memory is
> > Limited for usage by other stack, ex: get_user_pages on this memory
> > will fail, Therefore attempting to use it for block layer with (o_direct) will
> fail.
> >
> > Acutely any generic stack which attempts to "pin" this memory will fail.
> 
> So why isn't the BIOS/UEFI properly reserving this from the general operating
> system's pages so that the driver knows to use them instead?
> 
> Is UEFI wrong here about these being valid memory ranges for general use?
> If so, why not fix that?  If not, how in the world is the OS supposed to know
> these memory ranges are _not_ for general use?  I feel like there is
> something missing here...
>
Maybe I am miss understanding something here , but if BIOS/UEFI will reserve this pages 
They will be "out of kernel" which will work for propriety driver but this memory will not 
be useable for generic driver which will attempt to pin this memory with get_user_pages() .
so we can go and try to fix that  (not sure this is the right way) .

another option here is to use some kernel infrastructure  which  from one side reserve the memory from general use
on the other hand kernel will be aware of this pages so get_user_pages()  will work on this memory .

from what we saw CMA infrastructure can support  such requirements.
Please let me know if you think I missing here something .

Thanks , and sorry for format mess .
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CMA enhancement - non-default areas in x86
  2020-05-13  9:43           ` Ravich, Leonid
@ 2020-05-13 10:04             ` gregkh
  0 siblings, 0 replies; 8+ messages in thread
From: gregkh @ 2020-05-13 10:04 UTC (permalink / raw)
  To: Ravich, Leonid; +Cc: Idgar, Or, linux-kernel, linux-mm

On Wed, May 13, 2020 at 09:43:45AM +0000, Ravich, Leonid wrote:
> > On Wed, May 13, 2020 at 08:29:16AM +0000, Ravich, Leonid wrote:
> > > PCIe NTB
> > > Documentation/driver-api/ntb.rst
> > 
> > > 1) Basically PCI bridge between to root complex / PCI switches
> > > 2) using out of OS memory is one solution but then this memory is
> > > Limited for usage by other stack, ex: get_user_pages on this memory
> > > will fail, Therefore attempting to use it for block layer with (o_direct) will
> > fail.
> > >
> > > Acutely any generic stack which attempts to "pin" this memory will fail.
> > 
> > So why isn't the BIOS/UEFI properly reserving this from the general operating
> > system's pages so that the driver knows to use them instead?
> > 
> > Is UEFI wrong here about these being valid memory ranges for general use?
> > If so, why not fix that?  If not, how in the world is the OS supposed to know
> > these memory ranges are _not_ for general use?  I feel like there is
> > something missing here...
> >
> Maybe I am miss understanding something here , but if BIOS/UEFI will reserve this pages 
> They will be "out of kernel" which will work for propriety driver but this memory will not 
> be useable for generic driver which will attempt to pin this memory with get_user_pages() .
> so we can go and try to fix that  (not sure this is the right way) .

What do you mean by "propriety" driver vs. "generic" driver?

Shouldn't there be some "generic" way that UEFI tells any driver where
these memory locations are that can not be used as general memory?  If
not, try fixing up UEFI for that.

> another option here is to use some kernel infrastructure  which  from one side reserve the memory from general use
> on the other hand kernel will be aware of this pages so get_user_pages()  will work on this memory .
> 
> from what we saw CMA infrastructure can support  such requirements.

CMA needs to be told where to reserve the memory at boot time.  If you
want to use that, great, but something has to tell it, so perhaps just
get that info from UEFI as that is the "equilivant" to a device tree,
right?

Try it all out and see, all of this is pointless without real patches,
which is why we almost never have these kinds of discussions without
working code.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-13 10:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  6:13 CMA enhancement - non-default areas in x86 Idgar, Or
2020-05-13  6:47 ` gregkh
2020-05-13  7:00   ` Idgar, Or
2020-05-13  7:14     ` gregkh
2020-05-13  8:29       ` Ravich, Leonid
2020-05-13  8:33         ` gregkh
2020-05-13  9:43           ` Ravich, Leonid
2020-05-13 10:04             ` gregkh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.