All of lore.kernel.org
 help / color / mirror / Atom feed
* Creating kernel mappings for memory initially marked with bootmem NOMAP?
@ 2017-03-08 19:03 Florian Fainelli
  2017-03-08 19:14 ` Ard Biesheuvel
  2017-03-08 19:26 ` Russell King - ARM Linux
  0 siblings, 2 replies; 9+ messages in thread
From: Florian Fainelli @ 2017-03-08 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On our platforms (brcmstb) we have an use case where we boot with some
(a lot actually) memory carved out and marked initially with bootmem
NOMAP in order for this memory not to be mapped in the kernel's linear
mapping.

Now, we have some peripherals that want large chunks of physically and
virtually contiguous memory that belong to these memblock NOMAP ranges.
I have no problems using mmap() against this memory, because the kernel
will do what is necessary for a process to map it for me. The struggle
is for a kernel driver which specifies a range of physical memory and
size, and expects a virtually contiguous mapping in return (not using
DMA-API, because reasons).

Essentially the problem is that there are no PTEs created for these
memory regions (and pfn_valid() returns 0, since this is NOMAP memory),
so I have been playing with __add_pages() from the memory hotplug code
in an attempt to get proper page references to this memory, but I am
clearly missing something.

Yes I know it's a terrible idea, but what if I wanted to get that working?

Thanks in advance!
-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 19:03 Creating kernel mappings for memory initially marked with bootmem NOMAP? Florian Fainelli
@ 2017-03-08 19:14 ` Ard Biesheuvel
  2017-03-08 19:52   ` Florian Fainelli
  2017-03-08 19:26 ` Russell King - ARM Linux
  1 sibling, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2017-03-08 19:14 UTC (permalink / raw)
  To: linux-arm-kernel


> On 8 Mar 2017, at 20:03, Florian Fainelli <f.fainelli@gmail.com> wrote:
> 
> Hi,
> 
> On our platforms (brcmstb) we have an use case where we boot with some
> (a lot actually) memory carved out and marked initially with bootmem
> NOMAP in order for this memory not to be mapped in the kernel's linear
> mapping.
> 
> Now, we have some peripherals that want large chunks of physically and
> virtually contiguous memory that belong to these memblock NOMAP ranges.
> I have no problems using mmap() against this memory, because the kernel
> will do what is necessary for a process to map it for me. The struggle
> is for a kernel driver which specifies a range of physical memory and
> size, and expects a virtually contiguous mapping in return (not using
> DMA-API, because reasons).
> 
> Essentially the problem is that there are no PTEs created for these
> memory regions (and pfn_valid() returns 0, since this is NOMAP memory),
> so I have been playing with __add_pages() from the memory hotplug code
> in an attempt to get proper page references to this memory, but I am
> clearly missing something.
> 
> Yes I know it's a terrible idea, but what if I wanted to get that working?
> 

Did you try memremap?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 19:03 Creating kernel mappings for memory initially marked with bootmem NOMAP? Florian Fainelli
  2017-03-08 19:14 ` Ard Biesheuvel
@ 2017-03-08 19:26 ` Russell King - ARM Linux
  2017-03-08 19:29   ` Russell King - ARM Linux
  1 sibling, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2017-03-08 19:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 08, 2017 at 11:03:43AM -0800, Florian Fainelli wrote:
> Now, we have some peripherals that want large chunks of physically and
> virtually contiguous memory that belong to these memblock NOMAP ranges.
> I have no problems using mmap() against this memory, because the kernel
> will do what is necessary for a process to map it for me. The struggle
> is for a kernel driver which specifies a range of physical memory and
> size, and expects a virtually contiguous mapping in return (not using
> DMA-API, because reasons).

Will vm_iomap_memory() do the job?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 19:26 ` Russell King - ARM Linux
@ 2017-03-08 19:29   ` Russell King - ARM Linux
  0 siblings, 0 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2017-03-08 19:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 08, 2017 at 07:26:53PM +0000, Russell King - ARM Linux wrote:
> On Wed, Mar 08, 2017 at 11:03:43AM -0800, Florian Fainelli wrote:
> > Now, we have some peripherals that want large chunks of physically and
> > virtually contiguous memory that belong to these memblock NOMAP ranges.
> > I have no problems using mmap() against this memory, because the kernel
> > will do what is necessary for a process to map it for me. The struggle
> > is for a kernel driver which specifies a range of physical memory and
> > size, and expects a virtually contiguous mapping in return (not using
> > DMA-API, because reasons).
> 
> Will vm_iomap_memory() do the job?

Sorry, I thought you were asking about userspace.  The memremap()
family of functions is what you want for mapping it into the kernel.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 19:14 ` Ard Biesheuvel
@ 2017-03-08 19:52   ` Florian Fainelli
  2017-03-08 22:06     ` Ard Biesheuvel
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2017-03-08 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/08/2017 11:14 AM, Ard Biesheuvel wrote:
> 
>> On 8 Mar 2017, at 20:03, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>> Hi,
>>
>> On our platforms (brcmstb) we have an use case where we boot with some
>> (a lot actually) memory carved out and marked initially with bootmem
>> NOMAP in order for this memory not to be mapped in the kernel's linear
>> mapping.
>>
>> Now, we have some peripherals that want large chunks of physically and
>> virtually contiguous memory that belong to these memblock NOMAP ranges.
>> I have no problems using mmap() against this memory, because the kernel
>> will do what is necessary for a process to map it for me. The struggle
>> is for a kernel driver which specifies a range of physical memory and
>> size, and expects a virtually contiguous mapping in return (not using
>> DMA-API, because reasons).
>>
>> Essentially the problem is that there are no PTEs created for these
>> memory regions (and pfn_valid() returns 0, since this is NOMAP memory),
>> so I have been playing with __add_pages() from the memory hotplug code
>> in an attempt to get proper page references to this memory, but I am
>> clearly missing something.
>>
>> Yes I know it's a terrible idea, but what if I wanted to get that working?
>>
> 
> Did you try memremap?

Not yet, because this is done on 4.1 at the moment, but I will
definitively give this a try, thanks a lot!

Side note: on a kernel that does not have memremap() (such as 4.1),
would not an ioremap_cache() on the physical range marked as NOMAP
result in the same behavior anyway? ioremap() won't catch the fact that
we are mapping RAM, since this is NOMAP, pfn_valid() returns 0.

My understanding of the pfn_valid() check for ioremap() is to avoid
mapping the same DRAM location twice with potentially conflicting
attributes, but if it has not been mapped at all, as is the case with
NOMAP, does not that get me the same results?

Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 19:52   ` Florian Fainelli
@ 2017-03-08 22:06     ` Ard Biesheuvel
  2017-03-08 22:10       ` Florian Fainelli
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2017-03-08 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 March 2017 at 20:52, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 03/08/2017 11:14 AM, Ard Biesheuvel wrote:
>>
>>> On 8 Mar 2017, at 20:03, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> On our platforms (brcmstb) we have an use case where we boot with some
>>> (a lot actually) memory carved out and marked initially with bootmem
>>> NOMAP in order for this memory not to be mapped in the kernel's linear
>>> mapping.
>>>
>>> Now, we have some peripherals that want large chunks of physically and
>>> virtually contiguous memory that belong to these memblock NOMAP ranges.
>>> I have no problems using mmap() against this memory, because the kernel
>>> will do what is necessary for a process to map it for me. The struggle
>>> is for a kernel driver which specifies a range of physical memory and
>>> size, and expects a virtually contiguous mapping in return (not using
>>> DMA-API, because reasons).
>>>
>>> Essentially the problem is that there are no PTEs created for these
>>> memory regions (and pfn_valid() returns 0, since this is NOMAP memory),
>>> so I have been playing with __add_pages() from the memory hotplug code
>>> in an attempt to get proper page references to this memory, but I am
>>> clearly missing something.
>>>
>>> Yes I know it's a terrible idea, but what if I wanted to get that working?
>>>
>>
>> Did you try memremap?
>
> Not yet, because this is done on 4.1 at the moment, but I will
> definitively give this a try, thanks a lot!
>
> Side note: on a kernel that does not have memremap() (such as 4.1),
> would not an ioremap_cache() on the physical range marked as NOMAP
> result in the same behavior anyway? ioremap() won't catch the fact that
> we are mapping RAM, since this is NOMAP, pfn_valid() returns 0.
>
> My understanding of the pfn_valid() check for ioremap() is to avoid
> mapping the same DRAM location twice with potentially conflicting
> attributes, but if it has not been mapped at all, as is the case with
> NOMAP, does not that get me the same results?
>

Yes, it does. But ioremap_cache() is deprecated for mapping normal
memory. There remains a case for ioremap_cache() on ARM for mapping
NOR flash (which is arguably a device) with cacheable attributes, but
for the general case of mapping DRAM, you should not expect new code
using ioremap_cache() to be accepted upstream.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 22:06     ` Ard Biesheuvel
@ 2017-03-08 22:10       ` Florian Fainelli
  2017-03-16 19:04         ` Florian Fainelli
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2017-03-08 22:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/08/2017 02:06 PM, Ard Biesheuvel wrote:
> On 8 March 2017 at 20:52, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> On 03/08/2017 11:14 AM, Ard Biesheuvel wrote:
>>>
>>>> On 8 Mar 2017, at 20:03, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On our platforms (brcmstb) we have an use case where we boot with some
>>>> (a lot actually) memory carved out and marked initially with bootmem
>>>> NOMAP in order for this memory not to be mapped in the kernel's linear
>>>> mapping.
>>>>
>>>> Now, we have some peripherals that want large chunks of physically and
>>>> virtually contiguous memory that belong to these memblock NOMAP ranges.
>>>> I have no problems using mmap() against this memory, because the kernel
>>>> will do what is necessary for a process to map it for me. The struggle
>>>> is for a kernel driver which specifies a range of physical memory and
>>>> size, and expects a virtually contiguous mapping in return (not using
>>>> DMA-API, because reasons).
>>>>
>>>> Essentially the problem is that there are no PTEs created for these
>>>> memory regions (and pfn_valid() returns 0, since this is NOMAP memory),
>>>> so I have been playing with __add_pages() from the memory hotplug code
>>>> in an attempt to get proper page references to this memory, but I am
>>>> clearly missing something.
>>>>
>>>> Yes I know it's a terrible idea, but what if I wanted to get that working?
>>>>
>>>
>>> Did you try memremap?
>>
>> Not yet, because this is done on 4.1 at the moment, but I will
>> definitively give this a try, thanks a lot!
>>
>> Side note: on a kernel that does not have memremap() (such as 4.1),
>> would not an ioremap_cache() on the physical range marked as NOMAP
>> result in the same behavior anyway? ioremap() won't catch the fact that
>> we are mapping RAM, since this is NOMAP, pfn_valid() returns 0.
>>
>> My understanding of the pfn_valid() check for ioremap() is to avoid
>> mapping the same DRAM location twice with potentially conflicting
>> attributes, but if it has not been mapped at all, as is the case with
>> NOMAP, does not that get me the same results?
>>
> 
> Yes, it does. But ioremap_cache() is deprecated for mapping normal
> memory. There remains a case for ioremap_cache() on ARM for mapping
> NOR flash (which is arguably a device) with cacheable attributes, but
> for the general case of mapping DRAM, you should not expect new code
> using ioremap_cache() to be accepted upstream.

This is very likely going to remain out of tree, and I will keep an eye
on migrating this to memremap() when we update to a newer kernel. Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-08 22:10       ` Florian Fainelli
@ 2017-03-16 19:04         ` Florian Fainelli
  2017-03-16 20:00           ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2017-03-16 19:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/08/2017 02:10 PM, Florian Fainelli wrote:
>> Yes, it does. But ioremap_cache() is deprecated for mapping normal
>> memory. There remains a case for ioremap_cache() on ARM for mapping
>> NOR flash (which is arguably a device) with cacheable attributes, but
>> for the general case of mapping DRAM, you should not expect new code
>> using ioremap_cache() to be accepted upstream.
> 
> This is very likely going to remain out of tree, and I will keep an eye
> on migrating this to memremap() when we update to a newer kernel. Thanks!

And now I have another interesting problem, self inflicted of course. We
have this piece of code here in mm/gup.c [1] which is meant to allow
doing O_DIRECT on pages that are now marked as NOMAP.

Our middle-ware does a mmap() of some regions initially marked as NOMAP
such that it can access this memory and do a mapping "on demand" only
when using these physical memory regions. The use case for O_DIRECT is
to playback a file directly from e.g: a local hard drive it provides a
significant enough performance boost we want to keep bypassing the page
cache.

After removing the check in the above mentioned piece of code for
!pfn_valid() and making it a !memblock_is_memory(__pfn_to_phys(pfn)) I
can move on and everything seems to be fine, except that eventually, we
have the following call trace:

ata_qc_issue -> arm_dma_map_sg -> arm_dma_map_page ->
__dma_page_cpu_to_dev -> dma_cache_maint_page

[  170.253148] [00000000] *pgd=07b0e003, *pmd=0bc31003, *pte=00000000
[  170.262157] Internal error: Oops: 207 [#1] SMP ARM
[  170.279088] CPU: 1 PID: 1688 Comm: nx_io_worker0 Tainted: P
O    4.1.20-1.8pre-01028-g970868a93bbc-dirty #6
[  170.289708] Hardware name: Broadcom STB (Flattened Device Tree)
[  170.295635] task: cd16d500 ti: c7340000 task.ti: c7340000
[  170.301048] PC is at dma_cache_maint_page+0x70/0x140
[  170.306019] LR is at __dma_page_cpu_to_dev+0x2c/0xa8
[  170.310989] pc : [<c001cbec>]    lr : [<c001cce8>]    psr: 60010093
[  170.310989] sp : c7341af8  ip : 00000000  fp : c0e3a300
[  170.322479] r10: 00000002  r9 : c00219a4  r8 : c0e6c740
[  170.327709] r7 : 00000000  r6 : 00010000  r5 : feb8cca0  r4 : fff5c665
[  170.334244] r3 : c0e0a4a8  r2 : 0000007f  r1 : 0000fff5  r0 : ce97aca0
[  170.340779] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[  170.348009] Control: 30c5387d  Table: 07c351c0  DAC: 55555555

and that's actually coming from the fact that we have SPARSEMEM
(actually SPARSEMEM && SPARSEMEM_MANUAL && SPARSEMEM_EXTREME) enabled
for this platform and __section_mem_map_addr() de-references
section->section_mem_map and section is NULL here as a result of a call
to __page_to_pfn() and __pfn_to_page().

So I guess my question is: if a process is mapping some physical memory
through /dev/mem, could sparsemem somehow populate that section
corresponding to this PFN? Everything I see seems to occur at boot time
and when memory hotplug is used (maybe I should start using memory hotplug).

Thanks!

[1]:
https://github.com/Broadcom/stblinux-4.1/blob/master/linux/mm/gup.c#L388
-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Creating kernel mappings for memory initially marked with bootmem NOMAP?
  2017-03-16 19:04         ` Florian Fainelli
@ 2017-03-16 20:00           ` Russell King - ARM Linux
  0 siblings, 0 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2017-03-16 20:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 16, 2017 at 12:04:26PM -0700, Florian Fainelli wrote:
> On 03/08/2017 02:10 PM, Florian Fainelli wrote:
> >> Yes, it does. But ioremap_cache() is deprecated for mapping normal
> >> memory. There remains a case for ioremap_cache() on ARM for mapping
> >> NOR flash (which is arguably a device) with cacheable attributes, but
> >> for the general case of mapping DRAM, you should not expect new code
> >> using ioremap_cache() to be accepted upstream.
> > 
> > This is very likely going to remain out of tree, and I will keep an eye
> > on migrating this to memremap() when we update to a newer kernel. Thanks!
> 
> And now I have another interesting problem, self inflicted of course. We
> have this piece of code here in mm/gup.c [1] which is meant to allow
> doing O_DIRECT on pages that are now marked as NOMAP.

I think you're wrong.  get_user_pages() retrieves a list of "struct page"
pointers for the range of user addresses.  NOMAP regions do not have an
associated "struct page" (they're not declared into the Linux page
allocator.)

> Our middle-ware does a mmap() of some regions initially marked as NOMAP
> such that it can access this memory and do a mapping "on demand" only
> when using these physical memory regions. The use case for O_DIRECT is
> to playback a file directly from e.g: a local hard drive it provides a
> significant enough performance boost we want to keep bypassing the page
> cache.
> 
> After removing the check in the above mentioned piece of code for
> !pfn_valid() and making it a !memblock_is_memory(__pfn_to_phys(pfn)) I
> can move on and everything seems to be fine, except that eventually, we
> have the following call trace:

pfn_valid()'s whole point of existing is to return true only for pfns
that correspond with pages managed by the Linux page allocator.  You've
bypassed that, making the test return true for other pfns.  This means
that:

                page = pte_page(pte);

is going to return rubbish for "page", which will lead to...

> ata_qc_issue -> arm_dma_map_sg -> arm_dma_map_page ->
> __dma_page_cpu_to_dev -> dma_cache_maint_page
> 
> [  170.253148] [00000000] *pgd=07b0e003, *pmd=0bc31003, *pte=00000000
> [  170.262157] Internal error: Oops: 207 [#1] SMP ARM
> [  170.279088] CPU: 1 PID: 1688 Comm: nx_io_worker0 Tainted: P
> O    4.1.20-1.8pre-01028-g970868a93bbc-dirty #6
> [  170.289708] Hardware name: Broadcom STB (Flattened Device Tree)
> [  170.295635] task: cd16d500 ti: c7340000 task.ti: c7340000
> [  170.301048] PC is at dma_cache_maint_page+0x70/0x140
> [  170.306019] LR is at __dma_page_cpu_to_dev+0x2c/0xa8

exactly this, because DMA cache maintanence relies upon having a
valid and de-reference-able struct page.

> So I guess my question is: if a process is mapping some physical memory
> through /dev/mem, could sparsemem somehow populate that section
> corresponding to this PFN? Everything I see seems to occur at boot time
> and when memory hotplug is used (maybe I should start using memory hotplug).

If you hotplug the memory into the Linux page allocator, then you will
need the memory to be mapped, and Linux will integrate it into the
page allocator, and it will be no different from any other memory.

At that point, you might as well have ignored the NOMAP.

Linux's block IO is just not designed to do device DMA to random bits
of memory that are not part of the page allocator.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-03-16 20:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-08 19:03 Creating kernel mappings for memory initially marked with bootmem NOMAP? Florian Fainelli
2017-03-08 19:14 ` Ard Biesheuvel
2017-03-08 19:52   ` Florian Fainelli
2017-03-08 22:06     ` Ard Biesheuvel
2017-03-08 22:10       ` Florian Fainelli
2017-03-16 19:04         ` Florian Fainelli
2017-03-16 20:00           ` Russell King - ARM Linux
2017-03-08 19:26 ` Russell King - ARM Linux
2017-03-08 19:29   ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.