linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* ioremap_page_range: remapping of physical RAM ranges
@ 2017-01-25 19:55 A. Samy
  2017-01-25 22:27 ` John Hubbard
  0 siblings, 1 reply; 11+ messages in thread
From: A. Samy @ 2017-01-25 19:55 UTC (permalink / raw)
  To: linux-mm; +Cc: zhongjiang

Hi,

Commit 3277953de2f31 un-exported ioremap_page_range(), what is an
alternative method of remapping a physical ram range...  This function
was very useful, examples here:
https://github.com/asamy/ksm/blob/master/mm.c#L38 and here:
https://github.com/asamy/ksm/blob/master/ksm.c#L410 etc...

So, you're forcing me to either reimplement it on my own (which is
merely copy-pasting the kernel function), unless you have a suggestion
on what else to use (which I could never find other)?

Thanks,

-- 
asamy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-25 19:55 ioremap_page_range: remapping of physical RAM ranges A. Samy
@ 2017-01-25 22:27 ` John Hubbard
  2017-01-25 23:15   ` Ahmed Samy
  0 siblings, 1 reply; 11+ messages in thread
From: John Hubbard @ 2017-01-25 22:27 UTC (permalink / raw)
  To: A. Samy, linux-mm; +Cc: zhongjiang

On 01/25/2017 11:55 AM, A. Samy wrote:
> Hi,
>
> Commit 3277953de2f31 un-exported ioremap_page_range(), what is an
> alternative method of remapping a physical ram range...  This function
> was very useful, examples here:
> https://github.com/asamy/ksm/blob/master/mm.c#L38 and here:
> https://github.com/asamy/ksm/blob/master/ksm.c#L410 etc...
>
> So, you're forcing me to either reimplement it on my own (which is
> merely copy-pasting the kernel function), unless you have a suggestion
> on what else to use (which I could never find other)?

Hi A. Samy,

I'm sorry this caught you by surprise, let's try get your use case covered.

My thinking on this was: the exported ioremap* family of functions was clearly intended to provide 
just what the name says: mapping of IO (non-RAM) memory. If normal RAM is to be re-mapped, then it 
should not be done "casually" in a driver, as a (possibly unintended) side effect of a function that 
implies otherwise. Either it should be done within the core mm code, or perhaps a new, better-named 
wrapper could be provided, for cases such as yours.

After a very quick peek at your github code, it seems that your mm_remap() routine already has some 
code in common with __ioremap_caller(), so I'm thinking that we could basically promote your 
mm_remap to the in-tree kernel and EXPORT it, and maybe factor out the common parts (or not--it's 
small, after all). Thoughts? If you like it, I'll put something together here.

thanks
john h

>
> Thanks,
>
> -- 
> asamy
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-25 22:27 ` John Hubbard
@ 2017-01-25 23:15   ` Ahmed Samy
  2017-01-26  8:33     ` John Hubbard
  0 siblings, 1 reply; 11+ messages in thread
From: Ahmed Samy @ 2017-01-25 23:15 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, zhongjiang

On Wed, Jan 25, 2017 at 02:27:27PM -0800, John Hubbard wrote:
> 
> Hi A. Samy,
> 
> I'm sorry this caught you by surprise, let's try get your use case covered.
> 
> My thinking on this was: the exported ioremap* family of functions was
> clearly intended to provide just what the name says: mapping of IO (non-RAM)
> memory. If normal RAM is to be re-mapped, then it should not be done
> "casually" in a driver, as a (possibly unintended) side effect of a function
> that implies otherwise. Either it should be done within the core mm code, or
> perhaps a new, better-named wrapper could be provided, for cases such as
> yours.
Hi John,

I agree.  I assume whoever exported it was also doing it for the same
purpose as mine[?]
> 
> After a very quick peek at your github code, it seems that your mm_remap()
> routine already has some code in common with __ioremap_caller(), so I'm
> thinking that we could basically promote your mm_remap to the in-tree kernel
> and EXPORT it, and maybe factor out the common parts (or not--it's small,
> after all). Thoughts? If you like it, I'll put something together here.
That'd be a good solution, it's actually sometimes useful to remap physical
ram in general, specifically for memory imaging tools, etc.

How about also exporting walk_system_ram_range()?  It seems to be defined
conditionally, so I am not sure if that would be a good idea.
	[ See also mm_cache_ram_ranges() in mm.c in github a?? it's also a hacky
	  way to get RAM ranges.  ]

How about something like:

	/* vm_flags incase locking is required, in my case, I need it for VMX
	 * root where there is no interrupts.  */
	void *remap_ram_range(unsigned long phys, unsigned long size,
			      unsigned long vm_flags)
	{
		struct vm_struct *area;
		unsigned long psize;
		unsigned long vaddr;

		psize = (size >> PAGE_SHIFT) + (size & (PAGE_SIZE - 1)) != 0;
		area = get_vm_area_caller(size, VM_IOREMAP | vm_flags, 
					  __builtin_return_address(0));
		if (!area)
			return NULL;

		area->phys_addr = phys & ~(PAGE_SIZE - 1);
		vaddr = (unsigned long)area->addr;
		if (remap_page_range(vaddr, vaddr + size, phys, size))
			goto err_remap;

		return (void *)vaddr + phys & (PAGE_SIZE - 1);
err_remap:
		free_vm_area(area);
		return NULL;
	}

Of course you can add protection, etc.
> 
> thanks
> john h
> 
Thanks,
	asamy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-25 23:15   ` Ahmed Samy
@ 2017-01-26  8:33     ` John Hubbard
  2017-01-26 18:24       ` Ahmed Samy
  2017-01-28 21:11       ` Ahmed Samy
  0 siblings, 2 replies; 11+ messages in thread
From: John Hubbard @ 2017-01-26  8:33 UTC (permalink / raw)
  To: Ahmed Samy; +Cc: linux-mm, zhongjiang

On 01/25/2017 03:15 PM, Ahmed Samy wrote:
> On Wed, Jan 25, 2017 at 02:27:27PM -0800, John Hubbard wrote:
>>
>> Hi A. Samy,
>>
>> I'm sorry this caught you by surprise, let's try get your use case covered.
>>
>> My thinking on this was: the exported ioremap* family of functions was
>> clearly intended to provide just what the name says: mapping of IO (non-RAM)
>> memory. If normal RAM is to be re-mapped, then it should not be done
>> "casually" in a driver, as a (possibly unintended) side effect of a function
>> that implies otherwise. Either it should be done within the core mm code, or
>> perhaps a new, better-named wrapper could be provided, for cases such as
>> yours.
> Hi John,
>
> I agree.  I assume whoever exported it was also doing it for the same
> purpose as mine[?]
>>
>> After a very quick peek at your github code, it seems that your mm_remap()
>> routine already has some code in common with __ioremap_caller(), so I'm
>> thinking that we could basically promote your mm_remap to the in-tree kernel
>> and EXPORT it, and maybe factor out the common parts (or not--it's small,
>> after all). Thoughts? If you like it, I'll put something together here.
> That'd be a good solution, it's actually sometimes useful to remap physical
> ram in general, specifically for memory imaging tools, etc.
>
> How about also exporting walk_system_ram_range()?  It seems to be defined
> conditionally, so I am not sure if that would be a good idea.

That routine has an interesting history. At first glance, I think it used to be 
exported. And now it is not. And it's ifdef'd out only for powerpc. I'll look into 
the history and intentions of that some more...

> 	[ See also mm_cache_ram_ranges() in mm.c in github – it's also a hacky
> 	  way to get RAM ranges.  ]

Yes, I see.

>
> How about something like:
>
> 	/* vm_flags incase locking is required, in my case, I need it for VMX
> 	 * root where there is no interrupts.  */
> 	void *remap_ram_range(unsigned long phys, unsigned long size,
> 			      unsigned long vm_flags)
> 	{
> 		struct vm_struct *area;
> 		unsigned long psize;
> 		unsigned long vaddr;
>
> 		psize = (size >> PAGE_SHIFT) + (size & (PAGE_SIZE - 1)) != 0;
> 		area = get_vm_area_caller(size, VM_IOREMAP | vm_flags,
> 					  __builtin_return_address(0));
> 		if (!area)
> 			return NULL;
>
> 		area->phys_addr = phys & ~(PAGE_SIZE - 1);
> 		vaddr = (unsigned long)area->addr;
> 		if (remap_page_range(vaddr, vaddr + size, phys, size))

That's ioremap_page_range, I assume (rather than remap_page_range)?

Overall, the remap_ram_range approach looks reasonable to me so far. I'll look into 
the details tomorrow.

I'm sure that most people on this list already know this, but...could you say a few 
more words about how remapping system ram is used, why it's a good thing and not a 
bad thing? :)

thanks
john h

> 			goto err_remap;
>
> 		return (void *)vaddr + phys & (PAGE_SIZE - 1);
> err_remap:
> 		free_vm_area(area);
> 		return NULL;
> 	}
>
> Of course you can add protection, etc.
>>
>> thanks
>> john h
>>
> Thanks,
> 	asamy
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-26  8:33     ` John Hubbard
@ 2017-01-26 18:24       ` Ahmed Samy
  2017-01-28 21:11       ` Ahmed Samy
  1 sibling, 0 replies; 11+ messages in thread
From: Ahmed Samy @ 2017-01-26 18:24 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, zhongjiang

On Thu, Jan 26, 2017 at 12:33:02AM -0800, John Hubbard wrote:
> 
> That's ioremap_page_range, I assume (rather than remap_page_range)?
Yes, I renamed it for your convience.

> 
> Overall, the remap_ram_range approach looks reasonable to me so far. I'll
> look into the details tomorrow.
> 
> I'm sure that most people on this list already know this, but...could you
> say a few more words about how remapping system ram is used, why it's a good
> thing and not a bad thing? :)
> 
It's useful for memory imaging tools, where you'd iterate through available ram
ranges, and dump it.  You could look at Google's Rekall, I am not sure if they
take the exact same approach.

Another use is mine, when I use EPT (GPA <-> HPA), let's say when I want to
write to a guest virtual address, then I first need to translate that into GPA,
then translate to HPA through EPT, and remap HPA to get a safe HVA, then I can write it to safely.
You can see a few use cases in the github link...  You could assume the same
for userspace to kernel space mapping, you mostly wouldn't trust the user
address passed, so you'd remap it to kernel space first (ptrace, whatever...).

	asamy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-26  8:33     ` John Hubbard
  2017-01-26 18:24       ` Ahmed Samy
@ 2017-01-28 21:11       ` Ahmed Samy
  2017-01-28 21:48         ` John Hubbard
  1 sibling, 1 reply; 11+ messages in thread
From: Ahmed Samy @ 2017-01-28 21:11 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, zhongjiang

On Thu, Jan 26, 2017 at 12:33:02AM -0800, John Hubbard wrote:
> 
> That's ioremap_page_range, I assume (rather than remap_page_range)?
> 
> Overall, the remap_ram_range approach looks reasonable to me so far. I'll
> look into the details tomorrow.
> 
> I'm sure that most people on this list already know this, but...could you
> say a few more words about how remapping system ram is used, why it's a good
> thing and not a bad thing? :)
> 
> thanks
> john h
> 
Please let me know if you're going to actually make a commit that either
	1) reverts that commit
	2) implements a "separate" function...

Either way, I don't think the un-export is reasonable in the slightest, if that
function is too low-level, then why not also un-export pmd_offset(),
pgd_offset(), perhaps current task too?  These interact directly with low-level
stuff, not meant for drivers, the function is meant to be low-level, I don't know
what made you think that people use it wrong?  How about writing proper
documentation about it instead?  Besides, even if that function does not exist,
you can always iterate the PTEs and set the physical address, it is not hard,
but the safe way is via the kernel knowledge, which is what that function
when combined with others (from vmalloc) provide...

How about this, a function as part of vmalloc, that says something like
`void *vremap(unsigned long phys, unsigned long size, unsigned long flags);`?
Then that solved the problem and there is no need for "low level" functions,
anymore.

Other than, if you're not going to apply a proper workaround, then let me know,
and I'll handle it myself from here.  I don't want this to get past the next
-rc release, so please let's get this fixed...

Thanks,
	asamy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-28 21:11       ` Ahmed Samy
@ 2017-01-28 21:48         ` John Hubbard
  2017-01-28 21:55           ` Ahmed Samy
  0 siblings, 1 reply; 11+ messages in thread
From: John Hubbard @ 2017-01-28 21:48 UTC (permalink / raw)
  To: Ahmed Samy; +Cc: linux-mm, zhongjiang

On 01/28/2017 01:11 PM, Ahmed Samy wrote:
> On Thu, Jan 26, 2017 at 12:33:02AM -0800, John Hubbard wrote:
>>
>> That's ioremap_page_range, I assume (rather than remap_page_range)?
>>
>> Overall, the remap_ram_range approach looks reasonable to me so far. I'll
>> look into the details tomorrow.
>>
>> I'm sure that most people on this list already know this, but...could you
>> say a few more words about how remapping system ram is used, why it's a good
>> thing and not a bad thing? :)
>>
>> thanks
>> john h
>>
> Please let me know if you're going to actually make a commit that either
> 	1) reverts that commit
> 	2) implements a "separate" function...

This email caught me as I was just sitting down to this. A couple days later than I expected, sorry.

>
> Either way, I don't think the un-export is reasonable in the slightest, if that
> function is too low-level, then why not also un-export pmd_offset(),
> pgd_offset(), perhaps current task too?  These interact directly with low-level
> stuff, not meant for drivers, the function is meant to be low-level, I don't know
> what made you think that people use it wrong?  How about writing proper
> documentation about it instead?

heh, I'm sure we're in strong agreement there: good documentation is desirable, yet sometimes 
missing. :) As for "what made you think that people use it wrong?", I think Zhong spotted a 
potential problem, then (if I understand correctly) decided that it was actually OK, but by then, I 
had egged him on to remove the EXPORT, because it looked "off" to me, too. (It still does.)

So I'll take the heat for this one, and that's why I'm following up on it.

   Besides, even if that function does not exist,
> you can always iterate the PTEs and set the physical address, it is not hard,
> but the safe way is via the kernel knowledge, which is what that function
> when combined with others (from vmalloc) provide...
>
> How about this, a function as part of vmalloc, that says something like
> `void *vremap(unsigned long phys, unsigned long size, unsigned long flags);`?
> Then that solved the problem and there is no need for "low level" functions,
> anymore.

Quick question, what do you mean "a function as part of vmalloc"?

>
> Other than, if you're not going to apply a proper workaround, then let me know,
> and I'll handle it myself from here.  I don't want this to get past the next
> -rc release, so please let's get this fixed...

Agreed.

thanks,
john h

>
> Thanks,
> 	asamy
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-28 21:48         ` John Hubbard
@ 2017-01-28 21:55           ` Ahmed Samy
  2017-01-28 22:12             ` Ahmed Samy
  2017-01-28 22:13             ` John Hubbard
  0 siblings, 2 replies; 11+ messages in thread
From: Ahmed Samy @ 2017-01-28 21:55 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, zhongjiang

On Sat, Jan 28, 2017 at 01:48:46PM -0800, John Hubbard wrote:
> Quick question, what do you mean "a function as part of vmalloc"?
Take a look at `vmap()', it should be like that API.  Dunno if vmap can be used
in place, haven't tried, maybe can get `struct page` and then use vmap?  I
don't know, what do you think?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-28 21:55           ` Ahmed Samy
@ 2017-01-28 22:12             ` Ahmed Samy
  2017-01-28 22:16               ` John Hubbard
  2017-01-28 22:13             ` John Hubbard
  1 sibling, 1 reply; 11+ messages in thread
From: Ahmed Samy @ 2017-01-28 22:12 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-mm, zhongjiang

On Sat, Jan 28, 2017 at 11:55:05PM +0200, Ahmed Samy wrote:
> Take a look at `vmap()', it should be like that API.  Dunno if vmap can be used
> in place, haven't tried, maybe can get `struct page` and then use vmap?  I
> don't know, what do you think?
OK, so, I just tried vmap() with a struct page retrieved by pfn_to_page() which
seems to work just fine, so I suppose this can be disregarded, it's my fault; I
should've done more research before posting this to the list.

Sorry.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-28 21:55           ` Ahmed Samy
  2017-01-28 22:12             ` Ahmed Samy
@ 2017-01-28 22:13             ` John Hubbard
  1 sibling, 0 replies; 11+ messages in thread
From: John Hubbard @ 2017-01-28 22:13 UTC (permalink / raw)
  To: Ahmed Samy; +Cc: linux-mm, zhongjiang

On 01/28/2017 01:55 PM, Ahmed Samy wrote:
> On Sat, Jan 28, 2017 at 01:48:46PM -0800, John Hubbard wrote:
>> Quick question, what do you mean "a function as part of vmalloc"?
> Take a look at `vmap()', it should be like that API.  Dunno if vmap can be used
> in place, haven't tried, maybe can get `struct page` and then use vmap?  I
> don't know, what do you think?

Right, vmap is probably what you're looking for here. The way this story usually goes in my 
experience is: if you're actually dealing with real RAM, you are also dealing in struct pages. So 
you lookup the struct pages, keep track of them, (maybe also pin them with get_user_pages), and then 
map them with vmap.

Beyond the out-of-tree driver that I'm supporting (which uses vmap in that way), there are also a 
*lot* of in-tree examples that also do it.

Are you buying this? :)

thanks
john h

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ioremap_page_range: remapping of physical RAM ranges
  2017-01-28 22:12             ` Ahmed Samy
@ 2017-01-28 22:16               ` John Hubbard
  0 siblings, 0 replies; 11+ messages in thread
From: John Hubbard @ 2017-01-28 22:16 UTC (permalink / raw)
  To: Ahmed Samy; +Cc: linux-mm, zhongjiang

On 01/28/2017 02:12 PM, Ahmed Samy wrote:
> On Sat, Jan 28, 2017 at 11:55:05PM +0200, Ahmed Samy wrote:
>> Take a look at `vmap()', it should be like that API.  Dunno if vmap can be used
>> in place, haven't tried, maybe can get `struct page` and then use vmap?  I
>> don't know, what do you think?
> OK, so, I just tried vmap() with a struct page retrieved by pfn_to_page() which
> seems to work just fine, so I suppose this can be disregarded, it's my fault; I
> should've done more research before posting this to the list.
>
> Sorry.

No problem at all, delighted to hear that your code will be OK!

thanks
john h


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-01-28 22:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25 19:55 ioremap_page_range: remapping of physical RAM ranges A. Samy
2017-01-25 22:27 ` John Hubbard
2017-01-25 23:15   ` Ahmed Samy
2017-01-26  8:33     ` John Hubbard
2017-01-26 18:24       ` Ahmed Samy
2017-01-28 21:11       ` Ahmed Samy
2017-01-28 21:48         ` John Hubbard
2017-01-28 21:55           ` Ahmed Samy
2017-01-28 22:12             ` Ahmed Samy
2017-01-28 22:16               ` John Hubbard
2017-01-28 22:13             ` John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).