All of lore.kernel.org
 help / color / mirror / Atom feed
* mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-29 21:29 ` Sudarshan Rajagopalan
  0 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-10-29 21:29 UTC (permalink / raw)
  To: Anshuman Khandual, Mark Rutland, David Hildenbrand, Steven Price,
	Mike Rapoport, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Suren Baghdasaryan,
	Greg Kroah-Hartman, Pratik Patel

Hello all,

We have a usecase where a module driver adds certain memory blocks using 
add_memory_driver_managed(), so that it can perform memory hotplug 
operations on these blocks. In general, these memory blocks aren’t 
something that gets physically added later, but is part of actual RAM 
that system booted up with. Meaning – we set the ‘mem=’ cmdline 
parameter to limit the memory and later add the remaining ones using 
add_memory*() variants.

The basic idea is to have driver have ownership and manage certain 
memory blocks for hotplug operations.

For the driver be able to know how much memory was limited and how much 
actually present, we take the delta of ‘bootmem physical end address’ 
and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is 
obtained by scanning the reg values in ‘memory’ DT node and determining 
the max {addr,size}. Since our driver is getting modularized, we won’t 
have access to memblock_end_of_DRAM (i.e. end address of all memory 
blocks after ‘mem=’ is applied).

So checking if memblock_{start/end}_of_DRAM() symbols can be exported? 
Also, this information can be obtained by userspace by doing ‘cat 
/proc/iomem’ and greping for ‘System RAM’. So wondering if userspace can 
have access to such info, can we allow kernel module drivers have access 
by exporting memblock_{start/end}_of_DRAM().

Or are there any other ways where a module driver can get the end 
address of system memory block?


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 18+ messages in thread

* mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-29 21:29 ` Sudarshan Rajagopalan
  0 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-10-29 21:29 UTC (permalink / raw)
  To: Anshuman Khandual, Mark Rutland, David Hildenbrand, Steven Price,
	Mike Rapoport, linux-arm-kernel, linux-kernel
  Cc: Suren Baghdasaryan, Catalin Marinas, Greg Kroah-Hartman,
	Will Deacon, Pratik Patel

Hello all,

We have a usecase where a module driver adds certain memory blocks using 
add_memory_driver_managed(), so that it can perform memory hotplug 
operations on these blocks. In general, these memory blocks aren’t 
something that gets physically added later, but is part of actual RAM 
that system booted up with. Meaning – we set the ‘mem=’ cmdline 
parameter to limit the memory and later add the remaining ones using 
add_memory*() variants.

The basic idea is to have driver have ownership and manage certain 
memory blocks for hotplug operations.

For the driver be able to know how much memory was limited and how much 
actually present, we take the delta of ‘bootmem physical end address’ 
and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is 
obtained by scanning the reg values in ‘memory’ DT node and determining 
the max {addr,size}. Since our driver is getting modularized, we won’t 
have access to memblock_end_of_DRAM (i.e. end address of all memory 
blocks after ‘mem=’ is applied).

So checking if memblock_{start/end}_of_DRAM() symbols can be exported? 
Also, this information can be obtained by userspace by doing ‘cat 
/proc/iomem’ and greping for ‘System RAM’. So wondering if userspace can 
have access to such info, can we allow kernel module drivers have access 
by exporting memblock_{start/end}_of_DRAM().

Or are there any other ways where a module driver can get the end 
address of system memory block?


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-29 21:29 ` Sudarshan Rajagopalan
@ 2020-10-30  6:41   ` David Hildenbrand
  -1 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2020-10-30  6:41 UTC (permalink / raw)
  To: Sudarshan Rajagopalan, Anshuman Khandual, Mark Rutland,
	Steven Price, Mike Rapoport, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Suren Baghdasaryan,
	Greg Kroah-Hartman, Pratik Patel

On 29.10.20 22:29, Sudarshan Rajagopalan wrote:
> Hello all,
> 

Hi!

> We have a usecase where a module driver adds certain memory blocks using
> add_memory_driver_managed(), so that it can perform memory hotplug
> operations on these blocks. In general, these memory blocks aren’t
> something that gets physically added later, but is part of actual RAM
> that system booted up with. Meaning – we set the ‘mem=’ cmdline
> parameter to limit the memory and later add the remaining ones using
> add_memory*() variants.
> 
> The basic idea is to have driver have ownership and manage certain
> memory blocks for hotplug operations.

So, in summary, you're still abusing the memory hot(un)plug 
infrastructure from your driver - just not in a severe way as before. 
And I'll tell you why, so you might understand why exposing this API is 
not really a good idea and why your driver wouldn't - for example - be 
upstream material.

Don't get me wrong, what you are doing might be ok in your context, but 
it's simply not universally applicable in our current model.

Ordinary system RAM works different than many other devices (like PCI 
devices) whereby *something* senses the device and exposes it to the 
system, and some available driver binds to it and owns the memory.

Memory is detected by a driver and added to the system via e.g., 
add_memory_driver_managed(). Memory devices are created and the memory 
is directly handed off to the system, to be used as system RAM as soon 
as memory devices are onlined. There is no driver that "binds" memory 
like other devices - it's rather the core (buddy) that uses/owns that 
memory immediately after device creation.

> 
> For the driver be able to know how much memory was limited and how much
> actually present, we take the delta of ‘bootmem physical end address’
> and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
> obtained by scanning the reg values in ‘memory’ DT node and determining
> the max {addr,size}. Since our driver is getting modularized, we won’t
> have access to memblock_end_of_DRAM (i.e. end address of all memory
> blocks after ‘mem=’ is applied).

What you do with "mem=" is force memory detection to ignore some of it's 
detected memory.

> 
> So checking if memblock_{start/end}_of_DRAM() symbols can be exported?
> Also, this information can be obtained by userspace by doing ‘cat
> /proc/iomem’ and greping for ‘System RAM’. So wondering if userspace can

Not correct: with "mem=", cat /proc/iomem only shows *detected* + added 
system RAM, not the unmodified detection.

> have access to such info, can we allow kernel module drivers have access
> by exporting memblock_{start/end}_of_DRAM().
> 
> Or are there any other ways where a module driver can get the end
> address of system memory block?

And here is our problem: You disabled *detection* of that memory by the 
responsible driver (here: core). Now your driver wants to know what 
would have been detected. Assume you have memory hole in that region - 
it would not work by simply looking at start/end. You're driver is not 
the one doing the detection.

Another issue is: when using such memory for KVM guests, there is no 
mechanism that tracks ownership of that memory - imagine another driver 
wanting to use that memory. This really only works in special environments.

Yet another issue: you cannot assume that memblock data will stay around 
after boot. While we do it right now for arm64, that might change at 
some point. This is also one of the reasons why we don't export any real 
memblock data to drivers.


When using "mem=" you have to know the exact layout of your system RAM 
and communicate the right places how that layout looks like manually: 
here, to your driver.

The clean way of doing things today is to allocate RAM and use it for 
guests - e.g., using hugetlb/gigantic pages. As I said, there are other 
techniques coming up to deal with minimizing struct page overhead - if 
that's what you're concerned with (I still don't know why you're 
removing the memory from the host when giving it to the guest).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-30  6:41   ` David Hildenbrand
  0 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2020-10-30  6:41 UTC (permalink / raw)
  To: Sudarshan Rajagopalan, Anshuman Khandual, Mark Rutland,
	Steven Price, Mike Rapoport, linux-arm-kernel, linux-kernel
  Cc: Suren Baghdasaryan, Catalin Marinas, Greg Kroah-Hartman,
	Will Deacon, Pratik Patel

On 29.10.20 22:29, Sudarshan Rajagopalan wrote:
> Hello all,
> 

Hi!

> We have a usecase where a module driver adds certain memory blocks using
> add_memory_driver_managed(), so that it can perform memory hotplug
> operations on these blocks. In general, these memory blocks aren’t
> something that gets physically added later, but is part of actual RAM
> that system booted up with. Meaning – we set the ‘mem=’ cmdline
> parameter to limit the memory and later add the remaining ones using
> add_memory*() variants.
> 
> The basic idea is to have driver have ownership and manage certain
> memory blocks for hotplug operations.

So, in summary, you're still abusing the memory hot(un)plug 
infrastructure from your driver - just not in a severe way as before. 
And I'll tell you why, so you might understand why exposing this API is 
not really a good idea and why your driver wouldn't - for example - be 
upstream material.

Don't get me wrong, what you are doing might be ok in your context, but 
it's simply not universally applicable in our current model.

Ordinary system RAM works different than many other devices (like PCI 
devices) whereby *something* senses the device and exposes it to the 
system, and some available driver binds to it and owns the memory.

Memory is detected by a driver and added to the system via e.g., 
add_memory_driver_managed(). Memory devices are created and the memory 
is directly handed off to the system, to be used as system RAM as soon 
as memory devices are onlined. There is no driver that "binds" memory 
like other devices - it's rather the core (buddy) that uses/owns that 
memory immediately after device creation.

> 
> For the driver be able to know how much memory was limited and how much
> actually present, we take the delta of ‘bootmem physical end address’
> and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
> obtained by scanning the reg values in ‘memory’ DT node and determining
> the max {addr,size}. Since our driver is getting modularized, we won’t
> have access to memblock_end_of_DRAM (i.e. end address of all memory
> blocks after ‘mem=’ is applied).

What you do with "mem=" is force memory detection to ignore some of it's 
detected memory.

> 
> So checking if memblock_{start/end}_of_DRAM() symbols can be exported?
> Also, this information can be obtained by userspace by doing ‘cat
> /proc/iomem’ and greping for ‘System RAM’. So wondering if userspace can

Not correct: with "mem=", cat /proc/iomem only shows *detected* + added 
system RAM, not the unmodified detection.

> have access to such info, can we allow kernel module drivers have access
> by exporting memblock_{start/end}_of_DRAM().
> 
> Or are there any other ways where a module driver can get the end
> address of system memory block?

And here is our problem: You disabled *detection* of that memory by the 
responsible driver (here: core). Now your driver wants to know what 
would have been detected. Assume you have memory hole in that region - 
it would not work by simply looking at start/end. You're driver is not 
the one doing the detection.

Another issue is: when using such memory for KVM guests, there is no 
mechanism that tracks ownership of that memory - imagine another driver 
wanting to use that memory. This really only works in special environments.

Yet another issue: you cannot assume that memblock data will stay around 
after boot. While we do it right now for arm64, that might change at 
some point. This is also one of the reasons why we don't export any real 
memblock data to drivers.


When using "mem=" you have to know the exact layout of your system RAM 
and communicate the right places how that layout looks like manually: 
here, to your driver.

The clean way of doing things today is to allocate RAM and use it for 
guests - e.g., using hugetlb/gigantic pages. As I said, there are other 
techniques coming up to deal with minimizing struct page overhead - if 
that's what you're concerned with (I still don't know why you're 
removing the memory from the host when giving it to the guest).

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-29 21:29 ` Sudarshan Rajagopalan
@ 2020-10-30  8:38   ` Mike Rapoport
  -1 siblings, 0 replies; 18+ messages in thread
From: Mike Rapoport @ 2020-10-30  8:38 UTC (permalink / raw)
  To: Sudarshan Rajagopalan
  Cc: Anshuman Khandual, Mark Rutland, David Hildenbrand, Steven Price,
	linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Suren Baghdasaryan, Greg Kroah-Hartman, Pratik Patel

On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
> Hello all,
> 
> We have a usecase where a module driver adds certain memory blocks using
> add_memory_driver_managed(), so that it can perform memory hotplug
> operations on these blocks. In general, these memory blocks aren’t something
> that gets physically added later, but is part of actual RAM that system
> booted up with. Meaning – we set the ‘mem=’ cmdline parameter to limit the
> memory and later add the remaining ones using add_memory*() variants.
> 
> The basic idea is to have driver have ownership and manage certain memory
> blocks for hotplug operations.
> 
> For the driver be able to know how much memory was limited and how much
> actually present, we take the delta of ‘bootmem physical end address’ and
> ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is obtained by
> scanning the reg values in ‘memory’ DT node and determining the max
> {addr,size}. Since our driver is getting modularized, we won’t have access
> to memblock_end_of_DRAM (i.e. end address of all memory blocks after ‘mem=’
> is applied).
> 
> So checking if memblock_{start/end}_of_DRAM() symbols can be exported? Also,
> this information can be obtained by userspace by doing ‘cat /proc/iomem’ and
> greping for ‘System RAM’. So wondering if userspace can have access to such
> info, can we allow kernel module drivers have access by exporting
> memblock_{start/end}_of_DRAM().

These functions cannot be exported not because we want to hide this
information from the modules but because it is unsafe to use them.
On most architecturs these functions are __init so they are discarded
after boot anyway. Beisdes, the memory configuration known to memblock
might be not accurate in many cases as David explained in his reply.

> Or are there any other ways where a module driver can get the end address of
> system memory block?
 
What do you mean by "system memory block"? There could be a lot of
interpretations if you take into account memory hotplug, "mem=" option,
reserved and firmware memory.

I'd suggest you to describe the entire use case in more detail. Having
the complete picture would help finding a proper solution.

> Sudarshan
> 

--
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-30  8:38   ` Mike Rapoport
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Rapoport @ 2020-10-30  8:38 UTC (permalink / raw)
  To: Sudarshan Rajagopalan
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas,
	Anshuman Khandual, linux-kernel, Steven Price,
	Suren Baghdasaryan, Greg Kroah-Hartman, Will Deacon,
	linux-arm-kernel, Pratik Patel

On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
> Hello all,
> 
> We have a usecase where a module driver adds certain memory blocks using
> add_memory_driver_managed(), so that it can perform memory hotplug
> operations on these blocks. In general, these memory blocks aren’t something
> that gets physically added later, but is part of actual RAM that system
> booted up with. Meaning – we set the ‘mem=’ cmdline parameter to limit the
> memory and later add the remaining ones using add_memory*() variants.
> 
> The basic idea is to have driver have ownership and manage certain memory
> blocks for hotplug operations.
> 
> For the driver be able to know how much memory was limited and how much
> actually present, we take the delta of ‘bootmem physical end address’ and
> ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is obtained by
> scanning the reg values in ‘memory’ DT node and determining the max
> {addr,size}. Since our driver is getting modularized, we won’t have access
> to memblock_end_of_DRAM (i.e. end address of all memory blocks after ‘mem=’
> is applied).
> 
> So checking if memblock_{start/end}_of_DRAM() symbols can be exported? Also,
> this information can be obtained by userspace by doing ‘cat /proc/iomem’ and
> greping for ‘System RAM’. So wondering if userspace can have access to such
> info, can we allow kernel module drivers have access by exporting
> memblock_{start/end}_of_DRAM().

These functions cannot be exported not because we want to hide this
information from the modules but because it is unsafe to use them.
On most architecturs these functions are __init so they are discarded
after boot anyway. Beisdes, the memory configuration known to memblock
might be not accurate in many cases as David explained in his reply.

> Or are there any other ways where a module driver can get the end address of
> system memory block?
 
What do you mean by "system memory block"? There could be a lot of
interpretations if you take into account memory hotplug, "mem=" option,
reserved and firmware memory.

I'd suggest you to describe the entire use case in more detail. Having
the complete picture would help finding a proper solution.

> Sudarshan
> 

--
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-30  8:38   ` Mike Rapoport
@ 2020-10-31  9:18     ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2020-10-31  9:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Sudarshan Rajagopalan, Mark Rutland, David Hildenbrand,
	Catalin Marinas, Anshuman Khandual, linux-kernel, Steven Price,
	Suren Baghdasaryan, Greg Kroah-Hartman, Will Deacon,
	linux-arm-kernel, Pratik Patel

On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
>  
> What do you mean by "system memory block"? There could be a lot of
> interpretations if you take into account memory hotplug, "mem=" option,
> reserved and firmware memory.
> 
> I'd suggest you to describe the entire use case in more detail. Having
> the complete picture would help finding a proper solution.

I think we need the code for the driver trying to do this as an RFC
submission.  Everything else is rather pointless.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-31  9:18     ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2020-10-31  9:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, Sudarshan Rajagopalan, Will Deacon,
	David Hildenbrand, Catalin Marinas, Anshuman Khandual,
	linux-kernel, Steven Price, Greg Kroah-Hartman,
	Suren Baghdasaryan, linux-arm-kernel, Pratik Patel

On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
>  
> What do you mean by "system memory block"? There could be a lot of
> interpretations if you take into account memory hotplug, "mem=" option,
> reserved and firmware memory.
> 
> I'd suggest you to describe the entire use case in more detail. Having
> the complete picture would help finding a proper solution.

I think we need the code for the driver trying to do this as an RFC
submission.  Everything else is rather pointless.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-31  9:18     ` Christoph Hellwig
@ 2020-10-31 10:05       ` David Hildenbrand
  -1 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2020-10-31 10:05 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Rapoport
  Cc: Sudarshan Rajagopalan, Mark Rutland, Catalin Marinas,
	Anshuman Khandual, linux-kernel, Steven Price,
	Suren Baghdasaryan, Greg Kroah-Hartman, Will Deacon,
	linux-arm-kernel, Pratik Patel

On 31.10.20 10:18, Christoph Hellwig wrote:
> On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
>>   
>> What do you mean by "system memory block"? There could be a lot of
>> interpretations if you take into account memory hotplug, "mem=" option,
>> reserved and firmware memory.
>>
>> I'd suggest you to describe the entire use case in more detail. Having
>> the complete picture would help finding a proper solution.
> 
> I think we need the code for the driver trying to do this as an RFC
> submission.  Everything else is rather pointless.

Sharing RFCs is most probably not what people want when developing 
advanced hypervisor features :)

@Sudarshan, I recommend looking at the slides of the KVM Forum talk from 
yesterday

https://kvmforum2020.sched.com/event/eE40/towards-an-alternative-memory-architecture-joao-martins-oracle?iframe=no

It contains a nice summary of the state of art, and how "mem=", devdax, 
and dax_hmat can be used to tackle the issue in a hypervisor.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-10-31 10:05       ` David Hildenbrand
  0 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2020-10-31 10:05 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Rapoport
  Cc: Mark Rutland, Will Deacon, Anshuman Khandual, Catalin Marinas,
	Sudarshan Rajagopalan, linux-kernel, Steven Price,
	Greg Kroah-Hartman, Suren Baghdasaryan, linux-arm-kernel,
	Pratik Patel

On 31.10.20 10:18, Christoph Hellwig wrote:
> On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
>>   
>> What do you mean by "system memory block"? There could be a lot of
>> interpretations if you take into account memory hotplug, "mem=" option,
>> reserved and firmware memory.
>>
>> I'd suggest you to describe the entire use case in more detail. Having
>> the complete picture would help finding a proper solution.
> 
> I think we need the code for the driver trying to do this as an RFC
> submission.  Everything else is rather pointless.

Sharing RFCs is most probably not what people want when developing 
advanced hypervisor features :)

@Sudarshan, I recommend looking at the slides of the KVM Forum talk from 
yesterday

https://kvmforum2020.sched.com/event/eE40/towards-an-alternative-memory-architecture-joao-martins-oracle?iframe=no

It contains a nice summary of the state of art, and how "mem=", devdax, 
and dax_hmat can be used to tackle the issue in a hypervisor.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-30  6:41   ` David Hildenbrand
@ 2020-11-03  2:15     ` Sudarshan Rajagopalan
  -1 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-11-03  2:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, Mark Rutland, Steven Price, Mike Rapoport,
	linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Suren Baghdasaryan, Greg Kroah-Hartman, Pratik Patel

On 2020-10-29 23:41, David Hildenbrand wrote:
> On 29.10.20 22:29, Sudarshan Rajagopalan wrote:
>> Hello all,
>> 
> 
> Hi!
> 

Hi David.. thanks for the response as always.

>> We have a usecase where a module driver adds certain memory blocks 
>> using
>> add_memory_driver_managed(), so that it can perform memory hotplug
>> operations on these blocks. In general, these memory blocks aren’t
>> something that gets physically added later, but is part of actual RAM
>> that system booted up with. Meaning – we set the ‘mem=’ cmdline
>> parameter to limit the memory and later add the remaining ones using
>> add_memory*() variants.
>> 
>> The basic idea is to have driver have ownership and manage certain
>> memory blocks for hotplug operations.
> 
> So, in summary, you're still abusing the memory hot(un)plug
> infrastructure from your driver - just not in a severe way as before.
> And I'll tell you why, so you might understand why exposing this API
> is not really a good idea and why your driver wouldn't - for example -
> be upstream material.
> 
> Don't get me wrong, what you are doing might be ok in your context,
> but it's simply not universally applicable in our current model.
> 
> Ordinary system RAM works different than many other devices (like PCI
> devices) whereby *something* senses the device and exposes it to the
> system, and some available driver binds to it and owns the memory.
> 
> Memory is detected by a driver and added to the system via e.g.,
> add_memory_driver_managed(). Memory devices are created and the memory
> is directly handed off to the system, to be used as system RAM as soon
> as memory devices are onlined. There is no driver that "binds" memory
> like other devices - it's rather the core (buddy) that uses/owns that
> memory immediately after device creation.
> 

I see.. and I agree that drivers are meant to *sense* that something 
changed or newly added, so that driver can check if it's the one 
responsible or compatible for handling this entity and binds to it. So I 
guess what it boils down to is - a driver that uses memory hotplug 
_cannot_ add/remove or have ownership of memblock boot memory, but for 
the newly added RAM blocks later on.

I was trying to mimic the detecting and adding of extra RAM by limiting 
the System RAM with "mem=XGB" as though system booted with XGB of boot 
memory and later add the remaining blocks (force detection and adding) 
using add_memorY-driver_manager(). This remaining blocks are calculated 
by 'physical end addr of boot memory' - 'memblock_end_of_DRAM'. The 
"physical end addr of boot memory" i.e. the actual RAM that bootloader 
informs to kernel can be obtained by scanning the 'memory' DT node.

>> 
>> For the driver be able to know how much memory was limited and how 
>> much
>> actually present, we take the delta of ‘bootmem physical end address’
>> and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
>> obtained by scanning the reg values in ‘memory’ DT node and 
>> determining
>> the max {addr,size}. Since our driver is getting modularized, we won’t
>> have access to memblock_end_of_DRAM (i.e. end address of all memory
>> blocks after ‘mem=’ is applied).
> 
> What you do with "mem=" is force memory detection to ignore some of
> it's detected memory.
> 
>> 
>> So checking if memblock_{start/end}_of_DRAM() symbols can be exported?
>> Also, this information can be obtained by userspace by doing ‘cat
>> /proc/iomem’ and greping for ‘System RAM’. So wondering if userspace 
>> can
> 
> Not correct: with "mem=", cat /proc/iomem only shows *detected* +
> added system RAM, not the unmodified detection.
> 

That's correct - I meant 'memblock_end_of_DRAM' along with "mem=" can be 
calculated using 'cat /proc/iomem' which shows "detected plus added" 
System RAM, and not the remaining undetected one which got stripped off 
due to "mem=XGB". Basically, 'memblock_end_of_DRAM' address with 
'mem=XGB' is {end addr of boot RAM - XGB}.. which would be same as end 
address of "System RAM" showed in /proc/iomem.

The reasoning for this is - if userspace can have access to such info 
and calculate the memblock end address, why not let drivers have this 
info using memblock_end_of_DRAM()?

>> have access to such info, can we allow kernel module drivers have 
>> access
>> by exporting memblock_{start/end}_of_DRAM().
>> 
>> Or are there any other ways where a module driver can get the end
>> address of system memory block?
> 
> And here is our problem: You disabled *detection* of that memory by
> the responsible driver (here: core). Now your driver wants to know
> what would have been detected. Assume you have memory hole in that
> region - it would not work by simply looking at start/end. You're
> driver is not the one doing the detection.
> 

Regarding the memory hole - the driver can inspect the 'memory' DT node 
that kernel gets from ABL from RAM partition table if any such holes 
exist or not. I agree that if such holes exists, hot adding will fail 
since it needs block size to be added.
The same issue will arise if a RAM slot is added and a driver senses it 
and it only knows the start/end of this RAM slot (though such holes 
generally doesn't exists in RAM slots).

This is again something specific to our target which we make sure there 
are no such holes in the top most memory which is stripped off by "mem=" 
and later added by the driver. I agree this is not universal upstream 
material type, but its a method that drivers can utilize.

> Another issue is: when using such memory for KVM guests, there is no
> mechanism that tracks ownership of that memory - imagine another
> driver wanting to use that memory. This really only works in special
> environments.
> 
> Yet another issue: you cannot assume that memblock data will stay
> around after boot. While we do it right now for arm64, that might
> change at some point. This is also one of the reasons why we don't
> export any real memblock data to drivers.
> 
> 
> When using "mem=" you have to know the exact layout of your system RAM
> and communicate the right places how that layout looks like manually:
> here, to your driver.
> 

I agree the issues mentioned here with this approach are valid from 
upstream POV, but we aren't trying to make a generic driver for this 
usecase and upstream it, but rather have it tailor made for our usecase 
alone where we know the layout of the System RAM (max bootmemory, no 
holes etc) and we utilize "mem=" and memory hotplug so that driver can 
add and have ownership of the remaining memory for later hotplug 
operations.

> The clean way of doing things today is to allocate RAM and use it for
> guests - e.g., using hugetlb/gigantic pages. As I said, there are
> other techniques coming up to deal with minimizing struct page
> overhead - if that's what you're concerned with (I still don't know
> why you're removing the memory from the host when giving it to the
> guest).

The overhead of strut page with hugetlb is valid, but we have other 
usecases outside of inter-VM sharing where we rely on memory 
hotplugging. In general, we want a way to be able to add/remove and 
offline/online a memory which is part of boot. With all the tools 
available - "mem=", "/proc/iomem", "memory" DT node and memory hotplug 
framework, a driver can still be able to achieve this and these tools 
that are present now does allow it.

Keeping the interVM memory sharing aside, would it be okay if 
memblock_end_of_DRAM() be exported? Like I mentioned before, there can 
be a userspace service that calculates this using 'cat /proc/iomem' and 
have it delivered to driver via a sysfs node. So I dont see any harm in 
exporting this info to driver. I agree other memblock info shouldn't be 
exposed outside to drivers. But I see no harm for 
memblock_end_of_DRAM().

I will be glad to share more info about the usecase where we use this 
approach if that would help, and I can check and get back on how much we 
can share since this is a proprietary usecase for Qualcomm.


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-11-03  2:15     ` Sudarshan Rajagopalan
  0 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-11-03  2:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mark Rutland, Anshuman Khandual, Catalin Marinas, linux-kernel,
	Steven Price, Suren Baghdasaryan, Mike Rapoport,
	Greg Kroah-Hartman, Will Deacon, linux-arm-kernel, Pratik Patel

On 2020-10-29 23:41, David Hildenbrand wrote:
> On 29.10.20 22:29, Sudarshan Rajagopalan wrote:
>> Hello all,
>> 
> 
> Hi!
> 

Hi David.. thanks for the response as always.

>> We have a usecase where a module driver adds certain memory blocks 
>> using
>> add_memory_driver_managed(), so that it can perform memory hotplug
>> operations on these blocks. In general, these memory blocks aren’t
>> something that gets physically added later, but is part of actual RAM
>> that system booted up with. Meaning – we set the ‘mem=’ cmdline
>> parameter to limit the memory and later add the remaining ones using
>> add_memory*() variants.
>> 
>> The basic idea is to have driver have ownership and manage certain
>> memory blocks for hotplug operations.
> 
> So, in summary, you're still abusing the memory hot(un)plug
> infrastructure from your driver - just not in a severe way as before.
> And I'll tell you why, so you might understand why exposing this API
> is not really a good idea and why your driver wouldn't - for example -
> be upstream material.
> 
> Don't get me wrong, what you are doing might be ok in your context,
> but it's simply not universally applicable in our current model.
> 
> Ordinary system RAM works different than many other devices (like PCI
> devices) whereby *something* senses the device and exposes it to the
> system, and some available driver binds to it and owns the memory.
> 
> Memory is detected by a driver and added to the system via e.g.,
> add_memory_driver_managed(). Memory devices are created and the memory
> is directly handed off to the system, to be used as system RAM as soon
> as memory devices are onlined. There is no driver that "binds" memory
> like other devices - it's rather the core (buddy) that uses/owns that
> memory immediately after device creation.
> 

I see.. and I agree that drivers are meant to *sense* that something 
changed or newly added, so that driver can check if it's the one 
responsible or compatible for handling this entity and binds to it. So I 
guess what it boils down to is - a driver that uses memory hotplug 
_cannot_ add/remove or have ownership of memblock boot memory, but for 
the newly added RAM blocks later on.

I was trying to mimic the detecting and adding of extra RAM by limiting 
the System RAM with "mem=XGB" as though system booted with XGB of boot 
memory and later add the remaining blocks (force detection and adding) 
using add_memorY-driver_manager(). This remaining blocks are calculated 
by 'physical end addr of boot memory' - 'memblock_end_of_DRAM'. The 
"physical end addr of boot memory" i.e. the actual RAM that bootloader 
informs to kernel can be obtained by scanning the 'memory' DT node.

>> 
>> For the driver be able to know how much memory was limited and how 
>> much
>> actually present, we take the delta of ‘bootmem physical end address’
>> and ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
>> obtained by scanning the reg values in ‘memory’ DT node and 
>> determining
>> the max {addr,size}. Since our driver is getting modularized, we won’t
>> have access to memblock_end_of_DRAM (i.e. end address of all memory
>> blocks after ‘mem=’ is applied).
> 
> What you do with "mem=" is force memory detection to ignore some of
> it's detected memory.
> 
>> 
>> So checking if memblock_{start/end}_of_DRAM() symbols can be exported?
>> Also, this information can be obtained by userspace by doing ‘cat
>> /proc/iomem’ and greping for ‘System RAM’. So wondering if userspace 
>> can
> 
> Not correct: with "mem=", cat /proc/iomem only shows *detected* +
> added system RAM, not the unmodified detection.
> 

That's correct - I meant 'memblock_end_of_DRAM' along with "mem=" can be 
calculated using 'cat /proc/iomem' which shows "detected plus added" 
System RAM, and not the remaining undetected one which got stripped off 
due to "mem=XGB". Basically, 'memblock_end_of_DRAM' address with 
'mem=XGB' is {end addr of boot RAM - XGB}.. which would be same as end 
address of "System RAM" showed in /proc/iomem.

The reasoning for this is - if userspace can have access to such info 
and calculate the memblock end address, why not let drivers have this 
info using memblock_end_of_DRAM()?

>> have access to such info, can we allow kernel module drivers have 
>> access
>> by exporting memblock_{start/end}_of_DRAM().
>> 
>> Or are there any other ways where a module driver can get the end
>> address of system memory block?
> 
> And here is our problem: You disabled *detection* of that memory by
> the responsible driver (here: core). Now your driver wants to know
> what would have been detected. Assume you have memory hole in that
> region - it would not work by simply looking at start/end. You're
> driver is not the one doing the detection.
> 

Regarding the memory hole - the driver can inspect the 'memory' DT node 
that kernel gets from ABL from RAM partition table if any such holes 
exist or not. I agree that if such holes exists, hot adding will fail 
since it needs block size to be added.
The same issue will arise if a RAM slot is added and a driver senses it 
and it only knows the start/end of this RAM slot (though such holes 
generally doesn't exists in RAM slots).

This is again something specific to our target which we make sure there 
are no such holes in the top most memory which is stripped off by "mem=" 
and later added by the driver. I agree this is not universal upstream 
material type, but its a method that drivers can utilize.

> Another issue is: when using such memory for KVM guests, there is no
> mechanism that tracks ownership of that memory - imagine another
> driver wanting to use that memory. This really only works in special
> environments.
> 
> Yet another issue: you cannot assume that memblock data will stay
> around after boot. While we do it right now for arm64, that might
> change at some point. This is also one of the reasons why we don't
> export any real memblock data to drivers.
> 
> 
> When using "mem=" you have to know the exact layout of your system RAM
> and communicate the right places how that layout looks like manually:
> here, to your driver.
> 

I agree the issues mentioned here with this approach are valid from 
upstream POV, but we aren't trying to make a generic driver for this 
usecase and upstream it, but rather have it tailor made for our usecase 
alone where we know the layout of the System RAM (max bootmemory, no 
holes etc) and we utilize "mem=" and memory hotplug so that driver can 
add and have ownership of the remaining memory for later hotplug 
operations.

> The clean way of doing things today is to allocate RAM and use it for
> guests - e.g., using hugetlb/gigantic pages. As I said, there are
> other techniques coming up to deal with minimizing struct page
> overhead - if that's what you're concerned with (I still don't know
> why you're removing the memory from the host when giving it to the
> guest).

The overhead of strut page with hugetlb is valid, but we have other 
usecases outside of inter-VM sharing where we rely on memory 
hotplugging. In general, we want a way to be able to add/remove and 
offline/online a memory which is part of boot. With all the tools 
available - "mem=", "/proc/iomem", "memory" DT node and memory hotplug 
framework, a driver can still be able to achieve this and these tools 
that are present now does allow it.

Keeping the interVM memory sharing aside, would it be okay if 
memblock_end_of_DRAM() be exported? Like I mentioned before, there can 
be a userspace service that calculates this using 'cat /proc/iomem' and 
have it delivered to driver via a sysfs node. So I dont see any harm in 
exporting this info to driver. I agree other memblock info shouldn't be 
exposed outside to drivers. But I see no harm for 
memblock_end_of_DRAM().

I will be glad to share more info about the usecase where we use this 
approach if that would help, and I can check and get back on how much we 
can share since this is a proprietary usecase for Qualcomm.


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-30  8:38   ` Mike Rapoport
@ 2020-11-03  2:51     ` Sudarshan Rajagopalan
  -1 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-11-03  2:51 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Anshuman Khandual, Mark Rutland, David Hildenbrand, Steven Price,
	linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Suren Baghdasaryan, Greg Kroah-Hartman, Pratik Patel

On 2020-10-30 01:38, Mike Rapoport wrote:
> On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
>> Hello all,
>> 
>> We have a usecase where a module driver adds certain memory blocks 
>> using
>> add_memory_driver_managed(), so that it can perform memory hotplug
>> operations on these blocks. In general, these memory blocks aren’t 
>> something
>> that gets physically added later, but is part of actual RAM that 
>> system
>> booted up with. Meaning – we set the ‘mem=’ cmdline parameter to limit 
>> the
>> memory and later add the remaining ones using add_memory*() variants.
>> 
>> The basic idea is to have driver have ownership and manage certain 
>> memory
>> blocks for hotplug operations.
>> 
>> For the driver be able to know how much memory was limited and how 
>> much
>> actually present, we take the delta of ‘bootmem physical end address’ 
>> and
>> ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is obtained 
>> by
>> scanning the reg values in ‘memory’ DT node and determining the max
>> {addr,size}. Since our driver is getting modularized, we won’t have 
>> access
>> to memblock_end_of_DRAM (i.e. end address of all memory blocks after 
>> ‘mem=’
>> is applied).
>> 
>> So checking if memblock_{start/end}_of_DRAM() symbols can be exported? 
>> Also,
>> this information can be obtained by userspace by doing ‘cat 
>> /proc/iomem’ and
>> greping for ‘System RAM’. So wondering if userspace can have access to 
>> such
>> info, can we allow kernel module drivers have access by exporting
>> memblock_{start/end}_of_DRAM().
> 
> These functions cannot be exported not because we want to hide this
> information from the modules but because it is unsafe to use them.
> On most architecturs these functions are __init so they are discarded
> after boot anyway. Beisdes, the memory configuration known to memblock
> might be not accurate in many cases as David explained in his reply.
> 

I don't see how information contained in memblock_{start/end}_of_DRAM() 
is considered hidden if the information can be obtained using 'cat 
/proc/iomem'. The memory resource manager adds these blocks either in 
"System RAM", "reserved", "Kernel data/code" etc. Inspecting this, one 
could determine whats the start and end of memblocks.

I agree on the part that its __init annotated and could be removed after 
boot. This is something that the driver can be vary of too.

>> Or are there any other ways where a module driver can get the end 
>> address of
>> system memory block?
> 
> What do you mean by "system memory block"? There could be a lot of
> interpretations if you take into account memory hotplug, "mem=" option,
> reserved and firmware memory.

I meant the physical end address of memblock. The equivalent of 
memblock_end_of_DRAM.

> 
> I'd suggest you to describe the entire use case in more detail. Having
> the complete picture would help finding a proper solution.

The usecase in general is have a way to add/remove and online/offline 
certain memory blocks which are part of boot. We do this by limiting the 
memory using "mem=" and latter add the remaining blocks using 
add_memory_driver_mamanaged().

> 
>> Sudarshan
>> 
> 
> --
> Sincerely yours,
> Mike.


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-11-03  2:51     ` Sudarshan Rajagopalan
  0 siblings, 0 replies; 18+ messages in thread
From: Sudarshan Rajagopalan @ 2020-11-03  2:51 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas,
	Anshuman Khandual, linux-kernel, Steven Price,
	Suren Baghdasaryan, Greg Kroah-Hartman, Will Deacon,
	linux-arm-kernel, Pratik Patel

On 2020-10-30 01:38, Mike Rapoport wrote:
> On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
>> Hello all,
>> 
>> We have a usecase where a module driver adds certain memory blocks 
>> using
>> add_memory_driver_managed(), so that it can perform memory hotplug
>> operations on these blocks. In general, these memory blocks aren’t 
>> something
>> that gets physically added later, but is part of actual RAM that 
>> system
>> booted up with. Meaning – we set the ‘mem=’ cmdline parameter to limit 
>> the
>> memory and later add the remaining ones using add_memory*() variants.
>> 
>> The basic idea is to have driver have ownership and manage certain 
>> memory
>> blocks for hotplug operations.
>> 
>> For the driver be able to know how much memory was limited and how 
>> much
>> actually present, we take the delta of ‘bootmem physical end address’ 
>> and
>> ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is obtained 
>> by
>> scanning the reg values in ‘memory’ DT node and determining the max
>> {addr,size}. Since our driver is getting modularized, we won’t have 
>> access
>> to memblock_end_of_DRAM (i.e. end address of all memory blocks after 
>> ‘mem=’
>> is applied).
>> 
>> So checking if memblock_{start/end}_of_DRAM() symbols can be exported? 
>> Also,
>> this information can be obtained by userspace by doing ‘cat 
>> /proc/iomem’ and
>> greping for ‘System RAM’. So wondering if userspace can have access to 
>> such
>> info, can we allow kernel module drivers have access by exporting
>> memblock_{start/end}_of_DRAM().
> 
> These functions cannot be exported not because we want to hide this
> information from the modules but because it is unsafe to use them.
> On most architecturs these functions are __init so they are discarded
> after boot anyway. Beisdes, the memory configuration known to memblock
> might be not accurate in many cases as David explained in his reply.
> 

I don't see how information contained in memblock_{start/end}_of_DRAM() 
is considered hidden if the information can be obtained using 'cat 
/proc/iomem'. The memory resource manager adds these blocks either in 
"System RAM", "reserved", "Kernel data/code" etc. Inspecting this, one 
could determine whats the start and end of memblocks.

I agree on the part that its __init annotated and could be removed after 
boot. This is something that the driver can be vary of too.

>> Or are there any other ways where a module driver can get the end 
>> address of
>> system memory block?
> 
> What do you mean by "system memory block"? There could be a lot of
> interpretations if you take into account memory hotplug, "mem=" option,
> reserved and firmware memory.

I meant the physical end address of memblock. The equivalent of 
memblock_end_of_DRAM.

> 
> I'd suggest you to describe the entire use case in more detail. Having
> the complete picture would help finding a proper solution.

The usecase in general is have a way to add/remove and online/offline 
certain memory blocks which are part of boot. We do this by limiting the 
memory using "mem=" and latter add the remaining blocks using 
add_memory_driver_mamanaged().

> 
>> Sudarshan
>> 
> 
> --
> Sincerely yours,
> Mike.


Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-10-31 10:05       ` David Hildenbrand
@ 2020-11-03  8:38         ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2020-11-03  8:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Christoph Hellwig, Mike Rapoport, Sudarshan Rajagopalan,
	Mark Rutland, Catalin Marinas, Anshuman Khandual, linux-kernel,
	Steven Price, Suren Baghdasaryan, Greg Kroah-Hartman,
	Will Deacon, linux-arm-kernel, Pratik Patel

On Sat, Oct 31, 2020 at 11:05:45AM +0100, David Hildenbrand wrote:
> On 31.10.20 10:18, Christoph Hellwig wrote:
> > On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
> > > What do you mean by "system memory block"? There could be a lot of
> > > interpretations if you take into account memory hotplug, "mem=" option,
> > > reserved and firmware memory.
> > > 
> > > I'd suggest you to describe the entire use case in more detail. Having
> > > the complete picture would help finding a proper solution.
> > 
> > I think we need the code for the driver trying to do this as an RFC
> > submission.  Everything else is rather pointless.
> 
> Sharing RFCs is most probably not what people want when developing advanced
> hypervisor features :)

Well, if they can't even do that it really has no relevance for kernel
development.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-11-03  8:38         ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2020-11-03  8:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mark Rutland, Will Deacon, Anshuman Khandual, Catalin Marinas,
	Sudarshan Rajagopalan, linux-kernel, Steven Price,
	Christoph Hellwig, linux-arm-kernel, Greg Kroah-Hartman,
	Suren Baghdasaryan, Mike Rapoport, Pratik Patel

On Sat, Oct 31, 2020 at 11:05:45AM +0100, David Hildenbrand wrote:
> On 31.10.20 10:18, Christoph Hellwig wrote:
> > On Fri, Oct 30, 2020 at 10:38:42AM +0200, Mike Rapoport wrote:
> > > What do you mean by "system memory block"? There could be a lot of
> > > interpretations if you take into account memory hotplug, "mem=" option,
> > > reserved and firmware memory.
> > > 
> > > I'd suggest you to describe the entire use case in more detail. Having
> > > the complete picture would help finding a proper solution.
> > 
> > I think we need the code for the driver trying to do this as an RFC
> > submission.  Everything else is rather pointless.
> 
> Sharing RFCs is most probably not what people want when developing advanced
> hypervisor features :)

Well, if they can't even do that it really has no relevance for kernel
development.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
  2020-11-03  2:51     ` Sudarshan Rajagopalan
@ 2020-11-03 16:51       ` Mike Rapoport
  -1 siblings, 0 replies; 18+ messages in thread
From: Mike Rapoport @ 2020-11-03 16:51 UTC (permalink / raw)
  To: Sudarshan Rajagopalan
  Cc: Anshuman Khandual, Mark Rutland, David Hildenbrand, Steven Price,
	linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Suren Baghdasaryan, Greg Kroah-Hartman, Pratik Patel

On Mon, Nov 02, 2020 at 06:51:25PM -0800, Sudarshan Rajagopalan wrote:
> On 2020-10-30 01:38, Mike Rapoport wrote:
> > On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
> > > Hello all,
> > > 
> > > We have a usecase where a module driver adds certain memory blocks
> > > using
> > > add_memory_driver_managed(), so that it can perform memory hotplug
> > > operations on these blocks. In general, these memory blocks aren’t
> > > something
> > > that gets physically added later, but is part of actual RAM that
> > > system
> > > booted up with. Meaning – we set the ‘mem=’ cmdline parameter to
> > > limit the
> > > memory and later add the remaining ones using add_memory*() variants.
> > > 
> > > The basic idea is to have driver have ownership and manage certain
> > > memory
> > > blocks for hotplug operations.
> > > 
> > > For the driver be able to know how much memory was limited and how
> > > much
> > > actually present, we take the delta of ‘bootmem physical end
> > > address’ and
> > > ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
> > > obtained by
> > > scanning the reg values in ‘memory’ DT node and determining the max
> > > {addr,size}. Since our driver is getting modularized, we won’t have
> > > access
> > > to memblock_end_of_DRAM (i.e. end address of all memory blocks after
> > > ‘mem=’
> > > is applied).
> > > 
> > > So checking if memblock_{start/end}_of_DRAM() symbols can be
> > > exported? Also,
> > > this information can be obtained by userspace by doing ‘cat
> > > /proc/iomem’ and
> > > greping for ‘System RAM’. So wondering if userspace can have access
> > > to such
> > > info, can we allow kernel module drivers have access by exporting
> > > memblock_{start/end}_of_DRAM().
> > 
> > These functions cannot be exported not because we want to hide this
> > information from the modules but because it is unsafe to use them.
> > On most architecturs these functions are __init so they are discarded
> > after boot anyway. Beisdes, the memory configuration known to memblock
> > might be not accurate in many cases as David explained in his reply.
> > 
> 
> I don't see how information contained in memblock_{start/end}_of_DRAM() is
> considered hidden if the information can be obtained using 'cat
> /proc/iomem'. The memory resource manager adds these blocks either in
> "System RAM", "reserved", "Kernel data/code" etc. Inspecting this, one could
> determine whats the start and end of memblocks.

I'm not saying that the memblock data is considered hidden. On most
systems it is simply not present after boot. And even if it is not
discarded, it might be not accurate on any arch except arm64.

> I agree on the part that its __init annotated and could be removed after
> boot. This is something that the driver can be vary of too.
> 
> > > Or are there any other ways where a module driver can get the end
> > > address of
> > > system memory block?
> > 
> > What do you mean by "system memory block"? There could be a lot of
> > interpretations if you take into account memory hotplug, "mem=" option,
> > reserved and firmware memory.
> 
> I meant the physical end address of memblock. The equivalent of
> memblock_end_of_DRAM.

> > I'd suggest you to describe the entire use case in more detail. Having
> > the complete picture would help finding a proper solution.
> 
> The usecase in general is have a way to add/remove and online/offline
> certain memory blocks which are part of boot. We do this by limiting the
> memory using "mem=" and latter add the remaining blocks using
> add_memory_driver_mamanaged().

I think such infrastructure should be a part of core mm rather than
external out-of-tree driver.

> Sudarshan
> 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm/memblock: export memblock_{start/end}_of_DRAM
@ 2020-11-03 16:51       ` Mike Rapoport
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Rapoport @ 2020-11-03 16:51 UTC (permalink / raw)
  To: Sudarshan Rajagopalan
  Cc: Mark Rutland, David Hildenbrand, Catalin Marinas,
	Anshuman Khandual, linux-kernel, Steven Price,
	Suren Baghdasaryan, Greg Kroah-Hartman, Will Deacon,
	linux-arm-kernel, Pratik Patel

On Mon, Nov 02, 2020 at 06:51:25PM -0800, Sudarshan Rajagopalan wrote:
> On 2020-10-30 01:38, Mike Rapoport wrote:
> > On Thu, Oct 29, 2020 at 02:29:27PM -0700, Sudarshan Rajagopalan wrote:
> > > Hello all,
> > > 
> > > We have a usecase where a module driver adds certain memory blocks
> > > using
> > > add_memory_driver_managed(), so that it can perform memory hotplug
> > > operations on these blocks. In general, these memory blocks aren’t
> > > something
> > > that gets physically added later, but is part of actual RAM that
> > > system
> > > booted up with. Meaning – we set the ‘mem=’ cmdline parameter to
> > > limit the
> > > memory and later add the remaining ones using add_memory*() variants.
> > > 
> > > The basic idea is to have driver have ownership and manage certain
> > > memory
> > > blocks for hotplug operations.
> > > 
> > > For the driver be able to know how much memory was limited and how
> > > much
> > > actually present, we take the delta of ‘bootmem physical end
> > > address’ and
> > > ‘memblock_end_of_DRAM’. The 'bootmem physical end address' is
> > > obtained by
> > > scanning the reg values in ‘memory’ DT node and determining the max
> > > {addr,size}. Since our driver is getting modularized, we won’t have
> > > access
> > > to memblock_end_of_DRAM (i.e. end address of all memory blocks after
> > > ‘mem=’
> > > is applied).
> > > 
> > > So checking if memblock_{start/end}_of_DRAM() symbols can be
> > > exported? Also,
> > > this information can be obtained by userspace by doing ‘cat
> > > /proc/iomem’ and
> > > greping for ‘System RAM’. So wondering if userspace can have access
> > > to such
> > > info, can we allow kernel module drivers have access by exporting
> > > memblock_{start/end}_of_DRAM().
> > 
> > These functions cannot be exported not because we want to hide this
> > information from the modules but because it is unsafe to use them.
> > On most architecturs these functions are __init so they are discarded
> > after boot anyway. Beisdes, the memory configuration known to memblock
> > might be not accurate in many cases as David explained in his reply.
> > 
> 
> I don't see how information contained in memblock_{start/end}_of_DRAM() is
> considered hidden if the information can be obtained using 'cat
> /proc/iomem'. The memory resource manager adds these blocks either in
> "System RAM", "reserved", "Kernel data/code" etc. Inspecting this, one could
> determine whats the start and end of memblocks.

I'm not saying that the memblock data is considered hidden. On most
systems it is simply not present after boot. And even if it is not
discarded, it might be not accurate on any arch except arm64.

> I agree on the part that its __init annotated and could be removed after
> boot. This is something that the driver can be vary of too.
> 
> > > Or are there any other ways where a module driver can get the end
> > > address of
> > > system memory block?
> > 
> > What do you mean by "system memory block"? There could be a lot of
> > interpretations if you take into account memory hotplug, "mem=" option,
> > reserved and firmware memory.
> 
> I meant the physical end address of memblock. The equivalent of
> memblock_end_of_DRAM.

> > I'd suggest you to describe the entire use case in more detail. Having
> > the complete picture would help finding a proper solution.
> 
> The usecase in general is have a way to add/remove and online/offline
> certain memory blocks which are part of boot. We do this by limiting the
> memory using "mem=" and latter add the remaining blocks using
> add_memory_driver_mamanaged().

I think such infrastructure should be a part of core mm rather than
external out-of-tree driver.

> Sudarshan
> 
-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-11-03 16:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29 21:29 mm/memblock: export memblock_{start/end}_of_DRAM Sudarshan Rajagopalan
2020-10-29 21:29 ` Sudarshan Rajagopalan
2020-10-30  6:41 ` David Hildenbrand
2020-10-30  6:41   ` David Hildenbrand
2020-11-03  2:15   ` Sudarshan Rajagopalan
2020-11-03  2:15     ` Sudarshan Rajagopalan
2020-10-30  8:38 ` Mike Rapoport
2020-10-30  8:38   ` Mike Rapoport
2020-10-31  9:18   ` Christoph Hellwig
2020-10-31  9:18     ` Christoph Hellwig
2020-10-31 10:05     ` David Hildenbrand
2020-10-31 10:05       ` David Hildenbrand
2020-11-03  8:38       ` Christoph Hellwig
2020-11-03  8:38         ` Christoph Hellwig
2020-11-03  2:51   ` Sudarshan Rajagopalan
2020-11-03  2:51     ` Sudarshan Rajagopalan
2020-11-03 16:51     ` Mike Rapoport
2020-11-03 16:51       ` Mike Rapoport

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.