All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH] drivers: base: dynamic memory block creation
Date: Wed, 14 Aug 2013 14:36:34 -0700	[thread overview]
Message-ID: <520BF862.6070008@sr71.net> (raw)
In-Reply-To: <20130814211454.GA17423@variantweb.net>

On 08/14/2013 02:14 PM, Seth Jennings wrote:
> On Wed, Aug 14, 2013 at 01:47:27PM -0700, Dave Hansen wrote:
>> On 08/14/2013 12:31 PM, Seth Jennings wrote:
>>> +static unsigned long *memblock_present;
>>> +static bool largememory_enable __read_mostly;
>>
>> How would you see this getting used in practice?  Are you just going to
>> set this by default on ppc?  Or, would you ask the distros to put it on
>> the command-line by default?  Would it only affect machines larger than
>> a certain size?
> 
> It would not be on by default, but for people running into the problem
> on their large memory machines, we could enable this after verifying
> that any tools that operate on the memory block configs are "dynamic
> memory block aware"

I don't have any idea how you would do this in practice.  You can
obviously fix the dlpar tools that you're shipping for a given distro.
But, what about the other applications?  I could imagine things like
databases wanting to know when memory comes and goes.

>> An existing tool would not work
>> with this patch (plus boot option) since it would not know how to
>> show/hide things.  It lets _part_ of those existing tools get reused
>> since they only have to be taught how to show/hide things.
>>
>> I'd find this really intriguing if you found a way to keep even the old
>> tools working.  Instead of having an explicit show/hide, why couldn't
>> you just create the entries on open(), for instance?
> 
> Nathan and I talked about this and I'm not sure if sysfs would support
> such a thing, i.e. memory block creation when someone tried to cd into
> the memory block device config.  I wouldn't know where to start on that.

It's not that fundamentally hard.  Think of how an on-disk filesystem
works today.  You do an open('foo') and the fs goes off and tries to
figure out whether there's something named 'foo' on the disk.  If there
is, it creates inodes and dentries to back it.  In your case, instead of
going to the disk, you go look at the memory configuration.

This might require a new filesystem instead of sysfs itself, but it
would potentially be a way to have good backward compatibility.

>>> +static ssize_t memory_present_show(struct device *dev,
>>> +				  struct device_attribute *attr, char *buf)
>>> +{
>>> +	int n_bits, ret;
>>> +
>>> +	n_bits = NR_MEM_SECTIONS / sections_per_block;
>>> +	ret = bitmap_scnlistprintf(buf, PAGE_SIZE - 2,
>>> +				memblock_present, n_bits);
>>> +	buf[ret++] = '\n';
>>> +	buf[ret] = '\0';
>>> +
>>> +	return ret;
>>> +}
>>
>> Doesn't this break the one-value-per-file rule?
> 
> I didn't know there was such a rule but it might. Is there any
> acceptable way to express a ranges of values.  I would just do a
> "last_memblock_id" but the range can have holes.

The rules are written down very nicely:

	Documentation/filesystems/sysfs.txt

I'm wrong, btw....  It's acceptable to do 'arrays' of values too, not
just single ones.


WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH] drivers: base: dynamic memory block creation
Date: Wed, 14 Aug 2013 14:36:34 -0700	[thread overview]
Message-ID: <520BF862.6070008@sr71.net> (raw)
In-Reply-To: <20130814211454.GA17423@variantweb.net>

On 08/14/2013 02:14 PM, Seth Jennings wrote:
> On Wed, Aug 14, 2013 at 01:47:27PM -0700, Dave Hansen wrote:
>> On 08/14/2013 12:31 PM, Seth Jennings wrote:
>>> +static unsigned long *memblock_present;
>>> +static bool largememory_enable __read_mostly;
>>
>> How would you see this getting used in practice?  Are you just going to
>> set this by default on ppc?  Or, would you ask the distros to put it on
>> the command-line by default?  Would it only affect machines larger than
>> a certain size?
> 
> It would not be on by default, but for people running into the problem
> on their large memory machines, we could enable this after verifying
> that any tools that operate on the memory block configs are "dynamic
> memory block aware"

I don't have any idea how you would do this in practice.  You can
obviously fix the dlpar tools that you're shipping for a given distro.
But, what about the other applications?  I could imagine things like
databases wanting to know when memory comes and goes.

>> An existing tool would not work
>> with this patch (plus boot option) since it would not know how to
>> show/hide things.  It lets _part_ of those existing tools get reused
>> since they only have to be taught how to show/hide things.
>>
>> I'd find this really intriguing if you found a way to keep even the old
>> tools working.  Instead of having an explicit show/hide, why couldn't
>> you just create the entries on open(), for instance?
> 
> Nathan and I talked about this and I'm not sure if sysfs would support
> such a thing, i.e. memory block creation when someone tried to cd into
> the memory block device config.  I wouldn't know where to start on that.

It's not that fundamentally hard.  Think of how an on-disk filesystem
works today.  You do an open('foo') and the fs goes off and tries to
figure out whether there's something named 'foo' on the disk.  If there
is, it creates inodes and dentries to back it.  In your case, instead of
going to the disk, you go look at the memory configuration.

This might require a new filesystem instead of sysfs itself, but it
would potentially be a way to have good backward compatibility.

>>> +static ssize_t memory_present_show(struct device *dev,
>>> +				  struct device_attribute *attr, char *buf)
>>> +{
>>> +	int n_bits, ret;
>>> +
>>> +	n_bits = NR_MEM_SECTIONS / sections_per_block;
>>> +	ret = bitmap_scnlistprintf(buf, PAGE_SIZE - 2,
>>> +				memblock_present, n_bits);
>>> +	buf[ret++] = '\n';
>>> +	buf[ret] = '\0';
>>> +
>>> +	return ret;
>>> +}
>>
>> Doesn't this break the one-value-per-file rule?
> 
> I didn't know there was such a rule but it might. Is there any
> acceptable way to express a ranges of values.  I would just do a
> "last_memblock_id" but the range can have holes.

The rules are written down very nicely:

	Documentation/filesystems/sysfs.txt

I'm wrong, btw....  It's acceptable to do 'arrays' of values too, not
just single ones.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-14 21:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 19:31 [RFC][PATCH] drivers: base: dynamic memory block creation Seth Jennings
2013-08-14 19:31 ` Seth Jennings
2013-08-14 19:40 ` Greg Kroah-Hartman
2013-08-14 19:40   ` Greg Kroah-Hartman
2013-08-16 19:07   ` Seth Jennings
2013-08-16 19:07     ` Seth Jennings
2013-08-14 19:43 ` Greg Kroah-Hartman
2013-08-14 19:43   ` Greg Kroah-Hartman
2013-08-14 20:05   ` Dave Hansen
2013-08-14 20:05     ` Dave Hansen
2013-08-14 20:35     ` Greg Kroah-Hartman
2013-08-14 20:35       ` Greg Kroah-Hartman
2013-08-14 21:16       ` Seth Jennings
2013-08-14 21:16         ` Seth Jennings
2013-08-14 21:37       ` Yinghai Lu
2013-08-14 21:52         ` Seth Jennings
2013-08-14 21:52           ` Seth Jennings
2013-08-14 23:20           ` Yinghai Lu
2013-08-14 23:20             ` Yinghai Lu
2013-08-15  2:12           ` Michael Ellerman
2013-08-15  2:12             ` Michael Ellerman
2013-08-15  2:12             ` Michael Ellerman
2013-08-14 20:40 ` Nathan Fontenot
2013-08-14 20:40   ` Nathan Fontenot
2013-08-14 20:47 ` Dave Hansen
2013-08-14 20:47   ` Dave Hansen
2013-08-14 21:14   ` Seth Jennings
2013-08-14 21:14     ` Seth Jennings
2013-08-14 21:36     ` Dave Hansen [this message]
2013-08-14 21:36       ` Dave Hansen
2013-08-14 21:37     ` Cody P Schafer
2013-08-14 21:37       ` Cody P Schafer
2013-08-14 21:49       ` Dave Hansen
2013-08-14 21:49         ` Dave Hansen
2013-08-15  0:01 ` Rafael J. Wysocki
2013-08-15  0:01   ` Rafael J. Wysocki
2013-08-16 18:41   ` Seth Jennings
2013-08-16 18:41     ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520BF862.6070008@sr71.net \
    --to=dave@sr71.net \
    --cc=akpm@linux-foundation.org \
    --cc=cody@linux.vnet.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.