All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Mel Gorman <mgorman@suse.com>
Cc: Adam Manzanares <Adam.Manzanares@wdc.com>,
	Hannes Reinecke <hare@suse.de>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH] brd: Allow ramdisk to be allocated on selected NUMA node
Date: Fri, 15 Jun 2018 08:28:46 -0600	[thread overview]
Message-ID: <094af896-1185-7658-a66d-ff7c67cd0b0d@kernel.dk> (raw)
In-Reply-To: <20180615092305.e3k3fvqhkspbrba3@novell.com>

On 6/15/18 3:23 AM, Mel Gorman wrote:
> On Thu, Jun 14, 2018 at 02:47:39PM -0600, Jens Axboe wrote:
>>>>> Will numactl ... modprobe brd ... solve this problem?
>>>>
>>>> It won't, pages are allocated as needed.
>>>>
>>>
>>> Then how about a numactl ... dd /dev/ram ... after the modprobe.
>>
>> Yes of course, or you could do that for every application that ends
>> up in the path of the doing IO to it. The point of the option is to
>> just make it explicit, and not have to either NUMA pin each task,
>> or prefill all possible pages.
>>
> 
> It's certainly possible from userspace using dd and numactl setting the
> desired memory policy. mmtests has the following snippet when setting
> up a benchmark using brd to deal with both NUMA artifacts and variable
> performance due to first faults early in the lifetime of a benchmark.
> 
>                 modprobe brd rd_size=$((TESTDISK_RD_SIZE/1024))
>                 if [ "$TESTDISK_RD_PREALLOC" == "yes" ]; then
>                         if [ "$TESTDISK_RD_PREALLOC_NODE" != "" ]; then
>                                 tmp_prealloc_cmd="numactl -N $TESTDISK_RD_PREALLOC_NODE"
>                         else
>                                 tmp_prealloc_cmd="numactl -i all"
>                         fi
>                         $tmp_prealloc_cmd dd if=/dev/zero of=/dev/ram0 bs=1M &>/dev/null
>                 fi
> 
> (Haven't actually validated this in a long time but it worked at some point)

You'd want to make this oflag=direct as well (this goes for Adam, too), or
you could have pages being written that are NOT issued by dd.

> First option allocates just from one node, the other interleaves between
> everything. Any combination of nodes or policies can be used and this was
> very simple, but it's what was needed at the time. The question is how
> far do you want to go with supporting policies within the module?

Not far, imho :-)

> One option would be to keep this very simple like the patch suggests so users
> get the hint that it's even worth considering and then point at a document
> on how to do more complex policies from userspace at device creation time.
> Another is simply to document the hazard that the locality of memory is
> controlled by the memory policy of the first task that touches it.

I like the simple option, especially since (as Christoph pointed out) that
if we fail allocating from the given node, then we'll just go elsewhere.

-- 
Jens Axboe

  reply	other threads:[~2018-06-15 14:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-14 13:38 [PATCH] brd: Allow ramdisk to be allocated on selected NUMA node Hannes Reinecke
2018-06-14 14:47 ` Jens Axboe
2018-06-14 15:29   ` Hannes Reinecke
2018-06-14 15:33     ` Jens Axboe
2018-06-14 16:09       ` Hannes Reinecke
2018-06-14 20:32         ` Adam Manzanares
2018-06-14 20:37           ` Jens Axboe
2018-06-14 20:41             ` Adam Manzanares
2018-06-14 20:47               ` Jens Axboe
2018-06-14 20:53                 ` Adam Manzanares
2018-06-15  6:06                   ` Hannes Reinecke
2018-06-15  9:23                 ` Mel Gorman
2018-06-15 14:28                   ` Jens Axboe [this message]
2018-06-15  7:30       ` Christoph Hellwig
2018-06-15 14:12         ` Jens Axboe
2018-06-15 14:07 ` Bart Van Assche
2018-06-15 16:55   ` Hannes Reinecke
2018-06-15 16:58     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=094af896-1185-7658-a66d-ff7c67cd0b0d@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=Adam.Manzanares@wdc.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=mgorman@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.