From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752999AbcFOBrD (ORCPT <rfc822;w@1wt.eu>);
	Tue, 14 Jun 2016 21:47:03 -0400
Received: from mx1.redhat.com ([209.132.183.28]:46244 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752159AbcFOBrB (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 14 Jun 2016 21:47:01 -0400
Date: Tue, 14 Jun 2016 21:46:58 -0400
From: Mike Snitzer <snitzer@redhat.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "Kani, Toshimitsu" <toshi.kani@hpe.com>,
        "axboe@kernel.dk" <axboe@kernel.dk>,
        "linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
        "dm-devel@redhat.com" <dm-devel@redhat.com>,
        "viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
        "dan.j.williams@intel.com" <dan.j.williams@intel.com>,
        "ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>,
        "agk@redhat.com" <agk@redhat.com>
Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices
Message-ID: <20160615014658.GA5443@redhat.com>
References: <1465856497-19698-1-git-send-email-toshi.kani@hpe.com>
 <CAPcyv4jdM1phR=kGoP2-7tfsVvbNe2C6hHNS5TD28ALGZQQTSw@mail.gmail.com>
 <1465861755.3504.185.camel@hpe.com>
 <x49fusf282h.fsf@segfault.boston.devel.redhat.com>
 <20160614154131.GB25876@redhat.com>
 <x49inxbzfp4.fsf@segfault.boston.devel.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <x49inxbzfp4.fsf@segfault.boston.devel.redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 15 Jun 2016 01:47:00 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jun 14 2016 at  4:19pm -0400,
Jeff Moyer <jmoyer@redhat.com> wrote:

> Mike Snitzer <snitzer@redhat.com> writes:
> 
> > On Tue, Jun 14 2016 at  9:50am -0400,
> > Jeff Moyer <jmoyer@redhat.com> wrote:
> >
> >> "Kani, Toshimitsu" <toshi.kani@hpe.com> writes:
> >> 
> >> >> I had dm-linear and md-raid0 support on my list of things to look at,
> >> >> did you have raid0 in your plans?
> >> >
> >> > Yes, I hope to extend further and raid0 is a good candidate.   
> >> 
> >> dm-flakey would allow more xfstests test cases to run.  I'd say that's
> >> more important than linear or raid0.  ;-)
> >
> > Regardless of which target(s) grow DAX support the most pressing initial
> > concern is getting the DM device stacking correct.  And verifying that
> > IO that cross pmem device boundaries are being properly split by DM
> > core (via drivers/md/dm.c:__split_and_process_non_flush()'s call to
> > max_io_len).
> 
> That was a tongue-in-cheek comment.  You're reading way too much into
> it.
> 
> >> Also, the next step in this work is to then decide how to determine on
> >> what numa node an LBA resides.  We had discussed this at a prior
> >> plumbers conference, and I think the consensus was to use xattrs.
> >> Toshi, do you also plan to do that work?
> >
> > How does the associated NUMA node relate to this?  Does the
> > DM requests_queue need to be setup to only allocate from the NUMA node
> > the pmem device is attached to?  I recently added support for this to
> > DM.  But there will likely be some code need to propagate the NUMA node
> > id accordingly.
> 
> I assume you mean allocate memory (the volatile kind).  That should work
> the same between pmem and regular block devices, no?

This is the commit I made to train DM to be numa node aware:
115485e83f497fdf9b4 ("dm: add 'dm_numa_node' module parameter")

As is the DM code is focused on memory allocations.  But I think blk-mq
may use the NUMA node for via tag_set->numa_node.  But that is moot
given pmem is bio-based right?

Steps could be taken to make all threads DM creates for a a given device
get pinned to the specified NUMA node too.