From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752999AbcFOBrD (ORCPT ); Tue, 14 Jun 2016 21:47:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46244 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752159AbcFOBrB (ORCPT ); Tue, 14 Jun 2016 21:47:01 -0400 Date: Tue, 14 Jun 2016 21:46:58 -0400 From: Mike Snitzer To: Jeff Moyer Cc: "Kani, Toshimitsu" , "axboe@kernel.dk" , "linux-nvdimm@lists.01.org" , "linux-kernel@vger.kernel.org" , "linux-raid@vger.kernel.org" , "dm-devel@redhat.com" , "viro@zeniv.linux.org.uk" , "dan.j.williams@intel.com" , "ross.zwisler@linux.intel.com" , "agk@redhat.com" Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Message-ID: <20160615014658.GA5443@redhat.com> References: <1465856497-19698-1-git-send-email-toshi.kani@hpe.com> <1465861755.3504.185.camel@hpe.com> <20160614154131.GB25876@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 15 Jun 2016 01:47:00 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 14 2016 at 4:19pm -0400, Jeff Moyer wrote: > Mike Snitzer writes: > > > On Tue, Jun 14 2016 at 9:50am -0400, > > Jeff Moyer wrote: > > > >> "Kani, Toshimitsu" writes: > >> > >> >> I had dm-linear and md-raid0 support on my list of things to look at, > >> >> did you have raid0 in your plans? > >> > > >> > Yes, I hope to extend further and raid0 is a good candidate.    > >> > >> dm-flakey would allow more xfstests test cases to run. I'd say that's > >> more important than linear or raid0. ;-) > > > > Regardless of which target(s) grow DAX support the most pressing initial > > concern is getting the DM device stacking correct. And verifying that > > IO that cross pmem device boundaries are being properly split by DM > > core (via drivers/md/dm.c:__split_and_process_non_flush()'s call to > > max_io_len). > > That was a tongue-in-cheek comment. You're reading way too much into > it. > > >> Also, the next step in this work is to then decide how to determine on > >> what numa node an LBA resides. We had discussed this at a prior > >> plumbers conference, and I think the consensus was to use xattrs. > >> Toshi, do you also plan to do that work? > > > > How does the associated NUMA node relate to this? Does the > > DM requests_queue need to be setup to only allocate from the NUMA node > > the pmem device is attached to? I recently added support for this to > > DM. But there will likely be some code need to propagate the NUMA node > > id accordingly. > > I assume you mean allocate memory (the volatile kind). That should work > the same between pmem and regular block devices, no? This is the commit I made to train DM to be numa node aware: 115485e83f497fdf9b4 ("dm: add 'dm_numa_node' module parameter") As is the DM code is focused on memory allocations. But I think blk-mq may use the NUMA node for via tag_set->numa_node. But that is moot given pmem is bio-based right? Steps could be taken to make all threads DM creates for a a given device get pinned to the specified NUMA node too.