From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752010AbcFVUQW (ORCPT ); Wed, 22 Jun 2016 16:16:22 -0400 Received: from mail-bl2on0122.outbound.protection.outlook.com ([65.55.169.122]:59954 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751400AbcFVUQT (ORCPT ); Wed, 22 Jun 2016 16:16:19 -0400 From: "Kani, Toshimitsu" To: "dan.j.williams@intel.com" CC: "linux-kernel@vger.kernel.org" , "sandeen@redhat.com" , "linux-nvdimm@ml01.01.org" , "agk@redhat.com" , "linux-raid@vger.kernel.org" , "snitzer@redhat.com" , "viro@zeniv.linux.org.uk" , "axboe@kernel.dk" , "axboe@fb.com" , "ross.zwisler@linux.intel.com" , "dm-devel@redhat.com" Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Thread-Topic: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Thread-Index: AQHRxcNvFc0gDtWw/UKSuO/qIYbBN5/oAoaAgAqtRoCAAAWsgIAAFjAAgAADUICAAAKLgIAAFViAgAASHACAAQDRgIAAH4oAgAAtfQCAAYZRAIAAHDSAgAAOIAA= Date: Wed, 22 Jun 2016 20:16:13 +0000 Message-ID: <1466625958.3504.340.camel@hpe.com> References: <20160613225756.GA18417@redhat.com> <20160620180043.GA21261@redhat.com> <1466446861.3504.243.camel@hpe.com> <20160620194026.GA21657@redhat.com> <20160620195217.GB21657@redhat.com> <1466452883.3504.244.camel@hpe.com> <1466457467.3504.249.camel@hpe.com> <20160620222236.GA22461@redhat.com> <20160621134147.GA26392@redhat.com> <1466523280.3504.262.camel@hpe.com> <20160621181728.GA27821@redhat.com> <1466616868.3504.320.camel@hpe.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=toshi.kani@hpe.com; x-originating-ip: [15.219.163.9] x-ms-office365-filtering-correlation-id: dee885df-c4e3-4347-0e9c-08d39ada0ee9 x-microsoft-exchange-diagnostics: 1;CS1PR84MB0007;6:k6d6DKgKMH0/CMOAwEZ8ET/P+IDQiqDOzjqVKkFPkVxbeSnr+BFoN/rZ0gkwSjlZsVy8dHqFlsBy5y0ep07DczMs1CXQUzlxy5NXQXteu6MFpjAR2pCfTRctE69O3aZrjd7r86EcE1PnkRfGTTe1R24udChcWQAestJJ7lCJUGq15x0vbC4+O3sPb+AlSZ3+VBiUqYZ8wWBqeQajiETSYMjyPvzJQJehTd15k2ccT7+L6sgvuC2hjr6CxIO1v9Kw83sxQDCb1SNMy19WYn8ljBSVpdTQkKYnGgwYclS0yvyzu4robNAJXsnFRgeqGprK;5:Em7VuPxhCXfs/ScyrAoqNlzpycpWMVpXap951PvH8nQZbCphf+l6mG/bTG4Ia8J+dJVdJe+x6vmMotw3J4Wgfn7liAyAQVs+ZjKRxIS1/cN93rrM1wRwzCGCbSqCR4Kc2As8mmCdv6jGzbVZ3KNzBw==;24:re9Uhmgu4TZOLvJs72alJxjJ7n0vONUCIVvNMaqIxSvxpLiLAVr1XGaxXhgzgyJzKcfITzJN6i0SmDIxMkCGkWVdF6rTjd/5ofVRwudCRsY=;7:nEkp4tGgDt1j0XBE45OYQVshaBTGOuq8SkhXtx6NjnHB8NCxo0Ja9H0u9ejNesP3AbEi0IUDg2ZiCK5foJ5bEDUCh5SaNVFomTx80RpmM/oPlhIbKHyYNcbTAZ8ACVbyLRRULJe8QtsKc72HDiArkAbXR9czifw97kHTBn2UJ/fu7kBRoNiY99akLS/Nk9WymXXm97xVmIAZRX9ZufCZeLMWnN60IwLQj9E4oi3BvyttiPYxVW9nDVK/uqqOvLArxx8lzrjKtDBTzxvOxl0bgg== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0007; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(227479698468861)(278428928389397); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:CS1PR84MB0007;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0007; x-forefront-prvs: 0981815F2F x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(7916002)(24454002)(377424004)(199003)(189002)(51694002)(377454003)(11100500001)(5640700001)(87936001)(8676002)(2501003)(575784001)(103116003)(86362001)(189998001)(36756003)(7736002)(93886004)(3280700002)(122556002)(101416001)(106356001)(50986999)(76176999)(68736007)(54356999)(305945005)(66066001)(2351001)(106116001)(81156014)(7846002)(92566002)(105586002)(8936002)(5002640100001)(99286002)(19580405001)(81166006)(19580395003)(2906002)(6116002)(33646002)(77096005)(3846002)(102836003)(586003)(110136002)(10400500002)(3660700001)(97736004)(2950100001)(2900100001)(4326007);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0007;H:CS1PR84MB0005.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <5E20F3A3C83CE84B838D7A9A8C911F14@NAMPRD84.PROD.OUTLOOK.COM> MIME-Version: 1.0 X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Jun 2016 20:16:13.2063 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0007 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u5MKGSkJ012511 On Wed, 2016-06-22 at 12:15 -0700, Dan Williams wrote: > On Wed, Jun 22, 2016 at 10:44 AM, Kani, Toshimitsu > wrote: > > On Tue, 2016-06-21 at 14:17 -0400, Mike Snitzer wrote: > > > > > > On Tue, Jun 21 2016 at 11:44am -0400, > > > Kani, Toshimitsu wrote: > > > > > > > > On Tue, 2016-06-21 at 09:41 -0400, Mike Snitzer wrote: > > > > > On Mon, Jun 20 2016 at  6:22pm -0400, > > > > > Mike Snitzer wrote: > > > > > I'm now wondering if we'd be better off setting a new QUEUE_FLAG_DAX > > > > > rather than establish GENHD_FL_DAX on the genhd? > > > > > > > > > > It'd be quite a bit easier to allow upper layers (e.g. XFS and ext4) > > > > > to check for a queue flag. > > > >  > > > > I think GENHD_FL_DAX is more appropriate since DAX does not use a > > > > request queue, except for protecting the underlining device being > > > > disabled while direct_access() is called (b2e0d1625e19). > > >  > > > The devices in question have a request_queue.  All bio-based device have > > > a request_queue. > > > > DAX-capable devices have two operation modes, bio-based and DAX.  I agree > > that bio-based operation is associated with a request queue, and its > > capabilities should be set to it.  DAX, on the other hand, is rather > > independent from a request queue. > > > > > I don't have a big problem with GENHD_FL_DAX.  Just wanted to point out > > > that such block device capabilities are generally advertised in terms of > > > a QUEUE_FLAG. > > > > I do not have a strong opinion, but feel a bit odd to associate DAX to a > > request queue. > > Given that we do not support dax to a raw block device [1] it seems a > gendisk flag is more misleading than request_queue flag that specifies > what requests can be made of the device. > > [1]: acc93d30d7d4 Revert "block: enable dax for raw block devices" Oh, I see.  I will change to use request_queue flag. > > > > About protecting direct_access, this patch assumes that the > > > > underlining device cannot be disabled until dtr() is called.  Is this > > > > correct?  If not, I will need to call dax_map_atomic(). > > > > > > One of the big design considerations for DM that a DM device can be > > > suspended (with or without flush) and any new IO will be blocked until > > > the DM device is resumed. > > > > > > So ideally DM should be able to have the same capability even if using > > > DAX. > > > > Supporting suspend for DAX is challenging since it allows user > > applications to access a device directly.  Once a device range is mmap'd, > > there is no kernel intervention to access the range, unless we invalidate > > user mappings.  This isn't done today even after a driver is unbind'd from > > a device. > > > > > But that is different than what commit b2e0d1625e19 is addressing.  For > > > DM, I wouldn't think you'd need the extra protections that > > > dax_map_atomic() is providing given that the underlying block device > > > lifetime is managed via DM core's dm_get_device/dm_put_device (see also: > > > dm.c:open_table_device/close_table_device). > > > > I thought so as well.  But I realized that there is (almost) nothing that > > can prevent the unbind operation.  It cannot fail, either.  This unbind > > proceeds even when a device is in-use.  In case of a pmem device, it is > > only protected by pmem_release_queue(), which is called when a pmem device > > is being deleted and calls blk_cleanup_queue() to serialize a critical > > section between > > blk_queue_enter() and blk_queue_exit() per b2e0d1625e19.  This prevents > > from a kernel DTLB fault, but does not prevent a device disappeared while > > in-use. > > > > Protecting DM's underlining device with blk_queue_enter() (or something > > similar) requires more thoughts...  blk_queue_enter() to a DM device > > cannot be redirected to its underlining device.  So, this is TBD for > > now.  But I do not think this is a blocker issue since doing unbind to a > > underlining device is quite harmful no matter what we do - even if it is > > protected with blk_queue_enter(). > > I still have the "block device removed" notification patches on my > todo list.  It's not a blocker, but there are scenarios where we can > keep accessing memory via dax of a disabled device leading to memory > corruption.   Right, I noticed that user applications can access mmap'd ranges on a disabled device. > I'll bump that up in my queue now that we are looking at > additional scenarios where letting DAX mappings leak past the > reconfiguration of a block device could lead to trouble. Great.  With DM, removing a underlining device while in-use can lead to trouble, esp. with RAID0.  Users need to remove a device from DM first... Thanks, -Toshi