From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754626AbcBBKeM (ORCPT ); Tue, 2 Feb 2016 05:34:12 -0500 Received: from mx2.suse.de ([195.135.220.15]:41428 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754387AbcBBKeK (ORCPT ); Tue, 2 Feb 2016 05:34:10 -0500 Date: Tue, 2 Feb 2016 11:34:21 +0100 From: Jan Kara To: Ross Zwisler Cc: Christoph Hellwig , Jan Kara , Dave Chinner , Dan Williams , linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Jan Kara , Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-nvdimm@ml01.01.org Subject: Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences Message-ID: <20160202103421.GA12574@quack.suse.cz> References: <1454009704-25959-1-git-send-email-ross.zwisler@linux.intel.com> <1454009704-25959-2-git-send-email-ross.zwisler@linux.intel.com> <20160128213858.GA29114@infradead.org> <20160202000212.GA12005@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160202000212.GA12005@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 01-02-16 17:02:12, Ross Zwisler wrote: > On Thu, Jan 28, 2016 at 01:38:58PM -0800, Christoph Hellwig wrote: > > On Thu, Jan 28, 2016 at 12:35:04PM -0700, Ross Zwisler wrote: > > > There are a number of places in dax.c that look up the struct block_device > > > associated with an inode. Previously this was done by just using > > > inode->i_sb->s_bdev. This is correct for inodes that exist within the > > > filesystems supported by DAX (ext2, ext4 & XFS), but when running DAX > > > against raw block devices this value is NULL. This causes NULL pointer > > > dereferences when these block_device pointers are used. > > > > It's also wrong for an XFS file system with a RT device.. > > > > > +#define DAX_BDEV(inode) (S_ISBLK(inode->i_mode) ? I_BDEV(inode) \ > > > + : inode->i_sb->s_bdev) > > > > .. but this isn't going to fix it. You must use a bdev returned by > > get_blocks or a similar file system method. > > Jan & Dave, > > Before I start in on a solution to this issue I just wanted to confirm that > DAX can rely on the fact that the filesystem's get_block() call will reliably > set bh->b_bdev for non-error returns. From this conversation between Jan & > Dave: > > https://lkml.org/lkml/2016/1/7/723 > > " > > No. The real problem is a long-standing abuse of struct buffer_head to be > > used for passing block mapping information (it's on my todo list to remove > > that at least from DAX code and use cleaner block mapping interface but > > first I want basic DAX functionality to settle down to avoid unnecessary > > conflicts). Filesystem is not supposed to touch bh->b_bdev. > > That has not been true for a long, long time. e.g. XFS always > rewrites bh->b_bdev in get_blocks because the file may not reside on > the primary block device of the filesystem. i.e.: > > /* > * If this is a realtime file, data may be on a different device. > * to that pointed to from the buffer_head b_bdev currently. > */ > bh_result->b_bdev = xfs_find_bdev_for_inode(inode); > > If you need > > that filled in, set it yourself in before passing bh to the block mapping > > function. > > That may be true, but we cannot assume that the bdev coming back > out of get_block is the same one that was passed in. > " > > It sounds like this is always true for XFS, and from looking at the ext4 code > I think this is true there as well because bh->b_bdev is set in > ext4_dax_mmap_get_block() via map_bh(). > > Relying on the bh->b_bdev returned by get_block() is correct, yea? Yeah, sorry, I was confused. If the result is a mapped block (i.e. return value of get_block callback is > 0), ext4 also sets bh->b_bdev via map_bh() as you correctly point out. If the result is a hole or error, ext4 doesn't set bh->b_bdev at all. So you can rely on bh->b_bdev. Honza -- Jan Kara SUSE Labs, CR