From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756050AbcA3ASZ (ORCPT ); Fri, 29 Jan 2016 19:18:25 -0500 Received: from mail-yk0-f179.google.com ([209.85.160.179]:34089 "EHLO mail-yk0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753591AbcA3ASW (ORCPT ); Fri, 29 Jan 2016 19:18:22 -0500 MIME-Version: 1.0 In-Reply-To: <20160129233430.GA20549@linux.intel.com> References: <1454009704-25959-1-git-send-email-ross.zwisler@linux.intel.com> <1454009704-25959-2-git-send-email-ross.zwisler@linux.intel.com> <20160128213858.GA29114@infradead.org> <20160129182815.GB5224@linux.intel.com> <20160129233430.GA20549@linux.intel.com> Date: Fri, 29 Jan 2016 16:18:22 -0800 Message-ID: Subject: Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences From: Dan Williams To: Ross Zwisler , Christoph Hellwig , "linux-kernel@vger.kernel.org" , Alexander Viro , Andrew Morton , Dan Williams , Dave Chinner , Jan Kara , Matthew Wilcox , linux-fsdevel , linux-nvdimm Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 29, 2016 at 3:34 PM, Ross Zwisler wrote: > On Fri, Jan 29, 2016 at 11:28:15AM -0700, Ross Zwisler wrote: >> On Thu, Jan 28, 2016 at 01:38:58PM -0800, Christoph Hellwig wrote: >> > On Thu, Jan 28, 2016 at 12:35:04PM -0700, Ross Zwisler wrote: >> > > There are a number of places in dax.c that look up the struct block_device >> > > associated with an inode. Previously this was done by just using >> > > inode->i_sb->s_bdev. This is correct for inodes that exist within the >> > > filesystems supported by DAX (ext2, ext4 & XFS), but when running DAX >> > > against raw block devices this value is NULL. This causes NULL pointer >> > > dereferences when these block_device pointers are used. >> > >> > It's also wrong for an XFS file system with a RT device.. >> > >> > > +#define DAX_BDEV(inode) (S_ISBLK(inode->i_mode) ? I_BDEV(inode) \ >> > > + : inode->i_sb->s_bdev) >> > >> > .. but this isn't going to fix it. You must use a bdev returned by >> > get_blocks or a similar file system method. >> >> I guess I need to go off and understand if we can have DAX mappings on such a >> device. If we can, we may have a problem - we can get the block_device from >> get_block() in I/O path and the various fault paths, but we don't have access >> to get_block() when flushing via dax_writeback_mapping_range(). We avoid >> needing it the normal case by storing the sector results from get_block() in >> the radix tree. >> >> /me is off to play with RT devices... > > Well, RT devices are completely broken as far as I can see. I've reported the > breakage to the XFS list. Anything I do that triggers a RT block allocation > in XFS causes a lockdep splat + a kernel BUG - I've tried regular pwrite(), > xfs_rtcp and mmap() + write to address. Not a new bug either - happens just > the same with v4.4. Happens with both PMEM and BRD, and has no relationship > to whether I'm using DAX or not. > > Does it work for this patch to go in as-is since it fixes an immediate OOPS > with raw block devices + DAX, and when RT devices are alive again I'll figure > out how to make them work too? Can we step back and be clear about which lookups should be coming from get_blocks(). Which ones are critical vs ones we just opportunistically lookup for a debug print. Right now xfs and ext4 are basically disagreeing on whether get_blocks() reliably sets ->bh_bdev, and checking for a raw block-device inode in dax_clear_blocks() does not make sense. So this all seems a bit confused.