From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id D117D7F80 for ; Sat, 11 Apr 2015 16:12:52 -0500 (CDT) Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by relay3.corp.sgi.com (Postfix) with ESMTP id 5FF82AC003 for ; Sat, 11 Apr 2015 14:12:49 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id zOM8uRNPdU8TscBI (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Sat, 11 Apr 2015 14:12:48 -0700 (PDT) Date: Sat, 11 Apr 2015 17:12:43 -0400 From: Brian Foster Subject: Re: [PATCH 2/5] xfs: direct IO needs to use append ioends Message-ID: <20150411211241.GA4039@bfoster.bfoster> References: <1428673080-23052-1-git-send-email-david@fromorbit.com> <1428673080-23052-3-git-send-email-david@fromorbit.com> <20150410202147.GB2846@laptop.bfoster> <20150410223040.GN13731@dastard> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150410223040.GN13731@dastard> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Dave Chinner Cc: xfs@oss.sgi.com On Sat, Apr 11, 2015 at 08:30:40AM +1000, Dave Chinner wrote: > On Fri, Apr 10, 2015 at 04:21:47PM -0400, Brian Foster wrote: > > On Fri, Apr 10, 2015 at 11:37:57PM +1000, Dave Chinner wrote: > > > From: Dave Chinner > > > > > > Now we have an ioend being passed unconditionally to the direct IO > > > write completion context, we can pass a preallocated transaction > > > handle for on-disk inode size updates that are run in completion. > > > > > > At this point we really need to be passing the correct block range > > > that the IO spans through the ioend, so calculate the last block in > > > the mapping before we map the allocated range and use that instead > > > of the size desired by the direct IO. > > > > > > This enables us to keep track of multiple get-blocks calls in the > > > same direct IO - the ioend will keep coming back to us, and we can > > > keep extending it's range as new allocations and mappings are done. > > > > > > There are some new trace points added for debugging, and a small > > > hack to actually make the tracepoints work (enums in tracepoints > > > that use __print_symbolic don't work correctly) that should be fixed > > > in the 4.1 merge window. THis hack can be removed when the > > > tracepoint fix is upstream. > > > > > > There are lots of comments explaining the intricacies of passing the > > > ioend and append transaction in the code; they are better placed in > > > the code because we're going to need them to understand why this > > > code does what it does in a few years time.... > > > > > > Signed-off-by: Dave Chinner > > > --- > > > > I still need to look at this one (and grok the dio code more)... but an > > initial question: is this multiple get_blocks() call aggregation a > > requirement for the append ioend mechanism or an optimization? If the > > latter, I think a separate patch is more appropriate... > > Requirement. Direct Io is a twisty maze of passages loaded with > deadly traps. e.g. non AIO path: > > ->direct_IO > alloc dio(off, len) > loop until all IO issued { > get_blocks > dio->private = bh_result->b_private > build bio > dio->ref++ > submit bio > } > > dio_await_completion(dio) > dio_complete(dio) > dio->ref-- => goes to zero > dio->end_io(filp, off, len, dio->private) > xfs_end_io_direct_write(... off, len, ioend) > > > So, essentially, for as many bios that are mapped and submitted for > the direct IO, there is only one end IO completion call for the > entire IO. The multiple mappings we make need to aggregate the state > of the entire IO, not just the single instance.... > Ok, thanks for the breakdown. Essentially, we need to track the highest precedent I/O type of the overall DIO with respect to the completion handler. The patch itself is not hard to follow, but the dio path is a different beast. What I didn't quite catch when first playing with this is the mapping size optimization earlier in the get_blocks call that effectively defeats some of this by reducing the need for multiple calls in many cases. A single DIO write over a range of alternating map types (e.g., alternating preallocated blocks and holes), for example, is a better way to trigger the ioend aggregation. Additional comments to follow... Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs