From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932961AbcDNQVa (ORCPT ); Thu, 14 Apr 2016 12:21:30 -0400 Received: from mail-bl2on0132.outbound.protection.outlook.com ([65.55.169.132]:5348 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932566AbcDNQV1 (ORCPT ); Thu, 14 Apr 2016 12:21:27 -0400 Authentication-Results: fromorbit.com; dkim=none (message not signed) header.d=none;fromorbit.com; dmarc=none action=none header.from=hpe.com; Message-ID: <570FC379.7000107@hpe.com> Date: Thu, 14 Apr 2016 12:21:13 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Dave Chinner CC: "Theodore Ts'o" , Andreas Dilger , , , Tejun Heo , Christoph Lameter , Scott J Norton , Douglas Hatch , Toshimitsu Kani Subject: Re: [PATCH v3 1/2] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called References: <1460484775-33359-1-git-send-email-Waiman.Long@hpe.com> <1460484775-33359-2-git-send-email-Waiman.Long@hpe.com> <20160414031634.GJ10643@dastard> In-Reply-To: <20160414031634.GJ10643@dastard> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.58] X-ClientProxiedBy: CO2PR06CA016.namprd06.prod.outlook.com (10.141.242.16) To CS1PR84MB0310.NAMPRD84.PROD.OUTLOOK.COM (10.162.190.28) X-MS-Office365-Filtering-Correlation-Id: 91ccf86b-4ee4-400f-5fd1-08d36480d216 X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;2:xNp88VdSvGGwB5I3leH20ZqRG7WU0Bn1UUcAJ+L08pfKF0jwpNCDay8bawx4OglHl0QsOd8TxWEBO0ObUy9xaxeEPpgR2J1l8PcU3eBhcNLqOD9KvztFakRAKWqeBmz7cG2o7BZeV184GxiieP6Gv2dhSDTgJ6jnrF+PwgiUdnE76zhxnYut1z7aO/b3T2XF;3:SBj0f97u6iPdjlKDp1sUN0Qtf9o0k9M0SgS7S9IB0iS/HZENFnxBW1+alFtxDRSaSZMwOGOooQUa0Te6l/q74dtAfzi/Q2HrsmlVA/0CLtvyuQwfaxyE6+W0T1h9qOzi X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0310; X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;25:18QClPXDzWbM2yBknD60rtWGLxJCvZu42uTMxexnHlLo1wqkCBtRhrFHTs/6YdYAGDMCTYZoqNb3qX2Ml6BxyY5lom+0nkuJ7F97PgYHPSEaXPwtjPUNdJj2z3k0sFNz/k7mDg8t3Mr3jrJWY5G+ojvGfS0aXjvc5Ouqhn7ST/FVMJAQ7pypBywOGP6mH6ozLmTh3aYT+Y1OCV3yWasbq1ptarVaTqULo9wmU3XFkFOGNud9KHULUz/20OjdvALR/T/pXrXd/iuPcEslau7ycxXnfor/1rQaP3Xjxi4/i8/CQMMOmAvfXzd8L2muliY+Or+DAVEAfDwhuJlLouB/r+w7zrDFTb+ZyLd1I/U/mWD4UoKcq4lFuhxnNxYkadPjQWyJ3GID1XG5AIKT05GbRDcjy70BEp7GFyN7kLeMAvjGGa2xoB4An1JOx+dDp5v/+0DpEQDl7gZFGsKYiOeBTev6HfnTUgXJA1n00t2DymkJFjyEcBIIuy8VRMzLfoD3ZMh6Go/CXSBG0e8nEsJ9Ht+23IAC77ZrNGa+wrzXifmh3ucbdr5Jyvuqveu8T9kZXLXaGwrjR8LtTPxw6j5U0B34yr8xPNI998cGnC/aKiP74P5N1DXqyHT585kQe2kiyywwAVqbYpmmXJLTWNoWqA== X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;20:3blr8seSIR98UdEC03VjK2h3LhAb0TEn0PiPuaAoYy3Vv4fYlcI8vpugQbqL7Zu7j/hqtvfn8k0J3CpibQm1ejPneIPpBps6kOsZEFBg+TCsOgADvMJHIypakkw7ZZIvQ3ia7HFMggNDZm2LzlPMcj9WaZhv4Gkhbx/TnH4YTYmeNQfIEfiZe2ZGHl1iQnn6pzTKwE/P+d95kieQw1EzgaaW48In2iNUa/t2+Y1DWqOKOJsZDHm2/0yUR8e7QZEPs76YGZ8K3bkFKxhCADxi+SfL9Vv37MmcluZIzCGMDkvK0geZ5SX2xfP/XZIjMLqzRQKsGCscmeh6GbCZuqwg8w==;4:IMCM2gyTfCbU4ArMoALfsox/wAL/tq0ejAwEQY9Lj337/7OjXBwJ7ZFucBaKBXrguUmLVzmWQwlKJDiHota4WcnxXjV9FgXS2rrM+WDoDVnxxiveoZ7bDTBLQKthudEY9qVqXMXmbSzlqpdESsEf3jaI6V6+y6/olR5Lkyfovv/Qs0oFUhlxWi6XOnOX0JMcPP4wviYr2bl4rqApTmHciJ90vDecXRpJafU3Y/3FFAI9W93dxdVxeW9HjFypHBVIg9id6js2o2fKRDxk420N0fe7E7yh1U4q7Ia2h6FjxTa0mJDlA00vZn+41qjZ+zKd1U/ob3NuZHIPyaGx+UjQYF6ZXeiMd9fMm0kVYuDZMmiBwNe1rU1/CQ8Pjv9SR1ss X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001);SRVR:CS1PR84MB0310;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0310; X-Forefront-PRVS: 0912297777 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(24454002)(377454003)(23756003)(2906002)(64126003)(2950100001)(65956001)(65806001)(1096002)(50466002)(586003)(33656002)(77096005)(6116002)(117156001)(3846002)(42186005)(81166005)(5008740100001)(66066001)(230700001)(59896002)(92566002)(80316001)(86362001)(4001350100001)(5004730100002)(47776003)(36756003)(50986999)(54356999)(87266999)(76176999)(65816999)(110136002)(189998001)(83506001)(4326007);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0310;H:[192.168.142.157];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CS1PR84MB0310;23:vzD3q+wnDAWG7bjyN6sftiueH7E/Y3jaPfATap3?= =?iso-8859-1?Q?ngel7Az5a8Kx5IEIanRbUYP4siR3zqt89jBlXGLRT1hs57PBX0C8jVim83?= =?iso-8859-1?Q?OfwK86lpg07seiTfPcqLX1hkICY6neygSabYIC1TNgSgQvSr3gb4aZzC8T?= =?iso-8859-1?Q?7y3nbop5UmguoCaJ4RnMoz7jxoKyNb1xHmiqTjXSr+h8xsQNPRM3waCz1w?= =?iso-8859-1?Q?Pi1AT1Ua2CE0ZBMBYeVd0sIpeU4ANkwyHP8WOwsF1hZPcw5EywoV9qfqWf?= =?iso-8859-1?Q?ttYsgC+f2NWi5Gb0Jfp3Kzb17GbBaXrpEyvpMYuUvasw7GYJDqx11anPn+?= =?iso-8859-1?Q?GqYJlrFVDS3TXuZZ74SzBUJN+MiC9iV5bIdjSMNF8pKrhn9JyF4/0VYVnv?= =?iso-8859-1?Q?ae67v72nXxIElb88aFxSddT/1kE7il72TDXMT1XEbzuj1BuqDJIT++JsSw?= =?iso-8859-1?Q?9TZYF3i2SZ4vKwqJZOLY+Lcs0KgmYWLR9Z9JTcwEOwpGGnsq981lNt84co?= =?iso-8859-1?Q?5a3b/kAtgrUyPi/VsVZB1ZBjwBnGDwjBJm6B/3JN0tRYiLWeMo2XSP7I5n?= =?iso-8859-1?Q?MMUkQsNmOhbL4q4RLzAGfmOozAZ3EqPbsK+E63QLouh0nEIfZD6aWilVln?= =?iso-8859-1?Q?wFONNNs2hqbkp7ROH8Zy875DzLNLMdlkaBqPB+j0eK8ReSHKayvvMDrNqO?= =?iso-8859-1?Q?rhLuot75Pcxo53hPUYEZvzw9fe7fjC6YeloMycT5vYiavooLxJ/OMZzzLN?= =?iso-8859-1?Q?Eq1fxZvTxgKaj23VntWlw2Zynl8NRwHeQiMKZvJ27l99ahnh9hq9qDyeAu?= =?iso-8859-1?Q?B83eDHX/LrP4NGZk5lKk+PKtVnNqhGK3LeEF7RcjCc401Ck5/UpeYyeiqt?= =?iso-8859-1?Q?/4BypAlJQ8rBdWYrlkxy2JFpsWP1Z3tJa33OwYUMT3xQPV0SWRIvENnqco?= =?iso-8859-1?Q?S7ONOq4tFcq+5X5SFyMS3GMzRq9JHofTr8EAWhaD+io04YnSg2ESpSuaJq?= =?iso-8859-1?Q?JwkxLBXm9+IJdkdv63qEqodVN1h0OiNY4J1Ts6LMy3DTDEaWyHe/KApRGc?= =?iso-8859-1?Q?dAW8SaIjF3iVbyLjLhCSdxGt6la7eqhjlBInJoLGWh0UtpVSsTb5fGgC8l?= =?iso-8859-1?Q?8m4GZ?= X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;5:u83YdlWTzjz16woKtZqEtlvNxPG6rDCmc2bAM9QvCGSZ8004VxFdTh+DJ3F+eRXTp+aC2cKAgEis2mFN7AOJ43sB9DeTiknnYNXMAcWucI47/OyWpEKizRLrD0oto36YlsOGV065fe8BhDscIJejLw==;24:KaYbn+aZK423lOobfaiCe9bxiCJ/ySQQf6dnKaqG8XgFYzsG07+rEA+ERVJofxGLlzfzS//ypUcy92sR9IXBLC1KxAaBVKsi9YZKoEJ/qyY= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2016 16:21:21.6863 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0310 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/13/2016 11:16 PM, Dave Chinner wrote: > On Tue, Apr 12, 2016 at 02:12:54PM -0400, Waiman Long wrote: >> When performing direct I/O, the current ext4 code does >> not pass in the DIO_SKIP_DIO_COUNT flag to dax_do_io() or >> __blockdev_direct_IO() when inode_dio_begin() has, in fact, been >> called. This causes dax_do_io()/__blockdev_direct_IO() to invoke >> inode_dio_begin()/inode_dio_end() internally. This doubling of >> inode_dio_begin()/inode_dio_end() calls are wasteful. >> >> This patch removes the extra internal inode_dio_begin()/inode_dio_end() >> calls when those calls are being issued by the caller directly. For >> really fast storage systems like NVDIMM, the removal of the extra >> inode_dio_begin()/inode_dio_end() can give a meaningful boost to >> I/O performance. > Doesn't this break truncate IO serialisation? > > i.e. it appears to me that the ext4 use of inode_dio_begin()/ > inode_dio_end() does not cover AIO, where the IO is still in flight > when submission returns. i.e. the inode_dio_end() call > needs to be in IO completion, not in the submitter context. The only > reason it doesn't break right now is that the duplicate accounting > in the DIO code is correct w.r.t. AIO. Hence bypassing the DIO > accounting will cause AIO writes to race with truncate. > > Same AIO vs truncate problem occurs with the indirect read case you > modified to skip the direct IO layer accounting. I don't quite understand how the duplicate accounting is correct wrt AIO. Both the direct and indirect paths are something like: inode_dio_begin() ... inode_dio_begin() ... inode_dio_end() ... inode_dio_end() What the patch does is to eliminate the innermost inode_dio_begin/end pair. Unless there is a difference between a dio count of 2 vs. 1, I can't see how the code correctness differ with and without my patch. Cheers, Longman