From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755354AbcI2OEW (ORCPT ); Thu, 29 Sep 2016 10:04:22 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:51638 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752855AbcI2OEU (ORCPT ); Thu, 29 Sep 2016 10:04:20 -0400 Subject: Re: [Nbd] [PATCH][V3] nbd: add multi-connection support To: Wouter Verhelst References: <1475092892-8230-1-git-send-email-jbacik@fb.com> <20160929095204.mexr6wpypo3bl6mx@grep.be> CC: , , , , From: Josef Bacik Message-ID: <87908d95-0b7c-bc3f-f69d-94d006829daf@fb.com> Date: Thu, 29 Sep 2016 10:03:50 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160929095204.mexr6wpypo3bl6mx@grep.be> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [107.15.72.49] X-ClientProxiedBy: BN6PR1601CA0004.namprd16.prod.outlook.com (10.172.104.142) To DM5PR15MB1323.namprd15.prod.outlook.com (10.173.210.13) X-MS-Office365-Filtering-Correlation-Id: c86b2d6b-d98c-41aa-84ae-08d3e87177a5 X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1323;2:XLhkiroNp1OHaphjAzCMbHQPf/cjbUe/r6sAJA2KapYVmpAuL61hAZRWbmCm8D9Zd5KwXFz2K5ZYK2AwxBSRIcpdZ0MMTj5aDGrWGmZMEpuuWz9RlFX5n5CFcrxMhKA3xJlrvtKxTN3RJF46hk8a5LzrhqamhtuEB1SDHGs7IIdncpiQb4M3XfrBOC46ZQHQ;3:kX+IkW793SPxmwdIjxchkEPCmMIeGDDs0Q8VW7lDbWI6GowoWBeAAgnfn6M2DBrKm4SQ/bzPHRDfkFcLdjGfC2jzxuJ+xncCMHIgCMpoLOPwC2KdzmoXd19/4KJsRTc9;25:ld+GVed1Oz+s/dlF5opGRbFPlHX19AGz+R8kd/LckuElR9g+k5+a9JLYtkmE5svjO5N7UoRn9PKX8xhML/fzsN381/CrpoYdi8poiReFBeks46Q6uKRLRz/CLtUNGCsC8qNTcuMluBI3tfAU+IvDhtVRQxhdMNTq1CljBdrtOVzJ8bSA8U48YTCcCY6fzZYQL//pVkRUOnzKvJVavvtoSq36Q6RAJMOWkWgMeWmxuKCfHrTg1PN6u1GwBNBb9V2PUdC9u0ltU6AI+ivZ0KGBuTho407qdTpqTxlfAK1ARg8QU4zcWl9Hpd+gE/CJssER/rHpkUCSzFr6ogj1DBee6NBUEfKOXnmv/d/2/yyebxmbLgB8F92Hb+BKVHxYmi9dslFfTv/9R4S/vw6FxPKWNr8rsKcV/h4KvHWCUXQr8gg= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM5PR15MB1323; X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1323;31:S42L6FPFi5LDKGGDEDDIbVOtQovGNC1AyB9M2mMiABKewE1835voCE9H4B6M/OSHWmgcJgsAIs5d+zpgUlSKPodnSXJN9Mf9RYawjoKGjM4r5QbSkm6W2zW29AOYJFO/Uvz+R/oCP4dmsHw7nQknuttASWUGKjwhrX/U8EpSPUVnVxsAKTwrMf19SnjLhb/8kZM2+kGkzaUaAjyOypqyhqsghgXXiLp9qW88mxvSFAI=;20:HInMpKXzp5d8Ve7NCG2The3UvaJYRowmkOaHWyZB5ATSzocSkB3NulW/rooY8yvWnTYIixBt/DCAHG7O539/J1w/+RjnSBO3rCh+HqczNdphn5GS+axR0dE5ItZ/cOCXsW90iAa6Wn3SKalTxf1WlONrQjK9cOad1dX+zvdvAUg=;4:I47rwGzelwEwMXtkBNEYkMLraXIIL1crCaAs1wHJMob0tFEWJAfBPheB3Y7sE0Cj5aZPfUE5tVbjPVCknE0TpS64yvSc0iG9RqzOaetYMlX7r7Xd4RYbqYpZKlsXAQn4BMcaxqXA/JOf51Vgu/wgfbMk+e0r6GTNZnnAaL1Vfw7kfQxU/hULJbWTg3MqdqZ8T1m6hR3x9TZkfuv5QfB3zGGGZj1nR6pSdzqRodKnaVS7biF2sXS+UgOoF8NQpqlukteqdF2HcGG3iANPNJ5y0D62mIeV1Kz9rwQtdnV6AHRaZepQf+NE2s3OaCeOlhJJBsVWJn174fCMJg8byqUY+ERUqMXJF0pcTu1+y1wD7EDB6DP1v/mudkUU6LRLHVP4dLjgrsSFhbP+RjP93pMtiZdInkZ/mumoUoFxTNppbQfBo/pNRQglISJCarj2sBq3 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(158342451672863); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:DM5PR15MB1323;BCL:0;PCL:0;RULEID:;SRVR:DM5PR15MB1323; X-Forefront-PRVS: 00808B16F3 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6069001)(6009001)(7916002)(377454003)(199003)(189002)(24454002)(83506001)(106356001)(6116002)(4326007)(105586002)(65956001)(65806001)(2906002)(66066001)(47776003)(230700001)(64126003)(92566002)(36756003)(6916009)(101416001)(586003)(31686004)(97736004)(77096005)(50466002)(42186005)(3846002)(54356999)(68736007)(76176999)(50986999)(5660300001)(2950100002)(81156014)(81166006)(86362001)(7846002)(110136003)(7736002)(23746002)(8676002)(33646002)(31696002)(6666003)(4001350100001)(305945005)(65826007)(189998001)(781001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR15MB1323;H:localhost.localdomain;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;DM5PR15MB1323;23:elHKT6/I47HsZE6bLNdXH3wHzgD2CCGy7Tug8?= =?Windows-1252?Q?hAdE5FfAMOtdR0KIHWVqBJ6L67hNGarBBJDb2en9f83CwXFSqzVt7/SV?= =?Windows-1252?Q?LcOmg+F8nk3ZI4mSoZiW3EfvCtzydf11JKEdU7igApPUX7Oiwbgi73p4?= =?Windows-1252?Q?aYTFHKAhVOX8F+rKdWZwvPjHyF5dtZBHbWk57UNWSeWnYnTRkEcp9ae8?= =?Windows-1252?Q?ebVhqxKd719jNImSkDALAr/CARjrk+5YRtEpYL9SzJmhLTl+dGdqqdSj?= =?Windows-1252?Q?sc25ij7EHlikgo+lhdUBD/mdL3pIbd+RJ7CJfqM4idytN0H64DX8BM9Q?= =?Windows-1252?Q?ke5DFboT9t6CTYWy6fYGMfcr6LZFrTbrhjm6F8097K0fUHDoSHByBtDS?= =?Windows-1252?Q?+KVFZSEKL/mli2/sAmDWxkAINAhjKE1ikIQ/yv2swp4aZN8/anrDcxJ2?= =?Windows-1252?Q?ijAyj2qVRwVEN01v+9RtvDO4WseabAJvOxtjbseMxH4UFg1MCqNfYET2?= =?Windows-1252?Q?EIFb5Vuut9ZomZHNZVsl6A+KLVJ6KDOqcLT9HKa//uWHltJ6gZsCgxgK?= =?Windows-1252?Q?3+HR8x83lpjquZdhH+cWXcsnOO+eqGpGdz3SXNCP/h8nzFttOdB0rgKr?= =?Windows-1252?Q?lDT2h/SFXCf3GhpUSs22rcM3FxbAlVMFeDrSh8hKjP/bC4Th9hD8x+lt?= =?Windows-1252?Q?dnkNe3e19IylkC1iZmPgSZRGLnmpDPlCNj+pFULP7bxkcRon5mNXVICM?= =?Windows-1252?Q?9fNL2QcpKWUET29OQxiZ+1rc4PGq6Ri0/YRheyV4613taFF+qmP4iWl4?= =?Windows-1252?Q?wTcMReFbsxkGmt1o7+ysJdyeqVYu+1AemBry+zlzawI6d2IlS2MHxA3L?= =?Windows-1252?Q?Rk9MvHgO0K9P0g3ebWg7Up+7wa7blvS/mwjGEOiSmJ03rMhQc8cjKADE?= =?Windows-1252?Q?F+OmNNY+zHRwIYfEZfUgmPbfRXTeLzAfl1vmYagyc7XTYEbk2pioDblW?= =?Windows-1252?Q?G+P4lXO6/xINh3x2ImvTEsN28l1GlUtasIi11AUA7mkPwhvKbLktXHt/?= =?Windows-1252?Q?nRY7CHUrhTOt8Iv/Pdy4nDuYc+pJj9wymyG95xuRScR7B5Yn8pJv6RA+?= =?Windows-1252?Q?bSBv+AlPdDzeY497kQFJCeYu0TmjYFpzEqeqNZDHxRpBJCOAux+weeAE?= =?Windows-1252?Q?s9JmQHejrQpNAwZGCHGlvh6B0MgcTBNLCKSDXa/DFTaHoa9pTpbUniYx?= =?Windows-1252?Q?dLD5nVQpwwInsaysCOvQaZZaa+BHItPUQ+HMNishlqhRdLyAJ2EzjTkf?= =?Windows-1252?Q?BlvEpL6lh8BiZPsgGQZmt0p30W8hRoGJ81S3sQlytdSZ8LJxgxogCKIy?= =?Windows-1252?Q?vG8yBjsAwNgkYJhSZ4XFxvj3S86P5S50Q=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1323;6:Y3NJdlv3ZP/FdPIb+cgpABguXKKnyfak+fSfY1O9Wa39yK2qgaLxEdPmG7CJMNHHNUGRkQt6b8f4Irnz5MuksM6N9GtpAxuc9r3LsYSN0/+wkINdW+SxxGUMtNNvgIbzeTTs7/oCinzWb7F9b7NQRXZjNiIeH5cOXONS29MwzSiF6Eowrd+kcsYP4n/cAcxcbByKwywkMuh7c1kg/o3NXKo4oaUSUuXujFSgHspSAIWzxH3rSJY29lqI1eDTOpcHL/JMoHf/S2TrYnGNF8E2yJXdtOuKSkJdFommY0l1wjQ=;5:Wm7USqBHxDaXTNzT2XuPgTuRUr4Wpdpyz1VGKEByjEz5tAtzn3iOXjQTeKhWmHjzxzVZyXNHznv2mpRAP+J1gptYjZShxRvMYpsvmudU0d/siT/NBrQBS8gBlocIOBXHRsDf7IRsY6zAQNUifo+XlA==;24:pAYHGA27kWdcr3Q2OnklvNkETREsGCmr5yh6yOFnfHS6j0AgQAfkXizrUbAQhE/PcksG1pUYZzkcOJ6q1o+4oEVAzfguFmBvaP/FkgpTrVo=;7:HDe2kkc6qjcq5polxuLFDwMdxqUPvMKAWbyEJrdcXaoNvMrkhs4bDBr6HY1lpBw7C6T39CXwHlMqHgZ5+puAnO87xyLowjErebcNtdJOQi8zs3X1rvupscqTwZjxwXlmKzvy2e/ZnDOJ3k/nSTeMZTIeOUrvcEziFFqAdr6OrbXoE+olabHbD/P1MlYsCLiMkfAKWDC9F0S8nfQQ+a1dhbDFKY06DQi8vSKHVa7ZbmuDxVZJ5/x/40NSq5KM0pTf7C3yxqm7s5a8ELgXoGiRN4Piht8rjwp3RrnxThMDZ0v7NQuH4gfe/nB7lGXUufY9 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1323;20:i+s63bBLz8ZHXz8OtX+xNiQiKpNokVKEpGkk7a5N6ScKCZ3w+fOzEvzE2kYs7Bm9sjCo6sjtS1DFQrLXyAYyZg+NuuTjO59q1kAGKAxEt8G9qWnE9m+ogDkMOMiDyrhiSYDoO91lisoN9x5x9f3F9hkuw+M4ACax9dNtVDtBp88= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2016 14:04:02.0492 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1323 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-29_09:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/29/2016 05:52 AM, Wouter Verhelst wrote: > Hi Josef, > > On Wed, Sep 28, 2016 at 04:01:32PM -0400, Josef Bacik wrote: >> NBD can become contended on its single connection. We have to serialize all >> writes and we can only process one read response at a time. Fix this by >> allowing userspace to provide multiple connections to a single nbd device. This >> coupled with block-mq drastically increases performance in multi-process cases. >> Thanks, > > This reminds me: I've been pondering this for a while, and I think there > is no way we can guarantee the correct ordering of FLUSH replies in the > face of multiple connections, since a WRITE reply on one connection may > arrive before a FLUSH reply on another which it does not cover, even if > the server has no cache coherency issues otherwise. > > Having said that, there can certainly be cases where that is not a > problem, and where performance considerations are more important than > reliability guarantees; so once this patch lands in the kernel (and the > necessary support patch lands in the userland utilities), I think I'll > just update the documentation to mention the problems that might ensue, > and be done with it. > > I can see only a few ways in which to potentially solve this problem: > - Kernel-side nbd-client could send a FLUSH command over every channel, > and only report successful completion once all replies have been > received. This might negate some of the performance benefits, however. > - Multiplexing commands over a single connection (perhaps an SCTP one, > rather than TCP); this would require some effort though, as you said, > and would probably complicate the protocol significantly. > So think of it like normal disks with multiple channels. We don't send flushes down all the hwq's to make sure they are clear, we leave that decision up to the application (usually a FS of course). So what we're doing here is no worse than what every real disk on the planet does, our hw queues are just have a lot longer transfer times and are more error prone ;). I definitely think documenting the behavior is important so that people don't expect magic to happen, and perhaps we could add a flag later that says send all the flushes down all the connections for the paranoid, it should be relatively straightforward to do. Thanks, Josef