From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752093AbbGNLmE (ORCPT ); Tue, 14 Jul 2015 07:42:04 -0400 Received: from mail-bl2on0124.outbound.protection.outlook.com ([65.55.169.124]:52605 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751554AbbGNLmA convert rfc822-to-8bit (ORCPT ); Tue, 14 Jul 2015 07:42:00 -0400 Authentication-Results: spf=pass (sender IP is 206.191.229.116) smtp.mailfrom=microsoft.com; redhat.com; dkim=none (message not signed) header.d=none; From: Dexuan Cui To: Vitaly Kuznetsov , "devel@linuxdriverproject.org" CC: KY Srinivasan , Haiyang Zhang , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device shutdown Thread-Topic: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device shutdown Thread-Index: AQHQvWYcDQ3nQCJkBU2Sq3NyVNZ99J3az6vw Date: Tue, 14 Jul 2015 11:41:44 +0000 Message-ID: <19f503e369b04d01b79a1bde866a39f8@SIXPR30MB031.064d.mgd.msft.net> References: <1436789934-11566-1-git-send-email-vkuznets@redhat.com> In-Reply-To: <1436789934-11566-1-git-send-email-vkuznets@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [141.251.57.132] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BY2FFO11FD029;1:hPjG0k9wLsHLeH6W30wdRzeWo/OjldEOSWatGkNVufbUysy+r2jrnRWf/RQpFK4HRYm6+51kV4Nrt0ZPpSQZIflSq5sZGVD8yQzKXczdXcwHkcvrasqUmlGuQPG2k6Iuw3+tD6XoamGjwRKdUSKp5bS1ifPEbU5/u8ANT7aj7V1kCuSIz3OmWMJ9saHlHcdAUWwfJVg7ewZryQaxo1yrVw77XaymSUtPFJt048drtFgMIvtqaoEe5CK/k/BmXws6pu9JxibNmG4yWO5KYtanQQAqrCnAOi1GWeDWfJsVkXW2MU7rRJZ9/u41aXzJxVWXQpzRizZ3MFgZ9gY5HavAieby3DAnz0/RYpPW3AWZyIdU0aAHAuzNCO6bo+lkPLTF X-Forefront-Antispam-Report: CIP:206.191.229.116;CTRY:US;IPV:CAL;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(2980300002)(438002)(199003)(13464003)(189002)(51704005)(106116001)(16796002)(66066001)(5001770100001)(92566002)(77156002)(24736003)(106466001)(86612001)(62966003)(5001960100002)(46406003)(86362001)(54356999)(46102003)(108616004)(189998001)(50986999)(76176999)(86146001)(2501003)(50466002)(33646002)(97756001)(87936001)(2900100001)(2656002)(6806004)(2950100001)(47776003)(23726002)(5003600100002)(102836002);DIR:OUT;SFP:1102;SCL:1;SRVR:BY2PR03MB074;H:064-smtp-out.microsoft.com;FPR:;SPF:Pass;MLV:sfv;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;BY2PR03MB074;2:xRIy+mByKz0HvmlehA1BMFlq9RRcvsH6sapKWzxFrcP7v3An0deJxrD9tXmZ0OP8;3:m3Vyda8xkypV5eSbeqqmq8YjxcMYbgS+UPGTWU4n/jW3lOGkX+eRC/IboFAwDhfouAB+FA3TVNRcbNv/qFzn4LQ80EyLUYTcillXUFZ5mrz65LrZbKOWBKoiqe/2uhw7V51RDo5Bzd4XPKwftRT0JenZWenlpOa5cwfNHUgDiLrQrLxtlXAJ+WO9ybKu2PCRRbs7uUoLLguIopBX76wHWw3qSFf95nLD+/iG1TiNZKNPNCkIEDK5gSjO8OImUrch;25:JIe/lSNQfLVMDW4ZybygXrhTynTmc2eGV0pONuPfkSfgOHkhg6TAdcQrwyVfKyUTIn61b7zct60ywCODtyEe8E8xq5HvG4ZNixlj23yN0eLivsxr/b3b4yyj5V9PuVDkM00TX6ccimkAJg3EATKJaHLSil1BSCNzvc30CKl8a/RQWkEWdpbfUmGj3i4eZzf7WXl5PLojtkxKFlEYZ5yYK6AI9pS7M9720PvKBq7hAMY= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY2PR03MB074; X-Microsoft-Exchange-Diagnostics: 1;BY2PR03MB074;20:pk1psGvWIwL4KaNyV4kG5CxGa5v33ly+QD6Kif6GjCcATcHdNx3vh7n65ZFk/+TxXkk6tHa2WYpk3rAzJlTygGPP6lWMz9TKvJrtoAsbhayZGICLIaWhusFfUxrXIIk5h/HvC3EW0wBYaXitGv7OtpkzS7RK3UZeEivHqScaaZb/jEE5G3QQNPb+NDi1SN9AkYfLENt4e22v5hRXTjvZV6KeywIiM1NSHn1UmUGAiud9Sexfj62wN7aLKsh423vAjeN5NIq4XQ57vWGFlO66VozayzolFiMieOQplpUBQgrsrN5Z+723vjPYuv9IUgtcrnzZjDurky6YmqT1OiGgCYVMvhL2nMLNVHdYsTi2J5ZzmEix+n5GSrd/8/dS5CCm7ek5LzHTg2/zp9rfUudmmrtHkCESA/MHgAf85zYTfalTC1R4WXMy6XUTwx63kFbFnF9zshKR5EP/taTiyghMNwE0S1PfheWWStAf2HIdgN9xZ4ikYT2bndMRK9KC5T4H;4:0oJ88ZcgGeKvcMsOlQnnipG6zG3GMAWoYS0utRLcrMmtNO+SM9nzWWKsXCHNfqljpY0Fj+nIlEESguhbUIhCizFzIn9KTxp2IoN4y043dQ1zTq2jMuYn0XfD/RErlmc6iRbUPWppPEYAqNqA7TBhyjdR57rsJjW+IHAwSTOh3J61UThvcEJSHLbZuFf+j85bW4WvZYYYmSsA0jH80JUC9OcRMAFKu1PNZAi3bsKiOP1RseLz13fgkacKuqfImd+JKbozIyEioZvrCk6x6VEGv25qW9GxJatM/8Nt7I3rplQ3cPNxZnt3Py4Clvg1BMiZ BY2PR03MB074: X-MS-Exchange-Organization-RulesExecuted X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401001)(5005006)(3002001);SRVR:BY2PR03MB074;BCL:0;PCL:0;RULEID:;SRVR:BY2PR03MB074; X-Forefront-PRVS: 0637FCE711 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;BY2PR03MB074;23:3qKiyunxi8nToWNh0FAhARusRR2TS7YpChZXsPDN6/?= =?us-ascii?Q?Bi9wws7D8aq05Q56ZNCNZBUrzt418qN8WD6qIEXVJBFryXTC0X6IqqtrbdG0?= =?us-ascii?Q?37VcAoHIPvCEkaSAvBqwmRj5wGJLhlgl1/W9/PzPjcicAEXGtQtzOFPf2Yaz?= =?us-ascii?Q?MvPj6qwOlqMz5aosN5M7FiOVs9qwhK7jSol4nQtuPSEf8K5JKJg1ZkmLSive?= =?us-ascii?Q?CHfCZX3256YeD9yx2Ohz+GXrzJ8y7nDyfDcYdlSO0/lPTrpzF1XJ6FTuz3T0?= =?us-ascii?Q?jfmgP/u9EVPGD8SSmjx9aqYKI7yxfpHg0Ieu+lYiU0bbMxeBANTkuCV/A2Tv?= =?us-ascii?Q?LPHYDes+RFM2PEE3FYoBmuPf27jh6ugvL5dvGABdueY+LrsZmqEZfYCu54l9?= =?us-ascii?Q?3pQ//wEM5SF9vrnJlihbWg+APqVmTUeeYJjcjMFFA85FZxMqkC1i6InMLTzy?= =?us-ascii?Q?TIL/JbmX+KKKMUAC5RKsH6UDgdJyHu2aS8WSIXNWzG/Zt6gx6gbArqFo0pkO?= =?us-ascii?Q?VFxkhw8qmlo5hWx0IUjbFZz9x7dceBlo6YKA6BWz8Wbtbz+MmzxoHqS6jXF9?= =?us-ascii?Q?DSfu6DAvr927et+nMoFIdVD8W3v9Bxjjt41HCotEVvPXaUY7LE+yOnjv2kNR?= =?us-ascii?Q?dNK7+Jpl6+sJaiUkW96zdITlZE00CXSnmp2chV73/qMyuk1RVGyJZcUoJQMx?= =?us-ascii?Q?8KDLP8SI/OQ0r/6MXLUsk3YcpmFOT4C8VyYWCXi+9GVQw30gcX98nMLEMqUE?= =?us-ascii?Q?oQlRhE4ZsbwP9XUnvgZ3xB9q9GxtaAaqvYPFXEUhUCt15NMKovCTEaxLOgAl?= =?us-ascii?Q?rTAhWBvxHjeCT5H578zXoBvKzLBX/HZdS/Zx0zTsQYAluK6c9Xgs3GKyxt+o?= =?us-ascii?Q?G/u9viKGRA9ZFwl2efMN/31AXINXol5I/z/hryfBpS7U6Nj1i2lg6aDAqpEW?= =?us-ascii?Q?kCy1ofpMTmxXxwo32TmFLpdupTpOdkgRjEQV2eVzs1jpY84EnmPDiqjKXkIY?= =?us-ascii?Q?pghpW15pC7eZ/KCrLyCapeSmtAwobrJXgdQ4wRq2PTvQ=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;BY2PR03MB074;5:GTXo+sJSFFmTe9ZQRHzhyigIZJr5ys0iqnWMgmJYZ74ATv5DwM9dZDz1SB8Jt6y7oFZvao6mR76vssw/OBvegCzvNK2kXjQdPyHCcgeztDvgRhKg7Ah8zcAsejHCXRe6l6sGzRBDM9Y/BBRLipBLqA==;24:9Osvsf+xz71OVz11DMouznBP3SQwXMnlX/zJTU/4GyeaKVrOHDXp4S+MBZLOPrNB3nw2JQqndJ/dme4JAWqPqLsLhYtoz+ENEftaGv02re4=;20:ihL9uOZ8ysfOW9V5W1PfdnU7z4T6tVYll87N+VFc5CXg6bcgbO69wnZYIctzgsLYXKrZR1zAPK4caBSqk1vQ/g== X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Jul 2015 11:41:55.8700 (UTC) X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=72f988bf-86f1-41af-91ab-2d7cd011db47;Ip=[206.191.229.116];Helo=[064-smtp-out.microsoft.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR03MB074 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Vitaly Kuznetsov > Sent: Monday, July 13, 2015 20:19 > Subject: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device > shutdown > > When a new subchannel offer from host comes during device shutdown (e.g. > when a netvsc/storvsc module is unloadedshortly after it was loaded) a > crash can happen as vmbus_process_offer() is not anyhow serialized with > vmbus_remove(). How about vmbus_onoffer_rescind()? It's not serialized with vmbus_remove() either, so I think there is an issue too? I remember when 'rmmod hv_netvsc', we get a rescind-offer message for each subchannel. > As an example we can try calling subchannel create > callback when the module is already unloaded. > The following crash was observed while keeping loading/unloading hv_netvsc > module on 64-CPU guest: > > hv_netvsc vmbus_14: net device safe to remove > BUG: unable to handle kernel paging request at 000000000000a118 > IP: [] netvsc_sc_open+0x31/0xb0 [hv_netvsc] > PGD 1f3946a067 PUD 1f38a5f067 PMD 0 > Oops: 0000 [#1] SMP > ... > Call Trace: > [] vmbus_onoffer+0x477/0x530 [hv_vmbus] > [] ? move_linked_works+0x5f/0x80 > [] vmbus_onmessage+0x33/0xa0 [hv_vmbus] > [] vmbus_onmessage_work+0x21/0x30 [hv_vmbus] > [] process_one_work+0x18e/0x4e0 > ... > > The issue cannot be solved by just resetting sc_creation_callback on > driver removal as while we search for the parent channel with channel_lock > held we release it after the channel was found and it can disapper beneath > our feet while we're still in vmbus_process_offer(); > > Introduce new sc_create_lock mutex and take it in vmbus_remove() to ensure > no new subchannels are created after we started the removal procedure. > Check its state with mutex_trylock in vmbus_process_offer(). In my 8-CPU VM, I can very easily reproduce the panic by 1. running while ((1)); do modprobe -r hv_netvsc; modprobe hv_netvsc; sleep 10; done. and 2. in vmbus_onoffer_rescind(), we sleep 3s after a subchannel is added into the primary channel's sc_list (and before the sc_creation_callback is invoked): (I added line 275) 262 if (!fnew) { 263 /* 264 * Check to see if this is a sub-channel. 265 */ 266 if (newchannel->offermsg.offer.sub_channel_index != 0) { 267 /* 268 * Process the sub-channel. 269 */ 270 newchannel->primary_channel = channel; 271 spin_lock_irqsave(&channel->lock, flags); 272 list_add_tail(&newchannel->sc_list, &channel->sc_list); 273 channel->num_sc++; 274 spin_unlock_irqrestore(&channel->lock, flags); 275 ssleep(3); 276 } else 277 goto err_free_chan; 278 }