From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD526C433EF for ; Wed, 11 May 2022 15:35:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4249A10F7DD; Wed, 11 May 2022 15:35:31 +0000 (UTC) Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2072.outbound.protection.outlook.com [40.107.93.72]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3AB4310F7DD for ; Wed, 11 May 2022 15:35:29 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HKtyNIv6eNocpTVGE7I9SrHaT5u9fP51/f0YIX9Noj9MS1G3rxb/APJ25ppzqH2yfRsawoQ+QtZ5mrulPxICgnQ8h0lnOBHdt1A/JgoCJn5RLR4HimQiClGNGuHQHWVet8Vq72vqiG1EkqTs0IzNntnV8CaTxscMj9boqVMiCN5wnYaK/cXBIyaT7Mg/GA/XG36oMoZ/Bial6ef+98uSqfV4Hws6W9b0Jo/1eBHXZQeQ0ToJnd5cQqKjsbwEJTn/LLYq0n5z4LRqNqK3t3Ok8ICKwBBBzbeM4rCRQ+UchxvAmEPTxUbNWwgf9+HS1Ssh8yx50CGBkcdrGSQYOVve1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1EfADKWqAJZzgsXHYwPY9nhHRfn3X1wTpcPjOYgZqHA=; b=kFIKKCB5nIYnhG+nMtqbptHfc6BpPTe4cnEy+i3whlC2WoVhknhI6L0E0FXwrQQr9t6n4bakfDQojUEUOVNYmCnKklw9l5RFPfrJfChptXK/NobGj7vaGYxzZpVj6C9iWUwwjzoC2AwADljjQ58F123VPUFtKpKr/5C9Iq7qZLo9H6icQuxTSPieFXQ0TVM7+EN1V103C6OthShRGGVAarf60dwdQcZVNlLBtCFCaNUGrHS5CEX6bglYMSpTmpBGuP1pLQQ5qryVHf+W1vTAAibSmAXwkkDIKPKMDNcGTDF2jrMYCPQwhR1cvibRlYAIhTMyP3eLgirzfCOGsgV9DA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1EfADKWqAJZzgsXHYwPY9nhHRfn3X1wTpcPjOYgZqHA=; b=axsBa5wqE1Q0rDo9qOKmgKXDwpMOd5oZAalOZXmQv9Hx/IN4dYf+r2SuHLnPw2/FAnm+VZgvBtb30pubWaBY/l91p1fkDTg+32qpMC5+h/D5iD4vdhTRaOCAJuosC9JIGx6osrwgKq/qbqvwQcvGHSOV/mhgm4NLxjeRObUq27M= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from DM5PR12MB1947.namprd12.prod.outlook.com (2603:10b6:3:111::23) by DS0PR12MB6416.namprd12.prod.outlook.com (2603:10b6:8:cb::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.13; Wed, 11 May 2022 15:35:26 +0000 Received: from DM5PR12MB1947.namprd12.prod.outlook.com ([fe80::1ce8:827f:a09:f6e]) by DM5PR12MB1947.namprd12.prod.outlook.com ([fe80::1ce8:827f:a09:f6e%5]) with mapi id 15.20.5227.023; Wed, 11 May 2022 15:35:26 +0000 Message-ID: <7330c50f-4623-c377-9486-c9a8fa884e15@amd.com> Date: Wed, 11 May 2022 11:35:24 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCH] drm/amdgpu: Fix multiple GPU resets in XGMI hive. Content-Language: en-US To: "Lazar, Lijo" , =?UTF-8?Q?Christian_K=c3=b6nig?= , =?UTF-8?Q?Christian_K=c3=b6nig?= , Felix Kuehling , amd-gfx@lists.freedesktop.org References: <20220504161841.24669-1-andrey.grodzovsky@amd.com> <7e9f45be-41a0-0764-8f4d-2787319477fb@amd.com> <80f0a3d8-5b42-f1b3-5396-464c665361c7@amd.com> <8ea0a998-b056-8cbf-d666-5305fd124a5d@amd.com> <40baeccc-86c0-5fc2-c970-c0bf8b6b6943@amd.com> <384abcbc-c5e9-3732-7257-7f7bbf4c704b@amd.com> <05a18be9-dcc3-9246-b572-e47ccf5e804b@amd.com> <5f49de9e-dfa0-3371-c800-581f00556820@amd.com> <82cf78c6-9246-e892-bc42-99f6ec668481@amd.com> <3cefe63f-1f27-db1c-aeee-3731ca1e6d1d@amd.com> <2b9b0047-6eb9-4117-9fa3-4396be39d39a@amd.com> <2d366654-8df3-c6ae-d6fc-4fa94fc32fb1@amd.com> <3e0c3d24-2135-b02f-93a3-ab2a597c794f@gmail.com> <9cd38f76-13d0-7e99-9c8c-dff847c6cf2b@amd.com> <6699d9ec-501d-d2d5-2220-88fb754819a7@amd.com> <27535877-a13f-75fc-368f-68105dffc7f8@amd.com> From: Andrey Grodzovsky In-Reply-To: <27535877-a13f-75fc-368f-68105dffc7f8@amd.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: YT3PR01CA0027.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:86::13) To DM5PR12MB1947.namprd12.prod.outlook.com (2603:10b6:3:111::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e8b3a819-6853-462a-7487-08da3363df0c X-MS-TrafficTypeDiagnostic: DS0PR12MB6416:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SGVHLdbRJVPe74dJ7K+OktchdGeuyUYIGa21ztoZRjD5Y88hZpVw3IMT0aLAFUxEpyTCIkcRfpdH+C5FlBLdggXRkgGHL8jwT8NXuwbH8ApPCLbP51bXS07w7PNBDKidFnU1EYYVF1BtJ6WpnZ0ZcWjq3bl96/ZvoYqp4+zZZLLTAhIHUYrHWkcdZoCq40kjz/owiI4WOK2de4KVPincBigZPAVQQQImkdxQDMri58ZLj+rCbOgDM/u7fmpVuAC9GUK5S9W9hcKDAGiuoWyilugFq7SxlwNOlz4Bo97yV77ZfTCShTXhp0MmmBK0nE0msCw8Vn/sOJY/zvbxBzJVpai7KbEz9i3/ssFTvQRkO81SXgJwhrbHJKwD92zfH3BSn+c3JmFMbmIq1fpYxH+AX50SGsDZS4NE2aSyFVvUI7Jc8offG2xI8I3TRMPV+VlX8Fb3HNwd7s8i7AW/HAFt0ibJZbNZIVY5Yk6NLs4klTFp2gkdflCWvuzMK4Pf1euRh1BmXpKBtVK1wVx/9Jf2GHTKbW411jMYdo1zZ7jawSZ5GpYqZ7rLEfqWFEe8gD8ICz0hZjn7mtoGRrBSj3SX5NiZzILKUskk/mOef3vRe24Q9XHPhXV1ByQmx2JdtvX311YaV4mCjPXTMghV5v3fqN0N1WW4BCxy1vMCjhyJ6HGd/rFV/oycICRmdsv+j1R7NZYqjX54yLtgNpOyVBVwY8qe2Zxpx2TYK249Bzl74UU= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM5PR12MB1947.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(66476007)(66574015)(66556008)(66946007)(186003)(6486002)(8936002)(31696002)(86362001)(4326008)(2906002)(508600001)(8676002)(31686004)(30864003)(83380400001)(44832011)(38100700002)(6512007)(110136005)(36756003)(5660300002)(316002)(2616005)(6506007)(53546011)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 2 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NXJIYVgwYVgwRU1XelZ4eEZVL1Q1YmpBbXlDRGszWEFYL1pkcXNSeHM1cERY?= =?utf-8?B?bmVnR1Y2RGVnZFAzMnJmdEZXOTFsSUxTNHJTVS92eWpZcHRwTjdGQVV2amdp?= =?utf-8?B?OXp5ZUk1cmM5M1RuODdpYWZaVjl5Q1ppN2czTHMyRVJBcE1WK0I3dENkN2lI?= =?utf-8?B?WVE3bXJ0WGVHZ0Qvamx5VDZMZiswRmRpOHJsZnkydVNVZnlMUTUzRkpabWR6?= =?utf-8?B?NUZKaC9DajA1WFhoendiOCtsZXFram1LZkhDNzU0TmJVM1oyZHRGQ1MzS3ls?= =?utf-8?B?a3owL3lVTlZjbUI3M0dYRmtEUlhKeVk1NUFhYW1BSWNMaFFLb1ZMQUxNTUZR?= =?utf-8?B?VnpYMm53QnZGQ1hEV2drYmUrQkw0QUNFazJvYU1rb0RwQWh0eXJ1MXB1c296?= =?utf-8?B?UTJySzhFTnhZaFZiZDhUekNxRjlRRmFzYndoTnVnWnVoZndBZVp0N3d3R3p3?= =?utf-8?B?aW1CLytNWHRDRm9XWGsvUTcxTEFPTmU0Z25WUERvUHRDZ3Y1ejhVTmJVQVFB?= =?utf-8?B?WEx4b0pLTE43MDZ3UHp0OWpQVGhNcVBUYUhLRmVtOGpDMDBzMXc5NWJzUTlv?= =?utf-8?B?bS95Mm9WZ2pLYWw1QU0vY3hHRXl2anZ1QmlkcDNDT2w1QjZ0UXVSeElSOFlQ?= =?utf-8?B?VjMvVEZyUzFZZGw2OFdsNnlCc1R4UlZTTnQ2Rm1TWTZmTjRhdHVSTXh1OUQ0?= =?utf-8?B?RXM2WDk0NHZlYmVsS1poczJlM3Y0am9uY3VlUGlQM1IxNUFoMkhYK2w2Wk9C?= =?utf-8?B?UStudkphRnJJMmZpaTFOWDFoVmcxTzZRdWRxT3lwcUV2dFloMFZuaWpLbFNL?= =?utf-8?B?bVFJaGdrcSt5bFFpaHBiYnJMcFRuck5ucmtUZ2d0b1l1MzEzbHNUMDJOVEN6?= =?utf-8?B?UnU4VG96SWw4TTRKa2xjMzE1dVlYaHpPZFgwNnlCZWZseGZtcXFNUVZjOXdx?= =?utf-8?B?STFpRXhqRGh4QkwzOGgxVlVVeExxSlRrTGNpVDdWaDVuY3NnZko1MU5qTGFU?= =?utf-8?B?OFR3WUNpL0lleEFJNWhlWEZ2cDV2VVlBdlI4SW44TXZTdEVVZzNwK2VvWEpC?= =?utf-8?B?V0QzbnlhNlJ6bUwyOVZBU1NsRlhWRmxiWjhQNlFBR1oyeVI0RndDSFBRREFT?= =?utf-8?B?NmdjQzd1Z1NKS1Z0elNFWkVQcDhyaTB0NjhJZ2NwN2RIOFhXSE11aWloM2lq?= =?utf-8?B?S2lQT21EM2JSQVdBZXdwWW9SY2tkbWYxYlZRUFM5ekhCdzV5RHRCcGtQSElk?= =?utf-8?B?V0RPV3NyZVhDS3ZJd0RTSkdPZ0lMTVl3eGtTb2k5ak1CZXVKSGgzM1o0b0x5?= =?utf-8?B?K3Vhb0xzNU1hdDdDNlJIVllwd3JTNElXVFVHWHI4MFgwZVNLdm54dEVkejc3?= =?utf-8?B?ZWcvNWN3cDdnbzUyM3hXeVd5dDYrVjRXOTlTODlxamhUaVh4SmJTZ2I0eW91?= =?utf-8?B?cmU5Q01pTzhRZ1E5QzJrWFdyOFNneUxnb3k3K0xTdHNGcm80MDVib0d0UE1S?= =?utf-8?B?YUxCb21HTEZteDVPM0pkVThYTU9YeEZJTlJjSS9vMnBQZU5SdDlxb2JQWVpF?= =?utf-8?B?d0E1QStTcCtWOWxwSEMrZ0xmWUE2Um1xYlpFL2pUV1huWTdDem1aeS9jUENZ?= =?utf-8?B?am5WWk5TN0tVOHUrcDhYTURYb2l4V0wrU0R4SVRyYVdxMFFyTUx2SmtXYXlF?= =?utf-8?B?Rm1iS3hjakxOZ1VmZ3VTRnZubGNIMnpndGVGbXlucmtyMStCeGltdFJGNTFi?= =?utf-8?B?ZDNYN3JwSnZZMk1GZ2hXRE1wRkNybDhRZGxzWWw1RHZSSTdVUkozK0ZCdjFa?= =?utf-8?B?YU5EU0U4UzI5YUxkTjFvL3BpYXJ0bk9GWFFwOCtCLzl5a3hWdWd6cHN2Y3Vv?= =?utf-8?B?T1pGUUpRcTBBN1lhN0JrUzJTbDlDRHV5b1dYaVQ3ODlDRTN1NktyQlpqcVgz?= =?utf-8?B?ZDNUL3QrVzFUQVViQlpza3llSVJrUlRkNTZqN3JQbFkxVTdXY2dxSURYWER6?= =?utf-8?B?cGwwR2t6Uk5nTGdWcGgvM3FhTSs1VHg3Yktid0FMNFlQcXFYYlVmOHVDYjB2?= =?utf-8?B?dStPbXdaOUhpNkVrZWY2RmdIUTJqTlB2cjZMS0ZIWkRnNEM2K1dPK1VMQjJj?= =?utf-8?B?NHVzVmtQdG9GV0VrUHM2TzZVMTdtMHBESmRUTmN1M25mcURpNXFqYmRWSzhz?= =?utf-8?B?cjc1c2czOW1pRC9abTErc3d0MXA4R1BLN251NW4rS0ExaFVJbUJHcFFDUnBh?= =?utf-8?B?OUR3WlFJQWh6T3paQjJVTk5wQVpDWXpPYUxpbURtU0piVkRHbFcrdXdRMlk1?= =?utf-8?B?Mm5STTd4RXBaaDlHQmdwcjBOekxzVHRndXUzdGlGRGNtdUtlNUI5ZisrTlpl?= =?utf-8?Q?tZsBAPl2BQ7R3CbviaNxfGrNRdv2IAdKHyGxXYWobAiVH?= X-MS-Exchange-AntiSpam-MessageData-1: aW/9APm4Mc9P/A== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: e8b3a819-6853-462a-7487-08da3363df0c X-MS-Exchange-CrossTenant-AuthSource: DM5PR12MB1947.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 May 2022 15:35:26.5292 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0tf/L2PjKB4yFdh2D9dROzFs75T/zlhMh3GixyTyqW9Z2TRHgwd35uaX+JX9QmPnm5QWMTVeAi9ErJFGD5UzLw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6416 X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bai Zoy Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" On 2022-05-11 11:20, Lazar, Lijo wrote: > > > On 5/11/2022 7:28 PM, Christian König wrote: >> Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky: >>> On 2022-05-11 03:38, Christian König wrote: >>>> Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky: >>>>> [SNIP] >>>>>> E.g. in the reset code (either before or after the reset, that's >>>>>> debatable) you do something like this: >>>>>> >>>>>> for (i = 0; i < num_ring; ++i) >>>>>>     cancel_delayed_work(ring[i]->scheduler....) >>>>>> cancel_work(adev->ras_work); >>>>>> cancel_work(adev->iofault_work); >>>>>> cancel_work(adev->debugfs_work); >>>>>> ... >>>>>> >>>>>> You don't really need to track which reset source has fired and >>>>>> which hasn't, because that would be racy again. Instead just >>>>>> bluntly reset all possible sources. >>>>>> >>>>>> Christian. >>>>> >>>>> >>>>> I don't say we care if it fired once or twice (I need to add a fix >>>>> to only insert reset work to pending reset list if it's not >>>>> already there), the point of using list (or array) to which you >>>>> add and from which you remove is that the logic of this is >>>>> encapsulated within reset domain. In your way we need to be aware >>>>> who exactly schedules reset work and explicitly cancel them, this >>>>> also means that for any new source added in the future you will >>>>> need to remember to add him >>>> >>>> I don't think that this is a valid argument. Additionally to the >>>> schedulers we probably just need less than a handful of reset >>>> sources, most likely even just one or two is enough. >>>> >>>> The only justification I can see of having additional separate >>>> reset sources would be if somebody wants to know if a specific >>>> source has been handled or not (e.g. call flush_work() or >>>> work_pending()). Like in the case of a reset triggered through >>>> debugfs. >>> >>> >>> This is indeed one reason, another is as we said before that if you >>> share 'reset source' (meaning a delayed work) with another client >>> (i.e. RAS and KFD) it means you make assumption that the other >>> client always proceeds with the >>> reset exactly the same way as you expect. So today we have this only >>> in scheduler vs non scheduler reset happening - non scheduler reset >>> clients assume the reset is always fully executed in HW while >>> scheduler based reset makes shortcuts and not always does HW reset >>> hence they cannot share 'reset source' (delayed work). Yes, we can >>> always add this in the future if and when such problem will arise >>> but no one will remember this then and a new bug will be introduced >>> and will take time to find and resolve. >> >> Mhm, so your main concern is that we forget to correctly handle the >> new reset sources? >> >> How about we do it like this then: >> >> struct amdgpu_reset_domain { >>      .... >>      union { >>          struct { >>              struct work_item debugfs; >>              struct work_item ras; >>              .... >>          }; >>          struct work_item array[] >>      } reset_sources; >> } >> > > If it's only about static array, > > enum amdgpu_reset_soruce { > > AMDGPU_RESET_SRC_RAS, > AMDGPU_RESET_SRC_ABC, > ..... > AMDGPU_RESET_SRC_XYZ, > AMDGPU_RESET_SRC_MAX > > }; > > struct work_struct reset_work[AMDGPU_RESET_SRC_MAX]; => An index for > each work item > > > Thanks, > Lijo It's possible though it makes harder to generalize reset_domain later for other drivers. But still one caveat, look at amdgpu_recover_work_struct and it's usage in amdgpu_device_gpu_recover and in gpu_recover_get, At least for debugfs i need to return back the result of GPU reset and so I cannot store actual work items in the array mentioned above but rather pointers to work_item because i need a way to get back the return value from gpu_recover like I do now in amdgpu_device_gpu_recover. Andrey > >> Not 100% sure if that works, but something like that should do the >> trick. >> >> My main concern is that I don't want to allocate the work items on >> the stack and dynamic allocation (e.g. kmalloc) is usually not >> possible either. >> >> Additional to that putting/removing work items from a list, array or >> other container is a very common source for race conditions. >> >> Regards, >> Christian. >> >>> >>>>> to the cancellation list which you showed above. In current way >>>>> all this done automatically within reset_domain code and it's >>>>> agnostic to specific driver and it's specific list of reset >>>>> sources. Also in case we would want to generalize reset_domain to >>>>> other GPU drivers (which was >>>>> a plan as far as i remember) this explicit mention of each reset >>>>> works for cancellation is again less suitable in my opinion. >>>> >>>> Well we could put the work item for the scheduler independent reset >>>> source into the reset domain as well. But I'm not sure those >>>> additional reset sources should be part of any common handling, >>>> that is largely amdgpu specific. >>> >>> >>> So it's for sure more then one source for the reasons described >>> above, also note that for scheduler we already cancel delayed work >>> in drm_sched_stop so calling them again in amdgpu code kind of >>> superfluous. >>> >>> Andrey >>> >>> >>>> >>>> Christian. >>>> >>>>> >>>>> Andrey >>>>> >>>>> >>>>>> >>>>>>> >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> The only difference is I chose to do the canceling right >>>>>>>>> BEFORE the HW reset and not AFTER. I did this because I see a >>>>>>>>> possible race where a new reset request is being generated >>>>>>>>> exactly after we finished the HW reset but before we canceled >>>>>>>>> out all pending resets - in such case you wold not want to >>>>>>>>> cancel this 'border line new' reset request. >>>>>>>> >>>>>>>> Why not? Any new reset request directly after a hardware reset >>>>>>>> is most likely just falsely generated by the reset itself. >>>>>>>> >>>>>>>> Ideally I would cancel all sources after the reset, but before >>>>>>>> starting any new work. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Andrey >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>>>> You can see that if many different reset sources share same >>>>>>>>>>>> work struct what can happen is that the first to obtain the >>>>>>>>>>>> lock you describe bellow might opt out from full HW reset >>>>>>>>>>>> because his bad job did signal for example or because his >>>>>>>>>>>> hunged IP block was able to recover through SW reset but in >>>>>>>>>>>> the meantime another reset source who needed an actual HW >>>>>>>>>>>> reset just silently returned and we end up with unhandled >>>>>>>>>>>> reset request. True that today this happens only to job >>>>>>>>>>>> timeout reset sources that are handled form within the >>>>>>>>>>>> scheduler and won't use this single work struct but no one >>>>>>>>>>>> prevents a future case for this to happen and also, if we >>>>>>>>>>>> actually want to unify scheduler time out handlers within >>>>>>>>>>>> reset domain (which seems to me the right design approach) >>>>>>>>>>>> we won't be able to use just one work struct for this >>>>>>>>>>>> reason anyway. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Just to add to this point - a reset domain is co-operative >>>>>>>>>>> domain. In addition to sharing a set of clients from various >>>>>>>>>>> reset sources for one device, it also will have a set of >>>>>>>>>>> devices like in XGMI hive. The job timeout on one device may >>>>>>>>>>> not eventually result in result, but a RAS error happening >>>>>>>>>>> on another device at the same time would need a reset. The >>>>>>>>>>> second device's RAS error cannot return seeing  that a reset >>>>>>>>>>> work already started, or ignore the reset work given that >>>>>>>>>>> another device has filled the reset data. >>>>>>>>>>> >>>>>>>>>>> When there is a reset domain, it should take care of the >>>>>>>>>>> work scheduled and keeping it in device or any other level >>>>>>>>>>> doesn't sound good. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Lijo >>>>>>>>>>> >>>>>>>>>>>> Andrey >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I'd put the reset work struct into the reset_domain >>>>>>>>>>>>> struct. That way you'd have exactly one worker for the >>>>>>>>>>>>> reset domain. You could implement a lock-less scheme to >>>>>>>>>>>>> decide whether you need to schedule a reset, e.g. using an >>>>>>>>>>>>> atomic counter in the shared work struct that gets >>>>>>>>>>>>> incremented when a client wants to trigger a reset >>>>>>>>>>>>> (atomic_add_return). If that counter is exactly 1 after >>>>>>>>>>>>> incrementing, you need to fill in the rest of the work >>>>>>>>>>>>> struct and schedule the work. Otherwise, it's already >>>>>>>>>>>>> scheduled (or another client is in the process of >>>>>>>>>>>>> scheduling it) and you just return. When the worker >>>>>>>>>>>>> finishes (after confirming a successful reset), it resets >>>>>>>>>>>>> the counter to 0, so the next client requesting a reset >>>>>>>>>>>>> will schedule the worker again. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>>   Felix >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Additional to that keep in mind that you can't allocate >>>>>>>>>>>>>>> any memory before or during the GPU reset nor wait for >>>>>>>>>>>>>>> the reset to complete (so you can't allocate anything on >>>>>>>>>>>>>>> the stack either). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is no dynamic allocation here, regarding stack >>>>>>>>>>>>>> allocations - we do it all the time when we call >>>>>>>>>>>>>> functions, even during GPU resets, how on stack >>>>>>>>>>>>>> allocation of work struct in amdgpu_device_gpu_recover is >>>>>>>>>>>>>> different from any other local variable we allocate in >>>>>>>>>>>>>> any function we call ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am also not sure why it's not allowed to wait for reset >>>>>>>>>>>>>> to complete ? Also, see in amdgpu_ras_do_recovery and >>>>>>>>>>>>>> gpu_recover_get (debugfs) - the caller expects the reset >>>>>>>>>>>>>> to complete before he returns. I can probably work around >>>>>>>>>>>>>> it in RAS code by calling atomic_set(&ras->in_recovery, >>>>>>>>>>>>>> 0) from some callback within actual reset function but >>>>>>>>>>>>>> regarding sysfs it actually expects a result returned >>>>>>>>>>>>>> indicating whether the call was successful or not. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Andrey >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don't think that concept you try here will work. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Christian. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Also in general seems to me it's cleaner approach where >>>>>>>>>>>>>>>> this logic (the work items) are held and handled in >>>>>>>>>>>>>>>> reset_domain and are not split in each adev or any >>>>>>>>>>>>>>>> other entity. We might want in the future to even move >>>>>>>>>>>>>>>> the scheduler handling into reset domain since reset >>>>>>>>>>>>>>>> domain is supposed to be a generic things and not only >>>>>>>>>>>>>>>> or AMD. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Andrey >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Andrey >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>>> Christian. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Signed-off-by: Andrey Grodzovsky >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Tested-by: Bai Zoy >>>>>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 11 +--- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 17 +++-- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 3 + >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 73 >>>>>>>>>>>>>>>>>>>> +++++++++++++++++++++- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 3 +- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 7 ++- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 7 ++- >>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 7 ++- >>>>>>>>>>>>>>>>>>>>   8 files changed, 104 insertions(+), 24 deletions(-) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>>>>>>>>>>>>>>>>> index 4264abc5604d..99efd8317547 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>>>>>>>>>>>>>>>>> @@ -109,6 +109,7 @@ >>>>>>>>>>>>>>>>>>>>   #include "amdgpu_fdinfo.h" >>>>>>>>>>>>>>>>>>>>   #include "amdgpu_mca.h" >>>>>>>>>>>>>>>>>>>>   #include "amdgpu_ras.h" >>>>>>>>>>>>>>>>>>>> +#include "amdgpu_reset.h" >>>>>>>>>>>>>>>>>>>>     #define MAX_GPU_INSTANCE 16 >>>>>>>>>>>>>>>>>>>>   @@ -509,16 +510,6 @@ struct >>>>>>>>>>>>>>>>>>>> amdgpu_allowed_register_entry { >>>>>>>>>>>>>>>>>>>>       bool grbm_indexed; >>>>>>>>>>>>>>>>>>>>   }; >>>>>>>>>>>>>>>>>>>>   -enum amd_reset_method { >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_NONE = -1, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_LEGACY = 0, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_MODE0, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_MODE1, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_MODE2, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_BACO, >>>>>>>>>>>>>>>>>>>> -    AMD_RESET_METHOD_PCI, >>>>>>>>>>>>>>>>>>>> -}; >>>>>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>>   struct amdgpu_video_codec_info { >>>>>>>>>>>>>>>>>>>>       u32 codec_type; >>>>>>>>>>>>>>>>>>>>       u32 max_width; >>>>>>>>>>>>>>>>>>>> diff --git >>>>>>>>>>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>>>>>>>>>> index e582f1044c0f..7fa82269c30f 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>>>>>>>>>> @@ -5201,6 +5201,12 @@ int >>>>>>>>>>>>>>>>>>>> amdgpu_device_gpu_recover_imp(struct amdgpu_device >>>>>>>>>>>>>>>>>>>> *adev, >>>>>>>>>>>>>>>>>>>>       } >>>>>>>>>>>>>>>>>>>> tmp_vram_lost_counter = >>>>>>>>>>>>>>>>>>>> atomic_read(&((adev)->vram_lost_counter)); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +    /* Drop all pending resets since we will reset >>>>>>>>>>>>>>>>>>>> now anyway */ >>>>>>>>>>>>>>>>>>>> +    tmp_adev = >>>>>>>>>>>>>>>>>>>> list_first_entry(device_list_handle, struct >>>>>>>>>>>>>>>>>>>> amdgpu_device, >>>>>>>>>>>>>>>>>>>> + reset_list); >>>>>>>>>>>>>>>>>>>> + amdgpu_reset_pending_list(tmp_adev->reset_domain); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>       /* Actual ASIC resets if needed.*/ >>>>>>>>>>>>>>>>>>>>       /* Host driver will handle XGMI hive reset >>>>>>>>>>>>>>>>>>>> for SRIOV */ >>>>>>>>>>>>>>>>>>>>       if (amdgpu_sriov_vf(adev)) { >>>>>>>>>>>>>>>>>>>> @@ -5296,7 +5302,7 @@ int >>>>>>>>>>>>>>>>>>>> amdgpu_device_gpu_recover_imp(struct amdgpu_device >>>>>>>>>>>>>>>>>>>> *adev, >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     struct amdgpu_recover_work_struct { >>>>>>>>>>>>>>>>>>>> -    struct work_struct base; >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_reset_work_struct base; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_device *adev; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_job *job; >>>>>>>>>>>>>>>>>>>>       int ret; >>>>>>>>>>>>>>>>>>>> @@ -5304,7 +5310,7 @@ struct >>>>>>>>>>>>>>>>>>>> amdgpu_recover_work_struct { >>>>>>>>>>>>>>>>>>>>     static void >>>>>>>>>>>>>>>>>>>> amdgpu_device_queue_gpu_recover_work(struct >>>>>>>>>>>>>>>>>>>> work_struct *work) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>> -    struct amdgpu_recover_work_struct >>>>>>>>>>>>>>>>>>>> *recover_work = container_of(work, struct >>>>>>>>>>>>>>>>>>>> amdgpu_recover_work_struct, base); >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_recover_work_struct >>>>>>>>>>>>>>>>>>>> *recover_work = container_of(work, struct >>>>>>>>>>>>>>>>>>>> amdgpu_recover_work_struct, base.base.work); >>>>>>>>>>>>>>>>>>>>         recover_work->ret = >>>>>>>>>>>>>>>>>>>> amdgpu_device_gpu_recover_imp(recover_work->adev, >>>>>>>>>>>>>>>>>>>> recover_work->job); >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>> @@ -5316,12 +5322,15 @@ int >>>>>>>>>>>>>>>>>>>> amdgpu_device_gpu_recover(struct amdgpu_device *adev, >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>>       struct amdgpu_recover_work_struct work = >>>>>>>>>>>>>>>>>>>> {.adev = adev, .job = job}; >>>>>>>>>>>>>>>>>>>>   - INIT_WORK(&work.base, >>>>>>>>>>>>>>>>>>>> amdgpu_device_queue_gpu_recover_work); >>>>>>>>>>>>>>>>>>>> + INIT_DELAYED_WORK(&work.base.base, >>>>>>>>>>>>>>>>>>>> amdgpu_device_queue_gpu_recover_work); >>>>>>>>>>>>>>>>>>>> + INIT_LIST_HEAD(&work.base.node); >>>>>>>>>>>>>>>>>>>>         if >>>>>>>>>>>>>>>>>>>> (!amdgpu_reset_domain_schedule(adev->reset_domain, >>>>>>>>>>>>>>>>>>>> &work.base)) >>>>>>>>>>>>>>>>>>>>           return -EAGAIN; >>>>>>>>>>>>>>>>>>>>   - flush_work(&work.base); >>>>>>>>>>>>>>>>>>>> + flush_delayed_work(&work.base.base); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_del_pendning_work(adev->reset_domain, >>>>>>>>>>>>>>>>>>>> &work.base); >>>>>>>>>>>>>>>>>>>>         return work.ret; >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>> diff --git >>>>>>>>>>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c >>>>>>>>>>>>>>>>>>>> index c80af0889773..ffddd419c351 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c >>>>>>>>>>>>>>>>>>>> @@ -134,6 +134,9 @@ struct amdgpu_reset_domain >>>>>>>>>>>>>>>>>>>> *amdgpu_reset_create_reset_domain(enum amdgpu_reset_d >>>>>>>>>>>>>>>>>>>> atomic_set(&reset_domain->in_gpu_reset, 0); >>>>>>>>>>>>>>>>>>>> init_rwsem(&reset_domain->sem); >>>>>>>>>>>>>>>>>>>>   + INIT_LIST_HEAD(&reset_domain->pending_works); >>>>>>>>>>>>>>>>>>>> + mutex_init(&reset_domain->reset_lock); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>       return reset_domain; >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>   diff --git >>>>>>>>>>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h >>>>>>>>>>>>>>>>>>>> index 1949dbe28a86..863ec5720fc1 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h >>>>>>>>>>>>>>>>>>>> @@ -24,7 +24,18 @@ >>>>>>>>>>>>>>>>>>>>   #ifndef __AMDGPU_RESET_H__ >>>>>>>>>>>>>>>>>>>>   #define __AMDGPU_RESET_H__ >>>>>>>>>>>>>>>>>>>>   -#include "amdgpu.h" >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> +#include >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +struct amdgpu_device; >>>>>>>>>>>>>>>>>>>> +struct amdgpu_job; >>>>>>>>>>>>>>>>>>>> +struct amdgpu_hive_info; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>     enum AMDGPU_RESET_FLAGS { >>>>>>>>>>>>>>>>>>>>   @@ -32,6 +43,17 @@ enum AMDGPU_RESET_FLAGS { >>>>>>>>>>>>>>>>>>>>       AMDGPU_SKIP_HW_RESET = 1, >>>>>>>>>>>>>>>>>>>>   }; >>>>>>>>>>>>>>>>>>>>   + >>>>>>>>>>>>>>>>>>>> +enum amd_reset_method { >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_NONE = -1, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_LEGACY = 0, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_MODE0, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_MODE1, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_MODE2, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_BACO, >>>>>>>>>>>>>>>>>>>> +    AMD_RESET_METHOD_PCI, >>>>>>>>>>>>>>>>>>>> +}; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>   struct amdgpu_reset_context { >>>>>>>>>>>>>>>>>>>>       enum amd_reset_method method; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_device *reset_req_dev; >>>>>>>>>>>>>>>>>>>> @@ -40,6 +62,8 @@ struct amdgpu_reset_context { >>>>>>>>>>>>>>>>>>>>       unsigned long flags; >>>>>>>>>>>>>>>>>>>>   }; >>>>>>>>>>>>>>>>>>>>   +struct amdgpu_reset_control; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>   struct amdgpu_reset_handler { >>>>>>>>>>>>>>>>>>>>       enum amd_reset_method reset_method; >>>>>>>>>>>>>>>>>>>>       struct list_head handler_list; >>>>>>>>>>>>>>>>>>>> @@ -76,12 +100,21 @@ enum amdgpu_reset_domain_type { >>>>>>>>>>>>>>>>>>>>       XGMI_HIVE >>>>>>>>>>>>>>>>>>>>   }; >>>>>>>>>>>>>>>>>>>>   + >>>>>>>>>>>>>>>>>>>> +struct amdgpu_reset_work_struct { >>>>>>>>>>>>>>>>>>>> +    struct delayed_work base; >>>>>>>>>>>>>>>>>>>> +    struct list_head node; >>>>>>>>>>>>>>>>>>>> +}; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>   struct amdgpu_reset_domain { >>>>>>>>>>>>>>>>>>>>       struct kref refcount; >>>>>>>>>>>>>>>>>>>>       struct workqueue_struct *wq; >>>>>>>>>>>>>>>>>>>>       enum amdgpu_reset_domain_type type; >>>>>>>>>>>>>>>>>>>>       struct rw_semaphore sem; >>>>>>>>>>>>>>>>>>>>       atomic_t in_gpu_reset; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +    struct list_head pending_works; >>>>>>>>>>>>>>>>>>>> +    struct mutex reset_lock; >>>>>>>>>>>>>>>>>>>>   }; >>>>>>>>>>>>>>>>>>>>     @@ -113,9 +146,43 @@ static inline void >>>>>>>>>>>>>>>>>>>> amdgpu_reset_put_reset_domain(struct >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain *dom >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     static inline bool >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_schedule(struct >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain *domain, >>>>>>>>>>>>>>>>>>>> - struct work_struct *work) >>>>>>>>>>>>>>>>>>>> + struct amdgpu_reset_work_struct *work) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>> -    return queue_work(domain->wq, work); >>>>>>>>>>>>>>>>>>>> + mutex_lock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +    if (!queue_delayed_work(domain->wq, >>>>>>>>>>>>>>>>>>>> &work->base, 0)) { >>>>>>>>>>>>>>>>>>>> + mutex_unlock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> +        return false; >>>>>>>>>>>>>>>>>>>> +    } >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + list_add_tail(&work->node, &domain->pending_works); >>>>>>>>>>>>>>>>>>>> + mutex_unlock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +    return true; >>>>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +static inline void >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_del_pendning_work(struct >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain *domain, >>>>>>>>>>>>>>>>>>>> +                  struct amdgpu_reset_work_struct >>>>>>>>>>>>>>>>>>>> *work) >>>>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>>>> + mutex_lock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> + list_del_init(&work->node); >>>>>>>>>>>>>>>>>>>> + mutex_unlock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +static inline void >>>>>>>>>>>>>>>>>>>> amdgpu_reset_pending_list(struct >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain *domain) >>>>>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_reset_work_struct *entry, *tmp; >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + mutex_lock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>> + list_for_each_entry_safe(entry, tmp, >>>>>>>>>>>>>>>>>>>> &domain->pending_works, node) { >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + list_del_init(&entry->node); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> +        /* Stop any other related pending resets */ >>>>>>>>>>>>>>>>>>>> + cancel_delayed_work(&entry->base); >>>>>>>>>>>>>>>>>>>> +    } >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + mutex_unlock(&domain->reset_lock); >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     void amdgpu_device_lock_reset_domain(struct >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain *reset_domain); >>>>>>>>>>>>>>>>>>>> diff --git >>>>>>>>>>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h >>>>>>>>>>>>>>>>>>>> index 239f232f9c02..574e870d3064 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h >>>>>>>>>>>>>>>>>>>> @@ -25,6 +25,7 @@ >>>>>>>>>>>>>>>>>>>>   #define AMDGPU_VIRT_H >>>>>>>>>>>>>>>>>>>>     #include "amdgv_sriovmsg.h" >>>>>>>>>>>>>>>>>>>> +#include "amdgpu_reset.h" >>>>>>>>>>>>>>>>>>>>     #define AMDGPU_SRIOV_CAPS_SRIOV_VBIOS (1 << 0) >>>>>>>>>>>>>>>>>>>> /* vBIOS is sr-iov ready */ >>>>>>>>>>>>>>>>>>>>   #define AMDGPU_SRIOV_CAPS_ENABLE_IOV (1 << 1) /* >>>>>>>>>>>>>>>>>>>> sr-iov is enabled on this GPU */ >>>>>>>>>>>>>>>>>>>> @@ -230,7 +231,7 @@ struct amdgpu_virt { >>>>>>>>>>>>>>>>>>>>       uint32_t reg_val_offs; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_irq_src ack_irq; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_irq_src rcv_irq; >>>>>>>>>>>>>>>>>>>> -    struct work_struct flr_work; >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_reset_work_struct flr_work; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_mm_table mm_table; >>>>>>>>>>>>>>>>>>>>       const struct amdgpu_virt_ops *ops; >>>>>>>>>>>>>>>>>>>>       struct amdgpu_vf_error_buffer vf_errors; >>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c >>>>>>>>>>>>>>>>>>>> index b81acf59870c..f3d1c2be9292 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c >>>>>>>>>>>>>>>>>>>> @@ -251,7 +251,7 @@ static int >>>>>>>>>>>>>>>>>>>> xgpu_ai_set_mailbox_ack_irq(struct amdgpu_device >>>>>>>>>>>>>>>>>>>> *adev, >>>>>>>>>>>>>>>>>>>>     static void xgpu_ai_mailbox_flr_work(struct >>>>>>>>>>>>>>>>>>>> work_struct *work) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>> -    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work); >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work.base.work); >>>>>>>>>>>>>>>>>>>>       struct amdgpu_device *adev = >>>>>>>>>>>>>>>>>>>> container_of(virt, struct amdgpu_device, virt); >>>>>>>>>>>>>>>>>>>>       int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT; >>>>>>>>>>>>>>>>>>>>   @@ -380,7 +380,8 @@ int >>>>>>>>>>>>>>>>>>>> xgpu_ai_mailbox_get_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>           return r; >>>>>>>>>>>>>>>>>>>>       } >>>>>>>>>>>>>>>>>>>>   - INIT_WORK(&adev->virt.flr_work, >>>>>>>>>>>>>>>>>>>> xgpu_ai_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_DELAYED_WORK(&adev->virt.flr_work.base, >>>>>>>>>>>>>>>>>>>> xgpu_ai_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_LIST_HEAD(&adev->virt.flr_work.node); >>>>>>>>>>>>>>>>>>>>         return 0; >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>> @@ -389,6 +390,8 @@ void >>>>>>>>>>>>>>>>>>>> xgpu_ai_mailbox_put_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.ack_irq, 0); >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_del_pendning_work(adev->reset_domain, >>>>>>>>>>>>>>>>>>>> &adev->virt.flr_work); >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     static int xgpu_ai_request_init_data(struct >>>>>>>>>>>>>>>>>>>> amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c >>>>>>>>>>>>>>>>>>>> index 22c10b97ea81..927b3d5bb1d0 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c >>>>>>>>>>>>>>>>>>>> @@ -275,7 +275,7 @@ static int >>>>>>>>>>>>>>>>>>>> xgpu_nv_set_mailbox_ack_irq(struct amdgpu_device >>>>>>>>>>>>>>>>>>>> *adev, >>>>>>>>>>>>>>>>>>>>     static void xgpu_nv_mailbox_flr_work(struct >>>>>>>>>>>>>>>>>>>> work_struct *work) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>> -    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work); >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work.base.work); >>>>>>>>>>>>>>>>>>>>       struct amdgpu_device *adev = >>>>>>>>>>>>>>>>>>>> container_of(virt, struct amdgpu_device, virt); >>>>>>>>>>>>>>>>>>>>       int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT; >>>>>>>>>>>>>>>>>>>>   @@ -407,7 +407,8 @@ int >>>>>>>>>>>>>>>>>>>> xgpu_nv_mailbox_get_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>           return r; >>>>>>>>>>>>>>>>>>>>       } >>>>>>>>>>>>>>>>>>>>   - INIT_WORK(&adev->virt.flr_work, >>>>>>>>>>>>>>>>>>>> xgpu_nv_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_DELAYED_WORK(&adev->virt.flr_work.base, >>>>>>>>>>>>>>>>>>>> xgpu_nv_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_LIST_HEAD(&adev->virt.flr_work.node); >>>>>>>>>>>>>>>>>>>>         return 0; >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>> @@ -416,6 +417,8 @@ void >>>>>>>>>>>>>>>>>>>> xgpu_nv_mailbox_put_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.ack_irq, 0); >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_del_pendning_work(adev->reset_domain, >>>>>>>>>>>>>>>>>>>> &adev->virt.flr_work); >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     const struct amdgpu_virt_ops xgpu_nv_virt_ops = { >>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c >>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c >>>>>>>>>>>>>>>>>>>> index 7b63d30b9b79..1d4ef5c70730 100644 >>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c >>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c >>>>>>>>>>>>>>>>>>>> @@ -512,7 +512,7 @@ static int >>>>>>>>>>>>>>>>>>>> xgpu_vi_set_mailbox_ack_irq(struct amdgpu_device >>>>>>>>>>>>>>>>>>>> *adev, >>>>>>>>>>>>>>>>>>>>     static void xgpu_vi_mailbox_flr_work(struct >>>>>>>>>>>>>>>>>>>> work_struct *work) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>> -    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work); >>>>>>>>>>>>>>>>>>>> +    struct amdgpu_virt *virt = container_of(work, >>>>>>>>>>>>>>>>>>>> struct amdgpu_virt, flr_work.base.work); >>>>>>>>>>>>>>>>>>>>       struct amdgpu_device *adev = >>>>>>>>>>>>>>>>>>>> container_of(virt, struct amdgpu_device, virt); >>>>>>>>>>>>>>>>>>>>         /* wait until RCV_MSG become 3 */ >>>>>>>>>>>>>>>>>>>> @@ -610,7 +610,8 @@ int >>>>>>>>>>>>>>>>>>>> xgpu_vi_mailbox_get_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>           return r; >>>>>>>>>>>>>>>>>>>>       } >>>>>>>>>>>>>>>>>>>>   - INIT_WORK(&adev->virt.flr_work, >>>>>>>>>>>>>>>>>>>> xgpu_vi_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_DELAYED_WORK(&adev->virt.flr_work.base, >>>>>>>>>>>>>>>>>>>> xgpu_vi_mailbox_flr_work); >>>>>>>>>>>>>>>>>>>> + INIT_LIST_HEAD(&adev->virt.flr_work.node); >>>>>>>>>>>>>>>>>>>>         return 0; >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>> @@ -619,6 +620,8 @@ void >>>>>>>>>>>>>>>>>>>> xgpu_vi_mailbox_put_irq(struct amdgpu_device *adev) >>>>>>>>>>>>>>>>>>>>   { >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.ack_irq, 0); >>>>>>>>>>>>>>>>>>>>       amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0); >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>> amdgpu_reset_domain_del_pendning_work(adev->reset_domain, >>>>>>>>>>>>>>>>>>>> &adev->virt.flr_work); >>>>>>>>>>>>>>>>>>>>   } >>>>>>>>>>>>>>>>>>>>     const struct amdgpu_virt_ops xgpu_vi_virt_ops = { >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>> >>