From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4041C43334 for ; Tue, 21 Jun 2022 17:09:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352478AbiFURJn (ORCPT ); Tue, 21 Jun 2022 13:09:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235406AbiFURJl (ORCPT ); Tue, 21 Jun 2022 13:09:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 59F36286D7 for ; Tue, 21 Jun 2022 10:09:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655831378; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=5kAhXV5ceZVd9SW+rj4Nij9cH0l8kTqdMQa24MeO028=; b=a6tE37yWZvfapKGwM2BaMX2wmLDc/0oRNaiFLfpkIPBaE28KIL3qkVBsKp12PsdhlGq3lU 90rGrxpHNMpJtKmJGx01KxV673oZSZxnGc1x/jUY1y21hLwzhkAS/W4B72R4F6JP5UkGOd XduCukErzoY0VrzFq7HyPIoPGjxx9Fo= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-351-st76avpAPr-7faYZCff1aQ-1; Tue, 21 Jun 2022 13:09:37 -0400 X-MC-Unique: st76avpAPr-7faYZCff1aQ-1 Received: by mail-il1-f200.google.com with SMTP id l3-20020a056e021aa300b002d9094fb397so4554998ilv.11 for ; Tue, 21 Jun 2022 10:09:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=5kAhXV5ceZVd9SW+rj4Nij9cH0l8kTqdMQa24MeO028=; b=sO6Q22f7fge7tpErctyx/U1BurIlr8SsjL0m2uwpGGWswjcM6v52dcSOmUVUT1Rco6 VBl6JGV/91HO1lkAkSALDjIzOefDCZeOtNG4ef9JZPS/ioQFXx//IVSOaXKt0bZ+VPOK COXKtB6zO3ZFYJeochzmZYG9s/DH3XjtdFMRlCTKNvGIhMRgQa1UCNJI9sZ0TCqVDeGn 47XS++9oIRiMoYeY/Qj9rgEGgsyPvaS04qk+lBIzTfTcqfRckywUkWB1jtfykhDHIfhs 1Kb8iPHksglAeGWQ+N/3M18qyiTMox82Hj9I1TjMQfLXeVT9ebRjbm/Jev3N1SS2re5/ Yo9w== X-Gm-Message-State: AJIora+Orbg2YJJQJ4sfW2IM6Ugjb0jtObW95hPOouUPhF5g4QkLM7U9 kDAoTZidm4WvYMl/+lcMkFqfTj487Zt+3rPi9qTkR5rm3pXYgTjz8btwd6SPRjDoRBlnd7Xy8KN n6ZsJJpl63bX41xWgYLxX+2ef X-Received: by 2002:a05:6638:240a:b0:332:783:156b with SMTP id z10-20020a056638240a00b003320783156bmr16359096jat.306.1655831376643; Tue, 21 Jun 2022 10:09:36 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sEoM/q2ZzHHdyf2r5gcqetk1jTonW7PHJFLpLSAjDLzc2r35SkR603oJ4EMetLtOp4Z+ydng== X-Received: by 2002:a05:6638:240a:b0:332:783:156b with SMTP id z10-20020a056638240a00b003320783156bmr16359088jat.306.1655831376427; Tue, 21 Jun 2022 10:09:36 -0700 (PDT) Received: from xz-m1.local (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id o33-20020a027421000000b00331a211407fsm7406362jac.93.2022.06.21.10.09.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Jun 2022 10:09:35 -0700 (PDT) Date: Tue, 21 Jun 2022 13:09:34 -0400 From: Peter Xu To: David Hildenbrand Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, "Dr . David Alan Gilbert" , Linux MM Mailing List , Sean Christopherson , Paolo Bonzini , Andrea Arcangeli , Andrew Morton Subject: Re: [PATCH RFC 1/4] mm/gup: Add FOLL_INTERRUPTIBLE Message-ID: References: <20220617014147.7299-1-peterx@redhat.com> <20220617014147.7299-2-peterx@redhat.com> <212f8b31-e470-d62c-0090-537d0d60add9@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <212f8b31-e470-d62c-0090-537d0d60add9@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 21, 2022 at 10:23:32AM +0200, David Hildenbrand wrote: > On 17.06.22 03:41, Peter Xu wrote: > > We have had FAULT_FLAG_INTERRUPTIBLE but it was never applied to GUPs. One > > issue with it is that not all GUP paths are able to handle signal delivers > > besides SIGKILL. > > > > That's not ideal for the GUP users who are actually able to handle these > > cases, like KVM. > > > > KVM uses GUP extensively on faulting guest pages, during which we've got > > existing infrastructures to retry a page fault at a later time. Allowing > > the GUP to be interrupted by generic signals can make KVM related threads > > to be more responsive. For examples: > > > > (1) SIGUSR1: which QEMU/KVM uses to deliver an inter-process IPI, > > e.g. when the admin issues a vm_stop QMP command, SIGUSR1 can be > > generated to kick the vcpus out of kernel context immediately, > > > > (2) SIGINT: which can be used with interactive hypervisor users to stop a > > virtual machine with Ctrl-C without any delays/hangs, > > > > (3) SIGTRAP: which grants GDB capability even during page faults that are > > stuck for a long time. > > > > Normally hypervisor will be able to receive these signals properly, but not > > if we're stuck in a GUP for a long time for whatever reason. It happens > > easily with a stucked postcopy migration when e.g. a network temp failure > > happens, then some vcpu threads can hang death waiting for the pages. With > > the new FOLL_INTERRUPTIBLE, we can allow GUP users like KVM to selectively > > enable the ability to trap these signals. > > This makes sense to me. I assume relevant callers will detect "GUP > failed" but also "well, there is a signal to handle" and cleanly back > off, correct? Correct, via an -EINTR. One thing to mention is that the gup user behavior will be the same as before if the caller didn't explicilty pass in FOLL_INTERRUPTIBLE with the gup call. So after the whole series applied only kvm (and only some path of kvm, not all GUP; I only touched up the x86 slow page fault path) to handle this, but that'll be far enough to cover 99.99% use cases that I wanted to take care of. E.g., some kvm request to gup on some guest apic page may not still be able to respond to a SIGUSR1 but that's very very rare, and we can always add more users of FOLL_INTERRUPTIBLE when the code is ready to benefit from the fast respondings. Thanks, -- Peter Xu