From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6E1EC433E0 for ; Wed, 8 Jul 2020 16:51:33 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 82D812063A for ; Wed, 8 Jul 2020 16:51:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="mSGv4A8j"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="g6VI/+65" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82D812063A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=oDpYFzupd3b3mxu9WY9DDNNDJFf9kgvttn7yrZnZZMs=; b=mSGv4A8j7PiOK9HD8fAWBkIYn TJiwPUYmTQGYjTzCWAdJ0GVqgEBv56UcDFW40yQO5nYdYQCjW5UADxzP+kwuhPjDcsrp/BamFjaj5 39Ho5nOesZj6uYicpXbzoJVxbAm9swF8gAFHytatnUFf+wwv8mM17K2xf0rCoPkVeNBBunkVlZYBz WJ8udPBgrlN69NqQVwAfdRb0m0wZtxaJYgSBcq59NHovrYw52QK1tyJq6+TBDOo+GSm1OmCBx5zcX rQW2mHH8sie8Iv0V0k0N7NfmsUhceizlwBU0Nf2QqDUhWciGW6rRk3E0kQ39Ouxk9Tv4omOW09pds 9b7pLCvKQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jtDGl-0006iG-D8; Wed, 08 Jul 2020 16:50:03 +0000 Received: from us-smtp-2.mimecast.com ([205.139.110.61] helo=us-smtp-delivery-1.mimecast.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jtDGg-0006hX-SC for linux-arm-kernel@lists.infradead.org; Wed, 08 Jul 2020 16:49:59 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594226998; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vqlyRZ8PWOPg3G24Eu0pfeRf1HU5OBcm+eJAVx3Vrxo=; b=g6VI/+65XlJaKQWC2cSqnSFH1I8TEo7F9B9HTD+5k14ATzgRrqKXUW2q2sKsGG8zZgZIDf qtRl6GL0ysnH5PoFrhEJkFNpwM1ovNCH1xGcQsEUTgaMzdZsYamHPYVGRRiya2X9CXA4Mp iTd7185qEAlzM/BUmz9Lmxme7SkzrYA= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-248-NdeCdrZoMxOExl4OKohpsg-1; Wed, 08 Jul 2020 12:49:53 -0400 X-MC-Unique: NdeCdrZoMxOExl4OKohpsg-1 Received: by mail-wr1-f71.google.com with SMTP id y18so33024765wrq.4 for ; Wed, 08 Jul 2020 09:49:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=vqlyRZ8PWOPg3G24Eu0pfeRf1HU5OBcm+eJAVx3Vrxo=; b=VQ5iWIihGMddbn/OrDiZC/h7h5wGy+R/3rUBnLELam2X9JzAtM5IyEmdZk3RD3FQLO bLNkb935KPEwIJzcWHDM6gCsaK8iGFL++XM1QHgJgMb0cp7iLYzv0Y2m090WryoJ//UI YRRGr+fejLbPSVaQYnL/Z5iNEz9W2Kha8Dd9o6zQnZwnOTQTz4fpD5jS0FQWIQXPu26m uDx2KA7Z0Eav0XKK2gzCKik9bZJyHsUyESorH97/2VZP/RlaFznAK8+wnt5VsAIgFxRQ /lnr/05r97K7Ir13pz1DJDqAh07QMEvTmdbdCmTPpZ6VBDMOx1TZdMIxHfpCQwBKo6Q0 kWlA== X-Gm-Message-State: AOAM533u1L5dGAOy9Nk7e0F/ir4H09ev25bIK6MEYr/nz/teQQj6jmO0 KkVEaUYH3r05S69hubXiAU/cfnlWUFsHSVPgLApMALWq7g88kMcPQgg2rI/PqaMUfn/n1ETeD9Y ByF6OWSGcBskMdx6ltsL7xS8RkB2pKc9NxqA= X-Received: by 2002:a5d:4751:: with SMTP id o17mr36147116wrs.345.1594226992158; Wed, 08 Jul 2020 09:49:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzKwkIVUHhaybcmq5xjnB90Dz1MDc8Z3prY6C11BQE/rgBV+l2aJ1mi/NwP1/90aObJka+vDg== X-Received: by 2002:a5d:4751:: with SMTP id o17mr36147093wrs.345.1594226991822; Wed, 08 Jul 2020 09:49:51 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:9541:9439:cb0f:89c? ([2001:b07:6468:f312:9541:9439:cb0f:89c]) by smtp.gmail.com with ESMTPSA id c206sm517865wmf.36.2020.07.08.09.49.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jul 2020 09:49:51 -0700 (PDT) Subject: Re: [Question] How to testing SDEI client driver To: James Morse , Gavin Shan References: <8cdef8ea-e550-ccff-2041-526d6f6fcda0@redhat.com> <3cc7e69c-9315-5138-85f7-7b36e17b67a1@redhat.com> <2b5ef5b2-cf6b-3f4e-4250-c1d0a34e9873@arm.com> From: Paolo Bonzini Message-ID: <69ecf58f-bf9d-6c2d-37ab-4397b30fff4b@redhat.com> Date: Wed, 8 Jul 2020 18:49:50 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <2b5ef5b2-cf6b-3f4e-4250-c1d0a34e9873@arm.com> Content-Language: en-US Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pbonzini@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200708_124958_976285_B4A3B67B X-CRM114-Status: GOOD ( 30.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, maz@kernel.org, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 08/07/20 18:11, James Morse wrote: > Hi Gavin, > > On 03/07/2020 01:26, Gavin Shan wrote: >> On 7/1/20 9:57 PM, James Morse wrote: >>> On 30/06/2020 06:17, Gavin Shan wrote: >>>> I'm currently looking into SDEI client driver and reworking on it so that >>>> it can provide capability/services to arm64/kvm to get it virtualized. >>> >>> What do you mean by virtualised? The expectation is the VMM would implement the 'firmware' >>> side of this. 'events' are most likely to come from the VMM, and having to handshake with >>> the kernel to work out if the event you want to inject is registered and enabled is >>> over-complicated. Supporting it in the VMM means you can notify a different vCPU if that >>> is appropriate, or take a different action if the event isn't registered. >>> >>> This was all blocked on finding a future-proof way for tools like Qemu to consume >>> reference code from ATF. > >> Sorry that I didn't mention the story a bit last time. We plan to use SDEI to >> deliver the notification (signal) from host to guest, needed by the asynchronous >> page fault feature. The RFCv2 patchset was post a while ago [1]. > > Thanks. So this is to hint to the guest that you'd swapped its memory to disk. Yuck. > > When would you do this? These days, the main reason is on-demand paging with live migration. Instead of waiting to have a consistent version of guest memory on the destination, memory that the guest has dirtied can be copied on demand from source to destination while the guest is running. Letting the guest reschedule is surprisingly effective in this case, especially with workloads that have a lot of threads. > Isn't this roughly equivalent to SMT CPUs taking a cache-miss? ... > If you pinned two vCPUs to one physical CPU, the host:scheduler would multiplex between > them. If one couldn't due useful work because it was waiting for memory, the other gets > all the slack time. (the TLB maintenance would hurt, but not as much as waiting for the disk) > The good news is the guest:scheduler already knows how to deal with this! > (and, it works for other OS too) The order of magnitude of both the wait and the reschedule is too different for SMT heuristics to be applicable here. Especially, two SMT pCPUs compete equally for fetch resources, while two vCPUs pinned to the same pCPU would only reschedule a few hundred times per second. Latency would be in the milliseconds and jitter would be horribl. > Wouldn't it be better to let the guest make the swapping decision? > You could provide a fast virtio swap device to the guest that is > backed by maybe-swapped host memory. I think you are describing something similar to "transcendent memory", which Xen implemented about 10 years ago (https://lwn.net/Articles/454795/). Unfortunately you've probably never heard about it for good reasons. :) The main showstopper is that you cannot rely on guest cooperation (also because it works surprisingly well without). >> For the SDEI >> events needed by the async page fault, it's originated from KVM (host). In order >> to achieve the goal, KVM needs some code so that SDEI event can be injected and >> delivered. Also, the SDEI related hypercalls needs to be handled either. > > I avoided doing this because it makes it massively complicated for the VMM. All that > in-kernel state now has to be migrated. KVM has to expose APIs to let the VMM inject > events, which gets nasty for shared events where some CPUs are masked, and others aren't. > > Having something like Qemu drive the reference code from TFA is the right thing to do for > SDEI. Are there usecases for injecting SDEIs from QEMU? If not, it can be done much more easily with KVM (and it would also would be really, really slow if each page fault had to be redirected through QEMU), which wouldn't have more than a handful of SDEI events. The in-kernel state is 4 64-bit values (EP address and argument, flags, affinity) per event. >> Yes, The SDEI specification already mentioned >> this: the client handler should have all required resources in place before >> the handler is going to run. However, I don't see it's a problem so far. > > What if they are swapped out? This thing becomes re-entrant ... which the spec forbids. > The host has no clue what is in guest memory. On x86 we don't do the notification if interrupts are disabled. On ARM I guess you'd do the same until SDEI_EVENT_COMPLETE (so yeah that would be some state that has to be migrated). In fact it would be nice if SDEI_EVENT_COMPLETE meant "wait for synchronous page-in" while SDEI_EVENT_COMPLETE_AND_RESUME meant "handle it asynchronously". >> Lets wait and see if it's a real issue until I post the RFC patchset :) > > Its not really a try it and see thing! On this we agree. ;) Paolo _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel