From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C3B7C433F5 for ; Tue, 22 Feb 2022 23:53:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236312AbiBVXxc (ORCPT ); Tue, 22 Feb 2022 18:53:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236308AbiBVXxb (ORCPT ); Tue, 22 Feb 2022 18:53:31 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 11D1B31508 for ; Tue, 22 Feb 2022 15:53:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1645573984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EDsu5aTczzOvR3oKds5AvRCwltMBcJ/oF6EuAWc1T2c=; b=KhmmPFGmVpygmr980VbZh3J7sskJxt+6aFtFccI2eRvrCIq5y+GI4tAt0rnSJDInxPSHDG rE9oV2lpZn0gAAiK8W2OEcgOfQN+XzK0+kwMkg5E15Nz09k8nTkOONZXtPOquiLfm453oQ 8M2UdmlUF1g5rdR97BtL0wj833QmB6g= Received: from mail-oo1-f71.google.com (mail-oo1-f71.google.com [209.85.161.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-81-o1iNC9fCMw-sHxfzC0G2Cw-1; Tue, 22 Feb 2022 18:53:03 -0500 X-MC-Unique: o1iNC9fCMw-sHxfzC0G2Cw-1 Received: by mail-oo1-f71.google.com with SMTP id k17-20020a4adfb1000000b0031c228d26a2so7224903ook.6 for ; Tue, 22 Feb 2022 15:53:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=EDsu5aTczzOvR3oKds5AvRCwltMBcJ/oF6EuAWc1T2c=; b=hVfkiV2zqOicS1VI1rUCqdrvkg712IV5zDthBJcunQbTQ4EBWUAAc+93mlU0DQgKYL j3NntB0UeZJQOpzYZlFYD1sBwCPUZNPEJ0SK8x9RJrmFAaqdefz7pzI0IfNUENwOiahZ gN1mHX0oGHEQBpjgrDYo998nQsMYXEZeo2RJcm9815cdtljRLu0+o+IrQhPm+t1lbDIB Q8+anxNy10flMOWQqpCkBoIWVu9NDcoFlvQovb/1bMlLa7p1KZ93jXlzyM7RPSKavGCV F1cplF3JVJYPf1a4Qb0nraDNBAxCziohnb3ir494X5eo7K3Es8NFk20q/Az1LI904suW S28Q== X-Gm-Message-State: AOAM533d63891KftPKnSAUHY+9Yn4QrbDcO/QEQPTCYJe1fE50CRBnb8 8X+zvJIsScIAkZcQEt38TyAof/wTX7HxonFS+dVB7WdYwLye7b3mODMrTSeyP4vL3EyiCuukMdt c5op1HHg94nk2UCFXJ8YN X-Received: by 2002:aca:b957:0:b0:2d4:cf0f:ce1e with SMTP id j84-20020acab957000000b002d4cf0fce1emr3145190oif.22.1645573982134; Tue, 22 Feb 2022 15:53:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJxDVey3GFc827L22t9gWEfGNj6vtWvCa3s3QmPNAs+QhKqYofPao9S/0+I8RDpFjiMHDHgHGQ== X-Received: by 2002:aca:b957:0:b0:2d4:cf0f:ce1e with SMTP id j84-20020acab957000000b002d4cf0fce1emr3145169oif.22.1645573981895; Tue, 22 Feb 2022 15:53:01 -0800 (PST) Received: from redhat.com ([38.15.36.239]) by smtp.gmail.com with ESMTPSA id 8sm6711150ota.60.2022.02.22.15.53.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Feb 2022 15:53:01 -0800 (PST) Date: Tue, 22 Feb 2022 16:53:00 -0700 From: Alex Williamson To: Yishai Hadas Cc: , , , , , , , , , , , , , , Subject: Re: [PATCH V8 mlx5-next 09/15] vfio: Define device migration protocol v2 Message-ID: <20220222165300.4a8dd044.alex.williamson@redhat.com> In-Reply-To: <20220220095716.153757-10-yishaih@nvidia.com> References: <20220220095716.153757-1-yishaih@nvidia.com> <20220220095716.153757-10-yishaih@nvidia.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Sun, 20 Feb 2022 11:57:10 +0200 Yishai Hadas wrote: > From: Jason Gunthorpe > > Replace the existing region based migration protocol with an ioctl based > protocol. The two protocols have the same general semantic behaviors, but > the way the data is transported is changed. > > This is the STOP_COPY portion of the new protocol, it defines the 5 states > for basic stop and copy migration and the protocol to move the migration > data in/out of the kernel. > > Compared to the clarification of the v1 protocol Alex proposed: > > https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen > > This has a few deliberate functional differences: > > - ERROR arcs allow the device function to remain unchanged. > > - The protocol is not required to return to the original state on > transition failure. Instead userspace can execute an unwind back to > the original state, reset, or do something else without needing kernel > support. This simplifies the kernel design and should userspace choose > a policy like always reset, avoids doing useless work in the kernel > on error handling paths. > > - PRE_COPY is made optional, userspace must discover it before using it. > This reflects the fact that the majority of drivers we are aware of > right now will not implement PRE_COPY. > > - segmentation is not part of the data stream protocol, the receiver > does not have to reproduce the framing boundaries. I'm not sure how to reconcile the statement above with: "The user must consider the migration data segments carried over the FD to be opaque and non-fungible. During RESUMING, the data segments must be written in the same order they came out of the saving side FD." This is subtly conflicting that it's not segmented, but segments must be written in order. We'll naturally have some segmentation due to buffering in kernel and userspace, but I think referring to it as a stream suggests that the user can cut and join segments arbitrarily so long as byte order is preserved, right? I suspect the commit log comment is referring to the driver imposed segmentation and framing relative to region offsets. Maybe something like: "The user must consider the migration data stream carried over the FD to be opaque and must preserve the byte order of the stream. The user is not required to preserve buffer segmentation when writing the data stream during the RESUMING operation." This statement also gives me pause relative to Jason's comments regarding async support: > + * The kernel migration driver must fully transition the device to the new state > + * value before the operation returns to the user. The above statement certainly doesn't preclude asynchronous availability of data on the stream FD, but it does demand that the device state transition itself is synchronous and can cannot be shortcut. If the state transition itself exceeds migration SLAs, we're in a pickle. Thanks, Alex