From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD9AFC433EF for ; Sun, 22 May 2022 18:22:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240118AbiEVSWx (ORCPT ); Sun, 22 May 2022 14:22:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234665AbiEVSWu (ORCPT ); Sun, 22 May 2022 14:22:50 -0400 Received: from mail-il1-x134.google.com (mail-il1-x134.google.com [IPv6:2607:f8b0:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B04B15822 for ; Sun, 22 May 2022 11:22:49 -0700 (PDT) Received: by mail-il1-x134.google.com with SMTP id z16so2342207ilp.5 for ; Sun, 22 May 2022 11:22:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juliacomputing.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tE8WqZzfOUJN2zSX+4Vir8QOi6nMy7D6r1Ae/XTxJ/s=; b=SlqvKxwH9G/5iO+xjA0hBg81hK37y9bUcgTmFRUKWRzdNTxgAVLmvAULrdfT+nUedd JcYcNfRa2HmBdm7s5tdpEhNRqAVJhJV4SZOmKQVhjWE+PA9oXEVnxktwiVTzTB/cJ25F w2HJ5v69QOjBHo/TEJoSrcWn8vnS2d2v2zuPBS6tyUuX/laI+z9Wuqaka8FK+GJl932v 63dbDojN6k8jQwdlV/n5L6iLHPsDXOE7lIVP6VxMSJA7pxQMG6cK/VzAu3nSLABiyu23 rrhG8ADYMSYHxdDJtsXsCevmPId8ynmw2J2Uhk6Jz7tYrco+5566UW60dhIvShOwc4DV cXtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tE8WqZzfOUJN2zSX+4Vir8QOi6nMy7D6r1Ae/XTxJ/s=; b=U7fVa3ukcFw2fFUpiG57HQ6cWJWdESJt2vX3u9m5MaEM+qzJRvyixCq72s4xkkGj6r WJ0yUhbM7ScVYvLRLLRkJiYnzYOaIpGZNBvXtFsaAxPj+HR3hME5CNRgiHyk5QiP5fUv rDnJTkAxTP1PKs03zHHm+5jayyflsipc6lByFKBE+3PKKF19l6LRRDuCIAWfkwSQezo7 mebGtkrWzhhzvbJskbIDQ828iJskBP1YgLo995ktEcQ7nidqiYnqlkxQui6ZOb6VUgrU wvZU7m5ph5KmrER7d2Wl4OGuCJfl3A95eu1Yju/a5lrFVY6FlAFcohXWKEDhP5iD5GaF i+jg== X-Gm-Message-State: AOAM531ocxmvDNd8YWxfizGsMWqK8qRsKI30MvkkNldCVile5IcHU66s dkEf7I7EWpLXLJtRITCOdS1H8DnD2hPkfCAHXMrN8ZMPnEpgkQ== X-Google-Smtp-Source: ABdhPJzrQva0YQ8rKPi+yqROj98/y42yMUpo8gSstb5UM2lZICPSKQKEpxWYQ8cVAvShf7IhDeek4aSVQpM0atY5FvA= X-Received: by 2002:a05:6e02:12e3:b0:2d1:583e:32bb with SMTP id l3-20020a056e0212e300b002d1583e32bbmr9821443iln.14.1653243768620; Sun, 22 May 2022 11:22:48 -0700 (PDT) MIME-Version: 1.0 References: <87ilpxmvg3.wl-maz@kernel.org> In-Reply-To: <87ilpxmvg3.wl-maz@kernel.org> From: Keno Fischer Date: Sun, 22 May 2022 14:22:12 -0400 Message-ID: Subject: Re: arm64 equivalents of PR_SET_TSC/ARCH_SET_CPUID To: Marc Zyngier Cc: Kyle Huey , open list , "moderated list:ARM PORT" , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , yyc1992@gmail.com, "Robert O'Callahan" , Thomas Gleixner , Borislav Petkov , Suzuki K Poulose , Will Deacon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 22, 2022 at 11:35 AM Marc Zyngier wrote: > From what I understand, you are relying on the TSC being disabled in > the tracee and intercepting the signal that gets delivered when it > accesses the counter. Is that correct? Yes, this is correct. The way that these kernel APIs work is that they turn any use of `rdtsc` (respectively `cpuid`) into SIGSEGV signals that the ptracer intercepts and emulates. It's not particularly pretty, but it works reasonably well in practice. > Assuming I'm right, I think it'd make a lot more sense if there was a > first class ptrace option, if only because this would mandate the > kernel to start trapping things that are not trapped today. I'm a bit nervous about "first class ptrace option" if only because ptrace is already a complicated mess and having spent a significant amount of time hunting down architecture-specific ptrace quirks, I'd be quite hesitant to introduce another one without a very strong justification. If the proposed mechanism is essentially signal-equivalent (i.e. it causes a ptrace stop and lets the ptracer emulate the instruction), then I'd strongly advocate for making it an actual, proper signal which has well-understood quirks (as the PR_SET_TSC/ARCH_SET_CPUID do on x86). The other consideration here is that disabling these sorts of counters may have non-ptrace applications e.g. sandboxes may want to disable these sorts of capabilities to harden against timing attacks, which may suggest that ptrace isn't the right place for it. If we're considering something more fancy, that's a different story of course. Naturally causing a ptrace trap on these instructions has significant overhead, but because they're usually fast, existing userspace is not particularly judicious in their use (the same issue happens on x86 of course). One could imagine a light-weight kernel-level record/replay capability where all accesses to these registers are traced and dumped into a buffer (with the corresponding capability to feed the values from a buffer). That kind of capability feels like a more natural fit for the perf subsystem, which already has capabilities to shuffle trace buffers around. > It also begs the question of the fate of CNTFRQ_EL0, since you want to > be able to replay traces from one system to another (and the counter > is meaningless without the frequency). Yes, it'd have to be interceptable also. > Finally, what of the VDSO, which is by far the most common user of the > counter? I can totally imagine the VDSO getting stuck if emulation is > used and the sequence counter moves synchronously with the traps > (which is why we disable the VDSO when trapping CNTVCT_EL0). Could you elaborate on this concern? rr does disable the vdso currently, so it wouldn't be a problem from that perspective, but I don't understand what you mean by the VDSO getting "stuck". > > 2. Likewise for ARCH_SET_CPUID > > We don't just emulate a single register, but a whole class of them. If > you are to present a different view for any of those, you'll need to > handle the lot (I really can't see why one would be more important > than the others). > > So SET_CPUID really is the wrong tool. I'd rather there was (again) an > API that described exactly that. I'm assuming these register values are all fixed as long as the process doesn't get migrated between CPU cores? In that case, it seems quite doable to introduce another ptrace regset that just has the register values for everything that could potentially be emulated (and is extensible for future additions). We'd need to think through the exact semantics in the ordinary course if one of the emulated registers does change, but it seems like a solvable issue. Keno From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1F35C433F5 for ; Mon, 23 May 2022 09:24:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=fcyU0huxOxbEh98cpxTZdk/6AanJ3jlfXwjUZ/WI6rg=; b=Krkj1dGpHOwCBZ CMLKklYSbqA7PeuDf/OX6i56lXAYg7kaKR6X9Wr6fvB3wFMVBsFC4hH5KcGO6F0LLxeJsKv8UQ/sG 0NmPfw7xYyxIx2lMGNWIXSyrTIcDZ/T9GrBGv2HXZ3cEJScqThPlOV9d1nSYvxLKn+BUsoWhO/2Dz vLj0/yG6kEbbDhYhCn8PhJzHc/CbJkaCoqnFx7y6C6hTUVliXHH/yEGm67JxSLRQyhXO6fgz6H0EF C5h0TTSD1aTJFdeUNvF2O+b8y71KN/zi26SALywt5wNObig55Yspxr6FQ81rh7nxC8G/1qCXd8Jhq O1fb8SNVOGeyBnAxrL7A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nt4HG-002o6L-I0; Mon, 23 May 2022 09:23:03 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nt34Y-002CKb-Lp for linux-arm-kernel@bombadil.infradead.org; Mon, 23 May 2022 08:05:53 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Cc:To:Subject:Message-ID: Date:From:In-Reply-To:References:MIME-Version:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=tE8WqZzfOUJN2zSX+4Vir8QOi6nMy7D6r1Ae/XTxJ/s=; b=hPRz8DlBlsFKh9uEdSr6G0KxOh TN4jSmlhqcezv4z5TobcU9D6gICbIEbNw6vohMoUaW0PimLX9uIKWFLJ1M5GiOq8B+0j5sLKJuK9n hxIEBq3AOCo4b0mq3q6ZiPihqSlnyavv9+rQyqvVjeJkcYoWny4SQXOO8PqqULbuVSm4t6tzvfkIM TJJFVlW2tzWqZHvg7yKpZ+y1pgQS7eFVkHcKrLQjzBGPVAP6iBczPm+EafLJ1Q2Vp7UMLtOgoxyOu BlIDeSkjGIddTNVcU2bPDGNTC05rkfAIF7V8jL3VV2nVkBLnOrRiG89FrO4vPfesN93+TbJHcAw1y EEEVKYTA==; Received: from mail-il1-x12f.google.com ([2607:f8b0:4864:20::12f]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nsqGB-000ZXn-8w for linux-arm-kernel@lists.infradead.org; Sun, 22 May 2022 18:25:04 +0000 Received: by mail-il1-x12f.google.com with SMTP id e9so4041761ilq.6 for ; Sun, 22 May 2022 11:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juliacomputing.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tE8WqZzfOUJN2zSX+4Vir8QOi6nMy7D6r1Ae/XTxJ/s=; b=SlqvKxwH9G/5iO+xjA0hBg81hK37y9bUcgTmFRUKWRzdNTxgAVLmvAULrdfT+nUedd JcYcNfRa2HmBdm7s5tdpEhNRqAVJhJV4SZOmKQVhjWE+PA9oXEVnxktwiVTzTB/cJ25F w2HJ5v69QOjBHo/TEJoSrcWn8vnS2d2v2zuPBS6tyUuX/laI+z9Wuqaka8FK+GJl932v 63dbDojN6k8jQwdlV/n5L6iLHPsDXOE7lIVP6VxMSJA7pxQMG6cK/VzAu3nSLABiyu23 rrhG8ADYMSYHxdDJtsXsCevmPId8ynmw2J2Uhk6Jz7tYrco+5566UW60dhIvShOwc4DV cXtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tE8WqZzfOUJN2zSX+4Vir8QOi6nMy7D6r1Ae/XTxJ/s=; b=q9OGjhYp7JhXeeE1aou0qWbAPCqxn++4D2BZE61AQ2HnqSdn0WZlLDpuAo7bJBoM69 +II/71+Rj0/doDUBWEPrNzPtWcPP6gzml4SgKUOtq1iJw+4+hTE9R+8Z7Es1Ug7aF5TJ JGPDAkvRwJlK2UzX1YHXpG9bcO2+uHGTXafB9giqv1vQlg4WkOxzHYukJ4l8g1yuemxR Z/MzbE8el5kqTooYW/5onDizJnOSQ/Li/G22wTgWqlt0X6Gl9jjVXv00IHTy+0+Oyr31 ey6YYRYpdGD0tyaVyPxzOu+2cDOCcrpqcqEcA4PFWfsiK0zIt5REoMwi0dPhtdwCOK40 S4Kw== X-Gm-Message-State: AOAM5321InaBeQNM6VkGWVqXGS9RXb+yjNKyAUWiAJNyVLFGahg4dbua UmOiBzb/RmYHGcxiJa6ihDBtzje6rbWhbLmFwlllUA== X-Google-Smtp-Source: ABdhPJzrQva0YQ8rKPi+yqROj98/y42yMUpo8gSstb5UM2lZICPSKQKEpxWYQ8cVAvShf7IhDeek4aSVQpM0atY5FvA= X-Received: by 2002:a05:6e02:12e3:b0:2d1:583e:32bb with SMTP id l3-20020a056e0212e300b002d1583e32bbmr9821443iln.14.1653243768620; Sun, 22 May 2022 11:22:48 -0700 (PDT) MIME-Version: 1.0 References: <87ilpxmvg3.wl-maz@kernel.org> In-Reply-To: <87ilpxmvg3.wl-maz@kernel.org> From: Keno Fischer Date: Sun, 22 May 2022 14:22:12 -0400 Message-ID: Subject: Re: arm64 equivalents of PR_SET_TSC/ARCH_SET_CPUID To: Marc Zyngier Cc: Kyle Huey , open list , "moderated list:ARM PORT" , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , yyc1992@gmail.com, "Robert O'Callahan" , Thomas Gleixner , Borislav Petkov , Suzuki K Poulose , Will Deacon X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220522_192459_637091_8D9E33BA X-CRM114-Status: GOOD ( 27.27 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sun, May 22, 2022 at 11:35 AM Marc Zyngier wrote: > From what I understand, you are relying on the TSC being disabled in > the tracee and intercepting the signal that gets delivered when it > accesses the counter. Is that correct? Yes, this is correct. The way that these kernel APIs work is that they turn any use of `rdtsc` (respectively `cpuid`) into SIGSEGV signals that the ptracer intercepts and emulates. It's not particularly pretty, but it works reasonably well in practice. > Assuming I'm right, I think it'd make a lot more sense if there was a > first class ptrace option, if only because this would mandate the > kernel to start trapping things that are not trapped today. I'm a bit nervous about "first class ptrace option" if only because ptrace is already a complicated mess and having spent a significant amount of time hunting down architecture-specific ptrace quirks, I'd be quite hesitant to introduce another one without a very strong justification. If the proposed mechanism is essentially signal-equivalent (i.e. it causes a ptrace stop and lets the ptracer emulate the instruction), then I'd strongly advocate for making it an actual, proper signal which has well-understood quirks (as the PR_SET_TSC/ARCH_SET_CPUID do on x86). The other consideration here is that disabling these sorts of counters may have non-ptrace applications e.g. sandboxes may want to disable these sorts of capabilities to harden against timing attacks, which may suggest that ptrace isn't the right place for it. If we're considering something more fancy, that's a different story of course. Naturally causing a ptrace trap on these instructions has significant overhead, but because they're usually fast, existing userspace is not particularly judicious in their use (the same issue happens on x86 of course). One could imagine a light-weight kernel-level record/replay capability where all accesses to these registers are traced and dumped into a buffer (with the corresponding capability to feed the values from a buffer). That kind of capability feels like a more natural fit for the perf subsystem, which already has capabilities to shuffle trace buffers around. > It also begs the question of the fate of CNTFRQ_EL0, since you want to > be able to replay traces from one system to another (and the counter > is meaningless without the frequency). Yes, it'd have to be interceptable also. > Finally, what of the VDSO, which is by far the most common user of the > counter? I can totally imagine the VDSO getting stuck if emulation is > used and the sequence counter moves synchronously with the traps > (which is why we disable the VDSO when trapping CNTVCT_EL0). Could you elaborate on this concern? rr does disable the vdso currently, so it wouldn't be a problem from that perspective, but I don't understand what you mean by the VDSO getting "stuck". > > 2. Likewise for ARCH_SET_CPUID > > We don't just emulate a single register, but a whole class of them. If > you are to present a different view for any of those, you'll need to > handle the lot (I really can't see why one would be more important > than the others). > > So SET_CPUID really is the wrong tool. I'd rather there was (again) an > API that described exactly that. I'm assuming these register values are all fixed as long as the process doesn't get migrated between CPU cores? In that case, it seems quite doable to introduce another ptrace regset that just has the register values for everything that could potentially be emulated (and is extensible for future additions). We'd need to think through the exact semantics in the ordinary course if one of the emulated registers does change, but it seems like a solvable issue. Keno _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel