From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 997DEC43441 for ; Wed, 14 Nov 2018 15:40:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5448020818 for ; Wed, 14 Nov 2018 15:40:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="O5tHoxOb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5448020818 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387447AbeKOBof (ORCPT ); Wed, 14 Nov 2018 20:44:35 -0500 Received: from mail-vs1-f66.google.com ([209.85.217.66]:33716 "EHLO mail-vs1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728000AbeKOBof (ORCPT ); Wed, 14 Nov 2018 20:44:35 -0500 Received: by mail-vs1-f66.google.com with SMTP id p74so9781503vsc.0 for ; Wed, 14 Nov 2018 07:40:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=qhXBUPYPE3l/6xSc85gdUJMtEaXW+SdKf9Phc9De/B8=; b=O5tHoxOb0xZT1oFMWnJLBbZgy8YpwssA8vPupJzplmRUhPvshHurejuvgNIRI4C2no 7AqjCTWtHwx8YP/HNBlcOeXt9hO3k3Bbj7RJoANiQvMmb2BApeiX3KPEjGaFjll5xd4H SAuyV1xFsEHsc/AoBjaAIcKCkP7ZvstN+iFEfAVD4UsTCNzkbr6JmlALsqPUe15y4TEO ercIpwGXe+okX2fqPFh7JxUTTsghAfsLvUCaMcERPrN9LBPMPSXs/W0cGVdne/cGjnqK bhiZE2nnArjNRzZCy/XXZwu1fqgcyTJKBfSRminnQMugfVlp46i2PqOVkWj6wgPEXJsC 7W+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=qhXBUPYPE3l/6xSc85gdUJMtEaXW+SdKf9Phc9De/B8=; b=Vv8PBkeFUYjpq0KcTlOUzqrdV0tdRIhiCW9AKTbUk+SgDlcGs8+6SlCOQa/38EG51A gYUsDGUTl98pcKRrnrZXDNzZ/+7NuQHIKJ9Oyw/Dff0hK/TaYuqkAVpeF6EfXsCbGWhC D7D+o2TIoHg0XL5mt5L7OJGJuMcXTpO8hheqfEs9ltJh6wqC+nPfBtpHzBVGHl8MT/iK bn2MZgLpTlzhzzccMpyPz10i51XRS7ah3pX3GXFwayUuq6xW32S+TsZ/Q2xkpWD/zg+8 dFonFCi3FfqnkDkc+AH95JNk0GREEo9f0q4pDmBGPSPpT5oWJqw0DjpUsNjpD+kzO2kq Q4+g== X-Gm-Message-State: AGRZ1gJAzijWir3xbfxz7rfTTKfLGOKRVFaYAH5gr+zLVrONevM4j6tr 8f0CEYGS/phR/CEE1zgXW+SDUazMBvpdJYOL37o5Dw== X-Google-Smtp-Source: AJdET5duI6gFzkjB9/3N/jXwI88Vod3a9ycACZLfwG3Oeb8HwlZ9siexlS6LwznRBJTN/Bl7C1woLcTAcPC5Jn3NGxs= X-Received: by 2002:a67:6e87:: with SMTP id j129mr1055506vsc.171.1542210048344; Wed, 14 Nov 2018 07:40:48 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a67:f48d:0:0:0:0:0 with HTTP; Wed, 14 Nov 2018 07:40:46 -0800 (PST) In-Reply-To: <5853c297-9d84-86e5-dede-aa2957562c6b@arm.com> References: <877ehjx447.fsf@oldenburg.str.redhat.com> <875zx2vhpd.fsf@oldenburg.str.redhat.com> <20181113193859.GJ3505@e103592.cambridge.arm.com> <5853c297-9d84-86e5-dede-aa2957562c6b@arm.com> From: Daniel Colascione Date: Wed, 14 Nov 2018 07:40:46 -0800 Message-ID: Subject: Re: Official Linux system wrapper library? To: Szabolcs Nagy Cc: Dave P Martin , nd , Florian Weimer , "Michael Kerrisk (man-pages)" , linux-kernel , Joel Fernandes , Linux API , Willy Tarreau , Vlastimil Babka , "Carlos O'Donell" , "libc-alpha@sourceware.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 14, 2018 at 3:58 AM, Szabolcs Nagy wrote: > On 13/11/18 19:39, Dave Martin wrote: >> On Mon, Nov 12, 2018 at 05:19:14AM -0800, Daniel Colascione wrote: >>> We should adopt a similar approach. Shipping a lower-level >>> "liblinux.so" tightly bound to the kernel would not only let the >>> kernel bypass glibc's "editorial discretion" in exposing new >>> facilities to userspace, but would also allow for tighter user-kernel >>> integration that one can achieve with a simplistic syscall(2)-style >>> escape hatch. (For example, for a long time now, I've wanted to go >>> beyond POSIX and improve the system's signal handling API, and this >>> improvement requires userspace cooperation.) The vdso is probably too >>> small and simplistic to serve in this role; I'd want a real library. >> >> Can you expand on your reasoning here? > > such lib creates a useless abi+api layer that > somebody has to maintain and document (with or > without vdso). People already maintain the kernel man pages and are very good. > it obviously cannot work together with a posix > conform libc implementation for which it would > require knowledge about You're incorrect on this point. See programs cobbled together out of syscall(2) invocations today: despite lack of libc integration, things do mostly work in practice. Calling through a library can't possible be worse, and in many ways can be much better. > thread cancellation internals, As I mentioned upthread, the only thing a libc needs in order to support cancellation properly (at least the way glibc does it) is a way to ask the kernel-provided userspace library whether a particular program counter address belongs to a certain code sequence immediately before the system call instruction, whatever that is. Providing this facility is doable without deep knowledge of libc's internals, and libc can use it without a deep knowledge of the interface library. > potentially TLS > for errno As someone else mentioned, errno is a libc construct. It's not *hard* to support setting errno though: libc could just be required to supply a well-defined libc_set_errno symbol that the kernel ABI library would then use as needed. > know libc types even ones that are > based on compile time feature macros This library would not have to do the things that libc does. Why would it have to support libc's feature test macros at all? > (and expose > them in headers in a way that does not collide > with libc headers) The kernel should have a set of types and a symbol namespace completely disjoint from libc's, with no compatibility hacks or macros needed. (That might take some renaming kernel-side.) If libc wants to provide a POSIX API, it can take on the responsibility for mapping the kernel's structures to libc's, but within its namespace, the kernel should be able to add types without fear of conflict. > abi variants the libc supports > (e.g. softfp, security hardened abi), libc > internal signals (for anything that's changing > signal masks), thread internals for syscalls that > require coordination between all user created > threads Most proposed new system calls do not create threads, manipulate signal masks, or muck with other internals, so these concerns just don't apply. That's why syscall(2) mostly works in practice. Even if a few new system calls *do* involve these internal details and require closer libc coordination, the majority (e.g., the new mount API, termios2) don't, and so can be exposed directly from the kernel project without being blocked by glibc. > (setxid), A kernel-side fix here would be the cleanest approach. > libc internal state for syscalls > that create/destroy threads. > > and thus such lib does not solve the problems > of users who actually requested wrappers for > new syscalls (since they want to call into libc > and create threads). > > there is a lot of bikesheding here by people who > don't understand the constraints nor the use-cases. Conversely, there's a lot of doubt-sowing from the other side that makes shipping a kernel-provided interface library seem harder than it is. Most new system calls do not bear on the integration concerns that you and others are raising, and whatever problems remain can be solved with a narrow interface between libc and a new interface library, one that would let both evolve independently. > an actual proposal in the thread that i think is > worth considering is to make the linux syscall > design process involve libc devs so the c api is > designed together with the syscall abi. After looking at the history of settid, signal multi-handler registration, and other proposed improvements running into the brick wall of glibc's API process, I think it's clear that requiring glibc signoff on new kernel interfaces would simply lead to stagnation. It's not as if we're approaching the problem from a position of ignorance. The right answer is a move to an approximation of the BSD model and bring the primary interface layer in-house. There's a lot of evidence that this model works.