From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD616C64E8A for ; Mon, 30 Nov 2020 14:57:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B38320725 for ; Mon, 30 Nov 2020 14:57:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gXENwOv6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726931AbgK3O4u (ORCPT ); Mon, 30 Nov 2020 09:56:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbgK3O4t (ORCPT ); Mon, 30 Nov 2020 09:56:49 -0500 Received: from mail-vs1-xe42.google.com (mail-vs1-xe42.google.com [IPv6:2607:f8b0:4864:20::e42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0B96C0613CF for ; Mon, 30 Nov 2020 06:56:09 -0800 (PST) Received: by mail-vs1-xe42.google.com with SMTP id m62so6448431vsd.3 for ; Mon, 30 Nov 2020 06:56:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=HXVfP4Y76rU0Aug30Y/qKy5GYl2DQt7dptkdehAzvQo=; b=gXENwOv6lUpcbpwOAHUOjhiwXU2mFlK+DNwWtIa8ADJpTwZ+Y6D6Q40ueMa6WYAp8r I1fxf86D5F+K0WSVwH1ugjHc2ywYpY7aE1huZaOyUgWYfvSPlHUL/8fUuoHkpCbVwL3w 5a5tE2Gb03bkc6iTZYh1+jEmRHlKWuhD256VQLDlHLplHBZECFmnbcH++WJF7Lx9Pqv5 QsiWcC/kgsKeENrMihoRxYT39nq7ESwmOwKtNSZaxJQkg3EP/3UPuXFm0FqsPyvJ7V3T OKq9kOk3V3hLNt3/vQ0uI2nUQ9vvGKYKQNNI5FlSq3Dtixs7DN0ylRCxXsgMiThhaMxl tKlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=HXVfP4Y76rU0Aug30Y/qKy5GYl2DQt7dptkdehAzvQo=; b=ehMesgHpSRkZegjqBhbFVpwrQYDB7PVuMlLc2kwkeDfXD/HimD8Pfgl2ywIy/pe5W2 u7j32mBkFIBWBo15/HIB0HA9CD/HnpLajialt0szfG0QeBd8JcvJ3LgEwrDlRgdmofrL VOgN8q5rZ/Xa5ywfmFL7AYWDZL6aCtlGgMWSU1foIX4MRytldiH1Ln13zLWQg8jXnr9D MpRV4zpMXkaYRF0WFzNNxwLmPaDzwbK1m6N/7myRT4tT6yVKeZQ4aQ2+LNuwCospPFtA uYNzds7AmbRDRK6F1d8s1OejtL2gIi611onTiPSuzP+EWyCoYFqOX1nHzAqNkfwZT6uw vT4w== X-Gm-Message-State: AOAM533q0olW+S8kvmQuNf0U2u9BZ+cRQYxg+n5k3uxAfo5Dy8S6PeDo dmkeWh9R7+jrOhCTaMgwV83u1Ezcd0U9hJVPmb+lalJX9Ak= X-Google-Smtp-Source: ABdhPJzHN5moboIauytrtaUrQtGhAOX7gFFzZG56lrSuXLgBg0t3NsV0tRhib3voYqE7KpHJ4WKd11X65ujoZAXTUA4= X-Received: by 2002:a05:6102:154:: with SMTP id a20mr15518114vsr.50.1606748168436; Mon, 30 Nov 2020 06:56:08 -0800 (PST) MIME-Version: 1.0 References: <5d1b946ab5473504e9599d90c0feda407a179a05.1601568663.git.gitgitgadget@gmail.com> <20201002035851.GB3252492@google.com> <20201012165730.GC3740546@google.com> In-Reply-To: <20201012165730.GC3740546@google.com> From: Han-Wen Nienhuys Date: Mon, 30 Nov 2020 15:55:57 +0100 Message-ID: Subject: Re: [PATCH v2 02/13] reftable: define the public API To: Jonathan Nieder Cc: Han-Wen Nienhuys via GitGitGadget , git , Jeff King , Han-Wen Nienhuys Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Mon, Oct 12, 2020 at 6:57 PM Jonathan Nieder wrote: > > We could, but it would considerably complicate work on this patch > > series, as the commit boundary then doesn't fall on file boundaries > > anymore. Would you be open to having multiple headers for the public > > interface? eg. reftable-record.h, reftable-reader.h etc. ? > > I'd be more than open to it: I think that would make for a clearer API. Done. > >> Should these call BUG()? Or is it useful for some callers to be able > >> to recover from these errors? > > > > Since this was written as a standalone library, it's up to the caller > > to decide what should be done. > > I'm not strongly opinionated about this, but just a quick note: this > implies that the library would need to make sure it is producing a > valid state in error cases. This is what it already does. I tried to clarify this in some places. > Does the API documentation describe what state a handle is in after > an error --- e.g., what operations are permitted after that? API_ERROR should not result in state changes. > >>> +int reftable_error_to_errno(int err); > >> > >> What is the intended use of this function? > > > > The read_raw_ref method in the ref backend API uses errno values as > > out-of-band communication mechanism. > > Could we change that? It sounds error-prone. I moved it into refs/reftable-backend.c; I'd rather not also have to clean up the files backend as part of this work, though. > >> Do I pass in the 'struct block_source *' as the source arg? If so, wh= y > >> are these declared as void *? > > > > you pass in block_source->arg. Should struct implementations of > > polymorphic types carry a pointer to their own vtable instead? > > I think a pointer to block_source would make it more self-explanatory, > yes. This has a number of downsides, though. It means I have to introduce separate structs strbuf_blocksource and file_blocksource, along with functions to create and deallocate them, so it makes the whole thing more unwieldy to use. > >> Is the reason this manages the buffer instead of requiring a > >> caller-supplied buffer to support zero-copy? > > > > Log blocks are compressed, so the caller doesn't know the correct size > > to supply. By letting the block source handle the management, we can > > swap out the block read from the file for a block managed by malloc on > > decompressing a log block. > > Hm, my naive assumption would have been that we'd use different > buffers for the compressed and uncompressed data. Maybe you can comment on how to solve this differently when we get to the implementation of block reading/writing in the commit series. > >>> + Generic tables > >>> + > >>> + A unified API for reading tables, either merged tables, or single r= eaders. > >> > >> Are there callers/helpers that don't know whether they want one or the > >> other? > > > > The setup with per-worktree refs means that there are two reftable > > stacks in a .git repo. In order to iterate over the entire ref space, > > you have to merge two merged tables, but the stack itslef merges a set > > of simple reftables. > > I see. That's subtle, so it seems worth documenting for the next > person reading this documentation and wondering why. Done. --=20 Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado