From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7F84C2D0E2 for ; Thu, 24 Sep 2020 13:09:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 37EB62311A for ; Thu, 24 Sep 2020 13:09:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Nnwg/d5B" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727948AbgIXNJ4 (ORCPT ); Thu, 24 Sep 2020 09:09:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727704AbgIXNJz (ORCPT ); Thu, 24 Sep 2020 09:09:55 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 637D5C0613CE; Thu, 24 Sep 2020 06:09:55 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id e2so3549898wme.1; Thu, 24 Sep 2020 06:09:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=SaITC2xFCqSXNs4nD9he4WvhlKQrsR7E/aVI3TI6hoE=; b=Nnwg/d5BfKIwz6FzofVEv/rsQSA+r8ITrJkA/4vXploQm3T9VnIIM+L9R1TP5NXLS8 87lcPyza/TkztZNc066bt04bpn6lxk+RkJUkAGIVihvorQ1VMr45u+CRV173iek4E9BD dssk2KahZFHkzXSbqKDy35qAB/6fn7LFQDbveLmZJL2ESvMMHGQ/TKGYdcifjxp7T7np khN7k+LYTDf1rChp6Sh02rE2jNl5i99/QcYT+fxizhPDeV9LKUvY2o/RY+psh0dWcWzJ +Rcujg32/x0Yb2gvMhmRA5WDAM0wtB3Gnl6rWX2yxukAW5RPJ43SMWAEOjtr25+3FKCw blmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=SaITC2xFCqSXNs4nD9he4WvhlKQrsR7E/aVI3TI6hoE=; b=jSziscH6UXCIv1JpbW0azu6+rUZw9qBGGY99Qk9oYho1Im0s5BkzuP1gro05/LfeBm KbtEEw+6GeCqXXL79K0YDiLK3SNMKVKd2H1nW49rO6jTrzOKl9lLWZM/Lk8YrHFkAOgN l/Ssf+YvebW+VnxtWFaNEKgssNiz3RJ5zJhgz8iNBG82SuYjaIXLyWtyADqgnYYi8tj1 PORqi77i86735cY9tmnTKZu6+VSYDH9ZIIFV+zqFzbnpB1LHqdF6VVDn7pVJfB6yH53T avjX9gcxmsMISrZq9LEKgovBbpSpFxA8yIPHJqn/Qw06PcYLMH0OBKDjMBQrZrtctEQr 2/wg== X-Gm-Message-State: AOAM530sjBFCnV7dzR1//nKO45W+eKlcuGTkkIsPb4uwaUqGFGmHZE6X iYZ9URCLWmRGUu8HuGEFfTk= X-Google-Smtp-Source: ABdhPJze4r4mH5cmyqnhTRx33p9Tw4RS69UmbPIvW7HP9BJ9ztI659XOk00LCCDDMLhxMnqoBE7dKQ== X-Received: by 2002:a1c:a551:: with SMTP id o78mr4782923wme.4.1600952993912; Thu, 24 Sep 2020 06:09:53 -0700 (PDT) Received: from [192.168.1.143] ([170.253.60.68]) by smtp.gmail.com with ESMTPSA id l18sm3699246wrp.84.2020.09.24.06.09.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 24 Sep 2020 06:09:53 -0700 (PDT) Subject: Re: [PATCH 12/24] getgrent_r.3: Use sizeof() to get buffer size (instead of hardcoding macro name) To: "Michael Kerrisk (man-pages)" , Stefan Puiu Cc: lnx-man , linux-kernel@vger.kernel.org, Walter Harms References: <20200910211344.3562-1-colomar.6.4.3@gmail.com> <20200910211344.3562-13-colomar.6.4.3@gmail.com> <7dd2ab72-3ce7-1f50-229a-e663c3df2dcd@gmail.com> <40803904-63c6-9770-85df-fe39f430131a@gmail.com> From: Alejandro Colomar Message-ID: Date: Thu, 24 Sep 2020 15:09:52 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <40803904-63c6-9770-85df-fe39f430131a@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Michael, On 2020-09-24 13:38, Michael Kerrisk (man-pages) wrote: > On 9/24/20 11:35 AM, Alejandro Colomar wrote: >> Hi, >> >> On 2020-09-23 22:35, Michael Kerrisk (man-pages) wrote: >>> On 9/15/20 12:03 PM, Stefan Puiu wrote: >>>> Hi, >>>> >>>> On Fri, Sep 11, 2020 at 6:28 PM Alejandro Colomar >>>> wrote: >>>>> >>>>> Hi Stefan, >>>>> >>>>> On 2020-09-11 16:35, Stefan Puiu wrote: >>>>> > Hi, >>>>> > >>>>> > On Fri, Sep 11, 2020 at 12:15 AM Alejandro Colomar >>>>> > wrote: >>>>> >> >>>>> >> Signed-off-by: Alejandro Colomar >>>>> >> --- >>>>> >> man3/getgrent_r.3 | 2 +- >>>>> >> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >> >>>>> >> diff --git a/man3/getgrent_r.3 b/man3/getgrent_r.3 >>>>> >> index 81d81a851..76deec370 100644 >>>>> >> --- a/man3/getgrent_r.3 >>>>> >> +++ b/man3/getgrent_r.3 >>>>> >> @@ -186,7 +186,7 @@ main(void) >>>>> >> >>>>> >> setgrent(); >>>>> >> while (1) { >>>>> >> - i = getgrent_r(&grp, buf, BUFLEN, &grpp); >>>>> >> + i = getgrent_r(&grp, buf, sizeof(buf), &grpp); >>>>> > >>>>> > I'm worried that less attentive people might copy/paste parts of this >>>>> > in their code, where maybe buf is just a pointer, and expect it to >>>>> > work. Maybe leaving BUFLEN here is useful as a reminder that they need >>>>> > to change something to adapt the code? >>>>> > >>>>> > Just my 2 cents, >>>>> > Stefan. >>>>> > >>>>> That's a very good point. >>>>> >>>>> So we have 3 options and I will propose now a 4th one. Let's see all >>>>> of them and see which one is better for the man pages. >>>>> >>>>> 1.- Use the macro everywhere. >>>>> >>>>> pros: >>>>> - It is still valid when the buffer is a pointer and not an array. >>>>> cons: >>>>> - Hardcodes the initializer. If the array is later initialized with a >>>>> different value, it may produce a silent bug, or a compilation break. >>>>> >>>>> 2.- Use sizeof() everywhere, and the macro for the initializer. >>>>> >>>>> pros: >>>>> - It is valid as long as the buffer is an array. >>>>> cons: >>>>> - If the code gets into a function, and the buffer is then a pointer, >>>>> it will definitively produce a silent bug. >>>>> >>>>> 3.- Use sizeof() everywhere, and a magic number for the initializer. >>>>> >>>>> The same as 2. >>>>> >>>>> 4.- Use ARRAY_BYTES() macro >>>>> >>>>> pros: >>>>> - It is always safe and when code changes, it may break compilation, but >>>>> never a silent bug. >>>>> cons: >>>>> - Add a few lines of code. Maybe too much complexity for an example. >>>>> But I'd say that it is the only safe option, and in real code it >>>>> should probably be used more, so maybe it's good to show a good practice. >>>> >>>> If you ask me, I think examples should be simple and easy to >>>> understand, and easy to copy/paste in your code. I'd settle for easy >>>> enough, not perfect or completely foolproof. If you need to look up >>>> obscure gcc features to understand an example, that's not very >>>> helpful. So I'd be more inclined to prefer version 1 above. But let's >>>> see Michael's opinion on this. >>>> >>>> Just my 2c, >>> >>> So, the fundamental problem is that C is nearly 50 years old. >>> It's a great high-level assembly language, but when it comes >>> to nuances like this it gets pretty painful. One can do macro >>> magic of the kind you suggest, but I agree with Stefan that it >>> gets confusing and distracting for the reader. I think I also >>> lean to solution 1. Yes, it's not perfect, but it's easy to >>> understand, and I don't think we can or should try and solve >>> the broken-ness of C in the manual pages. >>> >>> Thanks, >>> >>> Michael >>> >>> >> >> I was reverting the 3 patches I introduced (they changed from solution 1 >> to solution 2), and also was grepping for already existing solution 2 in >> the pages (it seems that solution 2 was a bit more extended than >> solution 1). >> >> While doing that, I've been thinking about it again... >> >> There's a good thing about sizeof (even though I admit it's very >> insecure; and I never use it for myself), especially for the man pages: >> >> I'll copy here a sample from getnameinfo.3 to ilustrate it: >> >> [[ >> .EX >> struct sockaddr *addr; /* input */ >> socklen_t addrlen; /* input */ >> char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV]; >> >> if (getnameinfo(addr, addrlen, hbuf, sizeof(hbuf), sbuf, >> sizeof(sbuf), NI_NUMERICHOST | NI_NUMERICSERV) == 0) >> printf("host=%s, serv=%s\en", hbuf, sbuf); >> .EE >> ]] >> >> Here, it's clear to the reader that the 4th argument to 'getnameinfo()' >> is the size of the buffer passed as the 3rd argument. >> >> If the function call was changed to >> >> [[ >> getnameinfo(addr, addrlen, hbuf, NI_MAXHOST, sbuf, >> sizeof(sbuf), NI_NUMERICHOST | NI_NUMERICSERV) >> ]] >> >> then it would be less clear, and the reader should go back and forth to >> see where that comes from. In this short example it is relatively very >> clear, but in some examples it might be less clear. >> >> Would you maintain your preference for solution 1? >> >> >> Also... I am trying to patch glibc to provide a safe version of >> 'nitems()', and shortly after they accept that patch (if they do), I'll >> send another one to add a safe 'array_bytes()' based on 'nitems()'. >> >> Maybe the examples could use 'array_bytes()'; although is will be a >> glibc extension, and non-existent in any other POSIX systems, of course, >> which would make the examples non-portable, but still can be solved with >> a simple >> >> [[ >> #if !defined(array_bytes) >> #define array_bytes() sizeof() >> #endif >> ]] >> >> But again it complicates the examples... > > (And I'd rather not...) > >> I'm not sure at all about what should be done. Please comment. If you >> still prefer solution 1, I'll send you a patch with the revert + fixes, >> but I think it's very delicate. > > Okay -- I agree with your perspective of the getnameinfo example. > > So, I think maybe solution 1 is clearer sometimes, and other times > solution 2 is clearer. I don't feel too strongly about this Me neither. > (because we can't solve the bugger problem, which is C). > I'm fine with not reverting the three patches you > refer to. I'm going to leave this decision to you :-)! > > Thanks, > > Michael > > I'll keep it as is now. :) Thanks, Alex