All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
@ 2022-08-26 21:07 Alejandro Colomar
  2022-08-27 11:10 ` Ingo Schwarze
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-08-26 21:07 UTC (permalink / raw)
  To: linux-man; +Cc: Alejandro Colomar, Ingo Schwarze, JeanHeyd Meneide

The WG14 charter for C23 added one principle to the ones in
previous standards:

[
15.  Application Programming Interfaces (APIs) should be
self-documenting when possible.  In particular, the order of
parameters in function declarations should be arranged such that
the size of an array appears before the array.  The purpose is to
allow Variable-Length Array (VLA) notation to be used. This not
only makes the code's purpose clearer to human readers, but also
makes static analysis easier.  Any new APIs added to the Standard
should take this into consideration.
]

ISO C doesn't allow using VLA syntax when the parameter used for
the size of the array is declared _after_ the parameter that is a
VLa.  That's a minor issue that could be easily changed in the
language without backwards-compatibility issues, and in fact it
seems to have been proposed, and not yet discarded, even if it's
not going to change in C23.

Since the manual pages SYNOPSIS are not bounded by strict C legal
syntax, but we already use some "tricks" to try to convey the most
information to the reader even if it might not be the most legal
syntax, we can also make a small compromise in this case, using
illegal syntax (at least not yet legalized) to add important
information to the function prototypes.

If we're lucky, compiler authors, and maybe even WG14 members, may
be satisfied by the syntax used in these manual pages, and may
decide to implement this feature to the language.

It seems to me a sound syntax that isn't ambiguous, even if it
deviates from the common pattern in C that declarations _always_
come before use.  It's a reasonable tradeoff.

This change will make the contract between the programmer and the
implementation clearer just by reading a prototype.  For example:

    size_t strlcpy(char *restrict dst, const char *restrict src,
                   size_t size);

      vs

    size_t strlcpy(char dst[restrict size], const char *restrict src,
                   size_t size);

The second prototype above makes it clear that the 'dst' buffer
will be safe from overflow, but the 'src' one clearly needs to be
NUL-terminated, or it will cause UB, since nothing tells the
function how long it is.

Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2611.htm>
Cc: Ingo Schwarze <schwarze@openbsd.org>
Cc: JeanHeyd Meneide <wg14@soasis.org>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/confstr.3            |  2 +-
 man3/des_crypt.3          |  6 ++++--
 man3/fgetc.3              |  2 +-
 man3/getcwd.3             |  2 +-
 man3/getdirentries.3      |  3 ++-
 man3/getgrent_r.3         |  4 ++--
 man3/getgrnam.3           |  4 ++--
 man3/gethostbyname.3      |  8 ++++----
 man3/getlogin.3           |  2 +-
 man3/getmntent.3          |  2 +-
 man3/getnameinfo.3        |  6 +++---
 man3/getnetent_r.3        |  6 +++---
 man3/getprotoent_r.3      |  6 +++---
 man3/getpwent_r.3         |  4 ++--
 man3/getpwnam.3           |  8 ++++----
 man3/getrpcent_r.3        |  6 +++---
 man3/getservent_r.3       |  6 +++---
 man3/getspnam.3           |  8 ++++----
 man3/inet_net_pton.3      |  2 +-
 man3/inet_ntop.3          |  2 +-
 man3/mblen.3              |  2 +-
 man3/mbrlen.3             |  2 +-
 man3/mbrtowc.3            |  5 ++---
 man3/mbstowcs.3           |  3 ++-
 man3/mbtowc.3             |  4 ++--
 man3/mq_receive.3         |  5 +++--
 man3/mq_send.3            |  4 ++--
 man3/printf.3             |  4 ++--
 man3/pthread_setname_np.3 |  3 ++-
 man3/ptsname.3            |  4 ++--
 man3/random.3             |  2 +-
 man3/random_r.3           |  3 ++-
 man3/regex.3              |  7 ++++---
 man3/resolver.3           | 34 ++++++++++++++++++----------------
 man3/rpc.3                |  2 +-
 man3/setaliasent.3        |  6 ++++--
 man3/setbuf.3             |  4 ++--
 man3/setnetgrent.3        |  2 +-
 man3/stpncpy.3            |  5 +++--
 man3/strcasecmp.3         |  2 +-
 man3/strcat.3             |  5 +++--
 man3/strcmp.3             |  2 +-
 man3/strcpy.3             |  5 +++--
 man3/strdup.3             |  4 ++--
 man3/strerror.3           |  4 ++--
 man3/strfmon.3            |  4 ++--
 man3/strfromd.3           |  6 +++---
 man3/strftime.3           |  4 ++--
 man3/string.3             | 25 +++++++++++++++++--------
 man3/strnlen.3            |  2 +-
 man3/strxfrm.3            |  3 ++-
 man3/ttyname.3            |  2 +-
 man3/unlocked_stdio.3     |  4 ++--
 man3/wcsnrtombs.3         |  2 +-
 man3/wcsrtombs.3          |  2 +-
 man3/wcstombs.3           |  2 +-
 56 files changed, 146 insertions(+), 122 deletions(-)

diff --git a/man3/confstr.3 b/man3/confstr.3
index 5bc334c02..434ab9678 100644
--- a/man3/confstr.3
+++ b/man3/confstr.3
@@ -20,7 +20,7 @@ Standard C library
 .nf
 .B #include <unistd.h>
 .PP
-.BI "size_t confstr(int " "name" ", char *" buf ", size_t " size );
+.BI "size_t confstr(int " "name" ", char " buf [ size "], size_t " size );
 .fi
 .PP
 .RS -4
diff --git a/man3/des_crypt.3 b/man3/des_crypt.3
index 90ce308b9..f419ab026 100644
--- a/man3/des_crypt.3
+++ b/man3/des_crypt.3
@@ -22,9 +22,11 @@ Standard C library
 .\" .B #include <des_crypt.h>
 .B #include <rpc/des_crypt.h>
 .PP
-.BI "int ecb_crypt(char *" key ", char *" data ", unsigned int " datalen ,
+.BI "int ecb_crypt(char *" key ", char " data [ datalen "], \
+unsigned int " datalen ,
 .BI "              unsigned int " mode );
-.BI "int cbc_crypt(char *" key ", char *" data ", unsigned int " datalen ,
+.BI "int cbc_crypt(char *" key ", char " data [ datalen "], \
+unsigned int " datalen ,
 .BI "              unsigned int " mode ", char *" ivec );
 .PP
 .BI "void des_setparity(char *" key );
diff --git a/man3/fgetc.3 b/man3/fgetc.3
index 2cd14a5fb..690cbce80 100644
--- a/man3/fgetc.3
+++ b/man3/fgetc.3
@@ -18,7 +18,7 @@ Standard C library
 .BI "int getc(FILE *" stream );
 .B "int getchar(void);"
 .PP
-.BI "char *fgets(char *restrict " s ", int " size ", FILE *restrict " stream );
+.BI "char *fgets(char " s "[restrict " size "], int " size ", FILE *restrict " stream );
 .PP
 .BI "int ungetc(int " c ", FILE *" stream );
 .fi
diff --git a/man3/getcwd.3 b/man3/getcwd.3
index 382bade77..82f573115 100644
--- a/man3/getcwd.3
+++ b/man3/getcwd.3
@@ -19,7 +19,7 @@ Standard C library
 .nf
 .B #include <unistd.h>
 .PP
-.BI "char *getcwd(char *" buf ", size_t " size );
+.BI "char *getcwd(char " buf [ size "], size_t " size );
 .BI "char *getwd(char *" buf );
 .B "char *get_current_dir_name(void);"
 .fi
diff --git a/man3/getdirentries.3 b/man3/getdirentries.3
index ce8ee69a8..eadc3c86e 100644
--- a/man3/getdirentries.3
+++ b/man3/getdirentries.3
@@ -14,7 +14,8 @@ Standard C library
 .nf
 .B #include <dirent.h>
 .PP
-.BI "ssize_t getdirentries(int " fd ", char *restrict " buf ", size_t " nbytes ,
+.BI "ssize_t getdirentries(int " fd ", char " buf "[restrict " nbytes "], \
+size_t " nbytes ,
 .BI "                      off_t *restrict " basep );
 .fi
 .PP
diff --git a/man3/getgrent_r.3 b/man3/getgrent_r.3
index 8a47bf59e..e1eeb31c7 100644
--- a/man3/getgrent_r.3
+++ b/man3/getgrent_r.3
@@ -13,10 +13,10 @@ Standard C library
 .B #include <grp.h>
 .PP
 .BI "int getgrent_r(struct group *restrict " gbuf ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct group **restrict " gbufp );
 .BI "int fgetgrent_r(FILE *restrict " stream ", struct group *restrict " gbuf ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct group **restrict " gbufp );
 .fi
 .PP
diff --git a/man3/getgrnam.3 b/man3/getgrnam.3
index 7ef37819f..ab46a7570 100644
--- a/man3/getgrnam.3
+++ b/man3/getgrnam.3
@@ -26,10 +26,10 @@ Standard C library
 .PP
 .BI "int getgrnam_r(const char *restrict " name \
 ", struct group *restrict " grp ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct group **restrict " result );
 .BI "int getgrgid_r(gid_t " gid ", struct group *restrict " grp ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct group **restrict " result );
 .fi
 .PP
diff --git a/man3/gethostbyname.3 b/man3/gethostbyname.3
index 20ad562be..1c7182245 100644
--- a/man3/gethostbyname.3
+++ b/man3/gethostbyname.3
@@ -49,24 +49,24 @@ Standard C library
 .BI "struct hostent *gethostbyname2(const char *" name ", int " af );
 .PP
 .BI "int gethostent_r(struct hostent *restrict " ret ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct hostent **restrict " result ,
 .BI "                 int *restrict " h_errnop );
 .PP
 .BI "int gethostbyaddr_r(const void *restrict " addr ", socklen_t " len \
 ", int " type ,
 .BI "                 struct hostent *restrict " ret ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct hostent **restrict " result ,
 .BI "                 int *restrict " h_errnop );
 .BI "int gethostbyname_r(const char *restrict " name ,
 .BI "                 struct hostent *restrict " ret ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct hostent **restrict " result ,
 .BI "                 int *restrict " h_errnop );
 .BI "int gethostbyname2_r(const char *restrict " name ", int " af,
 .BI "                 struct hostent *restrict " ret ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct hostent **restrict " result ,
 .BI "                 int *restrict " h_errnop );
 .fi
diff --git a/man3/getlogin.3 b/man3/getlogin.3
index 50b8b008b..8604f2bcd 100644
--- a/man3/getlogin.3
+++ b/man3/getlogin.3
@@ -16,7 +16,7 @@ Standard C library
 .B #include <unistd.h>
 .PP
 .B "char *getlogin(void);"
-.BI "int getlogin_r(char *" buf ", size_t " bufsize );
+.BI "int getlogin_r(char " buf [ bufsize "], size_t " bufsize );
 .PP
 .B #include <stdio.h>
 .PP
diff --git a/man3/getmntent.3 b/man3/getmntent.3
index 3c704b1d8..41746d9eb 100644
--- a/man3/getmntent.3
+++ b/man3/getmntent.3
@@ -37,7 +37,7 @@ Standard C library
 .PP
 .BI "struct mntent *getmntent_r(FILE *restrict " streamp ,
 .BI "              struct mntent *restrict " mntbuf ,
-.BI "              char *restrict " buf ", int " buflen );
+.BI "              char " buf "[restrict " buflen "], int " buflen );
 .fi
 .PP
 .RS -4
diff --git a/man3/getnameinfo.3 b/man3/getnameinfo.3
index 5c42c09b6..b72c37117 100644
--- a/man3/getnameinfo.3
+++ b/man3/getnameinfo.3
@@ -20,9 +20,9 @@ Standard C library
 .PP
 .BI "int getnameinfo(const struct sockaddr *restrict " addr \
 ", socklen_t " addrlen ,
-.BI "                char *restrict " host ", socklen_t " hostlen ,
-.BI "                char *restrict " serv ", socklen_t " servlen \
-", int " flags );
+.BI "                char " host "[restrict " hostlen "], socklen_t " hostlen ,
+.BI "                char " serv "[restrict " servlen "], socklen_t " servlen ,
+.BI "                int " flags );
 .fi
 .PP
 .RS -4
diff --git a/man3/getnetent_r.3 b/man3/getnetent_r.3
index 36a2ff819..55322df27 100644
--- a/man3/getnetent_r.3
+++ b/man3/getnetent_r.3
@@ -15,17 +15,17 @@ Standard C library
 .B #include <netdb.h>
 .PP
 .BI "int getnetent_r(struct netent *restrict " result_buf ,
-.BI "                char *restrict " buf ", size_t " buflen ,
+.BI "                char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                struct netent **restrict " result ,
 .BI "                int *restrict " h_errnop );
 .BI "int getnetbyname_r(const char *restrict " name ,
 .BI "                struct netent *restrict " result_buf ,
-.BI "                char *restrict " buf ", size_t " buflen ,
+.BI "                char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                struct netent **restrict " result ,
 .BI "                int *restrict " h_errnop );
 .BI "int getnetbyaddr_r(uint32_t " net ", int " type ,
 .BI "                struct netent *restrict " result_buf ,
-.BI "                char *restrict " buf ", size_t " buflen ,
+.BI "                char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                struct netent **restrict " result ,
 .BI "                int *restrict " h_errnop );
 .PP
diff --git a/man3/getprotoent_r.3 b/man3/getprotoent_r.3
index 2e3815a30..34ae75634 100644
--- a/man3/getprotoent_r.3
+++ b/man3/getprotoent_r.3
@@ -15,15 +15,15 @@ Standard C library
 .B #include <netdb.h>
 .PP
 .BI "int getprotoent_r(struct protoent *restrict " result_buf ,
-.BI "                  char *restrict " buf ", size_t " buflen ,
+.BI "                  char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                  struct protoent **restrict " result );
 .BI "int getprotobyname_r(const char *restrict " name ,
 .BI "                  struct protoent *restrict " result_buf ,
-.BI "                  char *restrict " buf ", size_t " buflen ,
+.BI "                  char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                  struct protoent **restrict " result );
 .BI "int getprotobynumber_r(int " proto ,
 .BI "                  struct protoent *restrict " result_buf ,
-.BI "                  char *restrict " buf ", size_t " buflen ,
+.BI "                  char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                  struct protoent **restrict " result );
 .PP
 .fi
diff --git a/man3/getpwent_r.3 b/man3/getpwent_r.3
index bde13f399..03826578c 100644
--- a/man3/getpwent_r.3
+++ b/man3/getpwent_r.3
@@ -13,11 +13,11 @@ Standard C library
 .B #include <pwd.h>
 .PP
 .BI "int getpwent_r(struct passwd *restrict " pwbuf ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct passwd **restrict " pwbufp );
 .BI "int fgetpwent_r(FILE *restrict " stream \
 ", struct passwd *restrict " pwbuf ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct passwd **restrict " pwbufp );
 .fi
 .PP
diff --git a/man3/getpwnam.3 b/man3/getpwnam.3
index 219d37733..d711a4c4a 100644
--- a/man3/getpwnam.3
+++ b/man3/getpwnam.3
@@ -28,12 +28,12 @@ Standard C library
 .BI "struct passwd *getpwnam(const char *" name );
 .BI "struct passwd *getpwuid(uid_t " uid );
 .PP
-.BI "int getpwnam_r(const char *restrict " name \
-", struct passwd *restrict " pwd ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "int getpwnam_r(const char *restrict " name ", \
+struct passwd *restrict " pwd ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct passwd **restrict " result );
 .BI "int getpwuid_r(uid_t " uid ", struct passwd *restrict " pwd ,
-.BI "               char *restrict " buf ", size_t " buflen ,
+.BI "               char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "               struct passwd **restrict " result );
 .fi
 .PP
diff --git a/man3/getrpcent_r.3 b/man3/getrpcent_r.3
index 44d20b7ed..74182fd83 100644
--- a/man3/getrpcent_r.3
+++ b/man3/getrpcent_r.3
@@ -14,13 +14,13 @@ Standard C library
 .nf
 .B #include <netdb.h>
 .PP
-.BI "int getrpcent_r(struct rpcent *" result_buf ", char *" buf ,
+.BI "int getrpcent_r(struct rpcent *" result_buf ", char " buf [ buflen "],
 .BI "                size_t " buflen ", struct rpcent **" result );
 .BI "int getrpcbyname_r(const char *" name ,
-.BI "                struct rpcent *" result_buf ", char *" buf ,
+.BI "                struct rpcent *" result_buf ", char " buf [ buflen "],
 .BI "                size_t " buflen ", struct rpcent **" result );
 .BI "int getrpcbynumber_r(int " number ,
-.BI "                struct rpcent *" result_buf ", char *" buf ,
+.BI "                struct rpcent *" result_buf ", char " buf [ buflen "],
 .BI "                size_t " buflen ", struct rpcent **" result );
 .PP
 .fi
diff --git a/man3/getservent_r.3 b/man3/getservent_r.3
index 4e7b1f03d..6d9c578b4 100644
--- a/man3/getservent_r.3
+++ b/man3/getservent_r.3
@@ -15,17 +15,17 @@ Standard C library
 .B #include <netdb.h>
 .PP
 .BI "int getservent_r(struct servent *restrict " result_buf ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct servent **restrict " result );
 .BI "int getservbyname_r(const char *restrict " name ,
 .BI "                 const char *restrict " proto ,
 .BI "                 struct servent *restrict " result_buf ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct servent **restrict " result );
 .BI "int getservbyport_r(int " port ,
 .BI "                 const char *restrict " proto ,
 .BI "                 struct servent *restrict " result_buf ,
-.BI "                 char *restrict " buf ", size_t " buflen ,
+.BI "                 char " buf "[restrict " buflen "], size_t " buflen ,
 .BI "                 struct servent **restrict " result );
 .PP
 .fi
diff --git a/man3/getspnam.3 b/man3/getspnam.3
index 3389105ab..db5f8f5f8 100644
--- a/man3/getspnam.3
+++ b/man3/getspnam.3
@@ -34,14 +34,14 @@ Standard C library
 .B #include <shadow.h>
 .PP
 .BI "int getspent_r(struct spwd *" spbuf ,
-.BI "               char *" buf ", size_t " buflen ", struct spwd **" spbufp );
+.BI "               char " buf [ buflen "], size_t " buflen ", struct spwd **" spbufp );
 .BI "int getspnam_r(const char *" name ", struct spwd *" spbuf ,
-.BI "               char *" buf ", size_t " buflen ", struct spwd **" spbufp );
+.BI "               char " buf [ buflen "], size_t " buflen ", struct spwd **" spbufp );
 .PP
 .BI "int fgetspent_r(FILE *" stream ", struct spwd *" spbuf ,
-.BI "               char *" buf ", size_t " buflen ", struct spwd **" spbufp );
+.BI "               char " buf [ buflen "], size_t " buflen ", struct spwd **" spbufp );
 .BI "int sgetspent_r(const char *" s ", struct spwd *" spbuf ,
-.BI "               char *" buf ", size_t " buflen ", struct spwd **" spbufp );
+.BI "               char " buf [ buflen "], size_t " buflen ", struct spwd **" spbufp );
 .fi
 .PP
 .RS -4
diff --git a/man3/inet_net_pton.3 b/man3/inet_net_pton.3
index 8dce6b299..c7d477695 100644
--- a/man3/inet_net_pton.3
+++ b/man3/inet_net_pton.3
@@ -15,7 +15,7 @@ Resolver library
 .BI "int inet_net_pton(int " af ", const char *" pres ,
 .BI "                    void *" netp ", size_t " nsize );
 .BI "char *inet_net_ntop(int " af ", const void *" netp ", int " bits ,
-.BI "                    char *" pres ", size_t " psize );
+.BI "                    char " pres [ psize "], size_t " psize );
 .fi
 .PP
 .RS -4
diff --git a/man3/inet_ntop.3 b/man3/inet_ntop.3
index b06c268bd..6f73b33fa 100644
--- a/man3/inet_ntop.3
+++ b/man3/inet_ntop.3
@@ -14,7 +14,7 @@ Standard C library
 .B #include <arpa/inet.h>
 .PP
 .BI "const char *inet_ntop(int " af ", const void *restrict " src ,
-.BI "                      char *restrict " dst ", socklen_t " size );
+.BI "                      char " dst "[restrict " size "], socklen_t " size );
 .fi
 .SH DESCRIPTION
 This function converts the network address structure
diff --git a/man3/mblen.3 b/man3/mblen.3
index ae7b38f1b..de826f2b8 100644
--- a/man3/mblen.3
+++ b/man3/mblen.3
@@ -18,7 +18,7 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "int mblen(const char *" s ", size_t " n );
+.BI "int mblen(const char " s [ n "], size_t " n );
 .fi
 .SH DESCRIPTION
 If
diff --git a/man3/mbrlen.3 b/man3/mbrlen.3
index 35c2b8db5..4522d2cac 100644
--- a/man3/mbrlen.3
+++ b/man3/mbrlen.3
@@ -18,7 +18,7 @@ Standard C library
 .nf
 .B #include <wchar.h>
 .PP
-.BI "size_t mbrlen(const char *restrict " s ", size_t " n ,
+.BI "size_t mbrlen(const char " s "[restrict " n "], size_t " n ,
 .BI "              mbstate_t *restrict " ps );
 .fi
 .SH DESCRIPTION
diff --git a/man3/mbrtowc.3 b/man3/mbrtowc.3
index b91c0fbc2..1de0f1ba7 100644
--- a/man3/mbrtowc.3
+++ b/man3/mbrtowc.3
@@ -19,9 +19,8 @@ Standard C library
 .nf
 .B #include <wchar.h>
 .PP
-.BI "size_t mbrtowc(wchar_t *restrict " pwc ", const char *restrict " s \
-", size_t " n ,
-.BI "               mbstate_t *restrict " ps );
+.BI "size_t mbrtowc(wchar_t *restrict " pwc ", const char " s "[restrict " n ],
+.BI "               size_t " n ", mbstate_t *restrict " ps );
 .fi
 .SH DESCRIPTION
 The main case for this function is when
diff --git a/man3/mbstowcs.3 b/man3/mbstowcs.3
index 30a2a8679..67b8e569e 100644
--- a/man3/mbstowcs.3
+++ b/man3/mbstowcs.3
@@ -19,7 +19,8 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "size_t mbstowcs(wchar_t *restrict " dest ", const char *restrict " src ,
+.BI "size_t mbstowcs(wchar_t " dest "[restrict " n "], \
+const char *restrict " src ,
 .BI "                size_t " n );
 .fi
 .SH DESCRIPTION
diff --git a/man3/mbtowc.3 b/man3/mbtowc.3
index b0a25ae12..18dca1957 100644
--- a/man3/mbtowc.3
+++ b/man3/mbtowc.3
@@ -18,8 +18,8 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "int mbtowc(wchar_t *restrict " pwc ", const char *restrict " s \
-", size_t " n );
+.BI "int mbtowc(wchar_t *restrict " pwc ", const char " s "[restrict " n "], \
+size_t " n );
 .fi
 .SH DESCRIPTION
 The main case for this function is when
diff --git a/man3/mq_receive.3 b/man3/mq_receive.3
index 94f686d97..f43df785f 100644
--- a/man3/mq_receive.3
+++ b/man3/mq_receive.3
@@ -12,13 +12,14 @@ Real-time library
 .nf
 .B #include <mqueue.h>
 .PP
-.BI "ssize_t mq_receive(mqd_t " mqdes ", char *" msg_ptr ,
+.BI "ssize_t mq_receive(mqd_t " mqdes ", char " msg_ptr [ msg_len ],
 .BI "                   size_t " msg_len ", unsigned int *" msg_prio );
 .PP
 .B #include <time.h>
 .B #include <mqueue.h>
 .PP
-.BI "ssize_t mq_timedreceive(mqd_t " mqdes ", char *restrict " msg_ptr ,
+.BI "ssize_t mq_timedreceive(mqd_t " mqdes ", \
+char *restrict " msg_ptr [ msg_len ],
 .BI "                   size_t " msg_len ", unsigned int *restrict " msg_prio ,
 .BI "                   const struct timespec *restrict " abs_timeout );
 .fi
diff --git a/man3/mq_send.3 b/man3/mq_send.3
index 26947595a..6f147d4fb 100644
--- a/man3/mq_send.3
+++ b/man3/mq_send.3
@@ -12,13 +12,13 @@ Real-time library
 .nf
 .B #include <mqueue.h>
 .PP
-.BI "int mq_send(mqd_t " mqdes ", const char *" msg_ptr ,
+.BI "int mq_send(mqd_t " mqdes ", const char " msg_ptr [ msg_len ],
 .BI "              size_t " msg_len ", unsigned int " msg_prio );
 .PP
 .B #include <time.h>
 .B #include <mqueue.h>
 .PP
-.BI "int mq_timedsend(mqd_t " mqdes ", const char *" msg_ptr ,
+.BI "int mq_timedsend(mqd_t " mqdes ", const char " msg_ptr [ msg_len ],
 .BI "              size_t " msg_len ", unsigned int " msg_prio ,
 .BI "              const struct timespec *" abs_timeout );
 .fi
diff --git a/man3/printf.3 b/man3/printf.3
index 878f95791..5099b6f72 100644
--- a/man3/printf.3
+++ b/man3/printf.3
@@ -30,7 +30,7 @@ Standard C library
 .BI "            const char *restrict " format ", ...);"
 .BI "int sprintf(char *restrict " str ,
 .BI "            const char *restrict " format ", ...);"
-.BI "int snprintf(char *restrict " str ", size_t " size ,
+.BI "int snprintf(char " str "[restrict " size "], size_t " size ,
 .BI "            const char *restrict " format ", ...);"
 .PP
 .B #include <stdarg.h>
@@ -42,7 +42,7 @@ Standard C library
 .BI "            const char *restrict " format ", va_list " ap );
 .BI "int vsprintf(char *restrict " str ,
 .BI "            const char *restrict " format ", va_list " ap );
-.BI "int vsnprintf(char *restrict " str ", size_t " size ,
+.BI "int vsnprintf(char " str "[restrict " size "], size_t " size ,
 .BI "            const char *restrict " format ", va_list " ap );
 .fi
 .PP
diff --git a/man3/pthread_setname_np.3 b/man3/pthread_setname_np.3
index 115557787..2bab13e85 100644
--- a/man3/pthread_setname_np.3
+++ b/man3/pthread_setname_np.3
@@ -15,7 +15,8 @@ POSIX threads library
 .B #include <pthread.h>
 .PP
 .BI "int pthread_setname_np(pthread_t " thread ", const char *" name );
-.BI "int pthread_getname_np(pthread_t " thread ", char *" name ", size_t " size );
+.BI "int pthread_getname_np(pthread_t " thread ", char " name [ size "], \
+size_t " size );
 .fi
 .SH DESCRIPTION
 By default, all the threads created using
diff --git a/man3/ptsname.3 b/man3/ptsname.3
index e40005df6..135730752 100644
--- a/man3/ptsname.3
+++ b/man3/ptsname.3
@@ -14,8 +14,8 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "char *ptsname(int " fd ");"
-.BI "int ptsname_r(int " fd ", char *" buf ", size_t " buflen ");"
+.BI "char *ptsname(int " fd );
+.BI "int ptsname_r(int " fd ", char " buf [ buflen "], size_t " buflen );
 .fi
 .PP
 .RS -4
diff --git a/man3/random.3 b/man3/random.3
index fd2512626..3a7af437a 100644
--- a/man3/random.3
+++ b/man3/random.3
@@ -23,7 +23,7 @@ Standard C library
 .B long random(void);
 .BI "void srandom(unsigned int " seed );
 .PP
-.BI "char *initstate(unsigned int " seed ", char *" state ", size_t " n );
+.BI "char *initstate(unsigned int " seed ", char " state [ n "], size_t " n );
 .BI "char *setstate(char *" state );
 .fi
 .PP
diff --git a/man3/random_r.3 b/man3/random_r.3
index b2bf97b06..8564e1723 100644
--- a/man3/random_r.3
+++ b/man3/random_r.3
@@ -18,7 +18,8 @@ Standard C library
 .BI "             int32_t *restrict " result );
 .BI "int srandom_r(unsigned int " seed ", struct random_data *" buf );
 .PP
-.BI "int initstate_r(unsigned int " seed ", char *restrict " statebuf ,
+.BI "int initstate_r(unsigned int " seed ", \
+char " statebuf "[restrict " statelen ],
 .BI "             size_t " statelen ", struct random_data *restrict " buf );
 .BI "int setstate_r(char *restrict " statebuf ,
 .BI "             struct random_data *restrict " buf );
diff --git a/man3/regex.3 b/man3/regex.3
index e423e442d..ae66b7980 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -21,11 +21,12 @@ Standard C library
 .BI "            int " cflags );
 .BI "int regexec(const regex_t *restrict " preg \
 ", const char *restrict " string ,
-.BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict]\
-, int " eflags );
+.BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict " nmatch ],
+.BI "            int " eflags );
 .PP
 .BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
-.BI "            char *restrict " errbuf ", size_t " errbuf_size );
+.BI "            char " errbuf "[restrict " errbuf_size "], \
+size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .fi
 .SH DESCRIPTION
diff --git a/man3/resolver.3 b/man3/resolver.3
index b565bb5e6..c701b4629 100644
--- a/man3/resolver.3
+++ b/man3/resolver.3
@@ -35,34 +35,35 @@ Resolver library
 .PP
 .BI "int res_nquery(res_state " statep ,
 .BI "           const char *" dname ", int " class ", int " type ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .PP
 .BI "int res_nsearch(res_state " statep ,
 .BI "           const char *" dname ", int " class ", int " type ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .PP
 .BI "int res_nquerydomain(res_state " statep ,
 .BI "           const char *" name ", const char *" domain ,
-.BI "           int " class ", int " type ", unsigned char *" answer ,
+.BI "           int " class ", int " type ", unsigned char " answer [ anslen ],
 .BI "           int " anslen );
 .PP
 .BI "int res_nmkquery(res_state " statep ,
 .BI "           int " op ", const char *" dname ", int " class ,
-.BI "           int " type ", const unsigned char *" data ", int " datalen ,
+.BI "           int " type ", const unsigned char " data [ datalen "], \
+int " datalen ,
 .BI "           const unsigned char *" newrr ,
-.BI "           unsigned char *" buf ", int " buflen );
+.BI "           unsigned char " buf [ buflen "], int " buflen );
 .PP
 .BI "int res_nsend(res_state " statep ,
-.BI "           const unsigned char *" msg ", int " msglen ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "           const unsigned char " msg [ msglen "], int " msglen ,
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .PP
-.BI "int dn_comp(const char *" exp_dn ", unsigned char *" comp_dn ,
+.BI "int dn_comp(const char *" exp_dn ", unsigned char " comp_dn [ length ],
 .BI "           int " length ", unsigned char **" dnptrs ,
 .BI "           unsigned char **" lastdnptr );
 .PP
 .BI "int dn_expand(const unsigned char *" msg ,
 .BI "           const unsigned char *" eomorig ,
-.BI "           const unsigned char *" comp_dn ", char *" exp_dn ,
+.BI "           const unsigned char *" comp_dn ", char " exp_dn [ length ],
 .BI "           int " length );
 .fi
 .\"
@@ -73,22 +74,23 @@ Resolver library
 .B int res_init(void);
 .PP
 .BI "int res_query(const char *" dname ", int " class ", int " type ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .PP
 .BI "int res_search(const char *" dname ", int " class ", int " type ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .PP
 .BI "int res_querydomain(const char *" name ", const char *" domain ,
-.BI "           int " class ", int " type ", unsigned char *" answer ,
+.BI "           int " class ", int " type ", unsigned char " answer [ anslen ],
 .BI "           int " anslen );
 .PP
 .BI "int res_mkquery(int " op ", const char *" dname ", int " class ,
-.BI "           int " type ", const unsigned char *" data ", int " datalen ,
+.BI "           int " type ", const unsigned char " data [ datalen "], \
+int " datalen ,
 .BI "           const unsigned char *" newrr ,
-.BI "           unsigned char *" buf ", int " buflen );
+.BI "           unsigned char " buf [ buflen "], int " buflen );
 .PP
-.BI "int res_send(const unsigned char *" msg ", int " msglen ,
-.BI "           unsigned char *" answer ", int " anslen );
+.BI "int res_send(const unsigned char " msg [ msglen "], int " msglen ,
+.BI "           unsigned char " answer [ anslen "], int " anslen );
 .fi
 .SH DESCRIPTION
 .B Note:
diff --git a/man3/rpc.3 b/man3/rpc.3
index b0cfc52e7..80fdd5dc4 100644
--- a/man3/rpc.3
+++ b/man3/rpc.3
@@ -74,7 +74,7 @@ This is the default authentication used by RPC.
 .PP
 .nf
 .BI "AUTH *authunix_create(char *" host ", uid_t " uid ", gid_t " gid ,
-.BI "                      int " len ", gid_t *" aup_gids );
+.BI "                      int " len ", gid_t " aup_gids [ len ]);
 .fi
 .IP
 Create and return an RPC authentication handle that contains
diff --git a/man3/setaliasent.3 b/man3/setaliasent.3
index 9d3cfb968..f4401608b 100644
--- a/man3/setaliasent.3
+++ b/man3/setaliasent.3
@@ -20,13 +20,15 @@ Standard C library
 .PP
 .B "struct aliasent *getaliasent(void);"
 .BI "int getaliasent_r(struct aliasent *restrict " result ,
-.BI "                     char *restrict " buffer ", size_t " buflen ,
+.BI "                     char " buffer "[restrict " buflen "], \
+size_t " buflen ,
 .BI "                     struct aliasent **restrict " res );
 .PP
 .BI "struct aliasent *getaliasbyname(const char *" name );
 .BI "int getaliasbyname_r(const char *restrict " name ,
 .BI "                     struct aliasent *restrict " result ,
-.BI "                     char *restrict " buffer ", size_t " buflen ,
+.BI "                     char " buffer "[restrict " buflen "], \
+size_t " buflen ,
 .BI "                     struct aliasent **restrict " res );
 .fi
 .SH DESCRIPTION
diff --git a/man3/setbuf.3 b/man3/setbuf.3
index 4a62952d7..8c72b8e0a 100644
--- a/man3/setbuf.3
+++ b/man3/setbuf.3
@@ -27,11 +27,11 @@ Standard C library
 .nf
 .B #include <stdio.h>
 .PP
-.BI "int setvbuf(FILE *restrict " stream ", char *restrict " buf ,
+.BI "int setvbuf(FILE *restrict " stream ", char " buf "[restrict " size ],
 .BI "            int " mode ", size_t " size );
 .PP
 .BI "void setbuf(FILE *restrict " stream ", char *restrict " buf );
-.BI "void setbuffer(FILE *restrict " stream ", char *restrict " buf ,
+.BI "void setbuffer(FILE *restrict " stream ", char " buf "[restrict " size ],
 .BI "            size_t "  size );
 .BI "void setlinebuf(FILE *" stream );
 .fi
diff --git a/man3/setnetgrent.3 b/man3/setnetgrent.3
index 3625adf14..9cfda3c83 100644
--- a/man3/setnetgrent.3
+++ b/man3/setnetgrent.3
@@ -23,7 +23,7 @@ Standard C library
 .BI "            char **restrict " user ", char **restrict " domain );
 .BI "int getnetgrent_r(char **restrict " host ,
 .BI "            char **restrict " user ", char **restrict " domain ,
-.BI "            char *restrict " buf ", size_t " buflen );
+.BI "            char " buf "[restrict " buflen "], size_t " buflen );
 .PP
 .BI "int innetgr(const char *" netgroup ", const char *" host ,
 .BI "            const char *" user ", const char *" domain );
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index c057845ac..5dd7cc96d 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -16,8 +16,9 @@ Standard C library
 .nf
 .B #include <string.h>
 .PP
-.BI "char *stpncpy(char *restrict " dest ", const char *restrict " src \
-", size_t " n );
+.BI "char *stpncpy(char " dest "[restrict " n "], \
+const char " src "[restrict " n "],
+.BI "              size_t " n );
 .fi
 .PP
 .RS -4
diff --git a/man3/strcasecmp.3 b/man3/strcasecmp.3
index 58a22349e..e94c79966 100644
--- a/man3/strcasecmp.3
+++ b/man3/strcasecmp.3
@@ -18,7 +18,7 @@ Standard C library
 .B #include <strings.h>
 .PP
 .BI "int strcasecmp(const char *" s1 ", const char *" s2 );
-.BI "int strncasecmp(const char *" s1 ", const char *" s2 ", size_t " n );
+.BI "int strncasecmp(const char " s1 [ n "], const char " s2 [ n "], size_t " n );
 .fi
 .SH DESCRIPTION
 The
diff --git a/man3/strcat.3 b/man3/strcat.3
index 5738bb9be..ff4c91307 100644
--- a/man3/strcat.3
+++ b/man3/strcat.3
@@ -20,8 +20,9 @@ Standard C library
 .B #include <string.h>
 .PP
 .BI "char *strcat(char *restrict " dest ", const char *restrict " src );
-.BI "char *strncat(char *restrict " dest ", const char *restrict " src \
-", size_t " n );
+.BI "char *strncat(char " dest "[restrict " n "], \
+const char " src "[restrict " n ],
+.BI "              size_t " n );
 .fi
 .SH DESCRIPTION
 The
diff --git a/man3/strcmp.3 b/man3/strcmp.3
index 933011b9c..fc5bf1a70 100644
--- a/man3/strcmp.3
+++ b/man3/strcmp.3
@@ -21,7 +21,7 @@ Standard C library
 .B #include <string.h>
 .PP
 .BI "int strcmp(const char *" s1 ", const char *" s2 );
-.BI "int strncmp(const char *" s1 ", const char *" s2 ", size_t " n );
+.BI "int strncmp(const char " s1 [ n "], const char " s2 [ n "], size_t " n );
 .fi
 .SH DESCRIPTION
 The
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 461b811a5..50543cf7b 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -23,8 +23,9 @@ Standard C library
 .B #include <string.h>
 .PP
 .BI "char *strcpy(char *restrict " dest ", const char *" src );
-.BI "char *strncpy(char *restrict " dest ", const char *restrict " src \
-", size_t " n );
+.BI "char *strncpy(char " dest "[restrict " n "], \
+const char " src "[restrict " n ],
+.BI "              size_t " n );
 .fi
 .SH DESCRIPTION
 The
diff --git a/man3/strdup.3 b/man3/strdup.3
index 7d15245b4..ea24a61bd 100644
--- a/man3/strdup.3
+++ b/man3/strdup.3
@@ -20,9 +20,9 @@ Standard C library
 .PP
 .BI "char *strdup(const char *" s );
 .PP
-.BI "char *strndup(const char *" s ", size_t " n );
+.BI "char *strndup(const char " s [ n "], size_t " n );
 .BI "char *strdupa(const char *" s );
-.BI "char *strndupa(const char *" s ", size_t " n );
+.BI "char *strndupa(const char " s [ n "], size_t " n );
 .fi
 .PP
 .RS -4
diff --git a/man3/strerror.3 b/man3/strerror.3
index 8857ddb4e..c1621372b 100644
--- a/man3/strerror.3
+++ b/man3/strerror.3
@@ -31,10 +31,10 @@ Standard C library
 .BI "const char *strerrorname_np(int " errnum );
 .BI "const char *strerrordesc_np(int " errnum );
 .PP
-.BI "int strerror_r(int " errnum ", char *" buf ", size_t " buflen );
+.BI "int strerror_r(int " errnum ", char " buf [ buflen "], size_t " buflen );
                /* XSI-compliant */
 .PP
-.BI "char *strerror_r(int " errnum ", char *" buf ", size_t " buflen );
+.BI "char *strerror_r(int " errnum ", char " buf [ buflen "], size_t " buflen );
                /* GNU-specific */
 .PP
 .BI "char *strerror_l(int " errnum ", locale_t " locale );
diff --git a/man3/strfmon.3 b/man3/strfmon.3
index 40342a900..41b22b95e 100644
--- a/man3/strfmon.3
+++ b/man3/strfmon.3
@@ -12,9 +12,9 @@ Standard C library
 .nf
 .B #include <monetary.h>
 .PP
-.BI "ssize_t strfmon(char *restrict " s ", size_t " max ,
+.BI "ssize_t strfmon(char " s "[restrict " max "], size_t " max ,
 .BI "                const char *restrict " format ", ...);"
-.BI "ssize_t strfmon_l(char *restrict " s ", size_t " max ", locale_t " locale ,
+.BI "ssize_t strfmon_l(char " s "[restrict " max "], size_t " max ", locale_t " locale ,
 .BI "                const char *restrict " format ", ...);"
 .fi
 .SH DESCRIPTION
diff --git a/man3/strfromd.3 b/man3/strfromd.3
index a936489a1..6c4df845c 100644
--- a/man3/strfromd.3
+++ b/man3/strfromd.3
@@ -20,11 +20,11 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "int strfromd(char *restrict " str ", size_t " n ,
+.BI "int strfromd(char " str "[restrict " n "], size_t " n ,
 .BI "             const char *restrict " format ", double " fp ");"
-.BI "int strfromf(char *restrict " str ", size_t " n ,
+.BI "int strfromf(char " str "[restrict " n "], size_t " n ,
 .BI "             const char *restrict " format ", float "fp ");"
-.BI "int strfroml(char *restrict " str ", size_t " n ,
+.BI "int strfroml(char " str "[restrict " n "], size_t " n ,
 .BI "             const char *restrict " format ", long double " fp ");"
 .fi
 .PP
diff --git a/man3/strftime.3 b/man3/strftime.3
index 9a10275ca..0fb2c3123 100644
--- a/man3/strftime.3
+++ b/man3/strftime.3
@@ -24,11 +24,11 @@ Standard C library
 .nf
 .B #include <time.h>
 .PP
-.BI "size_t strftime(char *restrict " s ", size_t " max ,
+.BI "size_t strftime(char " s "[restrict " max "], size_t " max ,
 .BI "                const char *restrict " format ,
 .BI "                const struct tm *restrict " tm );
 .PP
-.BI "size_t strftime_l(char *restrict " s ", size_t " max ,
+.BI "size_t strftime_l(char " s "[restrict " max "], size_t " max ,
 .BI "                const char *restrict " format ,
 .BI "                const struct tm *restrict " tm ,
 .BI "                locale_t " locale );
diff --git a/man3/string.3 b/man3/string.3
index ec5ed0bd9..2db0f80eb 100644
--- a/man3/string.3
+++ b/man3/string.3
@@ -26,7 +26,7 @@ and
 .I s2
 ignoring case.
 .TP
-.BI "int strncasecmp(const char *" s1 ", const char *" s2 ", size_t " n );
+.BI "int strncasecmp(const char " s1 [ n "], const char " s2 [ n "], size_t " n );
 Compare the first
 .I n
 bytes of the strings
@@ -112,8 +112,11 @@ Randomly swap the characters in
 Return the length of the string
 .IR s .
 .TP
-.BI "char *strncat(char *restrict " dest ", const char *restrict " src \
-", size_t " n );
+.nf
+.BI "char *strncat(char " dest "[restrict " n "], \
+const char " src "[restrict " n ],
+.BI "       size_t " n );
+.fi
 Append at most
 .I n
 bytes from the string
@@ -123,7 +126,7 @@ to the string
 returning a pointer to
 .IR dest .
 .TP
-.BI "int strncmp(const char *" s1 ", const char *" s2 ", size_t " n );
+.BI "int strncmp(const char " s1 [ n "], const char " s2 [ n "], size_t " n );
 Compare at most
 .I n
 bytes of the strings
@@ -131,8 +134,11 @@ bytes of the strings
 and
 .IR s2 .
 .TP
-.BI "char *strncpy(char *restrict " dest ", const char *restrict " src \
-", size_t " n );
+.nf
+.BI "char *strncpy(char " dest "[restrict " n "], \
+const char " src "[restrict " n ],
+.BI "       size_t " n );
+.fi
 Copy at most
 .I n
 bytes from string
@@ -179,8 +185,11 @@ Extract tokens from the string
 that are delimited by one of the bytes in
 .IR delim .
 .TP
-.BI "size_t strxfrm(char *restrict " dst ", const char *restrict " src \
-", size_t " n );
+.nf
+.BI "size_t strxfrm(char " dst "[restrict " n "], \
+const char " src "[restrict " n ],
+.BI "        size_t " n );
+.fi
 Transforms
 .I src
 to the current locale and copies the first
diff --git a/man3/strnlen.3 b/man3/strnlen.3
index 3cf575735..6df8e0d03 100644
--- a/man3/strnlen.3
+++ b/man3/strnlen.3
@@ -15,7 +15,7 @@ Standard C library
 .nf
 .B #include <string.h>
 .PP
-.BI "size_t strnlen(const char *" s ", size_t " maxlen );
+.BI "size_t strnlen(const char " s [ maxlen "], size_t " maxlen );
 .fi
 .PP
 .RS -4
diff --git a/man3/strxfrm.3 b/man3/strxfrm.3
index 909aed1df..df623a186 100644
--- a/man3/strxfrm.3
+++ b/man3/strxfrm.3
@@ -17,7 +17,8 @@ Standard C library
 .nf
 .B #include <string.h>
 .PP
-.BI "size_t strxfrm(char *restrict " dest ", const char *restrict " src ,
+.BI "size_t strxfrm(char " dest "[restrict " n "], \
+const char " src "[restrict " n ],
 .BI "               size_t " n );
 .fi
 .SH DESCRIPTION
diff --git a/man3/ttyname.3 b/man3/ttyname.3
index 39d253356..4e37d6cbf 100644
--- a/man3/ttyname.3
+++ b/man3/ttyname.3
@@ -16,7 +16,7 @@ Standard C library
 .B #include <unistd.h>
 .PP
 .BI "char *ttyname(int " fd );
-.BI "int ttyname_r(int " fd ", char *" buf ", size_t " buflen );
+.BI "int ttyname_r(int " fd ", char " buf [ buflen "], size_t " buflen );
 .fi
 .SH DESCRIPTION
 The function
diff --git a/man3/unlocked_stdio.3 b/man3/unlocked_stdio.3
index f87b57779..cb9de40f6 100644
--- a/man3/unlocked_stdio.3
+++ b/man3/unlocked_stdio.3
@@ -33,7 +33,7 @@ Standard C library
 ", size_t " n ,
 .BI "                      FILE *restrict " stream );
 .PP
-.BI "char *fgets_unlocked(char *restrict " s ", int " n \
+.BI "char *fgets_unlocked(char " s "[restrict " n "], int " n \
 ", FILE *restrict " stream );
 .BI "int fputs_unlocked(const char *restrict " s ", FILE *restrict " stream );
 .PP
@@ -47,7 +47,7 @@ Standard C library
 .BI "wint_t putwc_unlocked(wchar_t " wc ", FILE *" stream );
 .BI "wint_t putwchar_unlocked(wchar_t " wc );
 .PP
-.BI "wchar_t *fgetws_unlocked(wchar_t *restrict " ws ", int " n ,
+.BI "wchar_t *fgetws_unlocked(wchar_t " ws "[restrict " n "], int " n ,
 .BI "                      FILE *restrict " stream );
 .BI "int fputws_unlocked(const wchar_t *restrict " ws ,
 .BI "                      FILE *restrict " stream );
diff --git a/man3/wcsnrtombs.3 b/man3/wcsnrtombs.3
index ef9aeba4c..bc0a9c64f 100644
--- a/man3/wcsnrtombs.3
+++ b/man3/wcsnrtombs.3
@@ -17,7 +17,7 @@ Standard C library
 .nf
 .B #include <wchar.h>
 .PP
-.BI "size_t wcsnrtombs(char *restrict " dest ", const wchar_t **restrict " src ,
+.BI "size_t wcsnrtombs(char " dest "[restrict " len "], const wchar_t **restrict " src ,
 .BI "                  size_t " nwc ", size_t " len \
 ", mbstate_t *restrict " ps );
 .fi
diff --git a/man3/wcsrtombs.3 b/man3/wcsrtombs.3
index aed7024b7..335498663 100644
--- a/man3/wcsrtombs.3
+++ b/man3/wcsrtombs.3
@@ -18,7 +18,7 @@ Standard C library
 .nf
 .B #include <wchar.h>
 .PP
-.BI "size_t wcsrtombs(char *restrict " dest ", const wchar_t **restrict " src ,
+.BI "size_t wcsrtombs(char " dest "[restrict " len "], const wchar_t **restrict " src ,
 .BI "                 size_t " len ", mbstate_t *restrict " ps );
 .fi
 .SH DESCRIPTION
diff --git a/man3/wcstombs.3 b/man3/wcstombs.3
index 547381f7e..7c2394b36 100644
--- a/man3/wcstombs.3
+++ b/man3/wcstombs.3
@@ -18,7 +18,7 @@ Standard C library
 .nf
 .B #include <stdlib.h>
 .PP
-.BI "size_t wcstombs(char *restrict " dest ", const wchar_t *restrict " src ,
+.BI "size_t wcstombs(char " dest "[restrict " n "], const wchar_t *restrict " src ,
 .BI "                size_t " n );
 .fi
 .SH DESCRIPTION
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-08-26 21:07 [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters Alejandro Colomar
@ 2022-08-27 11:10 ` Ingo Schwarze
  2022-08-27 12:15   ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Ingo Schwarze @ 2022-08-27 11:10 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, JeanHeyd Meneide

Hi Alejandro,

> -.BI "char *getcwd(char *" buf ", size_t " size );
> +.BI "char *getcwd(char " buf [ size "], size_t " size );

I dislike this.

Manual pages should show function prototypes as they really are in
the header file, or if the header file contains useless fluff like
"restrict", a shortened form showing the essence that actually matters
for using the API.  They should certainly not show something imaginary
that does not match reality, and even less so using invalid syntax.

Yours,
  Ingo

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-08-27 11:10 ` Ingo Schwarze
@ 2022-08-27 12:15   ` Alejandro Colomar
  2022-08-27 13:08     ` Ingo Schwarze
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-08-27 12:15 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: linux-man, JeanHeyd Meneide


[-- Attachment #1.1: Type: text/plain, Size: 4593 bytes --]

Hi Ingo,

On 8/27/22 13:10, Ingo Schwarze wrote:
> Hi Alejandro,
> 
>> -.BI "char *getcwd(char *" buf ", size_t " size );
>> +.BI "char *getcwd(char " buf [ size "], size_t " size );
> 
> I dislike this.
> 
> Manual pages should show function prototypes as they really are in
> the header file, or if the header file contains useless fluff like
> "restrict", a shortened form showing the essence that actually matters
> for using the API.

Regarding restrict, it is essential to differentiate memcpy(3) and 
memmove(3), which are otherwise identical:

     void *memmove(void *dest, const void *src, size_t n);

     void *memcpy(void *restrict dest, const void *restrict src,
                  size_t n);

I guess you will argue that the description specified the difference, so 
it's not necessary in the synopsis.  That's true.  But reality is that 
programmers have historically not cared about those details; so much 
that glibc had to provide a compat symbol for old programs, which 
basically maps memcpy(3) to memmove(3) for code linked against old glibc 
versions.

In some cases, like in memcpy(3), the use of restrict is important; in 
others, such as in printf(3), it is irrelevant.  But for consistency, I 
decided to use restrict everywhere where one of POSIX, or glibc used it 
(assuming that POSIX would never remove a restrict qualifier if ISO C 
required it).  In some cases, glibc and POSIX differed, and I used the 
most restrictive prototype.

I didn't add that change about restrict without concerns about being too 
noisy.  I had them, and still have them.  But I think the added value is 
more than the one I removed.  Now prototypes are more precise, and 
overcoming the noise shouldn't be too much of a problem.

In the case of (abusing) VLA syntax, it's more or less the same thing, 
with a bit of added WTF moments about the "Why is this code using an 
identifier declared right after it?  Is it a typo?".  I guess the WTF 
moments will be more relevant the first few months, and less so when 
time passes and programmers get used to the syntax.

I used strlcpy(3) in the commit message on purpose, as it's a great 
example, similar to how good is the one about memcpy(3).  The competitor 
(as they promoted it) to strlcpy(3) in the Linux kernel is strscpy(9) 
(not available to user space).  They seem to be the same thing, but they 
are not.  Let's show their prototypes:

     size_t strlcpy(char dst[size], const char *src, size_t size);

     ssize_t strscpy(char dst[size], const char src[size], size_t size);

 From those prototypes, I can already see that the kernel accepts a 
possibly-not-terminated string, while strlcpy(3) requires that the 
string is terminated.  I didn't use restrict here to more clearly show 
the difference in VLA syntax (therefore admitting that a bit of noise is 
true).

Then of course, there's no difference in the prototypes between 
strscpy(9) and strncpy(3), apart from the return value, of course:

     char *strncpy(char dest[n], const char src[n], size_t n);

And yet they are different functions (one guarantees the produced string 
to be NUL-terminated and the other not (and also clears unnecessarily 
the rest of the buffer, so strncpy(3) is just broken).  But they are 
more or less in the same league, as they are used for transforming 
untrusted strings into proper strings (strncpy(3) only if you use it 
with sizeof(buf) - 1), and that's shown by the prototypes.

Do you regard the (abused) VLA syntax as something much worse than the 
use of restrict?  Or are they more or less equivalent to you?

>  They should certainly not show something imaginary
> that does not match reality, and even less so using invalid syntax.

Well, not that I haven't had those thoughts, but we already use ilegal 
syntax in some cases for good reasons.  See for example open(2):

        int open(const char *pathname, int flags);
        int open(const char *pathname, int flags, mode_t mode);

Of course, you can't declare two conflicting prototypes like that.  But 
it shows that those are the two only ways you can use it.  I'll admit 
that a long time ago I told Michael that we should fix those prototypes 
to match reality, with legal syntax, because otherwise they are 
confusing.  But with time, I got used to that weirdness, and it now 
seems to me more informative than just '...' as FreeBSD and OpenBSD 
document.

> 
> Yours,
>    Ingo

Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-08-27 12:15   ` Alejandro Colomar
@ 2022-08-27 13:08     ` Ingo Schwarze
  2022-08-27 18:38       ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Ingo Schwarze @ 2022-08-27 13:08 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, JeanHeyd Meneide

Hi Alejandro,

Alejandro Colomar wrote on Sat, Aug 27, 2022 at 02:15:32PM +0200:
> On 8/27/22 13:10, Ingo Schwarze wrote:
>> Alejandro Colomar wrote:

>>> -.BI "char *getcwd(char *" buf ", size_t " size );
>>> +.BI "char *getcwd(char " buf [ size "], size_t " size );

>> I dislike this.
>> 
>> Manual pages should show function prototypes as they really are in
>> the header file, or if the header file contains useless fluff like
>> "restrict", a shortened form showing the essence that actually matters
>> for using the API.

> Regarding restrict, it is essential to differentiate memcpy(3) and 
> memmove(3), which are otherwise identical:
> 
>      void *memmove(void *dest, const void *src, size_t n);
> 
>      void *memcpy(void *restrict dest, const void *restrict src,
>                   size_t n);

Actually, the syntax of both is identical, only the semantics differ.

That said, you are right that using memcpy(3) when memmove(3) is
required is a famous and widespread bug.  I doubt putting "restrict"
into the SYNOPSIS will discourage careless programmers from making that
mistake though.

To me, "restrict" feels like a specialized tool for people writing
compiler optimizers, not like something important enough to clutter
API documenation.

> Do you regard the (abused) VLA syntax as something much worse than the 
> use of restrict?  Or are they more or less equivalent to you?

If your implementation really contains "restrict" in the header
file and it's standardized, putting it into the SYNOPSIS seems
acceptable to me.  Not necessary though and maybe somewhat noisy
and distracting.

Putting something that is not in the implementation and/or not
in the standard into the SYNOPSIS seems much worse to me.

And invalid syntax in the SYNOPSIS is even worse than that.
For example, people may attempt to use SYNOPSIS as an example
when designing their own, private function for a similar but
not identical purpose and end up writing non-portable code,
or even code that does not compile anywhere.

They may be wrong if they blame you for that, but i doubt they
will thank you.

>> They should certainly not show something imaginary
>> that does not match reality, and even less so using invalid syntax.

> Well, not that I haven't had those thoughts, but we already use ilegal 
> syntax in some cases for good reasons.  See for example open(2):
> 
>         int open(const char *pathname, int flags);
>         int open(const char *pathname, int flags, mode_t mode);
> 
> Of course, you can't declare two conflicting prototypes like that.

This does not seem quite as horrifying as

  char *getcwd(char buf[size], size_t size);

because at least each of the prototypes is valid.

My main concern about it would be that it is likely to make some people
think (and C++ programmers in particular :-/) that there is type
checking for the third and subsequent arguments, in which case they
will be unpleasantly surprised when accidentally writing something like

  open(pathname, flags, &some_var, mode);

and finding out later that it compiled and ran just fine, but the
resulting file wasn't quite as confidential as they hoped.

Explicitly displaying the ... to indicate the variable number of
arguments, by contrast, makes it very clear that an API is almost
certainly unusually dangerous and needs to be used with especial
diligence.

Either way, certainly not quite as bad as invalid syntax inside
a prototype...

Yours,
  Ingo

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-08-27 13:08     ` Ingo Schwarze
@ 2022-08-27 18:38       ` Alejandro Colomar
  2022-08-28 11:24         ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-08-27 18:38 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: linux-man, JeanHeyd Meneide


[-- Attachment #1.1: Type: text/plain, Size: 8610 bytes --]

Hi Ingo,

On 8/27/22 15:08, Ingo Schwarze wrote:
[...]

>>       void *memmove(void *dest, const void *src, size_t n);
>>
>>       void *memcpy(void *restrict dest, const void *restrict src,
>>                    size_t n);
> 
> Actually, the syntax of both is identical, only the semantics differ.
> 
> That said, you are right that using memcpy(3) when memmove(3) is
> required is a famous and widespread bug.  I doubt putting "restrict"
> into the SYNOPSIS will discourage careless programmers from making that
> mistake though.

There will always be completely careless programmers, and I won't 
attempt to target those.  But I hope to increase the percentage of 
population that receives the message with this change.

There's a lot of good programmers out there that ignore what we'd 
probably consider basic stuff.  I've seen very successful programmers 
using pointer types to store offsets, where ptrdiff_t should be used; 
and of course the code is full of casts (including to uintptr_t) to 
silence the hundreds of warnings from the compiler yelling at that 
blasphemy (3 casts in a line of code that should be just 'q = p + offset;').

I don't think it's only carelessness (okay, there's a bit of it), but 
more that some programmers still live in the past century, and don't 
know that 'const', and more recently 'restrict' were added to the 
language.  Did they live in a cave?  So it seems.

> 
> To me, "restrict" feels like a specialized tool for people writing
> compiler optimizers, not like something important enough to clutter
> API documenation.
> 
>> Do you regard the (abused) VLA syntax as something much worse than the
>> use of restrict?  Or are they more or less equivalent to you?
> 
> If your implementation really contains "restrict" in the header
> file and it's standardized, putting it into the SYNOPSIS seems
> acceptable to me.  Not necessary though and maybe somewhat noisy
> and distracting.

I would document restrict even if glibc ommited it from their 
prototypes, because the manual pages document the behavior, and are not 
required to be uninformative when the implementation is.  memcpy(3) 
requires restrict pointers, even if the implementation doesn't advertise 
it.  The run-time behavior will produce bugs if the pointers aren't 
restrict, and so the documentation better tells that.

Should that only be limited to the DESCRIPTION?  Maybe.  I don't like 
that idea, when we have the language to express that more precisely and 
concisely.

What is the usefulness of a prototype that is as short as possible?
Okay, it's less noisy, but at the cost of giving less information to an 
interested reader.

     void *memcpy(void dest[restrict n], const void src[restrict n],
                  size_t n);

     void *memcpy(void *restrict dest, const void *restrict src,
                  size_t n);

     void *memcpy(void *dest, const void *src, size_t n);

     void *memcpy(void *dest, void *src, size_t n);

     void *memcpy(void *, void *, size_t);

Where do we stop?  Okay, I was abusing too much in the first one.  I 
didn't dare to propose that, since VLA syntax with void is, right now, a 
shooting offense.  GNU C allows pointer arithmetic on 'void *', but for 
some reason, when they are disguised as false arrays it is not allowed 
to compile.  Give it some decades, though.  I think it would be useful.

These prototypes give several levels of information, from most basic, to 
most precise:

- number of arguments.
- type of the arguments.
- small description of what they mean (through the names).
- are pointers only used for reading, or are the pointees also modified?
- can pointers point to the same storage?
- how many elements/bytes has the storage pointed to by pointers?

And now yet in those prototypes, but I'd also like to give information 
about if pointers are allowed to be null or not.  I'm still not 
convinced about how to document that.

Why draw the line of this is useful and this is noise in a specific 
point?  The description could perfectly document the const-ness of a 
parameter as well as it documents the restrict-ness or the null-ness.  I 
think that having the prototypes be concise is good, but the overall 
goal is to have the whole manual page concise.  If adding a little bit 
to the prototypes makes the description much more concise (since it 
doesn't need to document const-ness, and hopefully restrict-ness and 
null-ness or size, or at least it can be shorter about it since the 
prototype already tells you a big part), I'd go for it.

> 
> Putting something that is not in the implementation and/or not
> in the standard into the SYNOPSIS seems much worse to me.

I have mixed feelings about this.

As you probably know by older threads, I don't like the standard as a 
driving force in C, so I don't like the idea that implementations and 
documentation should go behind the standard, using whatever the standard 
provides them with.  It's not completely like that, as I acknowledge 
that the standard has improved considerably the language and the 
library, and I like to use some of their improvements, or even things 
that it is yet considering (my own personal code I build it with 
-std=gnu2x, since it's just for me, so I don't care about portability, 
and want the most useful extensions, and I'm prepared to deal with 
compiler bugs); but I don't like the standard as the _only_ driving force.

Said that, I think both GCC and glibc should not be intimidated by the 
standard when developing new extensions.  Okay, that is not a wildcard 
for releasing crap; but if a feature is good, that's fine.  Now, are the 
manual pages allowed to extend the language as well?  Of course not so 
much as the compiler or libc, but a little bit wouldn't hurt.  So, I 
wouldn't take your comment too strictly.

Still, I acknowledge this suggestion of mine is far more aggressive than 
most other trivial deviations from valid code.  Maybe I should keep the 
idea floating around, and suggest it again after the new standard is 
released, so that compiler writers are less stressed about it, and can 
consider such an extension.  I'll maybe talk to GCC maintainers about it 
and see what they think.

> 
> And invalid syntax in the SYNOPSIS is even worse than that.
> For example, people may attempt to use SYNOPSIS as an example
> when designing their own, private function for a similar but
> not identical purpose and end up writing non-portable code,
> or even code that does not compile anywhere.
> 
> They may be wrong if they blame you for that, but i doubt they
> will thank you.

Yeah, that's something important to consider.

> 
>>> They should certainly not show something imaginary
>>> that does not match reality, and even less so using invalid syntax.
> 
>> Well, not that I haven't had those thoughts, but we already use ilegal
>> syntax in some cases for good reasons.  See for example open(2):
>>
>>          int open(const char *pathname, int flags);
>>          int open(const char *pathname, int flags, mode_t mode);
>>
>> Of course, you can't declare two conflicting prototypes like that.
> 
> This does not seem quite as horrifying as
> 
>    char *getcwd(char buf[size], size_t size);
> 
> because at least each of the prototypes is valid.
> 
> My main concern about it would be that it is likely to make some people
> think (and C++ programmers in particular :-/) that there is type
> checking for the third and subsequent arguments, in which case they
> will be unpleasantly surprised when accidentally writing something like
> 
>    open(pathname, flags, &some_var, mode);
> 
> and finding out later that it compiled and ran just fine, but the
> resulting file wasn't quite as confidential as they hoped.
> 
> Explicitly displaying the ... to indicate the variable number of
> arguments, by contrast, makes it very clear that an API is almost
> certainly unusually dangerous and needs to be used with especial
> diligence.

Yeah, I suggested Michael using '...' and adding in a comment:

int open(const char *pathname, int flags, ... /* mode_t mode */);

He agreed, but we were doing something else, and then I didn't ask 
again, so this change didn't make it.  If you recommend me doing it, 
I'll do.

> 
> Either way, certainly not quite as bad as invalid syntax inside
> a prototype...

Cheers,

Alex

> 
> Yours,
>    Ingo

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-08-27 18:38       ` Alejandro Colomar
@ 2022-08-28 11:24         ` Alejandro Colomar
       [not found]           ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-08-28 11:24 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: linux-man, JeanHeyd Meneide


[-- Attachment #1.1: Type: text/plain, Size: 827 bytes --]

On 8/27/22 20:38, Alejandro Colomar wrote:
> 
>      void *memcpy(void dest[restrict n], const void src[restrict n],
>                   size_t n);
> 
>      void *memcpy(void *restrict dest, const void *restrict src,
>                   size_t n);
> 
>      void *memcpy(void *dest, const void *src, size_t n);
> 
>      void *memcpy(void *dest, void *src, size_t n);
> 
>      void *memcpy(void *, void *, size_t);

BTW, I forgot about 'noreturn', probably because memcpy(3) doesn't use 
it, but it's another layer of information which also adds a bit of 
noise, but is also useful to know.  The Linux man-pages use it (see 
exit(3)); I added that more or less at the same time I added restrict.


-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
       [not found]           ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
@ 2022-09-02 21:02             ` Alejandro Colomar
  2022-09-02 21:57               ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-02 21:02 UTC (permalink / raw)
  To: JeanHeyd Meneide; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3856 bytes --]

Hi JeanHeyd!

I'm forwarding your email to the mailing list, from my post-1996 mail 
client ;)

I hope all of your content is kept (even if slightly degraded).

Cheers,

Alex



-------- Forwarded Message --------
Subject: 	Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in 
function parameters
Date: 	Fri, 2 Sep 2022 16:56:00 -0400
From: 	JeanHeyd Meneide <wg14@soasis.org>
To: 	Alejandro Colomar <alx.manpages@gmail.com>
CC: 	Ingo Schwarze <schwarze@usta.de>, linux-man@vger.kernel.org



Hi Alejandro and Ingo,

       Just chiming in from a Standards perspective, here. We discussed, 
briefly, a way to allow Variable-Length function parameter declarations 
like the ones shown in this thread (e.g., char *getcwd(char buf[size], 
size_t size );).

       In GCC, there is a GNU extension that allows explicitly 
forward-declaring the prototype. Using the above example, it would look 
like so:

char *getcwd(size_t size; char buf[size], size_t size);

(Live Example [1])

(Note the `;` after the first "size" declaration). This was brought 
before the Committee to vote on for C23 in the form of N2780 [2], around 
the January 2022 timeframe. The paper did not pass, and it was seen as a 
"failed extension". After the vote on that failed, we talked about other 
ways of allowing places whether there was some appetite to allow 
"forward parsing" for this sort of case. That is, could we simply allow:

char *getcwd(char buf[size], size_t size);

to work as expected. The vote for this did not gain full consensus 
either, but there were a lot of abstentions [3]. While I personally 
voted in favor of allowing such for C, there was distinct worry that 
this would produce issues for weaker C implementations that did not want 
to commit to delayed parsing or forward parsing of the entirety of the 
argument list before resolving types. There are enough abstentions 
during voting that a working implementation with a writeup of complexity 
would sway the Committee one way or the other.

This is not to dissuade Alejandro's position, or to bolster Ingo's 
point; I'm mostly just reporting the Committee's response here. This is 
an unsolved problem for the Committee, and also a larger holdover from 
the removal of K&R declarations from C23, which COULD solve this problem:

// decl
char *getcwd();

// impl
char* getcwd(buf, size)
char buf[size];
       size_t size;
{
       /* impl here */
}

       There is room for innovation here, or perhaps bolstering of the 
GCC original extension. As it stands right now, compilers only very 
recently started taking Variably-Modified Type parameters and Static 
Extent parameters seriously after carefully separating them out of 
Variable-Length Arrays, warning where they can when static or other 
array parameters do not match buffer lengths and so-on.

       Not just to the folks in this thread, but to the broader 
community for anyone who is paying attention: WG14 would actively like 
to solve this problem. If someone can:
- prove out a way to do delayed parsing that is not implementation-costly,
- revive the considered-dead GCC extension, or
- provide a 3rd or 4th way to support the goals,

I am certain WG14 would look favorably upon such a thing eventually, 
brought before the Committee in inclusion for C2y/C3a.

       Whether or not you feel like the manpages are the best place to 
start that, I'll leave up to you!

Thanks,
JeanHeyd

[1]: https://godbolt.org/z/dv1G3qGa3 <https://godbolt.org/z/dv1G3qGa3>
[2]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2780.pdf 
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2780.pdf>
[3]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2991.pdf 
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2991.pdf> - search 
for n2780

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-02 21:02             ` Alejandro Colomar
@ 2022-09-02 21:57               ` Alejandro Colomar
  2022-09-03 12:47                 ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-02 21:57 UTC (permalink / raw)
  To: JeanHeyd Meneide; +Cc: Ingo Schwarze, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 8301 bytes --]

Hi JeanHeyd,

> Subject:     Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in 
> function parameters
> Date:     Fri, 2 Sep 2022 16:56:00 -0400
> From:     JeanHeyd Meneide <wg14@soasis.org>
> To:     Alejandro Colomar <alx.manpages@gmail.com>
> CC:     Ingo Schwarze <schwarze@usta.de>, linux-man@vger.kernel.org
> 
> 
> 
> Hi Alejandro and Ingo,
> 
>        Just chiming in from a Standards perspective, here. We discussed, 
> briefly, a way to allow Variable-Length function parameter declarations 
> like the ones shown in this thread (e.g., char *getcwd(char buf[size], 
> size_t size );).
> 
>        In GCC, there is a GNU extension that allows explicitly 
> forward-declaring the prototype. Using the above example, it would look 
> like so:

I added the GCC list to the thread, so that they can intervene if they 
consider it necessary.

> 
> char *getcwd(size_t size; char buf[size], size_t size);

I read about that, although I don't like it very much, and never used it.

> 
> (Live Example [1])
> 
> (Note the `;` after the first "size" declaration). This was brought 
> before the Committee to vote on for C23 in the form of N2780 [2], around 
> the January 2022 timeframe. The paper did not pass, and it was seen as a 
> "failed extension". After the vote on that failed, we talked about other 
> ways of allowing places whether there was some appetite to allow 
> "forward parsing" for this sort of case. That is, could we simply allow:
> 
> char *getcwd(char buf[size], size_t size);
> 
> to work as expected. The vote for this did not gain full consensus 
> either, but there were a lot of abstentions [3]. While I personally 
> voted in favor of allowing such for C, there was distinct worry that 
> this would produce issues for weaker C implementations that did not want 
> to commit to delayed parsing or forward parsing of the entirety of the 
> argument list before resolving types. There are enough abstentions 
> during voting that a working implementation with a writeup of complexity 
> would sway the Committee one way or the other.

I like that this got less hate than the GNU extension.  It's nicer to my 
eyes.

> 
> This is not to dissuade Alejandro's position, or to bolster Ingo's 
> point; I'm mostly just reporting the Committee's response here. This is 
> an unsolved problem for the Committee, and also a larger holdover from 
> the removal of K&R declarations from C23, which COULD solve this problem:
> 
> // decl
> char *getcwd();
> 
> // impl
> char* getcwd(buf, size)
> char buf[size];
>        size_t size;
> {
>        /* impl here */
> }

I won't miss them ;)

My regex-based parser[1] that finds declarations and definitions in C 
code bases goes nuts with K&R functions.  They are dead for good :)

[1]: <http://www.alejandro-colomar.es/src/alx/alx/grepc.git/>

> 
>        There is room for innovation here, or perhaps bolstering of the 
> GCC original extension. As it stands right now, compilers only very 
> recently started taking Variably-Modified Type parameters and Static 
> Extent parameters seriously after carefully separating them out of 
> Variable-Length Arrays, warning where they can when static or other 
> array parameters do not match buffer lengths and so-on.
> 
>        Not just to the folks in this thread, but to the broader 
> community for anyone who is paying attention: WG14 would actively like 
> to solve this problem. If someone can:
> - prove out a way to do delayed parsing that is not implementation-costly,
> - revive the considered-dead GCC extension, or
> - provide a 3rd or 4th way to support the goals,
> 
> I am certain WG14 would look favorably upon such a thing eventually, 
> brought before the Committee in inclusion for C2y/C3a.
> 
>        Whether or not you feel like the manpages are the best place to 
> start that, I'll leave up to you!

I'll try to defend the reasons to start this in the man-pages.

This feature is mostly for documentation purposes, not being meaningful 
for code at all (for some meaning of meaningful), since it won't change 
the function definition in any way, nor the calls to it.  At least not 
by itself; static analysis may get some benefits, though.

Also, new code can be designed from the beginning so that sizes go 
before their corresponding arrays, so that new code won't typically be 
affected by the lack of this feature in the language.

This leaves us with legacy code, especially libc, which just works, and 
doesn't have any urgent needs to change their prototypes in this regard 
(they could, to improve static analysis, but not what we'd call urgent).

And since most people don't go around reading libc headers searching for 
function declarations (especially since there are manual pages that show 
them nicely), it's not like the documentation of the code depends on how 
the function is _actually_ declared in code (that's why I also defended 
documenting restrict even if glibc wouldn't have cared to declare it), 
but it depends basically on what the manual pages say about the 
function.  If the manual pages say a function gets 'restrict' params, it 
means it gets 'restrict' params, no matter what the code says, and if it 
doesn't, the function accepts overlapping pointers, at least for most of 
the public (modulo manual page bugs, that is).

So this extension could very well be added by the manual pages, as a 
form of documentation, and then maybe picked up by compilers that have 
enough resources to implement it.


Considering that this feature is mostly about documentation (and a bit 
of static analysis too), the documentation should be something appealing 
to the reader.


Let's take an example:


        int getnameinfo(const struct sockaddr *restrict addr,
                        socklen_t addrlen,
                        char *restrict host, socklen_t hostlen,
                        char *restrict serv, socklen_t servlen,
                        int flags);

and some transformations:


        int getnameinfo(const struct sockaddr *restrict addr,
                        socklen_t addrlen,
                        char host[restrict hostlen], socklen_t hostlen,
                        char serv[restrict servlen], socklen_t servlen,
                        int flags);


        int getnameinfo(socklen_t hostlen;
                        socklen_t servlen;
                        const struct sockaddr *restrict addr,
                        socklen_t addrlen,
                        char host[restrict hostlen], socklen_t hostlen,
                        char serv[restrict servlen], socklen_t servlen,
                        int flags);

(I'm not sure if I used correct GNU syntax, since I never used that 
extension myself.)

The first transformation above is non-ambiguous, as concise as possible, 
and its only issue is that it might complicate the implementation a bit 
too much.  I don't think forward-using a parameter's size would be too 
much of a parsing problem for human readers.

The second one is unnecessarily long and verbose, and semicolons are not 
very distinguishable from commas, for human readers, which may be very 
confusing.

        int foo(int a; int b[a], int a);
        int foo(int a, int b[a], int o);

Those two are very different to the compiler, and yet very similar to 
the human eye.  I don't like it.  The fact that it allows for simpler 
compilers isn't enough to overcome the readability issues.

I think I'd prefer having the forward-using syntax as a non-standard 
extension --or a standard but optional language feature-- to avoid 
forcing small compilers to implement it, rather than having the GNU 
extension standardized in all compilers.

Having this extension in any single compiler would even make it more 
appealing to manual pages, which could use the syntax more freely 
without fear of confusing readers.  Even if the standard wouldn't accept it.

Let's see if GCC likes the feature and helps me attempt to use it a 
little bit! :-)

Cheers,

Alex


-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-02 21:57               ` Alejandro Colomar
@ 2022-09-03 12:47                 ` Martin Uecker
  2022-09-03 13:29                   ` Ingo Schwarze
  2022-09-03 13:41                   ` Alejandro Colomar
  0 siblings, 2 replies; 85+ messages in thread
From: Martin Uecker @ 2022-09-03 12:47 UTC (permalink / raw)
  To: Alejandro Colomar, JeanHeyd Meneide; +Cc: Ingo Schwarze, linux-man, gcc

...
> > 
> >        Whether or not you feel like the manpages are the best place to 
> > start that, I'll leave up to you!
> 
> I'll try to defend the reasons to start this in the man-pages.
> 
> This feature is mostly for documentation purposes, not being meaningful 
> for code at all (for some meaning of meaningful), since it won't change 
> the function definition in any way, nor the calls to it.  At least not 
> by itself; static analysis may get some benefits, though.


GCC will warn if the bound is specified inconsistently between
declarations and also emit warnings if it can see that a buffer
which is passed is too small:

https://godbolt.org/z/PsjPG1nv7


BTW: If you declare pointers to arrays (not first elements) you
can get run-time bounds checking with UBSan:

https://godbolt.org/z/TvMo89WfP


> 
> Also, new code can be designed from the beginning so that sizes go 
> before their corresponding arrays, so that new code won't typically be 
> affected by the lack of this feature in the language.
> 
> This leaves us with legacy code, especially libc, which just works, and 
> doesn't have any urgent needs to change their prototypes in this regard 
> (they could, to improve static analysis, but not what we'd call urgent).

It would be useful step to find out-of-bounds problem in
applications using libc.


> And since most people don't go around reading libc headers searching for 
> function declarations (especially since there are manual pages that show 
> them nicely), it's not like the documentation of the code depends on how 
> the function is _actually_ declared in code (that's why I also defended 
> documenting restrict even if glibc wouldn't have cared to declare it), 
> but it depends basically on what the manual pages say about the 
> function.  If the manual pages say a function gets 'restrict' params, it 
> means it gets 'restrict' params, no matter what the code says, and if it 
> doesn't, the function accepts overlapping pointers, at least for most of 
> the public (modulo manual page bugs, that is).
> 
> So this extension could very well be added by the manual pages, as a 
> form of documentation, and then maybe picked up by compilers that have 
> enough resources to implement it.
> 
> 
> Considering that this feature is mostly about documentation (and a bit 
> of static analysis too), the documentation should be something appealing 
> to the reader.
> 
> 
> Let's take an example:
> 
> 
>         int getnameinfo(const struct sockaddr *restrict addr,
>                         socklen_t addrlen,
>                         char *restrict host, socklen_t hostlen,
>                         char *restrict serv, socklen_t servlen,
>                         int flags);
> 
> and some transformations:
> 
> 
>         int getnameinfo(const struct sockaddr *restrict addr,
>                         socklen_t addrlen,
>                         char host[restrict hostlen], socklen_t hostlen,
>                         char serv[restrict servlen], socklen_t servlen,
>                         int flags);
> 
> 
>         int getnameinfo(socklen_t hostlen;
>                         socklen_t servlen;
>                         const struct sockaddr *restrict addr,
>                         socklen_t addrlen,
>                         char host[restrict hostlen], socklen_t hostlen,
>                         char serv[restrict servlen], socklen_t servlen,
>                         int flags);
> 
> (I'm not sure if I used correct GNU syntax, since I never used that 
> extension myself.)
> 
> The first transformation above is non-ambiguous, as concise as possible, 
> and its only issue is that it might complicate the implementation a bit 
> too much.  I don't think forward-using a parameter's size would be too 
> much of a parsing problem for human readers.


I personally find the second form not terrible.  Being
able to read code left-to-right, top-down is helpful in more
complicated examples.



> The second one is unnecessarily long and verbose, and semicolons are not 
> very distinguishable from commas, for human readers, which may be very 
> confusing.
> 
>         int foo(int a; int b[a], int a);
>         int foo(int a, int b[a], int o);
> 
> Those two are very different to the compiler, and yet very similar to 
> the human eye.  I don't like it.  The fact that it allows for simpler 
> compilers isn't enough to overcome the readability issues.

This is true, I would probably use it with a comma and/or
syntax highlighting.


> I think I'd prefer having the forward-using syntax as a non-standard 
> extension --or a standard but optional language feature-- to avoid 
> forcing small compilers to implement it, rather than having the GNU 
> extension standardized in all compilers.

The problems with the second form are:

- it is not 100% backwards compatible (which maybe ok though) as
the semantics of the following code changes:

int n;
int foo(int a[n], int n); // refers to different n!

Code written for new compilers could then be misunderstood
by old compilers when a variable with 'n' is in scope.


- it would generally be fundamentally new to C to have
backwards references and parser might need to be changes
to allow this


- a compiler or tool then has to deal also with ugly
corner cases such as mutual references:

int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]);



We could consider new syntax such as

int foo(char buf[.n], int n);


Personally, I would prefer the conceptual simplicity of forward
declarations and the fact that these exist already in GCC
over any alternative.  I would also not mind new syntax, but
then one has to define the rules more precisely to avoid the
aforementioned problems. 


Martin





^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 12:47                 ` Martin Uecker
@ 2022-09-03 13:29                   ` Ingo Schwarze
  2022-09-03 15:08                     ` Alejandro Colomar
  2022-09-03 13:41                   ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Ingo Schwarze @ 2022-09-03 13:29 UTC (permalink / raw)
  To: alx.manpages
  Cc: Martin Uecker, Alejandro Colomar, JeanHeyd Meneide, linux-man, gcc

Hi,

the only point i strongly care about is this one:

Manual pages should not use
 * non-standard syntax
 * non-portable syntax
 * ambiguous syntax (i.e. syntax that might have different meanings
   with different compilers or in different contexts)
 * syntax that might be invalid or dangerous with some widely
   used compiler collections like GCC or LLVM

Regarding the discussions about standardization and extensions,
all proposals i have seen look seriously ugly and awkward to me,
and i'm not yet convinced such ugliness is sufficiently offset by
the relatively minor benefit that is apparent to me right now.

Yours,
  Ingo

-- 
Ingo Schwarze             <schwarze@usta.de>
http://www.openbsd.org/   <schwarze@openbsd.org>
http://mandoc.bsd.lv/     <schwarze@mandoc.bsd.lv>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 12:47                 ` Martin Uecker
  2022-09-03 13:29                   ` Ingo Schwarze
@ 2022-09-03 13:41                   ` Alejandro Colomar
  2022-09-03 14:35                     ` Martin Uecker
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-03 13:41 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 5699 bytes --]

Hi Martin,

On 9/3/22 14:47, Martin Uecker wrote:
[...]

> GCC will warn if the bound is specified inconsistently between
> declarations and also emit warnings if it can see that a buffer
> which is passed is too small:
> 
> https://godbolt.org/z/PsjPG1nv7

That's very good news!

BTW, it's nice to see that GCC doesn't need 'static' for array 
parameters.  I never understood what the static keyword adds there. 
There's no way one can specify an array size an mean anything other than 
requiring that, for a non-null pointer, the array should have at least 
that size.

> 
> 
> BTW: If you declare pointers to arrays (not first elements) you
> can get run-time bounds checking with UBSan:
> 
> https://godbolt.org/z/TvMo89WfP

Couldn't that be caught at compile time?  n is certainly out of bounds 
always for such an array, since the last element is n-1.

> 
> 
>>
>> Also, new code can be designed from the beginning so that sizes go
>> before their corresponding arrays, so that new code won't typically be
>> affected by the lack of this feature in the language.
>>
>> This leaves us with legacy code, especially libc, which just works, and
>> doesn't have any urgent needs to change their prototypes in this regard
>> (they could, to improve static analysis, but not what we'd call urgent).
> 
> It would be useful step to find out-of-bounds problem in
> applications using libc.

Yep, it would be very useful for that.  Not urgent, but yes, very useful.


>> Let's take an example:
>>
>>
>>          int getnameinfo(const struct sockaddr *restrict addr,
>>                          socklen_t addrlen,
>>                          char *restrict host, socklen_t hostlen,
>>                          char *restrict serv, socklen_t servlen,
>>                          int flags);
>>
>> and some transformations:
>>
>>
>>          int getnameinfo(const struct sockaddr *restrict addr,
>>                          socklen_t addrlen,
>>                          char host[restrict hostlen], socklen_t hostlen,
>>                          char serv[restrict servlen], socklen_t servlen,
>>                          int flags);
>>
>>
>>          int getnameinfo(socklen_t hostlen;
>>                          socklen_t servlen;
>>                          const struct sockaddr *restrict addr,
>>                          socklen_t addrlen,
>>                          char host[restrict hostlen], socklen_t hostlen,
>>                          char serv[restrict servlen], socklen_t servlen,
>>                          int flags);
>>
>> (I'm not sure if I used correct GNU syntax, since I never used that
>> extension myself.)
>>
>> The first transformation above is non-ambiguous, as concise as possible,
>> and its only issue is that it might complicate the implementation a bit
>> too much.  I don't think forward-using a parameter's size would be too
>> much of a parsing problem for human readers.
> 
> 
> I personally find the second form not terrible.  Being
> able to read code left-to-right, top-down is helpful in more
> complicated examples.
> 
> 
> 
>> The second one is unnecessarily long and verbose, and semicolons are not
>> very distinguishable from commas, for human readers, which may be very
>> confusing.
>>
>>          int foo(int a; int b[a], int a);
>>          int foo(int a, int b[a], int o);
>>
>> Those two are very different to the compiler, and yet very similar to
>> the human eye.  I don't like it.  The fact that it allows for simpler
>> compilers isn't enough to overcome the readability issues.
> 
> This is true, I would probably use it with a comma and/or
> syntax highlighting.
> 
> 
>> I think I'd prefer having the forward-using syntax as a non-standard
>> extension --or a standard but optional language feature-- to avoid
>> forcing small compilers to implement it, rather than having the GNU
>> extension standardized in all compilers.
> 
> The problems with the second form are:
> 
> - it is not 100% backwards compatible (which maybe ok though) as
> the semantics of the following code changes:
> 
> int n;
> int foo(int a[n], int n); // refers to different n!
> 
> Code written for new compilers could then be misunderstood
> by old compilers when a variable with 'n' is in scope.
> 
> 

Hmmm, this one is serious.  I can't seem to solve it with that syntax.

> - it would generally be fundamentally new to C to have
> backwards references and parser might need to be changes
> to allow this
> 
> 
> - a compiler or tool then has to deal also with ugly
> corner cases such as mutual references:
> 
> int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]);
> 
> 
> 
> We could consider new syntax such as
> 
> int foo(char buf[.n], int n);
> 
> 
> Personally, I would prefer the conceptual simplicity of forward
> declarations and the fact that these exist already in GCC
> over any alternative.  I would also not mind new syntax, but
> then one has to define the rules more precisely to avoid the
> aforementioned problems.

What about taking something from K&R functions for this?:

int foo(q; w; int a[q], int q, int s[w], int w);

By not specifying the types, the syntax is again short.
This is left-to-right, so no problems with global variables, and no need 
for complex parsers.
Also, by not specifying types, now it's more obvious to the naked eye 
that there's a difference:


           int foo(a; int b[a], int a);
           int foo(int a, int b[a], int o);


What do you think about this syntax?


Thanks,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 13:41                   ` Alejandro Colomar
@ 2022-09-03 14:35                     ` Martin Uecker
  2022-09-03 14:59                       ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-09-03 14:35 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Samstag, den 03.09.2022, 15:41 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On 9/3/22 14:47, Martin Uecker wrote:
> [...]
> 
> > GCC will warn if the bound is specified inconsistently between
> > declarations and also emit warnings if it can see that a buffer
> > which is passed is too small:
> > 
> > https://godbolt.org/z/PsjPG1nv7
> 
> That's very good news!
> 
> BTW, it's nice to see that GCC doesn't need 'static' for array 
> parameters.  I never understood what the static keyword adds there. 
> There's no way one can specify an array size an mean anything other than 
> requiring that, for a non-null pointer, the array should have at least 
> that size.

From the C standard's point of view,

void foo(int n, char buf[n]);

is semantically equivalent to

void foo(int, char *buf);

and without 'static' the 'n' has no further meaning
(this is different for pointers to arrays).

The static keyword implies that the pointer is be valid and
non-zero and that there must be at least 'n' elements
accessible, so in some sense it is stronger (it implies 
alid non-zero pointers), but at the same time it does not
imply a bound.

But I agree that 'n' without 'static' should simply imply
a bound and I think we should use it this way even when
the standard currently does not attach a meaning to it.

> > 
> > BTW: If you declare pointers to arrays (not first elements) you
> > can get run-time bounds checking with UBSan:
> > 
> > https://godbolt.org/z/TvMo89WfP
> 
> Couldn't that be caught at compile time?  n is certainly out of bounds 
> always for such an array, since the last element is n-1.

Yes, in this example it could (and ideally should) be
detected at compile time.

But this notation already today allows passing of a bound
across API  boundaries and thus enables run-time detection of
out-of-bound accesses even in scenarious where it could
not be found at compile time.

> > 
> > > Also, new code can be designed from the beginning so that sizes go
> > > before their corresponding arrays, so that new code won't typically be
> > > affected by the lack of this feature in the language.
> > > 
> > > This leaves us with legacy code, especially libc, which just works, and
> > > doesn't have any urgent needs to change their prototypes in this regard
> > > (they could, to improve static analysis, but not what we'd call urgent).
> > 
> > It would be useful step to find out-of-bounds problem in
> > applications using libc.
> 
> Yep, it would be very useful for that.  Not urgent, but yes, very useful.
> 
> 
> > > Let's take an example:
> > > 
> > > 
> > >          int getnameinfo(const struct sockaddr *restrict addr,
> > >                          socklen_t addrlen,
> > >                          char *restrict host, socklen_t hostlen,
> > >                          char *restrict serv, socklen_t servlen,
> > >                          int flags);
> > > 
> > > and some transformations:
> > > 
> > > 
> > >          int getnameinfo(const struct sockaddr *restrict addr,
> > >                          socklen_t addrlen,
> > >                          char host[restrict hostlen], socklen_t hostlen,
> > >                          char serv[restrict servlen], socklen_t servlen,
> > >                          int flags);
> > > 
> > > 
> > >          int getnameinfo(socklen_t hostlen;
> > >                          socklen_t servlen;
> > >                          const struct sockaddr *restrict addr,
> > >                          socklen_t addrlen,
> > >                          char host[restrict hostlen], socklen_t hostlen,
> > >                          char serv[restrict servlen], socklen_t servlen,
> > >                          int flags);
> > > 
> > > (I'm not sure if I used correct GNU syntax, since I never used that
> > > extension myself.)
> > > 
> > > The first transformation above is non-ambiguous, as concise as possible,
> > > and its only issue is that it might complicate the implementation a bit
> > > too much.  I don't think forward-using a parameter's size would be too
> > > much of a parsing problem for human readers.
> > 
> > I personally find the second form not terrible.  Being
> > able to read code left-to-right, top-down is helpful in more
> > complicated examples.
> > 
> > 
> > 
> > > The second one is unnecessarily long and verbose, and semicolons are not
> > > very distinguishable from commas, for human readers, which may be very
> > > confusing.
> > > 
> > >          int foo(int a; int b[a], int a);
> > >          int foo(int a, int b[a], int o);
> > > 
> > > Those two are very different to the compiler, and yet very similar to
> > > the human eye.  I don't like it.  The fact that it allows for simpler
> > > compilers isn't enough to overcome the readability issues.
> > 
> > This is true, I would probably use it with a comma and/or
> > syntax highlighting.
> > 
> > 
> > > I think I'd prefer having the forward-using syntax as a non-standard
> > > extension --or a standard but optional language feature-- to avoid
> > > forcing small compilers to implement it, rather than having the GNU
> > > extension standardized in all compilers.
> > 
> > The problems with the second form are:
> > 
> > - it is not 100% backwards compatible (which maybe ok though) as
> > the semantics of the following code changes:
> > 
> > int n;
> > int foo(int a[n], int n); // refers to different n!
> > 
> > Code written for new compilers could then be misunderstood
> > by old compilers when a variable with 'n' is in scope.
> > 
> > 
> 
> Hmmm, this one is serious.  I can't seem to solve it with that syntax.
> 
> > - it would generally be fundamentally new to C to have
> > backwards references and parser might need to be changes
> > to allow this
> > 
> > 
> > - a compiler or tool then has to deal also with ugly
> > corner cases such as mutual references:
> > 
> > int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]);
> > 
> > 
> > 
> > We could consider new syntax such as
> > 
> > int foo(char buf[.n], int n);
> > 
> > 
> > Personally, I would prefer the conceptual simplicity of forward
> > declarations and the fact that these exist already in GCC
> > over any alternative.  I would also not mind new syntax, but
> > then one has to define the rules more precisely to avoid the
> > aforementioned problems.
> 
> What about taking something from K&R functions for this?:
> 
> int foo(q; w; int a[q], int q, int s[w], int w);
> 
> By not specifying the types, the syntax is again short.
> This is left-to-right, so no problems with global variables, and no need 
> for complex parsers.
> Also, by not specifying types, now it's more obvious to the naked eye 
> that there's a difference:

I am ok with the syntax, but I am not sure how this would
work. If the type is determined only later you would still
have to change parsers (some C compilers do type
checking  and folding during parsing, so need the types
to be known during parsing) and you also still have the
problem with the mutual dependencies.

We thought about using this syntax

int foo(char buf[.n], int n);

because it is new syntax which means we can restrict the
size to be the name of a parameter instead of allowing
arbitrary expressions, which then makes forward references
less problematic.  It is also consistent with designators in
initializers and could also be extend to annotate
flexible array members or for storing pointers to arrays
in structures:

struct {
  int n;
  char buf[.n];
};

struct {
  int n;
  char (*buf)[.n];
};


Martin


> 
>            int foo(a; int b[a], int a);
>            int foo(int a, int b[a], int o);
> 
> 
> What do you think about this syntax?




^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 14:35                     ` Martin Uecker
@ 2022-09-03 14:59                       ` Alejandro Colomar
  2022-09-03 15:31                         ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-03 14:59 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 3878 bytes --]

Hi Martin,

On 9/3/22 16:35, Martin Uecker wrote:
> Am Samstag, den 03.09.2022, 15:41 +0200 schrieb Alejandro Colomar:
>> Hi Martin,
>>
>> On 9/3/22 14:47, Martin Uecker wrote:
>> [...]
>>
>>> GCC will warn if the bound is specified inconsistently between
>>> declarations and also emit warnings if it can see that a buffer
>>> which is passed is too small:
>>>
>>> https://godbolt.org/z/PsjPG1nv7
>>
>> That's very good news!
>>
>> BTW, it's nice to see that GCC doesn't need 'static' for array
>> parameters.  I never understood what the static keyword adds there.
>> There's no way one can specify an array size an mean anything other than
>> requiring that, for a non-null pointer, the array should have at least
>> that size.
> 
>  From the C standard's point of view,
> 
> void foo(int n, char buf[n]);
> 
> is semantically equivalent to
> 
> void foo(int, char *buf);
> 
> and without 'static' the 'n' has no further meaning
> (this is different for pointers to arrays).

I know.  I just don't understand the rationale for that decission. :/

> 
> The static keyword implies that the pointer is be valid and
> non-zero and that there must be at least 'n' elements
> accessible, so in some sense it is stronger (it implies
> alid non-zero pointers), but at the same time it does not
> imply a bound.

That stronger meaning, I think is a mistake by the standard.
Basically, [static n] means the same as [n] combined with [[gnu::nonnull]].
What the standard should have done would be to keep those two things 
separate, since one may want to declare non-null non-array pointers, or 
possibly-null array ones.  So the standard should have standardized some 
form of nonnull for that.  But the recent discussion about presenting 
nonnull pointers as [static 1] is horrible.  But let's wait till the 
future hopefully fixes this.

> 
> But I agree that 'n' without 'static' should simply imply
> a bound and I think we should use it this way even when
> the standard currently does not attach a meaning to it.

Yep.

[...]

>> What about taking something from K&R functions for this?:
>>
>> int foo(q; w; int a[q], int q, int s[w], int w);
>>
>> By not specifying the types, the syntax is again short.
>> This is left-to-right, so no problems with global variables, and no need
>> for complex parsers.
>> Also, by not specifying types, now it's more obvious to the naked eye
>> that there's a difference:
> 
> I am ok with the syntax, but I am not sure how this would
> work. If the type is determined only later you would still
> have to change parsers (some C compilers do type
> checking  and folding during parsing, so need the types
> to be known during parsing) and you also still have the
> problem with the mutual dependencies.

This syntax resembles a lot K&R syntax.  Any C compiler that supports 
them (and I guess most compilers out there do) should be easily 
convertible to support this syntax (at least more easily than other 
alternatives).  But this is just a guess.

> 
> We thought about using this syntax
> 
> int foo(char buf[.n], int n);
> 
> because it is new syntax which means we can restrict the
> size to be the name of a parameter instead of allowing
> arbitrary expressions, which then makes forward references
> less problematic.  It is also consistent with designators in
> initializers and could also be extend to annotate
> flexible array members or for storing pointers to arrays
> in structures:

It's not crazy.  I don't have much to argue against it.

> 
> struct {
>    int n;
>    char buf[.n];
> };
> 
> struct {
>    int n;
>    char (*buf)[.n];
> };

Perhaps some doubts about how this would work for nested structures, but 
not unreasonable.

Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 13:29                   ` Ingo Schwarze
@ 2022-09-03 15:08                     ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-03 15:08 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: Martin Uecker, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1460 bytes --]

Hi Ingo,

On 9/3/22 15:29, Ingo Schwarze wrote:
> the only point i strongly care about is this one:
> 
> Manual pages should not use
>   * non-standard syntax
>   * non-portable syntax
>   * ambiguous syntax (i.e. syntax that might have different meanings
>     with different compilers or in different contexts)
>   * syntax that might be invalid or dangerous with some widely
>     used compiler collections like GCC or LLVM

The first two are good guidelines, but not strict IMHO if there's a good 
reason.

The third and fourth are a strong requirements.

For now I won't be applying this patch.

> 
> Regarding the discussions about standardization and extensions,
> all proposals i have seen look seriously ugly and awkward to me,
> and i'm not yet convinced such ugliness is sufficiently offset by
> the relatively minor benefit that is apparent to me right now.

I hope we come up with something not ugly from that discussion.

The static analysis / compiler warning capabilities of using VLA syntax 
seem strong reasons to me.  They help avoid stupid bugs, even for 
careless programmers (well, only if those careless programmers care just 
enough to enable -Wall, and then to read the warnings).  Not something 
that will fix an incorrect algorithm, but can stop some typos, or other 
stupid mistakes that we all do from time to time.

Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 14:59                       ` Alejandro Colomar
@ 2022-09-03 15:31                         ` Martin Uecker
  2022-09-03 20:02                           ` Alejandro Colomar
  2022-11-10  0:06                           ` Alejandro Colomar
  0 siblings, 2 replies; 85+ messages in thread
From: Martin Uecker @ 2022-09-03 15:31 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Hi Alejandro,

Am Samstag, den 03.09.2022, 16:59 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On 9/3/22 16:35, Martin Uecker wrote:
> > Am Samstag, den 03.09.2022, 15:41 +0200 schrieb Alejandro Colomar:
> > > Hi Martin,
> > > 
> > > On 9/3/22 14:47, Martin Uecker wrote:
> > > [...]
> > > 
> > > > GCC will warn if the bound is specified inconsistently between
> > > > declarations and also emit warnings if it can see that a buffer
> > > > which is passed is too small:
> > > > 
> > > > https://godbolt.org/z/PsjPG1nv7
> > > 
> > > That's very good news!
> > > 
> > > BTW, it's nice to see that GCC doesn't need 'static' for array
> > > parameters.  I never understood what the static keyword adds there.
> > > There's no way one can specify an array size an mean anything other than
> > > requiring that, for a non-null pointer, the array should have at least
> > > that size.
> > 
> >  From the C standard's point of view,
> > 
> > void foo(int n, char buf[n]);
> > 
> > is semantically equivalent to
> > 
> > void foo(int, char *buf);
> > 
> > and without 'static' the 'n' has no further meaning
> > (this is different for pointers to arrays).
> 
> I know.  I just don't understand the rationale for that decission. :/

I guess it made sense in the past, but is simply not
what we need today.

> > The static keyword implies that the pointer is be valid and
> > non-zero and that there must be at least 'n' elements
> > accessible, so in some sense it is stronger (it implies
> > alid non-zero pointers), but at the same time it does not
> > imply a bound.
> 
> That stronger meaning, I think is a mistake by the standard.
> Basically, [static n] means the same as [n] combined with [[gnu::nonnull]].
> What the standard should have done would be to keep those two things 
> separate, since one may want to declare non-null non-array pointers, or 
> possibly-null array ones.  So the standard should have standardized some 
> form of nonnull for that.  

I agree the situation is not good.  

> But the recent discussion about presenting 
> nonnull pointers as [static 1] is horrible.  But let's wait till the 
> future hopefully fixes this.

yes, [static 1] is problematic because then the number
can not be used as a bound anymore. 

My experience is that if one wants to see something fixed,
one has to push for it.  Standardization is meant
to standardize existing practice, so if we want to see
this improved, we can not wait for this.

> > But I agree that 'n' without 'static' should simply imply
> > a bound and I think we should use it this way even when
> > the standard currently does not attach a meaning to it.
> 
> Yep.
> 
> [...]
> 
> > > What about taking something from K&R functions for this?:
> > > 
> > > int foo(q; w; int a[q], int q, int s[w], int w);
> > > 
> > > By not specifying the types, the syntax is again short.
> > > This is left-to-right, so no problems with global variables, and no need
> > > for complex parsers.
> > > Also, by not specifying types, now it's more obvious to the naked eye
> > > that there's a difference:
> > 
> > I am ok with the syntax, but I am not sure how this would
> > work. If the type is determined only later you would still
> > have to change parsers (some C compilers do type
> > checking  and folding during parsing, so need the types
> > to be known during parsing) and you also still have the
> > problem with the mutual dependencies.
> 
> This syntax resembles a lot K&R syntax.  Any C compiler that supports 
> them (and I guess most compilers out there do) should be easily 
> convertible to support this syntax (at least more easily than other 
> alternatives).  But this is just a guess.

In K&R syntax this worked for definition:

void foo(y, n)
 int n;
 int y[n];
{ ...

But this worked because you could reorder the
declarations so that later declarations could
refer to previous ones.

So one could do

int foo(int n, char buf[n];  buf, n);

where the second part defines the order of
the parameter or

int foo(buf, n; int n, char buf[n]);

where the first part defins the order,
but the declarations need to have the size
first. But then you need to specify each
parameter twice...


> > We thought about using this syntax
> > 
> > int foo(char buf[.n], int n);
> > 
> > because it is new syntax which means we can restrict the
> > size to be the name of a parameter instead of allowing
> > arbitrary expressions, which then makes forward references
> > less problematic.  It is also consistent with designators in
> > initializers and could also be extend to annotate
> > flexible array members or for storing pointers to arrays
> > in structures:
> 
> It's not crazy.  I don't have much to argue against it.
> 
> > struct {
> >    int n;
> >    char buf[.n];
> > };
> > 
> > struct {
> >    int n;
> >    char (*buf)[.n];
> > };
> 
> Perhaps some doubts about how this would work for nested structures, but 
> not unreasonable.

It is not implemented though...

Martin


> Cheers,
> 
> Alex
> 
> -- 
> Alejandro Colomar
> <http://www.alejandro-colomar.es/>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 15:31                         ` Martin Uecker
@ 2022-09-03 20:02                           ` Alejandro Colomar
  2022-09-05 14:31                             ` Alejandro Colomar
  2022-11-10  0:06                           ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-03 20:02 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2782 bytes --]

Hi Martin,

On 9/3/22 17:31, Martin Uecker wrote:
[...]

>> But the recent discussion about presenting
>> nonnull pointers as [static 1] is horrible.  But let's wait till the
>> future hopefully fixes this.
> 
> yes, [static 1] is problematic because then the number
> can not be used as a bound anymore.
> 
> My experience is that if one wants to see something fixed,
> one has to push for it.  Standardization is meant
> to standardize existing practice, so if we want to see
> this improved, we can not wait for this.
> 

Yeah, I'm not just waiting to see if it gets fixed alone.  I've been 
discussing about nonnull being added to the standard, or improved in the 
compilers, but so far no compiler has something convincing.  GCC's 
attribute is problematic due to UB issues, and Clang's _Nonnull keyword 
is useless as of now:

<https://github.com/llvm/llvm-project/issues/57546>

Maybe GCC could add Clang's _Nonnull (and maybe _Nullable and the 
pragmas, but definitely not _Null_unspecified), and add some good warnings.

Only then it would make sense to try to standardize the feature.

[...]

> In K&R syntax this worked for definition:
> 
> void foo(y, n)
>   int n;
>   int y[n];
> { ...
> 
> But this worked because you could reorder the
> declarations so that later declarations could
> refer to previous ones.
> 
> So one could do
> 
> int foo(int n, char buf[n];  buf, n);
> 
> where the second part defines the order of
> the parameter or
> 
> int foo(buf, n; int n, char buf[n]);
> 
> where the first part defins the order,
> but the declarations need to have the size
> first. But then you need to specify each
> parameter twice...

Hmm, yeah, maybe the [.n] notation makes more sense.

> 
> 
>>> We thought about using this syntax
>>>
>>> int foo(char buf[.n], int n);
>>>
>>> because it is new syntax which means we can restrict the
>>> size to be the name of a parameter instead of allowing
>>> arbitrary expressions, which then makes forward references
>>> less problematic.  It is also consistent with designators in
>>> initializers and could also be extend to annotate
>>> flexible array members or for storing pointers to arrays
>>> in structures:
>>
>> It's not crazy.  I don't have much to argue against it.
>>
>>> struct {
>>>     int n;
>>>     char buf[.n];
>>> };
>>>
>>> struct {
>>>     int n;
>>>     char (*buf)[.n];
>>> };
>>
>> Perhaps some doubts about how this would work for nested structures, but
>> not unreasonable.
> 
> It is not implemented though...

Well, are you planning to implement it?
If you do, I'm very interested in using it in the documentation ;)


Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 20:02                           ` Alejandro Colomar
@ 2022-09-05 14:31                             ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-09-05 14:31 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 730 bytes --]

Hi Martin,

On 9/3/22 22:02, Alejandro Colomar wrote:
>>>> We thought about using this syntax
>>>>
>>>> int foo(char buf[.n], int n);

BTW, it would be useful if this syntax was accepted for void * too, 
especially since GNU C allows pointer arithmetic on void *.

     void *memmove(void dest[.n], const void src[.n], size_t n);

I understand that a void array doesn't make sense, so defining a VLA of 
type void is an error elsewhere, but since array parameters are not 
really arrays, and instead pointers, this could be reasonable.

The same that these "arrays" can have zero sizes, or even negative ones 
in some weird cases.

Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-09-03 15:31                         ` Martin Uecker
  2022-09-03 20:02                           ` Alejandro Colomar
@ 2022-11-10  0:06                           ` Alejandro Colomar
  2022-11-10  0:09                             ` Alejandro Colomar
                                               ` (2 more replies)
  1 sibling, 3 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10  0:06 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1792 bytes --]

Hi Martin,

On 9/3/22 17:31, Martin Uecker wrote:
> My experience is that if one wants to see something fixed,
> one has to push for it.  Standardization is meant
> to standardize existing practice, so if we want to see
> this improved, we can not wait for this.

I fully agree with you.  I've been ruminating these patches for some time, for 
having some more time to think about them.  Now, I like them enough to push. 
So, after a few minor cosmetic issues detected by some linters, I've pushed the 
changes to document all of man2 and man3 with hypothetical VLA syntax.

Now, I've released man-pages-6.01 very recently (just a few weeks ago), and I 
don't plan to release again in a year or two, so there's time to do the 
implementation in GCC.  From my side, please consider this an ACK or even 
somewhat of a push to get things done in the compiler side of things :)

I'll show here an excerpt of what kind of syntax has been pushed.  Of course, 
there's room for improving/fixing, since it's not seen an official release, but 
for now, this is what's up there:


        int strncmp(const char s1[.n], const char s2[.n], size_t n);

        long mbind(void addr[.len], unsigned long len, int mode,
                   const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
                                                / ULONG_WIDTH],
                   unsigned long maxnode, unsigned int flags);

        int cacheflush(void addr[.nbytes], int nbytes, int cache);


I've shown the three kinds of prototypes that have been changed:

-  Normal VLA; nothing fancy except for the '.'.
-  Complex size expressions.
-  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).


Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  0:06                           ` Alejandro Colomar
@ 2022-11-10  0:09                             ` Alejandro Colomar
  2022-11-10  1:33                             ` Joseph Myers
  2022-11-10  9:40                             ` G. Branden Robinson
  2 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10  0:09 UTC (permalink / raw)
  To: Martin Uecker; +Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2058 bytes --]



On 11/10/22 01:06, Alejandro Colomar wrote:
> Hi Martin,
> 
> On 9/3/22 17:31, Martin Uecker wrote:
>> My experience is that if one wants to see something fixed,
>> one has to push for it.  Standardization is meant
>> to standardize existing practice, so if we want to see
>> this improved, we can not wait for this.
> 
> I fully agree with you.  I've been ruminating these patches for some time, for 
> having some more time to think about them.  Now, I like them enough to push. So, 
> after a few minor cosmetic issues detected by some linters, I've pushed the 
> changes to document all of man2 and man3 with hypothetical VLA syntax.
> 
> Now, I've released man-pages-6.01 very recently (just a few weeks ago), and I 
> don't plan to release again in a year or two, so there's time to do the 
> implementation in GCC.  From my side, please consider this an ACK or even 
> somewhat of a push to get things done in the compiler side of things :)
> 
> I'll show here an excerpt of what kind of syntax has been pushed.  Of course, 
> there's room for improving/fixing, since it's not seen an official release, but 
> for now, this is what's up there:
> 
> 
>         int strncmp(const char s1[.n], const char s2[.n], size_t n);
> 
>         long mbind(void addr[.len], unsigned long len, int mode,
>                    const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
>                                                 / ULONG_WIDTH],
>                    unsigned long maxnode, unsigned int flags);
> 
>         int cacheflush(void addr[.nbytes], int nbytes, int cache);
> 
> 
> I've shown the three kinds of prototypes that have been changed:
> 
> -  Normal VLA; nothing fancy except for the '.'.
> -  Complex size expressions.
> -  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).

Oops: sizeof(void)==1
> 
> 
> Cheers,
> 
> Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  0:06                           ` Alejandro Colomar
  2022-11-10  0:09                             ` Alejandro Colomar
@ 2022-11-10  1:33                             ` Joseph Myers
  2022-11-10  1:39                               ` Joseph Myers
  2022-11-10  9:40                             ` G. Branden Robinson
  2 siblings, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-10  1:33 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Thu, 10 Nov 2022, Alejandro Colomar via Gcc wrote:

> I've shown the three kinds of prototypes that have been changed:
> 
> -  Normal VLA; nothing fancy except for the '.'.
> -  Complex size expressions.
> -  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).

That doesn't cover any of the tricky issues with such proposals, such as 
the choice of which entity is referred to by the parameter name when there 
are multiple nested parameter lists that use the same parameter name, or 
when the identifier is visible from an outer scope (including in 
particular the case where it's declared as a typedef name in an outer 
scope).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  1:33                             ` Joseph Myers
@ 2022-11-10  1:39                               ` Joseph Myers
  2022-11-10  6:21                                 ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-10  1:39 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Thu, 10 Nov 2022, Joseph Myers wrote:

> On Thu, 10 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
> > I've shown the three kinds of prototypes that have been changed:
> > 
> > -  Normal VLA; nothing fancy except for the '.'.
> > -  Complex size expressions.
> > -  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).
> 
> That doesn't cover any of the tricky issues with such proposals, such as 
> the choice of which entity is referred to by the parameter name when there 
> are multiple nested parameter lists that use the same parameter name, or 
> when the identifier is visible from an outer scope (including in 
> particular the case where it's declared as a typedef name in an outer 
> scope).

In fact I can't tell from these examples whether you mean for a '.' token 
after '[' to have special semantics, or whether you mean to have a special 
'. identifier' form of expression valid in certain context (each of which 
introduces its own complications; for the former, typedef names from outer 
scopes are problematic; for the latter, it's designated initializers where 
you get complications, for example).  Designing new syntax that doesn't 
cause ambiguity is generally tricky, and this sort of language extension 
is the kind of thing where you'd expect to so through at least five 
iterations of a WG14 paper before you have something like a sound 
specification.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  1:39                               ` Joseph Myers
@ 2022-11-10  6:21                                 ` Martin Uecker
  2022-11-10 10:09                                   ` Alejandro Colomar
  2022-11-10 23:19                                   ` Joseph Myers
  0 siblings, 2 replies; 85+ messages in thread
From: Martin Uecker @ 2022-11-10  6:21 UTC (permalink / raw)
  To: Joseph Myers, Alejandro Colomar
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Donnerstag, den 10.11.2022, 01:39 +0000 schrieb Joseph Myers:
> On Thu, 10 Nov 2022, Joseph Myers wrote:
> 
> > On Thu, 10 Nov 2022, Alejandro Colomar via Gcc wrote:
> > 
> > > I've shown the three kinds of prototypes that have been changed:
> > > 
> > > -  Normal VLA; nothing fancy except for the '.'.
> > > -  Complex size expressions.
> > > -  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).
> > 
> > That doesn't cover any of the tricky issues with such proposals, such as 
> > the choice of which entity is referred to by the parameter name when there 
> > are multiple nested parameter lists that use the same parameter name, or 
> > when the identifier is visible from an outer scope (including in 
> > particular the case where it's declared as a typedef name in an outer 
> > scope).
> 
> In fact I can't tell from these examples whether you mean for a '.' token 
> after '[' to have special semantics, or whether you mean to have a special 
> '. identifier' form of expression valid in certain context (each of which 
> introduces its own complications; for the former, typedef names from outer 
> scopes are problematic; for the latter, it's designated initializers where 
> you get complications, for example).  Designing new syntax that doesn't 
> cause ambiguity is generally tricky, and this sort of language extension 
> is the kind of thing where you'd expect to so through at least five 
> iterations of a WG14 paper before you have something like a sound 
> specification.

I am not sure what Alejandro has in mind exactly, but my idea of using
a new notation [.identifier] would be to limit it to accessing other
parameter names in the same parameter list only, so that there is 

1) no ambiguity what is referred to  and  
2) one can access parameters which come later 

If we want to specify something like this, I think we should also
restrict what kind of expressions one allows, e.g. it has to
be side-effect free.  But maybe we want to make this even more
restrictive (at least initially).

One problem with WG14 papers is that people put in too much,
because the overhead is so high and the standard is not updated
very often.  It would be better to build such feature more
incrementally, which could be done more easily with a compiler
extension.  One could start supporting just [.x] but not more
complicated expressions.

Later WG14 can still accept or reject or modify this proposal
based on the experience we get.

(I would also be happy with using GNU forward declarations, and
I am not sure why people dislike them so much.) 


Martin




^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  0:06                           ` Alejandro Colomar
  2022-11-10  0:09                             ` Alejandro Colomar
  2022-11-10  1:33                             ` Joseph Myers
@ 2022-11-10  9:40                             ` G. Branden Robinson
  2022-11-10 10:59                               ` Alejandro Colomar
  2 siblings, 1 reply; 85+ messages in thread
From: G. Branden Robinson @ 2022-11-10  9:40 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

[-- Attachment #1: Type: text/plain, Size: 944 bytes --]

Hi Alex,

At 2022-11-10T01:06:31+0100, Alejandro Colomar wrote:
> Now, I've released man-pages-6.01 very recently (just a few weeks
> ago), and I don't plan to release again in a year or two, so there's
> time to do the implementation in GCC.  From my side, please consider
> this an ACK or even somewhat of a push to get things done in the
> compiler side of things :)

Do you mean you _don't_ plan to release again for a year or two?

You know what Moltke said about plans and contact with the enemy.  For
one thing, I think the Linux kernel will move too fast to permit such a
leisurely cadence.

Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
you will, shortly thereafter, migrate to the new `MR` macro.

<tents fingers, laughs villainously>

Regards,
Branden

[1] Only 6 RC bugs left!

    https://savannah.gnu.org/bugs/index.php?go_report=Apply&group=groff&set=custom&report_id=225&status_id=1&plan_release_id=103

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  6:21                                 ` Martin Uecker
@ 2022-11-10 10:09                                   ` Alejandro Colomar
  2022-11-10 23:19                                   ` Joseph Myers
  1 sibling, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 10:09 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 6277 bytes --]

Hi Joseph and Martin!

On 11/10/22 07:21, Martin Uecker wrote:
> Am Donnerstag, den 10.11.2022, 01:39 +0000 schrieb Joseph Myers:
>> On Thu, 10 Nov 2022, Joseph Myers wrote:
>>
>>> On Thu, 10 Nov 2022, Alejandro Colomar via Gcc wrote:
>>>
>>>> I've shown the three kinds of prototypes that have been changed:
>>>>
>>>> -  Normal VLA; nothing fancy except for the '.'.
>>>> -  Complex size expressions.
>>>> -  'void *' VLAs (assuming GNU conventions: sizeof(void *)==1).
>>>
>>> That doesn't cover any of the tricky issues with such proposals, such as
>>> the choice of which entity is referred to by the parameter name when there
>>> are multiple nested parameter lists that use the same parameter name, or
>>> when the identifier is visible from an outer scope (including in
>>> particular the case where it's declared as a typedef name in an outer
>>> scope).
>>
>> In fact I can't tell from these examples whether you mean for a '.' token
>> after '[' to have special semantics, or whether you mean to have a special
>> '. identifier' form of expression valid in certain context (each of which
>> introduces its own complications; for the former, typedef names from outer
>> scopes are problematic; for the latter, it's designated initializers where
>> you get complications, for example).  Designing new syntax that doesn't
>> cause ambiguity is generally tricky, and this sort of language extension
>> is the kind of thing where you'd expect to so through at least five
>> iterations of a WG14 paper before you have something like a sound
>> specification.
> 
> I am not sure what Alejandro has in mind exactly, but my idea of using
> a new notation [.identifier] would be to limit it to accessing other
> parameter names in the same parameter list only, so that there is
> 
> 1) no ambiguity what is referred to  and
> 2) one can access parameters which come later

Yes, I implemented your idea.  As always, I thought I had linked to it in the 
commit message, but I didn't.  Quite a bad thing for the commit that implements 
a completely new feature to not point to the documentation/idea at all.

So, the documentation followed by these 3 patches is Martin's email:
<https://lore.kernel.org/linux-man/601680ae-30d7-1481-e152-034083f6dde1@gmail.com/T/#med2bdfcc31a3d0b3bc6c48b229c8d8dd5088935e>

It was sound in my head, and I couldn't see any inconsistencies.

-  I implemented it with '.' as being restricted to refer to parameters of the 
function being prototypes (commit 1).

-  I also allowed complex expressions in the prototypes (commit 2), since it's 
something that can be quite useful (that was already foreseen by Martin's idea, 
IIRC).  The most useful example that I have in my mind is a patch that I'm 
developing for shadow-utils:
 
<https://github.com/shadow-maint/shadow/pull/569/files#diff-12b560bab6b4fb8f7f3a16f01aaa994de539a8bed3058c976be0daebe16405c1>

    The gist of it is a function that gets a fixed-width non-NUL-terminated 
string, and copies it into a NUL-terminated string in a buffer than has to be of 
course +1 the size of the input string:

	void buf2str(char dst[restrict .n+1], const char src[restrict .n],
	             size_t n);

-  I extended the idea to apply to void[] (commit 3).  Something not yet allowed 
by GCC, but very useful IMO, especially for the mem...(3) functions.  Since GNU 
C consistently treats sizeof(void)==1, it makes sense to allow VLA syntax in 
that way.  This is not at all about allowing true VLAs of type void[]; that's 
forbidden, and should continue to be forbidden.  But since parameters are just 
pointers, I don't see any issue with allowing false void[] VLAs in parameters 
that really are void* in disguise.


The 3 commits are here (last 3 commits in that log):
<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?id=c64cd13e002561c6802c6a1a1a8a640f034fea70>


Martin, please check if I implemented your idea faithfully.  The 3 example 
prototypes I showed are good representatives of what I added, so if you don't 
understand man(7) source you could just read them and see if they make sense to 
you; the rest of the changes are of the same kind.  Or you could install the man 
pages from the repo :)



> 
> If we want to specify something like this, I think we should also
> restrict what kind of expressions one allows, e.g. it has to
> be side-effect free.

Well, yes, there should be no side effects; it would not make sense in a 
prototype.  I'd put it as simply as with _Generic(3) and similar stuff, where 
the controlling expression is not evaluated for side effects.  I never remember 
about sizeof() or typeof(): I always need to consult if they have side effects 
or not.  I'll be documenting that in the man-pages soon.

>  But maybe we want to make this even more
> restrictive (at least initially).

Yeah, you could go for an initial implementation that only supports my commit 1; 
that would be the simplest.  That would cover already the vast majority of 
cases.  But please consider commits 2 and 3 afterwards, since I believe they are 
also of great importance.

> 
> One problem with WG14 papers is that people put in too much,
> because the overhead is so high and the standard is not updated
> very often.  It would be better to build such feature more
> incrementally, which could be done more easily with a compiler
> extension.  One could start supporting just [.x] but not more
> complicated expressions.
> 
> Later WG14 can still accept or reject or modify this proposal
> based on the experience we get.

Yeah, and I also think any WG14 papers with features as important as this one 
without prior experience in a real compiler should be rejected.  I don't think 
it makes sense to standardize something just from theoretical discussions, and 
force everyone to implement it afterwards.  No matter how good the reviewers are.

> 
> (I would also be happy with using GNU forward declarations, and
> I am not sure why people dislike them so much.)

For me, it's how easy it is to confuse a comma with a semicolon.  Also, 
unnecessarily long lines.

> 
> Martin
Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  9:40                             ` G. Branden Robinson
@ 2022-11-10 10:59                               ` Alejandro Colomar
  2022-11-10 17:47                                 ` Alejandro Colomar
  2022-11-10 22:25                                 ` [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters G. Branden Robinson
  0 siblings, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 10:59 UTC (permalink / raw)
  To: G. Branden Robinson
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1807 bytes --]

Hi Branden!

On 11/10/22 10:40, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2022-11-10T01:06:31+0100, Alejandro Colomar wrote:
>> Now, I've released man-pages-6.01 very recently (just a few weeks
>> ago), and I don't plan to release again in a year or two, so there's
>> time to do the implementation in GCC.  From my side, please consider
>> this an ACK or even somewhat of a push to get things done in the
>> compiler side of things :)
> 
> Do you mean you _don't_ plan to release again for a year or two?
> 
> You know what Moltke said about plans and contact with the enemy.  For
> one thing, I think the Linux kernel will move too fast to permit such a
> leisurely cadence.

Heh, at this point, I burnt my ships, by using enhanced VLA syntax.  If I 
release that before GCC, I'm expecting to see an avalanche of reports about it 
(and I also expect that GCC and forums will receive a similar ammount).  So yes, 
I expect to wait some longish time.

> 
> Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
> you will, shortly thereafter, migrate to the new `MR` macro.

Not as soon as it gets released, because I expect (at least a decent amount of) 
contributors to be able to read the pages to which they contribute to, but as 
soon as it makes it into Debian stable, yes, that's in my plans.  So, if you 
make it before the freeze, that means around a couple of months from now.

> 
> <tents fingers, laughs villainously>

<also tents fingers, laughs villainously>

> 
> Regards,
> Branden
> 
> [1] Only 6 RC bugs left!

Looks good!

Cheers,

Alex

> 
>      https://savannah.gnu.org/bugs/index.php?go_report=Apply&group=groff&set=custom&report_id=225&status_id=1&plan_release_id=103

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10 10:59                               ` Alejandro Colomar
@ 2022-11-10 17:47                                 ` Alejandro Colomar
  2022-11-10 18:04                                   ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
  2022-11-10 22:25                                 ` [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters G. Branden Robinson
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 17:47 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2616 bytes --]

[removed gcc@ and other uninterested people; added groff@]

Hi Branden!

On 11/10/22 11:59, Alejandro Colomar wrote:
>> Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
>> you will, shortly thereafter, migrate to the new `MR` macro.
> 
> Not as soon as it gets released, because I expect (at least a decent amount of) 
> contributors to be able to read the pages to which they contribute to, but as 
> soon as it makes it into Debian stable, yes, that's in my plans.  So, if you 
> make it before the freeze, that means around a couple of months from now.

I won't be applying the patch now, to avoid contributors seeing people suddenly 
not seeing man page references while preparing patches.  But I'll start 
preparing the patch, to see where are the most difficult parts.  And maybe 
report some issues with the usability.

My first thing was to run:

$ grep -rn '^\.BR .* ([1-9]\w*)'

I'm surprised for good that it seems that there are no false positives.  I 
didn't expect that.  But since things like exit(1) are code, they are probably 
either not highlighted at all, or maybe are italicized (as code is).  So that's 
a good thing.

It showed a few lines that might be problematic, but that's actually bad code, 
which I need to fix:

man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))

Those are asking for a 2-line thing, where the second line is RB instead of BR. 
Which reminds me to check RB:

$ grep -rn '^\.RB .* ([1-9]\w*)'

There are much less cases, and also seem to be fine to script, with a few minor 
ffixes too.

The big issue is that your MR doesn't support leading text:

        .MR page‐title manual‐section [trailing‐text]

I remember we had this discussion about what to do with it.  A 4th argument? 
There's also conflict with a hypothetical link that we might want to add later.

My opinion is that the 4th argument should be the leading text.  Asking to use 
the escape (was it \c?) sequence to workaround that limitation is not very nice. 
  Especially for scripting the change.

If you want a 5th argument for a URI, you can specify the leading text as "", 
which is not much of an issue.  And you keep the trailing text and the leading 
one together.

What are your thoughts?  What should we do?

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 17:47                                 ` Alejandro Colomar
@ 2022-11-10 18:04                                   ` Alejandro Colomar
  2022-11-10 18:11                                     ` Alejandro Colomar
                                                       ` (2 more replies)
  0 siblings, 3 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 18:04 UTC (permalink / raw)
  To: G. Branden Robinson, groff; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3023 bytes --]

Of course I forgot to rename the title, and to agg groff@.  Nice.

-------- Forwarded Message --------
Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
Date: Thu, 10 Nov 2022 18:47:38 +0100
From: Alejandro Colomar <alx.manpages@gmail.com>
To: G. Branden Robinson <g.branden.robinson@gmail.com>
CC: Ingo Schwarze <schwarze@usta.de>, linux-man@vger.kernel.org

[removed gcc@ and other uninterested people; added groff@]

Hi Branden!

On 11/10/22 11:59, Alejandro Colomar wrote:
 >> Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
 >> you will, shortly thereafter, migrate to the new `MR` macro.
 >
 > Not as soon as it gets released, because I expect (at least a decent amount of)
 > contributors to be able to read the pages to which they contribute to, but as
 > soon as it makes it into Debian stable, yes, that's in my plans.  So, if you
 > make it before the freeze, that means around a couple of months from now.

I won't be applying the patch now, to avoid contributors seeing people suddenly 
not seeing man page references while preparing patches.  But I'll start 
preparing the patch, to see where are the most difficult parts.  And maybe 
report some issues with the usability.

My first thing was to run:

$ grep -rn '^\.BR .* ([1-9]\w*)'

I'm surprised for good that it seems that there are no false positives.  I 
didn't expect that.  But since things like exit(1) are code, they are probably 
either not highlighted at all, or maybe are italicized (as code is).  So that's 
a good thing.

It showed a few lines that might be problematic, but that's actually bad code, 
which I need to fix:

man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))

Those are asking for a 2-line thing, where the second line is RB instead of BR. 
Which reminds me to check RB:

$ grep -rn '^\.RB .* ([1-9]\w*)'

There are much less cases, and also seem to be fine to script, with a few minor 
ffixes too.

The big issue is that your MR doesn't support leading text:

         .MR page‐title manual‐section [trailing‐text]

I remember we had this discussion about what to do with it.  A 4th argument? 
There's also conflict with a hypothetical link that we might want to add later.

My opinion is that the 4th argument should be the leading text.  Asking to use 
the escape (was it \c?) sequence to workaround that limitation is not very nice. 
   Especially for scripting the change.

If you want a 5th argument for a URI, you can specify the leading text as "", 
which is not much of an issue.  And you keep the trailing text and the leading 
one together.

What are your thoughts?  What should we do?

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 18:04                                   ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
@ 2022-11-10 18:11                                     ` Alejandro Colomar
  2022-11-10 18:20                                       ` Alejandro Colomar
  2022-11-10 19:37                                     ` Alejandro Colomar
  2022-11-10 22:55                                     ` G. Branden Robinson
  2 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 18:11 UTC (permalink / raw)
  To: G. Branden Robinson, groff; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 315 bytes --]

Hi Branden,

Another interesting thing is what to do here:

$ sed -n 319,320p man2/timerfd_create.2
.TP
.BR poll "(2), " select "(2) (and similar)"


Can I have multiple input lines as the tag for a TP?  How to put 2 MR references 
in there?

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 18:11                                     ` Alejandro Colomar
@ 2022-11-10 18:20                                       ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 18:20 UTC (permalink / raw)
  To: G. Branden Robinson, groff; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 477 bytes --]



On 11/10/22 19:11, Alejandro Colomar wrote:
> Hi Branden,
> 
> Another interesting thing is what to do here:
> 
> $ sed -n 319,320p man2/timerfd_create.2
> .TP
> .BR poll "(2), " select "(2) (and similar)"
> 
> 
> Can I have multiple input lines as the tag for a TP?  How to put 2 MR references 
> in there?

Or maybe I should reorganize it and use TQ and multiple separate tags...

> 
> Cheers,
> 
> Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 18:04                                   ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
  2022-11-10 18:11                                     ` Alejandro Colomar
@ 2022-11-10 19:37                                     ` Alejandro Colomar
  2022-11-10 20:41                                       ` Alejandro Colomar
  2022-11-10 22:55                                     ` G. Branden Robinson
  2 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 19:37 UTC (permalink / raw)
  To: G. Branden Robinson, groff; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 6737 bytes --]

Hi Branden,

On 11/10/22 19:04, Alejandro Colomar wrote:
> Of course I forgot to rename the title, and to agg groff@.  Nice.
> 
> -------- Forwarded Message --------
> Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
> Date: Thu, 10 Nov 2022 18:47:38 +0100
> From: Alejandro Colomar <alx.manpages@gmail.com>
> To: G. Branden Robinson <g.branden.robinson@gmail.com>
> CC: Ingo Schwarze <schwarze@usta.de>, linux-man@vger.kernel.org
> 
> [removed gcc@ and other uninterested people; added groff@]
> 
> Hi Branden!
> 
> On 11/10/22 11:59, Alejandro Colomar wrote:
>  >> Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
>  >> you will, shortly thereafter, migrate to the new `MR` macro.
>  >
>  > Not as soon as it gets released, because I expect (at least a decent amount of)
>  > contributors to be able to read the pages to which they contribute to, but as
>  > soon as it makes it into Debian stable, yes, that's in my plans.  So, if you
>  > make it before the freeze, that means around a couple of months from now.
> 
> I won't be applying the patch now, to avoid contributors seeing people suddenly 
> not seeing man page references while preparing patches.  But I'll start 
> preparing the patch, to see where are the most difficult parts.  And maybe 
> report some issues with the usability.
> 
> My first thing was to run:
> 
> $ grep -rn '^\.BR .* ([1-9]\w*)'
> 
> I'm surprised for good that it seems that there are no false positives.  I 
> didn't expect that.  But since things like exit(1) are code, they are probably 
> either not highlighted at all, or maybe are italicized (as code is).  So that's 
> a good thing.
> 
> It showed a few lines that might be problematic, but that's actually bad code, 
> which I need to fix:
> 
> man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
> man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
> man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
> man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
> man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))
> 
> Those are asking for a 2-line thing, where the second line is RB instead of BR. 
> Which reminds me to check RB:
> 
> $ grep -rn '^\.RB .* ([1-9]\w*)'
> 
> There are much less cases, and also seem to be fine to script, with a few minor 
> ffixes too.
> 
> The big issue is that your MR doesn't support leading text:
> 
>          .MR page‐title manual‐section [trailing‐text]
> 
> I remember we had this discussion about what to do with it.  A 4th argument? 
> There's also conflict with a hypothetical link that we might want to add later.
> 
> My opinion is that the 4th argument should be the leading text.  Asking to use 
> the escape (was it \c?) sequence to workaround that limitation is not very nice. 
>    Especially for scripting the change.
> 
> If you want a 5th argument for a URI, you can specify the leading text as "", 
> which is not much of an issue.  And you keep the trailing text and the leading 
> one together.
> 
> What are your thoughts?  What should we do?

To document and discuss the way I'm migrating, I'll share here the scripts:

The simplest case: a single man page reference with no other stuff around it:

$ find man* -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) (\([1-9]\w*\))$/.MR \1 \2/'

Second simplest case: a single man page reference with only trailing stuff:

$ find man* -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) (\([1-9]\w*\))/.MR \1 \2 /'


And here I continue with hypothetical syntax not yet allowed by groff.

A single man page reference with only leading stuff:

$ find man* -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) (\([1-9]\w*\))$/.MR \2 \3 "" \1/'

A single man page reference with both leading and trailing stuff (thank $DEITY 
for not having comments in any of those, so I can just run the script):

$ find man* -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) (\([1-9]\w*\))\(.*\)/.MR \2 \3 \4 
\1/'


After running those 4, and inspecting the changes to make sure they look good 
(and they do), I have a quite small amount of references that my scripts didn't 
catch.  Some of them, just need a ffix before running the scripts again, some 
others need a manual migration, but nothing too difficult.


alx@asus5775:~/src/linux/man-pages/man-pages/MR$ grep -rn '^\.RB .* .\?([1-9]\w*)'
man2/mremap.2:324:.RB ( mmap "(2) " MAP_PRIVATE ),
man2/perf_event_open.2:35:.RB ( read "(2), " mmap "(2), " prctl "(2), " fcntl 
"(2), etc.)."
man2/open.2:86:.RB ( read "(2), " write "(2), " lseek "(2), " fcntl (2),
man3type/div_t.3type:43:.RB [[ l ] l ] div (3)
man3/fts.3:189:.RB [ l ] stat (2)
man3/fts.3:200:.RB [ l ] stat (2)
man3/fts.3:331:.RB [ l ] stat (2)
man3/fts.3:745:.RB [ l ] stat (2).
man5/proc.5:3426:.RB ( flock "(2) and " fcntl (2))
man7/pty.7:125:.RB ( ssh "(1), " rlogin "(1), " telnet (1)),
alx@asus5775:~/src/linux/man-pages/man-pages/MR$ grep -rn '^\.BR .* .\?([1-9]\w*)'
man1/getent.1:346:.BR ahosts / getaddrinfo (3)
man2/ioprio_set.2:278:.BR IOPRIO_CLASS_RT " (1)"
man2/ioprio_set.2:293:.BR IOPRIO_CLASS_BE " (2)"
man2/ioprio_set.2:306:.BR IOPRIO_CLASS_IDLE " (3)"
man2/keyctl.2:985:.BR  execve (2).
man2/ioctl_iflags.2:63:.BR mount  (2)
man2/memfd_create.2:232:.BR  open (2)
man2/syslog.2:77:.BR SYSLOG_ACTION_OPEN " (1)"
man2/syslog.2:81:.BR SYSLOG_ACTION_READ " (2)"
man2/syslog.2:93:.BR SYSLOG_ACTION_READ_ALL " (3)"
man2/syslog.2:103:.BR SYSLOG_ACTION_READ_CLEAR " (4)"
man2/syslog.2:109:.BR SYSLOG_ACTION_CLEAR " (5)"
man2/syslog.2:128:.BR SYSLOG_ACTION_CONSOLE_OFF " (6)"
man2/syslog.2:152:.BR SYSLOG_ACTION_CONSOLE_ON " (7)"
man2/syslog.2:175:.BR SYSLOG_ACTION_CONSOLE_LEVEL " (8)"
man2/syslog.2:192:.BR SYSLOG_ACTION_SIZE_UNREAD " (9) (since Linux 2.4.10)"
man2/syslog.2:203:.BR SYSLOG_ACTION_SIZE_BUFFER " (10) (since Linux 2.6.6)"
man2/sigreturn.2:42:.BR sigaltstack "(2))\(emin"
man2/timerfd_create.2:320:.BR poll "(2), " select "(2) (and similar)"
man2/eventfd.2:144:.BR poll "(2), " select "(2) (and similar)"
man2/signalfd.2:134:.BR poll "(2), " select "(2) (and similar)"
man3/duplocale.3:99:.BR  freelocale (3).
man7/spufs.7:122:.BR read "(2), " pread "(2), " write "(2), " pwrite "(2), " 
lseek (2)
man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))


Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 19:37                                     ` Alejandro Colomar
@ 2022-11-10 20:41                                       ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 20:41 UTC (permalink / raw)
  To: G. Branden Robinson, groff; +Cc: Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 5503 bytes --]

On 11/10/22 20:37, Alejandro Colomar wrote:
> Hi Branden,
> 
> On 11/10/22 19:04, Alejandro Colomar wrote:
>> Of course I forgot to rename the title, and to agg groff@.  Nice.
>>
>> -------- Forwarded Message --------
>> Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function 
>> parameters
>> Date: Thu, 10 Nov 2022 18:47:38 +0100
>> From: Alejandro Colomar <alx.manpages@gmail.com>
>> To: G. Branden Robinson <g.branden.robinson@gmail.com>
>> CC: Ingo Schwarze <schwarze@usta.de>, linux-man@vger.kernel.org
>>
>> [removed gcc@ and other uninterested people; added groff@]
>>
>> Hi Branden!
>>
>> On 11/10/22 11:59, Alejandro Colomar wrote:
>>  >> Also, as soon as Bertrand and I can get groff 1.23 out[1], I am hoping
>>  >> you will, shortly thereafter, migrate to the new `MR` macro.
>>  >
>>  > Not as soon as it gets released, because I expect (at least a decent amount 
>> of)
>>  > contributors to be able to read the pages to which they contribute to, but as
>>  > soon as it makes it into Debian stable, yes, that's in my plans.  So, if you
>>  > make it before the freeze, that means around a couple of months from now.
>>
>> I won't be applying the patch now, to avoid contributors seeing people 
>> suddenly not seeing man page references while preparing patches.  But I'll 
>> start preparing the patch, to see where are the most difficult parts.  And 
>> maybe report some issues with the usability.
>>
>> My first thing was to run:
>>
>> $ grep -rn '^\.BR .* ([1-9]\w*)'
>>
>> I'm surprised for good that it seems that there are no false positives.  I 
>> didn't expect that.  But since things like exit(1) are code, they are probably 
>> either not highlighted at all, or maybe are italicized (as code is).  So 
>> that's a good thing.
>>
>> It showed a few lines that might be problematic, but that's actually bad code, 
>> which I need to fix:
>>
>> man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
>> man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
>> man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
>> man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
>> man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))
>>
>> Those are asking for a 2-line thing, where the second line is RB instead of 
>> BR. Which reminds me to check RB:
>>
>> $ grep -rn '^\.RB .* ([1-9]\w*)'
>>
>> There are much less cases, and also seem to be fine to script, with a few 
>> minor ffixes too.
>>
>> The big issue is that your MR doesn't support leading text:
>>
>>          .MR page‐title manual‐section [trailing‐text]
>>
>> I remember we had this discussion about what to do with it.  A 4th argument? 
>> There's also conflict with a hypothetical link that we might want to add later.
>>
>> My opinion is that the 4th argument should be the leading text.  Asking to use 
>> the escape (was it \c?) sequence to workaround that limitation is not very 
>> nice.    Especially for scripting the change.
>>
>> If you want a 5th argument for a URI, you can specify the leading text as "", 
>> which is not much of an issue.  And you keep the trailing text and the leading 
>> one together.
>>
>> What are your thoughts?  What should we do?
> 
> To document and discuss the way I'm migrating, I'll share here the scripts:
> 
> The simplest case: a single man page reference with no other stuff around it:
> 
> $ find man* -type f \
>    | xargs sed -i 's/^\.BR \([^ ]*\) (\([1-9]\w*\))$/.MR \1 \2/'
> 
> Second simplest case: a single man page reference with only trailing stuff:
> 
> $ find man* -type f \
>    | xargs sed -i 's/^\.BR \([^ ]*\) (\([1-9]\w*\))/.MR \1 \2 /'
> 
> 
> And here I continue with hypothetical syntax not yet allowed by groff.
> 
> A single man page reference with only leading stuff:
> 
> $ find man* -type f \
>    | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) (\([1-9]\w*\))$/.MR \2 \3 "" \1/'
> 
> A single man page reference with both leading and trailing stuff (thank $DEITY 
> for not having comments in any of those, so I can just run the script):
> 
> $ find man* -type f \
>    | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) (\([1-9]\w*\))\(.*\)/.MR \2 \3 \4 
> \1/'

And a few more, to cover same-page references.  As Ingo recommended, I'm adding 
the section for consistency.  Redundancy is not a big issue here.

Man references in the same page, with no stuff around them:

$ find man2 -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) ()$/.MR \1 2/'

$ find man3 -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) ()$/.MR \1 3/'


Man references in the same page, with trailing stuff:

$ find man2 -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) ()/.MR \1 2 /'

$ find man3 -type f \
   | xargs sed -i 's/^\.BR \([^ ]*\) ()/.MR \1 3 /'


Man references in the same page, with only leading stuff:

$ find man2 -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) ()$/.MR \2 2 "" \1/'

$ find man3 -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) ()$/.MR \2 3 "" \1/'


And finally, man references in the same page, with both leading and trailing 
stuff (again, I was lucky, and there were no comments):

$ find man2 -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) ()\(.*\)/.MR \2 2 \3 \1/'

$ find man3 -type f \
   | xargs sed -i 's/^\.RB \([^ ]*\) \([^ ]*\) ()\(.*\)/.MR \2 3 \3 \1/'

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10 10:59                               ` Alejandro Colomar
  2022-11-10 17:47                                 ` Alejandro Colomar
@ 2022-11-10 22:25                                 ` G. Branden Robinson
  1 sibling, 0 replies; 85+ messages in thread
From: G. Branden Robinson @ 2022-11-10 22:25 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

[-- Attachment #1: Type: text/plain, Size: 1533 bytes --]

Hi Alex,

At 2022-11-10T11:59:02+0100, Alejandro Colomar wrote:
> > You know what Moltke said about plans and contact with the enemy.
> > For one thing, I think the Linux kernel will move too fast to permit
> > such a leisurely cadence.
> 
> Heh, at this point, I burnt my ships, by using enhanced VLA syntax.
> If I release that before GCC, I'm expecting to see an avalanche of
> reports about it (and I also expect that GCC and forums will receive a
> similar ammount).  So yes, I expect to wait some longish time.

Hah, you rebutted my Moltke with your namesake.  You understand that I'm
obligated to spring a reference to the Battle of Lepanto or something on
you at some point.

> > Also, as soon as Bertrand and I can get groff 1.23 out[1], I am
> > hoping you will, shortly thereafter, migrate to the new `MR` macro.
> 
> Not as soon as it gets released, because I expect (at least a decent
> amount of) contributors to be able to read the pages to which they
> contribute to,

Laggardly adopters can always put this in man.local.

.if !d MR \{\
.  de MR
.    IR \\$1 (\\$2)\\$3
.  .
.\}

> but as soon as it makes it into Debian stable, yes, that's in my
> plans.  So, if you make it before the freeze, that means around a
> couple of months from now.

Yes.  It is a major personal goal to get groff 1.23 into Debian
bookworm.

> > <tents fingers, laughs villainously>
> 
> <also tents fingers, laughs villainously>

https://www.youtube.com/watch?v=VhH2egTLohM

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 18:04                                   ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
  2022-11-10 18:11                                     ` Alejandro Colomar
  2022-11-10 19:37                                     ` Alejandro Colomar
@ 2022-11-10 22:55                                     ` G. Branden Robinson
  2022-11-10 23:55                                       ` Alejandro Colomar
  2 siblings, 1 reply; 85+ messages in thread
From: G. Branden Robinson @ 2022-11-10 22:55 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: groff, Ingo Schwarze, linux-man

[-- Attachment #1: Type: text/plain, Size: 6493 bytes --]

Hi Alex,

At 2022-11-10T19:04:46+0100, Alejandro Colomar wrote:
> Of course I forgot to rename the title, and to agg groff@.  Nice.

It gave me time to reply to this one.  :)

> On 11/10/22 11:59, Alejandro Colomar wrote:
> I won't be applying the patch now, to avoid contributors seeing people
> suddenly not seeing man page references while preparing patches.  But
> I'll start preparing the patch, to see where are the most difficult
> parts.  And maybe report some issues with the usability.
> 
> My first thing was to run:
> 
> $ grep -rn '^\.BR .* ([1-9]\w*)'
> 
> I'm surprised for good that it seems that there are no false
> positives.  I didn't expect that.  But since things like exit(1) are
> code, they are probably either not highlighted at all, or maybe are
> italicized (as code is).  So that's a good thing.
> 
> It showed a few lines that might be problematic, but that's actually
> bad code, which I need to fix:
> 
> man7/credentials.7:270:.BR setuid "(2) (" setgid (2))
> man7/credentials.7:274:.BR seteuid "(2) (" setegid (2))
> man7/credentials.7:277:.BR setfsuid "(2) (" setfsgid (2))
> man7/credentials.7:280:.BR setreuid "(2) (" setregid (2))
> man7/credentials.7:284:.BR setresuid "(2) (" setresgid (2))
> 
> Those are asking for a 2-line thing, where the second line is RB instead of
> BR. Which reminds me to check RB:
> 
> $ grep -rn '^\.RB .* ([1-9]\w*)'
> 
> There are much less cases, and also seem to be fine to script, with a few
> minor ffixes too.
> 
> The big issue is that your MR doesn't support leading text:
> 
>         .MR page‐title manual‐section [trailing‐text]
> 
> I remember we had this discussion about what to do with it.  A 4th
> argument?  There's also conflict with a hypothetical link that we
> might want to add later.
> 
> My opinion is that the 4th argument should be the leading text.
> Asking to use the escape (was it \c?) sequence to workaround that
> limitation is not very nice.   Especially for scripting the change.

Here's what I did for groff.

commit 2ab0dacb95863a2e347d06cf970676c74c784ce2
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Oct 8 00:46:41 2021 +1100

    [man pages]: Migrate man(7) cross refs to `MR`.

     # Handle simplest case: ".IR foo (1)".
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))$/.MR \2 \3/
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))$/.MR \2 \3/
     # Handle case: trailing puncutation, e.g., ".IR foo (1),".
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))\([^[:space:]]\+\)/.MR \2 \3 \4/
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))\([^[:space:]]\+\)/.MR \2 \3 \4/
     # Handle case: 3rd+ arguments or trailing comments.  This case is rare
     # and will require manual fixup if there are 4+ arguments to MR.  Use
     # groff -man -rCHECKSTYLE=1 to have them automatically reported.
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))\( .*\)/.MR \2 \3\4/
    s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))\( .*\)/.MR \2 \3\4/

You can ignore the 'MAN[157]EXT' lines; they are relevant only to
within-groff pages (because all of our man pages undergo sed-processing
to be prepared for installation).

> If you want a 5th argument for a URI, you can specify the leading text
> as "", which is not much of an issue.  And you keep the trailing text
> and the leading one together.
> 
> What are your thoughts?  What should we do?

I am reluctant to extend the interface of `MR` at this point because as
it is it has two nice properties: it aligns with mdoc(7)'s `Xr` macro
and with Plan 9 from User Space troff's `MR`, which did it first.

(Admittedly, P9US troff's `MR` macro doesn't supply the parentheses.  I
don't know if they intend to change that.  I'm willing to supply a patch
to change their implementation and their man pages to align with what I
did in groff.  As shown above, I believe my sed-fu is in order.)

I think man page authors should learn when the `\c` escape sequence is
appropriate and use it when warranted, and recast their sentences
otherwise.  That is why I provided an explicit example in the
groff_man_style(7) page.

    .MR page-title manual-section [trailing-text]
        (since groff 1.23) Set a man page cross reference as "page-
        title(manual-section)".  If trailing-text (typically
        punctuation) is specified, it follows the closing parenthesis
        without intervening space.  Hyphenation is disabled while the
        cross reference is set.  page-title is set in the font specified
        by the MF string.  The cross reference hyperlinks to a URI of
        the form "man:page-title(manual-section)".

            The output driver
            .MR grops 1
            produces PostScript from
            .I troff
            output.
            .
            The Ghostscript program (\c
            .MR gs 1 )
            interprets PostScript and PDF.

`\c` solves problems that are complicated to solve any other way.  As
far as I have seen, you don't ever need it in mdoc(7) pages, for
example...but you pay a price.  You must learn which of mdoc's several
dozen macros are "parsed" versus "callable" (and what the heck the
package even _means_ by those words); you must learn that `Pf` and `Ns`
exist and when to use them; you must learn that certain two-letter words
will not behave as you expect; and if you thought using mdoc(7) meant
you wouldn't have to type any groff escape sequences, think
again--you'll be putting `\&` all over the place.

People can use mdoc(7) if they want to (and now that I'm learning it
better, I will consult as I am able), but its reputation in some circles
as a superior solution to man(7) on all fronts that should have kicked
its predecessor into the grave long ago is due solely to irresponsible
hype from its exponents.

If you need help automating a change to adapt some Linux man-pages
documents to use `\c` before an `MR` call on the next line (where you
were using `RB` before, for instance), just let know.  I am nearly
certain that a sed script utilizing its hold space feature can get the
job done.  (I've used the hold space profitably before, but occasions
for it come up seldom enough that I have to review my past solutions
before the knowledge comes back.  Or maybe it's creeping senescence.)

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10  6:21                                 ` Martin Uecker
  2022-11-10 10:09                                   ` Alejandro Colomar
@ 2022-11-10 23:19                                   ` Joseph Myers
  2022-11-10 23:28                                     ` Alejandro Colomar
                                                       ` (2 more replies)
  1 sibling, 3 replies; 85+ messages in thread
From: Joseph Myers @ 2022-11-10 23:19 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Alejandro Colomar, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Thu, 10 Nov 2022, Martin Uecker via Gcc wrote:

> One problem with WG14 papers is that people put in too much,
> because the overhead is so high and the standard is not updated
> very often.  It would be better to build such feature more
> incrementally, which could be done more easily with a compiler
> extension.  One could start supporting just [.x] but not more
> complicated expressions.

Even a compiler extension requires the level of detail of specification 
that you get with a WG14 paper (and the level of work on finding bugs in 
that specification), to avoid the problem we've had before with too many 
features added in GCC 2.x days where a poorly defined feature is "whatever 
the compiler accepts".

If you use .x as the notation but don't limit it to [.x], you have a 
completely new ambiguity between ordinary identifiers and member names

struct s { int a; };
void f(int a, int b[((struct s) { .a = 1 }).a]);

where it's newly ambiguous whether ".a = 1" is an assignment to the 
expression ".a" or a use of a designated initializer.

(I think that if you add any syntax for this, GNU VLA forward declarations 
are clearly to be preferred to inventing something new like [.x] which 
introduces its own problems.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10 23:19                                   ` Joseph Myers
@ 2022-11-10 23:28                                     ` Alejandro Colomar
  2022-11-11 19:52                                     ` Martin Uecker
  2022-11-12 12:34                                     ` Alejandro Colomar
  2 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 23:28 UTC (permalink / raw)
  To: Joseph Myers, Martin Uecker
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2085 bytes --]

Hi Joseph,

On 11/11/22 00:19, Joseph Myers wrote:
> On Thu, 10 Nov 2022, Martin Uecker via Gcc wrote:
> 
>> One problem with WG14 papers is that people put in too much,
>> because the overhead is so high and the standard is not updated
>> very often.  It would be better to build such feature more
>> incrementally, which could be done more easily with a compiler
>> extension.  One could start supporting just [.x] but not more
>> complicated expressions.
> 
> Even a compiler extension requires the level of detail of specification
> that you get with a WG14 paper (and the level of work on finding bugs in
> that specification), to avoid the problem we've had before with too many
> features added in GCC 2.x days where a poorly defined feature is "whatever
> the compiler accepts".
> 
> If you use .x as the notation but don't limit it to [.x], you have a
> completely new ambiguity between ordinary identifiers and member names
> 
> struct s { int a; };
> void f(int a, int b[((struct s) { .a = 1 }).a]);
> 
> where it's newly ambiguous whether ".a = 1" is an assignment to the
> expression ".a" or a use of a designated initializer.
> 
> (I think that if you add any syntax for this, GNU VLA forward declarations
> are clearly to be preferred to inventing something new like [.x] which
> introduces its own problems.)

Yeah, I think limiting it to [.n] initially, and only moving forward, step by 
step, if it's perfectly clear that it's doable seems very reasonable.

Re: GNU VLA fwd decl:

This example is what I'm worried about:

         int foo(int a; int b[a], int a);
         int foo(int a, int b[a], int o);

Okay, parameters should have more readable names...  But still, it allows for a 
high chance of wtf moments.  However, I can think of a syntax very similar to 
GNU's, that would make it a bit better in terms of readability: not declaring 
the type in the fwd decl:


         int foo(a; int b[a], int a);
         int foo(int a, int b[a], int o);

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 22:55                                     ` G. Branden Robinson
@ 2022-11-10 23:55                                       ` Alejandro Colomar
  2022-11-11  4:44                                         ` G. Branden Robinson
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-10 23:55 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: groff, Ingo Schwarze, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 7194 bytes --]

Hi Branden,

On 11/10/22 23:55, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2022-11-10T19:04:46+0100, Alejandro Colomar wrote:
>> Of course I forgot to rename the title, and to agg groff@.  Nice.
> 
> It gave me time to reply to this one.  :)

:)

[...]

>> The big issue is that your MR doesn't support leading text:
>>
>>          .MR page‐title manual‐section [trailing‐text]
>>
>> I remember we had this discussion about what to do with it.  A 4th
>> argument?  There's also conflict with a hypothetical link that we
>> might want to add later.
>>
>> My opinion is that the 4th argument should be the leading text.
>> Asking to use the escape (was it \c?) sequence to workaround that
>> limitation is not very nice.   Especially for scripting the change.
> 
> Here's what I did for groff.
> 
> commit 2ab0dacb95863a2e347d06cf970676c74c784ce2
> Author: G. Branden Robinson <g.branden.robinson@gmail.com>
> Date:   Fri Oct 8 00:46:41 2021 +1100
> 
>      [man pages]: Migrate man(7) cross refs to `MR`.
> 
>       # Handle simplest case: ".IR foo (1)".
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))$/.MR \2 \3/
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))$/.MR \2 \3/
>       # Handle case: trailing puncutation, e.g., ".IR foo (1),".
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))\([^[:space:]]\+\)/.MR \2 \3 \4/
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))\([^[:space:]]\+\)/.MR \2 \3 \4/
>       # Handle case: 3rd+ arguments or trailing comments.  This case is rare
>       # and will require manual fixup if there are 4+ arguments to MR.  Use
>       # groff -man -rCHECKSTYLE=1 to have them automatically reported.
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\(@MAN[157]EXT@\))\( .*\)/.MR \2 \3\4/
>      s/^.[BI]R \(\\%\)*\([@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))\( .*\)/.MR \2 \3\4/
> 
> You can ignore the 'MAN[157]EXT' lines; they are relevant only to
> within-groff pages (because all of our man pages undergo sed-processing
> to be prepared for installation).

Hmm, will need to parse that.  Anyway, I think now that I have the MR with 4 
arguments, moving the 4th to the previous line with sed and N should not be that 
difficult.

> 
>> If you want a 5th argument for a URI, you can specify the leading text
>> as "", which is not much of an issue.  And you keep the trailing text
>> and the leading one together.
>>
>> What are your thoughts?  What should we do?
> 
> I am reluctant to extend the interface of `MR` at this point because as
> it is it has two nice properties: it aligns with mdoc(7)'s `Xr` macro
> and with Plan 9 from User Space troff's `MR`, which did it first.

Well, being a compatible extension to the others is not that bad.  How does 
mdoc(7) solve it with Xr?

> 
> (Admittedly, P9US troff's `MR` macro doesn't supply the parentheses.  I
> don't know if they intend to change that.  I'm willing to supply a patch
> to change their implementation and their man pages to align with what I
> did in groff.  As shown above, I believe my sed-fu is in order.)
> 
> I think man page authors should learn when the `\c` escape sequence is
> appropriate and use it when warranted, and recast their sentences
> otherwise.  That is why I provided an explicit example in the
> groff_man_style(7) page.
> 
>      .MR page-title manual-section [trailing-text]
>          (since groff 1.23) Set a man page cross reference as "page-
>          title(manual-section)".  If trailing-text (typically
>          punctuation) is specified, it follows the closing parenthesis
>          without intervening space.  Hyphenation is disabled while the
>          cross reference is set.  page-title is set in the font specified
>          by the MF string.  The cross reference hyperlinks to a URI of
>          the form "man:page-title(manual-section)".
> 
>              The output driver
>              .MR grops 1
>              produces PostScript from
>              .I troff
>              output.
>              .
>              The Ghostscript program (\c
>              .MR gs 1 )
>              interprets PostScript and PDF.

One of the biggest issues with this is that it breaks what would otherwise 
represent a single entity, into two lines, so it hurts readability.  See as an 
extreme example the following change I did with my scripts (from posix_spawn(3), 
if you're curious):

@@ -129,7 +129,7 @@ .SH DESCRIPTION
  Below, the functions are described in terms of a three-step process: the
  .MR fork 3
  step, the
-.RB pre- exec ()
+.MR exec 3 "" pre-
  step (executed in the child),
  and the
  .MR exec 3


Having 'pre-' as the last part of some random line, separates it from the other 
part of the word.  The \c alternative would be:

step, the pre-
.MR exec 3
step ...


Not terrible, but I'm not in love with it.


> 
> `\c` solves problems that are complicated to solve any other way.  As
> far as I have seen, you don't ever need it in mdoc(7) pages, for
> example...but you pay a price.  You must learn which of mdoc's several
> dozen macros are "parsed" versus "callable" (and what the heck the
> package even _means_ by those words); you must learn that `Pf` and `Ns`
> exist and when to use them; you must learn that certain two-letter words
> will not behave as you expect; and if you thought using mdoc(7) meant
> you wouldn't have to type any groff escape sequences, think
> again--you'll be putting `\&` all over the place.
> 
> People can use mdoc(7) if they want to (and now that I'm learning it
> better, I will consult as I am able), but its reputation in some circles
> as a superior solution to man(7) on all fronts that should have kicked
> its predecessor into the grave long ago is due solely to irresponsible
> hype from its exponents.
> 
> If you need help automating a change to adapt some Linux man-pages
> documents to use `\c` before an `MR` call on the next line (where you
> were using `RB` before, for instance), just let know.  I am nearly
> certain that a sed script utilizing its hold space feature can get the
> job done.  (I've used the hold space profitably before, but occasions
> for it come up seldom enough that I have to review my past solutions
> before the knowledge comes back.  Or maybe it's creeping senescence.)

I hope I can come up with something, but yes, if not, I'll call you ;)

BTW, so far I've never found a case where I had to use the hold space.  I wonder 
if I may meet a case where I need it in my life.  This week I came up with some 
script for inserting an element into a JSON array at a specified position, but N 
is all that was needed:
<http://www.alejandro-colomar.es/src/alx/nginx/unitcli.git/tree/bin/setup-unit#n969>.
I've met a few more-complex cases, but not really that much.  I always come up 
with some combination of filters that allows me to avoid the hold space. 
Sometimes, two scripts run consecutively also helps keep it simple :)

Cheers,

Alex

> 
> Regards,
> Branden

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters)
  2022-11-10 23:55                                       ` Alejandro Colomar
@ 2022-11-11  4:44                                         ` G. Branden Robinson
  0 siblings, 0 replies; 85+ messages in thread
From: G. Branden Robinson @ 2022-11-11  4:44 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: groff, Ingo Schwarze, linux-man

[-- Attachment #1: Type: text/plain, Size: 2901 bytes --]

At 2022-11-11T00:55:18+0100, Alejandro Colomar wrote:
> Hmm, will need to parse that.  Anyway, I think now that I have the MR
> with 4 arguments, moving the 4th to the previous line with sed and N
> should not be that difficult.

Okay.

> Well, being a compatible extension to the others is not that bad.  How
> does mdoc(7) solve it with Xr?

I alluded to it: the `Pf` ("prefix") macro.

man(7):
.TH foo 1 2022-11-10 "groff test suite"
.SH Description
pre-\c
.MR exec 3

mdoc(7):
.Dd 2022-11-10
.Dt foo 1
.Os "groff test suite"
.Sh Description
.Pf pre- Xr exec 3

> One of the biggest issues with this is that it breaks what would
> otherwise represent a single entity, into two lines, so it hurts
> readability.  See as an extreme example the following change I did
> with my scripts (from posix_spawn(3), if you're curious):
> 
> @@ -129,7 +129,7 @@ .SH DESCRIPTION
>  Below, the functions are described in terms of a three-step process: the
>  .MR fork 3
>  step, the
> -.RB pre- exec ()
> +.MR exec 3 "" pre-
>  step (executed in the child),
>  and the
>  .MR exec 3
> 
> Having 'pre-' as the last part of some random line, separates it from the
> other part of the word.  The \c alternative would be:
> 
> step, the pre-
> .MR exec 3
> step ...
> 
> Not terrible, but I'm not in love with it.

I personally find the derangement of word ordering more disruptive to my
reading than a mid-word line break...especially after a hyphen, where
years of experience have prepared me to expect a continued word on the
next line anyway.  ;-)

I would also note that I don't think it's necessary to hyperlink every
single occurence of a cross-referenced man page topic, especially if the
same page topic comes up repeatedly in a section (or even paragraph).
IIRC Ingo doesn't agree, and you might too.

> I hope I can come up with something, but yes, if not, I'll call you ;)

My bat-shaped phone is plugged in.

> BTW, so far I've never found a case where I had to use the hold space.
> I wonder if I may meet a case where I need it in my life.  This week I
> came up with some script for inserting an element into a JSON array at
> a specified position, but N is all that was needed:
> <http://www.alejandro-colomar.es/src/alx/nginx/unitcli.git/tree/bin/setup-unit#n969>.

Multi-line patterns solve a lot of problems.  A person knows that they
are no longer a sed(1) beginner when they use those effectively.  :D

> I've met a few more-complex cases, but not really that much.  I always
> come up with some combination of filters that allows me to avoid the
> hold space.  Sometimes, two scripts run consecutively also helps keep
> it simple :)

I've resorted to this too.  It's just that sed is such a small language
(even in its GNU dialect) that it taunts me.  Surely mastering it should
be _easy_...

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10 23:19                                   ` Joseph Myers
  2022-11-10 23:28                                     ` Alejandro Colomar
@ 2022-11-11 19:52                                     ` Martin Uecker
  2022-11-12  1:09                                       ` Joseph Myers
  2022-11-12 12:34                                     ` Alejandro Colomar
  2 siblings, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-11-11 19:52 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Alejandro Colomar, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Donnerstag, den 10.11.2022, 23:19 +0000 schrieb Joseph Myers:
> On Thu, 10 Nov 2022, Martin Uecker via Gcc wrote:
> 
> > One problem with WG14 papers is that people put in too much,
> > because the overhead is so high and the standard is not updated
> > very often.  It would be better to build such feature more
> > incrementally, which could be done more easily with a compiler
> > extension.  One could start supporting just [.x] but not more
> > complicated expressions.
> 
> Even a compiler extension requires the level of detail of specification 
> that you get with a WG14 paper (and the level of work on finding bugs in 
> that specification), to avoid the problem we've had before with too many 
> features added in GCC 2.x days where a poorly defined feature is "whatever 
> the compiler accepts".

I think the effort needed to specify the feature correctly
can be minimized by making the first version of the feature
as simple as possible.  

> If you use .x as the notation but don't limit it to [.x], you have a 
> completely new ambiguity between ordinary identifiers and member names
> 
> struct s { int a; };
> void f(int a, int b[((struct s) { .a = 1 }).a]);
> 
> where it's newly ambiguous whether ".a = 1" is an assignment to the 
> expression ".a" or a use of a designated initializer.

If we only allowed [ . a ] then this example would not be allowed.

If need more flexibility, we could incrementally extend it.

> (I think that if you add any syntax for this, GNU VLA forward declarations 
> are clearly to be preferred to inventing something new like [.x] which 
> introduces its own problems.)

I also prefer this.

I proposed forward declarations but WG14 and also people in this
discussion did not like them.  If we would actually start using
them, we could propose them again for the next revision.

Martin




^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-11 19:52                                     ` Martin Uecker
@ 2022-11-12  1:09                                       ` Joseph Myers
  2022-11-12  7:24                                         ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-12  1:09 UTC (permalink / raw)
  To: Martin Uecker
  Cc: Alejandro Colomar, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Fri, 11 Nov 2022, Martin Uecker via Gcc wrote:

> > Even a compiler extension requires the level of detail of specification 
> > that you get with a WG14 paper (and the level of work on finding bugs in 
> > that specification), to avoid the problem we've had before with too many 
> > features added in GCC 2.x days where a poorly defined feature is "whatever 
> > the compiler accepts".
> 
> I think the effort needed to specify the feature correctly
> can be minimized by making the first version of the feature
> as simple as possible.  

The version of constexpr in the current C2x working draft is more or less 
as simple as possible.  It also went through lots of revisions to get 
there.  I'm currently testing an implementation of C2x constexpr for GCC 
13, and there are still several issues with the specification I found in 
the implementation process, beyond those raised in WG14 discussions, for 
which I'll need to raise NB comments to clarify things.

I think that illustrates that you need the several iterations on the 
specification process, *and* making it as simple as possible, *and* 
getting implementation experience, *and* the implementation experience 
being with a close eye to what it implies for all the details in the 
specification rather than just getting something vaguely functional but 
not clearly specified.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12  1:09                                       ` Joseph Myers
@ 2022-11-12  7:24                                         ` Martin Uecker
  0 siblings, 0 replies; 85+ messages in thread
From: Martin Uecker @ 2022-11-12  7:24 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Alejandro Colomar, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Samstag, den 12.11.2022, 01:09 +0000 schrieb Joseph Myers:
> On Fri, 11 Nov 2022, Martin Uecker via Gcc wrote:
> 
> > > Even a compiler extension requires the level of detail of specification 
> > > that you get with a WG14 paper (and the level of work on finding bugs in 
> > > that specification), to avoid the problem we've had before with too many 
> > > features added in GCC 2.x days where a poorly defined feature is "whatever 
> > > the compiler accepts".
> > 
> > I think the effort needed to specify the feature correctly
> > can be minimized by making the first version of the feature
> > as simple as possible.  
> 
> The version of constexpr in the current C2x working draft is more or less 
> as simple as possible.  It also went through lots of revisions to get 
> there.  I'm currently testing an implementation of C2x constexpr for GCC 
> 13, and there are still several issues with the specification I found in 
> the implementation process, beyond those raised in WG14 discussions, for 
> which I'll need to raise NB comments to clarify things.

constexpr had no implementation experience in C at all and
always suspected that C++ experience should somehow count is
not really justified.  

> I think that illustrates that you need the several iterations on the 
> specification process, *and* making it as simple as possible, *and* 
> getting implementation experience, *and* the implementation experience 
> being with a close eye to what it implies for all the details in the 
> specification rather than just getting something vaguely functional but 
> not clearly specified.

I agree. We should work on specification and on prototyping
new features in parallel.

Martin



^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-10 23:19                                   ` Joseph Myers
  2022-11-10 23:28                                     ` Alejandro Colomar
  2022-11-11 19:52                                     ` Martin Uecker
@ 2022-11-12 12:34                                     ` Alejandro Colomar
  2022-11-12 12:46                                       ` Alejandro Colomar
  2022-11-12 13:03                                       ` Joseph Myers
  2 siblings, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 12:34 UTC (permalink / raw)
  To: Joseph Myers, Martin Uecker
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2166 bytes --]

Hi Joseph,

On 11/11/22 00:19, Joseph Myers wrote:
> On Thu, 10 Nov 2022, Martin Uecker via Gcc wrote:
> 
>> One problem with WG14 papers is that people put in too much,
>> because the overhead is so high and the standard is not updated
>> very often.  It would be better to build such feature more
>> incrementally, which could be done more easily with a compiler
>> extension.  One could start supporting just [.x] but not more
>> complicated expressions.
> 
> Even a compiler extension requires the level of detail of specification
> that you get with a WG14 paper (and the level of work on finding bugs in
> that specification), to avoid the problem we've had before with too many
> features added in GCC 2.x days where a poorly defined feature is "whatever
> the compiler accepts".
> 
> If you use .x as the notation but don't limit it to [.x], you have a
> completely new ambiguity between ordinary identifiers and member names
> 
> struct s { int a; };
> void f(int a, int b[((struct s) { .a = 1 }).a]);

Is it really ambiguous?  Let's show some currently-valid code:


struct s {
	int a;
};

struct t {
	struct s s;
	int a;
};

void f(void)
{
	struct t x = {
		.a = 1,
		.s = {
			.a = ((struct s) {.a = 1}).a,
		},
	};
}


It is ambiguous to a human reader, but that's a subjective thing, and of course 
shadowing should be avoided by programmers.  However, for a compiler, scoping 
and syntax rules should be unambiguous, I think.  In your code example, I 
believe it is unambiguous that both '.a' refer to the struct member.

But maybe we're not considering more complex situations that might really be 
ambiguous to the compiler, so a first round of supporting only [.a] would be a 
good first implementation.

> 
> where it's newly ambiguous whether ".a = 1" is an assignment to the
> expression ".a" or a use of a designated initializer.
> 
> (I think that if you add any syntax for this, GNU VLA forward declarations
> are clearly to be preferred to inventing something new like [.x] which
> introduces its own problems.)
> 

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 12:34                                     ` Alejandro Colomar
@ 2022-11-12 12:46                                       ` Alejandro Colomar
  2022-11-12 13:03                                       ` Joseph Myers
  1 sibling, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 12:46 UTC (permalink / raw)
  To: Joseph Myers, Martin Uecker
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1004 bytes --]

On 11/12/22 13:34, Alejandro Colomar wrote:
> struct s {
>      int a;
> };
> 
> struct t {
>      struct s s;
>      int a;
> };
> 
> void f(void)
> {
>      struct t x = {
>          .a = 1,
>          .s = {
>              .a = ((struct s) {.a = 1}).a,
>          },
>      };
> }

 From here, a demonstration of what I understood from Martin's email is that 
there's also an idea of allowing the following:


struct s {
     int a;
     int b;
};

struct t {
     struct s s;
     int a;
     int b;
};

void f(void)
{
     struct t x = {
         .a = 1,
         .s = {
             // In the following line, .b=.a is assigning 2
             .a = ((struct s) {.a = 2, .b = .a}).b,
             // The previous line assigned 2, since the compound had 2 in .b
         },
         // In the following line, .b=.a is assigning 1
         .b = .a,
     };
}

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 12:34                                     ` Alejandro Colomar
  2022-11-12 12:46                                       ` Alejandro Colomar
@ 2022-11-12 13:03                                       ` Joseph Myers
  2022-11-12 13:40                                         ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-12 13:03 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:

> > struct s { int a; };
> > void f(int a, int b[((struct s) { .a = 1 }).a]);
> 
> Is it really ambiguous?  Let's show some currently-valid code:

Well, I still don't know what the syntax addition you propose is.  Is it

postfix-expression : . identifier

(with a special rule about how the identifier is interpreted, different 
from the normal scope rules)?  If so, then ".a = 1" could either match 
assignment-expression directly (assigning to the postfix-expression ".a").  
Or it could match designation[opt] initializer, where ".a" is a 
designator.  And as I've noted many times in discussions of C2x proposals 
on the WG14 reflector, if some sequence of tokens can match the syntax in 
more than one way, there always needs to be explicit normative text to 
disambiguate the intended parse - it's not enough that one parse might 
lead later to a violation of some other constraint (not that either parse 
leads to a constraint violation in this case).

Or is the syntax

array-declarator : direct-declarator [ . assignment-expression ]

(with appropriate variants with static and type-qualifier-list and for 
array-abstract-declarator as well, and with different identifier 
interpretation rules inside the assignment-expression)?  If so, then there 
are big problems parsing [ . ( a ) + ( b ) ], where 'a' is a typedef name 
in an outer scope, because the appropriate parse would depend on whether 
'a' is shadowed by a parameter - unless of course you add appropriate 
wording like that present in some places about not being able to use this 
syntax to shadow a typedef name.

Or is it just

array-declarator : direct-declarator [ . identifier ]

which might avoid some of these problems at the expense of being less 
expressive?

If you're proposing a C syntax addition, you always need to be clear about 
exactly what the new cases in the syntax would be, and how you resolve 
ambiguities with any other existing part of the syntax, how you interact 
with rules on scopes, namespaces and linkage of identifiers, etc.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 13:03                                       ` Joseph Myers
@ 2022-11-12 13:40                                         ` Alejandro Colomar
  2022-11-12 13:58                                           ` Alejandro Colomar
  2022-11-12 14:54                                           ` Joseph Myers
  0 siblings, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 13:40 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 3876 bytes --]

Hi Joseph,

On 11/12/22 14:03, Joseph Myers wrote:
> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>>> struct s { int a; };
>>> void f(int a, int b[((struct s) { .a = 1 }).a]);
>>
>> Is it really ambiguous?  Let's show some currently-valid code:
> 
> Well, I still don't know what the syntax addition you propose is.  Is it
> 
> postfix-expression : . identifier

I'll try to explain it in standardeese, but I'm not sure if I'll get it right, 
so I'll accompany it with plain English.

Maybe Martin can help.

Since it's to be used as an rvalue, not as a lvalue, I guess a 
postfix-expression wouldn't be the right one.

> 
> (with a special rule about how the identifier is interpreted, different
> from the normal scope rules)?  If so, then ".a = 1" could either match
> assignment-expression directly (assigning to the postfix-expression ".a").

No, assigning to a function parameter from within another parameter declaration 
wouldn't make sense.  They should be readonly.  Side effects should be 
forbidden, I think.

> Or it could match designation[opt] initializer, where ".a" is a
> designator.  And as I've noted many times in discussions of C2x proposals
> on the WG14 reflector, if some sequence of tokens can match the syntax in
> more than one way, there always needs to be explicit normative text to
> disambiguate the intended parse - it's not enough that one parse might
> lead later to a violation of some other constraint (not that either parse
> leads to a constraint violation in this case).
> 
> Or is the syntax
> 
> array-declarator : direct-declarator [ . assignment-expression ]

Not good either.  The '.' should prefix the identifier, not the expression.  So, 
for example, you would have:

        void *bsearch(const void key[.size], const void base[.size * .nmemb],
                      size_t nmemb, size_t size,
                      int (*compar)(const void [.size], const void [.size]));

That's taken from the current manual page from git HEAD.  See 'base', which gets 
its size from the multiplication of 'size' and 'nmemb'.

> 
> (with appropriate variants with static and type-qualifier-list and for
> array-abstract-declarator as well, and with different identifier
> interpretation rules inside the assignment-expression)?  If so, then there
> are big problems parsing [ . ( a ) + ( b ) ], where 'a' is a typedef name
> in an outer scope, because the appropriate parse would depend on whether
> 'a' is shadowed by a parameter - unless of course you add appropriate
> wording like that present in some places about not being able to use this
> syntax to shadow a typedef name.
> 
> Or is it just
> 
> array-declarator : direct-declarator [ . identifier ]

For the initial implementation, it would be, I think.

> 
> which might avoid some of these problems at the expense of being less
> expressive?

Yes.

> 
> If you're proposing a C syntax addition, you always need to be clear about
> exactly what the new cases in the syntax would be, and how you resolve
> ambiguities with any other existing part of the syntax, how you interact
> with rules on scopes, namespaces and linkage of identifiers, etc.

Yeah, I'll try.

I think that the complete feature would allow 'designator' to be used within 
unary-expression:

unary-expression: designator

Since sizeof(foo) is a unary-expression and you can't assign to it, I'm guessing 
that similar rules could be used for '.size'.


That would have the effect of allowing both features suggested by Martin: being 
able to used designators in both structures (as demonstrated in my last email) 
and function prototypes (as in the thing we're discussing).

I hope I got it right.  I'm not used to lexical grammar so much.

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 13:40                                         ` Alejandro Colomar
@ 2022-11-12 13:58                                           ` Alejandro Colomar
  2022-11-12 14:54                                           ` Joseph Myers
  1 sibling, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 13:58 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 4271 bytes --]



On 11/12/22 14:40, Alejandro Colomar wrote:
> Hi Joseph,
> 
> On 11/12/22 14:03, Joseph Myers wrote:
>> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
>>
>>>> struct s { int a; };
>>>> void f(int a, int b[((struct s) { .a = 1 }).a]);
>>>
>>> Is it really ambiguous?  Let's show some currently-valid code:
>>
>> Well, I still don't know what the syntax addition you propose is.  Is it
>>
>> postfix-expression : . identifier
> 
> I'll try to explain it in standardeese, but I'm not sure if I'll get it right, 
> so I'll accompany it with plain English.
> 
> Maybe Martin can help.
> 
> Since it's to be used as an rvalue, not as a lvalue, I guess a 
> postfix-expression wouldn't be the right one.
> 
>>
>> (with a special rule about how the identifier is interpreted, different
>> from the normal scope rules)?  If so, then ".a = 1" could either match
>> assignment-expression directly (assigning to the postfix-expression ".a").
> 
> No, assigning to a function parameter from within another parameter declaration 
> wouldn't make sense.  They should be readonly.  Side effects should be 
> forbidden, I think.
> 
>> Or it could match designation[opt] initializer, where ".a" is a
>> designator.  And as I've noted many times in discussions of C2x proposals
>> on the WG14 reflector, if some sequence of tokens can match the syntax in
>> more than one way, there always needs to be explicit normative text to
>> disambiguate the intended parse - it's not enough that one parse might
>> lead later to a violation of some other constraint (not that either parse
>> leads to a constraint violation in this case).
>>
>> Or is the syntax
>>
>> array-declarator : direct-declarator [ . assignment-expression ]
> 
> Not good either.  The '.' should prefix the identifier, not the expression.  So, 
> for example, you would have:
> 
>         void *bsearch(const void key[.size], const void base[.size * .nmemb],
>                       size_t nmemb, size_t size,
>                       int (*compar)(const void [.size], const void [.size]));
> 
> That's taken from the current manual page from git HEAD.  See 'base', which gets 
> its size from the multiplication of 'size' and 'nmemb'.
> 
>>
>> (with appropriate variants with static and type-qualifier-list and for
>> array-abstract-declarator as well, and with different identifier
>> interpretation rules inside the assignment-expression)?  If so, then there
>> are big problems parsing [ . ( a ) + ( b ) ], where 'a' is a typedef name
>> in an outer scope, because the appropriate parse would depend on whether
>> 'a' is shadowed by a parameter - unless of course you add appropriate
>> wording like that present in some places about not being able to use this
>> syntax to shadow a typedef name.
>>
>> Or is it just
>>
>> array-declarator : direct-declarator [ . identifier ]
> 
> For the initial implementation, it would be, I think.
> 
>>
>> which might avoid some of these problems at the expense of being less
>> expressive?
> 
> Yes.
> 
>>
>> If you're proposing a C syntax addition, you always need to be clear about
>> exactly what the new cases in the syntax would be, and how you resolve
>> ambiguities with any other existing part of the syntax, how you interact
>> with rules on scopes, namespaces and linkage of identifiers, etc.
> 
> Yeah, I'll try.
> 
> I think that the complete feature would allow 'designator' to be used within 
> unary-expression:
> 
> unary-expression: designator

Some mistake I did:  Since enum designators don't make sense in this feature, it 
should only be:

unary-expression: . identifier

> 
> Since sizeof(foo) is a unary-expression and you can't assign to it, I'm guessing 
> that similar rules could be used for '.size'.
> 
> 
> That would have the effect of allowing both features suggested by Martin: being 
> able to used designators in both structures (as demonstrated in my last email) 
> and function prototypes (as in the thing we're discussing).
> 
> I hope I got it right.  I'm not used to lexical grammar so much.
> 
> Cheers,
> 
> Alex
> 
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 13:40                                         ` Alejandro Colomar
  2022-11-12 13:58                                           ` Alejandro Colomar
@ 2022-11-12 14:54                                           ` Joseph Myers
  2022-11-12 15:35                                             ` Alejandro Colomar
  2022-11-12 15:56                                             ` Martin Uecker
  1 sibling, 2 replies; 85+ messages in thread
From: Joseph Myers @ 2022-11-12 14:54 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:

> Since it's to be used as an rvalue, not as a lvalue, I guess a
> postfix-expression wouldn't be the right one.

Several forms of postfix-expression are only rvalues.

> > (with a special rule about how the identifier is interpreted, different
> > from the normal scope rules)?  If so, then ".a = 1" could either match
> > assignment-expression directly (assigning to the postfix-expression ".a").
> 
> No, assigning to a function parameter from within another parameter
> declaration wouldn't make sense.  They should be readonly.  Side effects
> should be forbidden, I think.

Such assignments are already allowed.  In a function definition, the side 
effects (including in size expressions for array parameters adjusted to 
pointers) take place before entry to the function body.

And, in any case, if you did have a constraint disallowing such 
assignments, it wouldn't suffice for syntactic disambiguation (see the 
previous point I made about that; I have some rough notes towards a WG14 
paper on syntactic disambiguation, but haven't converted them into a 
coherent paper).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 14:54                                           ` Joseph Myers
@ 2022-11-12 15:35                                             ` Alejandro Colomar
  2022-11-12 17:02                                               ` Joseph Myers
  2022-11-12 15:56                                             ` Martin Uecker
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 15:35 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1768 bytes --]

Hi Joseph,

On 11/12/22 15:54, Joseph Myers wrote:
> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>> Since it's to be used as an rvalue, not as a lvalue, I guess a
>> postfix-expression wouldn't be the right one.
> 
> Several forms of postfix-expression are only rvalues.
> 
>>> (with a special rule about how the identifier is interpreted, different
>>> from the normal scope rules)?  If so, then ".a = 1" could either match
>>> assignment-expression directly (assigning to the postfix-expression ".a").
>>
>> No, assigning to a function parameter from within another parameter
>> declaration wouldn't make sense.  They should be readonly.  Side effects
>> should be forbidden, I think.
> 
> Such assignments are already allowed.  In a function definition, the side
> effects (including in size expressions for array parameters adjusted to
> pointers) take place before entry to the function body.

Then, I'm guessing that rules need to change in a way that .initializer cannot 
appear as the left operand of an assignment-expression.

That is, for the following current definition of the assignment-expression (as 
of N3054):

assignment-expression:
     conditional-expression
     unary-expression assignment-operator assignment-expression

The unary-expression cannot be formed by a .initializer.

Would that be doable and sufficient?

Cheers,

Alex

> 
> And, in any case, if you did have a constraint disallowing such
> assignments, it wouldn't suffice for syntactic disambiguation (see the
> previous point I made about that; I have some rough notes towards a WG14
> paper on syntactic disambiguation, but haven't converted them into a
> coherent paper).
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 14:54                                           ` Joseph Myers
  2022-11-12 15:35                                             ` Alejandro Colomar
@ 2022-11-12 15:56                                             ` Martin Uecker
  2022-11-13 13:19                                               ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-11-12 15:56 UTC (permalink / raw)
  To: Joseph Myers, Alejandro Colomar
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Samstag, den 12.11.2022, 14:54 +0000 schrieb Joseph Myers:
> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
> > Since it's to be used as an rvalue, not as a lvalue, I guess a
> > postfix-expression wouldn't be the right one.
> 
> Several forms of postfix-expression are only rvalues.
> 
> > > (with a special rule about how the identifier is interpreted, different
> > > from the normal scope rules)?  If so, then ".a = 1" could either match
> > > assignment-expression directly (assigning to the postfix-expression ".a").
> > 
> > No, assigning to a function parameter from within another parameter
> > declaration wouldn't make sense.  They should be readonly.  Side effects
> > should be forbidden, I think.
> 
> Such assignments are already allowed.  In a function definition, the side 
> effects (including in size expressions for array parameters adjusted to 
> pointers) take place before entry to the function body.
> 
> And, in any case, if you did have a constraint disallowing such 
> assignments, it wouldn't suffice for syntactic disambiguation (see the 
> previous point I made about that; I have some rough notes towards a WG14 
> paper on syntactic disambiguation, but haven't converted them into a 
> coherent paper).

My idea was to only allow

array-declarator : direct-declarator [ . identifier ]

and only for parameter (not nested inside structs declared
in parameter list) as a first step because it seems this 
would exclude all difficult cases.

But if we need to allow more complicated expressions, then
it starts getting more complicated.

One could could allow more generic expressions, and
specify that the .identifier refers to a
parameter in
the nearest lexically enclosing parameter list or
struct/union.

Then

void foo(struct bar { int x; char c[.x] } a, int x);

would not be allowed (which is good because then we
could later use the syntax also inside structs). If
we apply scoping rules, the following would work:

struct bar { int y; };
void foo(char p[((struct bar){ .y = .x }).y], int x);

But not:

struct bar { int y; };
void foo(char p[((struct bar){ .y = .y }).y], int y);


But there are not only syntactical problems, because
also the type of the parameter might become relevant
and then you can get circular dependencies:

void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);

I am not sure what would the best way to fix it. One
could specifiy that parameters referred to by 
the .identifer syntax must of some integer type and
that the sub-expression .identifer is always
converted to a 'size_t'. 

Maybe one should also add a constraint that all new
type length expressions, i.e. using the syntax,
can not have side effects. Or even that they follow
all the rules of integer constant expressions with
the fictitious assumption that all . identifer 
sub-expressions are integer constant expressions.
The rationale being that this would facilitate
compile time reasoning about length expressions.
 

Martin






^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 15:35                                             ` Alejandro Colomar
@ 2022-11-12 17:02                                               ` Joseph Myers
  2022-11-12 17:08                                                 ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-12 17:02 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:

> > > No, assigning to a function parameter from within another parameter
> > > declaration wouldn't make sense.  They should be readonly.  Side effects
> > > should be forbidden, I think.
> > 
> > Such assignments are already allowed.  In a function definition, the side
> > effects (including in size expressions for array parameters adjusted to
> > pointers) take place before entry to the function body.
> 
> Then, I'm guessing that rules need to change in a way that .initializer cannot
> appear as the left operand of an assignment-expression.

I think needing such a very special case rule tends to indicate that some 
alternative syntax, not needing such a rule, would be better.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 17:02                                               ` Joseph Myers
@ 2022-11-12 17:08                                                 ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-12 17:08 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1244 bytes --]



On 11/12/22 18:02, Joseph Myers wrote:
> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>>>> No, assigning to a function parameter from within another parameter
>>>> declaration wouldn't make sense.  They should be readonly.  Side effects
>>>> should be forbidden, I think.
>>>
>>> Such assignments are already allowed.  In a function definition, the side
>>> effects (including in size expressions for array parameters adjusted to
>>> pointers) take place before entry to the function body.
>>
>> Then, I'm guessing that rules need to change in a way that .initializer cannot
>> appear as the left operand of an assignment-expression.
> 
> I think needing such a very special case rule tends to indicate that some
> alternative syntax, not needing such a rule, would be better.

Well, by not being an lvalue, it can't be assigned to.  That would be somewhat 
like sizeof(identifier), which is also a unary-expression, so it's not so much 
of a special case, is it?

void f(size_t s, int a[sizeof(1) = 1]);  // constraint violation
void g(size_t s, int a[.s = 1]);         // Also constraint violation
void h(size_t s, int a[s = 1]);          // This is fine



-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-12 15:56                                             ` Martin Uecker
@ 2022-11-13 13:19                                               ` Alejandro Colomar
  2022-11-13 13:33                                                 ` Alejandro Colomar
  2022-11-14 17:52                                                 ` Joseph Myers
  0 siblings, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 13:19 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 6674 bytes --]

Hi Martin!

On 11/12/22 16:56, Martin Uecker wrote:
> Am Samstag, den 12.11.2022, 14:54 +0000 schrieb Joseph Myers:
>> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
>>
>>> Since it's to be used as an rvalue, not as a lvalue, I guess a
>>> postfix-expression wouldn't be the right one.
>>
>> Several forms of postfix-expression are only rvalues.
>>
>>>> (with a special rule about how the identifier is interpreted, different
>>>> from the normal scope rules)?  If so, then ".a = 1" could either match
>>>> assignment-expression directly (assigning to the postfix-expression ".a").
>>>
>>> No, assigning to a function parameter from within another parameter
>>> declaration wouldn't make sense.  They should be readonly.  Side effects
>>> should be forbidden, I think.
>>
>> Such assignments are already allowed.  In a function definition, the side
>> effects (including in size expressions for array parameters adjusted to
>> pointers) take place before entry to the function body.
>>
>> And, in any case, if you did have a constraint disallowing such
>> assignments, it wouldn't suffice for syntactic disambiguation (see the
>> previous point I made about that; I have some rough notes towards a WG14
>> paper on syntactic disambiguation, but haven't converted them into a
>> coherent paper).
> 
> My idea was to only allow
> 
> array-declarator : direct-declarator [ . identifier ]
> 
> and only for parameter (not nested inside structs declared
> in parameter list) as a first step because it seems this
> would exclude all difficult cases.
> 
> But if we need to allow more complicated expressions, then
> it starts getting more complicated.

Ahh, I guess my work in documenting the man-pages prototypes got me thinking of 
those extensions to the idea.  I don't remember all the details :)

> 
> One could could allow more generic expressions, and
> specify that the .identifier refers to a
> parameter in
> the nearest lexically enclosing parameter list or
> struct/union.
> 
> Then
> 
> void foo(struct bar { int x; char c[.x] } a, int x);
> 
> would not be allowed (which is good because then we
> could later use the syntax also inside structs). If
> we apply scoping rules, the following would work:
> 
> struct bar { int y; };
> void foo(char p[((struct bar){ .y = .x }).y], int x);

Makes sense.

> 
> But not:
> 
> struct bar { int y; };
> void foo(char p[((struct bar){ .y = .y }).y], int y);

Although it clearly is nonsense, I'm not sure I'd make it a constraint 
violation, but rather Undefined Behavior.  How is it different than this?:

$ cat foo.c
int main(void)
{
	int i = i;
	return i;
}


$ gcc --version | head -n1
gcc (Debian 12.2.0-9) 12.2.0
$ gcc -Wall -Wextra -Werror foo.c
$

$ clang --version | head -n1
Debian clang version 14.0.6
$ clang -Wall -Wextra -Werror foo.c
foo.c:3:10: error: variable 'i' is uninitialized when used within its own 
initialization [-Werror,-Wuninitialized]
         int i = i;
             ~   ^
1 error generated.


BTW, I just freaked out that GCC can't catch this trivial bug.  Should I open a 
bug report?

> 
> 
> But there are not only syntactical problems, because
> also the type of the parameter might become relevant
> and then you can get circular dependencies:
> 
> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);

This seems to be a difficult stone in the road.

> 
> I am not sure what would the best way to fix it. One
> could specifiy that parameters referred to by
> the .identifer syntax must of some integer type and
> that the sub-expression .identifer is always
> converted to a 'size_t'.

That makes sense, but then overnight some quite useful thing came to my mind 
that would not be possible with this limitation:


<https://software.codidact.com/posts/285946>

char *
stpecpy(char dst[.end - .dst], char *src, char end[1])
{
	for (/* void */; dst <= end; dst++) {
		*dst = *src++;
		if (*dst == '\0')
			return dst;
	}
	/* Truncation detected */
	*end = '\0';

#if !defined(NDEBUG)
	/* Consume the rest of the input string. */
	while (*src++) {};
#endif

	return end + 1;
}


stpecpy() is a function similar to strlcat(3) that gets a pointer to the end of 
the array instead of the size of the buffer.  This allows chaining without 
having performance issues[1].

[1]: <https://en.wikichip.org/wiki/schlemiel_the_painter%27s_algorithm>


Maybe allowing integral types and pointers would be enough.  However, foreseeing 
that the _Lengthof() proposal (BTW, which paper was it?) will succeed, and 
combining it with this one, _Lengthof(pointer) would ideally give the length of 
the array, so allowing pointers would conflict.

My solution is to disallow sizeof() and _Lengthof() on .identifier.  That could 
be done simply by saying that variably-modified types (VMT) are incomplete types 
until immediately after the comma that follows the parameter declaration. 
Therefore it would be allowed only in the same way as it is allowed right now 
with the normal syntax (i.e., after the parameter has been seen).

BTW, what was the number of the latest paper for _Lengthof() and what happened 
to it?  I guess it's likely to be added to C3x, isn't it?

And another BTW:  there's some kind of consistency in (some) projects for naming 
sizes, and I have pending a review of the Linux man-pages to make it consistent 
there too.

See the following table of usual conventions:

Operator/macro:                 variable names;    Description.
------------------------------|------------------|---------------------
strlen(3):                      length, len, l;    String length.
sizeof():                       size, sz, nbytes;  Identifier size in bytes.
nitems(), nelems():             n, nelem, nitems;  Array number of elements.
sizeof_array(), array_bytes():  size, sz, nbytes;  Array size in bytes.


Naming _Lengthof() the operator that gets the number of elements in an array 
would create naming confusion, since then length can mean two different things. 
I suggest _Nitemsof().


> 
> Maybe one should also add a constraint that all new
> type length expressions, i.e. using the syntax,
> can not have side effects. Or even that they follow
> all the rules of integer constant expressions with
> the fictitious assumption that all . identifer
> sub-expressions are integer constant expressions.
> The rationale being that this would facilitate
> compile time reasoning about length expressions.
>   
> 
> Martin
> 

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 13:19                                               ` Alejandro Colomar
@ 2022-11-13 13:33                                                 ` Alejandro Colomar
  2022-11-13 14:02                                                   ` Alejandro Colomar
  2022-11-14 17:52                                                 ` Joseph Myers
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 13:33 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2080 bytes --]

Hi Martin,

On 11/13/22 14:19, Alejandro Colomar wrote:
>> But there are not only syntactical problems, because
>> also the type of the parameter might become relevant
>> and then you can get circular dependencies:
>>
>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
> 
> This seems to be a difficult stone in the road.
> 
>>
>> I am not sure what would the best way to fix it. One
>> could specifiy that parameters referred to by
>> the .identifer syntax must of some integer type and
>> that the sub-expression .identifer is always
>> converted to a 'size_t'.
> 
> That makes sense, but then overnight some quite useful thing came to my mind 
> that would not be possible with this limitation:
> 
> 
> <https://software.codidact.com/posts/285946>
> 
> char *
> stpecpy(char dst[.end - .dst], char *src, char end[1])
> {
>      for (/* void */; dst <= end; dst++) {
>          *dst = *src++;
>          if (*dst == '\0')
>              return dst;
>      }
>      /* Truncation detected */
>      *end = '\0';
> 
> #if !defined(NDEBUG)
>      /* Consume the rest of the input string. */
>      while (*src++) {};
> #endif
> 
>      return end + 1;
> }

And I forgot to say it:  Default promotions rank high (probably the highest) in 
my list of most hated features^Wbugs in C.  I wouldn't convert it to size_t, but 
rather follow normal promotion rules.

Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an 
array (which took me some time to understand), I'd also allow the same here. 
So, the type of the expression between [] could perfectly be signed or unsigned.

So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to 
allow negative numbers.  In the function above, since dst can be a pointer to 
one-past-the-end (it represents a previous truncation; that's why the test 
dst<=end), forcing a size_t conversion would disallow that syntax.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 13:33                                                 ` Alejandro Colomar
@ 2022-11-13 14:02                                                   ` Alejandro Colomar
  2022-11-13 14:58                                                     ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 14:02 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 2405 bytes --]



On 11/13/22 14:33, Alejandro Colomar wrote:
> Hi Martin,
> 
> On 11/13/22 14:19, Alejandro Colomar wrote:
>>> But there are not only syntactical problems, because
>>> also the type of the parameter might become relevant
>>> and then you can get circular dependencies:
>>>
>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>
>> This seems to be a difficult stone in the road.
>>
>>>
>>> I am not sure what would the best way to fix it. One
>>> could specifiy that parameters referred to by
>>> the .identifer syntax must of some integer type and
>>> that the sub-expression .identifer is always
>>> converted to a 'size_t'.
>>
>> That makes sense, but then overnight some quite useful thing came to my mind 
>> that would not be possible with this limitation:
>>
>>
>> <https://software.codidact.com/posts/285946>
>>
>> char *
>> stpecpy(char dst[.end - .dst], char *src, char end[1])

Heh, I got an off-by-one error.  It should be dst[.end - .dst + 1], of course, 
and then the result of the whole expression would be 0, which is fine as size_t.

So, never mind.

>> {
>>      for (/* void */; dst <= end; dst++) {
>>          *dst = *src++;
>>          if (*dst == '\0')
>>              return dst;
>>      }
>>      /* Truncation detected */
>>      *end = '\0';
>>
>> #if !defined(NDEBUG)
>>      /* Consume the rest of the input string. */
>>      while (*src++) {};
>> #endif
>>
>>      return end + 1;
>> }

> 
> And I forgot to say it:  Default promotions rank high (probably the highest) in 
> my list of most hated features^Wbugs in C.  I wouldn't convert it to size_t, but 
> rather follow normal promotion rules.
> 
> Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an 
> array (which took me some time to understand), I'd also allow the same here. So, 
> the type of the expression between [] could perfectly be signed or unsigned.
> 
> So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to 
> allow negative numbers.  In the function above, since dst can be a pointer to 
> one-past-the-end (it represents a previous truncation; that's why the test 
> dst<=end), forcing a size_t conversion would disallow that syntax.
> 
> Cheers,
> 
> Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 14:02                                                   ` Alejandro Colomar
@ 2022-11-13 14:58                                                     ` Martin Uecker
  2022-11-13 15:15                                                       ` Alejandro Colomar
  2022-11-28 23:18                                                       ` Alex Colomar
  0 siblings, 2 replies; 85+ messages in thread
From: Martin Uecker @ 2022-11-13 14:58 UTC (permalink / raw)
  To: Alejandro Colomar, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
> 
> On 11/13/22 14:33, Alejandro Colomar wrote:
> > Hi Martin,
> > 
> > On 11/13/22 14:19, Alejandro Colomar wrote:
> > > > But there are not only syntactical problems, because
> > > > also the type of the parameter might become relevant
> > > > and then you can get circular dependencies:
> > > > 
> > > > void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
> > > 
> > > This seems to be a difficult stone in the road.

But note that GNU forward declarations solve this nicely.

> > > 
> > > > I am not sure what would the best way to fix it. One
> > > > could specifiy that parameters referred to by
> > > > the .identifer syntax must of some integer type and
> > > > that the sub-expression .identifer is always
> > > > converted to a 'size_t'.
> > > 
> > > That makes sense, but then overnight some quite useful thing came to my mind 
> > > that would not be possible with this limitation:
> > > 
> > > 
> > > <https://software.codidact.com/posts/285946>
> > > 
> > > char *
> > > stpecpy(char dst[.end - .dst], char *src, char end[1])
> 
> Heh, I got an off-by-one error.  It should be dst[.end - .dst + 1], of course, 
> and then the result of the whole expression would be 0, which is fine as size_t.
> 
> So, never mind.

.end and .dst would have pointer size though.

> > > {
> > >      for (/* void */; dst <= end; dst++) {
> > >          *dst = *src++;
> > >          if (*dst == '\0')
> > >              return dst;
> > >      }
> > >      /* Truncation detected */
> > >      *end = '\0';
> > > 
> > > #if !defined(NDEBUG)
> > >      /* Consume the rest of the input string. */
> > >      while (*src++) {};
> > > #endif
> > > 
> > >      return end + 1;
> > > }
> > And I forgot to say it:  Default promotions rank high (probably the highest) in 
> > my list of most hated features^Wbugs in C. 

If you replaced them with explicit conversion you then have
to add by hand all the time, I am pretty sure most people
would hate this more. (and it could also hide bugs)

> > I wouldn't convert it to size_t, but 
> > rather follow normal promotion rules.

The point of making it size_t is that you then
do need to know the type of the parameter to make
sense of the expression. If the type matters, then you get
mutual dependencies as in the example above. 

> > Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an 
> > array (which took me some time to understand), I'd also allow the same here. So, 
> > the type of the expression between [] could perfectly be signed or unsigned.
> > 
> > So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to 
> > allow negative numbers.  In the function above, since dst can be a pointer to 
> > one-past-the-end (it represents a previous truncation; that's why the test 
> > dst<=end), forcing a size_t conversion would disallow that syntax.

Yes, this then does not work.

Martin


> > Cheers,
> > 
> > Alex
> > 
> 
> -- 
> <http://www.alejandro-colomar.es/>


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 14:58                                                     ` Martin Uecker
@ 2022-11-13 15:15                                                       ` Alejandro Colomar
  2022-11-13 15:32                                                         ` Martin Uecker
  2022-11-13 16:28                                                         ` Alejandro Colomar
  2022-11-28 23:18                                                       ` Alex Colomar
  1 sibling, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 15:15 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 3465 bytes --]

Hi Martin,

On 11/13/22 15:58, Martin Uecker wrote:
> Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
>>
>> On 11/13/22 14:33, Alejandro Colomar wrote:
>>>
>>> On 11/13/22 14:19, Alejandro Colomar wrote:
>>>>> But there are not only syntactical problems, because
>>>>> also the type of the parameter might become relevant
>>>>> and then you can get circular dependencies:
>>>>>
>>>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>
>>>> This seems to be a difficult stone in the road.
> 
> But note that GNU forward declarations solve this nicely.

How would that above be solved with GNU fwd decl?  I'm guessing that it can't. 
How do you forward declare incomplete VMTs?.

> 
>>>>
>>>>> I am not sure what would the best way to fix it. One
>>>>> could specifiy that parameters referred to by
>>>>> the .identifer syntax must of some integer type and
>>>>> that the sub-expression .identifer is always
>>>>> converted to a 'size_t'.
>>>>
>>>> That makes sense, but then overnight some quite useful thing came to my mind
>>>> that would not be possible with this limitation:
>>>>
>>>>
>>>> <https://software.codidact.com/posts/285946>
>>>>
>>>> char *
>>>> stpecpy(char dst[.end - .dst], char *src, char end[1])
>>
>> Heh, I got an off-by-one error.  It should be dst[.end - .dst + 1], of course,
>> and then the result of the whole expression would be 0, which is fine as size_t.
>>
>> So, never mind.
> 
> .end and .dst would have pointer size though.
> 
>>>> {
>>>>       for (/* void */; dst <= end; dst++) {
>>>>           *dst = *src++;
>>>>           if (*dst == '\0')
>>>>               return dst;
>>>>       }
>>>>       /* Truncation detected */
>>>>       *end = '\0';
>>>>
>>>> #if !defined(NDEBUG)
>>>>       /* Consume the rest of the input string. */
>>>>       while (*src++) {};
>>>> #endif
>>>>
>>>>       return end + 1;
>>>> }
>>> And I forgot to say it:  Default promotions rank high (probably the highest) in
>>> my list of most hated features^Wbugs in C.
> 
> If you replaced them with explicit conversion you then have
> to add by hand all the time, I am pretty sure most people
> would hate this more. (and it could also hide bugs)

Yeah, casts are also in my top 3 list of things to avoid (although in this case 
there's no bug); maybe a bit over default promotions :)

I didn't mean that all promotions are bad.  Just the gratuitous ones, like 
promoting everything to int before even needing it.  That makes uint16_t a 
theoretical type, because whenever you try to use it, you end up with a signed 
32-bit type; fun heh? :P  _BitInt() solves that for me.

But sure, in (1u + 1l), promotions are fine to get a common type.

> 
>>> I wouldn't convert it to size_t, but
>>> rather follow normal promotion rules.
> 
> The point of making it size_t is that you then
> do need to know the type of the parameter to make
> sense of the expression. If the type matters, then you get
> mutual dependencies as in the example above.

Except if you treat incomplete types as... incomplete types (for which sizeof() 
is disallowed by the standard).  And the issue we're having is that the types 
are not yet complete at the time we're using them, aren't they?

Kind of like the initialization order fiasco, but since we're in a limited 
scope, we can detect it.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 15:15                                                       ` Alejandro Colomar
@ 2022-11-13 15:32                                                         ` Martin Uecker
  2022-11-13 16:25                                                           ` Alejandro Colomar
  2022-11-13 16:28                                                         ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-11-13 15:32 UTC (permalink / raw)
  To: Alejandro Colomar, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Sonntag, den 13.11.2022, 16:15 +0100 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On 11/13/22 15:58, Martin Uecker wrote:
> > Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
> > > On 11/13/22 14:33, Alejandro Colomar wrote:
> > > > On 11/13/22 14:19, Alejandro Colomar wrote:
> > > > > > But there are not only syntactical problems, because
> > > > > > also the type of the parameter might become relevant
> > > > > > and then you can get circular dependencies:
> > > > > > 
> > > > > > void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
> > > > > 
> > > > > This seems to be a difficult stone in the road.
> > 
> > But note that GNU forward declarations solve this nicely.
> 
> How would that above be solved with GNU fwd decl?  I'm guessing that it can't. 
> How do you forward declare incomplete VMTs?.

You can't express it. This was my point: it is impossible
to create circular dependencies.

...

> > > > > {
> > > > >       for (/* void */; dst <= end; dst++) {
> > > > >           *dst = *src++;
> > > > >           if (*dst == '\0')
> > > > >               return dst;
> > > > >       }
> > > > >       /* Truncation detected */
> > > > >       *end = '\0';
> > > > > 
> > > > > #if !defined(NDEBUG)
> > > > >       /* Consume the rest of the input string. */
> > > > >       while (*src++) {};
> > > > > #endif
> > > > > 
> > > > >       return end + 1;
> > > > > }
> > > > And I forgot to say it:  Default promotions rank high (probably the highest) in
> > > > my list of most hated features^Wbugs in C.
> > 
> > If you replaced them with explicit conversion you then have
> > to add by hand all the time, I am pretty sure most people
> > would hate this more. (and it could also hide bugs)
> 
> Yeah, casts are also in my top 3 list of things to avoid (although in this case 
> there's no bug); maybe a bit over default promotions :)
> 
> I didn't mean that all promotions are bad.  Just the gratuitous ones, like 
> promoting everything to int before even needing it.  That makes uint16_t a 
> theoretical type, because whenever you try to use it, you end up with a signed 
> 32-bit type; fun heh? :P  _BitInt() solves that for me.

uint16_t is for storing data.  My expectation is that people
will find _BitInt() difficult and error-prone to use for
small sizes.  But maybe I am wrong...

> But sure, in (1u + 1l), promotions are fine to get a common type.
> 
> > > > I wouldn't convert it to size_t, but
> > > > rather follow normal promotion rules.
> > 
> > The point of making it size_t is that you then
> > do need to know the type of the parameter to make
> > sense of the expression. If the type matters, then you get
> > mutual dependencies as in the example above.
> 
> Except if you treat incomplete types as... incomplete types (for which sizeof() 
> is disallowed by the standard).  And the issue we're having is that the types 
> are not yet complete at the time we're using them, aren't they?

It is not an incomplete type. When doing parsing and do not have
a declaration we know nothing about it (not just not the size).
If we assume we know the type (by looking ahead) we get mutual
dependencies.

Also the capability to parse, fold, and do type checking
in one go is something worth preserving in my opinion. 

Martin


> Kind of like the initialization order fiasco, but since we're in a limited 
> scope, we can detect it.





^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 15:32                                                         ` Martin Uecker
@ 2022-11-13 16:25                                                           ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 16:25 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 4129 bytes --]

Hi Martin,

On 11/13/22 16:32, Martin Uecker wrote:
> Am Sonntag, den 13.11.2022, 16:15 +0100 schrieb Alejandro Colomar:
>> Hi Martin,
>>
>> On 11/13/22 15:58, Martin Uecker wrote:
>>> Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
>>>> On 11/13/22 14:33, Alejandro Colomar wrote:
>>>>> On 11/13/22 14:19, Alejandro Colomar wrote:
>>>>>>> But there are not only syntactical problems, because
>>>>>>> also the type of the parameter might become relevant
>>>>>>> and then you can get circular dependencies:
>>>>>>>
>>>>>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>>>
>>>>>> This seems to be a difficult stone in the road.
>>>
>>> But note that GNU forward declarations solve this nicely.
>>
>> How would that above be solved with GNU fwd decl?  I'm guessing that it can't.
>> How do you forward declare incomplete VMTs?.
> 
> You can't express it. This was my point: it is impossible
> to create circular dependencies.
> 
> ...
> 
>>>>>> {
>>>>>>        for (/* void */; dst <= end; dst++) {
>>>>>>            *dst = *src++;
>>>>>>            if (*dst == '\0')
>>>>>>                return dst;
>>>>>>        }
>>>>>>        /* Truncation detected */
>>>>>>        *end = '\0';
>>>>>>
>>>>>> #if !defined(NDEBUG)
>>>>>>        /* Consume the rest of the input string. */
>>>>>>        while (*src++) {};
>>>>>> #endif
>>>>>>
>>>>>>        return end + 1;
>>>>>> }
>>>>> And I forgot to say it:  Default promotions rank high (probably the highest) in
>>>>> my list of most hated features^Wbugs in C.
>>>
>>> If you replaced them with explicit conversion you then have
>>> to add by hand all the time, I am pretty sure most people
>>> would hate this more. (and it could also hide bugs)
>>
>> Yeah, casts are also in my top 3 list of things to avoid (although in this case
>> there's no bug); maybe a bit over default promotions :)
>>
>> I didn't mean that all promotions are bad.  Just the gratuitous ones, like
>> promoting everything to int before even needing it.  That makes uint16_t a
>> theoretical type, because whenever you try to use it, you end up with a signed
>> 32-bit type; fun heh? :P  _BitInt() solves that for me.
> 
> uint16_t is for storing data.  My expectation is that people
> will find _BitInt() difficult and error-prone to use for
> small sizes.  But maybe I am wrong...

I'm a bit concerned about the suffix to create literals.  I'd have preferred a 
suffix that allowed creating a specific size (instead of the minimum one.  i.e., 
1u16 or something like that.  But otherwise I think it can be better.  I don't 
have in mind a big issue I had a year ago with uint16_t, but it required 3 casts 
in a line.  With _BitInt() I think none (or maybe one, for giving 1 the 
appropriate size) would have been needed.  But we'll see how it works out.


> 
>> But sure, in (1u + 1l), promotions are fine to get a common type.
>>
>>>>> I wouldn't convert it to size_t, but
>>>>> rather follow normal promotion rules.
>>>
>>> The point of making it size_t is that you then
>>> do need to know the type of the parameter to make
>>> sense of the expression. If the type matters, then you get
>>> mutual dependencies as in the example above.
>>
>> Except if you treat incomplete types as... incomplete types (for which sizeof()
>> is disallowed by the standard).  And the issue we're having is that the types
>> are not yet complete at the time we're using them, aren't they?
> 
> It is not an incomplete type. When doing parsing and do not have
> a declaration we know nothing about it (not just not the size).
> If we assume we know the type (by looking ahead) we get mutual
> dependencies.

Then I'd do the following:  .identifier always has an incomplete type.

I'm preparing a complete description of what I think of the feature.  I'll add that.

> 
> Also the capability to parse, fold, and do type checking
> in one go is something worth preserving in my opinion.

Makes sense.

Thanks for all the help, both!

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 15:15                                                       ` Alejandro Colomar
  2022-11-13 15:32                                                         ` Martin Uecker
@ 2022-11-13 16:28                                                         ` Alejandro Colomar
  2022-11-13 16:31                                                           ` Alejandro Colomar
  2022-11-14 18:13                                                           ` Joseph Myers
  1 sibling, 2 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 16:28 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 3542 bytes --]

SYNOPSIS:

unary-operator:  . identifier


DESCRIPTION:

-  It is not an lvalue.

    -  This means sizeof() and _Lengthof() cannot be applied to them.
    -  This prevents ambiguity with a designator in an initializer-list within a 
nested braced-initializer.

-  The type of a .identifier is always an incomplete type.

    -  This prevents circular dependencies involving sizeof() or _Lengthof().

-  Shadowing rules apply.

    -  This prevents ambiguity.


EXAMPLES:


-  Valid examples (libc):

        int
        strncmp(const char s1[.n],
                const char s2[.n],
                size_t n);

        int
        cacheflush(void addr[.nbytes],
                   int nbytes,
                   int cache);

        long
        mbind(void addr[.len],
              unsigned long len,
              int mode,
              const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
                                           / ULONG_WIDTH],
              unsigned long maxnode, unsigned int flags);

        void *
        bsearch(const void key[.size],
                const void base[.size * .nmemb],
                size_t nmemb,
                size_t size,
                int (*compar)(const void [.size], const void [.size]));

-  Valid examples (my own):

        void
        ustr2str(char dst[restrict .len + 1],
                 const char src[restrict .len],
                 size_t len);

        char *
        stpecpy(char dst[.end - .dst + 1],
                char *restrict src,
                char end[1]);

-  Valid examples (from this thread):

    -
        struct s { int a; };
        void f(int a, int b[((struct s) { .a = 1 }).a]);

        Explanation:
        -  Because of shadowing rules, .a=1 refers to the struct member.
           -  Also, if .a referred to the parameter, it would be an rvalue, so 
it wouldn't be valid to assign to it.
        -  (...).a refers to the struct member too, since otherwise an rvalue is 
not expected there.

    -
        void foo(struct bar { int x; char c[.x] } a, int x);

        Explanation:
        -  Because of shadowing rules, [.x] refers to the struct member.

    -
        struct bar { int y; };
        void foo(char p[((struct bar){ .y = .x }).y], int x);

        Explanation:
        -  .x unambiguously refers to the parameter.

-  Undefined behavior:

    -
        struct bar { int y; };
        void foo(char p[((struct bar){ .y = .y }).y], int y);

        Explanation:
        -  Because of shadowing rules, =.y refers to the struct member.
        -  .y=.y means initialize the member with itself (uninitialized use).
        -  (...).y refers to the struct member, since otherwise an rvalue is not 
expected there.

-  Constraint violations:

    -
        void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);

        Explanation:
        -  sizeof(*.b): Cannot get size of incomplete type.
        -  sizeof(*.a): Cannot get size of incomplete type.

    -
        void f(size_t s, int a[sizeof(1) = 1]);

        Explanation:
        -  Cannot assign to rvalue.

    -
        void f(size_t s, int a[.s = 1]);

        Explanation:
        -  Cannot assign to rvalue.

    -
        void f(size_t s, int a[sizeof(.s)]);

        Explanation:
        -  sizeof(.s): Cannot get size of incomplete type.


Does this idea make sense to you?


Cheers,
Alex
-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 16:28                                                         ` Alejandro Colomar
@ 2022-11-13 16:31                                                           ` Alejandro Colomar
  2022-11-13 16:34                                                             ` Alejandro Colomar
  2022-11-14 18:13                                                           ` Joseph Myers
  1 sibling, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 16:31 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 4560 bytes --]



On 11/13/22 17:28, Alejandro Colomar wrote:
> SYNOPSIS:
> 
> unary-operator:  . identifier
> 
> 
> DESCRIPTION:
> 
> -  It is not an lvalue.
> 
>     -  This means sizeof() and _Lengthof() cannot be applied to them.

Sorry, the above is a thinko.

I wanted to say that, like sizeof() and _Lengthof(), you can't assign to it.

>     -  This prevents ambiguity with a designator in an initializer-list within a 
> nested braced-initializer.
> 
> -  The type of a .identifier is always an incomplete type.
> 
>     -  This prevents circular dependencies involving sizeof() or _Lengthof().
> 
> -  Shadowing rules apply.
> 
>     -  This prevents ambiguity.
> 
> 
> EXAMPLES:
> 
> 
> -  Valid examples (libc):
> 
>         int
>         strncmp(const char s1[.n],
>                 const char s2[.n],
>                 size_t n);
> 
>         int
>         cacheflush(void addr[.nbytes],
>                    int nbytes,
>                    int cache);
> 
>         long
>         mbind(void addr[.len],
>               unsigned long len,
>               int mode,
>               const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
>                                            / ULONG_WIDTH],
>               unsigned long maxnode, unsigned int flags);
> 
>         void *
>         bsearch(const void key[.size],
>                 const void base[.size * .nmemb],
>                 size_t nmemb,
>                 size_t size,
>                 int (*compar)(const void [.size], const void [.size]));
> 
> -  Valid examples (my own):
> 
>         void
>         ustr2str(char dst[restrict .len + 1],
>                  const char src[restrict .len],
>                  size_t len);
> 
>         char *
>         stpecpy(char dst[.end - .dst + 1],
>                 char *restrict src,
>                 char end[1]);
> 
> -  Valid examples (from this thread):
> 
>     -
>         struct s { int a; };
>         void f(int a, int b[((struct s) { .a = 1 }).a]);
> 
>         Explanation:
>         -  Because of shadowing rules, .a=1 refers to the struct member.
>            -  Also, if .a referred to the parameter, it would be an rvalue, so 
> it wouldn't be valid to assign to it.
>         -  (...).a refers to the struct member too, since otherwise an rvalue is 
> not expected there.
> 
>     -
>         void foo(struct bar { int x; char c[.x] } a, int x);
> 
>         Explanation:
>         -  Because of shadowing rules, [.x] refers to the struct member.
> 
>     -
>         struct bar { int y; };
>         void foo(char p[((struct bar){ .y = .x }).y], int x);
> 
>         Explanation:
>         -  .x unambiguously refers to the parameter.
> 
> -  Undefined behavior:
> 
>     -
>         struct bar { int y; };
>         void foo(char p[((struct bar){ .y = .y }).y], int y);
> 
>         Explanation:
>         -  Because of shadowing rules, =.y refers to the struct member.
>         -  .y=.y means initialize the member with itself (uninitialized use).
>         -  (...).y refers to the struct member, since otherwise an rvalue is not 
> expected there.
> 
> -  Constraint violations:
> 
>     -
>         void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
> 
>         Explanation:
>         -  sizeof(*.b): Cannot get size of incomplete type.
>         -  sizeof(*.a): Cannot get size of incomplete type.
> 
>     -
>         void f(size_t s, int a[sizeof(1) = 1]);
> 
>         Explanation:
>         -  Cannot assign to rvalue.
> 
>     -
>         void f(size_t s, int a[.s = 1]);
> 
>         Explanation:
>         -  Cannot assign to rvalue.
> 
>     -
>         void f(size_t s, int a[sizeof(.s)]);
> 
>         Explanation:
>         -  sizeof(.s): Cannot get size of incomplete type.
> 
> 
> Does this idea make sense to you?
> 
> 
> Cheers,
> Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 16:31                                                           ` Alejandro Colomar
@ 2022-11-13 16:34                                                             ` Alejandro Colomar
  2022-11-13 16:56                                                               ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 16:34 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 4893 bytes --]



On 11/13/22 17:31, Alejandro Colomar wrote:
> 
> 
> On 11/13/22 17:28, Alejandro Colomar wrote:
>> SYNOPSIS:
>>
>> unary-operator:  . identifier
>>
>>
>> DESCRIPTION:
>>
>> -  It is not an lvalue.
>>
>>     -  This means sizeof() and _Lengthof() cannot be applied to them.
> 
> Sorry, the above is a thinko.
> 
> I wanted to say that, like sizeof() and _Lengthof(), you can't assign to it.
> 
>>     -  This prevents ambiguity with a designator in an initializer-list within 
>> a nested braced-initializer.
>>
>> -  The type of a .identifier is always an incomplete type.

Or rather, more easily prohibit explicitly using typeof(), sizeof(), and 
_Lengthof() to it.

>>
>>     -  This prevents circular dependencies involving sizeof() or _Lengthof().
>>
>> -  Shadowing rules apply.
>>
>>     -  This prevents ambiguity.
>>
>>
>> EXAMPLES:
>>
>>
>> -  Valid examples (libc):
>>
>>         int
>>         strncmp(const char s1[.n],
>>                 const char s2[.n],
>>                 size_t n);
>>
>>         int
>>         cacheflush(void addr[.nbytes],
>>                    int nbytes,
>>                    int cache);
>>
>>         long
>>         mbind(void addr[.len],
>>               unsigned long len,
>>               int mode,
>>               const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
>>                                            / ULONG_WIDTH],
>>               unsigned long maxnode, unsigned int flags);
>>
>>         void *
>>         bsearch(const void key[.size],
>>                 const void base[.size * .nmemb],
>>                 size_t nmemb,
>>                 size_t size,
>>                 int (*compar)(const void [.size], const void [.size]));
>>
>> -  Valid examples (my own):
>>
>>         void
>>         ustr2str(char dst[restrict .len + 1],
>>                  const char src[restrict .len],
>>                  size_t len);
>>
>>         char *
>>         stpecpy(char dst[.end - .dst + 1],
>>                 char *restrict src,
>>                 char end[1]);
>>
>> -  Valid examples (from this thread):
>>
>>     -
>>         struct s { int a; };
>>         void f(int a, int b[((struct s) { .a = 1 }).a]);
>>
>>         Explanation:
>>         -  Because of shadowing rules, .a=1 refers to the struct member.
>>            -  Also, if .a referred to the parameter, it would be an rvalue, so 
>> it wouldn't be valid to assign to it.
>>         -  (...).a refers to the struct member too, since otherwise an rvalue 
>> is not expected there.
>>
>>     -
>>         void foo(struct bar { int x; char c[.x] } a, int x);
>>
>>         Explanation:
>>         -  Because of shadowing rules, [.x] refers to the struct member.
>>
>>     -
>>         struct bar { int y; };
>>         void foo(char p[((struct bar){ .y = .x }).y], int x);
>>
>>         Explanation:
>>         -  .x unambiguously refers to the parameter.
>>
>> -  Undefined behavior:
>>
>>     -
>>         struct bar { int y; };
>>         void foo(char p[((struct bar){ .y = .y }).y], int y);
>>
>>         Explanation:
>>         -  Because of shadowing rules, =.y refers to the struct member.
>>         -  .y=.y means initialize the member with itself (uninitialized use).
>>         -  (...).y refers to the struct member, since otherwise an rvalue is 
>> not expected there.
>>
>> -  Constraint violations:
>>
>>     -
>>         void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>
>>         Explanation:
>>         -  sizeof(*.b): Cannot get size of incomplete type.
>>         -  sizeof(*.a): Cannot get size of incomplete type.
>>
>>     -
>>         void f(size_t s, int a[sizeof(1) = 1]);
>>
>>         Explanation:
>>         -  Cannot assign to rvalue.
>>
>>     -
>>         void f(size_t s, int a[.s = 1]);
>>
>>         Explanation:
>>         -  Cannot assign to rvalue.
>>
>>     -
>>         void f(size_t s, int a[sizeof(.s)]);
>>
>>         Explanation:
>>         -  sizeof(.s): Cannot get size of incomplete type.
>>
>>
>> Does this idea make sense to you?
>>
>>
>> Cheers,
>> Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 16:34                                                             ` Alejandro Colomar
@ 2022-11-13 16:56                                                               ` Alejandro Colomar
  2022-11-13 19:05                                                                 ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 16:56 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 5300 bytes --]



On 11/13/22 17:34, Alejandro Colomar wrote:
> 
> 
> On 11/13/22 17:31, Alejandro Colomar wrote:
>>
>>
>> On 11/13/22 17:28, Alejandro Colomar wrote:
>>> SYNOPSIS:
>>>
>>> unary-operator:  . identifier
>>>
>>>
>>> DESCRIPTION:
>>>
>>> -  It is not an lvalue.
>>>
>>>     -  This means sizeof() and _Lengthof() cannot be applied to them.
>>
>> Sorry, the above is a thinko.
>>
>> I wanted to say that, like sizeof() and _Lengthof(), you can't assign to it.
>>
>>>     -  This prevents ambiguity with a designator in an initializer-list 
>>> within a nested braced-initializer.
>>>
>>> -  The type of a .identifier is always an incomplete type.
> 
> Or rather, more easily prohibit explicitly using typeof(), sizeof(), and 
> _Lengthof() to it.

Hmm, this is not enough.  Pointer arithmetics are interesting, and for that, you 
need to implicitly know the sizeof(*.p).

How about allowing only integral types or pointers to integral types?

> 
>>>
>>>     -  This prevents circular dependencies involving sizeof() or _Lengthof().
>>>
>>> -  Shadowing rules apply.
>>>
>>>     -  This prevents ambiguity.
>>>
>>>
>>> EXAMPLES:
>>>
>>>
>>> -  Valid examples (libc):
>>>
>>>         int
>>>         strncmp(const char s1[.n],
>>>                 const char s2[.n],
>>>                 size_t n);
>>>
>>>         int
>>>         cacheflush(void addr[.nbytes],
>>>                    int nbytes,
>>>                    int cache);
>>>
>>>         long
>>>         mbind(void addr[.len],
>>>               unsigned long len,
>>>               int mode,
>>>               const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
>>>                                            / ULONG_WIDTH],
>>>               unsigned long maxnode, unsigned int flags);
>>>
>>>         void *
>>>         bsearch(const void key[.size],
>>>                 const void base[.size * .nmemb],
>>>                 size_t nmemb,
>>>                 size_t size,
>>>                 int (*compar)(const void [.size], const void [.size]));
>>>
>>> -  Valid examples (my own):
>>>
>>>         void
>>>         ustr2str(char dst[restrict .len + 1],
>>>                  const char src[restrict .len],
>>>                  size_t len);
>>>
>>>         char *
>>>         stpecpy(char dst[.end - .dst + 1],
>>>                 char *restrict src,
>>>                 char end[1]);
>>>
>>> -  Valid examples (from this thread):
>>>
>>>     -
>>>         struct s { int a; };
>>>         void f(int a, int b[((struct s) { .a = 1 }).a]);
>>>
>>>         Explanation:
>>>         -  Because of shadowing rules, .a=1 refers to the struct member.
>>>            -  Also, if .a referred to the parameter, it would be an rvalue, 
>>> so it wouldn't be valid to assign to it.
>>>         -  (...).a refers to the struct member too, since otherwise an rvalue 
>>> is not expected there.
>>>
>>>     -
>>>         void foo(struct bar { int x; char c[.x] } a, int x);
>>>
>>>         Explanation:
>>>         -  Because of shadowing rules, [.x] refers to the struct member.
>>>
>>>     -
>>>         struct bar { int y; };
>>>         void foo(char p[((struct bar){ .y = .x }).y], int x);
>>>
>>>         Explanation:
>>>         -  .x unambiguously refers to the parameter.
>>>
>>> -  Undefined behavior:
>>>
>>>     -
>>>         struct bar { int y; };
>>>         void foo(char p[((struct bar){ .y = .y }).y], int y);
>>>
>>>         Explanation:
>>>         -  Because of shadowing rules, =.y refers to the struct member.
>>>         -  .y=.y means initialize the member with itself (uninitialized use).
>>>         -  (...).y refers to the struct member, since otherwise an rvalue is 
>>> not expected there.
>>>
>>> -  Constraint violations:
>>>
>>>     -
>>>         void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>
>>>         Explanation:
>>>         -  sizeof(*.b): Cannot get size of incomplete type.
>>>         -  sizeof(*.a): Cannot get size of incomplete type.
>>>
>>>     -
>>>         void f(size_t s, int a[sizeof(1) = 1]);
>>>
>>>         Explanation:
>>>         -  Cannot assign to rvalue.
>>>
>>>     -
>>>         void f(size_t s, int a[.s = 1]);
>>>
>>>         Explanation:
>>>         -  Cannot assign to rvalue.
>>>
>>>     -
>>>         void f(size_t s, int a[sizeof(.s)]);
>>>
>>>         Explanation:
>>>         -  sizeof(.s): Cannot get size of incomplete type.
>>>
>>>
>>> Does this idea make sense to you?
>>>
>>>
>>> Cheers,
>>> Alex
>>
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 16:56                                                               ` Alejandro Colomar
@ 2022-11-13 19:05                                                                 ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-13 19:05 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 6113 bytes --]

On 11/13/22 17:56, Alejandro Colomar wrote:>>> On 11/13/22 17:28, Alejandro 
Colomar wrote:
>>>> SYNOPSIS:
>>>>
>>>> unary-operator:  . identifier
>>>>
>>>>
>>>> DESCRIPTION:
>>>>
>>>> -  It is not an lvalue.
>>>>
>>>>     -  This means sizeof() and _Lengthof() cannot be applied to them.
>>>
>>> Sorry, the above is a thinko.
>>>
>>> I wanted to say that, like sizeof() and _Lengthof(), you can't assign to it.
>>>
>>>>     -  This prevents ambiguity with a designator in an initializer-list 
>>>> within a nested braced-initializer.
>>>>
>>>> -  The type of a .identifier is always an incomplete type.
>>
>> Or rather, more easily prohibit explicitly using typeof(), sizeof(), and 
>> _Lengthof() to it.
> 
> Hmm, this is not enough.  Pointer arithmetics are interesting, and for that, you 
> need to implicitly know the sizeof(*.p).
> 
> How about allowing only integral types or pointers to integral types?

I've been thinking about keeping the number of passes as low as possible, while 
allowing most useful expressions:

Maybe forcing some ordering can help:

-  The type of a .initializer is complete after the opening parenthesis of the 
function-declarator (if it refers to a parameter) or after the opening brace of 
a braced-initializer, if it refers to a struct/union member, except when the 
type is a variably-modified type, which will be complete after the closing 
parenthesis or brace respectively.

I'm not sure I got the wording precisely, or if I covered all cases (like types 
that cannot be completed for other reasons, even after the closing ')' or '}'.

> 
>>
>>>>
>>>>     -  This prevents circular dependencies involving sizeof() or _Lengthof().
>>>>
>>>> -  Shadowing rules apply.
>>>>
>>>>     -  This prevents ambiguity.
>>>>
>>>>
>>>> EXAMPLES:
>>>>
>>>>
>>>> -  Valid examples (libc):
>>>>
>>>>         int
>>>>         strncmp(const char s1[.n],
>>>>                 const char s2[.n],
>>>>                 size_t n);
>>>>
>>>>         int
>>>>         cacheflush(void addr[.nbytes],
>>>>                    int nbytes,
>>>>                    int cache);
>>>>
>>>>         long
>>>>         mbind(void addr[.len],
>>>>               unsigned long len,
>>>>               int mode,
>>>>               const unsigned long nodemask[(.maxnode + ULONG_WIDTH ‐ 1)
>>>>                                            / ULONG_WIDTH],
>>>>               unsigned long maxnode, unsigned int flags);
>>>>
>>>>         void *
>>>>         bsearch(const void key[.size],
>>>>                 const void base[.size * .nmemb],
>>>>                 size_t nmemb,
>>>>                 size_t size,
>>>>                 int (*compar)(const void [.size], const void [.size]));
>>>>
>>>> -  Valid examples (my own):
>>>>
>>>>         void
>>>>         ustr2str(char dst[restrict .len + 1],
>>>>                  const char src[restrict .len],
>>>>                  size_t len);
>>>>
>>>>         char *
>>>>         stpecpy(char dst[.end - .dst + 1],
>>>>                 char *restrict src,
>>>>                 char end[1]);
>>>>
>>>> -  Valid examples (from this thread):
>>>>
>>>>     -
>>>>         struct s { int a; };
>>>>         void f(int a, int b[((struct s) { .a = 1 }).a]);
>>>>
>>>>         Explanation:
>>>>         -  Because of shadowing rules, .a=1 refers to the struct member.
>>>>            -  Also, if .a referred to the parameter, it would be an rvalue, 
>>>> so it wouldn't be valid to assign to it.
>>>>         -  (...).a refers to the struct member too, since otherwise an 
>>>> rvalue is not expected there.
>>>>
>>>>     -
>>>>         void foo(struct bar { int x; char c[.x] } a, int x);
>>>>
>>>>         Explanation:
>>>>         -  Because of shadowing rules, [.x] refers to the struct member.
>>>>
>>>>     -
>>>>         struct bar { int y; };
>>>>         void foo(char p[((struct bar){ .y = .x }).y], int x);
>>>>
>>>>         Explanation:
>>>>         -  .x unambiguously refers to the parameter.
>>>>
>>>> -  Undefined behavior:
>>>>
>>>>     -
>>>>         struct bar { int y; };
>>>>         void foo(char p[((struct bar){ .y = .y }).y], int y);
>>>>
>>>>         Explanation:
>>>>         -  Because of shadowing rules, =.y refers to the struct member.
>>>>         -  .y=.y means initialize the member with itself (uninitialized use).
>>>>         -  (...).y refers to the struct member, since otherwise an rvalue is 
>>>> not expected there.
>>>>
>>>> -  Constraint violations:
>>>>
>>>>     -
>>>>         void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>
>>>>         Explanation:
>>>>         -  sizeof(*.b): Cannot get size of incomplete type.
>>>>         -  sizeof(*.a): Cannot get size of incomplete type.
>>>>
>>>>     -
>>>>         void f(size_t s, int a[sizeof(1) = 1]);
>>>>
>>>>         Explanation:
>>>>         -  Cannot assign to rvalue.
>>>>
>>>>     -
>>>>         void f(size_t s, int a[.s = 1]);
>>>>
>>>>         Explanation:
>>>>         -  Cannot assign to rvalue.
>>>>
>>>>     -
>>>>         void f(size_t s, int a[sizeof(.s)]);

This should actually be valid.

>>>>
>>>>         Explanation:
>>>>         -  sizeof(.s): Cannot get size of incomplete type.
>>>>
>>>>
>>>> Does this idea make sense to you?
>>>>
>>>>
>>>> Cheers,
>>>> Alex
>>>
>>
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 13:19                                               ` Alejandro Colomar
  2022-11-13 13:33                                                 ` Alejandro Colomar
@ 2022-11-14 17:52                                                 ` Joseph Myers
  2022-11-14 17:57                                                   ` Alejandro Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-14 17:52 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sun, 13 Nov 2022, Alejandro Colomar via Gcc wrote:

> Maybe allowing integral types and pointers would be enough.  However,
> foreseeing that the _Lengthof() proposal (BTW, which paper was it?) will
> succeed, and combining it with this one, _Lengthof(pointer) would ideally give
> the length of the array, so allowing pointers would conflict.

Do you mean N2529 Romero, New pointer-proof keyword to determine array 
length?  To quote the convenor in WG14 reflector message 18575 (17 Nov 
2020) when I asked about its status, "The author asked me not to put those 
on the agenda.  He will supply updated versions later.".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-14 17:52                                                 ` Joseph Myers
@ 2022-11-14 17:57                                                   ` Alejandro Colomar
  2022-11-14 18:26                                                     ` Joseph Myers
  0 siblings, 1 reply; 85+ messages in thread
From: Alejandro Colomar @ 2022-11-14 17:57 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1139 bytes --]

Hi Joseph!

On 11/14/22 18:52, Joseph Myers wrote:
> On Sun, 13 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>> Maybe allowing integral types and pointers would be enough.  However,
>> foreseeing that the _Lengthof() proposal (BTW, which paper was it?) will
>> succeed, and combining it with this one, _Lengthof(pointer) would ideally give
>> the length of the array, so allowing pointers would conflict.
> 
> Do you mean N2529 Romero, New pointer-proof keyword to determine array
> length?

Yes, that's it!  Thanks.

> To quote the convenor in WG14 reflector message 18575 (17 Nov
> 2020) when I asked about its status, "The author asked me not to put those
> on the agenda.  He will supply updated versions later.".

Since his email is not in the paper, would you mind forwarding him this 
suggestion of mine of renaming it to avoid confusion with string lengths?  Or 
maybe point him to the mailing list discussion[1]?

[1]: 
<https://lore.kernel.org/linux-man/20221110222540.as3jrjdzxsnot3zm@illithid/T/#m794ad2a3173a19099625ee1dec7ea11ab754513d>

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 16:28                                                         ` Alejandro Colomar
  2022-11-13 16:31                                                           ` Alejandro Colomar
@ 2022-11-14 18:13                                                           ` Joseph Myers
  2022-11-28 22:59                                                             ` Alex Colomar
  1 sibling, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-14 18:13 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sun, 13 Nov 2022, Alejandro Colomar via Gcc wrote:

> SYNOPSIS:
> 
> unary-operator:  . identifier

That's not what you mean.  See the standard syntax.

unary-expression:
  [other alternatives]
  unary-operator cast-expression

unary-operator: one of
  & * + - ~ !

> -  It is not an lvalue.
> 
>    -  This means sizeof() and _Lengthof() cannot be applied to them.

sizeof can be applied to non-lvalues.

>    -  This prevents ambiguity with a designator in an initializer-list within
> a nested braced-initializer.

No, it doesn't.  See my previous points about syntactic disambiguation 
being a separate matter from "one parse would result in a constraint 
violation, so choose another parse that doesn't" (necessarily, because the 
constraint violation that results could in general be at an arbitrary 
distance from the point where a choice of parse has to be made).  Or see 
e.g. the disambiguation rule about enum type specifiers: there is an 
explicit rule "If an enum type specifier is present, then the longest 
possible sequence of tokens that can be interpreted as a specifier 
qualifier list is interpreted as part of the enum type specifier." that 
ensures that "enum e : long int;" interprets "long int" as the enum type 
specifier, rather than "long" as the enum type specifier and "int" as 
another type specifier in the sequence of declaration specifiers, even 
though the latter parse would result in a constraint violation later.

Also, requiring unbounded lookahead to determine what kind of construct is 
being parsed may be considered questionable for C.  (If you have an 
initializer starting .a.b.c.d.e, possibly with array element access as 
well, those could all be designators or .a might be a reference to a 
parameter of struct or union type and .b.c.d.e a sequence of references to 
members within it and disambiguation under your rule would depend on 
whether an '=' follows such an unbounded sequence.)

> -  The type of a .identifier is always an incomplete type.
> 
>    -  This prevents circular dependencies involving sizeof() or _Lengthof().

We have typeof as well, which can be applied to expressions with 
incomplete type.

> -  Shadowing rules apply.
> 
>    -  This prevents ambiguity.

"Shadowing rules apply" isn't much of a specification.  You need detailed 
wording that would be added to 6.2.1 Scopes of identifiers (or equivalent 
elsewhere) to make it clear exactly what scopes apply for identifiers 
looked up using this construct.

>    -
>        void foo(struct bar { int x; char c[.x] } a, int x);
> 
>        Explanation:
>        -  Because of shadowing rules, [.x] refers to the struct member.

I really don't think standardizing VLAs-in-structures would be a good 
idea.  Certainly it would be a massive pain to specify meaningful 
semantics for them and this outline doesn't even attempt to work through 
the consequences of removing the rule that "If an identifier is declared 
as having a variably modified type, it shall be an ordinary identifier (as 
defined in 6.2.3), have no linkage, and have either block scope or 
function prototype scope.".

The idea that .x as an expression might refer to either a member or a 
parameter is also a massive change to the namespace rules, where at 
present those are in completely different namespaces and so in any given 
context a name only needs looking up as one or the other.

Again, proposals should be *minimal*.  And even when they are, many issues 
may well arise in practice (see the long list of constexpr issues in my 
commit message for that C2x feature, for example, which I expect to turn 
into multiple NB comments and at least two accompanying documents).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-14 17:57                                                   ` Alejandro Colomar
@ 2022-11-14 18:26                                                     ` Joseph Myers
  2022-11-28 23:02                                                       ` Alex Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-14 18:26 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Mon, 14 Nov 2022, Alejandro Colomar via Gcc wrote:

> > To quote the convenor in WG14 reflector message 18575 (17 Nov
> > 2020) when I asked about its status, "The author asked me not to put those
> > on the agenda.  He will supply updated versions later.".
> 
> Since his email is not in the paper, would you mind forwarding him this
> suggestion of mine of renaming it to avoid confusion with string lengths?  Or
> maybe point him to the mailing list discussion[1]?
> 
> [1]:
> <https://lore.kernel.org/linux-man/20221110222540.as3jrjdzxsnot3zm@illithid/T/#m794ad2a3173a19099625ee1dec7ea11ab754513d>

I don't have his email address (I don't see any emails from him on the 
reflector since I joined it in 2001).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-14 18:13                                                           ` Joseph Myers
@ 2022-11-28 22:59                                                             ` Alex Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alex Colomar @ 2022-11-28 22:59 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 5335 bytes --]

Hi Joseph,

On 11/14/22 19:13, Joseph Myers wrote:
> On Sun, 13 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>> SYNOPSIS:
>>
>> unary-operator:  . identifier
> 
> That's not what you mean.  See the standard syntax.

Yup; typo there.

> 
> unary-expression:
>    [other alternatives]
>    unary-operator cast-expression
> 
> unary-operator: one of
>    & * + - ~ !
> 
>> -  It is not an lvalue.
>>
>>     -  This means sizeof() and _Lengthof() cannot be applied to them.
> 
> sizeof can be applied to non-lvalues.

thinko there.  I fixed it in a subsequent email.

> 
>>     -  This prevents ambiguity with a designator in an initializer-list within
>> a nested braced-initializer.
> 
> No, it doesn't.  See my previous points about syntactic disambiguation
> being a separate matter from "one parse would result in a constraint
> violation, so choose another parse that doesn't" (necessarily, because the
> constraint violation that results could in general be at an arbitrary
> distance from the point where a choice of parse has to be made).  Or see
> e.g. the disambiguation rule about enum type specifiers: there is an
> explicit rule "If an enum type specifier is present, then the longest
> possible sequence of tokens that can be interpreted as a specifier
> qualifier list is interpreted as part of the enum type specifier." that
> ensures that "enum e : long int;" interprets "long int" as the enum type
> specifier, rather than "long" as the enum type specifier and "int" as
> another type specifier in the sequence of declaration specifiers, even
> though the latter parse would result in a constraint violation later.

I get it.  It's only unambiguous if there's lookahead.

> 
> Also, requiring unbounded lookahead to determine what kind of construct is
> being parsed may be considered questionable for C.  (If you have an
> initializer starting .a.b.c.d.e, possibly with array element access as
> well, those could all be designators or .a might be a reference to a
> parameter of struct or union type and .b.c.d.e a sequence of references to
> members within it and disambiguation under your rule would depend on
> whether an '=' follows such an unbounded sequence.)

I'm thinking of an idea for this.

> 
>> -  The type of a .identifier is always an incomplete type.
>>
>>     -  This prevents circular dependencies involving sizeof() or _Lengthof().
> 
> We have typeof as well, which can be applied to expressions with
> incomplete type.

Yes, but it would not be problematic in the two-pass parsing I have in mind.

> 
>> -  Shadowing rules apply.
>>
>>     -  This prevents ambiguity.
> 
> "Shadowing rules apply" isn't much of a specification.  You need detailed
> wording that would be added to 6.2.1 Scopes of identifiers (or equivalent
> elsewhere) to make it clear exactly what scopes apply for identifiers
> looked up using this construct.

Yeah, I guess.  I'm being easy for this draft.  I'll try to be more 
precise for future revisions.

> 
>>     -
>>         void foo(struct bar { int x; char c[.x] } a, int x);
>>
>>         Explanation:
>>         -  Because of shadowing rules, [.x] refers to the struct member.
> 
> I really don't think standardizing VLAs-in-structures would be a good
> idea.  Certainly it would be a massive pain to specify meaningful
> semantics for them and this outline doesn't even attempt to work through
> the consequences of removing the rule that "If an identifier is declared
> as having a variably modified type, it shall be an ordinary identifier (as
> defined in 6.2.3), have no linkage, and have either block scope or
> function prototype scope.".

Maybe.  I didn't have them in mind until Martin mentioned them.  Now 
that he mentioned them, I'd like at least to be careful so that any new 
syntax doesn't do something that impedes adding them in the future, if 
it is ever considered desirable.

> 
> The idea that .x as an expression might refer to either a member or a
> parameter is also a massive change to the namespace rules, where at
> present those are in completely different namespaces and so in any given
> context a name only needs looking up as one or the other.
> 
> Again, proposals should be *minimal*.

Yes.  I only want to have a rough discussion about how the entire 
feature in an ideal future where everything is added would look like. 
Otherwise, adding a minimal feature without considering this future, 
might do something that prevents some part of it being implemented due 
to backwards compatibility.

So I'd like to discuss the whole idea before then going to a minimal 
proposal that will be *much* smaller than this idea that I'm discussing.

I'm happy with the Linux man-pages implementing the whole idea (even if 
it's impossible to implement it in C ever), and letting ISO C / GCC 
implement initially (and possibly ever) only the minimal stuff.


>  And even when they are, many issues
> may well arise in practice (see the long list of constexpr issues in my
> commit message for that C2x feature, for example, which I expect to turn
> into multiple NB comments and at least two accompanying documents).

Sure; I expect that.


Cheers,

Alex

> 

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-14 18:26                                                     ` Joseph Myers
@ 2022-11-28 23:02                                                       ` Alex Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alex Colomar @ 2022-11-28 23:02 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 986 bytes --]

Hi Joseph,

On 11/14/22 19:26, Joseph Myers wrote:
> On Mon, 14 Nov 2022, Alejandro Colomar via Gcc wrote:
> 
>>> To quote the convenor in WG14 reflector message 18575 (17 Nov
>>> 2020) when I asked about its status, "The author asked me not to put those
>>> on the agenda.  He will supply updated versions later.".
>>
>> Since his email is not in the paper, would you mind forwarding him this
>> suggestion of mine of renaming it to avoid confusion with string lengths?  Or
>> maybe point him to the mailing list discussion[1]?
>>
>> [1]:
>> <https://lore.kernel.org/linux-man/20221110222540.as3jrjdzxsnot3zm@illithid/T/#m794ad2a3173a19099625ee1dec7ea11ab754513d>
> 
> I don't have his email address (I don't see any emails from him on the
> reflector since I joined it in 2001).

Meh; thanks.  Would you mind commenting this issue to whoever defends 
his document, whenever you talk about it?

Thanks,

Alex

> 

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-13 14:58                                                     ` Martin Uecker
  2022-11-13 15:15                                                       ` Alejandro Colomar
@ 2022-11-28 23:18                                                       ` Alex Colomar
  2022-11-29  0:05                                                         ` Joseph Myers
  2022-11-29 14:58                                                         ` Michael Matz
  1 sibling, 2 replies; 85+ messages in thread
From: Alex Colomar @ 2022-11-28 23:18 UTC (permalink / raw)
  To: Martin Uecker, Joseph Myers
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 4976 bytes --]

Hi Martin,

On 11/13/22 15:58, Martin Uecker wrote:
> Am Sonntag, den 13.11.2022, 15:02 +0100 schrieb Alejandro Colomar:
>>
>> On 11/13/22 14:33, Alejandro Colomar wrote:
>>> Hi Martin,
>>>
>>> On 11/13/22 14:19, Alejandro Colomar wrote:
>>>>> But there are not only syntactical problems, because
>>>>> also the type of the parameter might become relevant
>>>>> and then you can get circular dependencies:
>>>>>
>>>>> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
>>>>
>>>> This seems to be a difficult stone in the road.
> 
> But note that GNU forward declarations solve this nicely.

Okay, so GNU declarations basically work by duplicating (some of) the 
declarations.

How about the compiler parsing the parameter list twice?  One for 
getting the declarations and their types (but not resolving any 
sizeof(), _Lengthof(), or typeof(), when they contain .identifier (or 
expressions containing it; in those cases, leave the type incomplete, to 
be completed in the second pass).  As if the programmer had specified 
the firward declarations, but it's the compiler that gets them 
automatically.

I guess asking the compiler to do two passes on the param list isn't as 
bad as asking to do unbound lookahead.  In this case it's bound:  look 
ahead till the end of the param list; get as much info as possible, and 
then do it again to complete.  Anything not yet clear after two passes 
is not valid.

So, for

     void foo(char (*a)[sizeof(*.b)], char (*b)[sizeof(*.a)]);

in the first pass, the compiler would read:

     char (*a)[sizeof(*.b)];  // sizeof .identifier; incomplete type; 
continue parsing
     char (*b)[sizeof(*.a)];  // sizeof .identifier; incomplete type; 
continue parsing

At the end of the first pass, the compiler only know:

     char (*a)[];
     char (*b)[];

At the second pass, when evaluating sizeof(), since the type of the 
arguments are yet incomplete, it can't be evaluated, and therefore, 
there's an error at the first sizeof(*.b): *.b has incomplete type.

---

Let's show a distinct case:

     void foo(char (*a)[sizeof(*.b)], char (*b)[10]);

After the first pass, the compiler would know:

     char (*a)[];
     char (*b)[10];

At the second pass, sizeof(*.b) would be evaluated undoubtedly to 
sizeof(char[10]), and the parameter list would then be fine.

Does this 2-pass parsing make sense to you?  Did I miss any details?


> 
>>>>
>>>>> I am not sure what would the best way to fix it. One
>>>>> could specifiy that parameters referred to by
>>>>> the .identifer syntax must of some integer type and
>>>>> that the sub-expression .identifer is always
>>>>> converted to a 'size_t'.
>>>>
>>>> That makes sense, but then overnight some quite useful thing came to my mind
>>>> that would not be possible with this limitation:
>>>>
>>>>
>>>> <https://software.codidact.com/posts/285946>
>>>>
>>>> char *
>>>> stpecpy(char dst[.end - .dst], char *src, char end[1])
>>
>> Heh, I got an off-by-one error.  It should be dst[.end - .dst + 1], of course,
>> and then the result of the whole expression would be 0, which is fine as size_t.
>>
>> So, never mind.
> 
> .end and .dst would have pointer size though.
> 
>>>> {
>>>>       for (/* void */; dst <= end; dst++) {
>>>>           *dst = *src++;
>>>>           if (*dst == '\0')
>>>>               return dst;
>>>>       }
>>>>       /* Truncation detected */
>>>>       *end = '\0';
>>>>
>>>> #if !defined(NDEBUG)
>>>>       /* Consume the rest of the input string. */
>>>>       while (*src++) {};
>>>> #endif
>>>>
>>>>       return end + 1;
>>>> }
>>> And I forgot to say it:  Default promotions rank high (probably the highest) in
>>> my list of most hated features^Wbugs in C.
> 
> If you replaced them with explicit conversion you then have
> to add by hand all the time, I am pretty sure most people
> would hate this more. (and it could also hide bugs)
> 
>>> I wouldn't convert it to size_t, but
>>> rather follow normal promotion rules.
> 
> The point of making it size_t is that you then
> do need to know the type of the parameter to make
> sense of the expression. If the type matters, then you get
> mutual dependencies as in the example above.
> 
>>> Since you can use anything between INTMAX_MIN and UINTMAX_MAX for accessing an
>>> array (which took me some time to understand), I'd also allow the same here. So,
>>> the type of the expression between [] could perfectly be signed or unsigned.
>>>
>>> So, you could use size_t for very high indices, or e.g. ptrdiff_t if you want to
>>> allow negative numbers.  In the function above, since dst can be a pointer to
>>> one-past-the-end (it represents a previous truncation; that's why the test
>>> dst<=end), forcing a size_t conversion would disallow that syntax.
> 
> Yes, this then does not work.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-28 23:18                                                       ` Alex Colomar
@ 2022-11-29  0:05                                                         ` Joseph Myers
  2022-11-29 14:58                                                         ` Michael Matz
  1 sibling, 0 replies; 85+ messages in thread
From: Joseph Myers @ 2022-11-29  0:05 UTC (permalink / raw)
  To: Alex Colomar
  Cc: Martin Uecker, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Tue, 29 Nov 2022, Alex Colomar via Gcc wrote:

> I guess asking the compiler to do two passes on the param list isn't as bad as
> asking to do unbound lookahead.  In this case it's bound:  look ahead till the
> end of the param list; get as much info as possible, and then do it again to
> complete.  Anything not yet clear after two passes is not valid.

Unbounded here means an unbounded number of tokens, as opposed to e.g. 
looking one token ahead after seeing an identifier in statement context to 
determine if it's a label.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-28 23:18                                                       ` Alex Colomar
  2022-11-29  0:05                                                         ` Joseph Myers
@ 2022-11-29 14:58                                                         ` Michael Matz
  2022-11-29 15:17                                                           ` Uecker, Martin
  2022-11-29 16:49                                                           ` Joseph Myers
  1 sibling, 2 replies; 85+ messages in thread
From: Michael Matz @ 2022-11-29 14:58 UTC (permalink / raw)
  To: Alex Colomar
  Cc: Martin Uecker, Joseph Myers, Ingo Schwarze, JeanHeyd Meneide,
	linux-man, gcc

Hey,

On Tue, 29 Nov 2022, Alex Colomar via Gcc wrote:

> How about the compiler parsing the parameter list twice?

This _is_ unbounded look-ahead.  You could avoid this by using "." for 
your new syntax.  Use something unambiguous that can't be confused with 
other syntactic elements, e.g. with a different punctuator like '@' or the 
like.  But I'm generally doubtful of this whole feature within C itself.  
It serves a purpose in documentation, so in man-pages it seems fine enough 
(but then still could use a different puncuator to not be confusable with 
C syntax).

But within C it still can only serve a documentation purpose as no 
checking could be performed without also changes in how e.g. arrays are 
represented (they always would need to come with a size).  It seems 
doubtful to introduce completely new and ambiguous syntax with all the 
problems Joseph lists just in order to be able to write documentation when 
there's a perfectly fine method to do so: comments.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 14:58                                                         ` Michael Matz
@ 2022-11-29 15:17                                                           ` Uecker, Martin
  2022-11-29 15:44                                                             ` Michael Matz
  2022-11-29 16:49                                                           ` Joseph Myers
  1 sibling, 1 reply; 85+ messages in thread
From: Uecker, Martin @ 2022-11-29 15:17 UTC (permalink / raw)
  To: alx.manpages, matz; +Cc: gcc, linux-man, joseph, schwarze, wg14

Am Dienstag, dem 29.11.2022 um 14:58 +0000 schrieb Michael Matz:
> Hey,
> 
> On Tue, 29 Nov 2022, Alex Colomar via Gcc wrote:
> 
> > How about the compiler parsing the parameter list twice?
> 
> This _is_ unbounded look-ahead.  You could avoid this by using "."
> for 
> your new syntax.  Use something unambiguous that can't be confused
> with 
> other syntactic elements, e.g. with a different punctuator like '@'
> or the 
> like.  But I'm generally doubtful of this whole feature within C
> itself.  
> It serves a purpose in documentation, so in man-pages it seems fine
> enough 
> (but then still could use a different puncuator to not be confusable
> with 
> C syntax).
> 
> But within C it still can only serve a documentation purpose as no 
> checking could be performed without also changes in how e.g. arrays
> are 
> represented (they always would need to come with a size).  


It does not require any changes on how arrays are represented.

As part of VM-types the size becomes part of the type and this
can be used for static or dynamic analysis, e.g. you can 
- today - get a run-time bounds violation with the sanitizer:

void foo(int n, char (*buf)[n])
{
  (*buf)[n] = 1;
}


int main()
{
    char buf[10];
    foo(10, &buf);
}

https://godbolt.org/z/WWEdeYchs

I personally find this already extremely useful.


For

void foo(int n, char buf[n]);

it semantically has no meaning according to the C standard,
but a compiler could still warn. 

It could also warn for

void foo(int n, char buf[n]);

int main()
{
    char buf[9];
    foo(buf);
}

if the passed buffer is too short. And here, GCC and Clang
already do this! (although - so far - only for static
bounds I think)

https://godbolt.org/z/afPhnxfzx

With "static" 

void foo(int n, char buf[static n]);

this would also be UB according to C.


We miss some features in GCC to make this more useful (and
I filed bugs a while ago). For example, UB sanitzer should detect
additional cases which are UB.

But in general: This feature is useful not only for documentation
but also for analysis.  You can get bounds checking in C which
works today and with additional compiler features this would
be very useful!


Martin



> It seems 
> doubtful to introduce completely new and ambiguous syntax with all
> the 
> problems Joseph lists just in order to be able to write documentation
> when 
> there's a perfectly fine method to do so: comments.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 15:17                                                           ` Uecker, Martin
@ 2022-11-29 15:44                                                             ` Michael Matz
  2022-11-29 16:58                                                               ` Uecker, Martin
  0 siblings, 1 reply; 85+ messages in thread
From: Michael Matz @ 2022-11-29 15:44 UTC (permalink / raw)
  To: Uecker, Martin; +Cc: alx.manpages, gcc, linux-man, joseph, schwarze, wg14

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

Hey,

On Tue, 29 Nov 2022, Uecker, Martin wrote:

> It does not require any changes on how arrays are represented.
> 
> As part of VM-types the size becomes part of the type and this
> can be used for static or dynamic analysis, e.g. you can 
> - today - get a run-time bounds violation with the sanitizer:
> 
> void foo(int n, char (*buf)[n])
> {
>   (*buf)[n] = 1;
> }

This can already statically analyzed as being wrong, no need for dynamic 
checking.  What I mean is the checking of the claimed contract.  Above you 
assure for the function body that buf has n elements.  This is also a 
pre-condition for calling this function and _that_ can't be checked in all 
cases because:

  void foo (int n, char (*buf)[n]) { (*buf)[n-1] = 1; }
  void callfoo(char * buf) { foo(10, buf); }

buf doesn't have a known size.  And a pre-condition that can't be checked 
is no pre-condition at all, as only then it can become a guarantee for the 
body.

The compiler has no choice than to trust the user that the pre-condition 
for calling foo is fulfilled.  I can see how being able to just check half 
of the contract might be useful, but if it doesn't give full checking then 
any proposal for syntax should be even more obviously orthogonal than the 
current one.

> For
> 
> void foo(int n, char buf[n]);
> 
> it semantically has no meaning according to the C standard,
> but a compiler could still warn. 

Hmm?  Warn about what in this decl?

> It could also warn for
> 
> void foo(int n, char buf[n]);
> 
> int main()
> {
>     char buf[9];
>     foo(buf);
> }

You mean if you write 'foo(10,buf)' (the above, as is, is simply a syntax 
error for non-matching number of args).  Or was it a mispaste and you mean 
the one from the godbolt link, i.e.:

void foo(char buf[10]){ buf[9] = 1; }
int main()
{
    char buf[9];
    foo(buf);
}

?  If so, yeah, we warn already.  I don't think this is an argument for 
(or against) introducing new syntax.

...

> But in general: This feature is useful not only for documentation
> but also for analysis.

Which feature we're talking about now?  The ones you used all work today, 
as you demonstrated.  I thought we would be talking about that ".whatever" 
syntax to refer to arbitrary parameters, even following ones?  I think a 
disrupting syntax change like that should have a higher bar than "in some 
cases, depending on circumstance, we might even be able to warn".


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 14:58                                                         ` Michael Matz
  2022-11-29 15:17                                                           ` Uecker, Martin
@ 2022-11-29 16:49                                                           ` Joseph Myers
  2022-11-29 16:53                                                             ` Jonathan Wakely
  1 sibling, 1 reply; 85+ messages in thread
From: Joseph Myers @ 2022-11-29 16:49 UTC (permalink / raw)
  To: Michael Matz
  Cc: Alex Colomar, Martin Uecker, Ingo Schwarze, JeanHeyd Meneide,
	linux-man, gcc

On Tue, 29 Nov 2022, Michael Matz via Gcc wrote:

> like.  But I'm generally doubtful of this whole feature within C itself.  
> It serves a purpose in documentation, so in man-pages it seems fine enough 
> (but then still could use a different puncuator to not be confusable with 
> C syntax).

In man-pages you don't need to invent syntax at all.  You can write

int f(char buf[n], int n);

and in the context of a man page it will be clear to readers what is 
meant, though such a syntax would be problematic in actual C source files 
because of issues with circular dependencies between parameters and with n 
already being declared in an outer scope.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 16:49                                                           ` Joseph Myers
@ 2022-11-29 16:53                                                             ` Jonathan Wakely
  2022-11-29 17:00                                                               ` Martin Uecker
  0 siblings, 1 reply; 85+ messages in thread
From: Jonathan Wakely @ 2022-11-29 16:53 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Michael Matz, Alex Colomar, Martin Uecker, Ingo Schwarze,
	JeanHeyd Meneide, linux-man, gcc

On Tue, 29 Nov 2022 at 16:49, Joseph Myers wrote:
>
> On Tue, 29 Nov 2022, Michael Matz via Gcc wrote:
>
> > like.  But I'm generally doubtful of this whole feature within C itself.
> > It serves a purpose in documentation, so in man-pages it seems fine enough
> > (but then still could use a different puncuator to not be confusable with
> > C syntax).
>
> In man-pages you don't need to invent syntax at all.  You can write
>
> int f(char buf[n], int n);
>
> and in the context of a man page it will be clear to readers what is
> meant,

Considerably more clear than new invented syntax IMHO.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 15:44                                                             ` Michael Matz
@ 2022-11-29 16:58                                                               ` Uecker, Martin
  2022-11-29 17:28                                                                 ` Alex Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Uecker, Martin @ 2022-11-29 16:58 UTC (permalink / raw)
  To: matz; +Cc: gcc, alx.manpages, linux-man, joseph, schwarze, wg14


Hi,

Am Dienstag, dem 29.11.2022 um 15:44 +0000 schrieb Michael Matz:
> Hey,
> 
> On Tue, 29 Nov 2022, Uecker, Martin wrote:
> 
> > It does not require any changes on how arrays are represented.
> > 
> > As part of VM-types the size becomes part of the type and this
> > can be used for static or dynamic analysis, e.g. you can 
> > - today - get a run-time bounds violation with the sanitizer:
> > 
> > void foo(int n, char (*buf)[n])
> > {
> >   (*buf)[n] = 1;
> > }
> 
> This can already statically analyzed as being wrong, no need for
> dynamic checking.  

In this toy example, but in general in can be checked
only at run-time by using the information about the
dynamic bound.

> What I mean is the checking of the claimed contract. 
> Above you assure for the function body that buf has n elements.

Yes.

> This is also a pre-condition for calling this function and
> _that_ can't be checked in all  cases because:
> 
>   void foo (int n, char (*buf)[n]) { (*buf)[n-1] = 1; }
>   void callfoo(char * buf) { foo(10, buf); }
> 
> buf doesn't have a known size. 

This does not type check.

>  And a pre-condition that can't be checked 
> is no pre-condition at all, as only then it can become a guarantee
> for the body.

The example above should look like:

void foo(int n, char (*buf)[n]);

void callfoo(char (*buf)[12]) { foo(10, buf); }

This could be checked by an UB sanitizer as calling
the function with an argument of incompatible type 
is UB (but we currently do not do this)


If you think about

void foo(int n, char buf[n]);

void callfoo(char *buf) { foo(10, buf); }


Then you are right that this can not be checked at this
time. But this  does not mean it is useless because we
still can detect inconsistencies in other cases:

void callfoo(int n, char buf[n - 1]) { foo(n, buf); }

We could also - in the future - have a warning about all
situations where bound information is lost, making sure
that preconditions are always checked for people who
consistently use these annotations.


> The compiler has no choice than to trust the user that the pre-
> condition  for calling foo is fulfilled.  I can see how
> being able to just check half  of the contract might be
> useful, but if it doesn't give full checking then 
> any proposal for syntax should be even more obviously
> orthogonal than the current one.

Your argument is not clear to me.


> > For
> > 
> > void foo(int n, char buf[n]);
> > 
> > it semantically has no meaning according to the C standard,
> > but a compiler could still warn. 
> 
> Hmm?  Warn about what in this decl?

I meant, we could warn about something like this
because it is likely an error:

void foo(int n, char buf[n])
{
  buf[n] = 1;
}


> > It could also warn for
> > 
> > void foo(int n, char buf[n]);
> > 
> > int main()
> > {
> >     char buf[9];
> >     foo(buf);
> > }
> 
> You mean if you write 'foo(10,buf)' (the above, as is, is simply a
> syntax error for non-matching number of args).  Or was it a mispaste
> and you mean  the one from the godbolt link, i.e.:

I meant:

char buf[9];
foo(10, buf);

In fact, it turns out we warn already:

https://godbolt.org/z/qcvsv87Ev

> void foo(char buf[10]){ buf[9] = 1; }
> int main()
> {
>     char buf[9];
>     foo(buf);
> }
> 
> ?  If so, yeah, we warn already.  I don't think this is an argument
> for (or against) introducing new syntax.
> ...

It is argument for having this syntax, because we could 
extend such warning (those we already have and those we
could still add) to more common cases such as

void foo(char buf[.n], size_t n);

In my opinion, this would a huge step forward for
safety of C programs as we already have a lot of
infrastructure for checking bounds.

Of course, the existing GNU extension would achieve
the same thing:

void foo(size_t n; char buf[n], size_t n);



> > But in general: This feature is useful not only for documentation
> > but also for analysis.
> 
> Which feature we're talking about now?  The ones you used all work
> today, 
> as you demonstrated.  I thought we would be talking about that
> ".whatever" 
> syntax to refer to arbitrary parameters, even following ones?  I
> think a 
> disrupting syntax change like that should have a higher bar than "in
> some 
> cases, depending on circumstance, we might even be able to warn".

We can use our existing features and then apply them
to cases where the bound is specified after the pointer,
which is more common in practice.


Martin


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 16:53                                                             ` Jonathan Wakely
@ 2022-11-29 17:00                                                               ` Martin Uecker
  2022-11-29 17:19                                                                 ` Alex Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-11-29 17:00 UTC (permalink / raw)
  To: Jonathan Wakely, Joseph Myers
  Cc: Michael Matz, Alex Colomar, Ingo Schwarze, JeanHeyd Meneide,
	linux-man, gcc

Am Dienstag, dem 29.11.2022 um 16:53 +0000 schrieb Jonathan Wakely:
> On Tue, 29 Nov 2022 at 16:49, Joseph Myers wrote:
> > 
> > On Tue, 29 Nov 2022, Michael Matz via Gcc wrote:
> > 
> > > like.  But I'm generally doubtful of this whole feature within C
> > > itself.
> > > It serves a purpose in documentation, so in man-pages it seems
> > > fine enough
> > > (but then still could use a different puncuator to not be
> > > confusable with
> > > C syntax).
> > 
> > In man-pages you don't need to invent syntax at all.  You can write
> > 
> > int f(char buf[n], int n);
> > 
> > and in the context of a man page it will be clear to readers what
> > is
> > meant,
> 
> Considerably more clear than new invented syntax IMHO.

True, but I think it would be a mistake to use code in
man pages which then does not work as expected (or even
is subtle wrong) in actual code.

Martin




^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 17:00                                                               ` Martin Uecker
@ 2022-11-29 17:19                                                                 ` Alex Colomar
  2022-11-29 17:29                                                                   ` Alex Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alex Colomar @ 2022-11-29 17:19 UTC (permalink / raw)
  To: Martin Uecker, Jonathan Wakely, Joseph Myers
  Cc: Michael Matz, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1328 bytes --]

Hi Martin, Joseph,

On 11/29/22 18:00, Martin Uecker wrote:
> Am Dienstag, dem 29.11.2022 um 16:53 +0000 schrieb Jonathan Wakely:
>> On Tue, 29 Nov 2022 at 16:49, Joseph Myers wrote:
>>>
>>> On Tue, 29 Nov 2022, Michael Matz via Gcc wrote:
>>>
>>>> like.  But I'm generally doubtful of this whole feature within C
>>>> itself.
>>>> It serves a purpose in documentation, so in man-pages it seems
>>>> fine enough
>>>> (but then still could use a different puncuator to not be
>>>> confusable with
>>>> C syntax).
>>>
>>> In man-pages you don't need to invent syntax at all.  You can write
>>>
>>> int f(char buf[n], int n);
>>>
>>> and in the context of a man page it will be clear to readers what
>>> is
>>> meant,
>>
>> Considerably more clear than new invented syntax IMHO.
> 
> True, but I think it would be a mistake to use code in
> man pages which then does not work as expected (or even
> is subtle wrong) in actual code.

Exactly.  Using your proposed syntax (which was my first draft) would 
have probably been the source of hidden bugs, since it might work (read 
compile) in some cases, but with wrong results.

I prefer this hypothetical syntax, which at most will cause compile errors.

Cheers,

Alex

> 
> Martin
> 
> 
> 

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 16:58                                                               ` Uecker, Martin
@ 2022-11-29 17:28                                                                 ` Alex Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alex Colomar @ 2022-11-29 17:28 UTC (permalink / raw)
  To: Uecker, Martin, matz; +Cc: gcc, linux-man, joseph, schwarze, wg14


[-- Attachment #1.1: Type: text/plain, Size: 6055 bytes --]

Hi Martin and Michael,

On 11/29/22 17:58, Uecker, Martin wrote:
> 
> Hi,
> 
> Am Dienstag, dem 29.11.2022 um 15:44 +0000 schrieb Michael Matz:
>> Hey,
>>
>> On Tue, 29 Nov 2022, Uecker, Martin wrote:
>>
>>> It does not require any changes on how arrays are represented.
>>>
>>> As part of VM-types the size becomes part of the type and this
>>> can be used for static or dynamic analysis, e.g. you can
>>> - today - get a run-time bounds violation with the sanitizer:
>>>
>>> void foo(int n, char (*buf)[n])
>>> {
>>>    (*buf)[n] = 1;
>>> }
>>
>> This can already statically analyzed as being wrong, no need for
>> dynamic checking.
> 
> In this toy example, but in general in can be checked
> only at run-time by using the information about the
> dynamic bound.
> 
>> What I mean is the checking of the claimed contract.
>> Above you assure for the function body that buf has n elements.
> 
> Yes.
> 
>> This is also a pre-condition for calling this function and
>> _that_ can't be checked in all  cases because:
>>
>>    void foo (int n, char (*buf)[n]) { (*buf)[n-1] = 1; }
>>    void callfoo(char * buf) { foo(10, buf); }
>>
>> buf doesn't have a known size.
> 
> This does not type check.
> 
>>   And a pre-condition that can't be checked
>> is no pre-condition at all, as only then it can become a guarantee
>> for the body.
> 
> The example above should look like:
> 
> void foo(int n, char (*buf)[n]);
> 
> void callfoo(char (*buf)[12]) { foo(10, buf); }
> 
> This could be checked by an UB sanitizer as calling
> the function with an argument of incompatible type
> is UB (but we currently do not do this)
> 
> 
> If you think about
> 
> void foo(int n, char buf[n]);
> 
> void callfoo(char *buf) { foo(10, buf); }
> 
> 
> Then you are right that this can not be checked at this
> time. But this  does not mean it is useless because we
> still can detect inconsistencies in other cases:
> 
> void callfoo(int n, char buf[n - 1]) { foo(n, buf); }
> 
> We could also - in the future - have a warning about all
> situations where bound information is lost, making sure
> that preconditions are always checked for people who
> consistently use these annotations.
> 
> 
>> The compiler has no choice than to trust the user that the pre-
>> condition  for calling foo is fulfilled.  I can see how
>> being able to just check half  of the contract might be
>> useful, but if it doesn't give full checking then
>> any proposal for syntax should be even more obviously
>> orthogonal than the current one.
> 
> Your argument is not clear to me.
> 
> 
>>> For
>>>
>>> void foo(int n, char buf[n]);
>>>
>>> it semantically has no meaning according to the C standard,
>>> but a compiler could still warn.
>>
>> Hmm?  Warn about what in this decl?
> 
> I meant, we could warn about something like this
> because it is likely an error:
> 
> void foo(int n, char buf[n])
> {
>    buf[n] = 1;
> }
> 
> 
>>> It could also warn for
>>>
>>> void foo(int n, char buf[n]);
>>>
>>> int main()
>>> {
>>>      char buf[9];
>>>      foo(buf);
>>> }
>>
>> You mean if you write 'foo(10,buf)' (the above, as is, is simply a
>> syntax error for non-matching number of args).  Or was it a mispaste
>> and you mean  the one from the godbolt link, i.e.:
> 
> I meant:
> 
> char buf[9];
> foo(10, buf);
> 
> In fact, it turns out we warn already:
> 
> https://godbolt.org/z/qcvsv87Ev
> 
>> void foo(char buf[10]){ buf[9] = 1; }
>> int main()
>> {
>>      char buf[9];
>>      foo(buf);
>> }
>>
>> ?  If so, yeah, we warn already.  I don't think this is an argument
>> for (or against) introducing new syntax.
>> ...
> 
> It is argument for having this syntax, because we could
> extend such warning (those we already have and those we
> could still add) to more common cases such as
> 
> void foo(char buf[.n], size_t n);
> 
> In my opinion, this would a huge step forward for
> safety of C programs as we already have a lot of
> infrastructure for checking bounds.
> 
> Of course, the existing GNU extension would achieve
> the same thing:
> 
> void foo(size_t n; char buf[n], size_t n);
> 
> 
> 
>>> But in general: This feature is useful not only for documentation
>>> but also for analysis.
>>
>> Which feature we're talking about now?  The ones you used all work
>> today,
>> as you demonstrated.  I thought we would be talking about that
>> ".whatever"
>> syntax to refer to arbitrary parameters, even following ones?  I
>> think a
>> disrupting syntax change like that should have a higher bar than "in
>> some
>> cases, depending on circumstance, we might even be able to warn".
> 
> We can use our existing features and then apply them
> to cases where the bound is specified after the pointer,
> which is more common in practice.

Yep; basically adding some (not perfect, but some) static analysis to 
many libc function calls.

Also, considering the issues with sizeof and arrays, and the lack of a 
_Nitems() [proposed as _Lengthof()] operator, there's a lot of manual 
work in array (read pointer) parameters.

However, a hypothetical _Nitems() operator could make use of this 
syntactic sugar, and be more useful than just providing static analysis. 
  Using _Nitems() on a VMT (including pointer parameters) could be 
specified to return the number of elements, so I foresee code like:


void foo(int arr[nmemb], size_t nmemb)
{
         // _Nitems() evaluates to nmemb
         for (size_t i = 0; i < _Nitems(arr); i++)
                 arr[i] = i;
}


void bar(int arr[])
{
         // Constraint violation
         for (size_t i = 0; i < _Nitems(arr); i++)
                 arr[i] = i;
}


This is probably the most useful part of this feature (but admittedly 
it's not only about this feature, or even could be added without this 
feature).

> 
> 
> Martin
> 

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 17:19                                                                 ` Alex Colomar
@ 2022-11-29 17:29                                                                   ` Alex Colomar
  2022-12-03 21:03                                                                     ` Alejandro Colomar
  0 siblings, 1 reply; 85+ messages in thread
From: Alex Colomar @ 2022-11-29 17:29 UTC (permalink / raw)
  To: Martin Uecker, Jonathan Wakely, Joseph Myers
  Cc: Michael Matz, Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1452 bytes --]

On 11/29/22 18:19, Alex Colomar wrote:
> Hi Martin, Joseph,
> 
> On 11/29/22 18:00, Martin Uecker wrote:
>> Am Dienstag, dem 29.11.2022 um 16:53 +0000 schrieb Jonathan Wakely:
>>> On Tue, 29 Nov 2022 at 16:49, Joseph Myers wrote:
>>>>
>>>> On Tue, 29 Nov 2022, Michael Matz via Gcc wrote:
>>>>
>>>>> like.  But I'm generally doubtful of this whole feature within C
>>>>> itself.
>>>>> It serves a purpose in documentation, so in man-pages it seems
>>>>> fine enough
>>>>> (but then still could use a different puncuator to not be
>>>>> confusable with
>>>>> C syntax).
>>>>
>>>> In man-pages you don't need to invent syntax at all.  You can write
>>>>
>>>> int f(char buf[n], int n);
>>>>
>>>> and in the context of a man page it will be clear to readers what
>>>> is
>>>> meant,
>>>
>>> Considerably more clear than new invented syntax IMHO.
>>
>> True, but I think it would be a mistake to use code in
>> man pages which then does not work as expected (or even
>> is subtle wrong) in actual code.
> 
> Exactly.  Using your

s/your/Joseph's/

> proposed syntax (which was my first draft) would 
> have probably been the source of hidden bugs, since it might work (read 
> compile) in some cases, but with wrong results.
> 
> I prefer this hypothetical syntax, which at most will cause compile errors.
> 
> Cheers,
> 
> Alex
> 
>>
>> Martin
>>
>>
>>
> 

-- 
<http://www.alejandro-colomar.es/>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-11-29 17:29                                                                   ` Alex Colomar
@ 2022-12-03 21:03                                                                     ` Alejandro Colomar
  2022-12-03 21:13                                                                       ` Andrew Pinski
                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-12-03 21:03 UTC (permalink / raw)
  To: Martin Uecker, Jonathan Wakely, Joseph Myers, Michael Matz
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1641 bytes --]

Hi!

I'll probably have to release again before the Debian freeze of Bookworm. 
That's something I didn't want to do, but there's some important bug that 
affects downstream projects (translation pages), and I need to release.  It's a 
bit weird that the bug has been reported now, because it has always been there 
(it's not a regression), but still, I want to address it before the next Debian.

And I don't want to start with stable releases, so I won't be releasing 
man-pages-6.01.1.  That means that all changes that I have in the project that I 
didn't plan to release until 2024 will be released in a few weeks, notably 
including the VLA syntax.

This means that while this syntax is still an invent, not something real that 
can be used, I need to be careful about the future if I plan to make it public 
so soon.

Since we've seen that using a '.' prefix seems to be problematic because of 
lookahead, and recently Michael Matz proposed using a different punctuator (he 
proposed '@') for differentiating parameters from struct members, I think going 
in that direction may be a good idea.

How about '$'?

It's been used for function parameters since... forever? in sh(1).  And it's 
being added to the source character set in C23, so it seems to be a good choice. 
  It should also be intuitive what it means.

What do you think about it?  I'm not asking for your opinion about adding it to 
GCC, but rather for replacing the current '.' in the man-pages before I release 
later this month.  Do you think I should apply that change?

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-12-03 21:03                                                                     ` Alejandro Colomar
@ 2022-12-03 21:13                                                                       ` Andrew Pinski
  2022-12-03 21:15                                                                       ` Martin Uecker
  2022-12-06  2:08                                                                       ` Joseph Myers
  2 siblings, 0 replies; 85+ messages in thread
From: Andrew Pinski @ 2022-12-03 21:13 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Jonathan Wakely, Joseph Myers, Michael Matz,
	Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

On Sat, Dec 3, 2022 at 1:05 PM Alejandro Colomar via Gcc
<gcc@gcc.gnu.org> wrote:
>
> Hi!
>
> I'll probably have to release again before the Debian freeze of Bookworm.
> That's something I didn't want to do, but there's some important bug that
> affects downstream projects (translation pages), and I need to release.  It's a
> bit weird that the bug has been reported now, because it has always been there
> (it's not a regression), but still, I want to address it before the next Debian.
>
> And I don't want to start with stable releases, so I won't be releasing
> man-pages-6.01.1.  That means that all changes that I have in the project that I
> didn't plan to release until 2024 will be released in a few weeks, notably
> including the VLA syntax.
>
> This means that while this syntax is still an invent, not something real that
> can be used, I need to be careful about the future if I plan to make it public
> so soon.
>
> Since we've seen that using a '.' prefix seems to be problematic because of
> lookahead, and recently Michael Matz proposed using a different punctuator (he
> proposed '@') for differentiating parameters from struct members, I think going
> in that direction may be a good idea.
>
> How about '$'?

$ is a GNU extension for identifiers already.
See https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Dollar-Signs.html#Dollar-Signs

Thanks,
Andrew

>
> It's been used for function parameters since... forever? in sh(1).  And it's
> being added to the source character set in C23, so it seems to be a good choice.
>   It should also be intuitive what it means.
>
> What do you think about it?  I'm not asking for your opinion about adding it to
> GCC, but rather for replacing the current '.' in the man-pages before I release
> later this month.  Do you think I should apply that change?
>
> Cheers,
>
> Alex
>
>
> --
> <http://www.alejandro-colomar.es/>

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-12-03 21:03                                                                     ` Alejandro Colomar
  2022-12-03 21:13                                                                       ` Andrew Pinski
@ 2022-12-03 21:15                                                                       ` Martin Uecker
  2022-12-03 21:18                                                                         ` Alejandro Colomar
  2022-12-06  2:08                                                                       ` Joseph Myers
  2 siblings, 1 reply; 85+ messages in thread
From: Martin Uecker @ 2022-12-03 21:15 UTC (permalink / raw)
  To: Alejandro Colomar, Jonathan Wakely, Joseph Myers, Michael Matz
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc

Am Samstag, dem 03.12.2022 um 22:03 +0100 schrieb Alejandro Colomar:
...
> Since we've seen that using a '.' prefix seems to be problematic
> because of lookahead, and recently Michael Matz proposed using a
> different punctuator (he proposed '@') for differentiating parameters
> from struct members, I think going  in that direction may be a good
> idea.
> 
> How about '$'?

I don't see how the lookahead issue has anything to do with the choice
of the symbol. Here, also with the context would fully disambiguate
between other uses so I do not think there is any issue with using this
syntax.  '$' is much more problematic as people use it in identifiers,
'@' may cause confusion with objective C.

Martin






^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-12-03 21:15                                                                       ` Martin Uecker
@ 2022-12-03 21:18                                                                         ` Alejandro Colomar
  0 siblings, 0 replies; 85+ messages in thread
From: Alejandro Colomar @ 2022-12-03 21:18 UTC (permalink / raw)
  To: Martin Uecker, Jonathan Wakely, Joseph Myers, Michael Matz
  Cc: Ingo Schwarze, JeanHeyd Meneide, linux-man, gcc


[-- Attachment #1.1: Type: text/plain, Size: 1435 bytes --]

Hi Martin and Andrew!

On 12/3/22 22:15, Martin Uecker wrote:
> Am Samstag, dem 03.12.2022 um 22:03 +0100 schrieb Alejandro Colomar:
> ...
>> Since we've seen that using a '.' prefix seems to be problematic
>> because of lookahead, and recently Michael Matz proposed using a
>> different punctuator (he proposed '@') for differentiating parameters
>> from struct members, I think going  in that direction may be a good
>> idea.
>>
>> How about '$'?
> 
> I don't see how the lookahead issue has anything to do with the choice
> of the symbol.

In simple [.identifier] expressions it's not a problem.  I was foreseeing more 
complex expressions, as I suggested earlier.

> Here, also with the context would fully disambiguate
> between other uses so I do not think there is any issue with using this
> syntax.  '$' is much more problematic as people use it in identifiers,
> '@' may cause confusion with objective C.

On 12/3/22 22:13, Andrew Pinski wrote:
 > $ is a GNU extension for identifiers already.
 > Seehttps://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Dollar-Signs.html#Dollar-Signs
 >
 > Thanks,
 > Andrew
 >

Hmmm, I see.  '$' is too bad.  '@' is confusing.  I think I'll keep the '.' for 
now then, and assume that there's a high possibility that we'll never have 
complex expressions with it.

> 
> Martin
> 

Thanks you!

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
  2022-12-03 21:03                                                                     ` Alejandro Colomar
  2022-12-03 21:13                                                                       ` Andrew Pinski
  2022-12-03 21:15                                                                       ` Martin Uecker
@ 2022-12-06  2:08                                                                       ` Joseph Myers
  2 siblings, 0 replies; 85+ messages in thread
From: Joseph Myers @ 2022-12-06  2:08 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Martin Uecker, Jonathan Wakely, Michael Matz, Ingo Schwarze,
	JeanHeyd Meneide, linux-man, gcc

On Sat, 3 Dec 2022, Alejandro Colomar via Gcc wrote:

> What do you think about it?  I'm not asking for your opinion about adding it
> to GCC, but rather for replacing the current '.' in the man-pages before I
> release later this month.  Do you think I should apply that change?

I think man pages should not use any novel syntax - even syntax newly 
added to the C standard or GCC, unless required to express the standard 
prototype for a function.  They should be written for maximal 
comprehensibility to C users in general, who are often behind on knowledge 
standard features let alone the more obscure extensions - and certainly 
don't know about random, highly speculative suggestions for possible 
features suggested in random mailing list threads.  So: don't use any 
invented syntax (even if you explain it somewhere in the man pages), don't 
use any syntax newly introduced in C23 unless strictly necessary and 
you're sure it's already extremely widely understood among C users, be 
wary of syntax introduced in C11.  If a new feature in this area were 
introduced in C29, waiting at least several years after that standard is 
released (*not* just after the feature gets added to a draft) to start 
using the new syntax in man pages would be a good idea.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2022-12-06  2:08 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-26 21:07 [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters Alejandro Colomar
2022-08-27 11:10 ` Ingo Schwarze
2022-08-27 12:15   ` Alejandro Colomar
2022-08-27 13:08     ` Ingo Schwarze
2022-08-27 18:38       ` Alejandro Colomar
2022-08-28 11:24         ` Alejandro Colomar
     [not found]           ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
2022-09-02 21:02             ` Alejandro Colomar
2022-09-02 21:57               ` Alejandro Colomar
2022-09-03 12:47                 ` Martin Uecker
2022-09-03 13:29                   ` Ingo Schwarze
2022-09-03 15:08                     ` Alejandro Colomar
2022-09-03 13:41                   ` Alejandro Colomar
2022-09-03 14:35                     ` Martin Uecker
2022-09-03 14:59                       ` Alejandro Colomar
2022-09-03 15:31                         ` Martin Uecker
2022-09-03 20:02                           ` Alejandro Colomar
2022-09-05 14:31                             ` Alejandro Colomar
2022-11-10  0:06                           ` Alejandro Colomar
2022-11-10  0:09                             ` Alejandro Colomar
2022-11-10  1:33                             ` Joseph Myers
2022-11-10  1:39                               ` Joseph Myers
2022-11-10  6:21                                 ` Martin Uecker
2022-11-10 10:09                                   ` Alejandro Colomar
2022-11-10 23:19                                   ` Joseph Myers
2022-11-10 23:28                                     ` Alejandro Colomar
2022-11-11 19:52                                     ` Martin Uecker
2022-11-12  1:09                                       ` Joseph Myers
2022-11-12  7:24                                         ` Martin Uecker
2022-11-12 12:34                                     ` Alejandro Colomar
2022-11-12 12:46                                       ` Alejandro Colomar
2022-11-12 13:03                                       ` Joseph Myers
2022-11-12 13:40                                         ` Alejandro Colomar
2022-11-12 13:58                                           ` Alejandro Colomar
2022-11-12 14:54                                           ` Joseph Myers
2022-11-12 15:35                                             ` Alejandro Colomar
2022-11-12 17:02                                               ` Joseph Myers
2022-11-12 17:08                                                 ` Alejandro Colomar
2022-11-12 15:56                                             ` Martin Uecker
2022-11-13 13:19                                               ` Alejandro Colomar
2022-11-13 13:33                                                 ` Alejandro Colomar
2022-11-13 14:02                                                   ` Alejandro Colomar
2022-11-13 14:58                                                     ` Martin Uecker
2022-11-13 15:15                                                       ` Alejandro Colomar
2022-11-13 15:32                                                         ` Martin Uecker
2022-11-13 16:25                                                           ` Alejandro Colomar
2022-11-13 16:28                                                         ` Alejandro Colomar
2022-11-13 16:31                                                           ` Alejandro Colomar
2022-11-13 16:34                                                             ` Alejandro Colomar
2022-11-13 16:56                                                               ` Alejandro Colomar
2022-11-13 19:05                                                                 ` Alejandro Colomar
2022-11-14 18:13                                                           ` Joseph Myers
2022-11-28 22:59                                                             ` Alex Colomar
2022-11-28 23:18                                                       ` Alex Colomar
2022-11-29  0:05                                                         ` Joseph Myers
2022-11-29 14:58                                                         ` Michael Matz
2022-11-29 15:17                                                           ` Uecker, Martin
2022-11-29 15:44                                                             ` Michael Matz
2022-11-29 16:58                                                               ` Uecker, Martin
2022-11-29 17:28                                                                 ` Alex Colomar
2022-11-29 16:49                                                           ` Joseph Myers
2022-11-29 16:53                                                             ` Jonathan Wakely
2022-11-29 17:00                                                               ` Martin Uecker
2022-11-29 17:19                                                                 ` Alex Colomar
2022-11-29 17:29                                                                   ` Alex Colomar
2022-12-03 21:03                                                                     ` Alejandro Colomar
2022-12-03 21:13                                                                       ` Andrew Pinski
2022-12-03 21:15                                                                       ` Martin Uecker
2022-12-03 21:18                                                                         ` Alejandro Colomar
2022-12-06  2:08                                                                       ` Joseph Myers
2022-11-14 17:52                                                 ` Joseph Myers
2022-11-14 17:57                                                   ` Alejandro Colomar
2022-11-14 18:26                                                     ` Joseph Myers
2022-11-28 23:02                                                       ` Alex Colomar
2022-11-10  9:40                             ` G. Branden Robinson
2022-11-10 10:59                               ` Alejandro Colomar
2022-11-10 17:47                                 ` Alejandro Colomar
2022-11-10 18:04                                   ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
2022-11-10 18:11                                     ` Alejandro Colomar
2022-11-10 18:20                                       ` Alejandro Colomar
2022-11-10 19:37                                     ` Alejandro Colomar
2022-11-10 20:41                                       ` Alejandro Colomar
2022-11-10 22:55                                     ` G. Branden Robinson
2022-11-10 23:55                                       ` Alejandro Colomar
2022-11-11  4:44                                         ` G. Branden Robinson
2022-11-10 22:25                                 ` [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters G. Branden Robinson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.