linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND
@ 2023-04-19 17:47 наб
  2023-04-19 17:48 ` [PATCH 2/2] regex.3: improve REG_STARTEND наб
  2023-04-19 19:51 ` [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-19 17:47 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1059 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Also note that in
       int regexec(const regex_t *restrict preg, const char *restrict string,
                   size_t nmatch, regmatch_t pmatch[restrict .nmatch],
                   int eflags);
pmatch is [1] if nmatch is 0 if eflags&REG_STARTEND.
Or, more succinctly,
  regmatch_t pmatch[restrict !!(.eflags & &REG_STARTEND) ?: .nmatch],

Doesn't really matter, and that's a much worse signature than what's
currently there, but.

 man3/regex.3 | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/man3/regex.3 b/man3/regex.3
index e8fed5147..d54d6024c 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -82,7 +82,9 @@ and
 .I pmatch
 arguments to
 .BR regexec ()
-are ignored if the pattern buffer supplied was compiled with this flag set.
+are only used for
+.B REG_STARTEND
+if the pattern buffer supplied was compiled with this flag set.
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH 2/2] regex.3: improve REG_STARTEND
  2023-04-19 17:47 [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
@ 2023-04-19 17:48 ` наб
  2023-04-19 20:23   ` Alejandro Colomar
  2023-04-19 19:51 ` [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND Alejandro Colomar
  1 sibling, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 17:48 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1652 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Also change the [0]. to -> here to make more obvious the point that
pmatch is used as a pointer-to-object, not array in this scenario.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index d54d6024c..2c8b87aca 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -141,23 +141,20 @@ compilation flag
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
+instead of
+.RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+is not consulted for this purpose.
+If any matches are returned, they're relative to
+.IR string ,
+not
+.IR string " + " pmatch->rm_so .
 This flag is a BSD extension, not present in POSIX.
 .SS Byte offsets
 Unless
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND
  2023-04-19 17:47 [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
  2023-04-19 17:48 ` [PATCH 2/2] regex.3: improve REG_STARTEND наб
@ 2023-04-19 19:51 ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-19 19:51 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1648 bytes --]

Hi наб!

On 4/19/23 19:47, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
> Also note that in
>        int regexec(const regex_t *restrict preg, const char *restrict string,
>                    size_t nmatch, regmatch_t pmatch[restrict .nmatch],
>                    int eflags);
> pmatch is [1] if nmatch is 0 if eflags&REG_STARTEND.
> Or, more succinctly,
>   regmatch_t pmatch[restrict !!(.eflags & &REG_STARTEND) ?: .nmatch],
> 
> Doesn't really matter, and that's a much worse signature than what's
> currently there, but.

Please include this in the commit message :)

> 
>  man3/regex.3 | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index e8fed5147..d54d6024c 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -82,7 +82,9 @@ and
>  .I pmatch
>  arguments to
>  .BR regexec ()
> -are ignored if the pattern buffer supplied was compiled with this flag set.
> +are only used for
> +.B REG_STARTEND
> +if the pattern buffer supplied was compiled with this flag set.

I think it would be clearer with a wording like:

+are only used for
+.B REG_STARTEND
+and only if the pattern buffer supplied was compiled with this flag set.

I'm still not convinced by my wording either; please revise.
But with your wording, I think it's not clear what happens if
REG_STARTEND is not set.

Cheers,
Alex

>  .TP
>  .B REG_NEWLINE
>  Match-any-character operators don't match a newline.

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH 2/2] regex.3: improve REG_STARTEND
  2023-04-19 17:48 ` [PATCH 2/2] regex.3: improve REG_STARTEND наб
@ 2023-04-19 20:23   ` Alejandro Colomar
  2023-04-19 21:20     ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-19 20:23 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2184 bytes --]

Hi наб!

On 4/19/23 19:48, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.
> 
> Also change the [0]. to -> here to make more obvious the point that
> pmatch is used as a pointer-to-object, not array in this scenario.
> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.
> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 23 ++++++++++-------------
>  1 file changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index d54d6024c..2c8b87aca 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -141,23 +141,20 @@ compilation flag
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
> +instead of
> +.RI [ string ", " string " + \fBstrlen\fP(" string )).

Hmmm, I like this!

Let's see if I understand it.  pmatch[] is normally
[[gnu::access(write_only, 4, 3)]]
but if ((.eflags & REG_STARTEND) != 0) it's [1] and
[[gnu::access(read_write, 4)]]?

>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +is not consulted for this purpose.
> +If any matches are returned, they're relative to
> +.IR string ,
> +not
> +.IR string " + " pmatch->rm_so .

How are such matches returned?  In pmatch[>0]?  Or how?

Cheers,
Alex

>  This flag is a BSD extension, not present in POSIX.
>  .SS Byte offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH 2/2] regex.3: improve REG_STARTEND
  2023-04-19 20:23   ` Alejandro Colomar
@ 2023-04-19 21:20     ` наб
  2023-04-19 21:45       ` Alejandro Colomar
                         ` (9 more replies)
  0 siblings, 10 replies; 143+ messages in thread
From: наб @ 2023-04-19 21:20 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2486 bytes --]

Hi!

On Wed, Apr 19, 2023 at 10:23:29PM +0200, Alejandro Colomar wrote:
> On 4/19/23 19:48, наб wrote:
> > diff --git a/man3/regex.3 b/man3/regex.3
> > index d54d6024c..2c8b87aca 100644
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -141,23 +141,20 @@ compilation flag
> >  above).
> >  .TP
> >  .B REG_STARTEND
> > -Use
> > -.I pmatch[0]
> > -on the input string, starting at byte
> > -.I pmatch[0].rm_so
> > -and ending before byte
> > -.IR pmatch[0].rm_eo .
> > +Match
> > +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
> > +instead of
> > +.RI [ string ", " string " + \fBstrlen\fP(" string )).
> Hmmm, I like this!
> 
> Let's see if I understand it.  pmatch[] is normally
> [[gnu::access(write_only, 4, 3)]]
> but if ((.eflags & REG_STARTEND) != 0) it's [1] and
> [[gnu::access(read_write, 4)]]?
I fucked the ternary in my previous mail I think, soz;
I don't know if it's gnu::anything, but you could model it as
{
	if(eflags & REG_STARTEND)
		read(pmatch, 1);

	if(!(preg->flags & REG_NOSUB))  // as "set" in regcomp()
		write(pmatch, nmatch);
}

I.e. pmatch[nmatch] must be a writable array, unless REG_NOSUB,
and also, additively, *pmatch must be readable if REG_STARTEND.

> >  This allows matching embedded NUL bytes
> >  and avoids a
> >  .BR strlen (3)
> > -on large strings.
> > -It does not use
> > +on known-length strings.
> >  .I nmatch
> > -on input, and does not change
> > -.B REG_NOTBOL
> > -or
> > -.B REG_NEWLINE
> > -processing.
> > +is not consulted for this purpose.
> > +If any matches are returned, they're relative to
> > +.IR string ,
> > +not
> > +.IR string " + " pmatch->rm_so .
> How are such matches returned?  In pmatch[>0]?  Or how?
In the usual way in pmatch[0..nmatch].

I guess the "nmatch isn't taken into account" thing is confusing,
because REG_STARTEND just adds a read. regexec() can be modelled as
{
	const char * start, * end;
	if(eflags & REG_STARTEND) {
		start = string + pmatch->rm_so;
		end   = string + pmatch->rm_eo;
	} else {
		start = string;
		end   = string + strlen(string);
	}
	
	// match stuff in [start, end)
}

And that's the /only/ effect REG_STARTEND has
(+ matches are returned relative to string, not to start,
   but that's consistent, and they just got decoupled;
   it bears noting it there since it's not what I expected to happen).

I'll sleep on this and post something I hate less tomorrow.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH 2/2] regex.3: improve REG_STARTEND
  2023-04-19 21:20     ` наб
@ 2023-04-19 21:45       ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
                         ` (8 subsequent siblings)
  9 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-19 21:45 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2805 bytes --]

Hi!

On 4/19/23 23:20, наб wrote:
> Hi!
> 
> On Wed, Apr 19, 2023 at 10:23:29PM +0200, Alejandro Colomar wrote:
>> On 4/19/23 19:48, наб wrote:
>>> diff --git a/man3/regex.3 b/man3/regex.3
>>> index d54d6024c..2c8b87aca 100644
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -141,23 +141,20 @@ compilation flag
>>>  above).
>>>  .TP
>>>  .B REG_STARTEND
>>> -Use
>>> -.I pmatch[0]
>>> -on the input string, starting at byte
>>> -.I pmatch[0].rm_so
>>> -and ending before byte
>>> -.IR pmatch[0].rm_eo .
>>> +Match
>>> +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
>>> +instead of
>>> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
>> Hmmm, I like this!
>>
>> Let's see if I understand it.  pmatch[] is normally
>> [[gnu::access(write_only, 4, 3)]]
>> but if ((.eflags & REG_STARTEND) != 0) it's [1] and
>> [[gnu::access(read_write, 4)]]?
> I fucked the ternary in my previous mail I think, soz;
> I don't know if it's gnu::anything, but you could model it as
> {
> 	if(eflags & REG_STARTEND)
> 		read(pmatch, 1);
> 
> 	if(!(preg->flags & REG_NOSUB))  // as "set" in regcomp()
> 		write(pmatch, nmatch);
> }
> 
> I.e. pmatch[nmatch] must be a writable array, unless REG_NOSUB,
> and also, additively, *pmatch must be readable if REG_STARTEND.

Ahh, now it's clear to me (I think).  :)

> 
>>>  This allows matching embedded NUL bytes
>>>  and avoids a
>>>  .BR strlen (3)
>>> -on large strings.
>>> -It does not use
>>> +on known-length strings.
>>>  .I nmatch
>>> -on input, and does not change
>>> -.B REG_NOTBOL
>>> -or
>>> -.B REG_NEWLINE
>>> -processing.
>>> +is not consulted for this purpose.
>>> +If any matches are returned, they're relative to
>>> +.IR string ,
>>> +not
>>> +.IR string " + " pmatch->rm_so .
>> How are such matches returned?  In pmatch[>0]?  Or how?
> In the usual way in pmatch[0..nmatch].
> 
> I guess the "nmatch isn't taken into account" thing is confusing,
> because REG_STARTEND just adds a read. regexec() can be modelled as
> {
> 	const char * start, * end;
> 	if(eflags & REG_STARTEND) {
> 		start = string + pmatch->rm_so;
> 		end   = string + pmatch->rm_eo;
> 	} else {
> 		start = string;
> 		end   = string + strlen(string);
> 	}
> 	
> 	// match stuff in [start, end)
> }
> 
> And that's the /only/ effect REG_STARTEND has
> (+ matches are returned relative to string, not to start,
>    but that's consistent, and they just got decoupled;
>    it bears noting it there since it's not what I expected to happen).
> 
> I'll sleep on this and post something I hate less tomorrow.

Sure; good night!

Best,
Alex

> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND
  2023-04-19 21:20     ` наб
  2023-04-19 21:45       ` Alejandro Colomar
@ 2023-04-19 23:23       ` наб
  2023-04-20 11:21         ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 2/9] regex.3: improve REG_STARTEND наб
                         ` (7 subsequent siblings)
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:23 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1436 bytes --]

In the regexec() signature
  regmatch_t pmatch[restrict .nmatch],
is a simplification. It's actually
  regmatch_t pmatch[restrict
    ((.preg->flags & REG_NOSUB) ? 0 : .nmatch) ?:
     !!(.eflags & REG_STARTEND)],

But speccing that would be insane.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
By the end, I think I get to a regex(3) that I don't dread opening
(and that has all the info I'd want. who knew there was re_nsub?)!

The main issues here are (a) it's full of standardese, entire paragraphs
lifted from POSIX, or very close to that, and the POSIX dialect is
hostile to human life^W^Wbeing effectively used and (b) what reads like
30 years of people adding stuff without having read any other part of
the document. Almost everything repeats at least once.

Funny moments outlined as they come in the messages.

 man3/regex.3 | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index e8fed5147..d77aac2e7 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -80,9 +80,11 @@ The
 .I nmatch
 and
 .I pmatch
-arguments to
 .BR regexec ()
-are ignored if the pattern buffer supplied was compiled with this flag set.
+arguments will be ignored for this purpose (but
+.I pmatch
+may still be used for
+.BR REG_STARTEND ).
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-19 21:20     ` наб
  2023-04-19 21:45       ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
@ 2023-04-19 23:23       ` наб
  2023-04-20 10:00         ` G. Branden Robinson
  2023-06-02  0:12         ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 3/9] regex.3: ffix наб
                         ` (6 subsequent siblings)
  9 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-19 23:23 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Also change the [0]. to -> here to make more obvious the point that
pmatch is used as a pointer-to-object, not array in this scenario.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index d77aac2e7..74f19945d 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -141,23 +141,30 @@ compilation flag
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
+instead of
+.RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
+.I pmatch
+must point to a valid readable object.
+If any matches are returned
+.RB ( REG_NOSUB
+wasn't passed to
+.BR regcomp (),
+the match succeeded, and
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+> 0), they overwrite
+.I pmatch
+as usual, and the
+.B Byte offsets
+remain relative to
+.IR string
+(not
+.IR string " + " pmatch->rm_so ).
 This flag is a BSD extension, not present in POSIX.
 .SS Byte offsets
 Unless
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 3/9] regex.3: ffix
  2023-04-19 21:20     ` наб
                         ` (2 preceding siblings ...)
  2023-04-19 23:23       ` [PATCH v2 2/9] regex.3: improve REG_STARTEND наб
@ 2023-04-19 23:23       ` наб
  2023-04-20 11:23         ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 4/9] regex.3: wfix наб
                         ` (5 subsequent siblings)
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:23 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

We never bold POSIX, not even anywhere else on this page.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 74f19945d..5aaf42caa 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -61,11 +61,11 @@ of zero or more of the following:
 .TP
 .B REG_EXTENDED
 Use
-.B POSIX
+POSIX
 Extended Regular Expression syntax when interpreting
 .IR regex .
 If not set,
-.B POSIX
+POSIX
 Basic Regular Expression syntax is used.
 .TP
 .B REG_ICASE
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 4/9] regex.3: wfix
  2023-04-19 21:20     ` наб
                         ` (3 preceding siblings ...)
  2023-04-19 23:23       ` [PATCH v2 3/9] regex.3: ffix наб
@ 2023-04-19 23:23       ` наб
  2023-04-20 11:27         ` Alejandro Colomar
  2023-04-19 23:23       ` [PATCH v2 5/9] regex.3: ffix наб
                         ` (4 subsequent siblings)
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:23 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 948 bytes --]

"Not in POSIX.2", so is it in POSIX.1-2008? POSIX.1-2001?
(or any other combination of standards from this millenion
not mentioned on this page?) It's not: just say POSIX.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 5aaf42caa..b6e574b4d 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -289,7 +289,7 @@ Unknown character class name.
 .TP
 .B REG_EEND
 Nonspecific error.
-This is not defined by POSIX.2.
+This is not defined by POSIX.
 .TP
 .B REG_EESCAPE
 Trailing backslash.
@@ -303,7 +303,7 @@ occurs prior to the starting point.
 .TP
 .B REG_ESIZE
 Compiled regular expression requires a pattern buffer larger than 64\ kB.
-This is not defined by POSIX.2.
+This is not defined by POSIX.
 .TP
 .B REG_ESPACE
 The regex routines ran out of memory.
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 5/9] regex.3: ffix
  2023-04-19 21:20     ` наб
                         ` (4 preceding siblings ...)
  2023-04-19 23:23       ` [PATCH v2 4/9] regex.3: wfix наб
@ 2023-04-19 23:23       ` наб
  2023-04-20 11:28         ` Alejandro Colomar
  2023-04-19 23:25       ` [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3 наб
                         ` (3 subsequent siblings)
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:23 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

Use "bitwise OR" instead of "bitwise-\fBor\fP". No other page spells it
like this. The other weirdo contenders are
  $ git grep bitwise | grep RI
  man2/adjtimex.2:.RI bitwise- or
  man2/open.2:.RI bitwise- or 'd

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index b6e574b4d..fa2669544 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -56,7 +56,7 @@ pattern buffer.
 .PP
 .I cflags
 is the
-.RB bitwise- or
+bitwise OR
 of zero or more of the following:
 .TP
 .B REG_EXTENDED
@@ -121,7 +121,7 @@ and
 are used to provide information regarding the location of any matches.
 .I eflags
 is the
-.RB bitwise- or
+bitwise OR
 of zero or more of the following flags:
 .TP
 .B REG_NOTBOL
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3
  2023-04-19 21:20     ` наб
                         ` (5 preceding siblings ...)
  2023-04-19 23:23       ` [PATCH v2 5/9] regex.3: ffix наб
@ 2023-04-19 23:25       ` наб
  2023-04-20 11:31         ` Alejandro Colomar
  2023-04-19 23:25       ` [PATCH v2 7/9] regex.3: destandardeseify Byte offsets наб
                         ` (2 subsequent siblings)
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:25 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 5738 bytes --]

They're inextricably linked, not cross-referenced at all,
and not used anywhere else.

Now that they (realistically) exist to the reader, add a note
on how big nmatch can be; POSIX even says "The application develope
should note that there is probably no reason for using a value of
nmatch that is larger than preg−>re_nsub+1.".

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3              | 66 ++++++++++++++++++++++++++++-----------
 man3type/regex_t.3type    | 64 +------------------------------------
 man3type/regmatch_t.3type |  2 +-
 man3type/regoff_t.3type   |  2 +-
 4 files changed, 51 insertions(+), 83 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index fa2669544..b95b3c3b0 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -15,7 +15,7 @@ regcomp, regexec, regerror, regfree \- POSIX regex functions
 Standard C library
 .RI ( libc ", " \-lc )
 .SH SYNOPSIS
-.nf
+.EX
 .B #include <regex.h>
 .PP
 .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
@@ -29,7 +29,21 @@ Standard C library
 .BI "            char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
-.fi
+.PP
+.B typedef struct {
+.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
+.B } regex_t;
+.PP
+.B typedef struct {
+.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
+                           to start of substring */
+.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
+                           the first character after the end of
+                           substring */
+.B } regmatch_t;
+.PP
+.BR typedef " /* ... */  " regoff_t;
+.EE
 .SH DESCRIPTION
 .SS POSIX regex compiling
 .BR regcomp ()
@@ -54,6 +68,21 @@ must always be supplied with the address of a
 .BR regcomp ()-initialized
 pattern buffer.
 .PP
+After
+.BR regcomp ()
+succeeds,
+.I preg->re_nsub
+holds the number of subexpressions in
+.IR regex .
+Thus, a value of
+.I preg->re_nsub
++ 1
+passed as
+.I nmatch
+to
+.BR regexec ()
+is sufficient to capture all matches.
+.PP
 .I cflags
 is the
 bitwise OR
@@ -192,22 +221,6 @@ must be at least
 .IR N+1 .)
 Any unused structure elements will contain the value \-1.
 .PP
-The
-.I regmatch_t
-structure which is the type of
-.I pmatch
-is defined in
-.IR <regex.h> .
-.PP
-.in +4n
-.EX
-typedef struct {
-    regoff_t rm_so;
-    regoff_t rm_eo;
-} regmatch_t;
-.EE
-.in
-.PP
 Each
 .I rm_so
 element that is not \-1 indicates the start offset of the next largest
@@ -216,6 +229,14 @@ The relative
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
+.PP
+.I regoff_t
+is a signed integer type
+capable of storing the largest value that can be stored in either an
+.I ptrdiff_t
+type or a
+.I ssize_t
+type.
 .SS POSIX error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
@@ -338,6 +359,15 @@ T}	Thread safety	MT-Safe
 POSIX.1-2008.
 .SH HISTORY
 POSIX.1-2001.
+.PP
+Prior to POSIX.1-2008,
+.I regoff_t
+was required to be
+capable of storing the largest value that can be stored in either an
+.I off_t
+type or a
+.I ssize_t
+type.
 .SH EXAMPLES
 .EX
 #include <stdint.h>
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
index 176d2c7a6..c0daaf0ff 100644
--- a/man3type/regex_t.3type
+++ b/man3type/regex_t.3type
@@ -1,63 +1 @@
-.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
-.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\"
-.TH regex_t 3type (date) "Linux man-pages (unreleased)"
-.SH NAME
-regex_t, regmatch_t, regoff_t
-\- regular expression matching
-.SH LIBRARY
-Standard C library
-.RI ( libc )
-.SH SYNOPSIS
-.EX
-.B #include <regex.h>
-.PP
-.B typedef struct {
-.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
-.B } regex_t;
-.PP
-.B typedef struct {
-.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
-                           to start of substring */
-.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
-                           the first character after the end of
-                           substring */
-.B } regmatch_t;
-.PP
-.BR typedef " /* ... */  " regoff_t;
-.EE
-.SH DESCRIPTION
-.TP
-.I regex_t
-This is a structure type used in regular expression matching.
-It holds a compiled regular expression,
-compiled with
-.BR regcomp (3).
-.TP
-.I regmatch_t
-This is a structure type used in regular expression matching.
-.TP
-.I regoff_t
-It is a signed integer type
-capable of storing the largest value that can be stored in either an
-.I ptrdiff_t
-type or a
-.I ssize_t
-type.
-.SH STANDARDS
-POSIX.1-2008.
-.SH HISTORY
-POSIX.1-2001.
-.PP
-Prior to POSIX.1-2008,
-the type was
-capable of storing the largest value that can be stored in either an
-.I off_t
-type or a
-.I ssize_t
-type.
-.SH SEE ALSO
-.BR regex (3)
+.so man3/regex.3
diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regmatch_t.3type
+++ b/man3type/regmatch_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regoff_t.3type
+++ b/man3type/regoff_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 7/9] regex.3: destandardeseify Byte offsets
  2023-04-19 21:20     ` наб
                         ` (6 preceding siblings ...)
  2023-04-19 23:25       ` [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3 наб
@ 2023-04-19 23:25       ` наб
  2023-04-19 23:26       ` [PATCH v2 8/9] regex.3: desoupify function descriptions наб
  2023-04-19 23:26       ` [PATCH v2 9/9] regex.3: fix subsection headings наб
  9 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-19 23:25 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index b95b3c3b0..9f262f985 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -198,37 +198,34 @@ This flag is a BSD extension, not present in POSIX.
 .SS Byte offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ string " + " rm_so ", " string " + " rm_eo ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 8/9] regex.3: desoupify function descriptions
  2023-04-19 21:20     ` наб
                         ` (7 preceding siblings ...)
  2023-04-19 23:25       ` [PATCH v2 7/9] regex.3: destandardeseify Byte offsets наб
@ 2023-04-19 23:26       ` наб
  2023-04-20 11:15         ` [PATCH v3 " наб
  2023-04-19 23:26       ` [PATCH v2 9/9] regex.3: fix subsection headings наб
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:26 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3798 bytes --]

Behold:
  regerror() is passed the error code, errcode, the pattern buffer,
  preg, a pointer to a character string buffer, errbuf, and the size
  of the string buffer, errbuf_size.

Absolute soup. This reads to me like an ill-conceived copy from a very
early standard version. It looks fine in source form but is horrific to
read as running text.

Instead, replace all of these with just the descriptions of what they do
with their arguments. What the arguments are is very clearly noted in
big bold in the prototypes.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 80 +++++++++++++++++++++-------------------------------
 1 file changed, 32 insertions(+), 48 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 9f262f985..7d08d4042 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -25,8 +25,8 @@ Standard C library
 .BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict ." nmatch ],
 .BI "            int " eflags );
 .PP
-.BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
-.BI "            char " errbuf "[restrict ." errbuf_size "], \
+.BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
+.BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .PP
@@ -52,21 +52,13 @@ for subsequent
 .BR regexec ()
 searches.
 .PP
-.BR regcomp ()
-is supplied with
-.IR preg ,
-a pointer to a pattern buffer storage area;
-.IR regex ,
-a pointer to the null-terminated string and
-.IR cflags ,
-flags used to determine the type of compilation.
-.PP
-All regular expression searching must be done via a compiled pattern
-buffer, thus
-.BR regexec ()
-must always be supplied with the address of a
-.BR regcomp ()-initialized
-pattern buffer.
+The pattern buffer at
+.I *preg
+is initialized.
+.I regex
+is a null-terminated string.
+The locale must be the same when running
+.BR regexec ().
 .PP
 After
 .BR regcomp ()
@@ -142,12 +134,10 @@ contains
 .SS POSIX regex matching
 .BR regexec ()
 is used to match a null-terminated string
-against the precompiled pattern buffer,
-.IR preg .
-.I nmatch
-and
-.I pmatch
-are used to provide information regarding the location of any matches.
+against the precompiled pattern buffer in
+.IR *preg ,
+which must have been initialised with
+.BR regexec ().
 .I eflags
 is the
 bitwise OR
@@ -242,34 +232,28 @@ and
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+If
+.I preg
+is a null pointer\(emthe latest error.
+.PP
+If
+.I errbuf_size
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS POSIX pattern buffer freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+which must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v2 9/9] regex.3: fix subsection headings
  2023-04-19 21:20     ` наб
                         ` (8 preceding siblings ...)
  2023-04-19 23:26       ` [PATCH v2 8/9] regex.3: desoupify function descriptions наб
@ 2023-04-19 23:26       ` наб
  2023-04-20 11:17         ` [PATCH v3 " наб
  9 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-19 23:26 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 7d08d4042..58eb81c8b 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -45,7 +45,7 @@ size_t " errbuf_size );
 .BR typedef " /* ... */  " regoff_t;
 .EE
 .SH DESCRIPTION
-.SS POSIX regex compiling
+.SS Compilation
 .BR regcomp ()
 is used to compile a regular expression into a form that is suitable
 for subsequent
@@ -131,7 +131,7 @@ whether
 .I eflags
 contains
 .BR REG_NOTEOL .
-.SS POSIX regex matching
+.SS Matching
 .BR regexec ()
 is used to match a null-terminated string
 against the precompiled pattern buffer in
@@ -185,7 +185,7 @@ remain relative to
 (not
 .IR string " + " pmatch->rm_so ).
 This flag is a BSD extension, not present in POSIX.
-.SS Byte offsets
+.SS Match offsets
 Unless
 .B REG_NOSUB
 was passed to
@@ -224,7 +224,7 @@ capable of storing the largest value that can be stored in either an
 type or a
 .I ssize_t
 type.
-.SS POSIX error reporting
+.SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
 .BR regcomp ()
@@ -249,7 +249,7 @@ Otherwise, up to
 bytes are copied to
 .IR errbuf ;
 the error string is always null-terminated, and truncated to fit.
-.SS POSIX pattern buffer freeing
+.SS Freeing
 .BR regfree ()
 invalidates the pattern buffer at
 .IR *preg ,
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-19 23:23       ` [PATCH v2 2/9] regex.3: improve REG_STARTEND наб
@ 2023-04-20 10:00         ` G. Branden Robinson
  2023-04-20 11:13           ` наб
  2023-06-02  0:12         ` Alejandro Colomar
  1 sibling, 1 reply; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-20 10:00 UTC (permalink / raw)
  To: наб; +Cc: Alejandro Colomar (man-pages), linux-man

[-- Attachment #1: Type: text/plain, Size: 299 bytes --]

Hi наб,

At 2023-04-20T01:23:14+0200, наб wrote:
> +> 0), they overwrite
> +.I pmatch
> +as usual, and the
> +.B Byte offsets
> +remain relative to
> +.IR string
> +(not
> +.IR string " + " pmatch->rm_so ).

I don't think "byte" needs to be captialized here.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-20 10:00         ` G. Branden Robinson
@ 2023-04-20 11:13           ` наб
  2023-04-20 18:33             ` G. Branden Robinson
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 11:13 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: Alejandro Colomar (man-pages), linux-man

[-- Attachment #1: Type: text/plain, Size: 404 bytes --]

Hi!

On Thu, Apr 20, 2023 at 05:00:59AM -0500, G. Branden Robinson wrote:
> At 2023-04-20T01:23:14+0200, наб wrote:
> > +> 0), they overwrite
> > +.I pmatch
> > +as usual, and the
> > +.B Byte offsets
> > +remain relative to
> > +.IR string
> I don't think "byte" needs to be captialized here.
I'm using it as a Sx and the section is capitalised,
so I think this should also be?

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v3 8/9] regex.3: desoupify function descriptions
  2023-04-19 23:26       ` [PATCH v2 8/9] regex.3: desoupify function descriptions наб
@ 2023-04-20 11:15         ` наб
  2023-04-20 11:43           ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 11:15 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3827 bytes --]

Behold:
  regerror() is passed the error code, errcode, the pattern buffer,
  preg, a pointer to a character string buffer, errbuf, and the size
  of the string buffer, errbuf_size.

Absolute soup. This reads to me like an ill-conceived copy from a very
early standard version. It looks fine in source form but is horrific to
read as running text.

Instead, replace all of these with just the descriptions of what they do
with their arguments. What the arguments are is very clearly noted in
big bold in the prototypes.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Left one "pre"compiled buffer.

 man3/regex.3 | 80 +++++++++++++++++++++-------------------------------
 1 file changed, 32 insertions(+), 48 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 9f262f985..9bb4a73ff 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -25,8 +25,8 @@ Standard C library
 .BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict ." nmatch ],
 .BI "            int " eflags );
 .PP
-.BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
-.BI "            char " errbuf "[restrict ." errbuf_size "], \
+.BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
+.BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .PP
@@ -52,21 +52,13 @@ for subsequent
 .BR regexec ()
 searches.
 .PP
-.BR regcomp ()
-is supplied with
-.IR preg ,
-a pointer to a pattern buffer storage area;
-.IR regex ,
-a pointer to the null-terminated string and
-.IR cflags ,
-flags used to determine the type of compilation.
-.PP
-All regular expression searching must be done via a compiled pattern
-buffer, thus
-.BR regexec ()
-must always be supplied with the address of a
-.BR regcomp ()-initialized
-pattern buffer.
+The pattern buffer at
+.I *preg
+is initialized.
+.I regex
+is a null-terminated string.
+The locale must be the same when running
+.BR regexec ().
 .PP
 After
 .BR regcomp ()
@@ -142,12 +134,10 @@ contains
 .SS POSIX regex matching
 .BR regexec ()
 is used to match a null-terminated string
-against the precompiled pattern buffer,
-.IR preg .
-.I nmatch
-and
-.I pmatch
-are used to provide information regarding the location of any matches.
+against the compiled pattern buffer in
+.IR *preg ,
+which must have been initialised with
+.BR regexec ().
 .I eflags
 is the
 bitwise OR
@@ -242,34 +232,28 @@ and
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+If
+.I preg
+is a null pointer\(emthe latest error.
+.PP
+If
+.I errbuf_size
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS POSIX pattern buffer freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+which must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v3 9/9] regex.3: fix subsection headings
  2023-04-19 23:26       ` [PATCH v2 9/9] regex.3: fix subsection headings наб
@ 2023-04-20 11:17         ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 11:17 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Missed the .Sx Byte offsets.

 man3/regex.3 | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 9bb4a73ff..552763940 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -45,7 +45,7 @@ size_t " errbuf_size );
 .BR typedef " /* ... */  " regoff_t;
 .EE
 .SH DESCRIPTION
-.SS POSIX regex compiling
+.SS Compilation
 .BR regcomp ()
 is used to compile a regular expression into a form that is suitable
 for subsequent
@@ -131,7 +131,7 @@ whether
 .I eflags
 contains
 .BR REG_NOTEOL .
-.SS POSIX regex matching
+.SS Matching
 .BR regexec ()
 is used to match a null-terminated string
 against the compiled pattern buffer in
@@ -179,13 +179,13 @@ the match succeeded, and
 > 0), they overwrite
 .I pmatch
 as usual, and the
-.B Byte offsets
+.B Match offsets
 remain relative to
 .IR string
 (not
 .IR string " + " pmatch->rm_so ).
 This flag is a BSD extension, not present in POSIX.
-.SS Byte offsets
+.SS Match offsets
 Unless
 .B REG_NOSUB
 was passed to
@@ -224,7 +224,7 @@ capable of storing the largest value that can be stored in either an
 type or a
 .I ssize_t
 type.
-.SS POSIX error reporting
+.SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
 .BR regcomp ()
@@ -249,7 +249,7 @@ Otherwise, up to
 bytes are copied to
 .IR errbuf ;
 the error string is always null-terminated, and truncated to fit.
-.SS POSIX pattern buffer freeing
+.SS Freeing
 .BR regfree ()
 invalidates the pattern buffer at
 .IR *preg ,
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND
  2023-04-19 23:23       ` [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
@ 2023-04-20 11:21         ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:21 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1922 bytes --]

Hi!

On 4/20/23 01:23, наб wrote:
> In the regexec() signature
>   regmatch_t pmatch[restrict .nmatch],
> is a simplification. It's actually
>   regmatch_t pmatch[restrict
>     ((.preg->flags & REG_NOSUB) ? 0 : .nmatch) ?:
>      !!(.eflags & REG_STARTEND)],
> 
> But speccing that would be insane.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.  Thanks!  BTW, I capitalized the subject, as is house
practice of using proper English sentences for the subject (after the
page prefix), with the exception of not using the trailing period
(which I know Branden disapproves :p).

Cheers,
Alex

> ---
> By the end, I think I get to a regex(3) that I don't dread opening
> (and that has all the info I'd want. who knew there was re_nsub?)!
> 
> The main issues here are (a) it's full of standardese, entire paragraphs
> lifted from POSIX, or very close to that, and the POSIX dialect is
> hostile to human life^W^Wbeing effectively used and (b) what reads like
> 30 years of people adding stuff without having read any other part of
> the document. Almost everything repeats at least once.
> 
> Funny moments outlined as they come in the messages.
> 
>  man3/regex.3 | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index e8fed5147..d77aac2e7 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -80,9 +80,11 @@ The
>  .I nmatch
>  and
>  .I pmatch
> -arguments to
>  .BR regexec ()
> -are ignored if the pattern buffer supplied was compiled with this flag set.
> +arguments will be ignored for this purpose (but
> +.I pmatch
> +may still be used for
> +.BR REG_STARTEND ).
>  .TP
>  .B REG_NEWLINE
>  Match-any-character operators don't match a newline.

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 3/9] regex.3: ffix
  2023-04-19 23:23       ` [PATCH v2 3/9] regex.3: ffix наб
@ 2023-04-20 11:23         ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:23 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 835 bytes --]

Hi!

On 4/20/23 01:23, наб wrote:
> We never bold POSIX, not even anywhere else on this page.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.  Thanks,

Alex

> ---
>  man3/regex.3 | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 74f19945d..5aaf42caa 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -61,11 +61,11 @@ of zero or more of the following:
>  .TP
>  .B REG_EXTENDED
>  Use
> -.B POSIX
> +POSIX
>  Extended Regular Expression syntax when interpreting
>  .IR regex .
>  If not set,
> -.B POSIX
> +POSIX
>  Basic Regular Expression syntax is used.
>  .TP
>  .B REG_ICASE

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 4/9] regex.3: wfix
  2023-04-19 23:23       ` [PATCH v2 4/9] regex.3: wfix наб
@ 2023-04-20 11:27         ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:27 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1232 bytes --]

Hi!

On 4/20/23 01:23, наб wrote:
> "Not in POSIX.2", so is it in POSIX.1-2008? POSIX.1-2001?
> (or any other combination of standards from this millenion
> not mentioned on this page?) It's not: just say POSIX.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied (with some added double-spaces to the log).  Thanks!

Cheers,

Alex

> ---
>  man3/regex.3 | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 5aaf42caa..b6e574b4d 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -289,7 +289,7 @@ Unknown character class name.
>  .TP
>  .B REG_EEND
>  Nonspecific error.
> -This is not defined by POSIX.2.
> +This is not defined by POSIX.
>  .TP
>  .B REG_EESCAPE
>  Trailing backslash.
> @@ -303,7 +303,7 @@ occurs prior to the starting point.
>  .TP
>  .B REG_ESIZE
>  Compiled regular expression requires a pattern buffer larger than 64\ kB.
> -This is not defined by POSIX.2.
> +This is not defined by POSIX.
>  .TP
>  .B REG_ESPACE
>  The regex routines ran out of memory.

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 5/9] regex.3: ffix
  2023-04-19 23:23       ` [PATCH v2 5/9] regex.3: ffix наб
@ 2023-04-20 11:28         ` Alejandro Colomar
  2023-04-20 12:12           ` [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:28 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1157 bytes --]

Hi!

On 4/20/23 01:23, наб wrote:
> Use "bitwise OR" instead of "bitwise-\fBor\fP". No other page spells it
> like this. The other weirdo contenders are
>   $ git grep bitwise | grep RI
>   man2/adjtimex.2:.RI bitwise- or
>   man2/open.2:.RI bitwise- or 'd

Please check also those, and maybe fix them in the same patch :)

Cheers,
Alex

> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index b6e574b4d..fa2669544 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -56,7 +56,7 @@ pattern buffer.
>  .PP
>  .I cflags
>  is the
> -.RB bitwise- or
> +bitwise OR
>  of zero or more of the following:
>  .TP
>  .B REG_EXTENDED
> @@ -121,7 +121,7 @@ and
>  are used to provide information regarding the location of any matches.
>  .I eflags
>  is the
> -.RB bitwise- or
> +bitwise OR
>  of zero or more of the following flags:
>  .TP
>  .B REG_NOTBOL

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3
  2023-04-19 23:25       ` [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3 наб
@ 2023-04-20 11:31         ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
                             ` (5 more replies)
  0 siblings, 6 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:31 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 6544 bytes --]

Hi!

On 4/20/23 01:25, наб wrote:
> They're inextricably linked, not cross-referenced at all,
> and not used anywhere else.
> 
> Now that they (realistically) exist to the reader, add a note

I prefer if the text movement is done in a separate commit that does
the minimum, so that git(1) has it easier to follow the changes.

Also, this is a big change.  Could you please move it closer to the
end of the patch set?

Thanks,

Alex

> on how big nmatch can be; POSIX even says "The application develope
> should note that there is probably no reason for using a value of
> nmatch that is larger than preg−>re_nsub+1.".
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3              | 66 ++++++++++++++++++++++++++++-----------
>  man3type/regex_t.3type    | 64 +------------------------------------
>  man3type/regmatch_t.3type |  2 +-
>  man3type/regoff_t.3type   |  2 +-
>  4 files changed, 51 insertions(+), 83 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index fa2669544..b95b3c3b0 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -15,7 +15,7 @@ regcomp, regexec, regerror, regfree \- POSIX regex functions
>  Standard C library
>  .RI ( libc ", " \-lc )
>  .SH SYNOPSIS
> -.nf
> +.EX
>  .B #include <regex.h>
>  .PP
>  .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
> @@ -29,7 +29,21 @@ Standard C library
>  .BI "            char " errbuf "[restrict ." errbuf_size "], \
>  size_t " errbuf_size );
>  .BI "void regfree(regex_t *" preg );
> -.fi
> +.PP
> +.B typedef struct {
> +.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
> +.B } regex_t;
> +.PP
> +.B typedef struct {
> +.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
> +                           to start of substring */
> +.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
> +                           the first character after the end of
> +                           substring */
> +.B } regmatch_t;
> +.PP
> +.BR typedef " /* ... */  " regoff_t;
> +.EE
>  .SH DESCRIPTION
>  .SS POSIX regex compiling
>  .BR regcomp ()
> @@ -54,6 +68,21 @@ must always be supplied with the address of a
>  .BR regcomp ()-initialized
>  pattern buffer.
>  .PP
> +After
> +.BR regcomp ()
> +succeeds,
> +.I preg->re_nsub
> +holds the number of subexpressions in
> +.IR regex .
> +Thus, a value of
> +.I preg->re_nsub
> ++ 1
> +passed as
> +.I nmatch
> +to
> +.BR regexec ()
> +is sufficient to capture all matches.
> +.PP
>  .I cflags
>  is the
>  bitwise OR
> @@ -192,22 +221,6 @@ must be at least
>  .IR N+1 .)
>  Any unused structure elements will contain the value \-1.
>  .PP
> -The
> -.I regmatch_t
> -structure which is the type of
> -.I pmatch
> -is defined in
> -.IR <regex.h> .
> -.PP
> -.in +4n
> -.EX
> -typedef struct {
> -    regoff_t rm_so;
> -    regoff_t rm_eo;
> -} regmatch_t;
> -.EE
> -.in
> -.PP
>  Each
>  .I rm_so
>  element that is not \-1 indicates the start offset of the next largest
> @@ -216,6 +229,14 @@ The relative
>  .I rm_eo
>  element indicates the end offset of the match,
>  which is the offset of the first character after the matching text.
> +.PP
> +.I regoff_t
> +is a signed integer type
> +capable of storing the largest value that can be stored in either an
> +.I ptrdiff_t
> +type or a
> +.I ssize_t
> +type.
>  .SS POSIX error reporting
>  .BR regerror ()
>  is used to turn the error codes that can be returned by both
> @@ -338,6 +359,15 @@ T}	Thread safety	MT-Safe
>  POSIX.1-2008.
>  .SH HISTORY
>  POSIX.1-2001.
> +.PP
> +Prior to POSIX.1-2008,
> +.I regoff_t
> +was required to be
> +capable of storing the largest value that can be stored in either an
> +.I off_t
> +type or a
> +.I ssize_t
> +type.
>  .SH EXAMPLES
>  .EX
>  #include <stdint.h>
> diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
> index 176d2c7a6..c0daaf0ff 100644
> --- a/man3type/regex_t.3type
> +++ b/man3type/regex_t.3type
> @@ -1,63 +1 @@
> -.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
> -.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
> -.\"
> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> -.\"
> -.\"
> -.TH regex_t 3type (date) "Linux man-pages (unreleased)"
> -.SH NAME
> -regex_t, regmatch_t, regoff_t
> -\- regular expression matching
> -.SH LIBRARY
> -Standard C library
> -.RI ( libc )
> -.SH SYNOPSIS
> -.EX
> -.B #include <regex.h>
> -.PP
> -.B typedef struct {
> -.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
> -.B } regex_t;
> -.PP
> -.B typedef struct {
> -.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
> -                           to start of substring */
> -.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
> -                           the first character after the end of
> -                           substring */
> -.B } regmatch_t;
> -.PP
> -.BR typedef " /* ... */  " regoff_t;
> -.EE
> -.SH DESCRIPTION
> -.TP
> -.I regex_t
> -This is a structure type used in regular expression matching.
> -It holds a compiled regular expression,
> -compiled with
> -.BR regcomp (3).
> -.TP
> -.I regmatch_t
> -This is a structure type used in regular expression matching.
> -.TP
> -.I regoff_t
> -It is a signed integer type
> -capable of storing the largest value that can be stored in either an
> -.I ptrdiff_t
> -type or a
> -.I ssize_t
> -type.
> -.SH STANDARDS
> -POSIX.1-2008.
> -.SH HISTORY
> -POSIX.1-2001.
> -.PP
> -Prior to POSIX.1-2008,
> -the type was
> -capable of storing the largest value that can be stored in either an
> -.I off_t
> -type or a
> -.I ssize_t
> -type.
> -.SH SEE ALSO
> -.BR regex (3)
> +.so man3/regex.3
> diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
> index dc78f2cf2..c0daaf0ff 100644
> --- a/man3type/regmatch_t.3type
> +++ b/man3type/regmatch_t.3type
> @@ -1 +1 @@
> -.so man3type/regex_t.3type
> +.so man3/regex.3
> diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
> index dc78f2cf2..c0daaf0ff 100644
> --- a/man3type/regoff_t.3type
> +++ b/man3type/regoff_t.3type
> @@ -1 +1 @@
> -.so man3type/regex_t.3type
> +.so man3/regex.3

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 8/9] regex.3: desoupify function descriptions
  2023-04-20 11:15         ` [PATCH v3 " наб
@ 2023-04-20 11:43           ` Alejandro Colomar
  2023-04-20 11:50             ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 11:43 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4682 bytes --]

Hi наб!

On 4/20/23 13:15, наб wrote:
> Behold:
>   regerror() is passed the error code, errcode, the pattern buffer,
>   preg, a pointer to a character string buffer, errbuf, and the size
>   of the string buffer, errbuf_size.
> 
> Absolute soup. This reads to me like an ill-conceived copy from a very
> early standard version. It looks fine in source form but is horrific to
> read as running text.
> 
> Instead, replace all of these with just the descriptions of what they do
> with their arguments. What the arguments are is very clearly noted in
> big bold in the prototypes.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

It would be nice to see the --range-diff[1], to easily review changes to
patches.  I have a hard time running vdiff[2] on the raw patches.

[1]:  <https://git-scm.com/docs/git-format-patch#Documentation/git-format-patch.txt---range-diffltpreviousgt>
      See also: <https://git-scm.com/docs/git-range-diff>

[2]:  <http://catb.org/jargon/html/V/vdiff.html>, not
      <https://www.unix.com/man-page/linux/1/vdiff/>

Cheers,
Alex

> ---
> Left one "pre"compiled buffer.
> 
>  man3/regex.3 | 80 +++++++++++++++++++++-------------------------------
>  1 file changed, 32 insertions(+), 48 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 9f262f985..9bb4a73ff 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -25,8 +25,8 @@ Standard C library
>  .BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict ." nmatch ],
>  .BI "            int " eflags );
>  .PP
> -.BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
> -.BI "            char " errbuf "[restrict ." errbuf_size "], \
> +.BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
> +.BI "                char " errbuf "[restrict ." errbuf_size "], \
>  size_t " errbuf_size );
>  .BI "void regfree(regex_t *" preg );
>  .PP
> @@ -52,21 +52,13 @@ for subsequent
>  .BR regexec ()
>  searches.
>  .PP
> -.BR regcomp ()
> -is supplied with
> -.IR preg ,
> -a pointer to a pattern buffer storage area;
> -.IR regex ,
> -a pointer to the null-terminated string and
> -.IR cflags ,
> -flags used to determine the type of compilation.
> -.PP
> -All regular expression searching must be done via a compiled pattern
> -buffer, thus
> -.BR regexec ()
> -must always be supplied with the address of a
> -.BR regcomp ()-initialized
> -pattern buffer.
> +The pattern buffer at
> +.I *preg
> +is initialized.
> +.I regex
> +is a null-terminated string.
> +The locale must be the same when running
> +.BR regexec ().
>  .PP
>  After
>  .BR regcomp ()
> @@ -142,12 +134,10 @@ contains
>  .SS POSIX regex matching
>  .BR regexec ()
>  is used to match a null-terminated string
> -against the precompiled pattern buffer,
> -.IR preg .
> -.I nmatch
> -and
> -.I pmatch
> -are used to provide information regarding the location of any matches.
> +against the compiled pattern buffer in
> +.IR *preg ,
> +which must have been initialised with
> +.BR regexec ().
>  .I eflags
>  is the
>  bitwise OR
> @@ -242,34 +232,28 @@ and
>  .BR regexec ()
>  into error message strings.
>  .PP
> -.BR regerror ()
> -is passed the error code,
> -.IR errcode ,
> -the pattern buffer,
> -.IR preg ,
> -a pointer to a character string buffer,
> -.IR errbuf ,
> -and the size of the string buffer,
> -.IR errbuf_size .
> -It returns the size of the
> -.I errbuf
> -required to contain the null-terminated error message string.
> -If both
> -.I errbuf
> -and
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +If
> +.I preg
> +is a null pointer\(emthe latest error.
> +.PP
> +If
> +.I errbuf_size
> +is
> +.BR 0 ,
> +the size of the required buffer is returned.
> +Otherwise, up to
>  .I errbuf_size
> -are nonzero,
> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> +bytes are copied to
> +.IR errbuf ;
> +the error string is always null-terminated, and truncated to fit.
>  .SS POSIX pattern buffer freeing
> -Supplying
>  .BR regfree ()
> -with a precompiled pattern buffer,
> -.IR preg ,
> -will free the memory allocated to the pattern buffer by the compiling
> -process,
> +invalidates the pattern buffer at
> +.IR *preg ,
> +which must have been initialized via
>  .BR regcomp ().
>  .SH RETURN VALUE
>  .BR regcomp ()

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 8/9] regex.3: desoupify function descriptions
  2023-04-20 11:43           ` Alejandro Colomar
@ 2023-04-20 11:50             ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 11:50 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1253 bytes --]

Hi!

On Thu, Apr 20, 2023 at 01:43:56PM +0200, Alejandro Colomar wrote:
> On 4/20/23 13:15, наб wrote:
> > Behold:
> >   regerror() is passed the error code, errcode, the pattern buffer,
> >   preg, a pointer to a character string buffer, errbuf, and the size
> >   of the string buffer, errbuf_size.
> > 
> > Absolute soup. This reads to me like an ill-conceived copy from a very
> > early standard version. It looks fine in source form but is horrific to
> > read as running text.
> > 
> > Instead, replace all of these with just the descriptions of what they do
> > with their arguments. What the arguments are is very clearly noted in
> > big bold in the prototypes.
> It would be nice to see the --range-diff[1], to easily review changes to
> patches.  I have a hard time running vdiff[2] on the raw patches.
v2:
> > -against the precompiled pattern buffer,
> > +against the precompiled pattern buffer in
v3:
> > -against the precompiled pattern buffer,
> > +against the compiled pattern buffer in

And 9/9 grew this hunk in v3:
@@ -179,13 +179,13 @@ the match succeeded, and
 > 0), they overwrite
 .I pmatch
 as usual, and the
-.B Byte offsets
+.B Match offsets
 remain relative to
 .IR string
 (not

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 11:28         ` Alejandro Colomar
@ 2023-04-20 12:12           ` наб
  2023-04-20 12:52             ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 12:12 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3090 bytes --]

Use "bitwise OR" instead of "bitwise-or" (with fonts).
No other pages spell it like this.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v2:
1:  1ccffe37b < -:  --------- regex.3: ffix
-:  --------- > 1:  830173bb5 adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix

idk if this did anything

 man2/adjtimex.2 | 2 +-
 man2/clone.2    | 2 +-
 man2/mprotect.2 | 2 +-
 man2/open.2     | 2 +-
 man2/syscall.2  | 2 +-
 man3/regex.3    | 4 ++--
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/man2/adjtimex.2 b/man2/adjtimex.2
index 523347de2..40b05cb0e 100644
--- a/man2/adjtimex.2
+++ b/man2/adjtimex.2
@@ -90,7 +90,7 @@ the constants used for
 .BR ntp_adjtime ()
 are equivalent but differently named.)
 It is a bit mask containing a
-.RI bitwise- or
+bitwise OR
 combination of zero or more of the following bits:
 .TP
 .B ADJ_OFFSET
diff --git a/man2/clone.2 b/man2/clone.2
index 42ee3fee8..ec43841eb 100644
--- a/man2/clone.2
+++ b/man2/clone.2
@@ -413,7 +413,7 @@ mask in the remainder of this page.
 .PP
 The
 .I flags
-mask is specified as a bitwise-OR of zero or more of
+mask is specified as a bitwise OR of zero or more of
 the constants listed below.
 Except as noted below, these flags are available
 (and have the same effect) in both
diff --git a/man2/mprotect.2 b/man2/mprotect.2
index 52c14da05..5a829dafe 100644
--- a/man2/mprotect.2
+++ b/man2/mprotect.2
@@ -43,7 +43,7 @@ signal for the process.
 .I prot
 is a combination of the following access flags:
 .B PROT_NONE
-or a bitwise-or of the other values in the following list:
+or a bitwise OR of the other values in the following list:
 .TP
 .B PROT_NONE
 The memory cannot be accessed at all.
diff --git a/man2/open.2 b/man2/open.2
index 77c06b55d..b5aff887c 100644
--- a/man2/open.2
+++ b/man2/open.2
@@ -123,7 +123,7 @@ respectively.
 .PP
 In addition, zero or more file creation flags and file status flags
 can be
-.RI bitwise- or 'd
+bitwise ORed
 in
 .IR flags .
 The
diff --git a/man2/syscall.2 b/man2/syscall.2
index 3eba62182..55233ac51 100644
--- a/man2/syscall.2
+++ b/man2/syscall.2
@@ -235,7 +235,7 @@ nuances:
 In order to indicate that a system call is called under the x32 ABI,
 an additional bit,
 .BR __X32_SYSCALL_BIT ,
-is bitwise-ORed with the system call number.
+is bitwise ORed with the system call number.
 The ABI used by a process affects some process behaviors,
 including signal handling or system call restarting.
 .IP \[bu]
diff --git a/man3/regex.3 b/man3/regex.3
index 3b504a4d5..3ee58f61d 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -56,7 +56,7 @@ pattern buffer.
 .PP
 .I cflags
 is the
-.RB bitwise- or
+bitwise OR
 of zero or more of the following:
 .TP
 .B REG_EXTENDED
@@ -121,7 +121,7 @@ and
 are used to provide information regarding the location of any matches.
 .I eflags
 is the
-.RB bitwise- or
+bitwise OR
 of zero or more of the following flags:
 .TP
 .B REG_NOTBOL
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 12:12           ` [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix наб
@ 2023-04-20 12:52             ` Alejandro Colomar
  2023-04-20 13:03               ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 12:52 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3739 bytes --]

On 4/20/23 14:12, наб wrote:
> Use "bitwise OR" instead of "bitwise-or" (with fonts).
> No other pages spell it like this.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.  Thanks.

> ---
> Range-diff against v2:
> 1:  1ccffe37b < -:  --------- regex.3: ffix
> -:  --------- > 1:  830173bb5 adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix

I rewrote the subject to:

man*/: ffix, wfix

> 
> idk if this did anything

Heh, it didn't do much.  What happened is that the patches are so
different, that git thinks you just removed one patch, and wrote
a different one from scratch.  Anyway, I find it useful most of
the time.

Cheers,
Alex

> 
>  man2/adjtimex.2 | 2 +-
>  man2/clone.2    | 2 +-
>  man2/mprotect.2 | 2 +-
>  man2/open.2     | 2 +-
>  man2/syscall.2  | 2 +-
>  man3/regex.3    | 4 ++--
>  6 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/man2/adjtimex.2 b/man2/adjtimex.2
> index 523347de2..40b05cb0e 100644
> --- a/man2/adjtimex.2
> +++ b/man2/adjtimex.2
> @@ -90,7 +90,7 @@ the constants used for
>  .BR ntp_adjtime ()
>  are equivalent but differently named.)
>  It is a bit mask containing a
> -.RI bitwise- or
> +bitwise OR
>  combination of zero or more of the following bits:
>  .TP
>  .B ADJ_OFFSET
> diff --git a/man2/clone.2 b/man2/clone.2
> index 42ee3fee8..ec43841eb 100644
> --- a/man2/clone.2
> +++ b/man2/clone.2
> @@ -413,7 +413,7 @@ mask in the remainder of this page.
>  .PP
>  The
>  .I flags
> -mask is specified as a bitwise-OR of zero or more of
> +mask is specified as a bitwise OR of zero or more of
>  the constants listed below.
>  Except as noted below, these flags are available
>  (and have the same effect) in both
> diff --git a/man2/mprotect.2 b/man2/mprotect.2
> index 52c14da05..5a829dafe 100644
> --- a/man2/mprotect.2
> +++ b/man2/mprotect.2
> @@ -43,7 +43,7 @@ signal for the process.
>  .I prot
>  is a combination of the following access flags:
>  .B PROT_NONE
> -or a bitwise-or of the other values in the following list:
> +or a bitwise OR of the other values in the following list:
>  .TP
>  .B PROT_NONE
>  The memory cannot be accessed at all.
> diff --git a/man2/open.2 b/man2/open.2
> index 77c06b55d..b5aff887c 100644
> --- a/man2/open.2
> +++ b/man2/open.2
> @@ -123,7 +123,7 @@ respectively.
>  .PP
>  In addition, zero or more file creation flags and file status flags
>  can be
> -.RI bitwise- or 'd
> +bitwise ORed
>  in
>  .IR flags .
>  The
> diff --git a/man2/syscall.2 b/man2/syscall.2
> index 3eba62182..55233ac51 100644
> --- a/man2/syscall.2
> +++ b/man2/syscall.2
> @@ -235,7 +235,7 @@ nuances:
>  In order to indicate that a system call is called under the x32 ABI,
>  an additional bit,
>  .BR __X32_SYSCALL_BIT ,
> -is bitwise-ORed with the system call number.
> +is bitwise ORed with the system call number.
>  The ABI used by a process affects some process behaviors,
>  including signal handling or system call restarting.
>  .IP \[bu]
> diff --git a/man3/regex.3 b/man3/regex.3
> index 3b504a4d5..3ee58f61d 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -56,7 +56,7 @@ pattern buffer.
>  .PP
>  .I cflags
>  is the
> -.RB bitwise- or
> +bitwise OR
>  of zero or more of the following:
>  .TP
>  .B REG_EXTENDED
> @@ -121,7 +121,7 @@ and
>  are used to provide information regarding the location of any matches.
>  .I eflags
>  is the
> -.RB bitwise- or
> +bitwise OR
>  of zero or more of the following flags:
>  .TP
>  .B REG_NOTBOL

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v4 1/6] regex.3: Fix subsection headings
  2023-04-20 11:31         ` Alejandro Colomar
@ 2023-04-20 13:02           ` наб
  2023-04-20 13:13             ` Alejandro Colomar
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
  2023-04-20 13:02           ` [PATCH v4 2/6] regex.3: Desoupify function descriptions наб
                             ` (4 subsequent siblings)
  5 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
$ git diff v3

But the patches are re-ordered (and a new move-only one added);
--range-diff, humorously, /only/ picks up that one, and doesn't
understand the rest, which is worse than if it failed entirely.

The 3type move is as far back as I could make it I think,
6/6 wants to come after regoff_t deduplication.

 man3/regex.3 | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 3ee58f61d..637cb2231 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -31,7 +31,7 @@ size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .fi
 .SH DESCRIPTION
-.SS POSIX regex compiling
+.SS Compilation
 .BR regcomp ()
 is used to compile a regular expression into a form that is suitable
 for subsequent
@@ -110,7 +110,7 @@ whether
 .I eflags
 contains
 .BR REG_NOTEOL .
-.SS POSIX regex matching
+.SS Matching
 .BR regexec ()
 is used to match a null-terminated string
 against the precompiled pattern buffer,
@@ -159,7 +159,7 @@ or
 .B REG_NEWLINE
 processing.
 This flag is a BSD extension, not present in POSIX.
-.SS Byte offsets
+.SS Match offsets
 Unless
 .B REG_NOSUB
 was set for the compilation of the pattern buffer, it is possible to
@@ -209,7 +209,7 @@ The relative
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
-.SS POSIX error reporting
+.SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
 .BR regcomp ()
@@ -238,7 +238,7 @@ are nonzero,
 is filled in with the first
 .I "errbuf_size \- 1"
 characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
-.SS POSIX pattern buffer freeing
+.SS Freeing
 Supplying
 .BR regfree ()
 with a precompiled pattern buffer,
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v4 2/6] regex.3: Desoupify function descriptions
  2023-04-20 11:31         ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
@ 2023-04-20 13:02           ` наб
  2023-04-20 14:00             ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 3/6] regex.3: Improve REG_STARTEND наб
                             ` (3 subsequent siblings)
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3758 bytes --]

Behold:
  regerror() is passed the error code, errcode, the pattern buffer,
  preg, a pointer to a character string buffer, errbuf, and the size
  of the string buffer, errbuf_size.

Absolute soup. This reads to me like an ill-conceived copy from a very
early standard version. It looks fine in source form but is horrific to
read as running text.

Instead, replace all of these with just the descriptions of what they do
with their arguments. What the arguments are is very clearly noted in
big bold in the prototypes.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 80 +++++++++++++++++++++-------------------------------
 1 file changed, 32 insertions(+), 48 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 637cb2231..b4feaba19 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -25,8 +25,8 @@ Standard C library
 .BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict ." nmatch ],
 .BI "            int " eflags );
 .PP
-.BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
-.BI "            char " errbuf "[restrict ." errbuf_size "], \
+.BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
+.BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .fi
@@ -38,21 +38,13 @@ for subsequent
 .BR regexec ()
 searches.
 .PP
-.BR regcomp ()
-is supplied with
-.IR preg ,
-a pointer to a pattern buffer storage area;
-.IR regex ,
-a pointer to the null-terminated string and
-.IR cflags ,
-flags used to determine the type of compilation.
-.PP
-All regular expression searching must be done via a compiled pattern
-buffer, thus
-.BR regexec ()
-must always be supplied with the address of a
-.BR regcomp ()-initialized
-pattern buffer.
+The pattern buffer at
+.I *preg
+is initialized.
+.I regex
+is a null-terminated string.
+The locale must be the same when running
+.BR regexec ().
 .PP
 .I cflags
 is the
@@ -113,12 +105,10 @@ contains
 .SS Matching
 .BR regexec ()
 is used to match a null-terminated string
-against the precompiled pattern buffer,
-.IR preg .
-.I nmatch
-and
-.I pmatch
-are used to provide information regarding the location of any matches.
+against the compiled pattern buffer in
+.IR *preg ,
+which must have been initialised with
+.BR regexec ().
 .I eflags
 is the
 bitwise OR
@@ -217,34 +207,28 @@ and
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+If
+.I preg
+is a null pointer\(emthe latest error.
+.PP
+If
+.I errbuf_size
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS Freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+which must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v4 3/6] regex.3: Improve REG_STARTEND
  2023-04-20 11:31         ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
  2023-04-20 13:02           ` [PATCH v4 2/6] regex.3: Desoupify function descriptions наб
@ 2023-04-20 13:02           ` наб
  2023-04-20 14:04             ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 4/6] regex.3, regex_t.3type: Move regex_t.3type into regex.3 наб
                             ` (2 subsequent siblings)
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Also change the [0]. to -> here to make more obvious the point that
pmatch is used as a pointer-to-object, not array in this scenario.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index b4feaba19..00e7e2c6b 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -131,23 +131,30 @@ compilation flag
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
+instead of
+.RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
+.I pmatch
+must point to a valid readable object.
+If any matches are returned
+.RB ( REG_NOSUB
+wasn't passed to
+.BR regcomp (),
+the match succeeded, and
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+> 0), they overwrite
+.I pmatch
+as usual, and the
+.B Match offsets
+remain relative to
+.IR string
+(not
+.IR string " + " pmatch->rm_so ).
 This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v4 4/6] regex.3, regex_t.3type: Move regex_t.3type into regex.3
  2023-04-20 11:31         ` Alejandro Colomar
                             ` (2 preceding siblings ...)
  2023-04-20 13:02           ` [PATCH v4 3/6] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 13:02           ` наб
  2023-04-20 13:02           ` [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3 наб
  2023-04-20 13:02           ` [PATCH v4 6/6] regex.3: Destandardeseify Match offsets наб
  5 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3765 bytes --]

Move-only commit.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3           | 30 ++++++++++++++++++++
 man3type/regex_t.3type | 63 ------------------------------------------
 2 files changed, 30 insertions(+), 63 deletions(-)
 delete mode 100644 man3type/regex_t.3type

diff --git a/man3/regex.3 b/man3/regex.3
index 00e7e2c6b..615e065de 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -29,6 +29,20 @@ Standard C library
 .BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
+.PP
+.B typedef struct {
+.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
+.B } regex_t;
+.PP
+.B typedef struct {
+.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
+                           to start of substring */
+.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
+                           the first character after the end of
+                           substring */
+.B } regmatch_t;
+.PP
+.BR typedef " /* ... */  " regoff_t;
 .fi
 .SH DESCRIPTION
 .SS Compilation
@@ -206,6 +220,14 @@ The relative
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
+.PP
+.I regoff_t
+It is a signed integer type
+capable of storing the largest value that can be stored in either an
+.I ptrdiff_t
+type or a
+.I ssize_t
+type.
 .SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
@@ -322,6 +344,14 @@ T}	Thread safety	MT-Safe
 POSIX.1-2008.
 .SH HISTORY
 POSIX.1-2001.
+.PP
+Prior to POSIX.1-2008,
+the type was
+capable of storing the largest value that can be stored in either an
+.I off_t
+type or a
+.I ssize_t
+type.
 .SH EXAMPLES
 .EX
 #include <stdint.h>
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
deleted file mode 100644
index 176d2c7a6..000000000
--- a/man3type/regex_t.3type
+++ /dev/null
@@ -1,63 +0,0 @@
-.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
-.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\"
-.TH regex_t 3type (date) "Linux man-pages (unreleased)"
-.SH NAME
-regex_t, regmatch_t, regoff_t
-\- regular expression matching
-.SH LIBRARY
-Standard C library
-.RI ( libc )
-.SH SYNOPSIS
-.EX
-.B #include <regex.h>
-.PP
-.B typedef struct {
-.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
-.B } regex_t;
-.PP
-.B typedef struct {
-.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
-                           to start of substring */
-.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
-                           the first character after the end of
-                           substring */
-.B } regmatch_t;
-.PP
-.BR typedef " /* ... */  " regoff_t;
-.EE
-.SH DESCRIPTION
-.TP
-.I regex_t
-This is a structure type used in regular expression matching.
-It holds a compiled regular expression,
-compiled with
-.BR regcomp (3).
-.TP
-.I regmatch_t
-This is a structure type used in regular expression matching.
-.TP
-.I regoff_t
-It is a signed integer type
-capable of storing the largest value that can be stored in either an
-.I ptrdiff_t
-type or a
-.I ssize_t
-type.
-.SH STANDARDS
-POSIX.1-2008.
-.SH HISTORY
-POSIX.1-2001.
-.PP
-Prior to POSIX.1-2008,
-the type was
-capable of storing the largest value that can be stored in either an
-.I off_t
-type or a
-.I ssize_t
-type.
-.SH SEE ALSO
-.BR regex (3)
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3
  2023-04-20 11:31         ` Alejandro Colomar
                             ` (3 preceding siblings ...)
  2023-04-20 13:02           ` [PATCH v4 4/6] regex.3, regex_t.3type: Move regex_t.3type into regex.3 наб
@ 2023-04-20 13:02           ` наб
  2023-04-20 14:07             ` Alejandro Colomar
  2023-04-20 13:02           ` [PATCH v4 6/6] regex.3: Destandardeseify Match offsets наб
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 3327 bytes --]

They're inextricably linked, not cross-referenced at all,
and not used anywhere else.

Now that they (realistically) exist to the reader, add a note
on how big nmatch can be; POSIX even says "The application developer
should note that there is probably no reason for using a value of
nmatch that is larger than preg−>re_nsub+1.".

Also remove the now-duplicate regmatch_t declaration.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3              | 40 +++++++++++++++++++--------------------
 man3type/regex_t.3type    |  1 +
 man3type/regmatch_t.3type |  2 +-
 man3type/regoff_t.3type   |  2 +-
 4 files changed, 23 insertions(+), 22 deletions(-)
 create mode 100644 man3type/regex_t.3type

diff --git a/man3/regex.3 b/man3/regex.3
index 615e065de..6d203fa22 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -15,7 +15,7 @@ regcomp, regexec, regerror, regfree \- POSIX regex functions
 Standard C library
 .RI ( libc ", " \-lc )
 .SH SYNOPSIS
-.nf
+.EX
 .B #include <regex.h>
 .PP
 .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
@@ -43,7 +43,7 @@ size_t " errbuf_size );
 .B } regmatch_t;
 .PP
 .BR typedef " /* ... */  " regoff_t;
-.fi
+.EE
 .SH DESCRIPTION
 .SS Compilation
 .BR regcomp ()
@@ -60,6 +60,21 @@ is a null-terminated string.
 The locale must be the same when running
 .BR regexec ().
 .PP
+After
+.BR regcomp ()
+succeeds,
+.I preg->re_nsub
+holds the number of subexpressions in
+.IR regex .
+Thus, a value of
+.I preg->re_nsub
++ 1
+passed as
+.I nmatch
+to
+.BR regexec ()
+is sufficient to capture all matches.
+.PP
 .I cflags
 is the
 bitwise OR
@@ -196,22 +211,6 @@ must be at least
 .IR N+1 .)
 Any unused structure elements will contain the value \-1.
 .PP
-The
-.I regmatch_t
-structure which is the type of
-.I pmatch
-is defined in
-.IR <regex.h> .
-.PP
-.in +4n
-.EX
-typedef struct {
-    regoff_t rm_so;
-    regoff_t rm_eo;
-} regmatch_t;
-.EE
-.in
-.PP
 Each
 .I rm_so
 element that is not \-1 indicates the start offset of the next largest
@@ -222,7 +221,7 @@ element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
 .PP
 .I regoff_t
-It is a signed integer type
+is a signed integer type
 capable of storing the largest value that can be stored in either an
 .I ptrdiff_t
 type or a
@@ -346,7 +345,8 @@ POSIX.1-2008.
 POSIX.1-2001.
 .PP
 Prior to POSIX.1-2008,
-the type was
+.I regoff_t
+was required to be
 capable of storing the largest value that can be stored in either an
 .I off_t
 type or a
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
new file mode 100644
index 000000000..c0daaf0ff
--- /dev/null
+++ b/man3type/regex_t.3type
@@ -0,0 +1 @@
+.so man3/regex.3
diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regmatch_t.3type
+++ b/man3type/regmatch_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regoff_t.3type
+++ b/man3type/regoff_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v4 6/6] regex.3: Destandardeseify Match offsets
  2023-04-20 11:31         ` Alejandro Colomar
                             ` (4 preceding siblings ...)
  2023-04-20 13:02           ` [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3 наб
@ 2023-04-20 13:02           ` наб
  2023-04-20 14:10             ` Alejandro Colomar
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 13:02 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 6d203fa22..552763940 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -188,37 +188,34 @@ This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ string " + " rm_so ", " string " + " rm_eo ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 12:52             ` Alejandro Colomar
@ 2023-04-20 13:03               ` Alejandro Colomar
  2023-04-20 14:13                 ` наб
  2023-04-20 18:42                 ` G. Branden Robinson
  0 siblings, 2 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 13:03 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4553 bytes --]

On 4/20/23 14:52, Alejandro Colomar wrote:
> On 4/20/23 14:12, наб wrote:
>> Use "bitwise OR" instead of "bitwise-or" (with fonts).
>> No other pages spell it like this.
>>
>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> 
> Patch applied.  Thanks.
> 
>> ---
>> Range-diff against v2:
>> 1:  1ccffe37b < -:  --------- regex.3: ffix
>> -:  --------- > 1:  830173bb5 adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
> 
> I rewrote the subject to:
> 
> man*/: ffix, wfix
> 
>>
>> idk if this did anything
> 
> Heh, it didn't do much.  What happened is that the patches are so
> different, that git thinks you just removed one patch, and wrote
> a different one from scratch.  Anyway, I find it useful most of
> the time.
> 
> Cheers,
> Alex
> 
>>
>>  man2/adjtimex.2 | 2 +-
>>  man2/clone.2    | 2 +-
>>  man2/mprotect.2 | 2 +-
>>  man2/open.2     | 2 +-
>>  man2/syscall.2  | 2 +-
>>  man3/regex.3    | 4 ++--
>>  6 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/man2/adjtimex.2 b/man2/adjtimex.2
>> index 523347de2..40b05cb0e 100644
>> --- a/man2/adjtimex.2
>> +++ b/man2/adjtimex.2
>> @@ -90,7 +90,7 @@ the constants used for

BTW, another thing you might find useful is this:

$ cat ~/.config/git/attributes 
*.[1-8]* diff=man


And then in your .gitconfig:

[diff "man"]
	xfuncname = "^\\.S[SH] .*$"


You may want to use a regex that also works for mdoc(7).

This produces the following hunks:

@@ -90,7 +90,7 @@ .SH DESCRIPTION

>>  .BR ntp_adjtime ()
>>  are equivalent but differently named.)
>>  It is a bit mask containing a
>> -.RI bitwise- or
>> +bitwise OR
>>  combination of zero or more of the following bits:
>>  .TP
>>  .B ADJ_OFFSET
>> diff --git a/man2/clone.2 b/man2/clone.2
>> index 42ee3fee8..ec43841eb 100644
>> --- a/man2/clone.2
>> +++ b/man2/clone.2
>> @@ -413,7 +413,7 @@ mask in the remainder of this page.

@@ -413,7 +413,7 @@ .SS The flags mask

>>  .PP
>>  The
>>  .I flags
>> -mask is specified as a bitwise-OR of zero or more of
>> +mask is specified as a bitwise OR of zero or more of
>>  the constants listed below.
>>  Except as noted below, these flags are available
>>  (and have the same effect) in both
>> diff --git a/man2/mprotect.2 b/man2/mprotect.2
>> index 52c14da05..5a829dafe 100644
>> --- a/man2/mprotect.2
>> +++ b/man2/mprotect.2
>> @@ -43,7 +43,7 @@ signal for the process.

@@ -43,7 +43,7 @@ .SH DESCRIPTION

>>  .I prot
>>  is a combination of the following access flags:
>>  .B PROT_NONE
>> -or a bitwise-or of the other values in the following list:
>> +or a bitwise OR of the other values in the following list:
>>  .TP
>>  .B PROT_NONE
>>  The memory cannot be accessed at all.
>> diff --git a/man2/open.2 b/man2/open.2
>> index 77c06b55d..b5aff887c 100644
>> --- a/man2/open.2
>> +++ b/man2/open.2
>> @@ -123,7 +123,7 @@ respectively.

@@ -123,7 +123,7 @@ .SH DESCRIPTION

>>  .PP
>>  In addition, zero or more file creation flags and file status flags
>>  can be
>> -.RI bitwise- or 'd
>> +bitwise ORed
>>  in
>>  .IR flags .
>>  The
>> diff --git a/man2/syscall.2 b/man2/syscall.2
>> index 3eba62182..55233ac51 100644
>> --- a/man2/syscall.2
>> +++ b/man2/syscall.2
>> @@ -235,7 +235,7 @@ nuances:

@@ -235,7 +235,7 @@ .SS Architecture calling conventions

>>  In order to indicate that a system call is called under the x32 ABI,
>>  an additional bit,
>>  .BR __X32_SYSCALL_BIT ,
>> -is bitwise-ORed with the system call number.
>> +is bitwise ORed with the system call number.
>>  The ABI used by a process affects some process behaviors,
>>  including signal handling or system call restarting.
>>  .IP \[bu]
>> diff --git a/man3/regex.3 b/man3/regex.3
>> index 3b504a4d5..3ee58f61d 100644
>> --- a/man3/regex.3
>> +++ b/man3/regex.3
>> @@ -56,7 +56,7 @@ pattern buffer.

@@ -56,7 +56,7 @@ .SS POSIX regex compiling

>>  .PP
>>  .I cflags
>>  is the
>> -.RB bitwise- or
>> +bitwise OR
>>  of zero or more of the following:
>>  .TP
>>  .B REG_EXTENDED
>> @@ -121,7 +121,7 @@ and

@@ -121,7 +121,7 @@ .SS POSIX regex matching

>>  are used to provide information regarding the location of any matches.
>>  .I eflags
>>  is the
>> -.RB bitwise- or
>> +bitwise OR
>>  of zero or more of the following flags:
>>  .TP
>>  .B REG_NOTBOL
> 

Cheers,
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 1/6] regex.3: Fix subsection headings
  2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
@ 2023-04-20 13:13             ` Alejandro Colomar
  2023-04-20 13:24               ` наб
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 13:13 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2294 bytes --]

On 4/20/23 15:02, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
> $ git diff v3
> 
> But the patches are re-ordered (and a new move-only one added);
> --range-diff, humorously, /only/ picks up that one, and doesn't
> understand the rest, which is worse than if it failed entirely.
> 
> The 3type move is as far back as I could make it I think,
> 6/6 wants to come after regoff_t deduplication.
> 
>  man3/regex.3 | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 3ee58f61d..637cb2231 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -31,7 +31,7 @@ size_t " errbuf_size );
>  .BI "void regfree(regex_t *" preg );
>  .fi
>  .SH DESCRIPTION
> -.SS POSIX regex compiling
> +.SS Compilation
>  .BR regcomp ()
>  is used to compile a regular expression into a form that is suitable
>  for subsequent
> @@ -110,7 +110,7 @@ whether
>  .I eflags
>  contains
>  .BR REG_NOTEOL .
> -.SS POSIX regex matching
> +.SS Matching
>  .BR regexec ()
>  is used to match a null-terminated string
>  against the precompiled pattern buffer,
> @@ -159,7 +159,7 @@ or
>  .B REG_NEWLINE
>  processing.
>  This flag is a BSD extension, not present in POSIX.
> -.SS Byte offsets
> +.SS Match offsets

I think it might be a bit clearer as "Subexpression match offsets" or
something like that?  What do you think?

>  Unless
>  .B REG_NOSUB
>  was set for the compilation of the pattern buffer, it is possible to
> @@ -209,7 +209,7 @@ The relative
>  .I rm_eo
>  element indicates the end offset of the match,
>  which is the offset of the first character after the matching text.
> -.SS POSIX error reporting
> +.SS Error reporting
>  .BR regerror ()
>  is used to turn the error codes that can be returned by both
>  .BR regcomp ()
> @@ -238,7 +238,7 @@ are nonzero,
>  is filled in with the first
>  .I "errbuf_size \- 1"
>  characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> -.SS POSIX pattern buffer freeing
> +.SS Freeing
>  Supplying
>  .BR regfree ()
>  with a precompiled pattern buffer,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 1/6] regex.3: Fix subsection headings
  2023-04-20 13:13             ` Alejandro Colomar
@ 2023-04-20 13:24               ` наб
  2023-04-20 13:35                 ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 13:24 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

Hi!

On Thu, Apr 20, 2023 at 03:13:54PM +0200, Alejandro Colomar wrote:
> On 4/20/23 15:02, наб wrote:
> > @@ -159,7 +159,7 @@ or
> >  .B REG_NEWLINE
> >  processing.
> >  This flag is a BSD extension, not present in POSIX.
> > -.SS Byte offsets
> > +.SS Match offsets
> I think it might be a bit clearer as "Subexpression match offsets" or
> something like that?  What do you think?
Nah; in a significant amount of scenarios you don't care about
subexpressions at all, and the one thing you're guaranteed to get is,
well, the non-subexpression match.
Saying "Subexpression match offsets" to mean "Match offsets, including
of subexpressions" is more confusing, and which offsets are returned is
explained in running text.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 1/6] regex.3: Fix subsection headings
  2023-04-20 13:24               ` наб
@ 2023-04-20 13:35                 ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 13:35 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1392 bytes --]

Hi!

On 4/20/23 15:24, наб wrote:
> Hi!
> 
> On Thu, Apr 20, 2023 at 03:13:54PM +0200, Alejandro Colomar wrote:
>> On 4/20/23 15:02, наб wrote:
>>> @@ -159,7 +159,7 @@ or
>>>  .B REG_NEWLINE
>>>  processing.
>>>  This flag is a BSD extension, not present in POSIX.
>>> -.SS Byte offsets
>>> +.SS Match offsets
>> I think it might be a bit clearer as "Subexpression match offsets" or
>> something like that?  What do you think?
> Nah; in a significant amount of scenarios you don't care about
> subexpressions at all, and the one thing you're guaranteed to get is,
> well, the non-subexpression match.
> Saying "Subexpression match offsets" to mean "Match offsets, including
> of subexpressions" is more confusing, and which offsets are returned is
> explained in running text.

Ahh, sorry; I was myself confused.  I thought the section was only about
subexpressions, which is why I found confusing that the title was not
more explicit.  Being about the main match + subexp, your title is better.

I'll apply this patch in a moment, after I push my SYNOPSIS patch, based
on your 2/6, since I found there are 2 places where _Nullable should go,
not one.

Best,
Alex

> 
> Best,

P.S.: That comma without continuation feels very awkward to me :)

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 2/6] regex.3: Desoupify function descriptions
  2023-04-20 13:02           ` [PATCH v4 2/6] regex.3: Desoupify function descriptions наб
@ 2023-04-20 14:00             ` Alejandro Colomar
  2023-04-20 14:37               ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 14:00 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4321 bytes --]

Hi!

On 4/20/23 15:02, наб wrote:
> Behold:
>   regerror() is passed the error code, errcode, the pattern buffer,
>   preg, a pointer to a character string buffer, errbuf, and the size
>   of the string buffer, errbuf_size.
> 
> Absolute soup. This reads to me like an ill-conceived copy from a very
> early standard version. It looks fine in source form but is horrific to
> read as running text.
> 
> Instead, replace all of these with just the descriptions of what they do
> with their arguments. What the arguments are is very clearly noted in
> big bold in the prototypes.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Please break this patch into smaller ones.

> ---
>  man3/regex.3 | 80 +++++++++++++++++++++-------------------------------
>  1 file changed, 32 insertions(+), 48 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 637cb2231..b4feaba19 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -25,8 +25,8 @@ Standard C library
>  .BI "            size_t " nmatch ", regmatch_t " pmatch "[restrict ." nmatch ],
>  .BI "            int " eflags );
>  .PP
> -.BI "size_t regerror(int " errcode ", const regex_t *restrict " preg ,
> -.BI "            char " errbuf "[restrict ." errbuf_size "], \
> +.BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
> +.BI "                char " errbuf "[restrict ." errbuf_size "], \
>  size_t " errbuf_size );
>  .BI "void regfree(regex_t *" preg );
>  .fi
> @@ -38,21 +38,13 @@ for subsequent
>  .BR regexec ()
>  searches.
>  .PP
> -.BR regcomp ()
> -is supplied with
> -.IR preg ,
> -a pointer to a pattern buffer storage area;
> -.IR regex ,
> -a pointer to the null-terminated string and
> -.IR cflags ,
> -flags used to determine the type of compilation.
> -.PP
> -All regular expression searching must be done via a compiled pattern
> -buffer, thus
> -.BR regexec ()
> -must always be supplied with the address of a
> -.BR regcomp ()-initialized
> -pattern buffer.
> +The pattern buffer at
> +.I *preg
> +is initialized.

I think I prefer avoiding passive voice here.  No?
It initializes the pattern buffer at *preg?

Thanks,
Alex

> +.I regex
> +is a null-terminated string.
> +The locale must be the same when running
> +.BR regexec ().
>  .PP
>  .I cflags
>  is the
> @@ -113,12 +105,10 @@ contains
>  .SS Matching
>  .BR regexec ()
>  is used to match a null-terminated string
> -against the precompiled pattern buffer,
> -.IR preg .
> -.I nmatch
> -and
> -.I pmatch
> -are used to provide information regarding the location of any matches.
> +against the compiled pattern buffer in
> +.IR *preg ,
> +which must have been initialised with
> +.BR regexec ().
>  .I eflags
>  is the
>  bitwise OR
> @@ -217,34 +207,28 @@ and
>  .BR regexec ()
>  into error message strings.
>  .PP
> -.BR regerror ()
> -is passed the error code,
> -.IR errcode ,
> -the pattern buffer,
> -.IR preg ,
> -a pointer to a character string buffer,
> -.IR errbuf ,
> -and the size of the string buffer,
> -.IR errbuf_size .
> -It returns the size of the
> -.I errbuf
> -required to contain the null-terminated error message string.
> -If both
> -.I errbuf
> -and
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +If
> +.I preg
> +is a null pointer\(emthe latest error.
> +.PP
> +If
> +.I errbuf_size
> +is
> +.BR 0 ,
> +the size of the required buffer is returned.
> +Otherwise, up to
>  .I errbuf_size
> -are nonzero,
> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> +bytes are copied to
> +.IR errbuf ;
> +the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
> -Supplying
>  .BR regfree ()
> -with a precompiled pattern buffer,
> -.IR preg ,
> -will free the memory allocated to the pattern buffer by the compiling
> -process,
> +invalidates the pattern buffer at
> +.IR *preg ,
> +which must have been initialized via
>  .BR regcomp ().
>  .SH RETURN VALUE
>  .BR regcomp ()

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 3/6] regex.3: Improve REG_STARTEND
  2023-04-20 13:02           ` [PATCH v4 3/6] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 14:04             ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 14:04 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2346 bytes --]

On 4/20/23 15:02, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.

I like this change.

> 
> Also change the [0]. to -> here to make more obvious the point that
> pmatch is used as a pointer-to-object, not array in this scenario.

Since at the same time [>0] can be meaningful, I prefer using [0],
to note that the first entry is special in the array.  -> looks like
there's no array at all, but rather just one object.

> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.

Good.

> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Cheers,
Alex

> ---
>  man3/regex.3 | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index b4feaba19..00e7e2c6b 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -131,23 +131,30 @@ compilation flag
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
> +instead of
> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
> +.I pmatch
> +must point to a valid readable object.
> +If any matches are returned
> +.RB ( REG_NOSUB
> +wasn't passed to
> +.BR regcomp (),
> +the match succeeded, and
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +> 0), they overwrite
> +.I pmatch
> +as usual, and the
> +.B Match offsets
> +remain relative to
> +.IR string
> +(not
> +.IR string " + " pmatch->rm_so ).
>  This flag is a BSD extension, not present in POSIX.
>  .SS Match offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3
  2023-04-20 13:02           ` [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3 наб
@ 2023-04-20 14:07             ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 14:07 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3937 bytes --]

On 4/20/23 15:02, наб wrote:
> They're inextricably linked, not cross-referenced at all,
> and not used anywhere else.
> 
> Now that they (realistically) exist to the reader, add a note
> on how big nmatch can be; POSIX even says "The application developer
> should note that there is probably no reason for using a value of
> nmatch that is larger than preg−>re_nsub+1.".
> 
> Also remove the now-duplicate regmatch_t declaration.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3              | 40 +++++++++++++++++++--------------------
>  man3type/regex_t.3type    |  1 +
>  man3type/regmatch_t.3type |  2 +-
>  man3type/regoff_t.3type   |  2 +-
>  4 files changed, 23 insertions(+), 22 deletions(-)
>  create mode 100644 man3type/regex_t.3type
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 615e065de..6d203fa22 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -15,7 +15,7 @@ regcomp, regexec, regerror, regfree \- POSIX regex functions
>  Standard C library
>  .RI ( libc ", " \-lc )
>  .SH SYNOPSIS
> -.nf
> +.EX
>  .B #include <regex.h>
>  .PP
>  .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
> @@ -43,7 +43,7 @@ size_t " errbuf_size );
>  .B } regmatch_t;
>  .PP
>  .BR typedef " /* ... */  " regoff_t;
> -.fi
> +.EE
>  .SH DESCRIPTION
>  .SS Compilation
>  .BR regcomp ()
> @@ -60,6 +60,21 @@ is a null-terminated string.
>  The locale must be the same when running
>  .BR regexec ().
>  .PP
> +After
> +.BR regcomp ()
> +succeeds,
> +.I preg->re_nsub
> +holds the number of subexpressions in
> +.IR regex .
> +Thus, a value of
> +.I preg->re_nsub
> ++ 1
> +passed as
> +.I nmatch
> +to
> +.BR regexec ()
> +is sufficient to capture all matches.
> +.PP
>  .I cflags
>  is the
>  bitwise OR
> @@ -196,22 +211,6 @@ must be at least
>  .IR N+1 .)
>  Any unused structure elements will contain the value \-1.
>  .PP
> -The
> -.I regmatch_t
> -structure which is the type of
> -.I pmatch
> -is defined in
> -.IR <regex.h> .
> -.PP
> -.in +4n
> -.EX
> -typedef struct {
> -    regoff_t rm_so;
> -    regoff_t rm_eo;
> -} regmatch_t;
> -.EE
> -.in
> -.PP
>  Each
>  .I rm_so
>  element that is not \-1 indicates the start offset of the next largest
> @@ -222,7 +221,7 @@ element indicates the end offset of the match,
>  which is the offset of the first character after the matching text.
>  .PP
>  .I regoff_t
> -It is a signed integer type
> +is a signed integer type
>  capable of storing the largest value that can be stored in either an
>  .I ptrdiff_t
>  type or a
> @@ -346,7 +345,8 @@ POSIX.1-2008.
>  POSIX.1-2001.
>  .PP
>  Prior to POSIX.1-2008,
> -the type was
> +.I regoff_t
> +was required to be
>  capable of storing the largest value that can be stored in either an
>  .I off_t
>  type or a
> diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
> new file mode 100644
> index 000000000..c0daaf0ff
> --- /dev/null
> +++ b/man3type/regex_t.3type

The link changes in the same patch that does the move are fine.
git should be smart enough to follow that, and it will help
humans too.  This short removal of the file might be worse than
than the previous approach, I fear.

> @@ -0,0 +1 @@
> +.so man3/regex.3
> diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
> index dc78f2cf2..c0daaf0ff 100644
> --- a/man3type/regmatch_t.3type
> +++ b/man3type/regmatch_t.3type
> @@ -1 +1 @@
> -.so man3type/regex_t.3type
> +.so man3/regex.3
> diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
> index dc78f2cf2..c0daaf0ff 100644
> --- a/man3type/regoff_t.3type
> +++ b/man3type/regoff_t.3type
> @@ -1 +1 @@
> -.so man3type/regex_t.3type
> +.so man3/regex.3


-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 6/6] regex.3: Destandardeseify Match offsets
  2023-04-20 13:02           ` [PATCH v4 6/6] regex.3: Destandardeseify Match offsets наб
@ 2023-04-20 14:10             ` Alejandro Colomar
  2023-04-20 15:05               ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 14:10 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2670 bytes --]



On 4/20/23 15:02, наб wrote:
> This section reads like it were (and pretty much is) lifted from POSIX.
> That's hard to read, because POSIX is horrendously verbose, as usual.
> 
> Instead, synopsise it into something less formal but more reasonable,
> and describe the resulting range with a range instead of a paragraph.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
>  1 file changed, 25 insertions(+), 28 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 6d203fa22..552763940 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -188,37 +188,34 @@ This flag is a BSD extension, not present in POSIX.
>  .SS Match offsets
>  Unless
>  .B REG_NOSUB
> -was set for the compilation of the pattern buffer, it is possible to
> -obtain match addressing information.
> -.I pmatch
> -must be dimensioned to have at least
> -.I nmatch
> -elements.
> -These are filled in by
> +was passed to
> +.BR regcomp (),
> +it is possible to
> +obtain the locations of matches within
> +.IR string :
>  .BR regexec ()
> -with substring match addresses.
> -The offsets of the subexpression starting at the
> -.IR i th
> -open parenthesis are stored in
> -.IR pmatch[i] .
> -The entire regular expression's match addresses are stored in
> -.IR pmatch[0] .
> -(Note that to return the offsets of
> -.I N
> -subexpression matches,
> +fills
>  .I nmatch
> -must be at least
> -.IR N+1 .)
> -Any unused structure elements will contain the value \-1.
> +elements of
> +.I pmatch
> +with results:
> +.I pmatch[0]
> +corresponds to the entire match,

I still don't understand this.  Does REG_NOSUB also affect pmatch[0]?
I would have expected that it would only affect *sub*matches, that is, [>0].

> +.I pmatch[1]
> +to the first expression, etc.
> +If there were more matches than
> +.IR nmatch ,
> +they are discarded;
> +if fewer,
> +unused elements of
> +.I pmatch
> +are filled with
> +.BR \-1 s.
>  .PP
> -Each
> -.I rm_so
> -element that is not \-1 indicates the start offset of the next largest
> -substring match within the string.
> -The relative
> -.I rm_eo
> -element indicates the end offset of the match,
> -which is the offset of the first character after the matching text.
> +Each returned valid
> +.RB (non- \-1 )
> +match corresponds to the range
> +.RI [ string " + " rm_so ", " string " + " rm_eo ).
>  .PP
>  .I regoff_t
>  is a signed integer type

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 13:03               ` Alejandro Colomar
@ 2023-04-20 14:13                 ` наб
  2023-04-20 14:19                   ` Alejandro Colomar
  2023-04-20 18:42                 ` G. Branden Robinson
  1 sibling, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 14:13 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 630 bytes --]

Hi!

On Thu, Apr 20, 2023 at 03:03:24PM +0200, Alejandro Colomar wrote:
> >> diff --git a/man2/adjtimex.2 b/man2/adjtimex.2
> >> index 523347de2..40b05cb0e 100644
> >> --- a/man2/adjtimex.2
> >> +++ b/man2/adjtimex.2
> >> @@ -90,7 +90,7 @@ the constants used for
> BTW, another thing you might find useful is this:
> 
> $ cat ~/.config/git/attributes 
> *.[1-8]* diff=man
> 
> And then in your .gitconfig:
> 
> [diff "man"]
> 	xfuncname = "^\\.S[SH] .*$"
That's great tech, thanks.

> You may want to use a regex that also works for mdoc(7).
mdoc uses .Sh and .Ss, so:
	xfuncname = "^\\.S[SHsh] .*"

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 14:13                 ` наб
@ 2023-04-20 14:19                   ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 14:19 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 982 bytes --]

Hi!

On 4/20/23 16:13, наб wrote:
> Hi!
> 
> On Thu, Apr 20, 2023 at 03:03:24PM +0200, Alejandro Colomar wrote:
>>>> diff --git a/man2/adjtimex.2 b/man2/adjtimex.2
>>>> index 523347de2..40b05cb0e 100644
>>>> --- a/man2/adjtimex.2
>>>> +++ b/man2/adjtimex.2
>>>> @@ -90,7 +90,7 @@ the constants used for
>> BTW, another thing you might find useful is this:
>>
>> $ cat ~/.config/git/attributes 
>> *.[1-8]* diff=man
>>
>> And then in your .gitconfig:
>>
>> [diff "man"]
>> 	xfuncname = "^\\.S[SH] .*$"
> That's great tech, thanks.
> 
>> You may want to use a regex that also works for mdoc(7).
> mdoc uses .Sh and .Ss, so:
> 	xfuncname = "^\\.S[SHsh] .*"

Thanks!  I improved my config file :-) [1]

Best,
Alex

[1]:  <http://www.alejandro-colomar.es/src/alx/alx/config.git/commit/?id=4e772e3e3fe0785d773cf702b115dfc3d20d90d5>

> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 2/6] regex.3: Desoupify function descriptions
  2023-04-20 14:00             ` Alejandro Colomar
@ 2023-04-20 14:37               ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 14:37 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1392 bytes --]

Hi!

On Thu, Apr 20, 2023 at 04:00:40PM +0200, Alejandro Colomar wrote:
> On 4/20/23 15:02, наб wrote:
> > Instead, replace all of these with just the descriptions of what they do
> > with their arguments. What the arguments are is very clearly noted in
> > big bold in the prototypes.
> Please break this patch into smaller ones.
Cracked into one each for regcomp/regexec/regerror.

> > @@ -38,21 +38,13 @@ for subsequent
> >  .BR regexec ()
> >  searches.
> >  .PP
> > -.BR regcomp ()
> > -is supplied with
> > -.IR preg ,
> > -a pointer to a pattern buffer storage area;
> > -.IR regex ,
> > -a pointer to the null-terminated string and
> > -.IR cflags ,
> > -flags used to determine the type of compilation.
> > -.PP
> > -All regular expression searching must be done via a compiled pattern
> > -buffer, thus
> > -.BR regexec ()
> > -must always be supplied with the address of a
> > -.BR regcomp ()-initialized
> > -pattern buffer.
> > +The pattern buffer at
> > +.I *preg
> > +is initialized.
> I think I prefer avoiding passive voice here.  No?
> It initializes the pattern buffer at *preg?
I changed it to
  On success, the pattern buffer at *preg is initialized.
Which makes more sense as a post-condition,
and writing it the other way around would be weird
("If it succeeds, it initialises pattern buffer at *preg"?
 horrendous).

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 6/6] regex.3: Destandardeseify Match offsets
  2023-04-20 14:10             ` Alejandro Colomar
@ 2023-04-20 15:05               ` наб
  2023-04-20 18:51                 ` G. Branden Robinson
  2023-04-21 11:34                 ` Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:05 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 5704 bytes --]

Hi!

On Thu, Apr 20, 2023 at 04:10:04PM +0200, Alejandro Colomar wrote:
> On 4/20/23 15:02, наб wrote:
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -188,37 +188,34 @@ This flag is a BSD extension, not present in POSIX.
> >  .SS Match offsets
> >  Unless
> >  .B REG_NOSUB
> > -was set for the compilation of the pattern buffer, it is possible to
> > -obtain match addressing information.
> > -.I pmatch
> > -must be dimensioned to have at least
> > -.I nmatch
> > -elements.
> > -These are filled in by
> > +was passed to
> > +.BR regcomp (),
> > +it is possible to
> > +obtain the locations of matches within
> > +.IR string :
> >  .BR regexec ()
> > -with substring match addresses.
> > -The offsets of the subexpression starting at the
> > -.IR i th
> > -open parenthesis are stored in
> > -.IR pmatch[i] .
> > -The entire regular expression's match addresses are stored in
> > -.IR pmatch[0] .
> > -(Note that to return the offsets of
> > -.I N
> > -subexpression matches,
> > +fills
> >  .I nmatch
> > -must be at least
> > -.IR N+1 .)
> > -Any unused structure elements will contain the value \-1.
> > +elements of
> > +.I pmatch
> > +with results:
> > +.I pmatch[0]
> > +corresponds to the entire match,
> I still don't understand this.  Does REG_NOSUB also affect pmatch[0]?
> I would have expected that it would only affect *sub*matches, that is, [>0].

Let's consult the manual:
  REG_NOSUB  Do not report position of matches. [...]
  REG_NOSUB  Compile for matching that need only report success or
             failure, not what was matched.                    (4.4BSD)
and POSIX:
  REG_NOSUB  Report only success or fail in regexec().
  REG_NOSUB  Report only success/fail in regexec( ).
(yes; the two times it describes it, it's written differently).

POSIX says it better I think.

And, indeed:
	$ cat a.c
	#include <regex.h>
	#include <stdio.h>
	int main(int c, char ** v) {
		regex_t r;
		regcomp(&r, v[1], 0);
		regmatch_t dt = {0, 3};
		printf("%d\n", regexec(&r, v[2], 1, &dt, REG_STARTEND));
		printf("%d, %d\n", (int)dt.rm_so, (int)dt.rm_eo);
	}

	$ cc a.c -oac
	$ ./ac 'c$' 'abcdef'
	0
	2, 3

	$ sed 's/0)/REG_NOSUB)/' a.c | cc -xc - -oac
	$ ./ac 'c$' 'abcdef'
	0
	0, 3


...and I've just realised why you're asking ‒ I think you're reading too
much (and ahistorically) into the "SUB" bit;
heretofor I've assumed this is for "substitution", which I think is fair.

Actually, let's consult POSIX.2 (Draft 11.2):
  591     Table B-8  − regcomp() cflags Argument
  596  REG_NOSUB  Report only success/fail in regexec().
B.5 C Binding for Regular Expression Matching, B.5.2 Description:
  609  If the REG_NOSUB flag was not set in cflags, then regcomp() shall set re_nsub to
  610  the number of parenthesized subexpressions [delimited by \( \) in basic regular
  611  expressions or ( ) in extended regular expressions] found in pattern.
both as present-day.

B.5.5 Rationale., History of Decisions Made:
  791  The working group has rejected, at least for now, the inclusion of a regsub() func-
  792  tion that would be used to do substitutions for a matched regular expression.
  793  While such a routine would be useful to some applications, its utility would be
  794  much more limited than the matching function described here. Both regular
  795  expression parsing and substitution are possible to implement without support
  796  other than that required by the C Standard {7}, but matching is much more com-
  797  plex than substituting. The only ‘‘difficult’’ part of substitution, given the infor-
  798  mation supplied by regexec(), is finding the next character in a string when there
  799  can be multibyte characters. That is a much wider issue, and one that needs a
  800  more general solution.

  803  In Draft 9, the interface was modified so that the matched substrings rm_sp and
  804  rm_ep are in a separate regmatch_t structure instead of in regex_t. This allows a
  805  single compiled regular expression to be used simultaneously in several contexts;
  806  in main() and a signal handler, perhaps, or in multiple threads of lightweight
  807  processes. (The preg argument to regexec() is declared with type const, so the
  808  implementation is not permitted to use the structure to store intermediate
  809  results.) It also allows an application to request an arbitrary number of sub-
  810  strings from a regular expression. (Previous versions reported only ten sub-
  811  strings.) The number of subexpressions in the regular expression is reported in
  812  re_nsub in preg. With this change to regexec(), consideration was given to drop-
  813  ping the REG_NOSUB flag, since the user can now specify this with a zero nmatch
  814  argument to regexec(). However, keeping REG_NOSUB allows an implementation
  815  to use a different (perhaps more efficient) algorithm if it knows in regcomp() that
  816  no subexpressions need be reported. The implementation is only required to fill
  817  in pmatch if nmatch is not zero and if REG_NOSUB is not specified. Note that the
  818  size_t type, as defined in the C Standard {7}, is unsigned, so the description of
  819  regexec() does not need to address negative values of nmatch.

So: yes, there was a substitution interface that got cut.
The name is actually a hold-over from
"don't allocate for ten subexpressions in regex_t".

I think changing our description to
  REG_NOSUB  Only report overall success. regexec() will only use pmatch
             for REG_STARTEND, and ignore nmatch.
may make that more obvious.

Best,
наб

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v5 0/8] regex.3 momento
  2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
  2023-04-20 13:13             ` Alejandro Colomar
@ 2023-04-20 15:35             ` наб
  2023-04-20 15:35               ` [PATCH v5 1/8] regex.3: Desoupify regcomp() description наб
                                 ` (8 more replies)
  1 sibling, 9 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:35 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2674 bytes --]

The range diff was again soup, I think there's something in the
interdiff tho.

8/8 may be clearer, may be not.

наб (8):
  regex.3: Desoupify regcomp() description
  regex.3: Desoupify regexec() description
  regex.3: Desoupify regerror() description
  regex.3: Improve REG_STARTEND
  regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link
    regex_t.3type into regex.3
  regex.3: Finalise move of reg*.3type
  regex.3: Destandardeseify Match offsets
  regex.3: Further clarify the sole purpose of REG_NOSUB

 man3/regex.3              | 250 +++++++++++++++++++++-----------------
 man3type/regex_t.3type    |  64 +---------
 man3type/regmatch_t.3type |   2 +-
 man3type/regoff_t.3type   |   2 +-
 4 files changed, 143 insertions(+), 175 deletions(-)

Interdiff against v4:
diff --git a/man3/regex.3 b/man3/regex.3
index 552763940..66d9c6596 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -52,7 +52,7 @@ .SS Compilation
 .BR regexec ()
 searches.
 .PP
-The pattern buffer at
+On success, the pattern buffer at
 .I *preg
 is initialized.
 .I regex
@@ -96,16 +96,14 @@ .SS Compilation
 searches using this pattern buffer will be case insensitive.
 .TP
 .B REG_NOSUB
-Do not report position of matches.
-The
-.I nmatch
-and
-.I pmatch
+Only report overall success:
 .BR regexec ()
-arguments will be ignored for this purpose (but
+will only use
 .I pmatch
-may still be used for
-.BR REG_STARTEND ).
+for
+.BR REG_STARTEND ,
+and ignore
+.IR nmatch .
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
@@ -161,7 +159,7 @@ .SS Matching
 .TP
 .B REG_STARTEND
 Match
-.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
 instead of
 .RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
@@ -183,7 +181,7 @@ .SS Matching
 remain relative to
 .IR string
 (not
-.IR string " + " pmatch->rm_so ).
+.IR string " + " pmatch[0].rm_so ).
 This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
@@ -349,6 +347,20 @@ .SH HISTORY
 type or a
 .I ssize_t
 type.
+.SH NOTES
+.I re_nsub
+is only required to be initialized if
+.B REG_NOSUB
+wasn't specified, but all known implementations initialize it regardless.
+.\" glibc, musl, 4.4BSD, illumos
+.PP
+Both
+.I regex_t
+and
+.I regmatch_t
+may (and do) have more members, in any order.
+Always reference them by name.
+.\" illumos has two more start/end pairs and the first one is of pointers
 .SH EXAMPLES
 .EX
 #include <stdint.h>
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 1/8] regex.3: Desoupify regcomp() description
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
@ 2023-04-20 15:35               ` наб
  2023-04-20 16:37                 ` Alejandro Colomar
  2023-04-20 15:35               ` [PATCH v5 2/8] regex.3: Desoupify regexec() description наб
                                 ` (7 subsequent siblings)
  8 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 15:35 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]

Behold:
  regerror() is passed the error code, errcode, the pattern buffer,
  preg, a pointer to a character string buffer, errbuf, and the size
  of the string buffer, errbuf_size.

Absolute soup. This reads to me like an ill-conceived copy from a very
early standard version. It looks fine in source form but is horrific to
read as running text.

Instead, replace all of these with just the descriptions of what they do
with their arguments. What the arguments are is very clearly noted in
big bold in the prototypes.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 129c42412..2f6ee816f 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -38,21 +38,13 @@ .SS Compilation
 .BR regexec ()
 searches.
 .PP
-.BR regcomp ()
-is supplied with
-.IR preg ,
-a pointer to a pattern buffer storage area;
-.IR regex ,
-a pointer to the null-terminated string and
-.IR cflags ,
-flags used to determine the type of compilation.
-.PP
-All regular expression searching must be done via a compiled pattern
-buffer, thus
-.BR regexec ()
-must always be supplied with the address of a
-.BR regcomp ()-initialized
-pattern buffer.
+On success, the pattern buffer at
+.I *preg
+is initialized.
+.I regex
+is a null-terminated string.
+The locale must be the same when running
+.BR regexec ().
 .PP
 .I cflags
 is the
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 2/8] regex.3: Desoupify regexec() description
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
  2023-04-20 15:35               ` [PATCH v5 1/8] regex.3: Desoupify regcomp() description наб
@ 2023-04-20 15:35               ` наб
  2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
                                 ` (6 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:35 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 713 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 2f6ee816f..ae160c9b3 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -105,12 +105,10 @@ .SS Compilation
 .SS Matching
 .BR regexec ()
 is used to match a null-terminated string
-against the precompiled pattern buffer,
-.IR preg .
-.I nmatch
-and
-.I pmatch
-are used to provide information regarding the location of any matches.
+against the compiled pattern buffer in
+.IR *preg ,
+which must have been initialised with
+.BR regexec ().
 .I eflags
 is the
 bitwise OR
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
  2023-04-20 15:35               ` [PATCH v5 1/8] regex.3: Desoupify regcomp() description наб
  2023-04-20 15:35               ` [PATCH v5 2/8] regex.3: Desoupify regexec() description наб
@ 2023-04-20 15:35               ` наб
  2023-04-20 16:42                 ` Alejandro Colomar
                                   ` (2 more replies)
  2023-04-20 15:35               ` [PATCH v5 4/8] regex.3: Improve REG_STARTEND наб
                                 ` (5 subsequent siblings)
  8 siblings, 3 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:35 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1981 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 46 ++++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 26 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index ae160c9b3..c5185549b 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -26,7 +26,7 @@ .SH SYNOPSIS
 .BI "            int " eflags );
 .PP
 .BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
-.BI "            char " errbuf "[restrict ." errbuf_size "], \
+.BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
 .fi
@@ -207,34 +207,28 @@ .SS Error reporting
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+If
+.I preg
+is a null pointer\(emthe latest error.
+.PP
+If
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
+.I errbuf_size
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS Freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+which must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (2 preceding siblings ...)
  2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
@ 2023-04-20 15:35               ` наб
  2023-04-20 17:29                 ` Alejandro Colomar
  2023-04-20 15:36               ` [PATCH v5 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
                                 ` (4 subsequent siblings)
  8 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 15:35 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1700 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index c5185549b..1ce0a3b7e 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -131,23 +131,30 @@ .SS Matching
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
+instead of
+.RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
+.I pmatch
+must point to a valid readable object.
+If any matches are returned
+.RB ( REG_NOSUB
+wasn't passed to
+.BR regcomp (),
+the match succeeded, and
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+> 0), they overwrite
+.I pmatch
+as usual, and the
+.B Match offsets
+remain relative to
+.IR string
+(not
+.IR string " + " pmatch[0].rm_so ).
 This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (3 preceding siblings ...)
  2023-04-20 15:35               ` [PATCH v5 4/8] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 15:36               ` наб
  2023-04-20 15:36               ` [PATCH v5 6/8] regex.3: Finalise move of reg*.3type наб
                                 ` (3 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 4247 bytes --]

Move-only commit.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3              | 30 ++++++++++++++++++
 man3type/regex_t.3type    | 64 +--------------------------------------
 man3type/regmatch_t.3type |  2 +-
 man3type/regoff_t.3type   |  2 +-
 4 files changed, 33 insertions(+), 65 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 1ce0a3b7e..897a622d4 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -29,6 +29,20 @@ .SH SYNOPSIS
 .BI "                char " errbuf "[restrict ." errbuf_size "], \
 size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
+.PP
+.B typedef struct {
+.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
+.B } regex_t;
+.PP
+.B typedef struct {
+.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
+                           to start of substring */
+.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
+                           the first character after the end of
+                           substring */
+.B } regmatch_t;
+.PP
+.BR typedef " /* ... */  " regoff_t;
 .fi
 .SH DESCRIPTION
 .SS Compilation
@@ -206,6 +220,14 @@ .SS Match offsets
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
+.PP
+.I regoff_t
+It is a signed integer type
+capable of storing the largest value that can be stored in either an
+.I ptrdiff_t
+type or a
+.I ssize_t
+type.
 .SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
@@ -322,6 +344,14 @@ .SH STANDARDS
 POSIX.1-2008.
 .SH HISTORY
 POSIX.1-2001.
+.PP
+Prior to POSIX.1-2008,
+the type was
+capable of storing the largest value that can be stored in either an
+.I off_t
+type or a
+.I ssize_t
+type.
 .SH EXAMPLES
 .EX
 #include <stdint.h>
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
index 176d2c7a6..c0daaf0ff 100644
--- a/man3type/regex_t.3type
+++ b/man3type/regex_t.3type
@@ -1,63 +1 @@
-.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
-.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\"
-.TH regex_t 3type (date) "Linux man-pages (unreleased)"
-.SH NAME
-regex_t, regmatch_t, regoff_t
-\- regular expression matching
-.SH LIBRARY
-Standard C library
-.RI ( libc )
-.SH SYNOPSIS
-.EX
-.B #include <regex.h>
-.PP
-.B typedef struct {
-.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
-.B } regex_t;
-.PP
-.B typedef struct {
-.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
-                           to start of substring */
-.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
-                           the first character after the end of
-                           substring */
-.B } regmatch_t;
-.PP
-.BR typedef " /* ... */  " regoff_t;
-.EE
-.SH DESCRIPTION
-.TP
-.I regex_t
-This is a structure type used in regular expression matching.
-It holds a compiled regular expression,
-compiled with
-.BR regcomp (3).
-.TP
-.I regmatch_t
-This is a structure type used in regular expression matching.
-.TP
-.I regoff_t
-It is a signed integer type
-capable of storing the largest value that can be stored in either an
-.I ptrdiff_t
-type or a
-.I ssize_t
-type.
-.SH STANDARDS
-POSIX.1-2008.
-.SH HISTORY
-POSIX.1-2001.
-.PP
-Prior to POSIX.1-2008,
-the type was
-capable of storing the largest value that can be stored in either an
-.I off_t
-type or a
-.I ssize_t
-type.
-.SH SEE ALSO
-.BR regex (3)
+.so man3/regex.3
diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regmatch_t.3type
+++ b/man3type/regmatch_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regoff_t.3type
+++ b/man3type/regoff_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 6/8] regex.3: Finalise move of reg*.3type
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (4 preceding siblings ...)
  2023-04-20 15:36               ` [PATCH v5 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
@ 2023-04-20 15:36               ` наб
  2023-04-20 15:36               ` [PATCH v5 7/8] regex.3: Destandardeseify Match offsets наб
                                 ` (2 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]

They're inextricably linked, not cross-referenced at all,
and not used anywhere else.

Now that they (realistically) exist to the reader, add a note
on how big nmatch can be; POSIX even says "The application developer
should note that there is probably no reason for using a value of
nmatch that is larger than preg−>re_nsub+1.".

Also remove the now-duplicate regmatch_t declaration.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 54 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 897a622d4..75c810c41 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -15,7 +15,7 @@ .SH LIBRARY
 Standard C library
 .RI ( libc ", " \-lc )
 .SH SYNOPSIS
-.nf
+.EX
 .B #include <regex.h>
 .PP
 .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
@@ -43,7 +43,7 @@ .SH SYNOPSIS
 .B } regmatch_t;
 .PP
 .BR typedef " /* ... */  " regoff_t;
-.fi
+.EE
 .SH DESCRIPTION
 .SS Compilation
 .BR regcomp ()
@@ -60,6 +60,21 @@ .SS Compilation
 The locale must be the same when running
 .BR regexec ().
 .PP
+After
+.BR regcomp ()
+succeeds,
+.I preg->re_nsub
+holds the number of subexpressions in
+.IR regex .
+Thus, a value of
+.I preg->re_nsub
++ 1
+passed as
+.I nmatch
+to
+.BR regexec ()
+is sufficient to capture all matches.
+.PP
 .I cflags
 is the
 bitwise OR
@@ -196,22 +211,6 @@ .SS Match offsets
 .IR N+1 .)
 Any unused structure elements will contain the value \-1.
 .PP
-The
-.I regmatch_t
-structure which is the type of
-.I pmatch
-is defined in
-.IR <regex.h> .
-.PP
-.in +4n
-.EX
-typedef struct {
-    regoff_t rm_so;
-    regoff_t rm_eo;
-} regmatch_t;
-.EE
-.in
-.PP
 Each
 .I rm_so
 element that is not \-1 indicates the start offset of the next largest
@@ -222,7 +221,7 @@ .SS Match offsets
 which is the offset of the first character after the matching text.
 .PP
 .I regoff_t
-It is a signed integer type
+is a signed integer type
 capable of storing the largest value that can be stored in either an
 .I ptrdiff_t
 type or a
@@ -346,12 +345,27 @@ .SH HISTORY
 POSIX.1-2001.
 .PP
 Prior to POSIX.1-2008,
-the type was
+.I regoff_t
+was required to be
 capable of storing the largest value that can be stored in either an
 .I off_t
 type or a
 .I ssize_t
 type.
+.SH NOTES
+.I re_nsub
+is only required to be initialized if
+.B REG_NOSUB
+wasn't specified, but all known implementations initialize it regardless.
+.\" glibc, musl, 4.4BSD, illumos
+.PP
+Both
+.I regex_t
+and
+.I regmatch_t
+may (and do) have more members, in any order.
+Always reference them by name.
+.\" illumos has two more start/end pairs and the first one is of pointers
 .SH EXAMPLES
 .EX
 #include <stdint.h>
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 7/8] regex.3: Destandardeseify Match offsets
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (5 preceding siblings ...)
  2023-04-20 15:36               ` [PATCH v5 6/8] regex.3: Finalise move of reg*.3type наб
@ 2023-04-20 15:36               ` наб
  2023-04-20 15:36               ` [PATCH v5 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 75c810c41..ca0ab83df 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -188,37 +188,34 @@ .SS Matching
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ string " + " rm_so ", " string " + " rm_eo ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v5 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (6 preceding siblings ...)
  2023-04-20 15:36               ` [PATCH v5 7/8] regex.3: Destandardeseify Match offsets наб
@ 2023-04-20 15:36               ` наб
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 15:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 794 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index ca0ab83df..66d9c6596 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -96,16 +96,14 @@ .SS Compilation
 searches using this pattern buffer will be case insensitive.
 .TP
 .B REG_NOSUB
-Do not report position of matches.
-The
-.I nmatch
-and
-.I pmatch
+Only report overall success:
 .BR regexec ()
-arguments will be ignored for this purpose (but
+will only use
 .I pmatch
-may still be used for
-.BR REG_STARTEND ).
+for
+.BR REG_STARTEND ,
+and ignore
+.IR nmatch .
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 1/8] regex.3: Desoupify regcomp() description
  2023-04-20 15:35               ` [PATCH v5 1/8] regex.3: Desoupify regcomp() description наб
@ 2023-04-20 16:37                 ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 16:37 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2147 bytes --]

Hi наб!

On 4/20/23 17:35, наб wrote:
> Behold:
>   regerror() is passed the error code, errcode, the pattern buffer,
>   preg, a pointer to a character string buffer, errbuf, and the size
>   of the string buffer, errbuf_size.
> 
> Absolute soup. This reads to me like an ill-conceived copy from a very

Single space after period is evil.  I'd like to point you to this rant
o'mine where I give more details, to not repeat myself too much:
<https://lore.kernel.org/linux-man/9c5c5744-dde0-b333-09e0-ba9d92aa96b1@gmail.com/T/#mb4eb99c9bccb59c6df82c1f6945766c878d85f07>

I've cleaned up those crimes before applying, and then applied this
patch.  :)

Cheers,

Alex


> early standard version. It looks fine in source form but is horrific to
> read as running text.
> 
> Instead, replace all of these with just the descriptions of what they do
> with their arguments. What the arguments are is very clearly noted in
> big bold in the prototypes.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 22 +++++++---------------
>  1 file changed, 7 insertions(+), 15 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 129c42412..2f6ee816f 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -38,21 +38,13 @@ .SS Compilation
>  .BR regexec ()
>  searches.
>  .PP
> -.BR regcomp ()
> -is supplied with
> -.IR preg ,
> -a pointer to a pattern buffer storage area;
> -.IR regex ,
> -a pointer to the null-terminated string and
> -.IR cflags ,
> -flags used to determine the type of compilation.
> -.PP
> -All regular expression searching must be done via a compiled pattern
> -buffer, thus
> -.BR regexec ()
> -must always be supplied with the address of a
> -.BR regcomp ()-initialized
> -pattern buffer.
> +On success, the pattern buffer at
> +.I *preg
> +is initialized.
> +.I regex
> +is a null-terminated string.
> +The locale must be the same when running
> +.BR regexec ().
>  .PP
>  .I cflags
>  is the

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
@ 2023-04-20 16:42                 ` Alejandro Colomar
  2023-04-20 18:50                   ` наб
  2023-04-20 16:50                 ` Alejandro Colomar
  2023-04-20 17:23                 ` Alejandro Colomar
  2 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 16:42 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4153 bytes --]

On 4/20/23 17:35, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 46 ++++++++++++++++++++--------------------------
>  1 file changed, 20 insertions(+), 26 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index ae160c9b3..c5185549b 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -26,7 +26,7 @@ .SH SYNOPSIS
>  .BI "            int " eflags );
>  .PP
>  .BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
> -.BI "            char " errbuf "[restrict ." errbuf_size "], \
> +.BI "                char " errbuf "[restrict ." errbuf_size "], \

See man-pages(7):

FORMATTING AND WORDING CONVENTIONS
       The  following  subsections  note some details for preferred formatting
       and wording conventions in various sections of the pages  in  the  man‐
       pages project.

   SYNOPSIS
       [...]

       In the SYNOPSIS, a long function prototype may  need  to  be  continued
       over  to the next line.  The continuation line is indented according to
       the following rules:

       (1)  If there is a single such prototype that needs  to  be  continued,
            then align the continuation line so that when the page is rendered
            on  a fixed‐width font device (e.g., on an xterm) the continuation
            line starts just below the start of the argument list in the  line
            above.   (Exception:  the indentation may be adjusted if necessary
            to prevent a very long continuation line or a further continuation
            line where the function prototype is very long.)  As an example:

                int tcsetattr(int fd, int optional_actions,
                              const struct termios *termios_p);

       (2)  But, where multiple functions in the SYNOPSIS require continuation
            lines, and the function names have different lengths,  then  align
            all continuation lines to start in the same column.  This provides
            a nicer rendering in PDF output (because the SYNOPSIS uses a vari‐
            able  width  font  where  spaces render narrower than most charac‐
            ters).  As an example:

                int getopt(int argc, char * const argv[],
                           const char *optstring);
                int getopt_long(int argc, char * const argv[],
                           const char *optstring,
                           const struct option *longopts, int *longindex);


>  size_t " errbuf_size );
>  .BI "void regfree(regex_t *" preg );
>  .fi
> @@ -207,34 +207,28 @@ .SS Error reporting
>  .BR regexec ()
>  into error message strings.
>  .PP
> -.BR regerror ()
> -is passed the error code,
> -.IR errcode ,
> -the pattern buffer,
> -.IR preg ,
> -a pointer to a character string buffer,
> -.IR errbuf ,
> -and the size of the string buffer,
> -.IR errbuf_size .
> -It returns the size of the
> -.I errbuf
> -required to contain the null-terminated error message string.
> -If both
> -.I errbuf
> -and
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +If
> +.I preg
> +is a null pointer\(emthe latest error.
> +.PP
> +If
>  .I errbuf_size
> -are nonzero,
> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> +is
> +.BR 0 ,
> +the size of the required buffer is returned.
> +Otherwise, up to
> +.I errbuf_size
> +bytes are copied to
> +.IR errbuf ;
> +the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
> -Supplying
>  .BR regfree ()
> -with a precompiled pattern buffer,
> -.IR preg ,
> -will free the memory allocated to the pattern buffer by the compiling
> -process,
> +invalidates the pattern buffer at
> +.IR *preg ,
> +which must have been initialized via
>  .BR regcomp ().
>  .SH RETURN VALUE
>  .BR regcomp ()

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
  2023-04-20 16:42                 ` Alejandro Colomar
@ 2023-04-20 16:50                 ` Alejandro Colomar
  2023-04-20 17:23                 ` Alejandro Colomar
  2 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 16:50 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 562 bytes --]



On 4/20/23 17:35, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---

[...]

> -If both
> -.I errbuf
> -and

[...]

>  .I errbuf_size
> -are nonzero,

Now that I read this, it seems we should add _Nullable to errbuf too.
I'll do that.

> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).


-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
  2023-04-20 16:42                 ` Alejandro Colomar
  2023-04-20 16:50                 ` Alejandro Colomar
@ 2023-04-20 17:23                 ` Alejandro Colomar
  2023-04-20 18:46                   ` наб
  2 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 17:23 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 839 bytes --]



On 4/20/23 17:35, наб wrote:
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +If
> +.I preg
> +is a null pointer\(emthe latest error.

I don't read that from the POSIX spec.  If preg is NULL, then I think any
error returned by a call to one of these APIs would be valid.  In fact,
since these functions are MT-Safe, they can't store any state, which leads
me to think that they can't really distinguish between the latest error,
and an error returned at a random point in the past, or even the result of
csrand_interval(x, y)[1] with appropriate x and y.

[1]:  <https://github.com/shadow-maint/shadow/blob/c80788a3ac092bc5abfa89ff48060d3f95cd5812/libmisc/csrand.c#L93>

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 15:35               ` [PATCH v5 4/8] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 17:29                 ` Alejandro Colomar
  2023-04-20 19:30                   ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 17:29 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2263 bytes --]



On 4/20/23 17:35, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.
> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.
> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index c5185549b..1ce0a3b7e 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -131,23 +131,30 @@ .SS Matching
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
> +instead of
> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
> +.I pmatch
> +must point to a valid readable object.

I think this is redundant, since we showed that [0] is accessed by
the function.

> +If any matches are returned
> +.RB ( REG_NOSUB
> +wasn't passed to
> +.BR regcomp (),
> +the match succeeded, and
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +> 0), they overwrite

And of course, nmatch must be at least 1, since otherwise, [0] was
not valid, and the whole call would have been UB; right?  So that
third condition must be true to not invoke UB, so we can omit it too,
I think.

> +.I pmatch
> +as usual, and the
> +.B Match offsets
> +remain relative to
> +.IR string
> +(not
> +.IR string " + " pmatch[0].rm_so ).
>  This flag is a BSD extension, not present in POSIX.
>  .SS Match offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-20 11:13           ` наб
@ 2023-04-20 18:33             ` G. Branden Robinson
  2023-04-20 22:29               ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-20 18:33 UTC (permalink / raw)
  To: наб; +Cc: Alejandro Colomar (man-pages), linux-man

[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]

At 2023-04-20T13:13:29+0200, наб wrote:
> On Thu, Apr 20, 2023 at 05:00:59AM -0500, G. Branden Robinson wrote:
> > At 2023-04-20T01:23:14+0200, наб wrote:
> > > +> 0), they overwrite
> > > +.I pmatch
> > > +as usual, and the
> > > +.B Byte offsets
> > > +remain relative to
> > > +.IR string
> > I don't think "byte" needs to be captialized here.
> I'm using it as a Sx and the section is capitalised,
> so I think this should also be?

[Note for non-mdoc(7) speakers: `Sx` is its macro for (sub)section
heading cross references.  man(7) doesn't have an equivalent, though if
there is demand, I'm happy to implement one.  :D]

Nothing I can see in man-pages(7) suggests that references to
(sub)section headings should be in an unusual typeface.  The norm in
English is usually to quote them.  It's also unusual to pun a
(sub)section heading name as an ordinary noun phrase this way.

So in this case I would neither capitalize _nor_ embolden the phrase.
After a piece of domain-specific jargon has been introduced in technical
writing (usually with italics), it is not thereafter specially marked.
In long-form works, it may get a cross reference after it in parentheses
or a footnote if it hasn't been mentioned for dozens of pages and the
reader requires a reminder.  I don't think regex(3) is large enough to
warrant that consideration, and "byte offset" seems to have the meaning
that a programmer already familiar with the individual terms would
infer.

Just the usual style coaching, not a NAK.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 13:03               ` Alejandro Colomar
  2023-04-20 14:13                 ` наб
@ 2023-04-20 18:42                 ` G. Branden Robinson
  2023-04-20 22:40                   ` Alejandro Colomar
  1 sibling, 1 reply; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-20 18:42 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: наб, linux-man

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

At 2023-04-20T15:03:24+0200, Alejandro Colomar wrote:
> BTW, another thing you might find useful is this:
> 
> $ cat ~/.config/git/attributes 
> *.[1-8]* diff=man
> 
> 
> And then in your .gitconfig:
> 
> [diff "man"]
> 	xfuncname = "^\\.S[SH] .*$"

Nice trick!  How on Earth have I been living without this?

> You may want to use a regex that also works for mdoc(7).

I reckon you could sweep up mdoc(7) pages as well with:

	xfuncname = "^\\.S[HShs] .*$"

> >>  .BR ntp_adjtime ()
> >>  are equivalent but differently named.)
> >>  It is a bit mask containing a
> >> -.RI bitwise- or
> >> +bitwise OR
> >>  combination of zero or more of the following bits:

Discussion of Boolean-algebraic operations is common enough among
programmers that it might be a good idea to settle on a specific style
recommendation for typesetting them.

I think either quotation (e.g., \[lq]or\[rq]) or shouting capitals (OR)
are tolerable, the latter only because the few operators commonly
mentioned have very short names (you don't see EQUIVALENCE much).

I would counsel against changing the type face for them (i.e., no bold,
no italics).

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 17:23                 ` Alejandro Colomar
@ 2023-04-20 18:46                   ` наб
  2023-04-20 22:45                     ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 18:46 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2622 bytes --]

Hi!

On Thu, Apr 20, 2023 at 07:23:39PM +0200, Alejandro Colomar wrote:
> On 4/20/23 17:35, наб wrote:
> > +.I errcode
> > +must be the latest error returned from an operation on
> > +.IR preg .
> > +If
> > +.I preg
> > +is a null pointer\(emthe latest error.
> I don't read that from the POSIX spec.
Whereas that's precisely where I got it from.

> If preg is NULL, then I think any
> error returned by a call to one of these APIs would be valid.
That's unspecified.

> In fact,
> since these functions are MT-Safe, they can't store any state,
Probably. OTOH, musl raw-dogs mbtowc() in regexec(), so.
(I'm pretty sure it's by accident since they do have a mbstate_t
 and juggle it a lot, but it's never actually used.)

> which leads
> me to think that they can't really distinguish between the latest error,
> and an error returned at a random point in the past, or even the result of
> csrand_interval(x, y)[1] with appropriate x and y.
Again, probably. But (line numbers from Issue 8 Draft 2.1):
57517  The regerror( ) function provides a mapping from error codes returned by regcomp( ) and
57518  regexec( ) to unspecified printable strings. It generates a string corresponding to the value of the
57519  errcode argument, which the application shall ensure is the last non-zero value returned by
57520  regcomp( ) or regexec( ) with the given value of preg. If errcode is not such a value, the content of
57521  the generated string is unspecified.

57522  If preg is a null pointer, but errcode is a value returned by a previous call to regexec( ) or regcomp( ),
57523  the regerror( ) still generates an error string corresponding to the value of errcode, but it might not
57524  be as detailed under some implementations.

57525  If the errbuf_size argument is not 0, regerror( ) shall place the generated string into the buffer of
57526  size errbuf_size bytes pointed to by errbuf. If the string (including the terminating null) cannot fit
57527  in the buffer, regerror( ) shall truncate the string and null-terminate the result.

57528  If errbuf_size is 0, regerror( ) shall ignore the errbuf argument, and return the size of the buffer
57529  needed to hold the generated string.

In these difficult times I tend to turn to what implementations do:
NetBSD, musl, illumos, and glibc, if you subtract REG_ATOI and REG_ITOA,
all essentially return lsearch(errors, errcode)->description
+ all sans NetBSD localise it.
None of them even use preg.

So yeah, I'll axe that.


And split out regfree() from this patch because I missed it.


Best,
наб

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 16:42                 ` Alejandro Colomar
@ 2023-04-20 18:50                   ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 18:50 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

On Thu, Apr 20, 2023 at 06:42:55PM +0200, Alejandro Colomar wrote:
> On 4/20/23 17:35, наб wrote:
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -26,7 +26,7 @@ .SH SYNOPSIS
> >  .BI "            int " eflags );
> >  .PP
> >  .BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
> > -.BI "            char " errbuf "[restrict ." errbuf_size "], \
> > +.BI "                char " errbuf "[restrict ." errbuf_size "], \
> See man-pages(7):
I didn't even notice it was matching regexec()/regcomp() since they're
in a separate paragraph, it just looks like a formatting error
(and makes it so multiple functions aren't as well-delineated as they could be),
but sure.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 6/6] regex.3: Destandardeseify Match offsets
  2023-04-20 15:05               ` наб
@ 2023-04-20 18:51                 ` G. Branden Robinson
  2023-04-21 11:34                 ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-20 18:51 UTC (permalink / raw)
  To: наб; +Cc: Alejandro Colomar, linux-man

[-- Attachment #1: Type: text/plain, Size: 642 bytes --]

At 2023-04-20T17:05:53+0200, наб wrote:
> I think changing our description to
>   REG_NOSUB  Only report overall success. regexec() will only use pmatch
>              for REG_STARTEND, and ignore nmatch.
> may make that more obvious.

s/Only report/Report only/
s/only use/use only/

You might then further economize on space:

>   REG_NOSUB  Report only overall success. regexec() will use only pmatch
>              for REG_STARTEND, ignoring nmatch.

As a rule of thumb, get the adverb "only" as close to the word it
modifies as you can, because "only" can modify pretty much anything in
English.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 17:29                 ` Alejandro Colomar
@ 2023-04-20 19:30                   ` наб
  2023-04-20 19:33                     ` наб
  2023-04-20 23:01                     ` Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:30 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1925 bytes --]

On Thu, Apr 20, 2023 at 07:29:27PM +0200, Alejandro Colomar wrote:
> On 4/20/23 17:35, наб wrote:
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -131,23 +131,30 @@ .SS Matching
> >  above).
> >  .TP
> >  .B REG_STARTEND
> > -Use
> > -.I pmatch[0]
> > -on the input string, starting at byte
> > -.I pmatch[0].rm_so
> > -and ending before byte
> > -.IR pmatch[0].rm_eo .
> > +Match
> > +.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
> > +instead of
> > +.RI [ string ", " string " + \fBstrlen\fP(" string )).
> >  This allows matching embedded NUL bytes
> >  and avoids a
> >  .BR strlen (3)
> > -on large strings.
> > -It does not use
> > +on known-length strings.
> > +.I pmatch
> > +must point to a valid readable object.
> I think this is redundant, since we showed that [0] is accessed by
> the function.
Yeah.

> > +If any matches are returned
> > +.RB ( REG_NOSUB
> > +wasn't passed to
> > +.BR regcomp (),
> > +the match succeeded, and
> >  .I nmatch
> > -on input, and does not change
> > -.B REG_NOTBOL
> > -or
> > -.B REG_NEWLINE
> > -processing.
> > +> 0), they overwrite
> And of course, nmatch must be at least 1, since otherwise, [0] was
> not valid, and the whole call would have been UB; right?  So that
> third condition must be true to not invoke UB, so we can omit it too,
> I think.
What? idk where you got this from.
Per 0d120a3c76b4446b194a54387ce0e7a84b208bfd:
    In the regexec() signature
      regmatch_t pmatch[restrict .nmatch],
    is a simplification. It's actually
      regmatch_t pmatch[restrict
        ((.preg->flags & REG_NOSUB) ? 0 : .nmatch) ?:
         !!(.eflags & REG_STARTEND)],

If REG_STARTEND, pmatch must point to a valid readable object.
(Naturally, if you pass in uninitialised memory or a null pointer,
 then you get UB.)
nmatch is not consulted and has no bearing on this.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 19:30                   ` наб
@ 2023-04-20 19:33                     ` наб
  2023-04-20 23:01                     ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:33 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

On Thu, Apr 20, 2023 at 09:30:06PM +0200, наб wrote:
> If REG_STARTEND, pmatch must point to a valid readable object.
> (Naturally, if you pass in uninitialised memory or a null pointer,
>  then you get UB.)
> nmatch is not consulted and has no bearing on this.
This is all to say:
  regexec(&reg, "str", 0, &rm, REG_STARTEND);
is valid, looks in ["str"+rm.so, "str"+rm.eo),
and doesn't change rm, whereas
  regexec(&reg, "str", 1, &rm, REG_STARTEND);
is valid, looks in ["str"+rm.so, "str"+rm.eo),
and will update rm with the match, if any.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v6 0/8] regex.3 momento
  2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
                                 ` (7 preceding siblings ...)
  2023-04-20 15:36               ` [PATCH v5 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
@ 2023-04-20 19:36               ` наб
  2023-04-20 19:36                 ` [PATCH v6 1/8] regex.3: Desoupify regexec() description наб
                                   ` (8 more replies)
  8 siblings, 9 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 4863 bytes --]

Should include all comments; includes Branden's wording.

наб (8):
  regex.3: Desoupify regexec() description
  regex.3: Desoupify regerror() description
  regex.3: Desoupify regfree() description
  regex.3: Improve REG_STARTEND
  regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link
    regex_t.3type into regex.3
  regex.3: Finalise move of reg*.3type
  regex.3: Destandardeseify Match offsets
  regex.3: Further clarify the sole purpose of REG_NOSUB

 man3/regex.3              | 226 ++++++++++++++++++++++----------------
 man3type/regex_t.3type    |  64 +----------
 man3type/regmatch_t.3type |   2 +-
 man3type/regoff_t.3type   |   2 +-
 4 files changed, 133 insertions(+), 161 deletions(-)

Range-diff against v5:
1:  fcb8df21b < -:  --------- regex.3: Desoupify regcomp() description
2:  7240de5b7 = 1:  1ad1aa6e9 regex.3: Desoupify regexec() description
3:  108f30cd7 ! 2:  6c4d26f89 regex.3: Desoupify regerror() description
    @@ Commit message
         Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
     
      ## man3/regex.3 ##
    -@@ man3/regex.3: .SH SYNOPSIS
    - .BI "            int " eflags );
    - .PP
    - .BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
    --.BI "            char " errbuf "[restrict ." errbuf_size "], \
    -+.BI "                char " errbuf "[restrict ." errbuf_size "], \
    - size_t " errbuf_size );
    - .BI "void regfree(regex_t *" preg );
    - .fi
     @@ man3/regex.3: .SS Error reporting
      .BR regexec ()
      into error message strings.
    @@ man3/regex.3: .SS Error reporting
     -If both
     -.I errbuf
     -and
    ++If
    ++.I preg
    ++isn't a null pointer,
     +.I errcode
     +must be the latest error returned from an operation on
     +.IR preg .
    -+If
    -+.I preg
    -+is a null pointer\(emthe latest error.
     +.PP
     +If
    ++.I errbuf_size
    ++is
    ++.BR 0 ,
    ++the size of the required buffer is returned.
    ++Otherwise, up to
      .I errbuf_size
     -are nonzero,
     -.I errbuf
     -is filled in with the first
     -.I "errbuf_size \- 1"
     -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
    -+is
    -+.BR 0 ,
    -+the size of the required buffer is returned.
    -+Otherwise, up to
    -+.I errbuf_size
     +bytes are copied to
     +.IR errbuf ;
     +the error string is always null-terminated, and truncated to fit.
      .SS Freeing
    --Supplying
    + Supplying
      .BR regfree ()
    --with a precompiled pattern buffer,
    --.IR preg ,
    --will free the memory allocated to the pattern buffer by the compiling
    --process,
    -+invalidates the pattern buffer at
    -+.IR *preg ,
    -+which must have been initialized via
    - .BR regcomp ().
    - .SH RETURN VALUE
    - .BR regcomp ()
-:  --------- > 3:  4b7971a5e regex.3: Desoupify regfree() description
4:  fd1a104d6 ! 4:  5fb4cc16f regex.3: Improve REG_STARTEND
    @@ man3/regex.3: .SS Matching
     -on large strings.
     -It does not use
     +on known-length strings.
    -+.I pmatch
    -+must point to a valid readable object.
     +If any matches are returned
     +.RB ( REG_NOSUB
     +wasn't passed to
    @@ man3/regex.3: .SS Matching
     -processing.
     +> 0), they overwrite
     +.I pmatch
    -+as usual, and the
    -+.B Match offsets
    -+remain relative to
    ++as usual, and the match offsets remain relative to
     +.IR string
     +(not
     +.IR string " + " pmatch[0].rm_so ).
5:  198b7b4fa ! 5:  057a4a522 regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
    @@ Commit message
     
      ## man3/regex.3 ##
     @@ man3/regex.3: .SH SYNOPSIS
    - .BI "                char " errbuf "[restrict ." errbuf_size "], \
    - size_t " errbuf_size );
    + .BI "            char " errbuf "[_Nullable restrict ." errbuf_size ],
    + .BI "            size_t " errbuf_size );
      .BI "void regfree(regex_t *" preg );
     +.PP
     +.B typedef struct {
6:  c6bc9cfd0 = 6:  60ac1a4d1 regex.3: Finalise move of reg*.3type
7:  59b8294c8 = 7:  3313546db regex.3: Destandardeseify Match offsets
8:  2e199fc3c ! 8:  7fa669481 regex.3: Further clarify the sole purpose of REG_NOSUB
    @@ man3/regex.3: .SS Compilation
     -.I nmatch
     -and
     -.I pmatch
    -+Only report overall success:
    ++Report only overall success.
      .BR regexec ()
     -arguments will be ignored for this purpose (but
    -+will only use
    ++will use only
      .I pmatch
     -may still be used for
     -.BR REG_STARTEND ).
     +for
     +.BR REG_STARTEND ,
    -+and ignore
    ++ignoring
     +.IR nmatch .
      .TP
      .B REG_NEWLINE
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v6 1/8] regex.3: Desoupify regexec() description
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
@ 2023-04-20 19:36                 ` наб
  2023-04-20 23:24                   ` Alejandro Colomar
  2023-04-20 19:36                 ` [PATCH v6 2/8] regex.3: Desoupify regerror() description наб
                                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 19:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 713 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index bedb97e87..47fe661d2 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -105,12 +105,10 @@ .SS Compilation
 .SS Matching
 .BR regexec ()
 is used to match a null-terminated string
-against the precompiled pattern buffer,
-.IR preg .
-.I nmatch
-and
-.I pmatch
-are used to provide information regarding the location of any matches.
+against the compiled pattern buffer in
+.IR *preg ,
+which must have been initialised with
+.BR regexec ().
 .I eflags
 is the
 bitwise OR
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 2/8] regex.3: Desoupify regerror() description
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
  2023-04-20 19:36                 ` [PATCH v6 1/8] regex.3: Desoupify regexec() description наб
@ 2023-04-20 19:36                 ` наб
  2023-04-20 19:37                 ` [PATCH v6 3/8] regex.3: Desoupify regfree() description наб
                                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:36 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1317 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 47fe661d2..3f1529583 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -207,27 +207,23 @@ .SS Error reporting
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+If
+.I preg
+isn't a null pointer,
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+.PP
+If
+.I errbuf_size
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS Freeing
 Supplying
 .BR regfree ()
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 3/8] regex.3: Desoupify regfree() description
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
  2023-04-20 19:36                 ` [PATCH v6 1/8] regex.3: Desoupify regexec() description наб
  2023-04-20 19:36                 ` [PATCH v6 2/8] regex.3: Desoupify regerror() description наб
@ 2023-04-20 19:37                 ` наб
  2023-04-20 23:35                   ` Alejandro Colomar
  2023-04-20 19:37                 ` [PATCH v6 4/8] regex.3: Improve REG_STARTEND наб
                                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 735 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 3f1529583..e3dd72a74 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -225,12 +225,10 @@ .SS Error reporting
 .IR errbuf ;
 the error string is always null-terminated, and truncated to fit.
 .SS Freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+which must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (2 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 3/8] regex.3: Desoupify regfree() description наб
@ 2023-04-20 19:37                 ` наб
  2023-04-20 23:15                   ` Alejandro Colomar
  2023-04-20 19:37                 ` [PATCH v6 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
                                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index e3dd72a74..a9bec59a9 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -131,23 +131,26 @@ .SS Matching
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
+instead of
+.RI [ string ", " string " + \fBstrlen\fP(" string )).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
+If any matches are returned
+.RB ( REG_NOSUB
+wasn't passed to
+.BR regcomp (),
+the match succeeded, and
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+> 0), they overwrite
+.I pmatch
+as usual, and the match offsets remain relative to
+.IR string
+(not
+.IR string " + " pmatch[0].rm_so ).
 This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (3 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 4/8] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 19:37                 ` наб
  2023-04-20 19:37                 ` [PATCH v6 6/8] regex.3: Finalise move of reg*.3type наб
                                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 4267 bytes --]

Move-only commit.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3              | 30 ++++++++++++++++++
 man3type/regex_t.3type    | 64 +--------------------------------------
 man3type/regmatch_t.3type |  2 +-
 man3type/regoff_t.3type   |  2 +-
 4 files changed, 33 insertions(+), 65 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index a9bec59a9..2b886eb77 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -29,6 +29,20 @@ .SH SYNOPSIS
 .BI "            char " errbuf "[_Nullable restrict ." errbuf_size ],
 .BI "            size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
+.PP
+.B typedef struct {
+.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
+.B } regex_t;
+.PP
+.B typedef struct {
+.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
+                           to start of substring */
+.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
+                           the first character after the end of
+                           substring */
+.B } regmatch_t;
+.PP
+.BR typedef " /* ... */  " regoff_t;
 .fi
 .SH DESCRIPTION
 .SS Compilation
@@ -202,6 +216,14 @@ .SS Match offsets
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
+.PP
+.I regoff_t
+It is a signed integer type
+capable of storing the largest value that can be stored in either an
+.I ptrdiff_t
+type or a
+.I ssize_t
+type.
 .SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
@@ -318,6 +340,14 @@ .SH STANDARDS
 POSIX.1-2008.
 .SH HISTORY
 POSIX.1-2001.
+.PP
+Prior to POSIX.1-2008,
+the type was
+capable of storing the largest value that can be stored in either an
+.I off_t
+type or a
+.I ssize_t
+type.
 .SH EXAMPLES
 .EX
 #include <stdint.h>
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
index 176d2c7a6..c0daaf0ff 100644
--- a/man3type/regex_t.3type
+++ b/man3type/regex_t.3type
@@ -1,63 +1 @@
-.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
-.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\"
-.TH regex_t 3type (date) "Linux man-pages (unreleased)"
-.SH NAME
-regex_t, regmatch_t, regoff_t
-\- regular expression matching
-.SH LIBRARY
-Standard C library
-.RI ( libc )
-.SH SYNOPSIS
-.EX
-.B #include <regex.h>
-.PP
-.B typedef struct {
-.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
-.B } regex_t;
-.PP
-.B typedef struct {
-.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
-                           to start of substring */
-.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
-                           the first character after the end of
-                           substring */
-.B } regmatch_t;
-.PP
-.BR typedef " /* ... */  " regoff_t;
-.EE
-.SH DESCRIPTION
-.TP
-.I regex_t
-This is a structure type used in regular expression matching.
-It holds a compiled regular expression,
-compiled with
-.BR regcomp (3).
-.TP
-.I regmatch_t
-This is a structure type used in regular expression matching.
-.TP
-.I regoff_t
-It is a signed integer type
-capable of storing the largest value that can be stored in either an
-.I ptrdiff_t
-type or a
-.I ssize_t
-type.
-.SH STANDARDS
-POSIX.1-2008.
-.SH HISTORY
-POSIX.1-2001.
-.PP
-Prior to POSIX.1-2008,
-the type was
-capable of storing the largest value that can be stored in either an
-.I off_t
-type or a
-.I ssize_t
-type.
-.SH SEE ALSO
-.BR regex (3)
+.so man3/regex.3
diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regmatch_t.3type
+++ b/man3type/regmatch_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regoff_t.3type
+++ b/man3type/regoff_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 6/8] regex.3: Finalise move of reg*.3type
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (4 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
@ 2023-04-20 19:37                 ` наб
  2023-04-20 19:37                 ` [PATCH v6 7/8] regex.3: Destandardeseify Match offsets наб
                                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]

They're inextricably linked, not cross-referenced at all,
and not used anywhere else.

Now that they (realistically) exist to the reader, add a note
on how big nmatch can be; POSIX even says "The application developer
should note that there is probably no reason for using a value of
nmatch that is larger than preg−>re_nsub+1.".

Also remove the now-duplicate regmatch_t declaration.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 54 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 2b886eb77..2e9bb13ff 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -15,7 +15,7 @@ .SH LIBRARY
 Standard C library
 .RI ( libc ", " \-lc )
 .SH SYNOPSIS
-.nf
+.EX
 .B #include <regex.h>
 .PP
 .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
@@ -43,7 +43,7 @@ .SH SYNOPSIS
 .B } regmatch_t;
 .PP
 .BR typedef " /* ... */  " regoff_t;
-.fi
+.EE
 .SH DESCRIPTION
 .SS Compilation
 .BR regcomp ()
@@ -60,6 +60,21 @@ .SS Compilation
 The locale must be the same when running
 .BR regexec ().
 .PP
+After
+.BR regcomp ()
+succeeds,
+.I preg->re_nsub
+holds the number of subexpressions in
+.IR regex .
+Thus, a value of
+.I preg->re_nsub
++ 1
+passed as
+.I nmatch
+to
+.BR regexec ()
+is sufficient to capture all matches.
+.PP
 .I cflags
 is the
 bitwise OR
@@ -192,22 +207,6 @@ .SS Match offsets
 .IR N+1 .)
 Any unused structure elements will contain the value \-1.
 .PP
-The
-.I regmatch_t
-structure which is the type of
-.I pmatch
-is defined in
-.IR <regex.h> .
-.PP
-.in +4n
-.EX
-typedef struct {
-    regoff_t rm_so;
-    regoff_t rm_eo;
-} regmatch_t;
-.EE
-.in
-.PP
 Each
 .I rm_so
 element that is not \-1 indicates the start offset of the next largest
@@ -218,7 +217,7 @@ .SS Match offsets
 which is the offset of the first character after the matching text.
 .PP
 .I regoff_t
-It is a signed integer type
+is a signed integer type
 capable of storing the largest value that can be stored in either an
 .I ptrdiff_t
 type or a
@@ -342,12 +341,27 @@ .SH HISTORY
 POSIX.1-2001.
 .PP
 Prior to POSIX.1-2008,
-the type was
+.I regoff_t
+was required to be
 capable of storing the largest value that can be stored in either an
 .I off_t
 type or a
 .I ssize_t
 type.
+.SH NOTES
+.I re_nsub
+is only required to be initialized if
+.B REG_NOSUB
+wasn't specified, but all known implementations initialize it regardless.
+.\" glibc, musl, 4.4BSD, illumos
+.PP
+Both
+.I regex_t
+and
+.I regmatch_t
+may (and do) have more members, in any order.
+Always reference them by name.
+.\" illumos has two more start/end pairs and the first one is of pointers
 .SH EXAMPLES
 .EX
 #include <stdint.h>
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 7/8] regex.3: Destandardeseify Match offsets
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (5 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 6/8] regex.3: Finalise move of reg*.3type наб
@ 2023-04-20 19:37                 ` наб
  2023-04-20 19:37                 ` [PATCH v6 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
  2023-04-21  2:01                 ` [PATCH v6 0/8] regex.3 momento Alejandro Colomar
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 2e9bb13ff..7b91f5b30 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -184,37 +184,34 @@ .SS Matching
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ string " + " rm_so ", " string " + " rm_eo ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v6 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (6 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 7/8] regex.3: Destandardeseify Match offsets наб
@ 2023-04-20 19:37                 ` наб
  2023-04-21  2:01                 ` [PATCH v6 0/8] regex.3 momento Alejandro Colomar
  8 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 19:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 792 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 7b91f5b30..4c450bd7f 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -96,16 +96,14 @@ .SS Compilation
 searches using this pattern buffer will be case insensitive.
 .TP
 .B REG_NOSUB
-Do not report position of matches.
-The
-.I nmatch
-and
-.I pmatch
+Report only overall success.
 .BR regexec ()
-arguments will be ignored for this purpose (but
+will use only
 .I pmatch
-may still be used for
-.BR REG_STARTEND ).
+for
+.BR REG_STARTEND ,
+ignoring
+.IR nmatch .
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-20 18:33             ` G. Branden Robinson
@ 2023-04-20 22:29               ` Alejandro Colomar
  2023-04-21  5:00                 ` G. Branden Robinson
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 22:29 UTC (permalink / raw)
  To: G. Branden Robinson, наб; +Cc: linux-man, groff


[-- Attachment #1.1: Type: text/plain, Size: 739 bytes --]

Hi Branden,

On 4/20/23 20:33, G. Branden Robinson wrote:
> [Note for non-mdoc(7) speakers: `Sx` is its macro for (sub)section
> heading cross references.  man(7) doesn't have an equivalent, though if
> there is demand, I'm happy to implement one.  :D]

I've been delaying my global switch to non-shouting sexion headings, due
to not having a clear idea of how to refer to them.  Having a macro that
does that for me, and ensures that the appropriate formatting is applied
might be a good solution.  It would also please the info(1) people, so
that the few references we have to those would be linked.

Cheers,
Alex


-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix
  2023-04-20 18:42                 ` G. Branden Robinson
@ 2023-04-20 22:40                   ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 22:40 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: наб, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1868 bytes --]

Hi Branden,

On 4/20/23 20:42, G. Branden Robinson wrote:
> At 2023-04-20T15:03:24+0200, Alejandro Colomar wrote:
>> BTW, another thing you might find useful is this:
>>
>> $ cat ~/.config/git/attributes 
>> *.[1-8]* diff=man
>>
>>
>> And then in your .gitconfig:
>>
>> [diff "man"]
>> 	xfuncname = "^\\.S[SH] .*$"
> 
> Nice trick!  How on Earth have I been living without this?

I don't remember how I found this obscure git(1) configuration.  I
think I was reviewing some patch at work and the hunk was complete
garbage, and we pulled some threads...  Itchy and Scratchy :)

> 
>> You may want to use a regex that also works for mdoc(7).
> 
> I reckon you could sweep up mdoc(7) pages as well with:
> 
> 	xfuncname = "^\\.S[HShs] .*$"

Already fixed(:

<http://www.alejandro-colomar.es/src/alx/alx/config.git/commit/?id=4e772e3e3fe0785d773cf702b115dfc3d20d90d5>

> 
>>>>  .BR ntp_adjtime ()
>>>>  are equivalent but differently named.)
>>>>  It is a bit mask containing a
>>>> -.RI bitwise- or
>>>> +bitwise OR
>>>>  combination of zero or more of the following bits:
> 
> Discussion of Boolean-algebraic operations is common enough among
> programmers that it might be a good idea to settle on a specific style
> recommendation for typesetting them.
> 
> I think either quotation (e.g., \[lq]or\[rq]) or shouting capitals (OR)
> are tolerable, the latter only because the few operators commonly
> mentioned have very short names (you don't see EQUIVALENCE much).

I'm used to seeing uppercase OR, AND, NAND, XOR, and similar names in
Electronics.  I'll vote for that.

> 
> I would counsel against changing the type face for them (i.e., no bold,
> no italics).
> 
> Regards,
> Branden

Cheers,
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 18:46                   ` наб
@ 2023-04-20 22:45                     ` Alejandro Colomar
  2023-04-20 23:05                       ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 22:45 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3461 bytes --]

Hi,

On 4/20/23 20:46, наб wrote:
> Hi!
> 
> On Thu, Apr 20, 2023 at 07:23:39PM +0200, Alejandro Colomar wrote:
>> On 4/20/23 17:35, наб wrote:
>>> +.I errcode
>>> +must be the latest error returned from an operation on
>>> +.IR preg .
>>> +If
>>> +.I preg
>>> +is a null pointer\(emthe latest error.
>> I don't read that from the POSIX spec.
> Whereas that's precisely where I got it from.

Here's the quote I think is the most relevant (you also quoted it
below):

       If preg is a null pointer, but errcode is a value returned by a
       previous  call  to regexec() or regcomp(), the regerror() still
       generates an error string corresponding to  the  value  of  er‐
       rcode,  but  it might not be as detailed under some implementa‐
       tions.


> 
>> If preg is NULL, then I think any
>> error returned by a call to one of these APIs would be valid.
> That's unspecified.

I don't think so.  POSIX says a "previous call".  It doesn't say the
"latest" or "immediately preceeding" or similar wording.  Don't you
understand the same from that paragraph?

> 
>> In fact,
>> since these functions are MT-Safe, they can't store any state,
> Probably. OTOH, musl raw-dogs mbtowc() in regexec(), so.
> (I'm pretty sure it's by accident since they do have a mbstate_t
>  and juggle it a lot, but it's never actually used.)
> 
>> which leads
>> me to think that they can't really distinguish between the latest error,
>> and an error returned at a random point in the past, or even the result of
>> csrand_interval(x, y)[1] with appropriate x and y.
> Again, probably. But (line numbers from Issue 8 Draft 2.1):
> 57517  The regerror( ) function provides a mapping from error codes returned by regcomp( ) and
> 57518  regexec( ) to unspecified printable strings. It generates a string corresponding to the value of the
> 57519  errcode argument, which the application shall ensure is the last non-zero value returned by
> 57520  regcomp( ) or regexec( ) with the given value of preg. If errcode is not such a value, the content of
> 57521  the generated string is unspecified.
> 
> 57522  If preg is a null pointer, but errcode is a value returned by a previous call to regexec( ) or regcomp( ),
> 57523  the regerror( ) still generates an error string corresponding to the value of errcode, but it might not
> 57524  be as detailed under some implementations.
> 
> 57525  If the errbuf_size argument is not 0, regerror( ) shall place the generated string into the buffer of
> 57526  size errbuf_size bytes pointed to by errbuf. If the string (including the terminating null) cannot fit
> 57527  in the buffer, regerror( ) shall truncate the string and null-terminate the result.
> 
> 57528  If errbuf_size is 0, regerror( ) shall ignore the errbuf argument, and return the size of the buffer
> 57529  needed to hold the generated string.
> 
> In these difficult times I tend to turn to what implementations do:
> NetBSD, musl, illumos, and glibc, if you subtract REG_ATOI and REG_ITOA,
> all essentially return lsearch(errors, errcode)->description
> + all sans NetBSD localise it.
> None of them even use preg.
> 
> So yeah, I'll axe that.
> 
> 
> And split out regfree() from this patch because I missed it.

Thanks,

Alex

> 
> 
> Best,
> наб

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 19:30                   ` наб
  2023-04-20 19:33                     ` наб
@ 2023-04-20 23:01                     ` Alejandro Colomar
  2023-04-21  0:13                       ` наб
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 23:01 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2801 bytes --]

Hi наб,

On 4/20/23 21:30, наб wrote:
> On Thu, Apr 20, 2023 at 07:29:27PM +0200, Alejandro Colomar wrote:
>> On 4/20/23 17:35, наб wrote:
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -131,23 +131,30 @@ .SS Matching
>>>  above).
>>>  .TP
>>>  .B REG_STARTEND
>>> -Use
>>> -.I pmatch[0]
>>> -on the input string, starting at byte
>>> -.I pmatch[0].rm_so
>>> -and ending before byte
>>> -.IR pmatch[0].rm_eo .
>>> +Match
>>> +.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
>>> +instead of
>>> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
>>>  This allows matching embedded NUL bytes
>>>  and avoids a
>>>  .BR strlen (3)
>>> -on large strings.
>>> -It does not use
>>> +on known-length strings.
>>> +.I pmatch
>>> +must point to a valid readable object.
>> I think this is redundant, since we showed that [0] is accessed by
>> the function.
> Yeah.
> 
>>> +If any matches are returned
>>> +.RB ( REG_NOSUB
>>> +wasn't passed to
>>> +.BR regcomp (),
>>> +the match succeeded, and
>>>  .I nmatch
>>> -on input, and does not change
>>> -.B REG_NOTBOL
>>> -or
>>> -.B REG_NEWLINE
>>> -processing.
>>> +> 0), they overwrite
>> And of course, nmatch must be at least 1, since otherwise, [0] was
>> not valid, and the whole call would have been UB; right?  So that
>> third condition must be true to not invoke UB, so we can omit it too,
>> I think.
> What? idk where you got this from.
> Per 0d120a3c76b4446b194a54387ce0e7a84b208bfd:
>     In the regexec() signature
>       regmatch_t pmatch[restrict .nmatch],
>     is a simplification. It's actually
>       regmatch_t pmatch[restrict
>         ((.preg->flags & REG_NOSUB) ? 0 : .nmatch) ?:
>          !!(.eflags & REG_STARTEND)],

That is a model that was useful in a commit message to describe more
or less what happens.  It doesn't need to perfectly describe reality.
Since REG_STARTEND is not in POSIX, we can't read what POSIX says,
so it's all up to how much implementations want to guarantee.  I
don't think glibc would like to allow specifying .nmatch as 0 while
the function accesses [0].  The fact that the current implementation
doesn't open Hell's doors to nasal demons doesn't mean it can't do
so in the future.  I conceive that _FORTIFY_SOURCE could reasonably
check that pmatch[] has at least .nmemb elements, and I don't want
to preclude that in the documentation.

Cheers,
Alex

> 
> If REG_STARTEND, pmatch must point to a valid readable object.
> (Naturally, if you pass in uninitialised memory or a null pointer,
>  then you get UB.)
> nmatch is not consulted and has no bearing on this.
> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 3/8] regex.3: Desoupify regerror() description
  2023-04-20 22:45                     ` Alejandro Colomar
@ 2023-04-20 23:05                       ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-20 23:05 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

Hi!

On Fri, Apr 21, 2023 at 12:45:16AM +0200, Alejandro Colomar wrote:
> On 4/20/23 20:46, наб wrote:
> > On Thu, Apr 20, 2023 at 07:23:39PM +0200, Alejandro Colomar wrote:
> >> If preg is NULL, then I think any
> >> error returned by a call to one of these APIs would be valid.
> > That's unspecified.
> I don't think so.  POSIX says a "previous call".  It doesn't say the
> "latest" or "immediately preceeding" or similar wording.  Don't you
> understand the same from that paragraph?
I read "a previous" as a shorthand for "the last non-zero value returned
by regcomp( ) or regexec( )" from above originally; but yeah, now that
you mention it, "just any returned error" is a valid read.

I think just "must be latest if preg passed" is what ended up in v6
on the grounds of realism; if it's also the recise letter of POSIX then
all the better.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 19:37                 ` [PATCH v6 4/8] regex.3: Improve REG_STARTEND наб
@ 2023-04-20 23:15                   ` Alejandro Colomar
  2023-04-21  0:39                     ` [PATCH v7 " наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 23:15 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2459 bytes --]



On 4/20/23 21:37, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.
> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.
> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 29 ++++++++++++++++-------------
>  1 file changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index e3dd72a74..a9bec59a9 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -131,23 +131,26 @@ .SS Matching
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
> +instead of
> +.RI [ string ", " string " + \fBstrlen\fP(" string )).

See man-pages(7):

       Expressions, if not written on a separate indented line, should
       be  specified in italics.  Again, the use of nonbreaking spaces
       may be appropriate if the expression  is  inlined  with  normal
       text.

strlen(string) is an expression, not a man page reference, so it should
go in full italics.  The + is also part of the expression, so it should
also go in italics.  I suggest:

.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
.RI [ string , " string + strlen(string)" ).

>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
> +If any matches are returned
> +.RB ( REG_NOSUB
> +wasn't passed to
> +.BR regcomp (),
> +the match succeeded, and
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +> 0), they overwrite
> +.I pmatch
> +as usual, and the match offsets remain relative to
> +.IR string
> +(not
> +.IR string " + " pmatch[0].rm_so ).

Similar stuff here.

>  This flag is a BSD extension, not present in POSIX.
>  .SS Match offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 1/8] regex.3: Desoupify regexec() description
  2023-04-20 19:36                 ` [PATCH v6 1/8] regex.3: Desoupify regexec() description наб
@ 2023-04-20 23:24                   ` Alejandro Colomar
  2023-04-21  0:33                     ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 23:24 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1122 bytes --]

Hi nab,

On 4/20/23 21:36, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index bedb97e87..47fe661d2 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -105,12 +105,10 @@ .SS Compilation
>  .SS Matching
>  .BR regexec ()
>  is used to match a null-terminated string
> -against the precompiled pattern buffer,
> -.IR preg .
> -.I nmatch
> -and
> -.I pmatch
> -are used to provide information regarding the location of any matches.
> +against the compiled pattern buffer in
> +.IR *preg ,
> +which must have been initialised with
> +.BR regexec ().

This patch removes the nmatch and pmatch info before presumably we add
it in a subsequent patch.  I prefer if the patch that documents that would
go either before this one, or right after this one.

Cheers,
Alex

>  .I eflags
>  is the
>  bitwise OR

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 3/8] regex.3: Desoupify regfree() description
  2023-04-20 19:37                 ` [PATCH v6 3/8] regex.3: Desoupify regfree() description наб
@ 2023-04-20 23:35                   ` Alejandro Colomar
  2023-04-21  0:27                     ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-20 23:35 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1512 bytes --]



On 4/20/23 21:37, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 3f1529583..e3dd72a74 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -225,12 +225,10 @@ .SS Error reporting
>  .IR errbuf ;
>  the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
> -Supplying
>  .BR regfree ()
> -with a precompiled pattern buffer,
> -.IR preg ,
> -will free the memory allocated to the pattern buffer by the compiling
> -process,
> +invalidates the pattern buffer at

While this ("invalidates") is true, it omits the most important information:
it frees the object.  I think it's better to say that it frees (or
deallocates) the object and any memory allocated within it, since that
already implies invalidating it (due to
<https://port70.net/~nsz/c/c11/n1570.html#6.2.4p2> and
<https://port70.net/~nsz/c/c11/n1570.html#7.22.3p1>), and also tells why
it's necessary to call this function.  Otherwise, it's not clear why we
should call it.  Why would I want to invalidate a buffer?  We can call
memfrob(3) for that :p  Or for secure stuff, arc4random(3).

> +.IR *preg ,
> +which must have been initialized via
>  .BR regcomp ().
>  .SH RETURN VALUE
>  .BR regcomp ()

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v5 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 23:01                     ` Alejandro Colomar
@ 2023-04-21  0:13                       ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-21  0:13 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 13218 bytes --]

On Fri, Apr 21, 2023 at 01:01:11AM +0200, Alejandro Colomar wrote:
> On 4/20/23 21:30, наб wrote:
> > On Thu, Apr 20, 2023 at 07:29:27PM +0200, Alejandro Colomar wrote:
> >> On 4/20/23 17:35, наб wrote:
> >>> --- a/man3/regex.3
> >>> +++ b/man3/regex.3
> >>> @@ -131,23 +131,30 @@ .SS Matching
> >>>  above).
> >>>  .TP
> >>>  .B REG_STARTEND
> >>> -Use
> >>> -.I pmatch[0]
> >>> -on the input string, starting at byte
> >>> -.I pmatch[0].rm_so
> >>> -and ending before byte
> >>> -.IR pmatch[0].rm_eo .
> >>> +Match
> >>> +.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
> >>> +instead of
> >>> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
> >>>  This allows matching embedded NUL bytes
> >>>  and avoids a
> >>>  .BR strlen (3)
> >>> -on large strings.
> >>> -It does not use
> >>> +on known-length strings.
> >>> +.I pmatch
> >>> +must point to a valid readable object.
> >> I think this is redundant, since we showed that [0] is accessed by
> >> the function.
> > Yeah.
> > 
> >>> +If any matches are returned
> >>> +.RB ( REG_NOSUB
> >>> +wasn't passed to
> >>> +.BR regcomp (),
> >>> +the match succeeded, and
> >>>  .I nmatch
> >>> -on input, and does not change
> >>> -.B REG_NOTBOL
> >>> -or
> >>> -.B REG_NEWLINE
> >>> -processing.
> >>> +> 0), they overwrite
> >> And of course, nmatch must be at least 1, since otherwise, [0] was
> >> not valid, and the whole call would have been UB; right?  So that
> >> third condition must be true to not invoke UB, so we can omit it too,
> >> I think.
> > What? idk where you got this from.
> > Per 0d120a3c76b4446b194a54387ce0e7a84b208bfd:
> >     In the regexec() signature
> >       regmatch_t pmatch[restrict .nmatch],
> >     is a simplification. It's actually
> >       regmatch_t pmatch[restrict
> >         ((.preg->flags & REG_NOSUB) ? 0 : .nmatch) ?:
> >          !!(.eflags & REG_STARTEND)],
> That is a model that was useful in a commit message to describe more
> or less what happens.  It doesn't need to perfectly describe reality.
> Since REG_STARTEND is not in POSIX, we can't read what POSIX says,
> so it's all up to how much implementations want to guarantee.  I
> don't think glibc would like to allow specifying .nmatch as 0 while
> the function accesses [0].  The fact that the current implementation
> doesn't open Hell's doors to nasal demons doesn't mean it can't do
> so in the future.  I conceive that _FORTIFY_SOURCE could reasonably
> check that pmatch[] has at least .nmemb elements, and I don't want
> to preclude that in the documentation.
What? I don't get this. Who cares what POSIX says about this 4.4BSD
extension?

This interface has been unchanged for over 30 years;
4.4BSD-Lite, /usr/src/lib/libc/regex/regexec.c:
    *      @(#)regexec.c   8.1 (Berkeley) 6/4/93

    int                             /* 0 success, REG_NOMATCH failure */
    regexec(preg, string, nmatch, pmatch, eflags)
    const regex_t *preg;
    const char *string;
    size_t nmatch;
    regmatch_t pmatch[];
    int eflags;
    {
            register struct re_guts *g = preg->re_g;
    #ifdef REDEBUG
    #       define  GOODFLAGS(f)    (f)
    #else
    #       define  GOODFLAGS(f)    ((f)&(REG_NOTBOL|REG_NOTEOL|REG_STARTEND))
    #endif
    
            if (preg->re_magic != MAGIC1 || g->magic != MAGIC2)
                    return(REG_BADPAT);
            assert(!(g->iflags&BAD));
            if (g->iflags&BAD)              /* backstop for no-debug case */
                    return(REG_BADPAT);
            if (eflags != GOODFLAGS(eflags))
                    return(REG_INVARG);
    
            if (g->nstates <= CHAR_BIT*sizeof(states1) && !(eflags&REG_LARGE))
                    return(smatcher(g, (char *)string, nmatch, pmatch, eflags));
            else
                    return(lmatcher(g, (char *)string, nmatch, pmatch, eflags));
    }

4.4BSD-Lite, /usr/src/lib/libc/regex/engine.c:
    *      @(#)engine.c    8.1 (Berkeley) 6/4/93

    /*
     * The matching engine and friends.  This file is #included by regexec.c
     * after suitable #defines of a variety of macros used herein, so that
     * different state representations can be used without duplicating masses
     * of code.
     */

    #ifdef SNAMES
    #define matcher smatcher
    
    #ifdef LNAMES
    #define matcher lmatcher
    
    /*
     - matcher - the actual matching engine
     == static int matcher(register struct re_guts *g, char *string, \
     ==     size_t nmatch, regmatch_t pmatch[], int eflags);
     */
    static int                      /* 0 success, REG_NOMATCH failure */
    matcher(g, string, nmatch, pmatch, eflags)
    register struct re_guts *g;
    char *string;
    size_t nmatch;
    regmatch_t pmatch[];
    int eflags;
    {
            register char *endp;
            register int i;
            struct match mv;
            register struct match *m = &mv;
            register char *dp;
            const register sopno gf = g->firststate+1;      /* +1 for OEND */
            const register sopno gl = g->laststate;
            char *start;
            char *stop;
    
            /* simplify the situation where possible */
            if (g->cflags&REG_NOSUB)
                    nmatch = 0;
            if (eflags&REG_STARTEND) {
                    start = string + pmatch[0].rm_so;
                    stop = string + pmatch[0].rm_eo;
            } else {
                    start = string;
                    stop = start + strlen(start);
            }
            if (stop < start)
                    return(REG_INVARG);
(rest of matcher)
            /* fill in the details if requested */
            if (nmatch > 0) {
                    pmatch[0].rm_so = m->coldp - m->offp;
                    pmatch[0].rm_eo = endp - m->offp;
            }
            if (nmatch > 1) {
                    assert(m->pmatch != NULL);
                    for (i = 1; i < nmatch; i++)
                            if (i <= m->g->nsub)
                                    pmatch[i] = m->pmatch[i];
                            else {
                                    pmatch[i].rm_so = -1;
                                    pmatch[i].rm_eo = -1;
                            }
            }

That's what the interface /is/ (also, I was guessing last time from
 behaviour and wrote the exact same pseudocode; fun).

And, tell you what, musl also does if(REG_NOSUB) nmatch = 0;
so does the illumos gate; glibc does
    int
    regexec (const regex_t *__restrict preg, const char *__restrict string,
             size_t nmatch, regmatch_t pmatch[_REGEX_NELTS (nmatch)], int eflags)
    {
      reg_errcode_t err;
      Idx start, length;
      re_dfa_t *dfa = preg->buffer;
    
      if (eflags & ~(REG_NOTBOL | REG_NOTEOL | REG_STARTEND))
        return REG_BADPAT;
    
      if (eflags & REG_STARTEND)
        {
          start = pmatch[0].rm_so;
          length = pmatch[0].rm_eo;
        }
      else
        {
          start = 0;
          length = strlen (string);
        }
    
      lock_lock (dfa->lock);
      if (preg->no_sub)
        err = re_search_internal (preg, string, length, start, length,
                                  length, 0, NULL, eflags);
      else
        err = re_search_internal (preg, string, length, start, length,
                                  length, nmatch, pmatch, eflags);
      lock_unlock (dfa->lock);
      return err != REG_NOERROR;
    }
i.e. it sets nmatch to 0 if REG_NOSUB, but later.
None of them do
  if (eflags & REG_STARTEND && !nmatch)
    ... what now? return an error?
for the sole purpose of... providing an interface that's broken?

nmatch is the amount of matches you care about getting back,
and nothing more.


If anything, the POSIX header is (Issue 8 Draft 2.1):
11030  The following shall be declared as functions and may also be defined as macros. Function
11031  prototypes shall be provided.
11032  int    regcomp(regex_t *restrict, const char *restrict, int);
11033  size_t regerror(int, const regex_t *restrict, char *restrict, size_t);
11034  int    regexec(const regex_t *restrict, const char *restrict, size_t,
11035            regmatch_t [restrict], int);
11036  void   regfree(regex_t *);


So you've overconstrained the interface for simplicity,
and now you're treating the simplification as a ground truth of..?

And glibc, if anything, would love for you to specify the start and end
bounds with REG_STARTEND while also passing nmatch = 0,
because it additionally optimises for that case (&& no backrefs).


/And also/, 6.7.6.2 Array declarators says:
  Constraints
  1 In addition to optional type qualifiers and the keyword static, the
  [ and ] may delimit an expression or *. If they delimit an expression
  (which specifies the size of an array), the expression shallhave an
  integer type. If the expression is a constant expression, it shall
  have a value greater thanzero. The element type shall not be an
  incomplete or function type. The optional type qualifiers and the
  keyword static shall appear only in a declaration of a function
  parameter with an array type, and then only in the outer most array
  type derivation.

  2 If an identifier is declared as having a variably modified type, it
  shall be an ordinary identifier (as defined in 6.2.3), have no
  linkage, and have either block scope or function prototype scope. If
  an identifier is declared to be an object with static or thread
  storage duration, it shall not have a variable length array type.

  Semantics
  3 If, in the declaration "T D1", D1 has one of the forms:
     D [        type-qualifier-list(opt)        assignment-expression(opt) ] attribute-specifier-sequence(opt)
     D [ static type-qualifier-list(opt)        assignment-expression      ] attribute-specifier-sequence(opt)
     D [        type-qualifier-list      static assignment-expression      ] attribute-specifier-sequence(opt)
     D [        type-qualifier-list(opt) *                                 ] attribute-specifier-sequence(opt)
  and the type specified for /ident/ in the declaration "T D" is
  "derived-declarator-type-list T", then the type specified for /ident/
  is "derived-declarator-type-list array of T".172)173) The optional
  attribute specifiersequence appertains to the array. (See 6.7.6.3 for
  the meaning of the optional type qualifiers and the keyword static.)

Where 6.7.6.3 Function declarators says:
  6 A declaration of a parameter as "array of /type/" shall be adjusted
  to "qualified pointer to /type/", wherethe /type/ qualifiers (if any)
  are those specified within the [ and ] of the array type derivation.
  If the keyword static also appears within the [ and ] of the array
  type derivation, then for each call to the function, the value of the
  corresponding actual argument shall provide access to the first
  element of an array with at least as many elements as specified by the
  size expression.

So even /if/ the declaration was
  int
  regexec(const regex_t *restrict, const char *restrict, size_t nmatch,
          regmatch_t pmatch[restrict static nmatch], int);
/which it isn't/, not even in glibc (#define _REGEX_NELTS(n) n, or to
empty depending on the environment, which means it's a regular
variably-modified array type, which means nothing when used in a
function prototype), it would /still/ be legal to do any of the below:
  regexec(regp, "", 0, NULL, 0);
  regmatch_t rm;
  regexec(regp, "", 0, &rm, 0);
  regexec(regp, "", 1, &rm, 0);
  regmatch_t rms[999];
  for(int i = 0; i < 999; ++i)
    regexec(regp, "", i, rms, 0);

More to the point, perhaps, 6.7.6.3 continues:
  20 EXAMPLE 5
  The following are all compatible function prototype declarators.
    double maximum(int n, int m, double a[n][m]);
    double maximum(int n, int m, double a[*][*]);
    double maximum(int n, int m, double a[ ][*]);
    double maximum(int n, int m, double a[ ][m]);
  as are:
    void f(double (*restrict a)[5]);
    void f(double a[restrict][5]);
    void f(double a[restrict 3][5]);
    void f(double a[restrict static 3][5]);
  (Note that the last declaration also specifies that the argument
  corresponding toain any call tofcan be expected to be a non-null
  pointer to the first of at least three arrays of 5 doubles,
  which the others do not.)

Which the others do not.

Well, it's not to the point since there's no static and there'll never
be static, but maybe it drives home that unless whatever's inside [] is
"restrict" or "static {expr}", it's purely decorative. And even with
static, you can always give it more objects. This is like saying that
       char *strncpy(char dst[restrict .sz], const char *restrict src,
                      size_t sz);
makes
    char dst[256 + 1]
	strncpy(dst, "whatever", 256);
illegal.


(There's also a forward-reffed stanza at 6.9.1.10, but I'm pretty sure
 it only applies to multi-dimensional VLAs.)


Best,
наб

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 3/8] regex.3: Desoupify regfree() description
  2023-04-20 23:35                   ` Alejandro Colomar
@ 2023-04-21  0:27                     ` наб
  2023-04-21  0:37                       ` [PATCH v7 " наб
  2023-04-21  0:58                       ` [PATCH v6 " Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-21  0:27 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1709 bytes --]

On Fri, Apr 21, 2023 at 01:35:43AM +0200, Alejandro Colomar wrote:
> On 4/20/23 21:37, наб wrote:
> > diff --git a/man3/regex.3 b/man3/regex.3
> > index 3f1529583..e3dd72a74 100644
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -225,12 +225,10 @@ .SS Error reporting
> >  .IR errbuf ;
> >  the error string is always null-terminated, and truncated to fit.
> >  .SS Freeing
> > -Supplying
> >  .BR regfree ()
> > -with a precompiled pattern buffer,
> > -.IR preg ,
> > -will free the memory allocated to the pattern buffer by the compiling
> > -process,
> > +invalidates the pattern buffer at
> While this ("invalidates") is true, it omits the most important information:
> it frees the object.
It doesn't.

> I think it's better to say that it frees (or
> deallocates) the object and any memory allocated within it, since that
> already implies invalidating it (due to
> <https://port70.net/~nsz/c/c11/n1570.html#6.2.4p2> and
> <https://port70.net/~nsz/c/c11/n1570.html#7.22.3p1>),
For the precise reasons listed here:
the regex_t object continues to exist.
regcomp() doesn't allocate *preg, and regfree() doesn't deallocate it.

> and also tells why
> it's necessary to call this function.  Otherwise, it's not clear why we
> should call it.  Why would I want to invalidate a buffer?
Admittedly, it does also "free any memory allocated by regcomp( )
associated with preg." (Issue 8 Draft 2.1), yeah.
Maybe it's my neurosis that I consider "may no longer be passed to
regexec()" the primary effect here.

Updated to
  regfree() invalidates the pattern buffer at *preg, freeing any
  associated memory; *preg must have been initialized via regcomp().

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 1/8] regex.3: Desoupify regexec() description
  2023-04-20 23:24                   ` Alejandro Colomar
@ 2023-04-21  0:33                     ` наб
  2023-04-21  0:49                       ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  0:33 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]

Hi!

On Fri, Apr 21, 2023 at 01:24:16AM +0200, Alejandro Colomar wrote:
> On 4/20/23 21:36, наб wrote:
> > diff --git a/man3/regex.3 b/man3/regex.3
> > index bedb97e87..47fe661d2 100644
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -105,12 +105,10 @@ .SS Compilation
> >  .SS Matching
> >  .BR regexec ()
> >  is used to match a null-terminated string
> > -against the precompiled pattern buffer,
> > -.IR preg .
> > -.I nmatch
> > -and
> > -.I pmatch
> > -are used to provide information regarding the location of any matches.
> > +against the compiled pattern buffer in
> > +.IR *preg ,
> > +which must have been initialised with
> > +.BR regexec ().
> This patch removes the nmatch and pmatch info before presumably we add
> it in a subsequent patch.
It doesn't and we don't ‒
the documentation for nmatch and pmatch never leaves Match offsets.

This patch just kills an extraneous, glib, and inaccurate description
in Matching.

There's another glib description not ten lines above in REG_NOSUB.
You don't need to keep the third one.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v7 3/8] regex.3: Desoupify regfree() description
  2023-04-21  0:27                     ` наб
@ 2023-04-21  0:37                       ` наб
  2023-04-21  0:58                       ` [PATCH v6 " Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: наб @ 2023-04-21  0:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v6:
1:  4b7971a5e ! 1:  2632fe5c8 regex.3: Desoupify regfree() description
    @@ man3/regex.3: .SS Error reporting
     -process,
     +invalidates the pattern buffer at
     +.IR *preg ,
    -+which must have been initialized via
    ++freeing any associated memory;
    ++.I *preg
    ++must have been initialized via
      .BR regcomp ().
      .SH RETURN VALUE
      .BR regcomp ()

 man3/regex.3 | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 3f1529583..46a4a12b9 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -225,12 +225,12 @@ .SS Error reporting
 .IR errbuf ;
 the error string is always null-terminated, and truncated to fit.
 .SS Freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+invalidates the pattern buffer at
+.IR *preg ,
+freeing any associated memory;
+.I *preg
+must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-20 23:15                   ` Alejandro Colomar
@ 2023-04-21  0:39                     ` наб
  2023-04-21  1:42                       ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  0:39 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2558 bytes --]

Explicitly spell out the ranges involved. The original wording always
confused me, but it's actually very sane.

Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
R_NOTEOL? No. That's weird and confusing.

String largeness doesn't matter, known-lengthness does.

Explicitly spell out the influence on returned matches
(relative to string, not start of range).

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v6:
1:  4b7971a5e < -:  --------- regex.3: Desoupify regfree() description
2:  5fb4cc16f ! 1:  ed050649b regex.3: Improve REG_STARTEND
    @@ man3/regex.3: .SS Matching
     -and ending before byte
     -.IR pmatch[0].rm_eo .
     +Match
    -+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
    ++.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
     +instead of
    -+.RI [ string ", " string " + \fBstrlen\fP(" string )).
    ++.RI [ string , " string + strlen(string)" ).
      This allows matching embedded NUL bytes
      and avoids a
      .BR strlen (3)
    @@ man3/regex.3: .SS Matching
     +as usual, and the match offsets remain relative to
     +.IR string
     +(not
    -+.IR string " + " pmatch[0].rm_so ).
    ++.IR "string + pmatch[0].rm_so" ).
      This flag is a BSD extension, not present in POSIX.
      .SS Match offsets
      Unless

 man3/regex.3 | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 46a4a12b9..099c2c17f 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -131,23 +131,26 @@ .SS Matching
 above).
 .TP
 .B REG_STARTEND
-Use
-.I pmatch[0]
-on the input string, starting at byte
-.I pmatch[0].rm_so
-and ending before byte
-.IR pmatch[0].rm_eo .
+Match
+.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
+instead of
+.RI [ string , " string + strlen(string)" ).
 This allows matching embedded NUL bytes
 and avoids a
 .BR strlen (3)
-on large strings.
-It does not use
+on known-length strings.
+If any matches are returned
+.RB ( REG_NOSUB
+wasn't passed to
+.BR regcomp (),
+the match succeeded, and
 .I nmatch
-on input, and does not change
-.B REG_NOTBOL
-or
-.B REG_NEWLINE
-processing.
+> 0), they overwrite
+.I pmatch
+as usual, and the match offsets remain relative to
+.IR string
+(not
+.IR "string + pmatch[0].rm_so" ).
 This flag is a BSD extension, not present in POSIX.
 .SS Match offsets
 Unless
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 1/8] regex.3: Desoupify regexec() description
  2023-04-21  0:33                     ` наб
@ 2023-04-21  0:49                       ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  0:49 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1333 bytes --]

Hi!

On 4/21/23 02:33, наб wrote:
> Hi!
> 
> On Fri, Apr 21, 2023 at 01:24:16AM +0200, Alejandro Colomar wrote:
>> On 4/20/23 21:36, наб wrote:
>>> diff --git a/man3/regex.3 b/man3/regex.3
>>> index bedb97e87..47fe661d2 100644
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -105,12 +105,10 @@ .SS Compilation
>>>  .SS Matching
>>>  .BR regexec ()
>>>  is used to match a null-terminated string
>>> -against the precompiled pattern buffer,
>>> -.IR preg .
>>> -.I nmatch
>>> -and
>>> -.I pmatch
>>> -are used to provide information regarding the location of any matches.
>>> +against the compiled pattern buffer in
>>> +.IR *preg ,
>>> +which must have been initialised with
>>> +.BR regexec ().
>> This patch removes the nmatch and pmatch info before presumably we add
>> it in a subsequent patch.
> It doesn't and we don't ‒
> the documentation for nmatch and pmatch never leaves Match offsets.
> 
> This patch just kills an extraneous, glib, and inaccurate description
> in Matching.
> 
> There's another glib description not ten lines above in REG_NOSUB.
> You don't need to keep the third one.

Ahhh, that's right.  Thanks!  Patch applied.

Cheers,
Alex

> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 3/8] regex.3: Desoupify regfree() description
  2023-04-21  0:27                     ` наб
  2023-04-21  0:37                       ` [PATCH v7 " наб
@ 2023-04-21  0:58                       ` Alejandro Colomar
  2023-04-21  1:24                         ` [PATCH v7a " наб
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  0:58 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2318 bytes --]

On 4/21/23 02:27, наб wrote:
> On Fri, Apr 21, 2023 at 01:35:43AM +0200, Alejandro Colomar wrote:
>> On 4/20/23 21:37, наб wrote:
>>> diff --git a/man3/regex.3 b/man3/regex.3
>>> index 3f1529583..e3dd72a74 100644
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -225,12 +225,10 @@ .SS Error reporting
>>>  .IR errbuf ;
>>>  the error string is always null-terminated, and truncated to fit.
>>>  .SS Freeing
>>> -Supplying
>>>  .BR regfree ()
>>> -with a precompiled pattern buffer,
>>> -.IR preg ,
>>> -will free the memory allocated to the pattern buffer by the compiling
>>> -process,
>>> +invalidates the pattern buffer at
>> While this ("invalidates") is true, it omits the most important information:
>> it frees the object.
> It doesn't.

You're right.  It frees memory within the object.  :/

> 
>> I think it's better to say that it frees (or
>> deallocates) the object and any memory allocated within it, since that
>> already implies invalidating it (due to
>> <https://port70.net/~nsz/c/c11/n1570.html#6.2.4p2> and
>> <https://port70.net/~nsz/c/c11/n1570.html#7.22.3p1>),
> For the precise reasons listed here:
> the regex_t object continues to exist.
> regcomp() doesn't allocate *preg, and regfree() doesn't deallocate it.
> 
>> and also tells why
>> it's necessary to call this function.  Otherwise, it's not clear why we
>> should call it.  Why would I want to invalidate a buffer?
> Admittedly, it does also "free any memory allocated by regcomp( )
> associated with preg." (Issue 8 Draft 2.1), yeah.

Yep.

> Maybe it's my neurosis that I consider "may no longer be passed to
> regexec()" the primary effect here.

:)

I wish GCC had an attribute for ensuring that in the -fanalyzer.
But [[gnu::malloc()]] only works for returned pointers, and not for
pointers initialized via a parameter, nor for returned integers.

> 
> Updated to
>   regfree() invalidates the pattern buffer at *preg, freeing any
>   associated memory; *preg must have been initialized via regcomp().

How about deinitializes?  Since regcomp(3) "initializes" the pattern
buffer, it makes sense to use complementary wording.

Cheers,
Alex

> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v7a 3/8] regex.3: Desoupify regfree() description
  2023-04-21  0:58                       ` [PATCH v6 " Alejandro Colomar
@ 2023-04-21  1:24                         ` наб
  2023-04-21  1:55                           ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  1:24 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v6:
1:  1ad1aa6e9 < -:  --------- regex.3: Desoupify regexec() description
2:  6c4d26f89 < -:  --------- regex.3: Desoupify regerror() description
3:  4b7971a5e ! 1:  5706f1892 regex.3: Desoupify regfree() description
    @@ man3/regex.3: .SS Error reporting
     -.IR preg ,
     -will free the memory allocated to the pattern buffer by the compiling
     -process,
    -+invalidates the pattern buffer at
    ++deinitializes the pattern buffer at
     +.IR *preg ,
    -+which must have been initialized via
    ++freeing any associated memory;
    ++.I *preg
    ++must have been initialized via
      .BR regcomp ().
      .SH RETURN VALUE
      .BR regcomp ()

 man3/regex.3 | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 3f1529583..ffdd98376 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -225,12 +225,12 @@ .SS Error reporting
 .IR errbuf ;
 the error string is always null-terminated, and truncated to fit.
 .SS Freeing
-Supplying
 .BR regfree ()
-with a precompiled pattern buffer,
-.IR preg ,
-will free the memory allocated to the pattern buffer by the compiling
-process,
+deinitializes the pattern buffer at
+.IR *preg ,
+freeing any associated memory;
+.I *preg
+must have been initialized via
 .BR regcomp ().
 .SH RETURN VALUE
 .BR regcomp ()
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21  0:39                     ` [PATCH v7 " наб
@ 2023-04-21  1:42                       ` Alejandro Colomar
  2023-04-21  2:16                         ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  1:42 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4566 bytes --]

Hi!

On 4/21/23 02:39, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.
> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.
> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.

> ---
> Range-diff against v6:
> 1:  4b7971a5e < -:  --------- regex.3: Desoupify regfree() description
> 2:  5fb4cc16f ! 1:  ed050649b regex.3: Improve REG_STARTEND
>     @@ man3/regex.3: .SS Matching
>      -and ending before byte
>      -.IR pmatch[0].rm_eo .
>      +Match
>     -+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
>     ++.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
>      +instead of
>     -+.RI [ string ", " string " + \fBstrlen\fP(" string )).
>     ++.RI [ string , " string + strlen(string)" ).
>       This allows matching embedded NUL bytes
>       and avoids a
>       .BR strlen (3)
>     @@ man3/regex.3: .SS Matching
>      +as usual, and the match offsets remain relative to
>      +.IR string
>      +(not
>     -+.IR string " + " pmatch[0].rm_so ).
>     ++.IR "string + pmatch[0].rm_so" ).
>       This flag is a BSD extension, not present in POSIX.
>       .SS Match offsets
>       Unless
> 
>  man3/regex.3 | 29 ++++++++++++++++-------------
>  1 file changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 46a4a12b9..099c2c17f 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -131,23 +131,26 @@ .SS Matching
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
> +instead of
> +.RI [ string , " string + strlen(string)" ).
>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
> +If any matches are returned
> +.RB ( REG_NOSUB
> +wasn't passed to
> +.BR regcomp (),
> +the match succeeded, and
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +> 0), they overwrite
> +.I pmatch
> +as usual, and the match offsets remain relative to
> +.IR string

Minor glitch: s/IR/I/

I fixed it.  BTW, don't know if you knew, but you can run some linters
to check these accidents by yourself.


$ make lint check -t >/dev/null
$ echo .IR foo >> man3/regex.3
$ make lint check -k
LINT (mandoc)	.tmp/man/man3/regex.3.lint-man.mandoc.touch
LINT (tbl comment)	.tmp/man/man3/regex.3.lint-man.tbl.touch
PRECONV	.tmp/man/man3/regex.3.tbl
TBL	.tmp/man/man3/regex.3.eqn
EQN	.tmp/man/man3/regex.3.cat.troff
TROFF	.tmp/man/man3/regex.3.cat.grotty
an.tmac:man3/regex.3:376: style: .IR expects at least 2 arguments, got 1
found style problems; aborting
make: *** [share/mk/build/catman.mk:80: .tmp/man/man3/regex.3.cat.grotty] Error 1
make: *** Deleting file '.tmp/man/man3/regex.3.cat.grotty'
make: Target 'check' not remade because of errors.
$ git restore -p
diff --git a/man3/regex.3 b/man3/regex.3
index e91504986..4840edb83 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -373,3 +373,4 @@ .SH SEE ALSO
 .PP
 The glibc manual section,
 .I "Regular Expressions"
+.IR foo
(1/1) Discard this hunk from worktree [y,n,q,a,d,e,?]? y

alx@asus5775:~/src/linux/man-pages/man-pages/main$ make lint check -k
LINT (mandoc)	.tmp/man/man3/regex.3.lint-man.mandoc.touch
LINT (tbl comment)	.tmp/man/man3/regex.3.lint-man.tbl.touch
PRECONV	.tmp/man/man3/regex.3.tbl
TBL	.tmp/man/man3/regex.3.eqn
EQN	.tmp/man/man3/regex.3.cat.troff
TROFF	.tmp/man/man3/regex.3.cat.grotty
GROTTY	.tmp/man/man3/regex.3.cat
COL	.tmp/man/man3/regex.3.cat.grep
GREP	.tmp/man/man3/regex.3.check-catman.touch


If you want to read more about this, see the CONTRIBUTING file, or the
Makefile itself (or rather, themselves).


Cheers,
Alex


> +(not
> +.IR "string + pmatch[0].rm_so" ).
>  This flag is a BSD extension, not present in POSIX.
>  .SS Match offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v7a 3/8] regex.3: Desoupify regfree() description
  2023-04-21  1:24                         ` [PATCH v7a " наб
@ 2023-04-21  1:55                           ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  1:55 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1739 bytes --]



On 4/21/23 03:24, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.

Thanks!
Alex

> ---
> Range-diff against v6:
> 1:  1ad1aa6e9 < -:  --------- regex.3: Desoupify regexec() description
> 2:  6c4d26f89 < -:  --------- regex.3: Desoupify regerror() description
> 3:  4b7971a5e ! 1:  5706f1892 regex.3: Desoupify regfree() description
>     @@ man3/regex.3: .SS Error reporting
>      -.IR preg ,
>      -will free the memory allocated to the pattern buffer by the compiling
>      -process,
>     -+invalidates the pattern buffer at
>     ++deinitializes the pattern buffer at
>      +.IR *preg ,
>     -+which must have been initialized via
>     ++freeing any associated memory;
>     ++.I *preg
>     ++must have been initialized via
>       .BR regcomp ().
>       .SH RETURN VALUE
>       .BR regcomp ()
> 
>  man3/regex.3 | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 3f1529583..ffdd98376 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -225,12 +225,12 @@ .SS Error reporting
>  .IR errbuf ;
>  the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
> -Supplying
>  .BR regfree ()
> -with a precompiled pattern buffer,
> -.IR preg ,
> -will free the memory allocated to the pattern buffer by the compiling
> -process,
> +deinitializes the pattern buffer at
> +.IR *preg ,
> +freeing any associated memory;
> +.I *preg
> +must have been initialized via
>  .BR regcomp ().
>  .SH RETURN VALUE
>  .BR regcomp ()

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v6 0/8] regex.3 momento
  2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
                                   ` (7 preceding siblings ...)
  2023-04-20 19:37                 ` [PATCH v6 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
@ 2023-04-21  2:01                 ` Alejandro Colomar
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
  8 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  2:01 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 5670 bytes --]

Hey наб!

On 4/20/23 21:36, наб wrote:
> Should include all comments; includes Branden's wording.


I'm going to sleep.  Would you please rebase and send tomorrow whatever
I didn't yet apply?  I've got a mess of mailbox by now =)

Let's see what I find in the git-log(1)...

> 
> наб (8):
>   regex.3: Desoupify regexec() description

Applied.

>   regex.3: Desoupify regerror() description

Not yet it seems;  please resend.

>   regex.3: Desoupify regfree() description

Applied.

>   regex.3: Improve REG_STARTEND

Applied.

>   regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link
>     regex_t.3type into regex.3
>   regex.3: Finalise move of reg*.3type

Both not yet; please resend.

>   regex.3: Destandardeseify Match offsets

Not yet; please resend.

>   regex.3: Further clarify the sole purpose of REG_NOSUB

And not yet; please resend.

Cheers,
Alex

> 
>  man3/regex.3              | 226 ++++++++++++++++++++++----------------
>  man3type/regex_t.3type    |  64 +----------
>  man3type/regmatch_t.3type |   2 +-
>  man3type/regoff_t.3type   |   2 +-
>  4 files changed, 133 insertions(+), 161 deletions(-)
> 
> Range-diff against v5:
> 1:  fcb8df21b < -:  --------- regex.3: Desoupify regcomp() description
> 2:  7240de5b7 = 1:  1ad1aa6e9 regex.3: Desoupify regexec() description
> 3:  108f30cd7 ! 2:  6c4d26f89 regex.3: Desoupify regerror() description
>     @@ Commit message
>          Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
>      
>       ## man3/regex.3 ##
>     -@@ man3/regex.3: .SH SYNOPSIS
>     - .BI "            int " eflags );
>     - .PP
>     - .BI "size_t regerror(int " errcode ", const regex_t *_Nullable restrict " preg ,
>     --.BI "            char " errbuf "[restrict ." errbuf_size "], \
>     -+.BI "                char " errbuf "[restrict ." errbuf_size "], \
>     - size_t " errbuf_size );
>     - .BI "void regfree(regex_t *" preg );
>     - .fi
>      @@ man3/regex.3: .SS Error reporting
>       .BR regexec ()
>       into error message strings.
>     @@ man3/regex.3: .SS Error reporting
>      -If both
>      -.I errbuf
>      -and
>     ++If
>     ++.I preg
>     ++isn't a null pointer,
>      +.I errcode
>      +must be the latest error returned from an operation on
>      +.IR preg .
>     -+If
>     -+.I preg
>     -+is a null pointer\(emthe latest error.
>      +.PP
>      +If
>     ++.I errbuf_size
>     ++is
>     ++.BR 0 ,
>     ++the size of the required buffer is returned.
>     ++Otherwise, up to
>       .I errbuf_size
>      -are nonzero,
>      -.I errbuf
>      -is filled in with the first
>      -.I "errbuf_size \- 1"
>      -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
>     -+is
>     -+.BR 0 ,
>     -+the size of the required buffer is returned.
>     -+Otherwise, up to
>     -+.I errbuf_size
>      +bytes are copied to
>      +.IR errbuf ;
>      +the error string is always null-terminated, and truncated to fit.
>       .SS Freeing
>     --Supplying
>     + Supplying
>       .BR regfree ()
>     --with a precompiled pattern buffer,
>     --.IR preg ,
>     --will free the memory allocated to the pattern buffer by the compiling
>     --process,
>     -+invalidates the pattern buffer at
>     -+.IR *preg ,
>     -+which must have been initialized via
>     - .BR regcomp ().
>     - .SH RETURN VALUE
>     - .BR regcomp ()
> -:  --------- > 3:  4b7971a5e regex.3: Desoupify regfree() description
> 4:  fd1a104d6 ! 4:  5fb4cc16f regex.3: Improve REG_STARTEND
>     @@ man3/regex.3: .SS Matching
>      -on large strings.
>      -It does not use
>      +on known-length strings.
>     -+.I pmatch
>     -+must point to a valid readable object.
>      +If any matches are returned
>      +.RB ( REG_NOSUB
>      +wasn't passed to
>     @@ man3/regex.3: .SS Matching
>      -processing.
>      +> 0), they overwrite
>      +.I pmatch
>     -+as usual, and the
>     -+.B Match offsets
>     -+remain relative to
>     ++as usual, and the match offsets remain relative to
>      +.IR string
>      +(not
>      +.IR string " + " pmatch[0].rm_so ).
> 5:  198b7b4fa ! 5:  057a4a522 regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
>     @@ Commit message
>      
>       ## man3/regex.3 ##
>      @@ man3/regex.3: .SH SYNOPSIS
>     - .BI "                char " errbuf "[restrict ." errbuf_size "], \
>     - size_t " errbuf_size );
>     + .BI "            char " errbuf "[_Nullable restrict ." errbuf_size ],
>     + .BI "            size_t " errbuf_size );
>       .BI "void regfree(regex_t *" preg );
>      +.PP
>      +.B typedef struct {
> 6:  c6bc9cfd0 = 6:  60ac1a4d1 regex.3: Finalise move of reg*.3type
> 7:  59b8294c8 = 7:  3313546db regex.3: Destandardeseify Match offsets
> 8:  2e199fc3c ! 8:  7fa669481 regex.3: Further clarify the sole purpose of REG_NOSUB
>     @@ man3/regex.3: .SS Compilation
>      -.I nmatch
>      -and
>      -.I pmatch
>     -+Only report overall success:
>     ++Report only overall success.
>       .BR regexec ()
>      -arguments will be ignored for this purpose (but
>     -+will only use
>     ++will use only
>       .I pmatch
>      -may still be used for
>      -.BR REG_STARTEND ).
>      +for
>      +.BR REG_STARTEND ,
>     -+and ignore
>     ++ignoring
>      +.IR nmatch .
>       .TP
>       .B REG_NEWLINE

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21  1:42                       ` Alejandro Colomar
@ 2023-04-21  2:16                         ` наб
  2023-04-21  9:45                           ` Alejandro Colomar
  2023-04-21 10:19                           ` Jakub Wilk
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-21  2:16 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 9433 bytes --]

Hi!

On Fri, Apr 21, 2023 at 03:42:48AM +0200, Alejandro Colomar wrote:
> On 4/21/23 02:39, наб wrote:
> > Explicitly spell out the ranges involved. The original wording always
> > confused me, but it's actually very sane.
> > 
> > Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> > R_NOTEOL? No. That's weird and confusing.
> > 
> > String largeness doesn't matter, known-lengthness does.
> > 
> > Explicitly spell out the influence on returned matches
> > (relative to string, not start of range).
> > 
> > Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> 
> Patch applied.
> 
> > ---
> > Range-diff against v6:
> > 1:  4b7971a5e < -:  --------- regex.3: Desoupify regfree() description
> > 2:  5fb4cc16f ! 1:  ed050649b regex.3: Improve REG_STARTEND
> >     @@ man3/regex.3: .SS Matching
> >      -and ending before byte
> >      -.IR pmatch[0].rm_eo .
> >      +Match
> >     -+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
> >     ++.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
> >      +instead of
> >     -+.RI [ string ", " string " + \fBstrlen\fP(" string )).
> >     ++.RI [ string , " string + strlen(string)" ).
> >       This allows matching embedded NUL bytes
> >       and avoids a
> >       .BR strlen (3)
> >     @@ man3/regex.3: .SS Matching
> >      +as usual, and the match offsets remain relative to
> >      +.IR string
> >      +(not
> >     -+.IR string " + " pmatch[0].rm_so ).
> >     ++.IR "string + pmatch[0].rm_so" ).
> >       This flag is a BSD extension, not present in POSIX.
> >       .SS Match offsets
> >       Unless
> > 
> >  man3/regex.3 | 29 ++++++++++++++++-------------
> >  1 file changed, 16 insertions(+), 13 deletions(-)
> > 
> > diff --git a/man3/regex.3 b/man3/regex.3
> > index 46a4a12b9..099c2c17f 100644
> > --- a/man3/regex.3
> > +++ b/man3/regex.3
> > @@ -131,23 +131,26 @@ .SS Matching
> >  above).
> >  .TP
> >  .B REG_STARTEND
> > -Use
> > -.I pmatch[0]
> > -on the input string, starting at byte
> > -.I pmatch[0].rm_so
> > -and ending before byte
> > -.IR pmatch[0].rm_eo .
> > +Match
> > +.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
> > +instead of
> > +.RI [ string , " string + strlen(string)" ).
> >  This allows matching embedded NUL bytes
> >  and avoids a
> >  .BR strlen (3)
> > -on large strings.
> > -It does not use
> > +on known-length strings.
> > +If any matches are returned
> > +.RB ( REG_NOSUB
> > +wasn't passed to
> > +.BR regcomp (),
> > +the match succeeded, and
> >  .I nmatch
> > -on input, and does not change
> > -.B REG_NOTBOL
> > -or
> > -.B REG_NEWLINE
> > -processing.
> > +> 0), they overwrite
> > +.I pmatch
> > +as usual, and the match offsets remain relative to
> > +.IR string
> 
> Minor glitch: s/IR/I/
> 
> I fixed it.  BTW, don't know if you knew, but you can run some linters
> to check these accidents by yourself.


$ make check
# ...
GREP    .tmp/man/man1/memusage.1.check-catman.touch
.tmp/man/man1/memusage.1.cat.grep:132:           Memory usage summary: heap total: 45200, heap peak: 6440, stack peak: 224
.tmp/man/man1/memusage.1.cat.grep:135:           realloc|        40         44800             0  (nomove:40, dec:19, free:0)
make: *** [share/mk/check/catman.mk:36: .tmp/man/man1/memusage.1.check-catman.touch] Error 1


$ make lint
SED     .tmp/man/man2/add_key.2.d/add_key.c
LINT (checkpatch)       .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch
bash: line 1: checkpatch: command not found
make: *** [share/mk/lint/c.mk:64: .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch] Error 127

git grep checkpatch first says I want checkpatch(1).
No such manual exists, at least in Debian.
Then it reveals I actually want checkpatch.pl from a linux checkout.
Probably call it [scripts/]checkpatch.pl then?

Then it reveals
  CHECKPATCH              := checkpatch
which means that just
  export CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl
doesn't work, and I need to pass it as an argument (should be ?=).
The same for all the other linters.

$ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint
# ...
LINT (mandoc)   .tmp/man/man1/pldd.1.lint-man.mandoc.touch
mandoc: man1/getent.1:6:14: WARNING: cannot parse date, using it verbatim: (date)
# (same what feels like every page; bullseye mandoc 1.14.5-1)

If I pass MANDOC=~/code/voreutils/mandoc (recent(ish, it was recent last
year) CVS, + some patches I forgot that fixed some egregious formatting
errors):
LINT (mandoc)   .tmp/man/man5/ftpusers.5.lint-man.mandoc.touch
LINT (mandoc)   .tmp/man/man5/gai.conf.5.lint-man.mandoc.touch
LINT (mandoc)   .tmp/man/man5/group.5.lint-man.mandoc.touch
LINT (mandoc)   .tmp/man/man5/host.conf.5.lint-man.mandoc.touch
mandoc: man5/erofs.5:78:2: ERROR: skipping end of block that is not open: RE
mandoc: man5/erofs.5:79:2: WARNING: skipping paragraph macro: IP empty
mandoc: man5/erofs.5:78:2: WARNING: skipping paragraph macro: br at the end of SS

And it passes! Those are the only errors I saw, even on the version with
IR\ string$

When I ran with 2>&1 | less to make sure, I got 
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
/etc/bash.bashrc: line 7: PS1: unbound variable
SED     .tmp/man/man2/add_key.2.d/add_key.c
SED     .tmp/man/man2/bind.2.d/bind.c
SED     .tmp/man/man2/chown.2.d/chown.c
SED     .tmp/man/man2/clock_getres.2.d/clock_getres.c
SED     .tmp/man/man2/clone.2.d/clone.c
SED     .tmp/man/man2/close_range.2.d/close_range.c
SED     .tmp/man/man2/copy_file_range.2.d/copy_file_range.c
SED     .tmp/man/man2/eventfd.2.d/eventfd.c
and indeed
Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail
and
$ sed -n 6,7p /etc/bash.bashrc
# If not running interactively, don't do anything
[ -z "$PS1" ] && return

(That should be ${PS1-}. What's even funnier is that
 $ sed -n 14p /etc/bash.bashrc
 if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then)


$ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint MANDOC=: CLANG-TIDY=:
LINT (checkpatch)       .tmp/man/man3/_Generic.3.d/_Generic.lint-c.checkpatch.touch
ERROR:ASSIGN_IN_IF: do not use assignment in if condition
#17: FILE: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.c:17:
+    if ((fp = fopen(argv[1], "r")) == NULL) {

Do not use assignments in if condition.
Example::

  if ((foo = bar(...)) < BAZ) {

should be written as::

  foo = bar(...);
  if (foo < BAZ) {

total: 1 errors, 0 warnings, 0 checks, 29 lines checked
make: *** [share/mk/lint/c.mk:64: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.lint-c.checkpatch.touch] Error 1
make: *** Waiting for unfinished jobs....
CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#17: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:17:
+    printf("Name: \"%s\" (%d segments)\n", info->dlpi_name,
+               info->dlpi_phnum);

CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#33: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:33:
+        printf("    %2zu: [%14p; memsz:%7jx] flags: %#jx; ", j,
+                (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr),

total: 0 errors, 0 warnings, 2 checks, 54 lines checked
make: *** [share/mk/lint/c.mk:63: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.lint-c.checkpatch.touch] Error 1
WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
#230: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:230:
+        err(EXIT_FAILURE, "closeSocketPair-close-0");

Embedded function names are less appropriate to use as
refactoring can cause function renaming.  Prefer the use of
"%s", __func__ to embedded function names.

Note that this does not work with -f (--file) checkpatch option
as it depends on patch context providing the function name.

WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
#232: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:232:
+        err(EXIT_FAILURE, "closeSocketPair-close-1");

total: 0 errors, 2 warnings, 0 checks, 612 lines checked
make: *** [share/mk/lint/c.mk:63: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.lint-c.checkpatch.touch] Error 1

(more pages)


I'm not sure I agree with the ASSIGN_IN_IF case, but I'm assuming
there's a mechanism to kill the lints you don't are about;
linux cdc9718d5e590d6905361800b938b93f2b66818e.


This continues until I've disabled every linter.
I'm assuming you have specific versions that work for you,
but, well.


Best,
наб

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v8 0/5] regex.3 momento
  2023-04-21  2:01                 ` [PATCH v6 0/8] regex.3 momento Alejandro Colomar
@ 2023-04-21  2:48                   ` наб
  2023-04-21  2:48                     ` [PATCH v8 1/5] regex.3: Desoupify regerror() description наб
                                       ` (5 more replies)
  0 siblings, 6 replies; 143+ messages in thread
From: наб @ 2023-04-21  2:48 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]

As a pull.rebase = true enjoyer, it was very easy
(indeed, git pull and axe the single-line conflict + empty commit),
and it's what I've been doing the entire time; recommend it.

5/5 remains a toss-up for me. Apply it if you think it's better,
don't if you don't.

https://bugs.debian.org/1034658

наб (5):
  regex.3: Desoupify regerror() description
  regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link
    regex_t.3type into regex.3
  regex.3: Finalise move of reg*.3type
  regex.3: Destandardeseify Match offsets
  regex.3: Further clarify the sole purpose of REG_NOSUB

 man3/regex.3              | 179 +++++++++++++++++++++++---------------
 man3type/regex_t.3type    |  64 +-------------
 man3type/regmatch_t.3type |   2 +-
 man3type/regoff_t.3type   |   2 +-
 4 files changed, 110 insertions(+), 137 deletions(-)

No clue where it got this. The interdiff is just the .IR -> .I.

Range-diff against v7:
1:  783a16431 ! 1:  4479e1572 regex.3: Desoupify regerror() description
    @@ man3/regex.3: .SS Error reporting
     +.IR errbuf ;
     +the error string is always null-terminated, and truncated to fit.
      .SS Freeing
    - Supplying
      .BR regfree ()
    + deinitializes the pattern buffer at
2:  5706f1892 < -:  --------- regex.3: Desoupify regfree() description
3:  baacf086f < -:  --------- regex.3: Improve REG_STARTEND
4:  056c3ff04 = 2:  bad307847 regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
5:  44d7b775d = 3:  edefa8a5e regex.3: Finalise move of reg*.3type
6:  79641df02 = 4:  500070a5e regex.3: Destandardeseify Match offsets
7:  26d06c07f = 5:  b01685c7a regex.3: Further clarify the sole purpose of REG_NOSUB
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v8 1/5] regex.3: Desoupify regerror() description
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
@ 2023-04-21  2:48                     ` наб
  2023-04-21 10:06                       ` Alejandro Colomar
  2023-04-21  2:48                     ` [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
                                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  2:48 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 1343 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index d91acc19d..069cc6388 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -210,27 +210,23 @@ .SS Error reporting
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+If
+.I preg
+isn't a null pointer,
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+.PP
+If
+.I errbuf_size
+is
+.BR 0 ,
+the size of the required buffer is returned.
+Otherwise, up to
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS Freeing
 .BR regfree ()
 deinitializes the pattern buffer at
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
  2023-04-21  2:48                     ` [PATCH v8 1/5] regex.3: Desoupify regerror() description наб
@ 2023-04-21  2:48                     ` наб
  2023-04-21 11:55                       ` Alejandro Colomar
  2023-04-21  2:48                     ` [PATCH v8 3/5] regex.3: Finalise move of reg*.3type наб
                                       ` (3 subsequent siblings)
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  2:48 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 4267 bytes --]

Move-only commit.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3              | 30 ++++++++++++++++++
 man3type/regex_t.3type    | 64 +--------------------------------------
 man3type/regmatch_t.3type |  2 +-
 man3type/regoff_t.3type   |  2 +-
 4 files changed, 33 insertions(+), 65 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 069cc6388..f6465d484 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -29,6 +29,20 @@ .SH SYNOPSIS
 .BI "            char " errbuf "[_Nullable restrict ." errbuf_size ],
 .BI "            size_t " errbuf_size );
 .BI "void regfree(regex_t *" preg );
+.PP
+.B typedef struct {
+.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
+.B } regex_t;
+.PP
+.B typedef struct {
+.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
+                           to start of substring */
+.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
+                           the first character after the end of
+                           substring */
+.B } regmatch_t;
+.PP
+.BR typedef " /* ... */  " regoff_t;
 .fi
 .SH DESCRIPTION
 .SS Compilation
@@ -202,6 +216,14 @@ .SS Match offsets
 .I rm_eo
 element indicates the end offset of the match,
 which is the offset of the first character after the matching text.
+.PP
+.I regoff_t
+It is a signed integer type
+capable of storing the largest value that can be stored in either an
+.I ptrdiff_t
+type or a
+.I ssize_t
+type.
 .SS Error reporting
 .BR regerror ()
 is used to turn the error codes that can be returned by both
@@ -320,6 +342,14 @@ .SH STANDARDS
 POSIX.1-2008.
 .SH HISTORY
 POSIX.1-2001.
+.PP
+Prior to POSIX.1-2008,
+the type was
+capable of storing the largest value that can be stored in either an
+.I off_t
+type or a
+.I ssize_t
+type.
 .SH EXAMPLES
 .EX
 #include <stdint.h>
diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
index 176d2c7a6..c0daaf0ff 100644
--- a/man3type/regex_t.3type
+++ b/man3type/regex_t.3type
@@ -1,63 +1 @@
-.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
-.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\"
-.TH regex_t 3type (date) "Linux man-pages (unreleased)"
-.SH NAME
-regex_t, regmatch_t, regoff_t
-\- regular expression matching
-.SH LIBRARY
-Standard C library
-.RI ( libc )
-.SH SYNOPSIS
-.EX
-.B #include <regex.h>
-.PP
-.B typedef struct {
-.BR "    size_t    re_nsub;" "  /* Number of parenthesized subexpressions */"
-.B } regex_t;
-.PP
-.B typedef struct {
-.BR "    regoff_t  rm_so;" "    /* Byte offset from start of string"
-                           to start of substring */
-.BR "    regoff_t  rm_eo;" "    /* Byte offset from start of string to"
-                           the first character after the end of
-                           substring */
-.B } regmatch_t;
-.PP
-.BR typedef " /* ... */  " regoff_t;
-.EE
-.SH DESCRIPTION
-.TP
-.I regex_t
-This is a structure type used in regular expression matching.
-It holds a compiled regular expression,
-compiled with
-.BR regcomp (3).
-.TP
-.I regmatch_t
-This is a structure type used in regular expression matching.
-.TP
-.I regoff_t
-It is a signed integer type
-capable of storing the largest value that can be stored in either an
-.I ptrdiff_t
-type or a
-.I ssize_t
-type.
-.SH STANDARDS
-POSIX.1-2008.
-.SH HISTORY
-POSIX.1-2001.
-.PP
-Prior to POSIX.1-2008,
-the type was
-capable of storing the largest value that can be stored in either an
-.I off_t
-type or a
-.I ssize_t
-type.
-.SH SEE ALSO
-.BR regex (3)
+.so man3/regex.3
diff --git a/man3type/regmatch_t.3type b/man3type/regmatch_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regmatch_t.3type
+++ b/man3type/regmatch_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
diff --git a/man3type/regoff_t.3type b/man3type/regoff_t.3type
index dc78f2cf2..c0daaf0ff 100644
--- a/man3type/regoff_t.3type
+++ b/man3type/regoff_t.3type
@@ -1 +1 @@
-.so man3type/regex_t.3type
+.so man3/regex.3
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
  2023-04-21  2:48                     ` [PATCH v8 1/5] regex.3: Desoupify regerror() description наб
  2023-04-21  2:48                     ` [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
@ 2023-04-21  2:48                     ` наб
  2023-04-21 10:33                       ` Alejandro Colomar
  2023-04-21  2:49                     ` [PATCH v8 4/5] regex.3: Destandardeseify Match offsets наб
                                       ` (2 subsequent siblings)
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  2:48 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]

They're inextricably linked, not cross-referenced at all,
and not used anywhere else.

Now that they (realistically) exist to the reader, add a note
on how big nmatch can be; POSIX even says "The application developer
should note that there is probably no reason for using a value of
nmatch that is larger than preg−>re_nsub+1.".

Also remove the now-duplicate regmatch_t declaration.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 54 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index f6465d484..46fd3adef 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -15,7 +15,7 @@ .SH LIBRARY
 Standard C library
 .RI ( libc ", " \-lc )
 .SH SYNOPSIS
-.nf
+.EX
 .B #include <regex.h>
 .PP
 .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
@@ -43,7 +43,7 @@ .SH SYNOPSIS
 .B } regmatch_t;
 .PP
 .BR typedef " /* ... */  " regoff_t;
-.fi
+.EE
 .SH DESCRIPTION
 .SS Compilation
 .BR regcomp ()
@@ -60,6 +60,21 @@ .SS Compilation
 The locale must be the same when running
 .BR regexec ().
 .PP
+After
+.BR regcomp ()
+succeeds,
+.I preg->re_nsub
+holds the number of subexpressions in
+.IR regex .
+Thus, a value of
+.I preg->re_nsub
++ 1
+passed as
+.I nmatch
+to
+.BR regexec ()
+is sufficient to capture all matches.
+.PP
 .I cflags
 is the
 bitwise OR
@@ -192,22 +207,6 @@ .SS Match offsets
 .IR N+1 .)
 Any unused structure elements will contain the value \-1.
 .PP
-The
-.I regmatch_t
-structure which is the type of
-.I pmatch
-is defined in
-.IR <regex.h> .
-.PP
-.in +4n
-.EX
-typedef struct {
-    regoff_t rm_so;
-    regoff_t rm_eo;
-} regmatch_t;
-.EE
-.in
-.PP
 Each
 .I rm_so
 element that is not \-1 indicates the start offset of the next largest
@@ -218,7 +217,7 @@ .SS Match offsets
 which is the offset of the first character after the matching text.
 .PP
 .I regoff_t
-It is a signed integer type
+is a signed integer type
 capable of storing the largest value that can be stored in either an
 .I ptrdiff_t
 type or a
@@ -344,12 +343,27 @@ .SH HISTORY
 POSIX.1-2001.
 .PP
 Prior to POSIX.1-2008,
-the type was
+.I regoff_t
+was required to be
 capable of storing the largest value that can be stored in either an
 .I off_t
 type or a
 .I ssize_t
 type.
+.SH NOTES
+.I re_nsub
+is only required to be initialized if
+.B REG_NOSUB
+wasn't specified, but all known implementations initialize it regardless.
+.\" glibc, musl, 4.4BSD, illumos
+.PP
+Both
+.I regex_t
+and
+.I regmatch_t
+may (and do) have more members, in any order.
+Always reference them by name.
+.\" illumos has two more start/end pairs and the first one is of pointers
 .SH EXAMPLES
 .EX
 #include <stdint.h>
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v8 4/5] regex.3: Destandardeseify Match offsets
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
                                       ` (2 preceding siblings ...)
  2023-04-21  2:48                     ` [PATCH v8 3/5] regex.3: Finalise move of reg*.3type наб
@ 2023-04-21  2:49                     ` наб
  2023-04-21 10:36                       ` Alejandro Colomar
  2023-04-21  2:49                     ` [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB наб
  2023-04-21 10:00                     ` [PATCH v8 0/5] regex.3 momento Alejandro Colomar
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  2:49 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 46fd3adef..55fddd88e 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -184,37 +184,34 @@ .SS Matching
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ string " + " rm_so ", " string " + " rm_eo ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
                                       ` (3 preceding siblings ...)
  2023-04-21  2:49                     ` [PATCH v8 4/5] regex.3: Destandardeseify Match offsets наб
@ 2023-04-21  2:49                     ` наб
  2023-04-21 11:44                       ` Alejandro Colomar
  2023-04-21 10:00                     ` [PATCH v8 0/5] regex.3 momento Alejandro Colomar
  5 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21  2:49 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 792 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
 man3/regex.3 | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 55fddd88e..060e8a587 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -96,16 +96,14 @@ .SS Compilation
 searches using this pattern buffer will be case insensitive.
 .TP
 .B REG_NOSUB
-Do not report position of matches.
-The
-.I nmatch
-and
-.I pmatch
+Report only overall success.
 .BR regexec ()
-arguments will be ignored for this purpose (but
+will use only
 .I pmatch
-may still be used for
-.BR REG_STARTEND ).
+for
+.BR REG_STARTEND ,
+ignoring
+.IR nmatch .
 .TP
 .B REG_NEWLINE
 Match-any-character operators don't match a newline.
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-20 22:29               ` Alejandro Colomar
@ 2023-04-21  5:00                 ` G. Branden Robinson
  2023-04-21  8:06                   ` a straw-man `SR` man(7) macro for (sub)section cross references (was: [PATCH v2 2/9] regex.3: improve REG_STARTEND) G. Branden Robinson
  2023-04-21 11:07                   ` [PATCH v2 2/9] regex.3: improve REG_STARTEND Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-21  5:00 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: наб, linux-man, groff

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

Hi Alex,

At 2023-04-21T00:29:12+0200, Alejandro Colomar wrote:
> On 4/20/23 20:33, G. Branden Robinson wrote:
> > [Note for non-mdoc(7) speakers: `Sx` is its macro for (sub)section
> > heading cross references.  man(7) doesn't have an equivalent, though
> > if there is demand, I'm happy to implement one.  :D]
> 
> I've been delaying my global switch to non-shouting sexion headings,
> due to not having a clear idea of how to refer to them.

Fair.

> Having a macro that does that for me, and ensures that the appropriate
> formatting is applied might be a good solution.

Well, I have three ideas.

1.  Mark them up the way the groff man pages do, in typographer's
    quotes.

See \[lq]Match offsets\[rq] in
.MR regex 3 .

2.  I could implement the `Q` quotation macro for man(7) that I've been
    making noise about for a while.[1]  Of course, you'd be waiting for
    the next release _after_ groff 1.23.0 for it...

See
.Q "Match offsets"
in
.MR regex 3 .

3.  I could implement a macro explicitly tuned to the problem of
    (sub)section cross references.  I didn't see anybody come up with a
    good way to shoehorn this functionality into `MR`, so I suggest the
    following.

.SR section-or-subsection-title [page-topic page-section [trailing-punct]

See
.SR "Match offsets" regex 3 .
.
Also see
.SR Bugs
below.

In this design, if argument 2 is present, argument 3 is mandatory.

The foregoing would render as

       See “Match offsets” (regex(3)).  Also see “Bugs” below.

On devices supporting hyperlinks, "Match offsets" would be a hyperlink
with a to-be-determined anchor reference.  "regex(3)" would be a
hyperlink as with the `MR` macro today.  "Bugs" would be a hyperlink
with a to-be-determined anchor reference within the current document.
(OSC 8 support for this may require some thought, or maybe we'd just
handle them like external page references.)

I trust the tradeoffs involved with each of the above solutions are
readily apparent.

> It would also please the info(1) people, so that the few references we
> have to those would be linked.

What's the URL format for hyperlinks into Info documents?  How is the
existing .UR/.UE inadequate?

Regards,
Branden

[1] https://mail.gnu.org/archive/html/groff/2022-12/msg00078.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* a straw-man `SR` man(7) macro for (sub)section cross references (was: [PATCH v2 2/9] regex.3: improve REG_STARTEND)
  2023-04-21  5:00                 ` G. Branden Robinson
@ 2023-04-21  8:06                   ` G. Branden Robinson
  2023-04-21 11:07                   ` [PATCH v2 2/9] regex.3: improve REG_STARTEND Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: G. Branden Robinson @ 2023-04-21  8:06 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: наб, linux-man, groff

[-- Attachment #1: Type: text/plain, Size: 1383 bytes --]

[self-follow-up; updated subject]

At 2023-04-21T00:07:21-0500, G. Branden Robinson wrote:
> 3.  I could implement a macro explicitly tuned to the problem of
>     (sub)section cross references.  I didn't see anybody come up with a
>     good way to shoehorn this functionality into `MR`, so I suggest the
>     following.
> 
> .SR section-or-subsection-title [page-topic page-section [trailing-punct]

On second thought, I think it would be better to have matched brackets
here.  And more seriously, to permute the argument order to feel more
parallel to `MR` (as well as `ME` and `UE`).

.SR section-or-subsection-title [trailing-punct [page-topic page-section]]

Updating the example:

See
.SR "Match offsets" . regex 3
.
Also see
.SR Bugs
below.

In this design, if argument 3 is present, argument 4 is mandatory.  This
would need to be a pretty hard requirement.  Maybe the default section,
if unspecified, would be "UNKNOWN".  This is rude but doesn't penalize
the user any more than the document author does.  (There is also
precedent in mdoc(7)'s setup macros.)  We don't want to `ab`ort page
rendering for these errors because that will adversely affect innocent
users who are simply trying to read documentation.

The foregoing would render as

       See “Match offsets” (regex(3)).  Also see “Bugs” below.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21  2:16                         ` наб
@ 2023-04-21  9:45                           ` Alejandro Colomar
  2023-04-21 12:13                             ` наб
  2023-04-21 10:19                           ` Jakub Wilk
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21  9:45 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 14390 bytes --]

Hi!

On 4/21/23 04:16, наб wrote:
> Hi!
> 
> On Fri, Apr 21, 2023 at 03:42:48AM +0200, Alejandro Colomar wrote:
>> On 4/21/23 02:39, наб wrote:
>>> Explicitly spell out the ranges involved. The original wording always
>>> confused me, but it's actually very sane.
>>>
>>> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
>>> R_NOTEOL? No. That's weird and confusing.
>>>
>>> String largeness doesn't matter, known-lengthness does.
>>>
>>> Explicitly spell out the influence on returned matches
>>> (relative to string, not start of range).
>>>
>>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
>>
>> Patch applied.
>>
>>> ---
>>> Range-diff against v6:
>>> 1:  4b7971a5e < -:  --------- regex.3: Desoupify regfree() description
>>> 2:  5fb4cc16f ! 1:  ed050649b regex.3: Improve REG_STARTEND
>>>     @@ man3/regex.3: .SS Matching
>>>      -and ending before byte
>>>      -.IR pmatch[0].rm_eo .
>>>      +Match
>>>     -+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
>>>     ++.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
>>>      +instead of
>>>     -+.RI [ string ", " string " + \fBstrlen\fP(" string )).
>>>     ++.RI [ string , " string + strlen(string)" ).
>>>       This allows matching embedded NUL bytes
>>>       and avoids a
>>>       .BR strlen (3)
>>>     @@ man3/regex.3: .SS Matching
>>>      +as usual, and the match offsets remain relative to
>>>      +.IR string
>>>      +(not
>>>     -+.IR string " + " pmatch[0].rm_so ).
>>>     ++.IR "string + pmatch[0].rm_so" ).
>>>       This flag is a BSD extension, not present in POSIX.
>>>       .SS Match offsets
>>>       Unless
>>>
>>>  man3/regex.3 | 29 ++++++++++++++++-------------
>>>  1 file changed, 16 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/man3/regex.3 b/man3/regex.3
>>> index 46a4a12b9..099c2c17f 100644
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -131,23 +131,26 @@ .SS Matching
>>>  above).
>>>  .TP
>>>  .B REG_STARTEND
>>> -Use
>>> -.I pmatch[0]
>>> -on the input string, starting at byte
>>> -.I pmatch[0].rm_so
>>> -and ending before byte
>>> -.IR pmatch[0].rm_eo .
>>> +Match
>>> +.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
>>> +instead of
>>> +.RI [ string , " string + strlen(string)" ).
>>>  This allows matching embedded NUL bytes
>>>  and avoids a
>>>  .BR strlen (3)
>>> -on large strings.
>>> -It does not use
>>> +on known-length strings.
>>> +If any matches are returned
>>> +.RB ( REG_NOSUB
>>> +wasn't passed to
>>> +.BR regcomp (),
>>> +the match succeeded, and
>>>  .I nmatch
>>> -on input, and does not change
>>> -.B REG_NOTBOL
>>> -or
>>> -.B REG_NEWLINE
>>> -processing.
>>> +> 0), they overwrite
>>> +.I pmatch
>>> +as usual, and the match offsets remain relative to
>>> +.IR string
>>
>> Minor glitch: s/IR/I/
>>
>> I fixed it.  BTW, don't know if you knew, but you can run some linters
>> to check these accidents by yourself.
> 
> 
> $ make check
> # ...
> GREP    .tmp/man/man1/memusage.1.check-catman.touch
> .tmp/man/man1/memusage.1.cat.grep:132:           Memory usage summary: heap total: 45200, heap peak: 6440, stack peak: 224
> .tmp/man/man1/memusage.1.cat.grep:135:           realloc|        40         44800             0  (nomove:40, dec:19, free:0)
> make: *** [share/mk/check/catman.mk:36: .tmp/man/man1/memusage.1.check-catman.touch] Error 1

That means the line goes beyond the 80-column margin in rendered pages.
There are pages where code examples go beyond that limit, and I can
only live with it :(.  Ideally, that test should pass in every page,
but in some cases it's impossible.

I know the name of the test is horrible.  Feel free to suggest
alternatives.  Maybe something like 'CHECK (80-col)	$@' would do.

> 
> 
> $ make lint
> SED     .tmp/man/man2/add_key.2.d/add_key.c
> LINT (checkpatch)       .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch
> bash: line 1: checkpatch: command not found
> make: *** [share/mk/lint/c.mk:64: .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch] Error 127
> 
> git grep checkpatch first says I want checkpatch(1).
> No such manual exists, at least in Debian.

Nope; that manual page probably only exists in my servers :)

<http://www.alejandro-colomar.es/src/alx/linux/checkpatch.git/>

> Then it reveals I actually want checkpatch.pl from a linux checkout.
> Probably call it [scripts/]checkpatch.pl then?

The thing is I suggested (privately; I hate that I can't
reference to some list archive) the checkpatch.pl maintainers
separating checkpatch.pl to a standalone project that can be
packaged separately, and has a separate git history.  That
way it would be directly useful to many other projects that
follow coding styles similar to the kernel.

I prepared some proof of concept in that repo, but we agreed
that it would be better if the entire git history from the
Linux git history was kept, so I need to learn how to extract
a few files from a git repo with their history (I know how to
do that for a single file or directory, but cherry-picking
files is more complex, and I didn't yet look deep into it).

So I need to do that work before trying to host that repo in
<kernel.org>.

Feel free to check out that repo, but keep in mind that I
will rewrite the entire history when I learn how to do it.

> 
> Then it reveals
>   CHECKPATCH              := checkpatch

For me it's in

$ which checkpatch
/usr/local/bin/checkpatch

And it's a modified version to be nicer to non-kernel projects.

> which means that just
>   export CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl
> doesn't work, and I need to pass it as an argument (should be ?=).
> The same for all the other linters.

Yeah; feel free to send patches :)

> 
> $ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint
> # ...
> LINT (mandoc)   .tmp/man/man1/pldd.1.lint-man.mandoc.touch
> mandoc: man1/getent.1:6:14: WARNING: cannot parse date, using it verbatim: (date)
> # (same what feels like every page; bullseye mandoc 1.14.5-1)

If you only want to run $CHECKPATCH, you can run
`make lint-c-checkpatch`.  For a complete set of targets, see
`make help`.  (I know; I should have told you before, but that
way I learnt some stuff that might have passed inadvertently.)

> 
> If I pass MANDOC=~/code/voreutils/mandoc (recent(ish, it was recent last
> year) CVS, + some patches I forgot that fixed some egregious formatting
> errors):
> LINT (mandoc)   .tmp/man/man5/ftpusers.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/gai.conf.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/group.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/host.conf.5.lint-man.mandoc.touch
> mandoc: man5/erofs.5:78:2: ERROR: skipping end of block that is not open: RE
> mandoc: man5/erofs.5:79:2: WARNING: skipping paragraph macro: IP empty
> mandoc: man5/erofs.5:78:2: WARNING: skipping paragraph macro: br at the end of SS

I see the same errors; feel free to send patches :)

$ make lint check -t
$ touch man5/erofs.5 
$ make lint check -k
LINT (mandoc)	.tmp/man/man5/erofs.5.lint-man.mandoc.touch
mandoc: man5/erofs.5:78:2: ERROR: skipping end of block that is not open: RE
mandoc: man5/erofs.5:79:2: WARNING: skipping paragraph macro: IP empty
mandoc: man5/erofs.5:78:2: WARNING: skipping paragraph macro: br at the end of SS
make: *** [share/mk/lint/man.mk:33: .tmp/man/man5/erofs.5.lint-man.mandoc.touch] Error 1
LINT (tbl comment)	.tmp/man/man5/erofs.5.lint-man.tbl.touch
make: Target 'lint' not remade because of errors.
PRECONV	.tmp/man/man5/erofs.5.tbl
TBL	.tmp/man/man5/erofs.5.eqn
EQN	.tmp/man/man5/erofs.5.cat.troff
TROFF	.tmp/man/man5/erofs.5.cat.grotty
an.tmac:man5/erofs.5:18: style: use of deprecated macro: .PD
an.tmac:man5/erofs.5:24: style: use of deprecated macro: .PD
an.tmac:man5/erofs.5:50: style: .BR expects at least 2 arguments, got 1
an.tmac:man5/erofs.5:78: style: unbalanced .RE
found style problems; aborting
make: *** [share/mk/build/catman.mk:80: .tmp/man/man5/erofs.5.cat.grotty] Error 1
make: *** Deleting file '.tmp/man/man5/erofs.5.cat.grotty'
make: Target 'check' not remade because of errors.


> 
> And it passes!

Do you mean that make doesn't recognize the error?

> Those are the only errors I saw, even on the version with
> IR\ string$
> 
> When I ran with 2>&1 | less to make sure, I got 
> /etc/bash.bashrc: line 7: PS1: unbound variable

So it seems.

> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> SED     .tmp/man/man2/add_key.2.d/add_key.c
> SED     .tmp/man/man2/bind.2.d/bind.c
> SED     .tmp/man/man2/chown.2.d/chown.c
> SED     .tmp/man/man2/clock_getres.2.d/clock_getres.c
> SED     .tmp/man/man2/clone.2.d/clone.c
> SED     .tmp/man/man2/close_range.2.d/close_range.c
> SED     .tmp/man/man2/copy_file_range.2.d/copy_file_range.c
> SED     .tmp/man/man2/eventfd.2.d/eventfd.c
> and indeed
> Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail
> and
> $ sed -n 6,7p /etc/bash.bashrc
> # If not running interactively, don't do anything
> [ -z "$PS1" ] && return

I have the same bashrc (Debian Sid here), and have this same
line.  Why is it failing only for you?  Maybe I modified
something in my startup scripts?  Maybe you did?

> 
> (That should be ${PS1-}. What's even funnier is that

Should we call debbugs?  :)

>  $ sed -n 14p /etc/bash.bashrc
>  if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then)

Huh!

> 
> 
> $ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint MANDOC=: CLANG-TIDY=:
> LINT (checkpatch)       .tmp/man/man3/_Generic.3.d/_Generic.lint-c.checkpatch.touch
> ERROR:ASSIGN_IN_IF: do not use assignment in if condition
> #17: FILE: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.c:17:
> +    if ((fp = fopen(argv[1], "r")) == NULL) {
> 
> Do not use assignments in if condition.
> Example::
> 
>   if ((foo = bar(...)) < BAZ) {
> 
> should be written as::
> 
>   foo = bar(...);
>   if (foo < BAZ) {
> 
> total: 1 errors, 0 warnings, 0 checks, 29 lines checked
> make: *** [share/mk/lint/c.mk:64: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.lint-c.checkpatch.touch] Error 1
> make: *** Waiting for unfinished jobs....

Hmmm, yes, I see that same error; this page is recent, so I
probably never run the linters on it yet.  :/

Thanks for the catch!  Fixed.

> CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
> #17: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:17:
> +    printf("Name: \"%s\" (%d segments)\n", info->dlpi_name,
> +               info->dlpi_phnum);

This page has so many warnings, that I probably missed these
valid ones.  Alignment seems performed by a schoolchild that
can't follow lines while painting :p.  Fixed.

> 
> CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
> #33: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:33:
> +        printf("    %2zu: [%14p; memsz:%7jx] flags: %#jx; ", j,
> +                (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr),
> 
> total: 0 errors, 0 warnings, 2 checks, 54 lines checked
> make: *** [share/mk/lint/c.mk:63: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.lint-c.checkpatch.touch] Error 1
> WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
> #230: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:230:
> +        err(EXIT_FAILURE, "closeSocketPair-close-0");
> 
> Embedded function names are less appropriate to use as
> refactoring can cause function renaming.  Prefer the use of
> "%s", __func__ to embedded function names.
> 
> Note that this does not work with -f (--file) checkpatch option
> as it depends on patch context providing the function name.
> 
> WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
> #232: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:232:
> +        err(EXIT_FAILURE, "closeSocketPair-close-1");

I've seen this one, and thought of fixing it, but I'm not
yet sure how to do it so that the page is consistent with
itself.  So far I've not done anything.

> 
> total: 0 errors, 2 warnings, 0 checks, 612 lines checked
> make: *** [share/mk/lint/c.mk:63: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.lint-c.checkpatch.touch] Error 1
> 
> (more pages)
> 
> 
> I'm not sure I agree with the ASSIGN_IN_IF case,

I do agree with it; it's just that I don't run these often;
especially some linters that have many warnings in current
pages, I tend to ignore them.  But they're still useful
sometimes.

> but I'm assuming
> there's a mechanism to kill the lints you don't are about;
> linux cdc9718d5e590d6905361800b938b93f2b66818e.

I disable the lints in the Makefile, so whatever you see is
probably because it's a wanted warning, or because the
linter recently added it.  However, fixing all pages would
be impossible :(.

> 
> 
> This continues until I've disabled every linter.
> I'm assuming you have specific versions that work for you,
> but, well.

No; I do see a lot of noise too.  The thing is it's still
useful for linting specific pages:

    $ make lint check -t >/dev/null  # ignore everything
    $ make lint check -W man5/erofs.5  # lint only that page

Cheers,
Alex

> 
> 
> Best,
> наб

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 0/5] regex.3 momento
  2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
                                       ` (4 preceding siblings ...)
  2023-04-21  2:49                     ` [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB наб
@ 2023-04-21 10:00                     ` Alejandro Colomar
  5 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:00 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2424 bytes --]

Hi!

On 4/21/23 04:48, наб wrote:
> As a pull.rebase = true enjoyer, it was very easy
> (indeed, git pull and axe the single-line conflict + empty commit),
> and it's what I've been doing the entire time; recommend it.

Heh, I never run `git pull`.  It feels too dangerous.

I prefer `git fetch` and then doing manually whatever needs to be
done, so I know exactly what goes on.

> 
> 5/5 remains a toss-up for me. Apply it if you think it's better,
> don't if you don't.
> 
> https://bugs.debian.org/1034658

:)

> 
> наб (5):
>   regex.3: Desoupify regerror() description
>   regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link
>     regex_t.3type into regex.3
>   regex.3: Finalise move of reg*.3type
>   regex.3: Destandardeseify Match offsets
>   regex.3: Further clarify the sole purpose of REG_NOSUB
> 
>  man3/regex.3              | 179 +++++++++++++++++++++++---------------
>  man3type/regex_t.3type    |  64 +-------------
>  man3type/regmatch_t.3type |   2 +-
>  man3type/regoff_t.3type   |   2 +-
>  4 files changed, 110 insertions(+), 137 deletions(-)
> 
> No clue where it got this. The interdiff is just the .IR -> .I.
> 
> Range-diff against v7:
> 1:  783a16431 ! 1:  4479e1572 regex.3: Desoupify regerror() description
>     @@ man3/regex.3: .SS Error reporting
>      +.IR errbuf ;
>      +the error string is always null-terminated, and truncated to fit.
>       .SS Freeing
>     - Supplying
>       .BR regfree ()
>     + deinitializes the pattern buffer at

This means that the context of the patches changed (due to the rebase),
even if the +/- haven't changed themselves.  Basically what would be
"applying with fuzz" when refreshing a patch.

> 2:  5706f1892 < -:  --------- regex.3: Desoupify regfree() description
> 3:  baacf086f < -:  --------- regex.3: Improve REG_STARTEND

The cause is probably that I applied these before it.

Cheers,
Alex

> 4:  056c3ff04 = 2:  bad307847 regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
> 5:  44d7b775d = 3:  edefa8a5e regex.3: Finalise move of reg*.3type
> 6:  79641df02 = 4:  500070a5e regex.3: Destandardeseify Match offsets
> 7:  26d06c07f = 5:  b01685c7a regex.3: Further clarify the sole purpose of REG_NOSUB

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 1/5] regex.3: Desoupify regerror() description
  2023-04-21  2:48                     ` [PATCH v8 1/5] regex.3: Desoupify regerror() description наб
@ 2023-04-21 10:06                       ` Alejandro Colomar
  2023-04-21 12:03                         ` [PATCH v9] " наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:06 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1011 bytes --]

On 4/21/23 04:48, наб wrote:
> +If
> +.I preg
> +isn't a null pointer,
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +.PP
> +If
> +.I errbuf_size
> +is
> +.BR 0 ,
> +the size of the required buffer is returned.

I wonder what it returns elsewise from that phrasing.  Probably the
same, right?  Which is confusing.  Maybe put that text without a
conditional, and only say that if errbuf_size is 0 the buffer is
ignored and no copy is performed?

> +Otherwise, up to
>  .I errbuf_size
> -are nonzero,
> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> +bytes are copied to
> +.IR errbuf ;
> +the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
>  .BR regfree ()
>  deinitializes the pattern buffer at

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21  2:16                         ` наб
  2023-04-21  9:45                           ` Alejandro Colomar
@ 2023-04-21 10:19                           ` Jakub Wilk
  2023-04-21 10:22                             ` Alejandro Colomar
  2023-04-21 11:34                             ` наб
  1 sibling, 2 replies; 143+ messages in thread
From: Jakub Wilk @ 2023-04-21 10:19 UTC (permalink / raw)
  To: наб; +Cc: Alejandro Colomar, linux-man

* наб <nabijaczleweli@nabijaczleweli.xyz>, 2023-04-21 04:16:
>/etc/bash.bashrc: line 7: PS1: unbound variable

How come? bash is not supposed to read bashrc if the shell is 
non-interactive (unless you instruct it otherwise).

>Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail

Unrelated, but what is /usr/bin/env for?

-- 
Jakub Wilk

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 10:19                           ` Jakub Wilk
@ 2023-04-21 10:22                             ` Alejandro Colomar
  2023-04-21 10:44                               ` Jakub Wilk
  2023-04-21 11:34                             ` наб
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:22 UTC (permalink / raw)
  To: Jakub Wilk; +Cc: linux-man, наб


[-- Attachment #1.1: Type: text/plain, Size: 1448 bytes --]

Hi Jakub!

On 4/21/23 12:19, Jakub Wilk wrote:
> * наб <nabijaczleweli@nabijaczleweli.xyz>, 2023-04-21 04:16:
>> /etc/bash.bashrc: line 7: PS1: unbound variable
> 
> How come? bash is not supposed to read bashrc if the shell is 
> non-interactive (unless you instruct it otherwise).
> 
>> Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail
> 
> Unrelated, but what is /usr/bin/env for?

$ git blame -- Makefile | grep bin/env
26061fbd33 (Alejandro Colomar 2022-06-19 19:55:58 +0200  31) SHELL := /usr/bin/env bash -Eeuo pipefail


$ git show 26061fbd33
commit 26061fbd337fbcfb6255def88ef4f0573c090702
Author: Alejandro Colomar <alx@kernel.org>
Date:   Sun Jun 19 19:55:58 2022 +0200

    Makefile: SHELL: Use a portable bash
    
    Reported-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
    Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>

diff --git a/Makefile b/Makefile
index 9beca11de..cb1466370 100644
--- a/Makefile
+++ b/Makefile
@@ -28,7 +28,7 @@
 #
 ########################################################################
 
-SHELL := /bin/bash -Eeuo pipefail
+SHELL := /usr/bin/env bash -Eeuo pipefail
 
 
 MAKEFLAGS += --no-print-directory


This helps in systems where bash(1) is not a system command (probably
MacOS, and maybe others).

Cheers,
Alex


-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21  2:48                     ` [PATCH v8 3/5] regex.3: Finalise move of reg*.3type наб
@ 2023-04-21 10:33                       ` Alejandro Colomar
  2023-04-21 10:34                         ` Alejandro Colomar
       [not found]                         ` <1d2d0aa8-cb28-2d7f-c48b-7a02f907cb5b@gmail.com>
  0 siblings, 2 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:33 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4242 bytes --]

Hi!

On 4/21/23 04:48, наб wrote:
> They're inextricably linked, not cross-referenced at all,
> and not used anywhere else.
> 
> Now that they (realistically) exist to the reader, add a note
> on how big nmatch can be; POSIX even says "The application developer
> should note that there is probably no reason for using a value of
> nmatch that is larger than preg−>re_nsub+1.".
> 
> Also remove the now-duplicate regmatch_t declaration.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied, with minor tweaks; see below (I guess you approve them).

Cheers,
Alex

> ---
>  man3/regex.3 | 54 +++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 34 insertions(+), 20 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index f6465d484..46fd3adef 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -15,7 +15,7 @@ .SH LIBRARY
>  Standard C library
>  .RI ( libc ", " \-lc )
>  .SH SYNOPSIS
> -.nf
> +.EX

I've been thinking about this, but am not yet fully convinced.  I'll
propose you the two alternatives, and let you decide what looks best.

(a)  Use .nf/.fi for the function prototypes, and .EX/.EE for the
     types.

(b)  .EX/.EE for everything, as you did.

Please have a look at the PDF versions (you can run
`pdfman ./man3/regex.3` after you `source ./scripts/bash_aliases`).

If you're going to use it often, I suggest the following in
~/.bash_aliases:

if [ -f ~/src/linux/man-pages/man-pages/main/scripts/bash_aliases ]; then
	. ~/src/linux/man-pages/man-pages/main/scripts/bash_aliases;
fi;


I've remove these bits from this patch, since the rest seems
uncontroversial to me.


>  .B #include <regex.h>
>  .PP
>  .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
> @@ -43,7 +43,7 @@ .SH SYNOPSIS
>  .B } regmatch_t;
>  .PP
>  .BR typedef " /* ... */  " regoff_t;
> -.fi
> +.EE
>  .SH DESCRIPTION
>  .SS Compilation
>  .BR regcomp ()
> @@ -60,6 +60,21 @@ .SS Compilation
>  The locale must be the same when running
>  .BR regexec ().
>  .PP
> +After
> +.BR regcomp ()
> +succeeds,
> +.I preg->re_nsub
> +holds the number of subexpressions in
> +.IR regex .
> +Thus, a value of
> +.I preg->re_nsub
> ++ 1
> +passed as
> +.I nmatch
> +to
> +.BR regexec ()
> +is sufficient to capture all matches.
> +.PP
>  .I cflags
>  is the
>  bitwise OR
> @@ -192,22 +207,6 @@ .SS Match offsets
>  .IR N+1 .)
>  Any unused structure elements will contain the value \-1.
>  .PP
> -The
> -.I regmatch_t
> -structure which is the type of
> -.I pmatch
> -is defined in
> -.IR <regex.h> .
> -.PP
> -.in +4n
> -.EX
> -typedef struct {
> -    regoff_t rm_so;
> -    regoff_t rm_eo;
> -} regmatch_t;
> -.EE
> -.in
> -.PP
>  Each
>  .I rm_so
>  element that is not \-1 indicates the start offset of the next largest
> @@ -218,7 +217,7 @@ .SS Match offsets
>  which is the offset of the first character after the matching text.
>  .PP
>  .I regoff_t
> -It is a signed integer type
> +is a signed integer type
>  capable of storing the largest value that can be stored in either an
>  .I ptrdiff_t
>  type or a
> @@ -344,12 +343,27 @@ .SH HISTORY
>  POSIX.1-2001.
>  .PP
>  Prior to POSIX.1-2008,
> -the type was
> +.I regoff_t
> +was required to be
>  capable of storing the largest value that can be stored in either an
>  .I off_t
>  type or a
>  .I ssize_t
>  type.
> +.SH NOTES

NOTES is dreaded, and only used when no other section would work.
CAVEATS (recently added to the Linux man-pages) is more suitable;
I've edited your patch to use it.

> +.I re_nsub
> +is only required to be initialized if
> +.B REG_NOSUB
> +wasn't specified, but all known implementations initialize it regardless.
> +.\" glibc, musl, 4.4BSD, illumos
> +.PP
> +Both
> +.I regex_t
> +and
> +.I regmatch_t
> +may (and do) have more members, in any order.
> +Always reference them by name.
> +.\" illumos has two more start/end pairs and the first one is of pointers
>  .SH EXAMPLES
>  .EX
>  #include <stdint.h>

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 10:33                       ` Alejandro Colomar
@ 2023-04-21 10:34                         ` Alejandro Colomar
  2023-04-21 11:26                           ` наб
       [not found]                         ` <1d2d0aa8-cb28-2d7f-c48b-7a02f907cb5b@gmail.com>
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:34 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4577 bytes --]



On 4/21/23 12:33, Alejandro Colomar wrote:
> Hi!
> 
> On 4/21/23 04:48, наб wrote:
>> They're inextricably linked, not cross-referenced at all,
>> and not used anywhere else.
>>
>> Now that they (realistically) exist to the reader, add a note
>> on how big nmatch can be; POSIX even says "The application developer
>> should note that there is probably no reason for using a value of
>> nmatch that is larger than preg−>re_nsub+1.".
>>
>> Also remove the now-duplicate regmatch_t declaration.
>>
>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> 
> Patch applied, with minor tweaks; see below (I guess you approve them).
> 
> Cheers,
> Alex
> 
>> ---
>>  man3/regex.3 | 54 +++++++++++++++++++++++++++++++++-------------------
>>  1 file changed, 34 insertions(+), 20 deletions(-)
>>
>> diff --git a/man3/regex.3 b/man3/regex.3
>> index f6465d484..46fd3adef 100644
>> --- a/man3/regex.3
>> +++ b/man3/regex.3
>> @@ -15,7 +15,7 @@ .SH LIBRARY
>>  Standard C library
>>  .RI ( libc ", " \-lc )
>>  .SH SYNOPSIS
>> -.nf
>> +.EX
> 
> I've been thinking about this, but am not yet fully convinced.  I'll
> propose you the two alternatives, and let you decide what looks best.
> 
> (a)  Use .nf/.fi for the function prototypes, and .EX/.EE for the
>      types.
> 
> (b)  .EX/.EE for everything, as you did.
> 
> Please have a look at the PDF versions (you can run
> `pdfman ./man3/regex.3` after you `source ./scripts/bash_aliases`).
> 
> If you're going to use it often, I suggest the following in
> ~/.bash_aliases:
> 
> if [ -f ~/src/linux/man-pages/man-pages/main/scripts/bash_aliases ]; then
> 	. ~/src/linux/man-pages/man-pages/main/scripts/bash_aliases;
> fi;
> 
> 
> I've remove these bits from this patch, since the rest seems
> uncontroversial to me.

But I haven't pushed, so that we can still have it in the same
patch if you confirm.

> 
> 
>>  .B #include <regex.h>
>>  .PP
>>  .BI "int regcomp(regex_t *restrict " preg ", const char *restrict " regex ,
>> @@ -43,7 +43,7 @@ .SH SYNOPSIS
>>  .B } regmatch_t;
>>  .PP
>>  .BR typedef " /* ... */  " regoff_t;
>> -.fi
>> +.EE
>>  .SH DESCRIPTION
>>  .SS Compilation
>>  .BR regcomp ()
>> @@ -60,6 +60,21 @@ .SS Compilation
>>  The locale must be the same when running
>>  .BR regexec ().
>>  .PP
>> +After
>> +.BR regcomp ()
>> +succeeds,
>> +.I preg->re_nsub
>> +holds the number of subexpressions in
>> +.IR regex .
>> +Thus, a value of
>> +.I preg->re_nsub
>> ++ 1
>> +passed as
>> +.I nmatch
>> +to
>> +.BR regexec ()
>> +is sufficient to capture all matches.
>> +.PP
>>  .I cflags
>>  is the
>>  bitwise OR
>> @@ -192,22 +207,6 @@ .SS Match offsets
>>  .IR N+1 .)
>>  Any unused structure elements will contain the value \-1.
>>  .PP
>> -The
>> -.I regmatch_t
>> -structure which is the type of
>> -.I pmatch
>> -is defined in
>> -.IR <regex.h> .
>> -.PP
>> -.in +4n
>> -.EX
>> -typedef struct {
>> -    regoff_t rm_so;
>> -    regoff_t rm_eo;
>> -} regmatch_t;
>> -.EE
>> -.in
>> -.PP
>>  Each
>>  .I rm_so
>>  element that is not \-1 indicates the start offset of the next largest
>> @@ -218,7 +217,7 @@ .SS Match offsets
>>  which is the offset of the first character after the matching text.
>>  .PP
>>  .I regoff_t
>> -It is a signed integer type
>> +is a signed integer type
>>  capable of storing the largest value that can be stored in either an
>>  .I ptrdiff_t
>>  type or a
>> @@ -344,12 +343,27 @@ .SH HISTORY
>>  POSIX.1-2001.
>>  .PP
>>  Prior to POSIX.1-2008,
>> -the type was
>> +.I regoff_t
>> +was required to be
>>  capable of storing the largest value that can be stored in either an
>>  .I off_t
>>  type or a
>>  .I ssize_t
>>  type.
>> +.SH NOTES
> 
> NOTES is dreaded, and only used when no other section would work.
> CAVEATS (recently added to the Linux man-pages) is more suitable;
> I've edited your patch to use it.
> 
>> +.I re_nsub
>> +is only required to be initialized if
>> +.B REG_NOSUB
>> +wasn't specified, but all known implementations initialize it regardless.
>> +.\" glibc, musl, 4.4BSD, illumos
>> +.PP
>> +Both
>> +.I regex_t
>> +and
>> +.I regmatch_t
>> +may (and do) have more members, in any order.
>> +Always reference them by name.
>> +.\" illumos has two more start/end pairs and the first one is of pointers
>>  .SH EXAMPLES
>>  .EX
>>  #include <stdint.h>
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 4/5] regex.3: Destandardeseify Match offsets
  2023-04-21  2:49                     ` [PATCH v8 4/5] regex.3: Destandardeseify Match offsets наб
@ 2023-04-21 10:36                       ` Alejandro Colomar
  2023-04-21 12:55                         ` [PATCH v9] " наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 10:36 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1047 bytes --]

On 4/21/23 04:49, наб wrote:
> This section reads like it were (and pretty much is) lifted from POSIX.
> That's hard to read, because POSIX is horrendously verbose, as usual.
> 
> Instead, synopsise it into something less formal but more reasonable,
> and describe the resulting range with a range instead of a paragraph.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
>  1 file changed, 25 insertions(+), 28 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 46fd3adef..55fddd88e 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -184,37 +184,34 @@ .SS Matching

[...]

> +Each returned valid
> +.RB (non- \-1 )
> +match corresponds to the range
> +.RI [ string " + " rm_so ", " string " + " rm_eo ).

These be expressions :)

>  .PP
>  .I regoff_t
>  is a signed integer type

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 10:22                             ` Alejandro Colomar
@ 2023-04-21 10:44                               ` Jakub Wilk
  2023-04-21 11:16                                 ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Jakub Wilk @ 2023-04-21 10:44 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, наб

* Alejandro Colomar <alx.manpages@gmail.com>, 2023-04-21 12:22:
>-SHELL := /bin/bash -Eeuo pipefail
>+SHELL := /usr/bin/env bash -Eeuo pipefail
>
>
> MAKEFLAGS += --no-print-directory
>
>
>This helps in systems where bash(1) is not a system command (probably 
>MacOS, and maybe others).

Yeah, but why not use simply

     SHELL = bash ...

?

-- 
Jakub Wilk

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-21  5:00                 ` G. Branden Robinson
  2023-04-21  8:06                   ` a straw-man `SR` man(7) macro for (sub)section cross references (was: [PATCH v2 2/9] regex.3: improve REG_STARTEND) G. Branden Robinson
@ 2023-04-21 11:07                   ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:07 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: наб, linux-man, groff


[-- Attachment #1.1: Type: text/plain, Size: 5952 bytes --]

Hi Branden!

On 4/21/23 07:00, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2023-04-21T00:29:12+0200, Alejandro Colomar wrote:
>> On 4/20/23 20:33, G. Branden Robinson wrote:
>>> [Note for non-mdoc(7) speakers: `Sx` is its macro for (sub)section
>>> heading cross references.  man(7) doesn't have an equivalent, though
>>> if there is demand, I'm happy to implement one.  :D]
>>
>> I've been delaying my global switch to non-shouting sexion headings,
>> due to not having a clear idea of how to refer to them.
> 
> Fair.
> 
>> Having a macro that does that for me, and ensures that the appropriate
>> formatting is applied might be a good solution.
> 
> Well, I have three ideas.
> 
> 1.  Mark them up the way the groff man pages do, in typographer's
>     quotes.
> 
> See \[lq]Match offsets\[rq] in
> .MR regex 3 .

Not bad, but if we can have some macro that hides these details, and
even lets users tune their favourite formatting, that may be nicer.
As a bonus, it adds hyperlinking abilities.  :-)

> 
> 2.  I could implement the `Q` quotation macro for man(7) that I've been
>     making noise about for a while.[1]  Of course, you'd be waiting for
>     the next release _after_ groff 1.23.0 for it...
> 
> See
> .Q "Match offsets"
> in
> .MR regex 3 .

I'm not yet convinced by a general need for .Q.  Since the single use
I've needed so far for it is in section references, I guess a .SR macro
is more appropriate.

> 
> 3.  I could implement a macro explicitly tuned to the problem of
>     (sub)section cross references.  I didn't see anybody come up with a
>     good way to shoehorn this functionality into `MR`, so I suggest the
>     following.

Agree; extending .MR for that seems not easy.

[... fixed in reply; below ...]

> 
> On devices supporting hyperlinks, "Match offsets" would be a hyperlink
> with a to-be-determined anchor reference.  "regex(3)" would be a
> hyperlink as with the `MR` macro today.  "Bugs" would be a hyperlink
> with a to-be-determined anchor reference within the current document.
> (OSC 8 support for this may require some thought, or maybe we'd just
> handle them like external page references.)
> 
> I trust the tradeoffs involved with each of the above solutions are
> readily apparent.



> 
>> It would also please the info(1) people, so that the few references we
>> have to those would be linked.
> 
> What's the URL format for hyperlinks into Info documents?

You ask me about how info(1) works?  :D

info(1) is to me as unknown as ed(1).  At least I can quit them both
with q.  There's not much more I know of either.

>  How is the
> existing .UR/.UE inadequate?

I meant more that man(7) would have capabilities similar to info
documents.  It would only be that the current implementation of man(1)
is not powerful enough to do what info(1) does, but I guess it would
be conceivable to implement an info-like system that got man(7)
source.  Similar to what this lsp(1) proposed to the list recently
could do.

> 
> Regards,
> Branden
> 
> [1] https://mail.gnu.org/archive/html/groff/2022-12/msg00078.html




On 4/21/23 10:06, G. Branden Robinson wrote:
> [self-follow-up; updated subject]
> 
> At 2023-04-21T00:07:21-0500, G. Branden Robinson wrote:
>> 3.  I could implement a macro explicitly tuned to the problem of
>>     (sub)section cross references.  I didn't see anybody come up with a
>>     good way to shoehorn this functionality into `MR`, so I suggest the
>>     following.
>>
>> .SR section-or-subsection-title [page-topic page-section [trailing-punct]
> 
> On second thought, I think it would be better to have matched brackets
> here.  And more seriously, to permute the argument order to feel more
> parallel to `MR` (as well as `ME` and `UE`).
> 
> .SR section-or-subsection-title [trailing-punct [page-topic page-section]]

I like this one most, by far.

However, I wonder what happens when there are conflicting names for
subsections.  This doesn't happen often, but certainly happens.  Should
we disambiguate by specifying the section and subsection in separate
arguments?

.SR section-or-subsection-title [trailing-punct [page-topic page-section] [section-title]]

I guess that will be hard to implement, but should be doable, since it's
unambiguous.  It also answers what to do when the chapter is not specified:
it would be interpreted as a section instead, so author's fault.  Some
draft examples:

See
.SR Description

       See “Description”

See
.SR Description .

       See “Description”.

See
.SR Compilation . Description

       See “Compilation” (“Description”).

See
.SR Description . regex 3

       See “Description” (regex(3)).

See
.SR Compilation . regex 3 Description

       See “Compilation” (regex(3) “Description”).


Further arguments ignored.  The complex thing would be that the meaning
of the 3rd arg depends on having a 4th one, but it's not so bad.

Does it make sense to you?

Cheers,
Alex

> 
> Updating the example:
> 
> See
> .SR "Match offsets" . regex 3
> .
> Also see
> .SR Bugs
> below.
> 
> In this design, if argument 3 is present, argument 4 is mandatory.  This
> would need to be a pretty hard requirement.  Maybe the default section,
> if unspecified, would be "UNKNOWN".  This is rude but doesn't penalize
> the user any more than the document author does.  (There is also
> precedent in mdoc(7)'s setup macros.)  We don't want to `ab`ort page
> rendering for these errors because that will adversely affect innocent
> users who are simply trying to read documentation.
> 
> The foregoing would render as
> 
>        See “Match offsets” (regex(3)).  Also see “Bugs” below.
> 
> Regards,
> Branden

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 10:44                               ` Jakub Wilk
@ 2023-04-21 11:16                                 ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:16 UTC (permalink / raw)
  To: Jakub Wilk; +Cc: linux-man, наб, bug-make


[-- Attachment #1.1: Type: text/plain, Size: 759 bytes --]

Hi Jakub,

On 4/21/23 12:44, Jakub Wilk wrote:
> * Alejandro Colomar <alx.manpages@gmail.com>, 2023-04-21 12:22:
>> -SHELL := /bin/bash -Eeuo pipefail
>> +SHELL := /usr/bin/env bash -Eeuo pipefail
>>
>>
>> MAKEFLAGS += --no-print-directory
>>
>>
>> This helps in systems where bash(1) is not a system command (probably 
>> MacOS, and maybe others).
> 
> Yeah, but why not use simply
> 
>      SHELL = bash ...
> 
> ?

I couldn't find documentation that guarantees that that should work,
so we used shebang style, which will work for sure.

I CCd bug-make@, in case they can confirm what is safe and what is not.

Thanks,
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 10:34                         ` Alejandro Colomar
@ 2023-04-21 11:26                           ` наб
  2023-04-21 11:36                             ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21 11:26 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 169 bytes --]

On Fri, Apr 21, 2023 at 12:34:39PM +0200, Alejandro Colomar wrote:
> But I haven't pushed, so that we can still have it in the same
> patch if you confirm.
Yeah, go on.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 10:19                           ` Jakub Wilk
  2023-04-21 10:22                             ` Alejandro Colomar
@ 2023-04-21 11:34                             ` наб
  2023-04-21 12:46                               ` Jakub Wilk
  1 sibling, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21 11:34 UTC (permalink / raw)
  To: Jakub Wilk; +Cc: Alejandro Colomar, linux-man

[-- Attachment #1: Type: text/plain, Size: 693 bytes --]

On Fri, Apr 21, 2023 at 12:19:57PM +0200, Jakub Wilk wrote:
> * наб <nabijaczleweli@nabijaczleweli.xyz>, 2023-04-21 04:16:
> > /etc/bash.bashrc: line 7: PS1: unbound variable
> How come? bash is not supposed to read bashrc if the shell is
> non-interactive (unless you instruct it otherwise).
No clue, surprised me as well, esp. since I didn't see any funny bash
flags to force interactivity. Should be protected against -u regardless.

> > Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail
> Unrelated, but what is /usr/bin/env for?
Oddly, SHELL look-up appears to only be defined for DOS:
  https://www.gnu.org/software/make/manual/html_node/Choosing-the-Shell.html

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v4 6/6] regex.3: Destandardeseify Match offsets
  2023-04-20 15:05               ` наб
  2023-04-20 18:51                 ` G. Branden Robinson
@ 2023-04-21 11:34                 ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:34 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 4061 bytes --]

Hi,

On 4/20/23 17:05, наб wrote:
> Hi!
> 
> On Thu, Apr 20, 2023 at 04:10:04PM +0200, Alejandro Colomar wrote:
>> On 4/20/23 15:02, наб wrote:
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -188,37 +188,34 @@ This flag is a BSD extension, not present in POSIX.
>>>  .SS Match offsets
>>>  Unless
>>>  .B REG_NOSUB
>>> -was set for the compilation of the pattern buffer, it is possible to
>>> -obtain match addressing information.
>>> -.I pmatch
>>> -must be dimensioned to have at least
>>> -.I nmatch
>>> -elements.
>>> -These are filled in by
>>> +was passed to
>>> +.BR regcomp (),
>>> +it is possible to
>>> +obtain the locations of matches within
>>> +.IR string :
>>>  .BR regexec ()
>>> -with substring match addresses.
>>> -The offsets of the subexpression starting at the
>>> -.IR i th
>>> -open parenthesis are stored in
>>> -.IR pmatch[i] .
>>> -The entire regular expression's match addresses are stored in
>>> -.IR pmatch[0] .
>>> -(Note that to return the offsets of
>>> -.I N
>>> -subexpression matches,
>>> +fills
>>>  .I nmatch
>>> -must be at least
>>> -.IR N+1 .)
>>> -Any unused structure elements will contain the value \-1.
>>> +elements of
>>> +.I pmatch
>>> +with results:
>>> +.I pmatch[0]
>>> +corresponds to the entire match,
>> I still don't understand this.  Does REG_NOSUB also affect pmatch[0]?
>> I would have expected that it would only affect *sub*matches, that is, [>0].
> 
> Let's consult the manual:
>   REG_NOSUB  Do not report position of matches. [...]
>   REG_NOSUB  Compile for matching that need only report success or
>              failure, not what was matched.                    (4.4BSD)
> and POSIX:
>   REG_NOSUB  Report only success or fail in regexec().
>   REG_NOSUB  Report only success/fail in regexec( ).
> (yes; the two times it describes it, it's written differently).
> 
> POSIX says it better I think.
> 
> And, indeed:
> 	$ cat a.c
> 	#include <regex.h>
> 	#include <stdio.h>
> 	int main(int c, char ** v) {
> 		regex_t r;
> 		regcomp(&r, v[1], 0);
> 		regmatch_t dt = {0, 3};
> 		printf("%d\n", regexec(&r, v[2], 1, &dt, REG_STARTEND));
> 		printf("%d, %d\n", (int)dt.rm_so, (int)dt.rm_eo);
> 	}
> 
> 	$ cc a.c -oac
> 	$ ./ac 'c$' 'abcdef'
> 	0
> 	2, 3
> 
> 	$ sed 's/0)/REG_NOSUB)/' a.c | cc -xc - -oac
> 	$ ./ac 'c$' 'abcdef'
> 	0
> 	0, 3
> 

I like this example, and the quotes from POSIX.  I'll link to your
message in the commit log.

> 
> ...and I've just realised why you're asking ‒ I think you're reading too
> much (and ahistorically) into the "SUB" bit;

[...]

> Actually, let's consult POSIX.2 (Draft 11.2):

[...]

>   609  If the REG_NOSUB flag was not set in cflags, then regcomp() shall set re_nsub to
>   610  the number of parenthesized subexpressions [delimited by \( \) in basic regular
>   611  expressions or ( ) in extended regular expressions] found in pattern.
> both as present-day.

[...]

> It also allows an application to request an arbitrary number of sub-
>   810  strings from a regular expression. (Previous versions reported only ten sub-
>   811  strings.) The number of subexpressions in the regular expression is reported in
>   812  re_nsub in preg.

[...]

> 
> So: yes, there was a substitution interface that got cut.
> The name is actually a hold-over from
> "don't allocate for ten subexpressions in regex_t".

So, the name indeed seems to come from "subexpressions", which confirms
that it's just confusing as hell.

> 
> I think changing our description to
>   REG_NOSUB  Only report overall success. regexec() will only use pmatch
>              for REG_STARTEND, and ignore nmatch.
> may make that more obvious.

Yeah, this, and further the version in v8, makes the behavior clear, even
if the name is brain-damaged (but there's nothing we can do about it :/).

Cheers,
Alex

> 
> Best,
> наб

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 11:26                           ` наб
@ 2023-04-21 11:36                             ` Alejandro Colomar
  2023-04-21 11:49                               ` наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:36 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 355 bytes --]



On 4/21/23 13:26, наб wrote:
> On Fri, Apr 21, 2023 at 12:34:39PM +0200, Alejandro Colomar wrote:
>> But I haven't pushed, so that we can still have it in the same
>> patch if you confirm.
> Yeah, go on.

But do you prefer (a) or (b)?

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB
  2023-04-21  2:49                     ` [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB наб
@ 2023-04-21 11:44                       ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:44 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1030 bytes --]

Hi nab!

On 4/21/23 04:49, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.  Thanks,

Alex

> ---
>  man3/regex.3 | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 55fddd88e..060e8a587 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -96,16 +96,14 @@ .SS Compilation
>  searches using this pattern buffer will be case insensitive.
>  .TP
>  .B REG_NOSUB
> -Do not report position of matches.
> -The
> -.I nmatch
> -and
> -.I pmatch
> +Report only overall success.
>  .BR regexec ()
> -arguments will be ignored for this purpose (but
> +will use only
>  .I pmatch
> -may still be used for
> -.BR REG_STARTEND ).
> +for
> +.BR REG_STARTEND ,
> +ignoring
> +.IR nmatch .
>  .TP
>  .B REG_NEWLINE
>  Match-any-character operators don't match a newline.

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 11:36                             ` Alejandro Colomar
@ 2023-04-21 11:49                               ` наб
  0 siblings, 0 replies; 143+ messages in thread
From: наб @ 2023-04-21 11:49 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 396 bytes --]

On Fri, Apr 21, 2023 at 01:36:19PM +0200, Alejandro Colomar wrote:
> On 4/21/23 13:26, наб wrote:
> > On Fri, Apr 21, 2023 at 12:34:39PM +0200, Alejandro Colomar wrote:
> >> But I haven't pushed, so that we can still have it in the same
> >> patch if you confirm.
> > Yeah, go on.
> But do you prefer (a) or (b)?
(a); (b) looks better (imo as a mdoc enjoyer), but blows the A4 margin.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-21  2:48                     ` [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
@ 2023-04-21 11:55                       ` Alejandro Colomar
  2023-04-21 11:57                         ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:55 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1080 bytes --]



On 4/21/23 04:48, наб wrote:
> Move-only commit.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
>  man3/regex.3              | 30 ++++++++++++++++++
>  man3type/regex_t.3type    | 64 +--------------------------------------
>  man3type/regmatch_t.3type |  2 +-
>  man3type/regoff_t.3type   |  2 +-
>  4 files changed, 33 insertions(+), 65 deletions(-)
> 
[...]

> diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
> index 176d2c7a6..c0daaf0ff 100644
> --- a/man3type/regex_t.3type
> +++ b/man3type/regex_t.3type
> @@ -1,63 +1 @@
> -.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
> -.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
> -.\"
> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> -.\"
> -.\"
> -.TH regex_t 3type (date) "Linux man-pages (unreleased)"
> -.SH NAME
> -regex_t, regmatch_t, regoff_t

Should we keep the names in regex.3?

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-21 11:55                       ` Alejandro Colomar
@ 2023-04-21 11:57                         ` Alejandro Colomar
  2023-04-21 11:57                           ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:57 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1314 bytes --]



On 4/21/23 13:55, Alejandro Colomar wrote:
> 
> 
> On 4/21/23 04:48, наб wrote:
>> Move-only commit.
>>
>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
>> ---
>>  man3/regex.3              | 30 ++++++++++++++++++
>>  man3type/regex_t.3type    | 64 +--------------------------------------
>>  man3type/regmatch_t.3type |  2 +-
>>  man3type/regoff_t.3type   |  2 +-
>>  4 files changed, 33 insertions(+), 65 deletions(-)
>>
> [...]
> 
>> diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
>> index 176d2c7a6..c0daaf0ff 100644
>> --- a/man3type/regex_t.3type
>> +++ b/man3type/regex_t.3type
>> @@ -1,63 +1 @@
>> -.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
>> -.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
>> -.\"
>> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
>> -.\"
>> -.\"
>> -.TH regex_t 3type (date) "Linux man-pages (unreleased)"
>> -.SH NAME
>> -regex_t, regmatch_t, regoff_t
> 
> Should we keep the names in regex.3?

Although that probably confuses man(1), since it will believe those are
in main section 3, while they are in 3type.  Branden, any opinions?

> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
       [not found]                         ` <1d2d0aa8-cb28-2d7f-c48b-7a02f907cb5b@gmail.com>
@ 2023-04-21 11:57                           ` Ralph Corderoy
  2023-04-21 11:59                             ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Ralph Corderoy @ 2023-04-21 11:57 UTC (permalink / raw)
  To: linux-man, groff

Hi Alejandro,

> > (a)  Use .nf/.fi for the function prototypes, and .EX/.EE for the
> >      types.
> > 
> > (b)  .EX/.EE for everything, as you did.
> > 
> > Please have a look at the PDF versions
...
> Which one looks better to you?  I've attached two PDF files

The Synopsis should not be in a fixed-width font.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
  2023-04-21 11:57                         ` Alejandro Colomar
@ 2023-04-21 11:57                           ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:57 UTC (permalink / raw)
  To: наб, G. Branden Robinson; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1429 bytes --]

(forgot to TO Branden)

On 4/21/23 13:57, Alejandro Colomar wrote:
> 
> 
> On 4/21/23 13:55, Alejandro Colomar wrote:
>>
>>
>> On 4/21/23 04:48, наб wrote:
>>> Move-only commit.
>>>
>>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
>>> ---
>>>  man3/regex.3              | 30 ++++++++++++++++++
>>>  man3type/regex_t.3type    | 64 +--------------------------------------
>>>  man3type/regmatch_t.3type |  2 +-
>>>  man3type/regoff_t.3type   |  2 +-
>>>  4 files changed, 33 insertions(+), 65 deletions(-)
>>>
>> [...]
>>
>>> diff --git a/man3type/regex_t.3type b/man3type/regex_t.3type
>>> index 176d2c7a6..c0daaf0ff 100644
>>> --- a/man3type/regex_t.3type
>>> +++ b/man3type/regex_t.3type
>>> @@ -1,63 +1 @@
>>> -.\" Copyright (c) 2020-2022 by Alejandro Colomar <alx@kernel.org>
>>> -.\" and Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
>>> -.\"
>>> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
>>> -.\"
>>> -.\"
>>> -.TH regex_t 3type (date) "Linux man-pages (unreleased)"
>>> -.SH NAME
>>> -regex_t, regmatch_t, regoff_t
>>
>> Should we keep the names in regex.3?
> 
> Although that probably confuses man(1), since it will believe those are
> in main section 3, while they are in 3type.  Branden, any opinions?
> 
>>
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 11:57                           ` Ralph Corderoy
@ 2023-04-21 11:59                             ` Alejandro Colomar
  2023-04-21 12:03                               ` Alejandro Colomar
  2023-04-21 12:09                               ` Ralph Corderoy
  0 siblings, 2 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 11:59 UTC (permalink / raw)
  To: Ralph Corderoy, linux-man, groff


[-- Attachment #1.1: Type: text/plain, Size: 688 bytes --]

Hi Ralph,

On 4/21/23 13:57, Ralph Corderoy wrote:
> Hi Alejandro,
> 
>>> (a)  Use .nf/.fi for the function prototypes, and .EX/.EE for the
>>>      types.
>>>
>>> (b)  .EX/.EE for everything, as you did.
>>>
>>> Please have a look at the PDF versions
> ...
>> Which one looks better to you?  I've attached two PDF files
> 
> The Synopsis should not be in a fixed-width font.

I know and agree most of the time, but when it has structure types with
multi-line comments, you see what happens in the first PDFs I sent
(mis-aligned comments).

Cheers,
Alex

> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 11:59                             ` Alejandro Colomar
@ 2023-04-21 12:03                               ` Alejandro Colomar
  2023-04-21 12:09                               ` Ralph Corderoy
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:03 UTC (permalink / raw)
  To: Ralph Corderoy, linux-man, groff


[-- Attachment #1.1: Type: text/plain, Size: 912 bytes --]



On 4/21/23 13:59, Alejandro Colomar wrote:
> Hi Ralph,
> 
> On 4/21/23 13:57, Ralph Corderoy wrote:
>> Hi Alejandro,
>>
>>>> (a)  Use .nf/.fi for the function prototypes, and .EX/.EE for the
>>>>      types.
>>>>
>>>> (b)  .EX/.EE for everything, as you did.
>>>>
>>>> Please have a look at the PDF versions
>> ...
>>> Which one looks better to you?  I've attached two PDF files
>>
>> The Synopsis should not be in a fixed-width font.
> 
> I know and agree most of the time, but when it has structure types with
> multi-line comments, you see what happens in the first PDFs I sent
> (mis-aligned comments).

Now I think twice, maybe the answer is to remove those comments, now
that the page better explains what these are in the DESCRIPTION.

> 
> Cheers,
> Alex
> 
>>
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v9] regex.3: Desoupify regerror() description
  2023-04-21 10:06                       ` Alejandro Colomar
@ 2023-04-21 12:03                         ` наб
  2023-04-21 12:26                           ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21 12:03 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2622 bytes --]

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v8:
1:  4479e1572 ! 1:  38109fcc6 regex.3: Desoupify regerror() description
    @@ man3/regex.3: .SS Error reporting
     +.IR preg .
     +.PP
     +If
    -+.I errbuf_size
    -+is
    -+.BR 0 ,
    -+the size of the required buffer is returned.
    -+Otherwise, up to
      .I errbuf_size
     -are nonzero,
     -.I errbuf
     -is filled in with the first
     -.I "errbuf_size \- 1"
     -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
    ++isn't 0, up to
    ++.I errbuf_size
     +bytes are copied to
     +.IR errbuf ;
     +the error string is always null-terminated, and truncated to fit.
      .SS Freeing
      .BR regfree ()
      deinitializes the pattern buffer at
    +@@ man3/regex.3: .SH RETURN VALUE
    + returns zero for a successful match or
    + .B REG_NOMATCH
    + for failure.
    ++.PP
    ++.BR regerror ()
    ++returns the size of the buffer required to hold the string.
    + .SH ERRORS
    + The following errors can be returned by
    + .BR regcomp ():

 man3/regex.3 | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index d91acc19d..efca582d7 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -210,27 +210,20 @@ .SS Error reporting
 .BR regexec ()
 into error message strings.
 .PP
-.BR regerror ()
-is passed the error code,
-.IR errcode ,
-the pattern buffer,
-.IR preg ,
-a pointer to a character string buffer,
-.IR errbuf ,
-and the size of the string buffer,
-.IR errbuf_size .
-It returns the size of the
-.I errbuf
-required to contain the null-terminated error message string.
-If both
-.I errbuf
-and
+If
+.I preg
+isn't a null pointer,
+.I errcode
+must be the latest error returned from an operation on
+.IR preg .
+.PP
+If
 .I errbuf_size
-are nonzero,
-.I errbuf
-is filled in with the first
-.I "errbuf_size \- 1"
-characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
+isn't 0, up to
+.I errbuf_size
+bytes are copied to
+.IR errbuf ;
+the error string is always null-terminated, and truncated to fit.
 .SS Freeing
 .BR regfree ()
 deinitializes the pattern buffer at
@@ -247,6 +240,9 @@ .SH RETURN VALUE
 returns zero for a successful match or
 .B REG_NOMATCH
 for failure.
+.PP
+.BR regerror ()
+returns the size of the buffer required to hold the string.
 .SH ERRORS
 The following errors can be returned by
 .BR regcomp ():
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 11:59                             ` Alejandro Colomar
  2023-04-21 12:03                               ` Alejandro Colomar
@ 2023-04-21 12:09                               ` Ralph Corderoy
  2023-04-21 12:14                                 ` Alejandro Colomar
  1 sibling, 1 reply; 143+ messages in thread
From: Ralph Corderoy @ 2023-04-21 12:09 UTC (permalink / raw)
  To: linux-man, groff

Hi Alejandro,

> when it has structure types with multi-line comments, you see what
> happens in the first PDFs I sent (mis-aligned comments).

Fix the formatting commands in the troff source so the comments are
aligned.  The man page is troff source for producing beautifully typeset
pages.

-- 
Cheers, Ralph.

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21  9:45                           ` Alejandro Colomar
@ 2023-04-21 12:13                             ` наб
  2023-04-21 12:21                               ` Alejandro Colomar
  2023-04-21 12:23                               ` Alejandro Colomar
  0 siblings, 2 replies; 143+ messages in thread
From: наб @ 2023-04-21 12:13 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 673 bytes --]

On Fri, Apr 21, 2023 at 11:45:07AM +0200, Alejandro Colomar wrote:
> On 4/21/23 04:16, наб wrote:
> > And it passes!
> Do you mean that make doesn't recognize the error?
I mean that
> > Those are the only errors I saw, even on the version with
> > IR\ string$
so, even if I'd ran the linter pass, it wouldn't've found the line you
originally pointed out.

> I have the same bashrc (Debian Sid here), and have this same
> line.  Why is it failing only for you?  Maybe I modified
> something in my startup scripts?  Maybe you did?
Unlikely. What if you do make ... 2>&1 | less?

Or this is an unrelated bullseye bash bug that's fixed in bookworm.

Best,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v8 3/5] regex.3: Finalise move of reg*.3type
  2023-04-21 12:09                               ` Ralph Corderoy
@ 2023-04-21 12:14                                 ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:14 UTC (permalink / raw)
  To: Ralph Corderoy, linux-man, groff


[-- Attachment #1.1.1: Type: text/plain, Size: 769 bytes --]

Hi Ralph,

On 4/21/23 14:09, Ralph Corderoy wrote:
> Hi Alejandro,
> 
>> when it has structure types with multi-line comments, you see what
>> happens in the first PDFs I sent (mis-aligned comments).
> 
> Fix the formatting commands in the troff source so the comments are
> aligned.  The man page is troff source for producing beautifully typeset
> pages.

I guess that would involve raw troff commands, right?  Might be necessary
in some other case, but I dodged the bullet this time with
<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=b5c5fd34ac4537fc00089c977d8cb72d4de910e6>.

See attached PDF.

Cheers,
Alex

> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #1.1.2: regex.3.rRwFNb --]
[-- Type: application/pdf, Size: 39932 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 12:13                             ` наб
@ 2023-04-21 12:21                               ` Alejandro Colomar
  2023-04-21 12:23                               ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:21 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 1126 bytes --]

On 4/21/23 14:13, наб wrote:
> On Fri, Apr 21, 2023 at 11:45:07AM +0200, Alejandro Colomar wrote:
>> On 4/21/23 04:16, наб wrote:
>>> And it passes!
>> Do you mean that make doesn't recognize the error?
> I mean that
>>> Those are the only errors I saw, even on the version with
>>> IR\ string$
> so, even if I'd ran the linter pass, it wouldn't've found the line you
> originally pointed out.
> 
>> I have the same bashrc (Debian Sid here), and have this same
>> line.  Why is it failing only for you?  Maybe I modified
>> something in my startup scripts?  Maybe you did?
> Unlikely.

$ grep PS1.*return /etc/bash.bashrc 
[ -z "$PS1" ] && return

> What if you do make ... 2>&1 | less?

Nothing bad.

I edited ~/.bash_aliases, but I don't think I have anything there
that would workaround this issue.  I'm puzzled.

> 
> Or this is an unrelated bullseye bash bug that's fixed in bookworm.

No idea; it could be.  I don't have any bullseye to test it.

Cheers,
Alex

> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 12:13                             ` наб
  2023-04-21 12:21                               ` Alejandro Colomar
@ 2023-04-21 12:23                               ` Alejandro Colomar
  1 sibling, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:23 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 968 bytes --]

On 4/21/23 14:13, наб wrote:
> On Fri, Apr 21, 2023 at 11:45:07AM +0200, Alejandro Colomar wrote:
>> On 4/21/23 04:16, наб wrote:
>>> And it passes!
>> Do you mean that make doesn't recognize the error?
> I mean that
>>> Those are the only errors I saw, even on the version with
>>> IR\ string$
> so, even if I'd ran the linter pass, it wouldn't've found the line you
> originally pointed out.

Yep; you probably need groff-1.23 for that (yet unreleased, but there's
an rc4 that you can build from source.  :)

Cheers

> 
>> I have the same bashrc (Debian Sid here), and have this same
>> line.  Why is it failing only for you?  Maybe I modified
>> something in my startup scripts?  Maybe you did?
> Unlikely. What if you do make ... 2>&1 | less?
> 
> Or this is an unrelated bullseye bash bug that's fixed in bookworm.
> 
> Best,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v9] regex.3: Desoupify regerror() description
  2023-04-21 12:03                         ` [PATCH v9] " наб
@ 2023-04-21 12:26                           ` Alejandro Colomar
  2023-04-21 12:27                             ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:26 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3015 bytes --]

On 4/21/23 14:03, наб wrote:
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Aaand patch applied!  I believe I've got all, right?

Cheers,
Alex

> ---
> Range-diff against v8:
> 1:  4479e1572 ! 1:  38109fcc6 regex.3: Desoupify regerror() description
>     @@ man3/regex.3: .SS Error reporting
>      +.IR preg .
>      +.PP
>      +If
>     -+.I errbuf_size
>     -+is
>     -+.BR 0 ,
>     -+the size of the required buffer is returned.
>     -+Otherwise, up to
>       .I errbuf_size
>      -are nonzero,
>      -.I errbuf
>      -is filled in with the first
>      -.I "errbuf_size \- 1"
>      -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
>     ++isn't 0, up to
>     ++.I errbuf_size
>      +bytes are copied to
>      +.IR errbuf ;
>      +the error string is always null-terminated, and truncated to fit.
>       .SS Freeing
>       .BR regfree ()
>       deinitializes the pattern buffer at
>     +@@ man3/regex.3: .SH RETURN VALUE
>     + returns zero for a successful match or
>     + .B REG_NOMATCH
>     + for failure.
>     ++.PP
>     ++.BR regerror ()
>     ++returns the size of the buffer required to hold the string.
>     + .SH ERRORS
>     + The following errors can be returned by
>     + .BR regcomp ():
> 
>  man3/regex.3 | 36 ++++++++++++++++--------------------
>  1 file changed, 16 insertions(+), 20 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index d91acc19d..efca582d7 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -210,27 +210,20 @@ .SS Error reporting
>  .BR regexec ()
>  into error message strings.
>  .PP
> -.BR regerror ()
> -is passed the error code,
> -.IR errcode ,
> -the pattern buffer,
> -.IR preg ,
> -a pointer to a character string buffer,
> -.IR errbuf ,
> -and the size of the string buffer,
> -.IR errbuf_size .
> -It returns the size of the
> -.I errbuf
> -required to contain the null-terminated error message string.
> -If both
> -.I errbuf
> -and
> +If
> +.I preg
> +isn't a null pointer,
> +.I errcode
> +must be the latest error returned from an operation on
> +.IR preg .
> +.PP
> +If
>  .I errbuf_size
> -are nonzero,
> -.I errbuf
> -is filled in with the first
> -.I "errbuf_size \- 1"
> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
> +isn't 0, up to
> +.I errbuf_size
> +bytes are copied to
> +.IR errbuf ;
> +the error string is always null-terminated, and truncated to fit.
>  .SS Freeing
>  .BR regfree ()
>  deinitializes the pattern buffer at
> @@ -247,6 +240,9 @@ .SH RETURN VALUE
>  returns zero for a successful match or
>  .B REG_NOMATCH
>  for failure.
> +.PP
> +.BR regerror ()
> +returns the size of the buffer required to hold the string.
>  .SH ERRORS
>  The following errors can be returned by
>  .BR regcomp ():

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v9] regex.3: Desoupify regerror() description
  2023-04-21 12:26                           ` Alejandro Colomar
@ 2023-04-21 12:27                             ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 12:27 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3251 bytes --]



On 4/21/23 14:26, Alejandro Colomar wrote:
> On 4/21/23 14:03, наб wrote:
>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> 
> Aaand patch applied!  I believe I've got all, right?

Feel free to add yourself to the copyright.  You clearly deserve it ;)

> 
> Cheers,
> Alex
> 
>> ---
>> Range-diff against v8:
>> 1:  4479e1572 ! 1:  38109fcc6 regex.3: Desoupify regerror() description
>>     @@ man3/regex.3: .SS Error reporting
>>      +.IR preg .
>>      +.PP
>>      +If
>>     -+.I errbuf_size
>>     -+is
>>     -+.BR 0 ,
>>     -+the size of the required buffer is returned.
>>     -+Otherwise, up to
>>       .I errbuf_size
>>      -are nonzero,
>>      -.I errbuf
>>      -is filled in with the first
>>      -.I "errbuf_size \- 1"
>>      -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
>>     ++isn't 0, up to
>>     ++.I errbuf_size
>>      +bytes are copied to
>>      +.IR errbuf ;
>>      +the error string is always null-terminated, and truncated to fit.
>>       .SS Freeing
>>       .BR regfree ()
>>       deinitializes the pattern buffer at
>>     +@@ man3/regex.3: .SH RETURN VALUE
>>     + returns zero for a successful match or
>>     + .B REG_NOMATCH
>>     + for failure.
>>     ++.PP
>>     ++.BR regerror ()
>>     ++returns the size of the buffer required to hold the string.
>>     + .SH ERRORS
>>     + The following errors can be returned by
>>     + .BR regcomp ():
>>
>>  man3/regex.3 | 36 ++++++++++++++++--------------------
>>  1 file changed, 16 insertions(+), 20 deletions(-)
>>
>> diff --git a/man3/regex.3 b/man3/regex.3
>> index d91acc19d..efca582d7 100644
>> --- a/man3/regex.3
>> +++ b/man3/regex.3
>> @@ -210,27 +210,20 @@ .SS Error reporting
>>  .BR regexec ()
>>  into error message strings.
>>  .PP
>> -.BR regerror ()
>> -is passed the error code,
>> -.IR errcode ,
>> -the pattern buffer,
>> -.IR preg ,
>> -a pointer to a character string buffer,
>> -.IR errbuf ,
>> -and the size of the string buffer,
>> -.IR errbuf_size .
>> -It returns the size of the
>> -.I errbuf
>> -required to contain the null-terminated error message string.
>> -If both
>> -.I errbuf
>> -and
>> +If
>> +.I preg
>> +isn't a null pointer,
>> +.I errcode
>> +must be the latest error returned from an operation on
>> +.IR preg .
>> +.PP
>> +If
>>  .I errbuf_size
>> -are nonzero,
>> -.I errbuf
>> -is filled in with the first
>> -.I "errbuf_size \- 1"
>> -characters of the error message and a terminating null byte (\[aq]\e0\[aq]).
>> +isn't 0, up to
>> +.I errbuf_size
>> +bytes are copied to
>> +.IR errbuf ;
>> +the error string is always null-terminated, and truncated to fit.
>>  .SS Freeing
>>  .BR regfree ()
>>  deinitializes the pattern buffer at
>> @@ -247,6 +240,9 @@ .SH RETURN VALUE
>>  returns zero for a successful match or
>>  .B REG_NOMATCH
>>  for failure.
>> +.PP
>> +.BR regerror ()
>> +returns the size of the buffer required to hold the string.
>>  .SH ERRORS
>>  The following errors can be returned by
>>  .BR regcomp ():
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND
  2023-04-21 11:34                             ` наб
@ 2023-04-21 12:46                               ` Jakub Wilk
  0 siblings, 0 replies; 143+ messages in thread
From: Jakub Wilk @ 2023-04-21 12:46 UTC (permalink / raw)
  To: наб; +Cc: Alejandro Colomar, linux-man

* наб <nabijaczleweli@nabijaczleweli.xyz>, 2023-04-21 13:34:
>>>/etc/bash.bashrc: line 7: PS1: unbound variable
>>How come? bash is not supposed to read bashrc if the shell is 
>>non-interactive (unless you instruct it otherwise).
>No clue, surprised me as well, esp. since I didn't see any funny bash 
>flags to force interactivity.

I did some googling, which led me to this this:
https://lists.debian.org/Ywohi2WEtK+TtquZ@wooledge.org

I can reproduce the bug in unstable:

    $ (SSH_CLIENT=moo bash -uc true)
    /etc/bash.bashrc: line 7: PS1: unbound variable

What is this I don't even.

-- 
Jakub Wilk

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v9] regex.3: Destandardeseify Match offsets
  2023-04-21 10:36                       ` Alejandro Colomar
@ 2023-04-21 12:55                         ` наб
  2023-04-21 13:15                           ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21 12:55 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2874 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
This is the last one.

Range-diff against v8:
1:  4479e1572 < -:  --------- regex.3: Desoupify regerror() description
2:  bad307847 < -:  --------- regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
3:  edefa8a5e < -:  --------- regex.3: Finalise move of reg*.3type
4:  500070a5e ! 1:  9af6c6b7f regex.3: Destandardeseify Match offsets
    @@ man3/regex.3: .SS Matching
     +Each returned valid
     +.RB (non- \-1 )
     +match corresponds to the range
    -+.RI [ string " + " rm_so ", " string " + " rm_eo ).
    ++.RI [ "string + rm_so" , " string + rm_eo" ).
      .PP
      .I regoff_t
      is a signed integer type

 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 30f2ef318..aae31c1e9 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -179,37 +179,34 @@ .SS Matching
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first expression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ "string + rm_so" , " string + rm_eo" ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v9] regex.3: Destandardeseify Match offsets
  2023-04-21 12:55                         ` [PATCH v9] " наб
@ 2023-04-21 13:15                           ` Alejandro Colomar
  2023-04-21 13:29                             ` [PATCH v9a] " наб
  0 siblings, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 13:15 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 3228 bytes --]



On 4/21/23 14:55, наб wrote:
> This section reads like it were (and pretty much is) lifted from POSIX.
> That's hard to read, because POSIX is horrendously verbose, as usual.
> 
> Instead, synopsise it into something less formal but more reasonable,
> and describe the resulting range with a range instead of a paragraph.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
> ---
> This is the last one.
> 
> Range-diff against v8:
> 1:  4479e1572 < -:  --------- regex.3: Desoupify regerror() description
> 2:  bad307847 < -:  --------- regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3
> 3:  edefa8a5e < -:  --------- regex.3: Finalise move of reg*.3type
> 4:  500070a5e ! 1:  9af6c6b7f regex.3: Destandardeseify Match offsets
>     @@ man3/regex.3: .SS Matching
>      +Each returned valid
>      +.RB (non- \-1 )
>      +match corresponds to the range
>     -+.RI [ string " + " rm_so ", " string " + " rm_eo ).
>     ++.RI [ "string + rm_so" , " string + rm_eo" ).
>       .PP
>       .I regoff_t
>       is a signed integer type
> 
>  man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
>  1 file changed, 25 insertions(+), 28 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 30f2ef318..aae31c1e9 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -179,37 +179,34 @@ .SS Matching
>  .SS Match offsets
>  Unless
>  .B REG_NOSUB
> -was set for the compilation of the pattern buffer, it is possible to
> -obtain match addressing information.
> -.I pmatch
> -must be dimensioned to have at least
> -.I nmatch
> -elements.
> -These are filled in by
> +was passed to
> +.BR regcomp (),
> +it is possible to
> +obtain the locations of matches within
> +.IR string :
>  .BR regexec ()
> -with substring match addresses.
> -The offsets of the subexpression starting at the
> -.IR i th
> -open parenthesis are stored in
> -.IR pmatch[i] .
> -The entire regular expression's match addresses are stored in
> -.IR pmatch[0] .
> -(Note that to return the offsets of
> -.I N
> -subexpression matches,
> +fills
>  .I nmatch
> -must be at least
> -.IR N+1 .)
> -Any unused structure elements will contain the value \-1.
> +elements of
> +.I pmatch
> +with results:
> +.I pmatch[0]
> +corresponds to the entire match,
> +.I pmatch[1]
> +to the first expression, etc.

s/expression/subexpression/?

> +If there were more matches than
> +.IR nmatch ,
> +they are discarded;
> +if fewer,
> +unused elements of
> +.I pmatch
> +are filled with
> +.BR \-1 s.
>  .PP
> -Each
> -.I rm_so
> -element that is not \-1 indicates the start offset of the next largest
> -substring match within the string.
> -The relative
> -.I rm_eo
> -element indicates the end offset of the match,
> -which is the offset of the first character after the matching text.
> +Each returned valid
> +.RB (non- \-1 )
> +match corresponds to the range
> +.RI [ "string + rm_so" , " string + rm_eo" ).
>  .PP
>  .I regoff_t
>  is a signed integer type

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* [PATCH v9a] regex.3: Destandardeseify Match offsets
  2023-04-21 13:15                           ` Alejandro Colomar
@ 2023-04-21 13:29                             ` наб
  2023-04-21 13:55                               ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-04-21 13:29 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2564 bytes --]

This section reads like it were (and pretty much is) lifted from POSIX.
That's hard to read, because POSIX is horrendously verbose, as usual.

Instead, synopsise it into something less formal but more reasonable,
and describe the resulting range with a range instead of a paragraph.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Range-diff against v9:
4:  80d247ebc ! 1:  c3e45d60e regex.3: Destandardeseify Match offsets
    @@ man3/regex.3: .SS Matching
     +.I pmatch[0]
     +corresponds to the entire match,
     +.I pmatch[1]
    -+to the first expression, etc.
    ++to the first subexpression, etc.
     +If there were more matches than
     +.IR nmatch ,
     +they are discarded;

 man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/man3/regex.3 b/man3/regex.3
index 30f2ef318..8efd21d72 100644
--- a/man3/regex.3
+++ b/man3/regex.3
@@ -179,37 +179,34 @@ .SS Matching
 .SS Match offsets
 Unless
 .B REG_NOSUB
-was set for the compilation of the pattern buffer, it is possible to
-obtain match addressing information.
-.I pmatch
-must be dimensioned to have at least
-.I nmatch
-elements.
-These are filled in by
+was passed to
+.BR regcomp (),
+it is possible to
+obtain the locations of matches within
+.IR string :
 .BR regexec ()
-with substring match addresses.
-The offsets of the subexpression starting at the
-.IR i th
-open parenthesis are stored in
-.IR pmatch[i] .
-The entire regular expression's match addresses are stored in
-.IR pmatch[0] .
-(Note that to return the offsets of
-.I N
-subexpression matches,
+fills
 .I nmatch
-must be at least
-.IR N+1 .)
-Any unused structure elements will contain the value \-1.
+elements of
+.I pmatch
+with results:
+.I pmatch[0]
+corresponds to the entire match,
+.I pmatch[1]
+to the first subexpression, etc.
+If there were more matches than
+.IR nmatch ,
+they are discarded;
+if fewer,
+unused elements of
+.I pmatch
+are filled with
+.BR \-1 s.
 .PP
-Each
-.I rm_so
-element that is not \-1 indicates the start offset of the next largest
-substring match within the string.
-The relative
-.I rm_eo
-element indicates the end offset of the match,
-which is the offset of the first character after the matching text.
+Each returned valid
+.RB (non- \-1 )
+match corresponds to the range
+.RI [ "string + rm_so" , " string + rm_eo" ).
 .PP
 .I regoff_t
 is a signed integer type
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 143+ messages in thread

* Re: [PATCH v9a] regex.3: Destandardeseify Match offsets
  2023-04-21 13:29                             ` [PATCH v9a] " наб
@ 2023-04-21 13:55                               ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-04-21 13:55 UTC (permalink / raw)
  To: наб; +Cc: linux-man


[-- Attachment #1.1: Type: text/plain, Size: 2905 bytes --]

On 4/21/23 15:29, наб wrote:
> This section reads like it were (and pretty much is) lifted from POSIX.
> That's hard to read, because POSIX is horrendously verbose, as usual.
> 
> Instead, synopsise it into something less formal but more reasonable,
> and describe the resulting range with a range instead of a paragraph.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

Patch applied.  Thanks,
Alex

> ---
> Range-diff against v9:
> 4:  80d247ebc ! 1:  c3e45d60e regex.3: Destandardeseify Match offsets
>     @@ man3/regex.3: .SS Matching
>      +.I pmatch[0]
>      +corresponds to the entire match,
>      +.I pmatch[1]
>     -+to the first expression, etc.
>     ++to the first subexpression, etc.
>      +If there were more matches than
>      +.IR nmatch ,
>      +they are discarded;
> 
>  man3/regex.3 | 53 +++++++++++++++++++++++++---------------------------
>  1 file changed, 25 insertions(+), 28 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index 30f2ef318..8efd21d72 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -179,37 +179,34 @@ .SS Matching
>  .SS Match offsets
>  Unless
>  .B REG_NOSUB
> -was set for the compilation of the pattern buffer, it is possible to
> -obtain match addressing information.
> -.I pmatch
> -must be dimensioned to have at least
> -.I nmatch
> -elements.
> -These are filled in by
> +was passed to
> +.BR regcomp (),
> +it is possible to
> +obtain the locations of matches within
> +.IR string :
>  .BR regexec ()
> -with substring match addresses.
> -The offsets of the subexpression starting at the
> -.IR i th
> -open parenthesis are stored in
> -.IR pmatch[i] .
> -The entire regular expression's match addresses are stored in
> -.IR pmatch[0] .
> -(Note that to return the offsets of
> -.I N
> -subexpression matches,
> +fills
>  .I nmatch
> -must be at least
> -.IR N+1 .)
> -Any unused structure elements will contain the value \-1.
> +elements of
> +.I pmatch
> +with results:
> +.I pmatch[0]
> +corresponds to the entire match,
> +.I pmatch[1]
> +to the first subexpression, etc.
> +If there were more matches than
> +.IR nmatch ,
> +they are discarded;
> +if fewer,
> +unused elements of
> +.I pmatch
> +are filled with
> +.BR \-1 s.
>  .PP
> -Each
> -.I rm_so
> -element that is not \-1 indicates the start offset of the next largest
> -substring match within the string.
> -The relative
> -.I rm_eo
> -element indicates the end offset of the match,
> -which is the offset of the first character after the matching text.
> +Each returned valid
> +.RB (non- \-1 )
> +match corresponds to the range
> +.RI [ "string + rm_so" , " string + rm_eo" ).
>  .PP
>  .I regoff_t
>  is a signed integer type

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-04-19 23:23       ` [PATCH v2 2/9] regex.3: improve REG_STARTEND наб
  2023-04-20 10:00         ` G. Branden Robinson
@ 2023-06-02  0:12         ` Alejandro Colomar
  2023-06-02  0:49           ` наб
  1 sibling, 1 reply; 143+ messages in thread
From: Alejandro Colomar @ 2023-06-02  0:12 UTC (permalink / raw)
  To: наб; +Cc: linux-man, G. Branden Robinson


[-- Attachment #1.1: Type: text/plain, Size: 2341 bytes --]

Hi!

On 4/20/23 01:23, наб wrote:
> Explicitly spell out the ranges involved. The original wording always
> confused me, but it's actually very sane.
> 
> Also change the [0]. to -> here to make more obvious the point that
> pmatch is used as a pointer-to-object, not array in this scenario.
> 
> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
> R_NOTEOL? No. That's weird and confusing.
> 
> String largeness doesn't matter, known-lengthness does.
> 
> Explicitly spell out the influence on returned matches
> (relative to string, not start of range).
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

I forgot about this patch set.  Could you please resend anything that
is still pending?  Thanks!

> ---
>  man3/regex.3 | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/man3/regex.3 b/man3/regex.3
> index d77aac2e7..74f19945d 100644
> --- a/man3/regex.3
> +++ b/man3/regex.3
> @@ -141,23 +141,30 @@ compilation flag
>  above).
>  .TP
>  .B REG_STARTEND
> -Use
> -.I pmatch[0]
> -on the input string, starting at byte
> -.I pmatch[0].rm_so
> -and ending before byte
> -.IR pmatch[0].rm_eo .
> +Match
> +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo )
> +instead of
> +.RI [ string ", " string " + \fBstrlen\fP(" string )).
>  This allows matching embedded NUL bytes
>  and avoids a
>  .BR strlen (3)
> -on large strings.
> -It does not use
> +on known-length strings.
> +.I pmatch
> +must point to a valid readable object.
> +If any matches are returned
> +.RB ( REG_NOSUB
> +wasn't passed to
> +.BR regcomp (),
> +the match succeeded, and
>  .I nmatch
> -on input, and does not change
> -.B REG_NOTBOL
> -or
> -.B REG_NEWLINE
> -processing.
> +> 0), they overwrite
> +.I pmatch
> +as usual, and the
> +.B Byte offsets

I'm still unsure about this.  Please do whatever you prefer, and let's
discuss again after you send the patch(es).

Cheers,
Alex

> +remain relative to
> +.IR string
> +(not
> +.IR string " + " pmatch->rm_so ).
>  This flag is a BSD extension, not present in POSIX.
>  .SS Byte offsets
>  Unless

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-06-02  0:12         ` Alejandro Colomar
@ 2023-06-02  0:49           ` наб
  2023-06-03 17:30             ` Alejandro Colomar
  0 siblings, 1 reply; 143+ messages in thread
From: наб @ 2023-06-02  0:49 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, G. Branden Robinson

[-- Attachment #1: Type: text/plain, Size: 414 bytes --]

On Fri, Jun 02, 2023 at 02:12:27AM +0200, Alejandro Colomar wrote:
> I forgot about this patch set.  Could you please resend anything that
> is still pending?  Thanks!
I did too, but that's because you applied it. This particular patch is
164297a322b5dee6addff9ad4acb224302ab6e7d and the whole set is
0d120a3c76b4446b194a54387ce0e7a84b208bfd^..e894e84af353727082420c48b3cbea566a0f7692
from the looks of it?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

* Re: [PATCH v2 2/9] regex.3: improve REG_STARTEND
  2023-06-02  0:49           ` наб
@ 2023-06-03 17:30             ` Alejandro Colomar
  0 siblings, 0 replies; 143+ messages in thread
From: Alejandro Colomar @ 2023-06-03 17:30 UTC (permalink / raw)
  To: наб; +Cc: linux-man, G. Branden Robinson


[-- Attachment #1.1: Type: text/plain, Size: 733 bytes --]

On 6/2/23 02:49, наб wrote:
> On Fri, Jun 02, 2023 at 02:12:27AM +0200, Alejandro Colomar wrote:
>> I forgot about this patch set.  Could you please resend anything that
>> is still pending?  Thanks!
> I did too, but that's because you applied it. This particular patch is
> 164297a322b5dee6addff9ad4acb224302ab6e7d and the whole set is
> 0d120a3c76b4446b194a54387ce0e7a84b208bfd^..e894e84af353727082420c48b3cbea566a0f7692
> from the looks of it?

Yep.  For some reason I had marked it as unread to check it later; probably
I just forgot to change that after receiving the revision of the patch.

Thanks!
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 143+ messages in thread

end of thread, other threads:[~2023-06-03 17:30 UTC | newest]

Thread overview: 143+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-19 17:47 [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
2023-04-19 17:48 ` [PATCH 2/2] regex.3: improve REG_STARTEND наб
2023-04-19 20:23   ` Alejandro Colomar
2023-04-19 21:20     ` наб
2023-04-19 21:45       ` Alejandro Colomar
2023-04-19 23:23       ` [PATCH v2 1/9] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND наб
2023-04-20 11:21         ` Alejandro Colomar
2023-04-19 23:23       ` [PATCH v2 2/9] regex.3: improve REG_STARTEND наб
2023-04-20 10:00         ` G. Branden Robinson
2023-04-20 11:13           ` наб
2023-04-20 18:33             ` G. Branden Robinson
2023-04-20 22:29               ` Alejandro Colomar
2023-04-21  5:00                 ` G. Branden Robinson
2023-04-21  8:06                   ` a straw-man `SR` man(7) macro for (sub)section cross references (was: [PATCH v2 2/9] regex.3: improve REG_STARTEND) G. Branden Robinson
2023-04-21 11:07                   ` [PATCH v2 2/9] regex.3: improve REG_STARTEND Alejandro Colomar
2023-06-02  0:12         ` Alejandro Colomar
2023-06-02  0:49           ` наб
2023-06-03 17:30             ` Alejandro Colomar
2023-04-19 23:23       ` [PATCH v2 3/9] regex.3: ffix наб
2023-04-20 11:23         ` Alejandro Colomar
2023-04-19 23:23       ` [PATCH v2 4/9] regex.3: wfix наб
2023-04-20 11:27         ` Alejandro Colomar
2023-04-19 23:23       ` [PATCH v2 5/9] regex.3: ffix наб
2023-04-20 11:28         ` Alejandro Colomar
2023-04-20 12:12           ` [PATCH v3 5/9] adjtimex.2, clone.2, mprotect.2, open.2, syscall.2, regex.3: ffix, wfix наб
2023-04-20 12:52             ` Alejandro Colomar
2023-04-20 13:03               ` Alejandro Colomar
2023-04-20 14:13                 ` наб
2023-04-20 14:19                   ` Alejandro Colomar
2023-04-20 18:42                 ` G. Branden Robinson
2023-04-20 22:40                   ` Alejandro Colomar
2023-04-19 23:25       ` [PATCH v2 6/9] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: move in with regex.3 наб
2023-04-20 11:31         ` Alejandro Colomar
2023-04-20 13:02           ` [PATCH v4 1/6] regex.3: Fix subsection headings наб
2023-04-20 13:13             ` Alejandro Colomar
2023-04-20 13:24               ` наб
2023-04-20 13:35                 ` Alejandro Colomar
2023-04-20 15:35             ` [PATCH v5 0/8] regex.3 momento наб
2023-04-20 15:35               ` [PATCH v5 1/8] regex.3: Desoupify regcomp() description наб
2023-04-20 16:37                 ` Alejandro Colomar
2023-04-20 15:35               ` [PATCH v5 2/8] regex.3: Desoupify regexec() description наб
2023-04-20 15:35               ` [PATCH v5 3/8] regex.3: Desoupify regerror() description наб
2023-04-20 16:42                 ` Alejandro Colomar
2023-04-20 18:50                   ` наб
2023-04-20 16:50                 ` Alejandro Colomar
2023-04-20 17:23                 ` Alejandro Colomar
2023-04-20 18:46                   ` наб
2023-04-20 22:45                     ` Alejandro Colomar
2023-04-20 23:05                       ` наб
2023-04-20 15:35               ` [PATCH v5 4/8] regex.3: Improve REG_STARTEND наб
2023-04-20 17:29                 ` Alejandro Colomar
2023-04-20 19:30                   ` наб
2023-04-20 19:33                     ` наб
2023-04-20 23:01                     ` Alejandro Colomar
2023-04-21  0:13                       ` наб
2023-04-20 15:36               ` [PATCH v5 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
2023-04-20 15:36               ` [PATCH v5 6/8] regex.3: Finalise move of reg*.3type наб
2023-04-20 15:36               ` [PATCH v5 7/8] regex.3: Destandardeseify Match offsets наб
2023-04-20 15:36               ` [PATCH v5 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
2023-04-20 19:36               ` [PATCH v6 0/8] regex.3 momento наб
2023-04-20 19:36                 ` [PATCH v6 1/8] regex.3: Desoupify regexec() description наб
2023-04-20 23:24                   ` Alejandro Colomar
2023-04-21  0:33                     ` наб
2023-04-21  0:49                       ` Alejandro Colomar
2023-04-20 19:36                 ` [PATCH v6 2/8] regex.3: Desoupify regerror() description наб
2023-04-20 19:37                 ` [PATCH v6 3/8] regex.3: Desoupify regfree() description наб
2023-04-20 23:35                   ` Alejandro Colomar
2023-04-21  0:27                     ` наб
2023-04-21  0:37                       ` [PATCH v7 " наб
2023-04-21  0:58                       ` [PATCH v6 " Alejandro Colomar
2023-04-21  1:24                         ` [PATCH v7a " наб
2023-04-21  1:55                           ` Alejandro Colomar
2023-04-20 19:37                 ` [PATCH v6 4/8] regex.3: Improve REG_STARTEND наб
2023-04-20 23:15                   ` Alejandro Colomar
2023-04-21  0:39                     ` [PATCH v7 " наб
2023-04-21  1:42                       ` Alejandro Colomar
2023-04-21  2:16                         ` наб
2023-04-21  9:45                           ` Alejandro Colomar
2023-04-21 12:13                             ` наб
2023-04-21 12:21                               ` Alejandro Colomar
2023-04-21 12:23                               ` Alejandro Colomar
2023-04-21 10:19                           ` Jakub Wilk
2023-04-21 10:22                             ` Alejandro Colomar
2023-04-21 10:44                               ` Jakub Wilk
2023-04-21 11:16                                 ` Alejandro Colomar
2023-04-21 11:34                             ` наб
2023-04-21 12:46                               ` Jakub Wilk
2023-04-20 19:37                 ` [PATCH v6 5/8] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
2023-04-20 19:37                 ` [PATCH v6 6/8] regex.3: Finalise move of reg*.3type наб
2023-04-20 19:37                 ` [PATCH v6 7/8] regex.3: Destandardeseify Match offsets наб
2023-04-20 19:37                 ` [PATCH v6 8/8] regex.3: Further clarify the sole purpose of REG_NOSUB наб
2023-04-21  2:01                 ` [PATCH v6 0/8] regex.3 momento Alejandro Colomar
2023-04-21  2:48                   ` [PATCH v8 0/5] " наб
2023-04-21  2:48                     ` [PATCH v8 1/5] regex.3: Desoupify regerror() description наб
2023-04-21 10:06                       ` Alejandro Colomar
2023-04-21 12:03                         ` [PATCH v9] " наб
2023-04-21 12:26                           ` Alejandro Colomar
2023-04-21 12:27                             ` Alejandro Colomar
2023-04-21  2:48                     ` [PATCH v8 2/5] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move & link regex_t.3type into regex.3 наб
2023-04-21 11:55                       ` Alejandro Colomar
2023-04-21 11:57                         ` Alejandro Colomar
2023-04-21 11:57                           ` Alejandro Colomar
2023-04-21  2:48                     ` [PATCH v8 3/5] regex.3: Finalise move of reg*.3type наб
2023-04-21 10:33                       ` Alejandro Colomar
2023-04-21 10:34                         ` Alejandro Colomar
2023-04-21 11:26                           ` наб
2023-04-21 11:36                             ` Alejandro Colomar
2023-04-21 11:49                               ` наб
     [not found]                         ` <1d2d0aa8-cb28-2d7f-c48b-7a02f907cb5b@gmail.com>
2023-04-21 11:57                           ` Ralph Corderoy
2023-04-21 11:59                             ` Alejandro Colomar
2023-04-21 12:03                               ` Alejandro Colomar
2023-04-21 12:09                               ` Ralph Corderoy
2023-04-21 12:14                                 ` Alejandro Colomar
2023-04-21  2:49                     ` [PATCH v8 4/5] regex.3: Destandardeseify Match offsets наб
2023-04-21 10:36                       ` Alejandro Colomar
2023-04-21 12:55                         ` [PATCH v9] " наб
2023-04-21 13:15                           ` Alejandro Colomar
2023-04-21 13:29                             ` [PATCH v9a] " наб
2023-04-21 13:55                               ` Alejandro Colomar
2023-04-21  2:49                     ` [PATCH v8 5/5] regex.3: Further clarify the sole purpose of REG_NOSUB наб
2023-04-21 11:44                       ` Alejandro Colomar
2023-04-21 10:00                     ` [PATCH v8 0/5] regex.3 momento Alejandro Colomar
2023-04-20 13:02           ` [PATCH v4 2/6] regex.3: Desoupify function descriptions наб
2023-04-20 14:00             ` Alejandro Colomar
2023-04-20 14:37               ` наб
2023-04-20 13:02           ` [PATCH v4 3/6] regex.3: Improve REG_STARTEND наб
2023-04-20 14:04             ` Alejandro Colomar
2023-04-20 13:02           ` [PATCH v4 4/6] regex.3, regex_t.3type: Move regex_t.3type into regex.3 наб
2023-04-20 13:02           ` [PATCH v4 5/6] regex.3, regex_t.3type, regmatch_t.3type, regoff_t.3type: Move in with regex.3 наб
2023-04-20 14:07             ` Alejandro Colomar
2023-04-20 13:02           ` [PATCH v4 6/6] regex.3: Destandardeseify Match offsets наб
2023-04-20 14:10             ` Alejandro Colomar
2023-04-20 15:05               ` наб
2023-04-20 18:51                 ` G. Branden Robinson
2023-04-21 11:34                 ` Alejandro Colomar
2023-04-19 23:25       ` [PATCH v2 7/9] regex.3: destandardeseify Byte offsets наб
2023-04-19 23:26       ` [PATCH v2 8/9] regex.3: desoupify function descriptions наб
2023-04-20 11:15         ` [PATCH v3 " наб
2023-04-20 11:43           ` Alejandro Colomar
2023-04-20 11:50             ` наб
2023-04-19 23:26       ` [PATCH v2 9/9] regex.3: fix subsection headings наб
2023-04-20 11:17         ` [PATCH v3 " наб
2023-04-19 19:51 ` [PATCH 1/2] regex.3: note that pmatch is still used if REG_NOSUB if REG_STARTEND Alejandro Colomar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).