All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/4] zic.8: Add public domain notice
@ 2022-11-23 13:48 Alejandro Colomar
  2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 13:48 UTC (permalink / raw)
  To: tz, Paul Eggert; +Cc: linux-man, Alejandro Colomar, G. Branden Robinson

Signed-off-by: Alejandro Colomar <alx@kernel.org>
---

Hi Paul,

v2:

- This time I sent them to the tz mailing list too (and also linux-man@).
- Added ACK and review from Branden in 3/4 and 4/4.
- Added comment from Branden to commit message in 4/4.
- Added tweak suggested by Branden in 3/4.

 zic.8 | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/zic.8 b/zic.8
index f79148f4..c2c61739 100644
--- a/zic.8
+++ b/zic.8
@@ -1,3 +1,7 @@
+.\" %%%LICENSE_START(PUBLIC_DOMAIN)
+.\" This page is in the public domain
+.\" %%%LICENSE_END
+.\"
 .TH ZIC 8
 .SH NAME
 zic \- timezone compiler
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency
  2022-11-23 13:48 [PATCH v2 1/4] zic.8: Add public domain notice Alejandro Colomar
@ 2022-11-23 13:48 ` Alejandro Colomar
  2022-11-23 18:42   ` Paul Eggert
  2022-11-23 19:14   ` G. Branden Robinson
  2022-11-23 13:48 ` [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters Alejandro Colomar
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 13:48 UTC (permalink / raw)
  To: tz, Paul Eggert; +Cc: linux-man, Alejandro Colomar

This adds consistency across other manual pages, and with POSIX.1.

Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 zic.8 | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/zic.8 b/zic.8
index c2c61739..7fb51dde 100644
--- a/zic.8
+++ b/zic.8
@@ -212,7 +212,7 @@ .SH OPTIONS
 .B zic
 prohibit this.
 .PP
-A time zone abbreviation uses a
+A timezone abbreviation uses a
 .B %z
 format.
 Pre-2015 versions of
@@ -270,7 +270,7 @@ .SH OPTIONS
 pre-2014 versions of the reference client support at most 1200
 transitions.
 .PP
-A time zone abbreviation has fewer than 3 or more than 6 characters.
+A timezone abbreviation has fewer than 3 or more than 6 characters.
 POSIX requires at least 3, and requires implementations to support
 at least 6.
 .PP
@@ -297,7 +297,7 @@ .SH FILES
 \*<https://pubs\*:.opengroup\*:.org/\*:onlinepubs/\*:9699919799/\*:basedefs/\*:V1_chap06\*:.html\*>
 and the encoding's non-unibyte characters should consist entirely of
 non-PPCS bytes.  Non-PPCS characters typically occur only in comments:
-although output file names and time zone abbreviations can contain
+although output file names and timezone abbreviations can contain
 nearly any character, other software will work better if these are
 limited to the restricted syntax described under the
 .B \*-v
@@ -521,7 +521,7 @@ .SH FILES
 .q "EST"
 or
 .q "EDT" )
-of time zone abbreviations to be used when this rule is in effect.
+of timezone abbreviations to be used when this rule is in effect.
 If this field is
 .q \*- ,
 the variable part is null.
@@ -574,12 +574,12 @@ .SH FILES
 this amount matters.
 .TP
 .B FORMAT
-The format for time zone abbreviations.
+The format for timezone abbreviations.
 The pair of characters
 .B %s
 is used to show where the
 .q "variable part"
-of the time zone abbreviation goes.
+of the timezone abbreviation goes.
 Alternatively, a format can use the pair of characters
 .B %z
 to stand for the UT offset in the form
@@ -596,12 +596,12 @@ .SH FILES
 Alternatively,
 a slash (/)
 separates standard and daylight abbreviations.
-To conform to POSIX, a time zone abbreviation should contain only
+To conform to POSIX, a timezone abbreviation should contain only
 alphanumeric ASCII characters,
 .q "+"
 and
 .q "\*-".
-By convention, the time zone abbreviation
+By convention, the timezone abbreviation
 .q "\*-00"
 is a placeholder that means local time is unspecified.
 .TP
@@ -609,7 +609,7 @@ .SH FILES
 The time at which the UT offset or the rule(s) change for a location.
 It takes the form of one to four fields YEAR [MONTH [DAY [TIME]]].
 If this is specified,
-the time zone information is generated from the given UT offset
+the timezone information is generated from the given UT offset
 and rule change until the time specified, which is interpreted using
 the rules in effect just before the transition.
 The month, day, and time of day have the same format as the IN, ON, and AT
@@ -867,7 +867,7 @@ .SH "EXTENDED EXAMPLE"
 and
 .q "BMT"
 were initially used, respectively.  Since
-Swiss rules and later EU rules were applied, the time zone abbreviation
+Swiss rules and later EU rules were applied, the timezone abbreviation
 has been CET for standard time and CEST for daylight saving
 time.
 .SH FILES
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-23 13:48 [PATCH v2 1/4] zic.8: Add public domain notice Alejandro Colomar
  2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
@ 2022-11-23 13:48 ` Alejandro Colomar
  2022-11-23 18:18   ` G. Branden Robinson
  2022-11-23 18:43   ` Paul Eggert
  2022-11-23 13:48 ` [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH) Alejandro Colomar
  2022-11-23 18:32 ` [PATCH v2 1/4] zic.8: Add public domain notice Paul Eggert
  3 siblings, 2 replies; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 13:48 UTC (permalink / raw)
  To: tz, Paul Eggert
  Cc: linux-man, Alejandro Colomar, G. Branden Robinson, Geoff Clare, groff

See the following table from groff_char(7):

 ┌──────────────────────────────────────────────────────────────────┐
 │Keycap   Appearance and meaning   Special character and meaning   │
 ├──────────────────────────────────────────────────────────────────┤
 │"        " neutral double quote   \[dq] neutral double quote      │
 │'        ’ closing single quote   \[aq] neutral apostrophe        │
 │-        ‐ hyphen                 \- or \[-] minus sign/Unix dash │
 │\        (escape character)       \e or \[rs] reverse solidus     │
 │^        ˆ modifier circumflex    \(ha circumflex/caret/“hat”     │
 │`        ‘ opening single quote   \(ga grave accent               │
 │~        ˜ modifier tilde         \(ti tilde                      │
 └──────────────────────────────────────────────────────────────────┘

Reviewed-by: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Geoff Clare <gwc@opengroup.org>
Cc: <groff@gnu.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---

v2:
- Transform ' into \(aq [Branden].


Hi Branden,

I took the freedom to take your message as a reviewed-by.  Please confirm :)

Cheers,

Alex


 zic.8 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/zic.8 b/zic.8
index 7fb51dde..4ef2482c 100644
--- a/zic.8
+++ b/zic.8
@@ -351,7 +351,7 @@ .SH FILES
 .q + .
 To allow for future extensions,
 an unquoted name should not contain characters from the set
-.q !$%&'()*,/:;<=>?@[\e]^`{|}~ .
+.q !$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti .
 .TP
 .B FROM
 Gives the first year in which the rule applies.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH)
  2022-11-23 13:48 [PATCH v2 1/4] zic.8: Add public domain notice Alejandro Colomar
  2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
  2022-11-23 13:48 ` [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters Alejandro Colomar
@ 2022-11-23 13:48 ` Alejandro Colomar
  2022-11-23 18:45   ` Paul Eggert
  2022-11-23 18:32 ` [PATCH v2 1/4] zic.8: Add public domain notice Paul Eggert
  3 siblings, 1 reply; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 13:48 UTC (permalink / raw)
  To: tz, Paul Eggert
  Cc: linux-man, Alejandro Colomar, G. Branden Robinson, Ingo Schwarze

The Linux man-pages started using for the page title (TH) the correct
letter case that programs and identifiers have.  This change was agreed
with groff(1) and mandoc(1) maintainers as an improvement, since it
provides more information to the reader of a manual page.

On 11/22/22 23:45, G. Branden Robinson wrote:
> I add that anyone who wants their man page headers to shout at them
> again can get that with (the forthcoming) groff 1.23.
> 
> o The an (man) and doc (mdoc) macro packages support new `CS` and `CT`
>    registers to control rendering of man page section headings and topics
>    (seen in the page header), respectively, in full capitals.  These
>    default off (with no visible effect on pages that already fully
>    capitalize such text in man page sources).  The rationale is to
>    encourage man page authors to preserve case distinction information in
>    (or restore it to) their topics and section headings, while giving
>    users (including system administrators, distributors, integrators, and
>    maintainers of man(1) implementations) a way to view the rendered page
>    elements in full capitals if desired.
> 
>

Link: <https://git.savannah.gnu.org/cgit/groff.git/tree/NEWS?id=eac39afe3e7a86f3adbfb02ff5e33bfd69d4c224#n271>
Acked-by: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Ingo Schwarze <schwarze@openbsd.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 zic.8 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/zic.8 b/zic.8
index 4ef2482c..fd5dd315 100644
--- a/zic.8
+++ b/zic.8
@@ -2,7 +2,7 @@
 .\" This page is in the public domain
 .\" %%%LICENSE_END
 .\"
-.TH ZIC 8
+.TH zic 8
 .SH NAME
 zic \- timezone compiler
 .SH SYNOPSIS
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-23 13:48 ` [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters Alejandro Colomar
@ 2022-11-23 18:18   ` G. Branden Robinson
  2022-11-23 18:43   ` Paul Eggert
  1 sibling, 0 replies; 19+ messages in thread
From: G. Branden Robinson @ 2022-11-23 18:18 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: tz, Paul Eggert, linux-man, Alejandro Colomar, Geoff Clare, groff

[-- Attachment #1: Type: text/plain, Size: 523 bytes --]

Hi Alex,

At 2022-11-23T14:48:29+0100, Alejandro Colomar wrote:
> Reviewed-by: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Geoff Clare <gwc@opengroup.org>
> Cc: <groff@gnu.org>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> 
> v2:
> - Transform ' into \(aq [Branden].
> 
> 
> Hi Branden,
> 
> I took the freedom to take your message as a reviewed-by.  Please confirm :)

Aye, cap'n!  Confirmed.

Just don't ask Robert Elz what time it is in Singapore...

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] zic.8: Add public domain notice
  2022-11-23 13:48 [PATCH v2 1/4] zic.8: Add public domain notice Alejandro Colomar
                   ` (2 preceding siblings ...)
  2022-11-23 13:48 ` [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH) Alejandro Colomar
@ 2022-11-23 18:32 ` Paul Eggert
  2022-11-23 19:01   ` Alejandro Colomar
  3 siblings, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2022-11-23 18:32 UTC (permalink / raw)
  To: Alejandro Colomar, tz; +Cc: linux-man, Alejandro Colomar, G. Branden Robinson

On 2022-11-23 05:48, Alejandro Colomar wrote:

> diff --git a/zic.8 b/zic.8
> index f79148f4..c2c61739 100644
> --- a/zic.8
> +++ b/zic.8
> @@ -1,3 +1,7 @@
> +.\" %%%LICENSE_START(PUBLIC_DOMAIN)
> +.\" This page is in the public domain
> +.\" %%%LICENSE_END
> +.\"
>   .TH ZIC 8
>   .SH NAME
>   zic \- timezone compiler

Let's not do that upstream. The file already contains a public-domain 
notice at the bottom, in a human-readable format that is visible to 
anybody who looks at the printable version of the man page. Let's not 
put in comments for every downstream user with its own idiosyncratic 
machine-readable way of repeating what's already there.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency
  2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
@ 2022-11-23 18:42   ` Paul Eggert
  2022-11-23 19:02     ` Alejandro Colomar
  2022-11-23 19:14   ` G. Branden Robinson
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2022-11-23 18:42 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, Alejandro Colomar, Time zone mailing list

On 2022-11-23 05:48, Alejandro Colomar wrote:
> This adds consistency across other manual pages, and with POSIX.1.

The tzdb project documentation uses the phrase "time zone" for the 
ordinary English meaning that you'll see in time zone maps or in phrases 
like "time zone abbreviation", whereas it uses the single word 
"timezone" to mean the POSIX idea of a set of rules that map UTC to 
local time. So, for example, this proposed change:

> -A time zone abbreviation uses a
> +A timezone abbreviation uses a

would not be right, because an time zone abbreviation like "PDT" doesn't 
denote a set of rules like a TZ string would.

I suggest modifying other Linux manual pages to be consistent with this 
usage, rather than trying to use the single word "timezone" for both 
usages. Quite possibly most other Linux manual pages typically use 
"timezone" because they're typically talking about the POSIX meaning, 
which would mean they're already OK.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-23 13:48 ` [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters Alejandro Colomar
  2022-11-23 18:18   ` G. Branden Robinson
@ 2022-11-23 18:43   ` Paul Eggert
  2022-11-26  2:31     ` Paul Eggert
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2022-11-23 18:43 UTC (permalink / raw)
  To: Alejandro Colomar, tz
  Cc: linux-man, Alejandro Colomar, G. Branden Robinson, Geoff Clare, groff

Thanks, I installed that, with a shorter commit message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH)
  2022-11-23 13:48 ` [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH) Alejandro Colomar
@ 2022-11-23 18:45   ` Paul Eggert
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Eggert @ 2022-11-23 18:45 UTC (permalink / raw)
  To: Alejandro Colomar, tz
  Cc: linux-man, Alejandro Colomar, G. Branden Robinson, Ingo Schwarze

[-- Attachment #1: Type: text/plain, Size: 145 bytes --]

On 2022-11-23 05:48, Alejandro Colomar wrote:

> -.TH ZIC 8
> +.TH zic 8

Thanks, I installed the attached more-elaborate patch to the tzdb doc.

[-- Attachment #2: 0001-Use-lower-case-page-titles-for-commands.patch --]
[-- Type: text/x-patch, Size: 1064 bytes --]

From 02e7258917fe44c8801978493fd547422b544bd0 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 23 Nov 2022 10:28:55 -0800
Subject: [PROPOSED] Use lower case page titles for commands

From a suggestion by G. Branden Robinson via Alejandro Colmar in:
https://lore.kernel.org/linux-man/20221123134827.10420-4-alx@kernel.org/T/#u
* date.1, zdump.8, zic.8: Use lower case .TH arguments.
---
 date.1  | 2 +-
 zdump.8 | 2 +-
 zic.8   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/date.1 b/date.1
index 1ecd63a3..957b95d4 100644
--- a/date.1
+++ b/date.1
@@ -1,4 +1,4 @@
-.TH DATE 1
+.TH date 1
 .SH NAME
 date \- show and set date and time
 .SH SYNOPSIS
diff --git a/zdump.8 b/zdump.8
index 131a6cbd..ee7f9073 100644
--- a/zdump.8
+++ b/zdump.8
@@ -1,4 +1,4 @@
-.TH ZDUMP 8
+.TH zdump 8
 .SH NAME
 zdump \- timezone dumper
 .SH SYNOPSIS
diff --git a/zic.8 b/zic.8
index 24454113..5b785762 100644
--- a/zic.8
+++ b/zic.8
@@ -1,4 +1,4 @@
-.TH ZIC 8
+.TH zic 8
 .SH NAME
 zic \- timezone compiler
 .SH SYNOPSIS
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] zic.8: Add public domain notice
  2022-11-23 18:32 ` [PATCH v2 1/4] zic.8: Add public domain notice Paul Eggert
@ 2022-11-23 19:01   ` Alejandro Colomar
  2022-11-23 19:19     ` Paul Eggert
  0 siblings, 1 reply; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 19:01 UTC (permalink / raw)
  To: Paul Eggert, tz; +Cc: linux-man, Alejandro Colomar, G. Branden Robinson


[-- Attachment #1.1: Type: text/plain, Size: 1145 bytes --]

Hi Paul,

On 11/23/22 19:32, Paul Eggert wrote:
> On 2022-11-23 05:48, Alejandro Colomar wrote:
> 
>> diff --git a/zic.8 b/zic.8
>> index f79148f4..c2c61739 100644
>> --- a/zic.8
>> +++ b/zic.8
>> @@ -1,3 +1,7 @@
>> +.\" %%%LICENSE_START(PUBLIC_DOMAIN)
>> +.\" This page is in the public domain
>> +.\" %%%LICENSE_END
>> +.\"
>>   .TH ZIC 8
>>   .SH NAME
>>   zic \- timezone compiler
> 
> Let's not do that upstream. The file already contains a public-domain notice at 
> the bottom, in a human-readable format that is visible to anybody who looks at 
> the printable version of the man page. Let's not put in comments for every 
> downstream user with its own idiosyncratic machine-readable way of repeating 
> what's already there.

Ah, I didn't see that.

Would you mind moving it to the top of the file, as is common with these 
notices?  I'd remove the one in the Linux man-pages repo if you do that, which 
would mean less maintenance for me (I could also remove it and keep the one at 
the bottom, but it's likely to not be found as easily).

Thanks,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency
  2022-11-23 18:42   ` Paul Eggert
@ 2022-11-23 19:02     ` Alejandro Colomar
  0 siblings, 0 replies; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 19:02 UTC (permalink / raw)
  To: Paul Eggert; +Cc: linux-man, Alejandro Colomar, Time zone mailing list


[-- Attachment #1.1: Type: text/plain, Size: 1147 bytes --]

Hi Paul,

On 11/23/22 19:42, Paul Eggert wrote:
> On 2022-11-23 05:48, Alejandro Colomar wrote:
>> This adds consistency across other manual pages, and with POSIX.1.
> 
> The tzdb project documentation uses the phrase "time zone" for the ordinary 
> English meaning that you'll see in time zone maps or in phrases like "time zone 
> abbreviation", whereas it uses the single word "timezone" to mean the POSIX idea 
> of a set of rules that map UTC to local time. So, for example, this proposed 
> change:
> 
>> -A time zone abbreviation uses a
>> +A timezone abbreviation uses a
> 
> would not be right, because an time zone abbreviation like "PDT" doesn't denote 
> a set of rules like a TZ string would.
> 
> I suggest modifying other Linux manual pages to be consistent with this usage, 
> rather than trying to use the single word "timezone" for both usages. Quite 
> possibly most other Linux manual pages typically use "timezone" because they're 
> typically talking about the POSIX meaning, which would mean they're already OK.

It makes sense.  Thanks.

Cheers,

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency
  2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
  2022-11-23 18:42   ` Paul Eggert
@ 2022-11-23 19:14   ` G. Branden Robinson
  1 sibling, 0 replies; 19+ messages in thread
From: G. Branden Robinson @ 2022-11-23 19:14 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: tz, Paul Eggert, linux-man, Alejandro Colomar

[-- Attachment #1: Type: text/plain, Size: 3916 bytes --]

Hi Alex,

At 2022-11-23T14:48:27+0100, Alejandro Colomar wrote:
> This adds consistency across other manual pages, and with POSIX.1.
> 
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
>  zic.8 | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
[...]
> -A time zone abbreviation uses a
> +A timezone abbreviation uses a
[and so on similarly]

I mildly object to this on English style grounds.  Like "filename" or
"filesystem", it seems like a corruption of an English compound noun by
programmers accustomed to writing identifiers in programming
languages.[1]  Such people might also write "icecream", which is widely
recognized as a solecism, if ice cream were to ever be discussed in the
context of an API.  The space is semantically significant: a
"greenhouse" is not the same thing as a "green house".

I believe there are a few advantages to preferring spaces in compounds
except where _general_ English usage (as opposed to that of Unix nerds)
has beaten a track the other way.

1.  They may be easier to parse for non-native English speakers.  I was
    going to say that this point would not apply to a German speaker,
    who can morphologically analyze a 100-letter compound without
    reducing the blood glucose level in their brain an iota, but on
    second thought that may not be true.  English is notorious for
    borrowing from any language in the world, and a triple compound
    combining Germanic, Latinate, and Mi'kmaq roots is conceivable.
    This process may challenge even a native German's world-class
    ability to detect morpheme boundaries.

2.  When divided, spaceful compounds don't need to be added to anyone's
    spell checking dictionary if their roots are already present (as
    they certainly will be here).

3.  When divided, spaceful compounds are at less risk of incorrect
    hyphenation when typeset.  (We speak of TeX and troff and other
    systems as having hyphenation "algorithms", and while this is
    literally true, they are algorithms with huge lists of rules and
    exceptions, and they are applied to large and ever-growing open
    classes of inputs.  They therefore _behave_ heuristically [albeit
    deterministically], I submit, and can produce incorrect hyphenation
    break points through no fault of the algorithm itself.)

4.  Compounds that retain their spaces will fill and break lines more
    evenly, reducing the risk of large gaps between words when
    adjustment is performed, especially on terminal devices that can
    only adjust lines coarsely (an entire character cell at a time).
    If, in context, one fears that a line break within a compound used
    in a man page will damage the comprehensibility of a sentence, then
    one should probably recast the sentence, but in a pinch one can use
    a non-breaking space to avoid the problem, as with "time\~zone".

I acknowledge that any of the above can be of little concern to some.

So if it were me I would start driving man pages toward consistency in
the other direction (spoiler alert: I've already done this for many
groff man pages) and not worry about consistency with POSIX.1 here.

Regards,
Branden

[1] _Most_ programming languages, that is.  Some are lexically analyzed
    such that spaces are permissible in identifiers.  In Fortran's old
    fixed-source form, this was the case because spaces were completely
    ignored in input (outside of string literals, maybe).  There is a
    famous, albeit semi-apocryphal, bug story arising from this.[2]
    Fortran implementations, like those of C and Perl after them, spent
    many years being quite liberal in what they accepted.

[2] The bug was real but legend often hyped it up into causing loss of a
    spacecraft (cf. "space craft" ;-) ).

    https://www-users.cs.york.ac.uk/susan/cyc/p/fbug.htm

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] zic.8: Add public domain notice
  2022-11-23 19:01   ` Alejandro Colomar
@ 2022-11-23 19:19     ` Paul Eggert
  2022-11-23 19:32       ` Alejandro Colomar
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2022-11-23 19:19 UTC (permalink / raw)
  To: Alejandro Colomar, tz; +Cc: linux-man, Alejandro Colomar, G. Branden Robinson

[-- Attachment #1: Type: text/plain, Size: 186 bytes --]

On 2022-11-23 11:01, Alejandro Colomar wrote:
> Would you mind moving it to the top of the file, as is common with these 
> notices?

Sure, that's easy. Done by installing the attached.

[-- Attachment #2: 0001-Put-public-domain-notices-at-man-page-starts.patch --]
[-- Type: text/x-patch, Size: 4398 bytes --]

From 7adffa0cded29ea97952c3839934b2811829b007 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 23 Nov 2022 11:17:57 -0800
Subject: [PROPOSED] Put public-domain notices at man page starts

Suggested by Alejandro Colomar.
---
 date.1       | 4 ++--
 newctime.3   | 4 ++--
 newtzset.3   | 4 ++--
 time2posix.3 | 4 ++--
 tzfile.5     | 4 ++--
 tzselect.8   | 4 ++--
 zdump.8      | 4 ++--
 zic.8        | 4 ++--
 8 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/date.1 b/date.1
index 957b95d4..9a725a20 100644
--- a/date.1
+++ b/date.1
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH date 1
 .SH NAME
 date \- show and set date and time
@@ -163,5 +165,3 @@ If
 is absent,
 UTC leap seconds are loaded from
 .BR /usr/share/zoneinfo/posixrules .
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
diff --git a/newctime.3 b/newctime.3
index 86615498..2907f856 100644
--- a/newctime.3
+++ b/newctime.3
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH NEWCTIME 3
 .SH NAME
 asctime, ctime, difftime, gmtime, localtime, mktime \- convert date and time
@@ -340,5 +342,3 @@ restricted to years in the range 1900 through 2099.
 To avoid this portability mess, new programs should use
 .B strftime
 instead.
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
diff --git a/newtzset.3 b/newtzset.3
index b6150513..c3742850 100644
--- a/newtzset.3
+++ b/newtzset.3
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH NEWTZSET 3
 .SH NAME
 tzset \- initialize time conversion information
@@ -346,5 +348,3 @@ newctime(3),
 newstrftime(3),
 time(2),
 tzfile(5)
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
diff --git a/time2posix.3 b/time2posix.3
index c794032c..e7c69206 100644
--- a/time2posix.3
+++ b/time2posix.3
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 1996-06-05 by Arthur David Olson.
 .TH time2posix 3
 .SH NAME
 time2posix, posix2time \- convert seconds since the Epoch
@@ -129,5 +131,3 @@ difftime(3),
 localtime(3),
 mktime(3),
 time(2)
-.\" This file is in the public domain, so clarified as of
-.\" 1996-06-05 by Arthur David Olson.
diff --git a/tzfile.5 b/tzfile.5
index 280e8d8a..9d312255 100644
--- a/tzfile.5
+++ b/tzfile.5
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 1996-06-05 by Arthur David Olson.
 .TH TZFILE 5
 .SH NAME
 tzfile \- timezone information
@@ -492,5 +494,3 @@ Internet RFC 8536
 .UR https://\:doi.org/\:10.17487/\:RFC8536
 doi:10.17487/RFC8536
 .UE .
-.\" This file is in the public domain, so clarified as of
-.\" 1996-06-05 by Arthur David Olson.
diff --git a/tzselect.8 b/tzselect.8
index 1a5ce110..53a34cf6 100644
--- a/tzselect.8
+++ b/tzselect.8
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH TZSELECT 8
 .SH NAME
 tzselect \- select a timezone
@@ -121,5 +123,3 @@ newctime(3), tzfile(5), zdump(8), zic(8)
 Applications should not assume that
 .BR tzselect 's
 output matches the user's political preferences.
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
diff --git a/zdump.8 b/zdump.8
index ee7f9073..1ff92639 100644
--- a/zdump.8
+++ b/zdump.8
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH zdump 8
 .SH NAME
 zdump \- timezone dumper
@@ -227,5 +229,3 @@ introduction of UTC is problematic.
 .SH SEE ALSO
 .BR tzfile (5),
 .BR zic (8)
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
diff --git a/zic.8 b/zic.8
index 5b785762..8b77ea12 100644
--- a/zic.8
+++ b/zic.8
@@ -1,3 +1,5 @@
+.\" This file is in the public domain, so clarified as of
+.\" 2009-05-17 by Arthur David Olson.
 .TH zic 8
 .SH NAME
 zic \- timezone compiler
@@ -894,5 +896,3 @@ specifying transition instants using universal time.
 .SH SEE ALSO
 .BR tzfile (5),
 .BR zdump (8)
-.\" This file is in the public domain, so clarified as of
-.\" 2009-05-17 by Arthur David Olson.
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] zic.8: Add public domain notice
  2022-11-23 19:19     ` Paul Eggert
@ 2022-11-23 19:32       ` Alejandro Colomar
  0 siblings, 0 replies; 19+ messages in thread
From: Alejandro Colomar @ 2022-11-23 19:32 UTC (permalink / raw)
  To: Paul Eggert, tz; +Cc: linux-man, Alejandro Colomar, G. Branden Robinson


[-- Attachment #1.1: Type: text/plain, Size: 289 bytes --]

On 11/23/22 20:19, Paul Eggert wrote:
> On 2022-11-23 11:01, Alejandro Colomar wrote:
>> Would you mind moving it to the top of the file, as is common with these notices?
> 
> Sure, that's easy. Done by installing the attached.

Thanks :)

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-23 18:43   ` Paul Eggert
@ 2022-11-26  2:31     ` Paul Eggert
  2022-11-26  3:07       ` G. Branden Robinson
  2022-11-26 21:19       ` G. Branden Robinson
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Eggert @ 2022-11-26  2:31 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: linux-man, Alejandro Colomar, G. Branden Robinson, Geoff Clare,
	groff, Time zone mailing list

[-- Attachment #1: Type: text/plain, Size: 280 bytes --]

On 2022-11-23 10:43, Paul Eggert wrote:
> I installed that
Further testing showed that the installed patch doesn't work with 
traditional troff, which doesn't support groff escape sequences like 
\(aq. To fix this I installed the equivalent of the attached further 
patch to TZDB.

[-- Attachment #2: specials.diff --]
[-- Type: text/x-patch, Size: 420 bytes --]

diff --git a/zic.8 b/zic.8
index f345f944..ccd012b3 100644
--- a/zic.8
+++ b/zic.8
@@ -349,7 +349,8 @@ nor
 .q + .
 To allow for future extensions,
 an unquoted name should not contain characters from the set
-.q !$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti .
+.ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
+.el .q !$%&'()*,/:;<=>?@[\e]^`{|}~ .
 .TP
 .B FROM
 Gives the first year in which the rule applies.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-26  2:31     ` Paul Eggert
@ 2022-11-26  3:07       ` G. Branden Robinson
  2022-11-26 21:19       ` G. Branden Robinson
  1 sibling, 0 replies; 19+ messages in thread
From: G. Branden Robinson @ 2022-11-26  3:07 UTC (permalink / raw)
  To: Paul Eggert
  Cc: Alejandro Colomar, linux-man, Alejandro Colomar, Geoff Clare,
	groff, Time zone mailing list

[-- Attachment #1: Type: text/plain, Size: 2422 bytes --]

Hi Paul,

At 2022-11-25T18:31:02-0800, Paul Eggert wrote:
> On 2022-11-23 10:43, Paul Eggert wrote:
> > I installed that
> Further testing showed that the installed patch doesn't work with
> traditional troff, which doesn't support groff escape sequences like \(aq.
> To fix this I installed the equivalent of the attached further patch to
> TZDB.

My apologies.  I've gotten _really_ used to groff, Heirloom Doctools
troff, and even a bit to mandoc, all of which define special characters
for the various quotation marks that are available (aq, dq, oq, cq, lq,
dq).

There is a hazard here but I hasten to note that the \(aq and \(dq
special characters are not groffisms.  There is more than one
traditional troff out there.

My checkout of DWB (Documenter's Workbench) 3.3 troff defines 'aq' and
'dq' special characters for "devpcl", "devLatin1", "devpost",
"devnroff", and "devnroff-12" devices.[1]

And _any_ descendant of Kernighan's device-independent troff should be
able to define these by simply adding lines like

aq "

dq "

...after whichever "real" character provides the corresponding glyph (or
near approximation) in each font description file.

However, Solaris troff didn't do this, at least not in Solaris 10,[2]
and I confess to some doubt whether it ever will.

It may also be futile to expect any administrator of a proprietary Unix
system to undertake this effort themselves, even if it is a small one.

AT&T troff didn't have a way to directly test for the existence of a
special character, and two indirect approaches I tried to determine this
information failed.[3]

Maybe this is why James Clark added the '.if c' feature to groff over 30
years ago.  But a lot of people have decided they'll just be damned if
they borrow even good ideas...

Regretfully yours,
Branden

[1] https://github.com/n-t-roff/DWB3.3
[2] https://github.com/n-t-roff/Solaris10-ditroff.git
[3] Taking Heirloom Doctools as a proxy for DWB/Kernighan troff,
    measuring the width of a nonexistent special character fails; it
    returns the width of a space, which might coincidentally be the
    same.  Using the formatted output comparison operator [e.g., '.if
    "foo"bar"'] doesn't work either; a nonexistent glyph doesn't compare
    equal to a space nor to an empty string, and it doesn't produce a
    'c' command in device-independent output so I'm not much the wiser
    as to what its internal representation is.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-26  2:31     ` Paul Eggert
  2022-11-26  3:07       ` G. Branden Robinson
@ 2022-11-26 21:19       ` G. Branden Robinson
  2022-11-27  0:12         ` Paul Eggert
  1 sibling, 1 reply; 19+ messages in thread
From: G. Branden Robinson @ 2022-11-26 21:19 UTC (permalink / raw)
  To: Paul Eggert
  Cc: Alejandro Colomar, linux-man, Alejandro Colomar, Geoff Clare,
	groff, Time zone mailing list

[-- Attachment #1: Type: text/plain, Size: 7764 bytes --]

Hi Paul,

At 2022-11-25T18:31:02-0800, Paul Eggert wrote:
> On 2022-11-23 10:43, Paul Eggert wrote:
> > I installed that
> Further testing showed that the installed patch doesn't work with
> traditional troff, which doesn't support groff escape sequences like
> \(aq.

I think this patch goes too far in the retrograde direction.

\(xx, where xx is any two characters, is not a groff extension.  It
comes from Ossanna troff all the way back in the mid-1970s.

It is a special character escape sequence; a groff way of spelling it
is \[xxx] where xxx can be of any nonzero length (but cannot contain a
closing square bracket).

The repertoire of supported special character identifiers varies by
implementation and, after Kernighan's rewrite of troff circa 1980 for
device-independence, by output device.  Nevertheless, for
portability/backward compatibility, a set of them are very widely
supported.  These include three that your patch takes out, \(ha, \(ga,
and \(ti.  Replacing these with ASCII characters will _not_ produce
correct typography on typesetting output devices.

I would attach scans of Tables I and II from "NROFF/TROFF User's
Manual", the version dated 1976, published with Volume 2 of the Unix
Programmer's Manual (1979), and reprinted by Holt, Reinhart, and Winston
in 1983, but the linux-man list rejects all attachments bigger than a
breadbox, so I will ask for your trust (or ask me for it privately).

Those tables illustrate the glyph repertoire of Ossanna troff and the
special character identifiers that were implemented.

groff_char(7) from groff 1.22.4 and earlier marks the special character
identifiers you can expect to be portable (with "***" in its listings),
and for 1.23 I have added a "History" section to the page which
addresses most of the thousand questions I've asked over the past few
years while trying to learn this stuff.  I'll put that in a footnote.[1]

> To fix this I installed the equivalent of the attached further patch to
> TZDB.

I therefore propose the following snippet instead, also taking into
account Solaris 10 troff's poor handling of unsupported font selections
in nroff.

.q + .
To allow for future extensions,
an unquoted name should not contain characters from the set
.ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
.el .ie t .q \f(CW!$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
.    el   .q !$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti .
.TP
.B FROM
Gives the first year in which the rule applies.

What do you think?

Regards,
Branden

[1] (Much UTF-8 follows.)

History
    A consideration of the typefaces originally available to AT&T nroff
    and troff illuminates many conventions that one might regard as
    idiosyncratic fifty years afterward.  (See section “History” of
    roff(7) for more context.)  The face used by the Teletype Model 37
    terminals of the Murray Hill Unix Room was based on ASCII, but
    assigned multiple meanings to several code points, as suggested by
    that standard.  Decimal 34 (") served as a dieresis accent and
    neutral double quotation mark; decimal 39 (') as an acute accent,
    apostrophe, and closing (right) single quotation mark; decimal 45
    (-) as a hyphen and a minus sign; decimal 94 (^) as a circumflex
    accent and caret; decimal 96 (`) as a grave accent and opening
    (left) single quotation mark; and decimal 126 (~) as a tilde accent
    and (with a half‐line motion) swung dash.  The Model 37 bore an
    optional extended character set offering upright Greek letters and
    several mathematical symbols; these were documented as early as the
    kbd(VII) man page of the (First Edition) Unix Programmer’s Manual.

    At the time Graphic Systems delivered the C/A/T phototypesetter to
    AT&T, the ASCII character set was not considered a standard basis
    for a glyph repertoire by traditional typographers.  In the stock
    Times roman, italic, and bold styles available, several ASCII
    characters were not present at all, nor was most of the Teletype’s
    extended character set.  AT&T commissioned a “special” font to
    ensure no loss of repertoire.

    A representation of the coverage of the C/A/T’s text fonts follows.
    The glyph resembling an underscore is a baseline rule, and that
    resembling a vertical line is a box rule.  In italics, the box rule
    was not slanted.  We also observe that the hyphen and minus sign
    were already “de‐unified” by the fonts provided; a decision whither
    to map an input “-” therefore had to be taken.

           ┌────────────────────────────────────────────────────┐
           │A B C D E F G H I J K L M N O P Q R S T U V W X Y Z │
           │a b c d e f g h i j k l m n o p q r s t u v w x y z │
           │0 1 2 3 4 5 6 7 8 9 fi fl ffi ffl                   │
           │! $ % & ( ) ‘ ’ * + - . , / : ; = ? [ ] │           │
           │• □ — ‐ _ ¼ ½ ¾ ° † ′ ¢ ® ©                         │
           └────────────────────────────────────────────────────┘

    The special font supplied the missing ASCII and Teletype extended
    glyphs, among several others.  The plus, minus, and equals signs
    appeared in the special font despite availability in text fonts “to
    insulate the appearance of equations from the choice of standard
    [read: text] fonts”—a priority since troff was turned to the task of
    mathematical typesetting as soon as it was developed.

    We note that AT&T took the opportunity to de‐unify the
    apostrophe/right single quotation mark from the acute accent (a
    choice ISO later duplicated in its 8859 series of standards).  A
    slash intended to be mirror‐symmetric with the backslash was also
    included, as was the Bell System logo; we do not attempt to depict
    the latter.

        ┌──────────────────────────────────────────────────────────┐
        │α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ ς τ υ ϕ χ ψ ω         │
        │Γ Δ Θ Λ Ξ Π Σ Υ Φ Ψ Ω                                     │
        │" ´ \ ^ _ ` ~ / < > { } # @ + − = ∗                       │
        │≥ ≤ ≡ ≈ ∼ ≠ ↑ ↓ ← → × ÷ ± ∞ ∂ ∇ ¬ ∫ ∝ √ ‾ ∪ ∩ ⊂ ⊃ ⊆ ⊇ ∅ ∈ │
        │§ ‡ ☜ ☞ | ○ ⎧ ⎩ ⎫ ⎭ ⎨ ⎬ ⎪ ⌊ ⌋ ⌈ ⌉                         │
        └──────────────────────────────────────────────────────────┘

    One ASCII character as rendered by the Model 37 was apparently
    abandoned.  That device printed decimal 124 (|) as a broken vertical
    line, like Unicode U+00A6 (¦).  No equivalent was available on the
    C/A/T; the box rule \[br], brace vertical extension \[bv], and “or”
    operator \[or] were used as contextually appropriate.

    Devices supported by AT&T device‐independent troff exhibited some
    differences in glyph detail.  For example, on the Autologic APS‐5
    phototypesetter, the square \(sq became filled in the Times bold
    face.

[The lowercase Greek letters in the last boxed table above render in
italics where feasible; it is not when pasting into a plain text email.]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-26 21:19       ` G. Branden Robinson
@ 2022-11-27  0:12         ` Paul Eggert
  2022-12-13 23:24           ` G. Branden Robinson
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Eggert @ 2022-11-27  0:12 UTC (permalink / raw)
  To: G. Branden Robinson
  Cc: Alejandro Colomar, linux-man, Alejandro Colomar, Geoff Clare,
	groff, Time zone mailing list

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On 2022-11-26 13:19, G. Branden Robinson wrote:
> I would attach scans of Tables I and II from "NROFF/TROFF User's
> Manual", the version dated 1976, published with Volume 2 of the Unix
> Programmer's Manual (1979)

Thanks for looking into this. It took me a trip down memory lane as I 
believe I was the first person to submit a computer-typeset PhD thesis 
to UCLA. I used 7th Edition Unix troff along with the C/A/T 
phototypesetter that was troff's main target in the 1970s. (As an aside, 
the C/A/T was why stderr was invented; see Diomidis Spinellis's "The 
Birth of Standard Error" 2013-12-11 
<https://www.spinellis.gr/blog/20131211/>.)

Solaris 10 /usr/bin/troff is largely unchanged from 1970s troff, and 
supports \(ga but none of the other escapes you mention, I expect 
because they were not present in the Bell Labs special font version 4 
and Commercial II that Unix assumed on the C/A/T. The source code of 7th 
Edition Unix troff agrees with Solaris 10 behavior here, and this also 
agrees with 7th Edition Unix /usr/doc/troff/table2 which documents \(ga 
but none of the other escapes you mentioned. I'm a bit surprised that 
the printed manuals you mention disagree with 7th Edition Unix, but 
anyway it doesn't matter all that much since Solaris 10 is what it is.

On other words, on Solaris 10 if I take this file 'foo':

	.nf
	default font
	aq |\(aq| |'|
	ga |\(ga| |`|
	ha |\(ha| |^|
	ti |\(ti| |~|
	.ft CW
	CW font
	aq |\(aq| |'|
	ga |\(ga| |`|
	ha |\(ha| |^|
	ti |\(ti| |~|

and run the shell command:

    /usr/bin/troff foo | /usr/lib/lp/postscript/dpost >foo.ps

I get the attached file foo.ps, and 'evince' says only \(ga works and 
even there it's barely usable in the default font, as shown in the 
attached screenshot foo.png of 'evince' displaying foo.ps.


> .ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
> .el .ie t .q \f(CW!$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
> .    el   .q !$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti .

With Solaris 10 in mind, in the second line of your proposed code the 
\f(CW...\fP and the \(ga are OK but the \(ha, \(ga, \(ti are dubious so 
I installed the attached patch instead.

[-- Attachment #2: foo.ps --]
[-- Type: application/postscript, Size: 5691 bytes --]

[-- Attachment #3: foo.png --]
[-- Type: image/png, Size: 4509 bytes --]

[-- Attachment #4: 0001-zic.8-Work-better-with-Solaris-10-troff.patch --]
[-- Type: text/x-patch, Size: 647 bytes --]

From 897c2968f128b7854a486405bb68666265b38b24 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 26 Nov 2022 15:44:18 -0800
Subject: [PROPOSED] * zic.8: Work better with Solaris 10 troff.

---
 zic.8 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/zic.8 b/zic.8
index ccd012b3..019a289c 100644
--- a/zic.8
+++ b/zic.8
@@ -350,6 +350,7 @@ nor
 To allow for future extensions,
 an unquoted name should not contain characters from the set
 .ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
+.el .ie t .q \f(CW!$%&'()*,/:;<=>?@[\e]^\(ga{|}~\fP .
 .el .q !$%&'()*,/:;<=>?@[\e]^`{|}~ .
 .TP
 .B FROM
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters
  2022-11-27  0:12         ` Paul Eggert
@ 2022-12-13 23:24           ` G. Branden Robinson
  0 siblings, 0 replies; 19+ messages in thread
From: G. Branden Robinson @ 2022-12-13 23:24 UTC (permalink / raw)
  To: Paul Eggert; +Cc: groff, linux-man, tz

[-- Attachment #1: Type: text/plain, Size: 8140 bytes --]

[dropping Alex and Geoff Clare of TOG, but keeping mailing lists because
Paul corrected me on a significant point and I won't have anyone
claiming I don't own up to my mistakes; still, length warning: 189
lines]

Hi Paul,

...finally getting back to this, with belated thanks.

At 2022-11-26T16:12:25-0800, Paul Eggert wrote:
> On 2022-11-26 13:19, G. Branden Robinson wrote:
> > I would attach scans of Tables I and II from "NROFF/TROFF User's
> > Manual", the version dated 1976, published with Volume 2 of the Unix
> > Programmer's Manual (1979)
> 
> Thanks for looking into this. It took me a trip down memory lane as I
> believe I was the first person to submit a computer-typeset PhD thesis
> to UCLA.

Cheers!

> I used 7th Edition Unix troff along with the C/A/T phototypesetter
> that was troff's main target in the 1970s. (As an aside, the C/A/T was
> why stderr was invented; see Diomidis Spinellis's "The Birth of
> Standard Error" 2013-12-11 <https://www.spinellis.gr/blog/20131211/>.)

I'll bet a lot of readers didn't know that one, but I did, and when I
found out about it via the TUHS list I was so tickled that I added a
link to groff's Texinfo manual.

  standard error stream.  The notation then serves to identify the
  output stream and does not necessarily mean that an error has
  occurred.@footnote{Unix and related operating systems distinguish
  standard output and standard error streams @emph{because} of
  @code{troff}:@:
  @uref{https://minnie.tuhs.org/pipermail/tuhs/2013-December/006113.html}.}

> Solaris 10 /usr/bin/troff is largely unchanged from 1970s troff, and
> supports \(ga but none of the other escapes you mention, I expect
> because they were not present in the Bell Labs special font version 4
> and Commercial II that Unix assumed on the C/A/T.

I admit to some shock here.  The 1976 version of Ossanna's nroff/troff
manual, CSTR #54, explicitly documents--

--wait, no it doesn't.

<blinks>

[Some UTF-8 follows, because it's essential to the discussion of
glyph/character repertoire.]

Apparently I outright hallucinated the presence of \(ha and \(ti in
"Table II: Input Naming Conventions for ’, ‘, and — and for Non-ASCII
Special Characters".  \(ga is there like you said but \(ha and \(ti are
not.  I managed to sustain this delusion despite acquiring a paper copy
of the HRW 1983 printing of both volumes of the Version 7 Unix
Programmer's Manual (typeset with the C/A/T itself), and reading it with
especially loving attention to the troff material.  By God, I told
myself, I'll figure this stuff out.

Hrm.  Vexing.

Lest some readers think this is a ridiculous thing to have gotten wrong,
permit me to quote one of the paragraphs interstitially present in
"Table II"'s 2 tables spread over 2 pages.  Times--and the Times
font--were very different in 1973, when the Bell Labs CSRC took delivery
of the C/A/T.

"The ASCII characters @, #, ", ’, ‘, <, >, \, {, }, ˜, ˆ, and _ exist
_only_ on the special font and are printed as a 1-em space if that font
is not mounted."

So why did I use so much non-Basic Latin Unicode to quote a list of
_ASCII_ characters from the CSTR #54 document?  Because that's what they
_look like_.  Some material in the groff_char(7) man page speaks to it.

History
    A consideration of the typefaces originally available to AT&T nroff
    and troff illuminates many conventions that one might regard as
    idiosyncratic fifty years afterward.  (See section “History” of
    roff(7) for more context.)  The face used by the Teletype Model 37
    terminals of the Murray Hill Unix Room was based on ASCII, but
    assigned multiple meanings to several code points, as suggested by
    that standard.  Decimal 34 (") served as a dieresis accent and
    neutral double quotation mark; decimal 39 (') as an acute accent,
    apostrophe, and closing (right) single quotation mark; decimal 45
    (-) as a hyphen and a minus sign; decimal 94 (^) as a circumflex
    accent and caret; decimal 96 (`) as a grave accent and opening
    (left) single quotation mark; and decimal 126 (~) as a tilde accent
    and (with a half‐line motion) swung dash.  The Model 37 bore an
    optional extended character set offering upright Greek letters and
    several mathematical symbols; these were documented as early as the
    kbd(VII) man page of the (First Edition) Unix Programmer’s Manual.

    At the time Graphic Systems delivered the C/A/T phototypesetter to
    AT&T, the ASCII character set was not considered a standard basis
    for a glyph repertoire by traditional typographers.  In the stock
    Times roman, italic, and bold styles available, several ASCII
    characters were not present at all, nor was most of the Teletype’s
    extended character set.  AT&T commissioned a “special” font to
    ensure no loss of repertoire.

(Nit: one character, the broken bar ¦, got lost anyway.  I guess no one
missed it.)

> The source code of 7th Edition Unix troff agrees with Solaris 10
> behavior here, and this also agrees with 7th Edition Unix
> /usr/doc/troff/table2 which documents \(ga but none of the other
> escapes you mentioned. I'm a bit surprised that the printed manuals
> you mention disagree with 7th Edition Unix,

Imagine how surprised I was when I found I had deceived myself!  Usually
my vision sucks this badly only when reviewing my _own_ work.

None of these three appear in the 1992 revision of CSTR #54 (revised by
Kernighan and documenting device-independent troff extensions).  I would
say they are GNU extensions, but two others that one might impugn with
such a descriptor are \(aq and \(dq (along with \(ga) appear in
Documenter's Workbench (DWB) troff 3.3 font descriptions for its
PostScript driver,[1] which I have no reason to believe isn't about 10
years older than that version of CSTR #54.  Device-independent troff
made it easy to specify your own special character names; people did.

> but anyway it doesn't matter all that much since Solaris 10 is what it
> is.

Agreed.  And even though someone could have added special character
aliases of "ASCII" glyphs in Solaris's font description files 30+ years
ago, they didn't.  Perhaps the reason was a feeling that nothing good
ever came from GNU; a more likely explanation to me is a dedication of
religious intensity to the principle of inertia, similarly to why
Solaris kept the World's Worst Bourne Shell implementation, compliant
with no published standard ever, as /bin/sh for something like 30 years.

(Think I'm kidding?  https://www.in-ulm.de/~mascheck/bourne/segv.html )

> On other words, on Solaris 10 if I take this file 'foo':
> 
> 	.nf
> 	default font
> 	aq |\(aq| |'|
> 	ga |\(ga| |`|
> 	ha |\(ha| |^|
> 	ti |\(ti| |~|
> 	.ft CW
> 	CW font
> 	aq |\(aq| |'|
> 	ga |\(ga| |`|
> 	ha |\(ha| |^|
> 	ti |\(ti| |~|
> 
> and run the shell command:
> 
>    /usr/bin/troff foo | /usr/lib/lp/postscript/dpost >foo.ps
> 
> I get the attached file foo.ps, and 'evince' says only \(ga works and
> even there it's barely usable in the default font, as shown in the
> attached screenshot foo.png of 'evince' displaying foo.ps.

Right.  With the undefinedness of \(ha and \(ti as well as \(aq now
clear to me, nothing about your output surprises me.

> > .ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
> > .el .ie t .q \f(CW!$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP .
> > .    el   .q !$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti .
> 
> With Solaris 10 in mind, in the second line of your proposed code the
> \f(CW...\fP and the \(ga are OK but the \(ha, \(ga, \(ti are dubious
> so I installed the attached patch instead.

Quite sensible.  As we discussed elsewhere, Solaris troff is scheduled
for retirement in January 2024, and groff 1.22.3 succeeded it.  While
old, it certainly supports \(aq, \(ha, and \(ti.

Thank you again for knocking the scales off my eyes here.

Regards,
Branden

[1] https://github.com/n-t-roff/DWB3.3/blob/master/postscript/devopost/R

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-12-13 23:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 13:48 [PATCH v2 1/4] zic.8: Add public domain notice Alejandro Colomar
2022-11-23 13:48 ` [PATCH v2 2/4] zic.8: s/time zone/timezone/ for consistency Alejandro Colomar
2022-11-23 18:42   ` Paul Eggert
2022-11-23 19:02     ` Alejandro Colomar
2022-11-23 19:14   ` G. Branden Robinson
2022-11-23 13:48 ` [PATCH v2 3/4] zic.8: Use correct escape sequences instead of special characters Alejandro Colomar
2022-11-23 18:18   ` G. Branden Robinson
2022-11-23 18:43   ` Paul Eggert
2022-11-26  2:31     ` Paul Eggert
2022-11-26  3:07       ` G. Branden Robinson
2022-11-26 21:19       ` G. Branden Robinson
2022-11-27  0:12         ` Paul Eggert
2022-12-13 23:24           ` G. Branden Robinson
2022-11-23 13:48 ` [PATCH v2 4/4] zic.8: Use correct letter case in page title (TH) Alejandro Colomar
2022-11-23 18:45   ` Paul Eggert
2022-11-23 18:32 ` [PATCH v2 1/4] zic.8: Add public domain notice Paul Eggert
2022-11-23 19:01   ` Alejandro Colomar
2022-11-23 19:19     ` Paul Eggert
2022-11-23 19:32       ` Alejandro Colomar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.