All of lore.kernel.org
 help / color / mirror / Atom feed
* [mlmmj] [patch] man page fixes
@ 2012-01-22  9:08 Thomas Goirand
  2012-01-22 13:56 ` Ben Schmidt
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Thomas Goirand @ 2012-01-22  9:08 UTC (permalink / raw)
  To: mlmmj

[-- Attachment #1: Type: text/plain, Size: 355 bytes --]

Hi,

Please also apply these man page fixes. I'm currently adding this patch
in the Debian packaging to reduce lintian warnings which are quite
annoying me when working on MLMMJ: too many warnings, and I wont see
anything... By the way, hyphen-as-minus use are breaking groff
indentation, so it's a good thing to fix them.

Cheers,

Thomas Goirand (zigo)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 10_fix_manpages_syntax.diff --]
[-- Type: text/x-diff; name="10_fix_manpages_syntax.diff", Size: 5953 bytes --]

From: Thomas Goirand <zigo@debian.org>
Subject: Fixes some of the manpages syntax
Description: Switched some UTF-8 chars and fixes some hyphen-used-as-minus-sign
Forwarded: yes
diff -u -r -N a/man/mlmmj-bounce.1 b/man/mlmmj-bounce.1
--- a/man/mlmmj-bounce.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-bounce.1	2012-01-22 16:59:32.000000000 +0800
@@ -3,7 +3,7 @@
 mlmmj-bounce \- bounce handling utility for mlmmj
 .SH SYNOPSIS
 .B mlmmj-bounce
-\fI-L /path/to/list \fR[\fI-a john=doe.org | -d\fR]\fI \fR[\fI-n num | -p\fR]
+\fI\-L /path/to/list \fR[\fI\-a john=doe.org | \-d\fR]\fI \fR[\fI\-n num | \-p\fR]
 .HP
 \fB\-a\fR: Address string that bounces
 .HP
@@ -37,6 +37,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-maintd.1 b/man/mlmmj-maintd.1
--- a/man/mlmmj-maintd.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-maintd.1	2012-01-22 16:59:54.000000000 +0800
@@ -3,7 +3,7 @@
 mlmmj-maintd \- maintenance for mlmmj maintained lists
 .SH SYNOPSIS
 .B mlmmj-maintd
-[\fI-F\fR] \fI[-d\fR | \fI-L\fR] /path/to/dir
+[\fI\-F\fR] \fI[\-d\fR | \fI\-L\fR] /path/to/dir
 .HP
 \fB\-d\fR: Full path to directory with lists
 .HP
@@ -30,10 +30,10 @@
 crontab entry:
 
 .LP
-0 */2 * * * /usr/bin/mlmmj-maintd -F -L /path/to/list
+0 */2 * * * /usr/bin/mlmmj\-maintd \-F \-L /path/to/list
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-process.1 b/man/mlmmj-process.1
--- a/man/mlmmj-process.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-process.1	2012-01-22 16:58:57.000000000 +0800
@@ -3,7 +3,7 @@
 mlmmj-process \- process mail for an mlmmj managed mailinglist
 .SH SYNOPSIS
 .B mlmmj-process
-\fI-L /path/to/list -m /path/to/mail \fR[\fI-h\fR] [\fI-P\fR] [\fI-V\fR]
+\fI\-L /path/to/list \-m /path/to/mail \fR[\fI-h\fR] [\fI-P\fR] [\fI-V\fR]
 .HP
 \fB\-h\fR: This help
 .HP
@@ -61,6 +61,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-recieve.1 b/man/mlmmj-recieve.1
--- a/man/mlmmj-recieve.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-recieve.1	2012-01-22 16:58:57.000000000 +0800
@@ -21,7 +21,7 @@
 using mailservers supporting the \fB/etc/aliases\fR file, a line to activate
 an mlmmj managed mailinglist would look like this:
 .LP
-list: "|/usr/bin/mlmmj-recieve -L /var/spool/mlmmj/list/"
+list: "|/usr/bin/mlmmj\-recieve \-L /var/spool/mlmmj/list/"
 
 It's very important to specify the full path to the binary, or the mailinglist
 will not function.
@@ -36,6 +36,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-send.1 b/man/mlmmj-send.1
--- a/man/mlmmj-send.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-send.1	2012-01-22 16:58:57.000000000 +0800
@@ -62,6 +62,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-sub.1 b/man/mlmmj-sub.1
--- a/man/mlmmj-sub.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-sub.1	2012-01-22 16:58:57.000000000 +0800
@@ -3,8 +3,8 @@
 mlmmj-sub \- subscribe address to a mailinglist run by mlmmj
 .SH SYNOPSIS
 .B mlmmj-sub
-\fI-L /path/to/list -a john@doe.org \fR[\fI-c\fR | \fI-C\fR] \fR[\fI-d\fR | \fI-n\fR]
-[\fI-h\fR] [\fI-U\fR] [\fI-V\fR]
+\fI\-L /path/to/list \-a john@doe.org \fR[\fI\-c\fR | \fI\-C\fR] \fR[\fI\-d\fR | \fI\-n\fR]
+[\fI\-h\fR] [\fI\-U\fR] [\fI\-V\fR]
 .HP
 \fB\-a\fR: Email address to subscribe
 .HP
@@ -51,6 +51,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>
diff -u -r -N a/man/mlmmj-unsub.1 b/man/mlmmj-unsub.1
--- a/man/mlmmj-unsub.1	2012-01-22 16:58:26.000000000 +0800
+++ b/man/mlmmj-unsub.1	2012-01-22 16:58:57.000000000 +0800
@@ -1,10 +1,10 @@
 .TH mlmmj-unsub "1" "September 2004" mlmmj-unsub
 .SH NAME
-mlmmj-unsub \- manual page for mlmmj-unsub
+mlmmj-unsub \- unsubscribe someone from a list
 .SH SYNOPSIS
 .B mlmmj-sub
-\fI-L /path/to/list -a john@doe.org \fR[\fI-c\fR | \fI-C\fR] [\fI-h\fR]
-\fR[\fI-d\fR | \fI-n\fR] [\fI-V\fR]
+\fI\-L /path/to/list \-a john@doe.org \fR[\fI\-c\fR | \fI\-C\fR] [\fI\-h\fR]
+\fR[\fI\-d\fR | \fI\-n\fR] [\fI\-V\fR]
 .HP
 \fB\-a\fR: Email address to unsubscribe
 .HP
@@ -45,6 +45,6 @@
 .SH AUTHORS
 This manual page was written by the following persons:
 .HP
-Søren Boll Overgaard <boll@debian.org> (based on html2man output)
+Søren Boll Overgaard <boll@debian.org> (based on html2man output)
 .HP
-Mads Martin Jørgensen <mmj@mmj.dk>
+Mads Martin Jørgensen <mmj@mmj.dk>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
@ 2012-01-22 13:56 ` Ben Schmidt
  2012-01-22 19:13 ` Thomas Goirand
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Ben Schmidt @ 2012-01-22 13:56 UTC (permalink / raw)
  To: mlmmj

Hi, Thomas,

Thanks for this. I have a few issues/questions.

1. This doesn't apply cleanly to current sources in version control.
Would you be able to provide a patch that does? I can probably resolve
the clashes OK, but I know little about groff, so I'm not sure if other
man-page changes/additions might also require fixing, so it'd be better
if someone who knows more what they're doing could look at it.

2. My system (Mac OS X) doesn't like the UTF-8 encoding. The existing
Latin-1 encoding works for me (in fact, the ø is replaced by just an o
for me automagically somewhere). I guess this is locale-related. This
means we need to figure out how to do an encoding conversion appropriate
to the host system as part of the build/install process, or find a groff
directive that makes it interpret the file as a particular encoding, or
something, rather than just change the encoding. I'm happy to change the
encoding to UTF-8 if we can figure out how to make all systems interpret
the files properly. Any ideas?

3. Could we keep the separate issues in separate patches? If the
encoding change is in one patch, the hyphen issue in another, and the
content changes in another, that'd be nice (and I can then easily apply
any that have no issues while continuing to discuss any that do).

Cheers, and thanks again,

Ben.



On 22/01/12 8:08 PM, Thomas Goirand wrote:
> Hi,
>
> Please also apply these man page fixes. I'm currently adding this patch
> in the Debian packaging to reduce lintian warnings which are quite
> annoying me when working on MLMMJ: too many warnings, and I wont see
> anything... By the way, hyphen-as-minus use are breaking groff
> indentation, so it's a good thing to fix them.
>
> Cheers,
>
> Thomas Goirand (zigo)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
  2012-01-22 13:56 ` Ben Schmidt
@ 2012-01-22 19:13 ` Thomas Goirand
  2012-01-23  0:37 ` Ben Schmidt
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Goirand @ 2012-01-22 19:13 UTC (permalink / raw)
  To: mlmmj

On 01/22/2012 09:56 PM, Ben Schmidt wrote:
> Hi, Thomas,
> 
> Thanks for this. I have a few issues/questions.
> 
> 1. This doesn't apply cleanly to current sources in version control.
> Would you be able to provide a patch that does?

Sorry, my patch is from MLMMJ 1.2.17, as I didn't upgrade the Debian
package yet (I'm waiting that you release something).

> I can probably resolve
> the clashes OK, but I know little about groff, so I'm not sure if other
> man-page changes/additions might also require fixing, so it'd be better
> if someone who knows more what they're doing could look at it.

The issue is when you have something with dash "like-this". Groff will
then try to wrap it, and you mind end up with something displayed like-
this (eg: with a return to the next line, when you really don't want
one). Adding a \ in front of the - makes it so that groff wont do the
word break.

If you check with lintian (which is a Debian package checking tool), it
will warn with a message like "hyphen-instead-of-minus" warning. The
extended description in lintian is as follow:

    This manual page seems to contain a hyphen where a minus sign was
intended. By default, "-" chars are interpreted as hyphens (U+2010) by
groff, not as minus signs (U+002D). Since options to programs use minus
signs (U+002D), this means for example in UTF-8 locales that you cannot
cut and paste options, nor search for them easily. The Debian groff
package currently forces "-" to be interpreted as a minus sign due to
the number of manual pages with this problem, but this is a
Debian-specific modification and hopefully eventually can be removed.

    "-" must be escaped ("\-") to be interpreted as minus. If you really
intend a hyphen (normally you don't), write it as "\(hy" to emphasise
that fact. See groff(7) and especially groff_char(7) for details, and
also the thread starting with
http://lists.debian.org/debian-devel/2003/debian-devel-200303/msg01481.html

    If you use some tool that converts your documentation to groff
format, this tag may indicate a bug in the tool. Some tools convert
dashes of any kind to hyphens. The safe way of converting dashes is to
convert them to "\-".

    Because this error can occur very often, Lintian shows only the
first 10 occurrences for each man page and give the number of suppressed
occurrences. If you want to see all warnings, run Lintian with the
-d/--debug option.

    Refer to /usr/share/doc/groff-base/README.Debian and the
groff_char(7) manual page for details.

    Severity: wishlist, Certainty: possible

    Check: manpages, Type: binary

> 2. My system (Mac OS X) doesn't like the UTF-8 encoding. The existing
> Latin-1 encoding works for me (in fact, the ø is replaced by just an o
> for me automagically somewhere).

All man pages should be using UTF-8 in Debian, and I believe that you
should have your mac to use UTF-8 if possible. If not, do we care? Is
MLMMJ used in the Apple platform?

Also, what type of encoding do you use? Why is your encoding more valid
than UTF-8? What if the user is let's say Chinese, Russian, or who knows?

It really doesn't make sense to use any type of specific encoding,
everyone should be using UTF-8, IMO.

> I guess this is locale-related. This
> means we need to figure out how to do an encoding conversion appropriate
> to the host system as part of the build/install process, or find a groff
> directive that makes it interpret the file as a particular encoding, or
> something, rather than just change the encoding. I'm happy to change the
> encoding to UTF-8 if we can figure out how to make all systems interpret
> the files properly. Any ideas?

Just fix your system, it should be using UTF-8 anyway.

> 3. Could we keep the separate issues in separate patches?

Feel free to apply what you think is ok, I'll anyway fix again in Debian
if it's still not correct.

Thomas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
  2012-01-22 13:56 ` Ben Schmidt
  2012-01-22 19:13 ` Thomas Goirand
@ 2012-01-23  0:37 ` Ben Schmidt
  2012-01-23  2:06 ` Ben Schmidt
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Ben Schmidt @ 2012-01-23  0:37 UTC (permalink / raw)
  To: mlmmj

On 23/01/12 6:13 AM, Thomas Goirand wrote:
> On 01/22/2012 09:56 PM, Ben Schmidt wrote:
>> Hi, Thomas,
>>
>> Thanks for this. I have a few issues/questions.
>>
>> 1. This doesn't apply cleanly to current sources in version control.
>> Would you be able to provide a patch that does?
>
> Sorry, my patch is from MLMMJ 1.2.17, as I didn't upgrade the Debian
> package yet (I'm waiting that you release something).

It's a wise move to wait until a release, of course. Just a little
tricky for me to apply old patches.

>> I can probably resolve the clashes OK, but I know little about groff,
>> so I'm not sure if other man-page changes/additions might also
>> require fixing, so it'd be better if someone who knows more what
>> they're doing could look at it.
>
> The issue is when you have something with dash "like-this". Groff will
> then try to wrap it, and you mind end up with something displayed like-
> this (eg: with a return to the next line, when you really don't want
> one). Adding a \ in front of the - makes it so that groff wont do the
> word break.

Thanks a lot for that detailed clarification. I'll do a semi-automated
find-replace on the current man pages and escape all the dashes.

>> 2. My system (Mac OS X) doesn't like the UTF-8 encoding. The existing
>> Latin-1 encoding works for me (in fact, the ø is replaced by just an o
>> for me automagically somewhere).
>
> All man pages should be using UTF-8 in Debian, and I believe that you
> should have your mac to use UTF-8 if possible. If not, do we care? Is
> MLMMJ used in the Apple platform?
>
> Also, what type of encoding do you use? Why is your encoding more valid
> than UTF-8? What if the user is let's say Chinese, Russian, or who knows?
>
> It really doesn't make sense to use any type of specific encoding,
> everyone should be using UTF-8, IMO.

I agree, it makes sense for everyone to use UTF-8 these days. However,
I'd prefer not to expect or assume that. I didn't mention my system
because I think it is particularly important, but simply to point out
that there is at least one system out there that this change will break.
There may be others. I would like to find a way to make this change that
won't break any system. Does anyone know how to do this, or another
project that has solved this problem whose work we can copy or imitate?

Cheers,

Ben.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (2 preceding siblings ...)
  2012-01-23  0:37 ` Ben Schmidt
@ 2012-01-23  2:06 ` Ben Schmidt
  2012-01-23  7:11 ` Thomas Goirand
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Ben Schmidt @ 2012-01-23  2:06 UTC (permalink / raw)
  To: mlmmj

>> All man pages should be using UTF-8 in Debian, and I believe that you
>> should have your mac to use UTF-8 if possible. If not, do we care? Is
>> MLMMJ used in the Apple platform?
>>
>> Also, what type of encoding do you use? Why is your encoding more valid
>> than UTF-8? What if the user is let's say Chinese, Russian, or who knows?
>>
>> It really doesn't make sense to use any type of specific encoding,
>> everyone should be using UTF-8, IMO.
>
> I agree, it makes sense for everyone to use UTF-8 these days. However,
> I'd prefer not to expect or assume that. I didn't mention my system
> because I think it is particularly important, but simply to point out
> that there is at least one system out there that this change will break.
> There may be others. I would like to find a way to make this change that
> won't break any system. Does anyone know how to do this, or another
> project that has solved this problem whose work we can copy or imitate?

I think I've solved this.

It seems Debian is non-standard in requiring UTF-8 man pages, as Groff
does not support UTF-8 input:
http://www.gnu.org/software/groff/manual/html_node/Input-Encodings.html

However, Groff supports character escapes which can be used compatibly:
http://manpages.ubuntu.com/manpages/gutsy/man7/groff_char.7.html
http://manpages.debian.net/cgi-bin/man.cgi?query=groff_char&apropos=0&sektion=0&manpathÞbian+6.0+squeeze&format=html&locale=en

So I'll replace ø with \[/o] and everything should be good, though a
little ugly.

Cheers,

Ben.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (3 preceding siblings ...)
  2012-01-23  2:06 ` Ben Schmidt
@ 2012-01-23  7:11 ` Thomas Goirand
  2012-01-23 16:39 ` Ben Schmidt
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Goirand @ 2012-01-23  7:11 UTC (permalink / raw)
  To: mlmmj

On 01/23/2012 08:37 AM, Ben Schmidt wrote:
> Just a little
> tricky for me to apply old patches.

For these man pages, my intention was to point at the issues, and make
sure they don't re-occur, because really, this has been recurrent with
MLMMJ.

> It seems Debian is non-standard in requiring UTF-8 man pages, as Groff
> does not support UTF-8 input:
> http://www.gnu.org/software/groff/manual/html_node/Input-Encodings.html

From the same page:
"By its very nature, -Tutf8 supports all input encodings"

So it's absolutely standard (and recommended).

> So I'll replace ø with \[/o] and everything should be good, though a
> little ugly.

I'm happy if you've found a solution, however, I still think UTF-8 is
the only choice.

Thomas

P.S: Please do *not* Cc: me, I'm registered to the list.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (4 preceding siblings ...)
  2012-01-23  7:11 ` Thomas Goirand
@ 2012-01-23 16:39 ` Ben Schmidt
  2012-01-24  6:17 ` Mads Martin Jørgensen
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Ben Schmidt @ 2012-01-23 16:39 UTC (permalink / raw)
  To: mlmmj

Hi, Thomas,

>> It seems Debian is non-standard in requiring UTF-8 man pages, as Groff
>> does not support UTF-8 input:
>> http://www.gnu.org/software/groff/manual/html_node/Input-Encodings.html
>
> From the same page:
> "By its very nature, -Tutf8 supports all input encodings"
>
> So it's absolutely standard (and recommended).

My interpretation of this is, "When the output/terminal encoding is
UTF-8, naturally all supported input encodings can be accommodated,
since Unicode is a superset of them all." (The paragraph then explains
how other output encodings have restrictions on which input encodings
they can accommodate.)

That doesn't by any means mean that UTF-8 is a supported input encoding.
On the contrary, since it's not on the list of supported input
encodings, and there is no documentation regarding how to instruct groff
that its input is UTF-8, I believe it isn't. If Debian supports it, they
must have patched groff, or just be happily sweeping the issue under the
carpet (if groff thinks everything is Latin-1 I presume it will just
handle text transparently, so it might not matter if it is actually fed
and outputs UTF-8 rather than Latin-1--until complicated wrapping or
collation gets involved).

>> So I'll replace ø with \[/o] and everything should be good, though a
>> little ugly.
>
> I'm happy if you've found a solution, however, I still think UTF-8 is
> the only choice.

I would prefer UTF-8, too, but I can't find any solid evidence that it
is officially supported upstream.

> P.S: Please do *not* Cc: me, I'm registered to the list.

Yeah, sorry about that. I keep trying to remember but only get it right
occasionally.

Ben.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (5 preceding siblings ...)
  2012-01-23 16:39 ` Ben Schmidt
@ 2012-01-24  6:17 ` Mads Martin Jørgensen
  2012-01-27  4:47 ` Thomas Goirand
  2012-01-27  5:37 ` Ben Schmidt
  8 siblings, 0 replies; 10+ messages in thread
From: Mads Martin Jørgensen @ 2012-01-24  6:17 UTC (permalink / raw)
  To: mlmmj

Feel free to replace ø with oe as well, if it's the one from my name
that's the culprit, and an easier fix.

-- 
Mads Martin Jørgensen

On 23/01/2012, at 03.07, Ben Schmidt <mail_ben_schmidt@yahoo.com.au> wrote:

> So I'll replace ø with \[/o] and everything should be good, though a
> little ugly.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (6 preceding siblings ...)
  2012-01-24  6:17 ` Mads Martin Jørgensen
@ 2012-01-27  4:47 ` Thomas Goirand
  2012-01-27  5:37 ` Ben Schmidt
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Goirand @ 2012-01-27  4:47 UTC (permalink / raw)
  To: mlmmj

On 01/24/2012 12:39 AM, Ben Schmidt wrote:
>>> It seems Debian is non-standard in requiring UTF-8 man pages, as Groff
>>> does not support UTF-8 input:
>>> http://www.gnu.org/software/groff/manual/html_node/Input-Encodings.html
>>
>> From the same page:
>> "By its very nature, -Tutf8 supports all input encodings"
>>
>> So it's absolutely standard (and recommended).
> 
> My interpretation of this is, "When the output/terminal encoding is
> UTF-8, naturally all supported input encodings can be accommodated,
> since Unicode is a superset of them all." (The paragraph then explains
> how other output encodings have restrictions on which input encodings
> they can accommodate.)
> 
> That doesn't by any means mean that UTF-8 is a supported input encoding.
> On the contrary, since it's not on the list of supported input
> encodings, and there is no documentation regarding how to instruct groff
> that its input is UTF-8, I believe it isn't. If Debian supports it, they
> must have patched groff, or just be happily sweeping the issue under the
> carpet (if groff thinks everything is Latin-1 I presume it will just
> handle text transparently, so it might not matter if it is actually fed
> and outputs UTF-8 rather than Latin-1--until complicated wrapping or
> collation gets involved).

This doesn't make sense at all. If there's a parameter to use UTF-8, how
could it be not supported?

Thomas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [mlmmj] [patch] man page fixes
  2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
                   ` (7 preceding siblings ...)
  2012-01-27  4:47 ` Thomas Goirand
@ 2012-01-27  5:37 ` Ben Schmidt
  8 siblings, 0 replies; 10+ messages in thread
From: Ben Schmidt @ 2012-01-27  5:37 UTC (permalink / raw)
  To: mlmmj

On 27/01/12 3:47 PM, Thomas Goirand wrote:
> On 01/24/2012 12:39 AM, Ben Schmidt wrote:
>>>> It seems Debian is non-standard in requiring UTF-8 man pages, as Groff
>>>> does not support UTF-8 input:
>>>> http://www.gnu.org/software/groff/manual/html_node/Input-Encodings.html
>>>
>>>  From the same page:
>>> "By its very nature, -Tutf8 supports all input encodings"
>>>
>>> So it's absolutely standard (and recommended).
>>
>> My interpretation of this is, "When the output/terminal encoding is
>> UTF-8, naturally all supported input encodings can be accommodated,
>> since Unicode is a superset of them all." (The paragraph then explains
>> how other output encodings have restrictions on which input encodings
>> they can accommodate.)
>>
>> That doesn't by any means mean that UTF-8 is a supported input encoding.
>> On the contrary, since it's not on the list of supported input
>> encodings, and there is no documentation regarding how to instruct groff
>> that its input is UTF-8, I believe it isn't. If Debian supports it, they
>> must have patched groff, or just be happily sweeping the issue under the
>> carpet (if groff thinks everything is Latin-1 I presume it will just
>> handle text transparently, so it might not matter if it is actually fed
>> and outputs UTF-8 rather than Latin-1--until complicated wrapping or
>> collation gets involved).
>
> This doesn't make sense at all. If there's a parameter to use UTF-8, how
> could it be not supported?

The parameter is to *output* UTF-8 not *input* UTF-8.

http://www.gnu.org/software/groff/manual/html_node/Groff-Options.html

‘-Tdev’
     Prepare output for device dev. The default device is ‘ps’, unless
     changed when groff was configured and built. The following are the
     output devices currently available:
...
     utf8
	For typewriter-like devices which use the Unicode (ISO 10646)
	character set with UTF-8 encoding.

Input encodings are supported via a hack abusing the more generic macro
functionality which powers a lot of groff, I believe:

‘-mname’ [e.g. -mlatin2]
     Read in the file name.tmac. Normally groff searches for this in its
     macro directories. If it isn't found, it tries tmac.name (searching
     in the same directories).

Output is much easier to implement than input (you just change what
bytes you stuff into the stream to represent a given character, rather
than needing to implement some kind of parser or state machine that can
recognise multi-byte character sequences, normalise text, etc.). It's
also a much higher priority as man pages are viewed much more frequently
than they are written or edited. So it's no surprise to me that groff
only supports UTF-8 output, not input.

Cheers,

Ben.





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-01-27  5:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-22  9:08 [mlmmj] [patch] man page fixes Thomas Goirand
2012-01-22 13:56 ` Ben Schmidt
2012-01-22 19:13 ` Thomas Goirand
2012-01-23  0:37 ` Ben Schmidt
2012-01-23  2:06 ` Ben Schmidt
2012-01-23  7:11 ` Thomas Goirand
2012-01-23 16:39 ` Ben Schmidt
2012-01-24  6:17 ` Mads Martin Jørgensen
2012-01-27  4:47 ` Thomas Goirand
2012-01-27  5:37 ` Ben Schmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.