All of lore.kernel.org
 help / color / mirror / Atom feed
* Correctly formatting URIs: slash
@ 2021-01-22 13:00 Alejandro Colomar (man-pages)
  2021-01-22 14:28 ` Michael Kerrisk (man-pages)
  2021-01-22 15:12 ` G. Branden Robinson
  0 siblings, 2 replies; 4+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-01-22 13:00 UTC (permalink / raw)
  To: G. Branden Robinson, Michael Kerrisk (man-pages), Jakub Wilk; +Cc: linux-man

Hi all,

Why do some pages use \:/ for the slash in the path part of a URL, but
some others don't, and just use /?

Moreover, why do the former use \:/ only for the path, but not for the
protocol?

$ grep -n '^\.UR' man7/uri.7;
173:.UR http://www.w3.org\:/CGI
243:.UR http://www.ietf.org\:/rfc\:/rfc1036.txt
383:.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
396:.UR http://www.ietf.org\:/rfc\:/rfc2253.txt
414:.UR http://www.ietf.org\:/rfc\:/rfc2254.txt
456:.UR http://www.ietf.org\:/rfc\:/rfc1625.txt
555:.UR
http://www.fwi.uva.nl\:/\(times\:/jargon\:/h\:/HackerWritingStyle.html
583:.UR http://www.ietf.org\:/rfc\:/rfc2396.txt
586:.UR http://www.w3.org\:/TR\:/REC\-html40
707:.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
$

$ grep -Inr '^\.UR' man? \
  |grep -c '\\:/';
56
$

$ grep -Inr '^\.UR' man? \
  |grep -c -v '\\:/';
41
$

$ grep -Inr '^\.UR' man? \
  |grep '\\:/' \
  |head -n1;
man2/futex.2:1910:.UR
http://kernel.org\:/doc\:/ols\:/2002\:/ols2002\-pages\-479\-495.pdf
$

$ grep -Inr '^\.UR' man? \
  |grep -v '\\:/' \
  |head -n1;
man1/memusage.1:206:.UR http://www.gnu.org/software/libc/bugs.html
$

What is the correct form?

Thanks,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Correctly formatting URIs: slash
  2021-01-22 13:00 Correctly formatting URIs: slash Alejandro Colomar (man-pages)
@ 2021-01-22 14:28 ` Michael Kerrisk (man-pages)
  2021-01-22 15:12 ` G. Branden Robinson
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Kerrisk (man-pages) @ 2021-01-22 14:28 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: G. Branden Robinson, Jakub Wilk, linux-man

Hi Alex,

On Fri, 22 Jan 2021 at 14:00, Alejandro Colomar (man-pages)
<alx.manpages@gmail.com> wrote:
>
> Hi all,
>
> Why do some pages use \:/ for the slash in the path part of a URL, but
> some others don't, and just use /?
>
> Moreover, why do the former use \:/ only for the path, but not for the
> protocol?
>
> $ grep -n '^\.UR' man7/uri.7;
> 173:.UR http://www.w3.org\:/CGI
> 243:.UR http://www.ietf.org\:/rfc\:/rfc1036.txt
> 383:.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
> 396:.UR http://www.ietf.org\:/rfc\:/rfc2253.txt
> 414:.UR http://www.ietf.org\:/rfc\:/rfc2254.txt
> 456:.UR http://www.ietf.org\:/rfc\:/rfc1625.txt
> 555:.UR
> http://www.fwi.uva.nl\:/\(times\:/jargon\:/h\:/HackerWritingStyle.html
> 583:.UR http://www.ietf.org\:/rfc\:/rfc2396.txt
> 586:.UR http://www.w3.org\:/TR\:/REC\-html40
> 707:.UR http://www.ietf.org\:/rfc\:/rfc2255.txt
> $
>
> $ grep -Inr '^\.UR' man? \
>   |grep -c '\\:/';
> 56
> $
>
> $ grep -Inr '^\.UR' man? \
>   |grep -c -v '\\:/';
> 41
> $
>
> $ grep -Inr '^\.UR' man? \
>   |grep '\\:/' \
>   |head -n1;
> man2/futex.2:1910:.UR
> http://kernel.org\:/doc\:/ols\:/2002\:/ols2002\-pages\-479\-495.pdf
> $
>
> $ grep -Inr '^\.UR' man? \
>   |grep -v '\\:/' \
>   |head -n1;
> man1/memusage.1:206:.UR http://www.gnu.org/software/libc/bugs.html
> $
>
> What is the correct form?

The "\:" is a clue to groff that it can do a line break here if
necessary; i.e., it is a recommendation that is a better point to
break than, say, in the middle of a word in the URL. Useful especially
for long URLs.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Correctly formatting URIs: slash
  2021-01-22 13:00 Correctly formatting URIs: slash Alejandro Colomar (man-pages)
  2021-01-22 14:28 ` Michael Kerrisk (man-pages)
@ 2021-01-22 15:12 ` G. Branden Robinson
  2021-01-22 17:32   ` Alejandro Colomar (man-pages)
  1 sibling, 1 reply; 4+ messages in thread
From: G. Branden Robinson @ 2021-01-22 15:12 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages)
  Cc: Michael Kerrisk (man-pages), Jakub Wilk, linux-man

[-- Attachment #1: Type: text/plain, Size: 1862 bytes --]

Hi Alex!

At 2021-01-22T14:00:33+0100, Alejandro Colomar (man-pages) wrote:
> Why do some pages use \:/ for the slash in the path part of a URL, but
> some others don't, and just use /?

Laziness or ignorance of how URLs get typeset and what the \: escape is
for.

URLs are typeset with hyphenation disabled.  That means that the line
preceding a URL can
be broken early in a very ugly way, somewhat like this sentence.

Slashes in URLs turn out to be pretty good places to break a line if it
must be.  If you wanted a hyphen to appear at the break point, you'd use
the "hyphenation character", an escape that goes way back to 1970s AT&T
troff: \%.  However, as with URLs,sometimes you want a hyphenless break
point, and that's what groff's \: is.  Heirloom Doctools troff supports
\: as well.  mandoc 1.14.1 does not (it refuses to break URLs at all, at
least for man(7) documents; I didn't check its mdoc(7) support).

> Moreover, why do the former use \:/ only for the path, but not for the
> protocol?

I think it is because people feel like postponing a break by 7 more
characters to get the first part after the schema adjacent to it is not
too high a price to pay.

There's no deep reason why you couldn't do:

.UR http\:://www\:.w3\:.org\:/CGI
Common Gateway Interface
.UE

for instance.

House style for the groff man pages is to place hyphenless break points
_before_ periods and _after_ slashes in pathnames and URLs.  The former
point is one I'd recommend firmly to others, because it helps keep the
reader from confusing a line-broken pathname or URL as ending a
sentence (prematurely).  The latter convention is more arbitrary; plenty
of perfectly valid URLs (and pathnames) exist with or without trailing
slashes, so one can't infer the end of such an object from the presence
or absence of a slash at the end of a line of text.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Correctly formatting URIs: slash
  2021-01-22 15:12 ` G. Branden Robinson
@ 2021-01-22 17:32   ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 4+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-01-22 17:32 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: Michael Kerrisk (man-pages), Jakub Wilk, linux-man

Hi Branden and Michael,

On 1/22/21 4:12 PM, G. Branden Robinson wrote:
> Hi Alex!
> 
> At 2021-01-22T14:00:33+0100, Alejandro Colomar (man-pages) wrote:
>> Why do some pages use \:/ for the slash in the path part of a URL, but
>> some others don't, and just use /?
> 
> Laziness or ignorance of how URLs get typeset and what the \: escape is
> for.
> 
> URLs are typeset with hyphenation disabled.  That means that the line
> preceding a URL can
> be broken early in a very ugly way, somewhat like this sentence.
> 
> Slashes in URLs turn out to be pretty good places to break a line if it
> must be.  If you wanted a hyphen to appear at the break point, you'd use
> the "hyphenation character", an escape that goes way back to 1970s AT&T
> troff: \%.  However, as with URLs,sometimes you want a hyphenless break
> point, and that's what groff's \: is.  Heirloom Doctools troff supports
> \: as well.  mandoc 1.14.1 does not (it refuses to break URLs at all, at
> least for man(7) documents; I didn't check its mdoc(7) support).
> 
>> Moreover, why do the former use \:/ only for the path, but not for the
>> protocol?
> 
> I think it is because people feel like postponing a break by 7 more
> characters to get the first part after the schema adjacent to it is not
> too high a price to pay.
> 
> There's no deep reason why you couldn't do:
> 
> .UR http\:://www\:.w3\:.org\:/CGI
> Common Gateway Interface
> .UE
> 
> for instance.
> 
> House style for the groff man pages is to place hyphenless break points
> _before_ periods and _after_ slashes in pathnames and URLs.  The former
> point is one I'd recommend firmly to others, because it helps keep the
> reader from confusing a line-broken pathname or URL as ending a
> sentence (prematurely).  The latter convention is more arbitrary; plenty
> of perfectly valid URLs (and pathnames) exist with or without trailing
> slashes, so one can't infer the end of such an object from the presence
> or absence of a slash at the end of a line of text.

Fair enough!  I'll patch URLs to follow those conventions.

Thanks,

Alex

> 
> Regards,
> Branden
> 


-- 
--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-22 17:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-22 13:00 Correctly formatting URIs: slash Alejandro Colomar (man-pages)
2021-01-22 14:28 ` Michael Kerrisk (man-pages)
2021-01-22 15:12 ` G. Branden Robinson
2021-01-22 17:32   ` Alejandro Colomar (man-pages)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.