linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: cjwatson@debian.org, dirk@gouders.net, linux-man@vger.kernel.org,
	help-texinfo@gnu.org, nabijaczleweli@nabijaczleweli.xyz,
	g.branden.robinson@gmail.com, groff@gnu.org
Subject: Re: Accessibility of man pages (was: Playground pager lsp(1))
Date: Sat, 8 Apr 2023 18:06:15 +0200	[thread overview]
Message-ID: <4303b096-698a-ff7d-1585-464c9aaadc40@gmail.com> (raw)
In-Reply-To: <83mt3imqwx.fsf@gnu.org>


[-- Attachment #1.1: Type: text/plain, Size: 5401 bytes --]

Hi Eli,

On 4/8/23 15:42, Eli Zaretskii wrote:
>> Date: Sat, 8 Apr 2023 15:02:59 +0200
>> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org,
>>  nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com,
>>  groff@gnu.org
>> From: Alejandro Colomar <alx.manpages@gmail.com>
>>
>> If you want how symlinks are dereferenced by find(1):
>>
>> $ man find | grep sym.*link | head -n1
>>        The  -H,  -L  and  -P  options control the treatment of symbolic links.
> 
> That's because the text appears verbatim in the man page.  Suppose the
> person in question doesn't think about "symbolic links", but has
> something else in mind, for example, "dereference".  (Why? because
> he/she just happened to see that term in some article, and wanted to
> know what does Find do with that.  Or for some other reason.)  Then
> they will not find the description of symlink behavior of Find by
> searching for "dereference".

That's why using consistent language is important.  Searching just for
"dereference" will of course have slightly less quality, but that
should be expected.  Once you have a slightly related match, you can
find terms that will help refine your search.

$ man find | grep dereference -C1
       When  the  -H  or  -L options are in effect, any symbolic links
       listed as the argument of -newer will be dereferenced, and  the
       timestamp  will  be  taken  from the file to which the symbolic
--
       used but -follow is, any symbolic links appearing after -follow
       on  the  command line will be dereferenced, and those before it
       will not).
--
              haviour of the -newer predicate; any files listed as the
              argument of -newer will be dereferenced if they are sym‐
              bolic  links.   The  same consideration applies to -new‐
--
       -newer Supported.  If the file specified is a symbolic link, it
              is always dereferenced.  This is a change from  previous
              behaviour, which used to take the relevant time from the


This already shows "symbolic link" several times, so you probably want
to search for that.

If you want something that processes natural language, you can always
ask some AI engine to process man pages for you ;).

> 
> Do you see the crucial issue here?  Indexing can tag some text with
> topics which do not appear verbatim in the text, but instead
> anticipate what people could have in mind when they are searching for
> that text without knowing what it says, exactly.

I don't remember myself having had such issues so far.  I'd like to
see real reports of readers that struggle to find a certain search
term in a certain page.  There are, but few (the only one I remember
is this one we had recently about proc(5)).  If you ever have such a
real case with man pages, please report it, and I will try to make it
more accessible.  The intention is that a combination of man(1),
apropos(1), whatis(1), and then some grep(1) and sed(1) should be
enough 99% of the time, and we should fix the outliers.

> 
>>>> After this patch, if you apropos "system" or "sysctl", you'll see
>>>> proc(5) pop up in your list.
>>>
>>> This literally adds the text to what the reader will see.  It makes
>>> the text longer and thus more difficult to read and parse, and there's
>>> a limit to how many key phrases you can add like this.
>>
>> If a page has too many topics, consider splitting the page (I agree
>> that proc(5) is asking for that job).
> 
> Indexing can tag any paragraph of text, not just the entire page.  A
> page cannot usefully have too many keywords in its title, but it _can_
> benefit from different keywords for different paragraphs.

We can add source code comments, which would appear in `man -K`
searches, but so far I haven't seen the need in any specific page.


[...]

> 
>>>  So when you see them in
>>> TOC or any similar navigation aid, you _know_, at least approximately,
>>> what each section is about.
>>
>> I know a priori that if I'm reading sscanf(3)'s SYNOPSIS, I'll find
>> the function prototype for it.  Or if I read printf(3)'s ATTRIBUTES
>> I'll find the thread-safety of the function.
> 
> SYNOPSIS is at least approximately self-describing (although some
> non-native English speakers might stumble on it).  But how would a
> random reader know that ATTRIBUTES will describe thread-safety, for
> example?  I wouldn't.  Isn't it better to have a section named "Thread
> Safety" instead?

I don't know the origin of the name of ATTRIBUTES.  There's
attributes(7), which documents what you can find there.

> 
>> text search has false positives, like anything else.  But having good
>> tools for handling text is the key to solving the problem.  grep(1)
>> and sed(1) are your friends when reading man pages.
> 
> Modern documentation is not plain text (even if we ignore
> compression), so tools which just search the text have limitations,
> sometimes serious ones.

In some cases you need to search the man(7) source code to get
extra information that is difficult to search in formatted text,
but that's for rare cases.  So far, I find mostly everything I
need just with text tools.

Cheers,
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-04-08 16:06 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-25 20:37 Playground pager lsp(1) Dirk Gouders
2023-03-25 20:47 ` Dirk Gouders
2023-04-04 23:45   ` Alejandro Colomar
2023-04-05  5:35     ` Eli Zaretskii
2023-04-06  1:10       ` Alejandro Colomar
2023-04-06  8:11         ` Eli Zaretskii
2023-04-06  8:48           ` Gavin Smith
2023-04-07 22:01           ` Alejandro Colomar
2023-04-08  7:05             ` Eli Zaretskii
2023-04-08 13:02               ` Accessibility of man pages (was: Playground pager lsp(1)) Alejandro Colomar
2023-04-08 13:42                 ` Eli Zaretskii
2023-04-08 16:06                   ` Alejandro Colomar [this message]
2023-04-08 13:47                 ` Colin Watson
2023-04-08 15:42                   ` Alejandro Colomar
2023-04-08 19:48                   ` Accessibility of man pages Dirk Gouders
2023-04-08 20:02                     ` Eli Zaretskii
2023-04-08 20:46                       ` Dirk Gouders
2023-04-08 21:53                         ` Alejandro Colomar
2023-04-08 22:33                           ` Alejandro Colomar
2023-04-09 10:28                       ` Ralph Corderoy
2023-04-08 20:31                     ` Ingo Schwarze
2023-04-08 20:59                       ` Dirk Gouders
2023-04-08 22:39                         ` Ingo Schwarze
2023-04-09  9:50                           ` Dirk Gouders
2023-04-09 10:35                             ` Dirk Gouders
     [not found]                 ` <87a5zhwntt.fsf@ada>
2023-04-09 12:05                   ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Alejandro Colomar
2023-04-09 12:17                     ` Alejandro Colomar
2023-04-09 18:55                       ` G. Branden Robinson
2023-04-09 12:29                     ` Colin Watson
2023-04-09 13:36                       ` Alejandro Colomar
2023-04-09 13:47                         ` Compressed man pages Ralph Corderoy
2023-04-12  8:13                     ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James
2023-04-12  8:32                       ` Compressed man pages Ralph Corderoy
2023-04-12 10:35                         ` Mingye Wang
2023-04-12 10:55                           ` Ralph Corderoy
2023-04-12 13:04                       ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Kerin Millar
2023-04-12 14:24                         ` Alejandro Colomar
2023-04-12 18:52                           ` Mingye Wang
2023-04-12 20:23                             ` Compressed man pages Alejandro Colomar
2023-04-13 10:09                             ` Ralph Corderoy
2023-04-07  2:18         ` Playground pager lsp(1) G. Branden Robinson
2023-04-07  6:36           ` Eli Zaretskii
2023-04-07 11:03             ` Gavin Smith
2023-04-07 14:43             ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson
2023-04-07 15:06               ` Eli Zaretskii
2023-04-07 15:08                 ` Larry McVoy
2023-04-07 17:07                 ` man page rendering speed Ingo Schwarze
2023-04-07 19:04                 ` man page rendering speed (was: Playground pager lsp(1)) Alejandro Colomar
2023-04-07 19:28                   ` Gavin Smith
2023-04-07 20:43                     ` Alejandro Colomar
2023-04-07 16:08               ` Colin Watson
2023-04-08 11:24               ` Ralph Corderoy
2023-04-07 21:26           ` reformatting man pages at SIGWINCH " Alejandro Colomar
2023-04-07 22:09             ` reformatting man pages at SIGWINCH Dirk Gouders
2023-04-07 22:16               ` Alejandro Colomar
2023-04-10 19:05                 ` Dirk Gouders
2023-04-10 19:57                   ` Alejandro Colomar
2023-04-10 20:24                   ` G. Branden Robinson
2023-04-11  9:20                     ` Ralph Corderoy
2023-04-11  9:39                     ` Dirk Gouders
2023-04-17  6:23                       ` G. Branden Robinson
2023-04-08 11:40               ` Ralph Corderoy
2023-04-05 10:02     ` Playground pager lsp(1) Dirk Gouders
2023-04-05 14:19       ` Arsen Arsenović
2023-04-05 18:01         ` Dirk Gouders
2023-04-05 19:07           ` Eli Zaretskii
2023-04-05 19:56             ` Dirk Gouders
2023-04-05 20:38             ` A less presumptive .info? (was: Re: Playground pager lsp(1)) Arsen Arsenović
2023-04-06  8:14               ` Eli Zaretskii
2023-04-06  8:56                 ` Gavin Smith
2023-04-07 13:14                 ` Arsen Arsenović
2023-04-06  1:31       ` Playground pager lsp(1) Alejandro Colomar
2023-04-06  6:01         ` Dirk Gouders

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4303b096-698a-ff7d-1585-464c9aaadc40@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=cjwatson@debian.org \
    --cc=dirk@gouders.net \
    --cc=eliz@gnu.org \
    --cc=g.branden.robinson@gmail.com \
    --cc=groff@gnu.org \
    --cc=help-texinfo@gnu.org \
    --cc=linux-man@vger.kernel.org \
    --cc=nabijaczleweli@nabijaczleweli.xyz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).