* Playground pager lsp(1) @ 2023-03-25 20:37 Dirk Gouders 2023-03-25 20:47 ` Dirk Gouders 0 siblings, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-03-25 20:37 UTC (permalink / raw) To: Alejandro Colomar, linux-man Hi Alejandro, first of all, chances are that you consider this post as spam, because this list is about linux manual pages and not pagers. In that case please accept my apologies and ignore this post. My reasoning was that readers here have some interest in manual pages and therefore probably also in pagers that claim to "understand" manual pages. My hope is that even if you consider this post inappropriate you will perhaps suggest some more appropriate place for such discussion. Not long ago, I noticed a discussion [1] about what pagers can and cannot do. That was interesting to me, because I am currently playing with a pager that claims to have a focus on manual pages. I will try to not waste your time and attach the manual page and a link to a short (3:50) demo video. To me it is absolutely OK should you just ignore this spam post, but perhaps you find lsp(1) interesting enough for further discussion. Best regards, Dirk [1] https://www.spinics.net/lists/linux-man/index.html#24494 [2] https://youtu.be/syGT4POgTAw LSP(1) User commands LSP(1) NAME lsp - list pages (or least significant pager) SYNOPSIS lsp [options] [file_name]... lsp -h lsp -v DESCRIPTION lsp is a terminal pager that assists in paging through data, usually text — no more(1), no less(1). The given files are opened if file names are given as options. Otherwise lsp assumes input from stdin and tries to read from there. In addition to it’s ability to aid in paging through text files lsp has limited knowledge about manual pages and offers some help in viewing them: • Manual pages usually refer to other manual pages and lsp allows to navigate those references and to visit them as new files with the ability to also navigate through all opened manual pages or other files. Here, lsp tries to minimize frustration caused by unavailable references and verifies their existance before offering them as references that can be visited. • In windowing environments lsp does complete resizes when windows get resized. This means it also reloads the manual page to fit the new window size. • Search for manual pages using apropos(1); in the current most basic form it lists all known manual pages ready for text search and visiting referenced manual pages. • lsp has an experimental TOC mode. This is a three-level folding mode trying to list only section and sub-section names for quick navigation in manual pages. The TOC is created using naive heuristics which works well to some extend, but it might be incomplete. Users should keep that in mind. OPTIONS All options can be given on the command line or via the environment variable LSP_OPTIONS. The short version of toggles can also be used as commands, e.g. you can input -i while paging through a file to toggle case sensitivity for searches. -a, --load-apropos Create an apropos pseudo-file. -c, --chop-lines Toggle chopping of lines that do not fit the current screen width. -h, --help Output help and exit. -i, --no-case Toggle case sensitivity in searches. -I, --man-case Turn on case sensitivity for names of manual pages. This is used for example to verify references to other manual pages. -l, --log-file Specify a path to where write debugging output. -n, --line-numbers Toggle visible line numbers. -s, --search-string Specify an initial search string. -v, --version Output version information of lsp and exit. --no-color Disable colored output. --reload-command Specify command to load manual pages. Default is man. --verify-command Specify command to verify the existance of references. Default is man -w. --verify-with-apropos Use the entries of the apropos pseudo-file for validation of references. COMMANDS Pg-Down / Pg-Up Forward/backward one page, respectively. Key-Down / Key-Up / Mouse-Wheel down/up Forward/backward one line, respectively. CTRL-l In search mode: bring current match to top of the page. ESC Turn off current highlighting of matches. TAB / S-TAB Navigate to next/previous reference respectively. ENTER • If previous command was TAB or S-TAB: Open reference at point, i.e. call `man <reference>'. • In TOC-mode: Go to currently selected position in file. / Start a forward search for regular expression. ? Start a backward search for regular expression. B Change buffer; choose from list. a Create a pseudo-file with the output of `apropos .'. That pseudo-file contains short descriptions for all manual pages known to the system; those manual pages can also be opened with TAB / S-TAB and ENTER commands. b Backward one page c Close file currently paged. Exits lsp if it was the only/last file being paged. f Forward one page h Show online help with command summary. m Open another manual page. n Find next match in search. p Find previous match in search. q • Exit lsp. • In TOC-mode: switch back to normal view. • In help-mode: close help file. ENVIRONMENT LSP_OPTIONS All command line options can also be specified using this variable. LSP_OPEN / LESSOPEN Analogical to less(1), lsp supports an input preprocessor but currently just the two basic forms: One that provides the path to a replacement file and the one that writes the content to be paged to a pipe. SEE ALSO apropos(1), less(1), man(1), more(1), pg(1) BUGS Report bugs at https://github.com/dgouders/lsp alpha-1.0e-42 03/25/2023 LSP(1) ^ permalink raw reply [flat|nested] 73+ messages in thread
* Playground pager lsp(1) 2023-03-25 20:37 Playground pager lsp(1) Dirk Gouders @ 2023-03-25 20:47 ` Dirk Gouders 2023-04-04 23:45 ` Alejandro Colomar 0 siblings, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-03-25 20:47 UTC (permalink / raw) To: Alejandro Colomar, linux-man Hi Alejandro, first of all, chances are that you consider this post as spam, because this list is about linux manual pages and not pagers. In that case please accept my apologies and ignore this post. My reasoning was that readers here have some interest in manual pages and therefore probably also in pagers that claim to "understand" manual pages. My hope is that even if you consider this post inappropriate you will perhaps suggest some more appropriate place for such discussion. Not long ago, I noticed a discussion [1] about what pagers can and cannot do. That was interesting to me, because I am currently playing with a pager that claims to have a focus on manual pages. I will try to not waste your time and attach the manual page and a link to a short (3:50) demo video. To me it is absolutely OK should you just ignore this spam post, but perhaps you find lsp(1) interesting enough for further discussion. Best regards, Dirk [1] https://www.spinics.net/lists/linux-man/index.html#24494 [2] https://youtu.be/syGT4POgTAw LSP(1) User commands LSP(1) NAME lsp - list pages (or least significant pager) SYNOPSIS lsp [options] [file_name]... lsp -h lsp -v DESCRIPTION lsp is a terminal pager that assists in paging through data, usually text — no more(1), no less(1). The given files are opened if file names are given as options. Otherwise lsp assumes input from stdin and tries to read from there. In addition to it’s ability to aid in paging through text files lsp has limited knowledge about manual pages and offers some help in viewing them: • Manual pages usually refer to other manual pages and lsp allows to navigate those references and to visit them as new files with the ability to also navigate through all opened manual pages or other files. Here, lsp tries to minimize frustration caused by unavailable references and verifies their existance before offering them as references that can be visited. • In windowing environments lsp does complete resizes when windows get resized. This means it also reloads the manual page to fit the new window size. • Search for manual pages using apropos(1); in the current most basic form it lists all known manual pages ready for text search and visiting referenced manual pages. • lsp has an experimental TOC mode. This is a three-level folding mode trying to list only section and sub-section names for quick navigation in manual pages. The TOC is created using naive heuristics which works well to some extend, but it might be incomplete. Users should keep that in mind. OPTIONS All options can be given on the command line or via the environment variable LSP_OPTIONS. The short version of toggles can also be used as commands, e.g. you can input -i while paging through a file to toggle case sensitivity for searches. -a, --load-apropos Create an apropos pseudo-file. -c, --chop-lines Toggle chopping of lines that do not fit the current screen width. -h, --help Output help and exit. -i, --no-case Toggle case sensitivity in searches. -I, --man-case Turn on case sensitivity for names of manual pages. This is used for example to verify references to other manual pages. -l, --log-file Specify a path to where write debugging output. -n, --line-numbers Toggle visible line numbers. -s, --search-string Specify an initial search string. -v, --version Output version information of lsp and exit. --no-color Disable colored output. --reload-command Specify command to load manual pages. Default is man. --verify-command Specify command to verify the existance of references. Default is man -w. --verify-with-apropos Use the entries of the apropos pseudo-file for validation of references. COMMANDS Pg-Down / Pg-Up Forward/backward one page, respectively. Key-Down / Key-Up / Mouse-Wheel down/up Forward/backward one line, respectively. CTRL-l In search mode: bring current match to top of the page. ESC Turn off current highlighting of matches. TAB / S-TAB Navigate to next/previous reference respectively. ENTER • If previous command was TAB or S-TAB: Open reference at point, i.e. call `man <reference>'. • In TOC-mode: Go to currently selected position in file. / Start a forward search for regular expression. ? Start a backward search for regular expression. B Change buffer; choose from list. a Create a pseudo-file with the output of `apropos .'. That pseudo-file contains short descriptions for all manual pages known to the system; those manual pages can also be opened with TAB / S-TAB and ENTER commands. b Backward one page c Close file currently paged. Exits lsp if it was the only/last file being paged. f Forward one page h Show online help with command summary. m Open another manual page. n Find next match in search. p Find previous match in search. q • Exit lsp. • In TOC-mode: switch back to normal view. • In help-mode: close help file. ENVIRONMENT LSP_OPTIONS All command line options can also be specified using this variable. LSP_OPEN / LESSOPEN Analogical to less(1), lsp supports an input preprocessor but currently just the two basic forms: One that provides the path to a replacement file and the one that writes the content to be paged to a pipe. SEE ALSO apropos(1), less(1), man(1), more(1), pg(1) BUGS Report bugs at https://github.com/dgouders/lsp alpha-1.0e-42 03/25/2023 LSP(1) ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-03-25 20:47 ` Dirk Gouders @ 2023-04-04 23:45 ` Alejandro Colomar 2023-04-05 5:35 ` Eli Zaretskii 2023-04-05 10:02 ` Playground pager lsp(1) Dirk Gouders 0 siblings, 2 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-04 23:45 UTC (permalink / raw) To: Dirk Gouders, linux-man; +Cc: help-texinfo [-- Attachment #1.1: Type: text/plain, Size: 9290 bytes --] Hi Dirk. On 3/25/23 21:47, Dirk Gouders wrote: > Hi Alejandro, > > first of all, chances are that you consider this post as spam, because > this list is about linux manual pages and not pagers. No, I don't. > In that case > please accept my apologies and ignore this post. > > My reasoning was that readers here have some interest in manual pages > and therefore probably also in pagers that claim to "understand" manual > pages. My hope is that even if you consider this post inappropriate you > will perhaps suggest some more appropriate place for such discussion. > > Not long ago, I noticed a discussion [1] about what pagers can and > cannot do. That was interesting to me, because I am currently playing > with a pager that claims to have a focus on manual pages. > > I will try to not waste your time and attach the manual page and a link > to a short (3:50) demo video. To me it is absolutely OK should you just > ignore this spam post, but perhaps you find lsp(1) interesting enough > for further discussion. If you had a Debian package, I might try it :) Or maybe a Makefile to build from source... What is this meson.build? > > Best regards, > > Dirk > > [1] https://www.spinics.net/lists/linux-man/index.html#24494 > [2] https://youtu.be/syGT4POgTAw > > LSP(1) User commands LSP(1) > > NAME > lsp - list pages (or least significant pager) > > SYNOPSIS > lsp [options] [file_name]... > > lsp -h > > lsp -v > > DESCRIPTION > lsp is a terminal pager that assists in paging through data, usually > text — no more(1), no less(1). I'd say it does quite a lot more than paging... We could say this is some info(1) equivalent for manual pages. With the benefit that you don't need to implement such a system from scratch, but just reusing the existing tools (apropos, man, whatis, ...). It seems something like what I would have written if I had to implement info(1) from scratch. I wish GNU guys had thought of this instead of developing their own incompatible system. > > The given files are opened if file names are given as options. > Otherwise lsp assumes input from stdin and tries to read from there. > > In addition to it’s ability to aid in paging through text files lsp has > limited knowledge about manual pages and offers some help in viewing > them: > > • Manual pages usually refer to other manual pages and lsp allows to > navigate those references and to visit them as new files with the > ability to also navigate through all opened manual pages or other > files. Out of curiosity, is this implemented with heuristics? Or do you rely on semantic mdoc(7) macros? If it's the first, how do you handle exit(1)? Is it a reference, or is it just code (with the meaning exit(EXIT_FAILURE))? If it's the second, I guess it doesn't support that in man(7), right? At least until MR is released. > > Here, lsp tries to minimize frustration caused by unavailable > references and verifies their existance before offering them as > references that can be visited. Do you mark these as broken references? It is interesting to know that there's a reference which you don't have installed. It may prompt you to install it and read it. When I see a broken reference, I usually find it with `apt-file find man3/page.3`, and then install the relevant package. > > • In windowing environments lsp does complete resizes when windows > get resized. This means it also reloads the manual page to fit the > new window size. Good. This I miss it in less(1) often. Not sure if they had any strong reason to not support that. > > • Search for manual pages using apropos(1); in the current most basic > form it lists all known manual pages ready for text search and > visiting referenced manual pages. What does it bring that `apropos * | less` can't do? If you're going the of info(1) with full-blown system, it seems reasonable, but I never really liked all that if it's just a new terminal and a command away from me. > > • lsp has an experimental TOC mode. > > This is a three-level folding mode trying to list only section and > sub-section names for quick navigation in manual pages. Nice, and this an important feature missing feature in info(1), as I reported recently. :) Maybe they are interested in something similar. > > The TOC is created using naive heuristics which works well to some > extend, but it might be incomplete. Users should keep that in mind. I guess the heuristics are just `^[^ ]` for SH and `^ [^ ]` for SS, right? I tipically use something similar for searching for command flags, and as you say, these just work. Cheers, Alex > > OPTIONS > All options can be given on the command line or via the environment > variable LSP_OPTIONS. The short version of toggles can also be used as > commands, e.g. you can input -i while paging through a file to toggle > case sensitivity for searches. > > -a, --load-apropos > Create an apropos pseudo-file. > > -c, --chop-lines > Toggle chopping of lines that do not fit the current screen width. > > -h, --help > Output help and exit. > > -i, --no-case > Toggle case sensitivity in searches. > > -I, --man-case > Turn on case sensitivity for names of manual pages. > > This is used for example to verify references to other manual > pages. > > -l, --log-file > Specify a path to where write debugging output. > > -n, --line-numbers > Toggle visible line numbers. > > -s, --search-string > Specify an initial search string. > > -v, --version > Output version information of lsp and exit. > > --no-color > Disable colored output. > > --reload-command > Specify command to load manual pages. Default is man. > > --verify-command > Specify command to verify the existance of references. Default is > man -w. > > --verify-with-apropos > Use the entries of the apropos pseudo-file for validation of > references. > > COMMANDS > Pg-Down / Pg-Up > Forward/backward one page, respectively. > > Key-Down / Key-Up / Mouse-Wheel down/up > Forward/backward one line, respectively. > > CTRL-l > In search mode: bring current match to top of the page. > > ESC > Turn off current highlighting of matches. > > TAB / S-TAB > Navigate to next/previous reference respectively. > > ENTER > > • If previous command was TAB or S-TAB: > > Open reference at point, i.e. call `man <reference>'. > > • In TOC-mode: > > Go to currently selected position in file. > > / > Start a forward search for regular expression. > > ? > Start a backward search for regular expression. > > B > Change buffer; choose from list. > > a > Create a pseudo-file with the output of `apropos .'. > > That pseudo-file contains short descriptions for all manual pages > known to the system; those manual pages can also be opened with TAB > / S-TAB and ENTER commands. > > b > Backward one page > > c > Close file currently paged. > > Exits lsp if it was the only/last file being paged. > > f > Forward one page > > h > Show online help with command summary. > > m > Open another manual page. > > n > Find next match in search. > > p > Find previous match in search. > > q > > • Exit lsp. > > • In TOC-mode: switch back to normal view. > > • In help-mode: close help file. > > ENVIRONMENT > LSP_OPTIONS > All command line options can also be specified using this variable. > > LSP_OPEN / LESSOPEN > Analogical to less(1), lsp supports an input preprocessor but > currently just the two basic forms: > > One that provides the path to a replacement file and the one that > writes the content to be paged to a pipe. > > SEE ALSO > apropos(1), less(1), man(1), more(1), pg(1) > > BUGS > Report bugs at https://github.com/dgouders/lsp > > alpha-1.0e-42 03/25/2023 LSP(1) -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-04 23:45 ` Alejandro Colomar @ 2023-04-05 5:35 ` Eli Zaretskii 2023-04-06 1:10 ` Alejandro Colomar 2023-04-05 10:02 ` Playground pager lsp(1) Dirk Gouders 1 sibling, 1 reply; 73+ messages in thread From: Eli Zaretskii @ 2023-04-05 5:35 UTC (permalink / raw) To: Alejandro Colomar; +Cc: dirk, linux-man, help-texinfo > Date: Wed, 5 Apr 2023 01:45:46 +0200 > Cc: help-texinfo@gnu.org > From: Alejandro Colomar <alx.manpages@gmail.com> > > With the benefit that you don't need to implement such a system from scratch, > but just reusing the existing tools (apropos, man, whatis, ...). It seems > something like what I would have written if I had to implement info(1) from > scratch. I wish GNU guys had thought of this instead of developing their > own incompatible system. This last sentence is a misunderstanding. The goal of Texinfo is not to improve the man pages. Texinfo is a completely different approach to software documentation, which allows to write large books and then produce various on-line and off-line formats to read and efficiently search those books. Man pages have no means of specifying structure and hyper-links except by loosely-coupling pages via "SEE ALSO" cross-references at the end; they have no means of quickly and efficiently finding some specific subject except by text search (which usually produces a lot of false positives). By contrast, Texinfo documents have sectioning structure, have cross-references that can appear where you need them and point anywhere else in the document (or into another document). They also have indexing and commands that allow the reader to use the index in order to find the subject he/she is interested in very quickly and accurately, even if the text of the index entry doesn't appear anywhere in the manual. How can you document a large and flexible software package, such as GDB or Texinfo or Emacs, in man pages? It is a mistake to even compare these two documentation systems, certainly in this way. > > • In windowing environments lsp does complete resizes when windows > > get resized. This means it also reloads the manual page to fit the > > new window size. > > Good. This I miss it in less(1) often. Not sure if they had any strong > reason to not support that. ??? Why do you say 'less' doesn't support window resizing? It does here. > > • lsp has an experimental TOC mode. > > > > This is a three-level folding mode trying to list only section and > > sub-section names for quick navigation in manual pages. > > Nice, and this an important feature missing feature in info(1), as I > reported recently. :) It isn't missing. The TOC is presented as top-level menu in each manual, and large manuals have also the "detailed menu" with all the sub-nodes spelled out. In addition, the Emacs Info reader has the Info-toc command, which presents a structured menu with all the sectioning levels of a manual even if the manual didn't produce it. There are also more focused commands which present TOC-like lists across all the manuals, which you can then navigate to read what you deem appropriate. See the description of "--all" command-line option of the stand-alone Info reader. For example, try this command: $ info --all e --index-search "init file" There's also the index-apropos command from inside the stand-alone reader (and the matching info-apropos in the Emacs Info reader). ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 5:35 ` Eli Zaretskii @ 2023-04-06 1:10 ` Alejandro Colomar 2023-04-06 8:11 ` Eli Zaretskii 2023-04-07 2:18 ` Playground pager lsp(1) G. Branden Robinson 0 siblings, 2 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-06 1:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dirk, linux-man, help-texinfo [-- Attachment #1.1: Type: text/plain, Size: 7586 bytes --] Hi Eli! On 4/5/23 07:35, Eli Zaretskii wrote: >> Date: Wed, 5 Apr 2023 01:45:46 +0200 >> Cc: help-texinfo@gnu.org >> From: Alejandro Colomar <alx.manpages@gmail.com> >> >> With the benefit that you don't need to implement such a system from scratch, >> but just reusing the existing tools (apropos, man, whatis, ...). It seems >> something like what I would have written if I had to implement info(1) from >> scratch. I wish GNU guys had thought of this instead of developing their >> own incompatible system. > > This last sentence is a misunderstanding. The goal of Texinfo is not > to improve the man pages. Texinfo is a completely different approach > to software documentation, which allows to write large books and then > produce various on-line and off-line formats to read and efficiently > search those books. "The manual was intended to be typeset; some detail is sacrificed on terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_, Eighth Edition, Volume 1, February 1985) You mean books like this one? Courtesy of groff(1)'s Deri James =) <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf> Or maybe you prefer HTML? <https://man7.org/linux/man-pages/man1/intro.1.html> As to efficiency, I'm not going to open that melon, because we're both very biased to be efficient on the formats we each maintain. I'll just say that I don't see an objective winner in those terms. About variety of output formats, anything that can be produced by groff(1), man(7) can be translated. And groff(1) can do many formats. > > Man pages have no means of specifying structure .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR Those can be used to produce very precise links such as this one (one of my favourite references when reviewing man-pages patches): <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf#pdf%3Abm11886> And there's still room for improvement over what you'll see in that PDF, or what you can see in <man7.org>. > and hyper-links except > by loosely-coupling pages via "SEE ALSO" cross-references at the end; > they have no means of quickly and efficiently finding some specific > subject except by text search (which usually produces a lot of false > positives). I guess you mean searching from the command line by the name of the parameter to a function, or what? I would be interested in a more detailed description of what you want to be able to search in current pages (hopefully ones that I maintain, so I can speak of them) that you can't find easily? Maybe I can help making something more accessible. lsp(1) helps a little bit making the structure of man pages navigable, and it's currently implemented using heuristics, but if it worked together with groff(1) to get the real source of truth, it could get precise data without needing heuristics. > > By contrast, Texinfo documents have sectioning structure, have > cross-references that can appear where you need them and point > anywhere else in the document (or into another document). This was discussed as a possible extension to '.MR'. We're just not sure that there's a real need for that in manual pages (although there's not a consensus on that regard, and Branden, which I'm sure is reading this, may jump in at any moment :). > They also > have indexing and commands that allow the reader to use the index in > order to find the subject he/she is interested in very quickly and You mean whatis(1) and apropos(1)? lsp(1) makes use of those to be able to navigate all pages in the system (I guess this is more or less what info(1) does; with the obvious differences due to how nodes are organized). > accurately, even if the text of the index entry doesn't appear > anywhere in the manual. man pages have several ways: - Including keywords in the NAME section. - Link pages. - TH line. Of course, this is for the terminal. For PDF or HTML, you can get hyperlinks to any subsection (and in the future maybe even tagged paragraphs) within a page. > > How can you document a large and flexible software package, such as > GDB or Texinfo or Emacs, in man pages? git is a huge program, yet its man pages are quite useful. Just split your documentation at the right boundary, which usually requires a good design for your software that allows such division. $ apt-file show git-man | wc -l 190 > > It is a mistake to even compare these two documentation systems, > certainly in this way. The fact that current man(1) implementations don't exploit the whole power of man(7) doesn't mean you can't design a software that does. I'm sure you could build something similar to info(1) that got man(7) pages as its input. That PDF linked above is just a starter of what we want to do in the not far future. Hopefully we can also get some time to work on HTML. > >>> • In windowing environments lsp does complete resizes when windows >>> get resized. This means it also reloads the manual page to fit the >>> new window size. >> >> Good. This I miss it in less(1) often. Not sure if they had any strong >> reason to not support that. > > ??? Why do you say 'less' doesn't support window resizing? It does > here. Hmm, now that I think, it's probably an issue of coordinating man(1) and less(1). I sometimes wish that when I resize a window where I'm reading a man page, it would reformat the page from source. Of course, that might be a problem for keeping track of where I was, since lines moved around. Not sure how good lsp(1) is at that. > >>> • lsp has an experimental TOC mode. >>> >>> This is a three-level folding mode trying to list only section and >>> sub-section names for quick navigation in manual pages. >> >> Nice, and this an important feature missing feature in info(1), as I >> reported recently. :) > > It isn't missing. The TOC is presented as top-level menu in each > manual, and large manuals have also the "detailed menu" with all the > sub-nodes spelled out. In addition, the Emacs Info reader has the > Info-toc command, which presents a structured menu with all the > sectioning levels of a manual even if the manual didn't produce it. Ahh, yes, this is true. What I found missing is a kind of a map for knowing what I have available for navigating (also the fact that I don't usually run info(1) makes me be a bit fuzzy on detailing what is it that I miss from it). So, info(1) has a map of the sections available in a page, and does it also have a map of all the pages in the system (or whatever you call your pages, I don't yet really understand the organization of info manuals). > > There are also more focused commands which present TOC-like lists > across all the manuals, which you can then navigate to read what you > deem appropriate. See the description of "--all" command-line option > of the stand-alone Info reader. For example, try this command: > > $ info --all e --index-search "init file" > > There's also the index-apropos command from inside the stand-alone > reader (and the matching info-apropos in the Emacs Info reader). It's nice to talk to you, even if we usually disagree in how we find documentation more accessible :) Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-06 1:10 ` Alejandro Colomar @ 2023-04-06 8:11 ` Eli Zaretskii 2023-04-06 8:48 ` Gavin Smith 2023-04-07 22:01 ` Alejandro Colomar 2023-04-07 2:18 ` Playground pager lsp(1) G. Branden Robinson 1 sibling, 2 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-06 8:11 UTC (permalink / raw) To: Alejandro Colomar; +Cc: dirk, linux-man, help-texinfo > Date: Thu, 6 Apr 2023 03:10:59 +0200 > Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org > From: Alejandro Colomar <alx.manpages@gmail.com> > > > This last sentence is a misunderstanding. The goal of Texinfo is not > > to improve the man pages. Texinfo is a completely different approach > > to software documentation, which allows to write large books and then > > produce various on-line and off-line formats to read and efficiently > > search those books. > > "The manual was intended to be typeset; some detail is sacrificed on > terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_, > Eighth Edition, Volume 1, February 1985) > > You mean books like this one? Courtesy of groff(1)'s Deri James =) > <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf> > > Or maybe you prefer HTML? > <https://man7.org/linux/man-pages/man1/intro.1.html> No, I mean books like "GNU Emacs Manual" or "Debugging with GDB" (https://shop.fsf.org/collection/books-docs). Or "War and Peace", for that matter. > As to efficiency, I'm not going to open that melon, because we're > both very biased to be efficient on the formats we each maintain. > I'll just say that I don't see an objective winner in those terms. How do you find the description of, say, "dereference symbolic link" (to take just a random example from the Emacs manual) when the actual text of the manual include neither this string nor matches for any related regular expressions, like "dereference.*link"? The way Info does it is to use the index (which should be present in any respectable reference document) to find description of the corresponding subject. The indexing, which is done by the author of the document, if it's a good indexing, should include index entries that specify subjects the reader could have in mind when he/she is looking for this kind of information. The corresponding index-searching commands of Info readers are a primary means for finding information quickly and efficiently, avoiding too many false positives and also avoiding frustrating misses, i.e., searches that fail to find anything pertinent. So this is not about objectivity, this is about features that either are present in the documentation system or are absent. I prefer the Info format to the HTML format of the same manual for this single reason: HTML browsers don't have the index searching capabilities (this is hopefully about to change, I hope, see the JS support in latest Texinfo), and that issue alone was enough to avert me from HTML, because I cannot afford wasting time on looking up information I cannot find instantly. > About variety of output formats, anything that can be produced by > groff(1), man(7) can be translated. And groff(1) can do many formats. Groff (and any other typesetting program) can be used for writing any kind of documents. I'm not talking about the processors, I'm talking about the design of the documentation system as a whole and about what the products actually look like. IOW, I'm talking about the man pages produced by the typesetter, not about what can be done with the typesetter. > > Man pages have no means of specifying structure > > .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR These provide just one level. And how frequently are they used in actual man pages out there, even when available? > Those can be used to produce very precise links such as this one > (one of my favourite references when reviewing man-pages patches): > <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf#pdf%3Abm11886> It's full of mojibake when I try reading it here. But anyway: what structure do you have there? It looks just a long sequence of separate man pages. > > and hyper-links except > > by loosely-coupling pages via "SEE ALSO" cross-references at the end; > > they have no means of quickly and efficiently finding some specific > > subject except by text search (which usually produces a lot of false > > positives). > > I guess you mean searching from the command line by the name of the > parameter to a function, or what? No, I mean looking a specific subject of interest without having to search/read through the entire document. > I would be interested in a more detailed description of what you > want to be able to search in current pages (hopefully ones that I > maintain, so I can speak of them) that you can't find easily? Maybe > I can help making something more accessible. See above, the example of using index-searching commands. > > By contrast, Texinfo documents have sectioning structure, have > > cross-references that can appear where you need them and point > > anywhere else in the document (or into another document). > > This was discussed as a possible extension to '.MR'. We're just not > sure that there's a real need for that in manual pages (although > there's not a consensus on that regard, and Branden, which I'm sure > is reading this, may jump in at any moment :). Cannot say about man pages, but in a serious documentation of any computer software you always need cross-references, because you cannot make any description self-contained without repeating the same stuff over and over and over again. Here's a short examples from a random place in the Emacs Lisp Reference manual: When an editing command returns to the editor command loop, Emacs automatically calls ‘set-buffer’ on the buffer shown in the selected window (*note Selecting Windows::). This is to prevent confusion: it ensures that the buffer that the cursor is in, when Emacs reads a command, is the buffer to which that command applies (*note Command Loop::). Thus, you should not use ‘set-buffer’ to switch visibly to a different buffer; for that, use the functions described in *note Switching Buffers::. The three places which say with "see SOMETHING" are cross-references to other parts of the manual. Without being able to cross-reference there, the text would have to explain what it means by "selected window", what it means by "commands" and "command loop", and mention explicitly the functions to switch to a buffer which are already described in detail elsewhere. This allows readers who already know about those subjects to read the text without having to skip large amounts of unnecessary information, while also allowing readers who are not sure they know about that to be able to follow the link, read there, and then come back to the same place to continue reading. > > They also > > have indexing and commands that allow the reader to use the index in > > order to find the subject he/she is interested in very quickly and > > You mean whatis(1) and apropos(1)? No. These perform text searches on the titles of the man pages, and are therefore limited to what is in the title. Indexing is much more powerful, and works on the topics in the index (which, as explained above, could contain text not present anywhere else in the document). And every respectful Info manual has an index (some have several indices). See above about the commands which use the index. > > accurately, even if the text of the index entry doesn't appear > > anywhere in the manual. > > man pages have several ways: > > - Including keywords in the NAME section. > - Link pages. > - TH line. This is not enough, IME. You need a way of "tagging" a chunk of text as describing, or being pertinent to, a particular subject, even if that subject does not appear literally in the text the reader will see. That's because when readers are after some specific material, they don't always have in mind the exact words used in the manual for describing that material, they could have some alternative phrases in their minds. Good indexing anticipates this in advance, and provides index entries for those alternative phrases, allowing readers to find stuff quickly. > Of course, this is for the terminal. For PDF or HTML, you can > get hyperlinks to any subsection (and in the future maybe even > tagged paragraphs) within a page. In Info, references to any paragraph are available since long ago. They are invaluable in some situations, especially when some section is very long and you want to point to a very specific part thereof. > > How can you document a large and flexible software package, such as > > GDB or Texinfo or Emacs, in man pages? > > git is a huge program, yet its man pages are quite useful. Git is a huge heap of separate commands, with very little to glue them together in terms of documented functionalities. Still, even in Git, there's the stuff that belongs to neither command in particular, and thus is documented in man pages with invented names like "gitrevisions", which is impossible to guess in advance for a newbie who needs this information. Moreover, the introduction material and the explanation of basic concepts is not in man pages, but in a separate HTML document ("The Git User's Manual"), and likewise the API documentation, which in itself is a telltale sign. While something like a huge heap of man pages is perhaps borderline reasonable for Git, it isn't reasonable for programs which are not easily broken into separate independent "pages", like GDB and Emacs. The more complex is the system of objects and concepts manipulated by the software, the less appropriate is the man-page format for describing it. > Just split your documentation at the right boundary, which > usually requires a good design for your software that allows > such division. Whether the manual is split or not is immaterial. Info manuals can also be split. The relevant issue is what the viewer allows the reader to do to read these chunks in a reasonable way, using efficient commands and features to find related information quickly. > The fact that current man(1) implementations don't exploit > the whole power of man(7) doesn't mean you can't design a > software that does. Indeed, it doesn't mean that. But we are discussing what is there, not what could be there in some distant future. > I'm sure you could build something similar to info(1) that > got man(7) pages as its input. No! The information about subsections, cross-references, and indices is missing. That information must be there to begin with, otherwise it cannot be recreated, because it's inserted by humans, not by programs. > > It isn't missing. The TOC is presented as top-level menu in each > > manual, and large manuals have also the "detailed menu" with all the > > sub-nodes spelled out. In addition, the Emacs Info reader has the > > Info-toc command, which presents a structured menu with all the > > sectioning levels of a manual even if the manual didn't produce it. > > Ahh, yes, this is true. What I found missing is a kind of a map for > knowing what I have available for navigating (also the fact that I > don't usually run info(1) makes me be a bit fuzzy on detailing what > is it that I miss from it). So, info(1) has a map of the sections > available in a page, and does it also have a map of all the pages > in the system (or whatever you call your pages, I don't yet really > understand the organization of info manuals). Yes, it does. If you invoke 'info' with no arguments, it will show the "directory" of all the installed manuals -- a large menu where each manual has at least one line explaining what the manual describes. Some manuals have much more than one line; examples include Coreutils and Binutils (which have a line for each individual command) and glibc (which has a line for every _function_). ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-06 8:11 ` Eli Zaretskii @ 2023-04-06 8:48 ` Gavin Smith 2023-04-07 22:01 ` Alejandro Colomar 1 sibling, 0 replies; 73+ messages in thread From: Gavin Smith @ 2023-04-06 8:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Alejandro Colomar, dirk, linux-man, help-texinfo On Thu, Apr 06, 2023 at 11:11:40AM +0300, Eli Zaretskii wrote: > How do you find the description of, say, "dereference symbolic link" > (to take just a random example from the Emacs manual) when the actual > text of the manual include neither this string nor matches for any > related regular expressions, like "dereference.*link"? > > The way Info does it is to use the index (which should be present in > any respectable reference document) to find description of the > corresponding subject. The indexing, which is done by the author of > the document, if it's a good indexing, should include index entries > that specify subjects the reader could have in mind when he/she is > looking for this kind of information. > > The corresponding index-searching commands of Info readers are a > primary means for finding information quickly and efficiently, > avoiding too many false positives and also avoiding frustrating > misses, i.e., searches that fail to find anything pertinent. In the future, there should be a local documentation search driven by AI algorithms which handles synonyms and rewordings, so that if the user searched for "dereference", they also found text about "following a reference" even if the word "dereference" wasn't used. Think of it like a version of G**gle running on your own machine. Implementing such a thing is beyond me, though. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-06 8:11 ` Eli Zaretskii 2023-04-06 8:48 ` Gavin Smith @ 2023-04-07 22:01 ` Alejandro Colomar 2023-04-08 7:05 ` Eli Zaretskii 1 sibling, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-07 22:01 UTC (permalink / raw) To: Eli Zaretskii Cc: dirk, linux-man, help-texinfo, наб, G. Branden Robinson, groff, Colin Watson [-- Attachment #1.1: Type: text/plain, Size: 15285 bytes --] Hi Eli, On 4/6/23 10:11, Eli Zaretskii wrote: >> Date: Thu, 6 Apr 2023 03:10:59 +0200 >> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org >> From: Alejandro Colomar <alx.manpages@gmail.com> >> >>> This last sentence is a misunderstanding. The goal of Texinfo is not >>> to improve the man pages. Texinfo is a completely different approach >>> to software documentation, which allows to write large books and then >>> produce various on-line and off-line formats to read and efficiently >>> search those books. >> >> "The manual was intended to be typeset; some detail is sacrificed on >> terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_, >> Eighth Edition, Volume 1, February 1985) >> >> You mean books like this one? Courtesy of groff(1)'s Deri James =) >> <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf> >> >> Or maybe you prefer HTML? >> <https://man7.org/linux/man-pages/man1/intro.1.html> > > No, I mean books like "GNU Emacs Manual" or "Debugging with GDB" > (https://shop.fsf.org/collection/books-docs). Or "War and Peace", for > that matter. > >> As to efficiency, I'm not going to open that melon, because we're >> both very biased to be efficient on the formats we each maintain. >> I'll just say that I don't see an objective winner in those terms. > > How do you find the description of, say, "dereference symbolic link" > (to take just a random example from the Emacs manual) when the actual > text of the manual include neither this string nor matches for any > related regular expressions, like "dereference.*link"? $ apropos link | grep sym | head -n5 readlink (2) - read value of a symbolic link readlinkat (2) - read value of a symbolic link sln (8) - create symbolic links symlink (2) - make a new name for a file symlink (7) - symbolic link handling I bet you're looking for readlink(2) and symlink(7), aren't you? > > The way Info does it is to use the index (which should be present in > any respectable reference document) to find description of the > corresponding subject. The indexing, which is done by the author of > the document, if it's a good indexing, should include index entries > that specify subjects the reader could have in mind when he/she is > looking for this kind of information. We do that too in man(7). For example, we improved the "index" for proc(5) recently, after наб lost some time without finding proc(5) in the list of pages that were interesting for the topic at hand: commit 2e1c1a57f138eedd35b7b2a825002fddb12d240f Author: наб <nabijaczleweli@nabijaczleweli.xyz> Date: Sat Apr 1 00:04:52 2023 +0200 proc.5: NAME: Add "system information, and sysctl" procfs hosts a whole host of information about the system, as well as sysctls; proc(5) hosts a description of a lot of sysctls, and at present there's no way to find that out. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Alejandro Colomar <alx@kernel.org> diff --git a/man5/proc.5 b/man5/proc.5 index 521402fe8..233cc1c9d 100644 --- a/man5/proc.5 +++ b/man5/proc.5 @@ -36,7 +36,7 @@ .\" .TH proc 5 (date) "Linux man-pages (unreleased)" .SH NAME -proc \- process information pseudo-filesystem +proc \- process information, system information, and sysctl pseudo-filesystem .SH DESCRIPTION The .B proc After this patch, if you apropos "system" or "sysctl", you'll see proc(5) pop up in your list. > > The corresponding index-searching commands of Info readers are a > primary means for finding information quickly and efficiently, > avoiding too many false positives and also avoiding frustrating > misses, i.e., searches that fail to find anything pertinent. That's no different than apropos(1). The only problem is when a man page feels like a one-page book. But if you split the book into several pages, then the index is useful to know which page you want. > > So this is not about objectivity, this is about features that either > are present in the documentation system or are absent. I prefer the > Info format to the HTML format of the same manual for this single > reason: HTML browsers don't have the index searching capabilities > (this is hopefully about to change, I hope, see the JS support in > latest Texinfo), and that issue alone was enough to avert me from > HTML, because I cannot afford wasting time on looking up information I > cannot find instantly. Yep, I also prefer man(1) over HTML man pages for similar reasons :). I can do whatis(1) and apropos(1) (although some man-pages websites have this capability too, but then I can't grep those results in the browser). > >> About variety of output formats, anything that can be produced by >> groff(1), man(7) can be translated. And groff(1) can do many formats. > > Groff (and any other typesetting program) can be used for writing any > kind of documents. I'm not talking about the processors, I'm talking > about the design of the documentation system as a whole and about what > the products actually look like. IOW, I'm talking about the man pages > produced by the typesetter, not about what can be done with the > typesetter. > >>> Man pages have no means of specifying structure >> >> .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR > > These provide just one level. We have many levels: book: /opt/local/foobar/man/ volume: man2/, man3/, ... chapter: man3/, man3type/, ... page: sscanf(3) section: sscanf(3)/DESCRIPTION subsection: sscanf(3)/DESCRIPTION/Conversions tags: sscanf(3)/DESCRIPTION/Conversions/n Branden, I now remember your wondering about MR and linking to specific locations in a page... Maybe we could use such a URI-like syntax for that. I guess it's not yet taken by any software, so we should be free to define paths in the 'man:' schema to mean this? > > And how frequently are they used in actual man pages out there, even > when available? Used in source man(7)? Always. > >> Those can be used to produce very precise links such as this one >> (one of my favourite references when reviewing man-pages patches): >> <https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/book/man-pages-6.04.01.pdf#pdf%3Abm11886> > > It's full of mojibake when I try reading it here. But anyway: what > structure do you have there? It looks just a long sequence of > separate man pages. There's a navigation panel in the left in most (all?) PDF readers. You can use that to navigate to the page you want, and get hyperlinks to pages or their contents. > >>> and hyper-links except >>> by loosely-coupling pages via "SEE ALSO" cross-references at the end; >>> they have no means of quickly and efficiently finding some specific >>> subject except by text search (which usually produces a lot of false >>> positives). >> >> I guess you mean searching from the command line by the name of the >> parameter to a function, or what? > > No, I mean looking a specific subject of interest without having to > search/read through the entire document. See symlink above. > >> I would be interested in a more detailed description of what you >> want to be able to search in current pages (hopefully ones that I >> maintain, so I can speak of them) that you can't find easily? Maybe >> I can help making something more accessible. > > See above, the example of using index-searching commands. Yep. I hope my answer about symlinks satisfied you. Cheers, Alex > >>> By contrast, Texinfo documents have sectioning structure, have >>> cross-references that can appear where you need them and point >>> anywhere else in the document (or into another document). >> >> This was discussed as a possible extension to '.MR'. We're just not >> sure that there's a real need for that in manual pages (although >> there's not a consensus on that regard, and Branden, which I'm sure >> is reading this, may jump in at any moment :). > > Cannot say about man pages, but in a serious documentation of any > computer software you always need cross-references, because you cannot > make any description self-contained without repeating the same stuff > over and over and over again. > > Here's a short examples from a random place in the Emacs Lisp > Reference manual: > > When an editing command returns to the editor command loop, Emacs > automatically calls ‘set-buffer’ on the buffer shown in the selected > window (*note Selecting Windows::). This is to prevent confusion: it > ensures that the buffer that the cursor is in, when Emacs reads a > command, is the buffer to which that command applies (*note Command > Loop::). Thus, you should not use ‘set-buffer’ to switch visibly to a > different buffer; for that, use the functions described in *note > Switching Buffers::. > > The three places which say with "see SOMETHING" are cross-references > to other parts of the manual. Without being able to cross-reference > there, the text would have to explain what it means by "selected > window", what it means by "commands" and "command loop", and mention > explicitly the functions to switch to a buffer which are already > described in detail elsewhere. This allows readers who already know > about those subjects to read the text without having to skip large > amounts of unnecessary information, while also allowing readers who > are not sure they know about that to be able to follow the link, read > there, and then come back to the same place to continue reading. > >>> They also >>> have indexing and commands that allow the reader to use the index in >>> order to find the subject he/she is interested in very quickly and >> >> You mean whatis(1) and apropos(1)? > > No. These perform text searches on the titles of the man pages, and > are therefore limited to what is in the title. Indexing is much more > powerful, and works on the topics in the index (which, as explained > above, could contain text not present anywhere else in the document). > And every respectful Info manual has an index (some have several > indices). See above about the commands which use the index. > >>> accurately, even if the text of the index entry doesn't appear >>> anywhere in the manual. >> >> man pages have several ways: >> >> - Including keywords in the NAME section. >> - Link pages. >> - TH line. > > This is not enough, IME. You need a way of "tagging" a chunk of text > as describing, or being pertinent to, a particular subject, even if > that subject does not appear literally in the text the reader will > see. That's because when readers are after some specific material, > they don't always have in mind the exact words used in the manual for > describing that material, they could have some alternative phrases in > their minds. Good indexing anticipates this in advance, and provides > index entries for those alternative phrases, allowing readers to find > stuff quickly. > >> Of course, this is for the terminal. For PDF or HTML, you can >> get hyperlinks to any subsection (and in the future maybe even >> tagged paragraphs) within a page. > > In Info, references to any paragraph are available since long ago. > They are invaluable in some situations, especially when some section > is very long and you want to point to a very specific part thereof. > >>> How can you document a large and flexible software package, such as >>> GDB or Texinfo or Emacs, in man pages? >> >> git is a huge program, yet its man pages are quite useful. > > Git is a huge heap of separate commands, with very little to glue them > together in terms of documented functionalities. Still, even in Git, > there's the stuff that belongs to neither command in particular, and > thus is documented in man pages with invented names like > "gitrevisions", which is impossible to guess in advance for a newbie > who needs this information. > > Moreover, the introduction material and the explanation of basic > concepts is not in man pages, but in a separate HTML document ("The > Git User's Manual"), and likewise the API documentation, which in > itself is a telltale sign. > > While something like a huge heap of man pages is perhaps borderline > reasonable for Git, it isn't reasonable for programs which are not > easily broken into separate independent "pages", like GDB and Emacs. > The more complex is the system of objects and concepts manipulated by > the software, the less appropriate is the man-page format for > describing it. > >> Just split your documentation at the right boundary, which >> usually requires a good design for your software that allows >> such division. > > Whether the manual is split or not is immaterial. Info manuals can > also be split. The relevant issue is what the viewer allows the > reader to do to read these chunks in a reasonable way, using efficient > commands and features to find related information quickly. > >> The fact that current man(1) implementations don't exploit >> the whole power of man(7) doesn't mean you can't design a >> software that does. > > Indeed, it doesn't mean that. But we are discussing what is there, > not what could be there in some distant future. > >> I'm sure you could build something similar to info(1) that >> got man(7) pages as its input. > > No! The information about subsections, cross-references, and indices > is missing. That information must be there to begin with, otherwise > it cannot be recreated, because it's inserted by humans, not by > programs. > >>> It isn't missing. The TOC is presented as top-level menu in each >>> manual, and large manuals have also the "detailed menu" with all the >>> sub-nodes spelled out. In addition, the Emacs Info reader has the >>> Info-toc command, which presents a structured menu with all the >>> sectioning levels of a manual even if the manual didn't produce it. >> >> Ahh, yes, this is true. What I found missing is a kind of a map for >> knowing what I have available for navigating (also the fact that I >> don't usually run info(1) makes me be a bit fuzzy on detailing what >> is it that I miss from it). So, info(1) has a map of the sections >> available in a page, and does it also have a map of all the pages >> in the system (or whatever you call your pages, I don't yet really >> understand the organization of info manuals). > > Yes, it does. If you invoke 'info' with no arguments, it will show > the "directory" of all the installed manuals -- a large menu where > each manual has at least one line explaining what the manual > describes. Some manuals have much more than one line; examples > include Coreutils and Binutils (which have a line for each individual > command) and glibc (which has a line for every _function_). -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-07 22:01 ` Alejandro Colomar @ 2023-04-08 7:05 ` Eli Zaretskii 2023-04-08 13:02 ` Accessibility of man pages (was: Playground pager lsp(1)) Alejandro Colomar 0 siblings, 1 reply; 73+ messages in thread From: Eli Zaretskii @ 2023-04-08 7:05 UTC (permalink / raw) To: Alejandro Colomar Cc: dirk, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff, cjwatson > Date: Sat, 8 Apr 2023 00:01:08 +0200 > Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org, > наб <nabijaczleweli@nabijaczleweli.xyz>, > "G. Branden Robinson" <g.branden.robinson@gmail.com>, groff <groff@gnu.org>, > Colin Watson <cjwatson@debian.org> > From: Alejandro Colomar <alx.manpages@gmail.com> > > > How do you find the description of, say, "dereference symbolic link" > > (to take just a random example from the Emacs manual) when the actual > > text of the manual include neither this string nor matches for any > > related regular expressions, like "dereference.*link"? > > $ apropos link | grep sym | head -n5 > readlink (2) - read value of a symbolic link > readlinkat (2) - read value of a symbolic link > sln (8) - create symbolic links > symlink (2) - make a new name for a file > symlink (7) - symbolic link handling > > I bet you're looking for readlink(2) and symlink(7), aren't you? I said "in the Emacs manual", and I said "when the actual text of the manual doesn't include the phrase you are looking for". So your example is not really up to its job: it uses text that is not the Emacs manual, and it finds only hits that literally appear in the title text of the man pages. For example, the above doesn't find the man page of Find, nor the man pages of cp and ls (and quite a few of others), all of which discuss what these utilities do with symbolic links. By contrast, the Info manual of Coreutils has almost 40 index entries starting with "symbolic link", and they are all shown when the user types "i symbolic link TAB" ('i' being the letter that invokes index-searching command). > diff --git a/man5/proc.5 b/man5/proc.5 > index 521402fe8..233cc1c9d 100644 > --- a/man5/proc.5 > +++ b/man5/proc.5 > @@ -36,7 +36,7 @@ > .\" > .TH proc 5 (date) "Linux man-pages (unreleased)" > .SH NAME > -proc \- process information pseudo-filesystem > +proc \- process information, system information, and sysctl pseudo-filesystem > .SH DESCRIPTION > The > .B proc > > > After this patch, if you apropos "system" or "sysctl", you'll see > proc(5) pop up in your list. This literally adds the text to what the reader will see. It makes the text longer and thus more difficult to read and parse, and there's a limit to how many key phrases you can add like this. By contrast, Texinfo lets you add any number of index entries that point to the same text. A random example from the Emacs manual: @cindex arrow keys @cindex moving point @cindex movement @cindex cursor motion @cindex moving the cursor To do more than insert characters, you have to know how to move point (@pxref{Point}). The keyboard commands @kbd{C-f}, @kbd{C-b}, @kbd{C-n}, and @kbd{C-p} move point to the right, left, down, and up, respectively. You can also move point using the @dfn{arrow keys} present on most keyboards: @key{RIGHT}, @key{LEFT}, @key{DOWN}, and @key{UP}; however, many Emacs users find that it is slower to use the arrow keys than the control keys, because you need to move your hand to the area of the keyboard where those keys are located. This paragraph has 5 index entries with different key phrases, all pointing to it. Different people will have different phrases in their minds when they think about "cursor movement", thus the need for several entries. One of the phrases appears in the text literally, the other don't; moreover, one of them, "movement" is a very frequent word, so searching for it with Grep is likely to bring a lot of false hits, whereas index-searching commands will not. > > The corresponding index-searching commands of Info readers are a > > primary means for finding information quickly and efficiently, > > avoiding too many false positives and also avoiding frustrating > > misses, i.e., searches that fail to find anything pertinent. > > That's no different than apropos(1). See above: it is very different. > >>> Man pages have no means of specifying structure > >> > >> .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR > > > > These provide just one level. > > We have many levels: > > book: /opt/local/foobar/man/ > volume: man2/, man3/, ... > chapter: man3/, man3type/, ... > page: sscanf(3) > section: sscanf(3)/DESCRIPTION > subsection: sscanf(3)/DESCRIPTION/Conversions > tags: sscanf(3)/DESCRIPTION/Conversions/n Texinfo has: - chapters - sections - subsections - subsubsections - unnumbered variants of the above (unnumberedsubsec etc.) - appendices (appendix, appendixsubsec etc.) - headings (majorheading, chapheading, subheading, etc.) More importantly, all those have meaningful names, not just standard labels like "DESCRIPTION" or "Conversions". So when you see them in TOC or any similar navigation aid, you _know_, at least approximately, what each section is about. > >>> and hyper-links except > >>> by loosely-coupling pages via "SEE ALSO" cross-references at the end; > >>> they have no means of quickly and efficiently finding some specific > >>> subject except by text search (which usually produces a lot of false > >>> positives). > >> > >> I guess you mean searching from the command line by the name of the > >> parameter to a function, or what? > > > > No, I mean looking a specific subject of interest without having to > > search/read through the entire document. > > See symlink above. Not relevant. > >> I would be interested in a more detailed description of what you > >> want to be able to search in current pages (hopefully ones that I > >> maintain, so I can speak of them) that you can't find easily? Maybe > >> I can help making something more accessible. > > > > See above, the example of using index-searching commands. > > Yep. I hope my answer about symlinks satisfied you. No, it didn't. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Accessibility of man pages (was: Playground pager lsp(1)) 2023-04-08 7:05 ` Eli Zaretskii @ 2023-04-08 13:02 ` Alejandro Colomar 2023-04-08 13:42 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-08 13:02 UTC (permalink / raw) To: Eli Zaretskii, cjwatson Cc: dirk, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff [-- Attachment #1.1: Type: text/plain, Size: 13257 bytes --] Hi Eli, Colin, On 4/8/23 09:05, Eli Zaretskii wrote: >> Date: Sat, 8 Apr 2023 00:01:08 +0200 >> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org, >> наб <nabijaczleweli@nabijaczleweli.xyz>, >> "G. Branden Robinson" <g.branden.robinson@gmail.com>, groff <groff@gnu.org>, >> Colin Watson <cjwatson@debian.org> >> From: Alejandro Colomar <alx.manpages@gmail.com> >> >>> How do you find the description of, say, "dereference symbolic link" >>> (to take just a random example from the Emacs manual) when the actual >>> text of the manual include neither this string nor matches for any >>> related regular expressions, like "dereference.*link"? >> >> $ apropos link | grep sym | head -n5 >> readlink (2) - read value of a symbolic link >> readlinkat (2) - read value of a symbolic link >> sln (8) - create symbolic links >> symlink (2) - make a new name for a file >> symlink (7) - symbolic link handling >> >> I bet you're looking for readlink(2) and symlink(7), aren't you? > > I said "in the Emacs manual", I wanted to show the man-pages equivalent. Of course I know nothing about the Emacs manual :) > and I said "when the actual text of the > manual doesn't include the phrase you are looking for". So your > example is not really up to its job: it uses text that is not the > Emacs manual, and it finds only hits that literally appear in the > title text of the man pages. I thought you wanted to know about how dereferencing symlinks works in general. > For example, the above doesn't find the > man page of Find, If you want how symlinks are dereferenced by find(1): $ man find | grep sym.*link | head -n1 The -H, -L and -P options control the treatment of symbolic links. $ man find | sed -n '/^ -L/,/^$/p;' | sed '/^$/,$d' -L Follow symbolic links. When find examines or prints information about files, the information used shall be taken from the prop‐ erties of the file to which the link points, not from the link itself (unless it is a broken symbolic link or find is unable to examine the file to which the link points). Use of this option implies -noleaf. If you later use the -P option, -noleaf will still be in effect. If -L is in effect and find discovers a symbolic link to a subdirectory during its search, the subdirec‐ tory pointed to by the symbolic link will be searched. $ man find | sed -n '/^ -H/,/^$/p;' | sed '/^$/,$d' -H Do not follow symbolic links, except while processing the com‐ mand line arguments. When find examines or prints information about files, the information used shall be taken from the prop‐ erties of the symbolic link itself. The only exception to this behaviour is when a file specified on the command line is a sym‐ bolic link, and the link can be resolved. For that situation, the information used is taken from whatever the link points to (that is, the link is followed). The information about the link itself is used as a fallback if the file pointed to by the sym‐ bolic link cannot be examined. If -H is in effect and one of the paths specified on the command line is a symbolic link to a directory, the contents of that directory will be examined (though of course -maxdepth 0 would prevent this). $ man find | sed -n '/^ -P/,/^$/p;' | sed '/^$/,$d' -P Never follow symbolic links. This is the default behaviour. When find examines or prints information about files, and the file is a symbolic link, the information used shall be taken from the properties of the symbolic link itself. > nor the man pages of cp If you want to know how symlinks are handled by cp(1), then: $ man cp | grep sym.*link -B1 -H follow command-line symbolic links in SOURCE -- -L, --dereference always follow symbolic links in SOURCE -- -P, --no-dereference never follow symbolic links in SOURCE -- -s, --symbolic-link make symbolic links instead of copying > and ls (and quite a few of And similarly for ls(1): $ man ls | grep sym.*link -C2 -H, --dereference-command-line follow symbolic links listed on the command line --dereference-command-line-symlink-to-dir follow each command line symbolic link that points to a direc‐ tory -- -L, --dereference when showing file information for a symbolic link, show informa‐ tion for the file the link references rather than for the link itself > others), all of which discuss what these utilities do with symbolic > links. If you want to know how other command handles symlinks, look at that command's page, and try a few things with grep and sed. > By contrast, the Info manual of Coreutils has almost 40 index > entries starting with "symbolic link", and they are all shown when the > user types "i symbolic link TAB" ('i' being the letter that invokes > index-searching command). > >> diff --git a/man5/proc.5 b/man5/proc.5 >> index 521402fe8..233cc1c9d 100644 >> --- a/man5/proc.5 >> +++ b/man5/proc.5 >> @@ -36,7 +36,7 @@ >> .\" >> .TH proc 5 (date) "Linux man-pages (unreleased)" >> .SH NAME >> -proc \- process information pseudo-filesystem >> +proc \- process information, system information, and sysctl pseudo-filesystem >> .SH DESCRIPTION >> The >> .B proc >> >> >> After this patch, if you apropos "system" or "sysctl", you'll see >> proc(5) pop up in your list. > > This literally adds the text to what the reader will see. It makes > the text longer and thus more difficult to read and parse, and there's > a limit to how many key phrases you can add like this. If a page has too many topics, consider splitting the page (I agree that proc(5) is asking for that job). > By contrast, > Texinfo lets you add any number of index entries that point to the > same text. A random example from the Emacs manual: > > @cindex arrow keys > @cindex moving point > @cindex movement > @cindex cursor motion > @cindex moving the cursor Using consistent language across pages helps for these things. > To do more than insert characters, you have to know how to move > point (@pxref{Point}). The keyboard commands @kbd{C-f}, @kbd{C-b}, > @kbd{C-n}, and @kbd{C-p} move point to the right, left, down, and up, > respectively. You can also move point using the @dfn{arrow keys} > present on most keyboards: @key{RIGHT}, @key{LEFT}, > @key{DOWN}, and @key{UP}; however, many Emacs users find > that it is slower to use the arrow keys than the control keys, because > you need to move your hand to the area of the keyboard where those > keys are located. > > This paragraph has 5 index entries with different key phrases, all > pointing to it. Different people will have different phrases in their > minds when they think about "cursor movement", thus the need for > several entries. One of the phrases appears in the text literally, > the other don't; moreover, one of them, "movement" is a very frequent > word, so searching for it with Grep is likely to bring a lot of false > hits, whereas index-searching commands will not. > >>> The corresponding index-searching commands of Info readers are a >>> primary means for finding information quickly and efficiently, >>> avoiding too many false positives and also avoiding frustrating >>> misses, i.e., searches that fail to find anything pertinent. >> >> That's no different than apropos(1). > > See above: it is very different. > >>>>> Man pages have no means of specifying structure >>>> >>>> .SH, .SS, .TP, .TQ, and very soon (hopefully weeks not months) .MR >>> >>> These provide just one level. >> >> We have many levels: >> >> book: /opt/local/foobar/man/ >> volume: man2/, man3/, ... >> chapter: man3/, man3type/, ... >> page: sscanf(3) >> section: sscanf(3)/DESCRIPTION >> subsection: sscanf(3)/DESCRIPTION/Conversions >> tags: sscanf(3)/DESCRIPTION/Conversions/n > > Texinfo has: > > - chapters > - sections > - subsections > - subsubsections > - unnumbered variants of the above (unnumberedsubsec etc.) > - appendices (appendix, appendixsubsec etc.) > - headings (majorheading, chapheading, subheading, etc.) > > More importantly, all those have meaningful names, not just standard > labels like "DESCRIPTION" or "Conversions". "Conversions" is not a standard subsection. It's about conversion specifiers; something exclusive of sscanf(3). However, sections and above do be standardized, and I believe that's good, so that you can have some a-priori expectations of the organization of a page. > So when you see them in > TOC or any similar navigation aid, you _know_, at least approximately, > what each section is about. I know a priori that if I'm reading sscanf(3)'s SYNOPSIS, I'll find the function prototype for it. Or if I read printf(3)'s ATTRIBUTES I'll find the thread-safety of the function. So much, that I have functions for reading a specific section of a certain page: $ man_section man3/sscanf.3 SYNOPSIS sscanf(3) Library Functions Manual sscanf(3) SYNOPSIS #include <stdio.h> int sscanf(const char *restrict str, const char *restrict format, ...); #include <stdarg.h> int vsscanf(const char *restrict str, const char *restrict format, va_list ap); Feature Test Macro Requirements for glibc (see fea‐ ture_test_macros(7)): vsscanf(): _ISOC99_SOURCE || _POSIX_C_SOURCE >= 200112L Linux man‐pages (unreleased) (date) sscanf(3) $ man_section man3/printf.3 ATTRIBUTES printf(3) Library Functions Manual printf(3) ATTRIBUTES For an explanation of the terms used in this section, see attrib‐ utes(7). ┌──────────────────────────────┬───────────────┬────────────────┐ │ Interface │ Attribute │ Value │ ├──────────────────────────────┼───────────────┼────────────────┤ │ printf(), fprintf(), │ Thread safety │ MT‐Safe locale │ │ sprintf(), snprintf(), │ │ │ │ vprintf(), vfprintf(), │ │ │ │ vsprintf(), vsnprintf() │ │ │ └──────────────────────────────┴───────────────┴────────────────┘ Linux man‐pages (unreleased) (date) printf(3) > >>>>> and hyper-links except >>>>> by loosely-coupling pages via "SEE ALSO" cross-references at the end; >>>>> they have no means of quickly and efficiently finding some specific >>>>> subject except by text search (which usually produces a lot of false >>>>> positives). text search has false positives, like anything else. But having good tools for handling text is the key to solving the problem. grep(1) and sed(1) are your friends when reading man pages. Colin, I've had a feeling for a long time that compressed pages are not very useful. These days, storage is cheap. How would you feel about having the man pages installed uncompressed in Debian? That would allow running text tools directly in /usr/share/man/. I've had to do that several times, and lucky me that I have the source code of the Linux man-pages checked out in my computers, but other users don't and they might have trouble finding for example which pages talk about RLIMIT_NOFILE. The only way I know of is: $ grep -rl RLIMIT_NOFILE man* man2/dup.2 man2/pidfd_getfd.2 man2/open.2 man2/fcntl.2 man2/poll.2 man2/pidfd_open.2 man2/getrlimit.2 man2/select.2 man2/seccomp_unotify.2 man3/getdtablesize.3 man3/mq_open.3 man3/errno.3 man3/sysconf.3 man5/proc.5 man7/unix.7 man7/fanotify.7 man7/capabilities.7 I'd like to enable this ability for everyone by not compressing system man pages. I guess we should talk to the Debian policy mailing list? Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages (was: Playground pager lsp(1)) 2023-04-08 13:02 ` Accessibility of man pages (was: Playground pager lsp(1)) Alejandro Colomar @ 2023-04-08 13:42 ` Eli Zaretskii 2023-04-08 16:06 ` Alejandro Colomar 2023-04-08 13:47 ` Colin Watson [not found] ` <87a5zhwntt.fsf@ada> 2 siblings, 1 reply; 73+ messages in thread From: Eli Zaretskii @ 2023-04-08 13:42 UTC (permalink / raw) To: Alejandro Colomar Cc: cjwatson, dirk, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff > Date: Sat, 8 Apr 2023 15:02:59 +0200 > Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org, > nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com, > groff@gnu.org > From: Alejandro Colomar <alx.manpages@gmail.com> > > If you want how symlinks are dereferenced by find(1): > > $ man find | grep sym.*link | head -n1 > The -H, -L and -P options control the treatment of symbolic links. That's because the text appears verbatim in the man page. Suppose the person in question doesn't think about "symbolic links", but has something else in mind, for example, "dereference". (Why? because he/she just happened to see that term in some article, and wanted to know what does Find do with that. Or for some other reason.) Then they will not find the description of symlink behavior of Find by searching for "dereference". Do you see the crucial issue here? Indexing can tag some text with topics which do not appear verbatim in the text, but instead anticipate what people could have in mind when they are searching for that text without knowing what it says, exactly. > >> After this patch, if you apropos "system" or "sysctl", you'll see > >> proc(5) pop up in your list. > > > > This literally adds the text to what the reader will see. It makes > > the text longer and thus more difficult to read and parse, and there's > > a limit to how many key phrases you can add like this. > > If a page has too many topics, consider splitting the page (I agree > that proc(5) is asking for that job). Indexing can tag any paragraph of text, not just the entire page. A page cannot usefully have too many keywords in its title, but it _can_ benefit from different keywords for different paragraphs. > > By contrast, > > Texinfo lets you add any number of index entries that point to the > > same text. A random example from the Emacs manual: > > > > @cindex arrow keys > > @cindex moving point > > @cindex movement > > @cindex cursor motion > > @cindex moving the cursor > > Using consistent language across pages helps for these things. There's no consistency when we want to be friendly to different people with vastly different backgrounds and cultural preferences. Good indexing will anticipate any "inconsistent" habits. And, once again, since the index entries don't appear in the text presented to the reader, the text remains consistent even if the index entries draw from different inconsistent sources. > > Texinfo has: > > > > - chapters > > - sections > > - subsections > > - subsubsections > > - unnumbered variants of the above (unnumberedsubsec etc.) > > - appendices (appendix, appendixsubsec etc.) > > - headings (majorheading, chapheading, subheading, etc.) > > > > More importantly, all those have meaningful names, not just standard > > labels like "DESCRIPTION" or "Conversions". > > "Conversions" is not a standard subsection. It's about conversion > specifiers; something exclusive of sscanf(3). However, sections and > above do be standardized, and I believe that's good, so that you can > have some a-priori expectations of the organization of a page. But it then makes it impossible to add sections with meaningful names, if those names aren't standardized. > > So when you see them in > > TOC or any similar navigation aid, you _know_, at least approximately, > > what each section is about. > > I know a priori that if I'm reading sscanf(3)'s SYNOPSIS, I'll find > the function prototype for it. Or if I read printf(3)'s ATTRIBUTES > I'll find the thread-safety of the function. SYNOPSIS is at least approximately self-describing (although some non-native English speakers might stumble on it). But how would a random reader know that ATTRIBUTES will describe thread-safety, for example? I wouldn't. Isn't it better to have a section named "Thread Safety" instead? > text search has false positives, like anything else. But having good > tools for handling text is the key to solving the problem. grep(1) > and sed(1) are your friends when reading man pages. Modern documentation is not plain text (even if we ignore compression), so tools which just search the text have limitations, sometimes serious ones. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages (was: Playground pager lsp(1)) 2023-04-08 13:42 ` Eli Zaretskii @ 2023-04-08 16:06 ` Alejandro Colomar 0 siblings, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-08 16:06 UTC (permalink / raw) To: Eli Zaretskii Cc: cjwatson, dirk, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff [-- Attachment #1.1: Type: text/plain, Size: 5401 bytes --] Hi Eli, On 4/8/23 15:42, Eli Zaretskii wrote: >> Date: Sat, 8 Apr 2023 15:02:59 +0200 >> Cc: dirk@gouders.net, linux-man@vger.kernel.org, help-texinfo@gnu.org, >> nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com, >> groff@gnu.org >> From: Alejandro Colomar <alx.manpages@gmail.com> >> >> If you want how symlinks are dereferenced by find(1): >> >> $ man find | grep sym.*link | head -n1 >> The -H, -L and -P options control the treatment of symbolic links. > > That's because the text appears verbatim in the man page. Suppose the > person in question doesn't think about "symbolic links", but has > something else in mind, for example, "dereference". (Why? because > he/she just happened to see that term in some article, and wanted to > know what does Find do with that. Or for some other reason.) Then > they will not find the description of symlink behavior of Find by > searching for "dereference". That's why using consistent language is important. Searching just for "dereference" will of course have slightly less quality, but that should be expected. Once you have a slightly related match, you can find terms that will help refine your search. $ man find | grep dereference -C1 When the -H or -L options are in effect, any symbolic links listed as the argument of -newer will be dereferenced, and the timestamp will be taken from the file to which the symbolic -- used but -follow is, any symbolic links appearing after -follow on the command line will be dereferenced, and those before it will not). -- haviour of the -newer predicate; any files listed as the argument of -newer will be dereferenced if they are sym‐ bolic links. The same consideration applies to -new‐ -- -newer Supported. If the file specified is a symbolic link, it is always dereferenced. This is a change from previous behaviour, which used to take the relevant time from the This already shows "symbolic link" several times, so you probably want to search for that. If you want something that processes natural language, you can always ask some AI engine to process man pages for you ;). > > Do you see the crucial issue here? Indexing can tag some text with > topics which do not appear verbatim in the text, but instead > anticipate what people could have in mind when they are searching for > that text without knowing what it says, exactly. I don't remember myself having had such issues so far. I'd like to see real reports of readers that struggle to find a certain search term in a certain page. There are, but few (the only one I remember is this one we had recently about proc(5)). If you ever have such a real case with man pages, please report it, and I will try to make it more accessible. The intention is that a combination of man(1), apropos(1), whatis(1), and then some grep(1) and sed(1) should be enough 99% of the time, and we should fix the outliers. > >>>> After this patch, if you apropos "system" or "sysctl", you'll see >>>> proc(5) pop up in your list. >>> >>> This literally adds the text to what the reader will see. It makes >>> the text longer and thus more difficult to read and parse, and there's >>> a limit to how many key phrases you can add like this. >> >> If a page has too many topics, consider splitting the page (I agree >> that proc(5) is asking for that job). > > Indexing can tag any paragraph of text, not just the entire page. A > page cannot usefully have too many keywords in its title, but it _can_ > benefit from different keywords for different paragraphs. We can add source code comments, which would appear in `man -K` searches, but so far I haven't seen the need in any specific page. [...] > >>> So when you see them in >>> TOC or any similar navigation aid, you _know_, at least approximately, >>> what each section is about. >> >> I know a priori that if I'm reading sscanf(3)'s SYNOPSIS, I'll find >> the function prototype for it. Or if I read printf(3)'s ATTRIBUTES >> I'll find the thread-safety of the function. > > SYNOPSIS is at least approximately self-describing (although some > non-native English speakers might stumble on it). But how would a > random reader know that ATTRIBUTES will describe thread-safety, for > example? I wouldn't. Isn't it better to have a section named "Thread > Safety" instead? I don't know the origin of the name of ATTRIBUTES. There's attributes(7), which documents what you can find there. > >> text search has false positives, like anything else. But having good >> tools for handling text is the key to solving the problem. grep(1) >> and sed(1) are your friends when reading man pages. > > Modern documentation is not plain text (even if we ignore > compression), so tools which just search the text have limitations, > sometimes serious ones. In some cases you need to search the man(7) source code to get extra information that is difficult to search in formatted text, but that's for rare cases. So far, I find mostly everything I need just with text tools. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages (was: Playground pager lsp(1)) 2023-04-08 13:02 ` Accessibility of man pages (was: Playground pager lsp(1)) Alejandro Colomar 2023-04-08 13:42 ` Eli Zaretskii @ 2023-04-08 13:47 ` Colin Watson 2023-04-08 15:42 ` Alejandro Colomar 2023-04-08 19:48 ` Accessibility of man pages Dirk Gouders [not found] ` <87a5zhwntt.fsf@ada> 2 siblings, 2 replies; 73+ messages in thread From: Colin Watson @ 2023-04-08 13:47 UTC (permalink / raw) To: Alejandro Colomar Cc: Eli Zaretskii, dirk, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff On Sat, Apr 08, 2023 at 03:02:59PM +0200, Alejandro Colomar wrote: > Colin, I've had a feeling for a long time that compressed pages are > not very useful. These days, storage is cheap. How would you feel > about having the man pages installed uncompressed in Debian? That > would allow running text tools directly in /usr/share/man/. I'm not personally all that bothered either way, but it's a distribution-wide policy decision rather than something I'd decide on. I suspect there are still some people who would push back against the space cost. > I've had to do that several times, and lucky me that I have the source > code of the Linux man-pages checked out in my computers, but other > users don't and they might have trouble finding for example which > pages talk about RLIMIT_NOFILE. The only way I know of is: man -Kaw RLIMIT_NOFILE (This looks at the page source rather than the rendered output, so sometimes it over-reports if your search term matches a groff macro, etc. But that's true of your approach too.) -- Colin Watson (he/him) [cjwatson@debian.org] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages (was: Playground pager lsp(1)) 2023-04-08 13:47 ` Colin Watson @ 2023-04-08 15:42 ` Alejandro Colomar 2023-04-08 19:48 ` Accessibility of man pages Dirk Gouders 1 sibling, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-08 15:42 UTC (permalink / raw) To: Colin Watson Cc: Eli Zaretskii, dirk, linux-man, nabijaczleweli, g.branden.robinson, groff, help-texinfo [-- Attachment #1.1: Type: text/plain, Size: 1387 bytes --] Hi Colin, On 4/8/23 15:47, Colin Watson wrote: > On Sat, Apr 08, 2023 at 03:02:59PM +0200, Alejandro Colomar wrote: >> Colin, I've had a feeling for a long time that compressed pages are >> not very useful. These days, storage is cheap. How would you feel >> about having the man pages installed uncompressed in Debian? That >> would allow running text tools directly in /usr/share/man/. > > I'm not personally all that bothered either way, but it's a > distribution-wide policy decision rather than something I'd decide on. > I suspect there are still some people who would push back against the > space cost. > >> I've had to do that several times, and lucky me that I have the source >> code of the Linux man-pages checked out in my computers, but other >> users don't and they might have trouble finding for example which >> pages talk about RLIMIT_NOFILE. The only way I know of is: > > man -Kaw RLIMIT_NOFILE Hmm, interesting; I didn't know about -K. > > (This looks at the page source rather than the rendered output, so > sometimes it over-reports if your search term matches a groff macro, > etc. But that's true of your approach too.) Yeah, this should be good for most purposes. Consider my itch scratched. :) Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 13:47 ` Colin Watson 2023-04-08 15:42 ` Alejandro Colomar @ 2023-04-08 19:48 ` Dirk Gouders 2023-04-08 20:02 ` Eli Zaretskii 2023-04-08 20:31 ` Ingo Schwarze 1 sibling, 2 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-08 19:48 UTC (permalink / raw) To: Alejandro Colomar Cc: Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Hi Alex, Colin Watson <cjwatson@debian.org> writes: > On Sat, Apr 08, 2023 at 03:02:59PM +0200, Alejandro Colomar wrote: >> Colin, I've had a feeling for a long time that compressed pages are >> not very useful. These days, storage is cheap. How would you feel >> about having the man pages installed uncompressed in Debian? That >> would allow running text tools directly in /usr/share/man/. > > I'm not personally all that bothered either way, but it's a > distribution-wide policy decision rather than something I'd decide on. > I suspect there are still some people who would push back against the > space cost. > >> I've had to do that several times, and lucky me that I have the source >> code of the Linux man-pages checked out in my computers, but other >> users don't and they might have trouble finding for example which >> pages talk about RLIMIT_NOFILE. The only way I know of is: >> >> $ grep -rl RLIMIT_NOFILE man* >> man2/dup.2 >> man2/pidfd_getfd.2 >> man2/open.2 >> man2/fcntl.2 >> man2/poll.2 >> man2/pidfd_open.2 >> man2/getrlimit.2 >> man2/select.2 >> man2/seccomp_unotify.2 >> man3/getdtablesize.3 >> man3/mq_open.3 >> man3/errno.3 >> man3/sysconf.3 >> man5/proc.5 >> man7/unix.7 >> man7/fanotify.7 >> man7/capabilities.7 > > man -Kaw RLIMIT_NOFILE Sometimes it is good to have options and one would be bzgrep(1). As far as I know it doesn't understand "-r" but: $ find /usr/share/man -type f -exec bzgrep -l RLIMIT_NOFILE {} \; /usr/share/man/man1/runuser.1.bz2 /usr/share/man/man1/su.1.bz2 /usr/share/man/man1/nghttpx.1.bz2 /usr/share/man/man3/getdtablesize.3.bz2 /usr/share/man/man3/mq_open.3.bz2 /usr/share/man/man3/errno.3.bz2 /usr/share/man/man3/sysconf.3.bz2 /usr/share/man/man3p/getrlimit.3p.bz2 /usr/share/man/man3p/sysconf.3p.bz2 /usr/share/man/man3p/posix_spawn_file_actions_addclose.3p.bz2 /usr/share/man/man0p/sys_resource.h.0p.bz2 /usr/share/man/man2/pidfd_open.2.bz2 /usr/share/man/man2/poll.2.bz2 /usr/share/man/man2/getrlimit.2.bz2 /usr/share/man/man2/open.2.bz2 /usr/share/man/man2/select.2.bz2 /usr/share/man/man2/fcntl.2.bz2 /usr/share/man/man2/seccomp_unotify.2.bz2 /usr/share/man/man2/dup.2.bz2 /usr/share/man/man2/pidfd_getfd.2.bz2 /usr/share/man/man7/fanotify.7.bz2 /usr/share/man/man7/capabilities.7.bz2 /usr/share/man/man7/unix.7.bz2 /usr/share/man/man5/proc.5.bz2 Yes, it's very slow but close to `man -K`: find... man -K... real 107.45 real 96.34 user 117.06 user 70.11 sys 14.43 sys 26.86 [a thought later] Oh, I found something much faster: $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE [snip] real 24.30 user 32.34 sys 6.84 Hmm, perhaps, someone has an explanation for this? Cheers, Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 19:48 ` Accessibility of man pages Dirk Gouders @ 2023-04-08 20:02 ` Eli Zaretskii 2023-04-08 20:46 ` Dirk Gouders 2023-04-09 10:28 ` Ralph Corderoy 2023-04-08 20:31 ` Ingo Schwarze 1 sibling, 2 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-08 20:02 UTC (permalink / raw) To: Dirk Gouders Cc: alx.manpages, cjwatson, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff > From: Dirk Gouders <dirk@gouders.net> > Cc: Colin Watson <cjwatson@debian.org>, Eli Zaretskii <eliz@gnu.org>, > linux-man@vger.kernel.org, help-texinfo@gnu.org, > nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com, > groff@gnu.org > Date: Sat, 08 Apr 2023 21:48:13 +0200 > > $ find /usr/share/man -type f -exec bzgrep -l RLIMIT_NOFILE {} \; > /usr/share/man/man1/runuser.1.bz2 > /usr/share/man/man1/su.1.bz2 > /usr/share/man/man1/nghttpx.1.bz2 > /usr/share/man/man3/getdtablesize.3.bz2 > /usr/share/man/man3/mq_open.3.bz2 > /usr/share/man/man3/errno.3.bz2 > /usr/share/man/man3/sysconf.3.bz2 > /usr/share/man/man3p/getrlimit.3p.bz2 > /usr/share/man/man3p/sysconf.3p.bz2 > /usr/share/man/man3p/posix_spawn_file_actions_addclose.3p.bz2 > /usr/share/man/man0p/sys_resource.h.0p.bz2 > /usr/share/man/man2/pidfd_open.2.bz2 > /usr/share/man/man2/poll.2.bz2 > /usr/share/man/man2/getrlimit.2.bz2 > /usr/share/man/man2/open.2.bz2 > /usr/share/man/man2/select.2.bz2 > /usr/share/man/man2/fcntl.2.bz2 > /usr/share/man/man2/seccomp_unotify.2.bz2 > /usr/share/man/man2/dup.2.bz2 > /usr/share/man/man2/pidfd_getfd.2.bz2 > /usr/share/man/man7/fanotify.7.bz2 > /usr/share/man/man7/capabilities.7.bz2 > /usr/share/man/man7/unix.7.bz2 > /usr/share/man/man5/proc.5.bz2 > > Yes, it's very slow but close to `man -K`: > > find... man -K... > > real 107.45 real 96.34 > user 117.06 user 70.11 > sys 14.43 sys 26.86 > > [a thought later] > > Oh, I found something much faster: > > $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE > [snip] > > real 24.30 > user 32.34 > sys 6.84 > > Hmm, perhaps, someone has an explanation for this? Multiprocessing, obviously. Your CPU has more than one execution unit, so the pipe via xargs runs 'find' and 'bzgrep' in parallel on two different execution units. By contrast, "find -exec" runs them sequentially, in a single thread. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 20:02 ` Eli Zaretskii @ 2023-04-08 20:46 ` Dirk Gouders 2023-04-08 21:53 ` Alejandro Colomar 2023-04-09 10:28 ` Ralph Corderoy 1 sibling, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-04-08 20:46 UTC (permalink / raw) To: Eli Zaretskii Cc: alx.manpages, cjwatson, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Eli Zaretskii <eliz@gnu.org> writes: >> From: Dirk Gouders <dirk@gouders.net> >> Cc: Colin Watson <cjwatson@debian.org>, Eli Zaretskii <eliz@gnu.org>, >> linux-man@vger.kernel.org, help-texinfo@gnu.org, >> nabijaczleweli@nabijaczleweli.xyz, g.branden.robinson@gmail.com, >> groff@gnu.org >> Date: Sat, 08 Apr 2023 21:48:13 +0200 >> >> $ find /usr/share/man -type f -exec bzgrep -l RLIMIT_NOFILE {} \; >> /usr/share/man/man1/runuser.1.bz2 >> /usr/share/man/man1/su.1.bz2 >> /usr/share/man/man1/nghttpx.1.bz2 >> /usr/share/man/man3/getdtablesize.3.bz2 >> /usr/share/man/man3/mq_open.3.bz2 >> /usr/share/man/man3/errno.3.bz2 >> /usr/share/man/man3/sysconf.3.bz2 >> /usr/share/man/man3p/getrlimit.3p.bz2 >> /usr/share/man/man3p/sysconf.3p.bz2 >> /usr/share/man/man3p/posix_spawn_file_actions_addclose.3p.bz2 >> /usr/share/man/man0p/sys_resource.h.0p.bz2 >> /usr/share/man/man2/pidfd_open.2.bz2 >> /usr/share/man/man2/poll.2.bz2 >> /usr/share/man/man2/getrlimit.2.bz2 >> /usr/share/man/man2/open.2.bz2 >> /usr/share/man/man2/select.2.bz2 >> /usr/share/man/man2/fcntl.2.bz2 >> /usr/share/man/man2/seccomp_unotify.2.bz2 >> /usr/share/man/man2/dup.2.bz2 >> /usr/share/man/man2/pidfd_getfd.2.bz2 >> /usr/share/man/man7/fanotify.7.bz2 >> /usr/share/man/man7/capabilities.7.bz2 >> /usr/share/man/man7/unix.7.bz2 >> /usr/share/man/man5/proc.5.bz2 >> >> Yes, it's very slow but close to `man -K`: >> >> find... man -K... >> >> real 107.45 real 96.34 >> user 117.06 user 70.11 >> sys 14.43 sys 26.86 >> >> [a thought later] >> >> Oh, I found something much faster: >> >> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE >> [snip] >> >> real 24.30 >> user 32.34 >> sys 6.84 >> >> Hmm, perhaps, someone has an explanation for this? > > Multiprocessing, obviously. Your CPU has more than one execution > unit, so the pipe via xargs runs 'find' and 'bzgrep' in parallel on > two different execution units. By contrast, "find -exec" runs them > sequentially, in a single thread. Yes, that must be it, thanks. I noticed `man -K...` uses up to four CPUs in parallel and therefore was unsure. With your explanation, we can get even faster: $ time -p find /usr/share/man -type f | xargs -P 6 bzgrep -l RLIMIT_NOFILE [snip] real 7.56 user 32.97 sys 7.02 Dirk PS: Colin, too late, I noticed a Mail-Followup-To Header in your mail. For the future: Is it correct that in such a case one should use that recipient list (without your address) -- even if he replies to something you wrote? In that case: I'm sorry I did that wrong. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 20:46 ` Dirk Gouders @ 2023-04-08 21:53 ` Alejandro Colomar 2023-04-08 22:33 ` Alejandro Colomar 0 siblings, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-08 21:53 UTC (permalink / raw) To: Dirk Gouders, Eli Zaretskii, cjwatson, Ingo Schwarze Cc: linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff [-- Attachment #1.1: Type: text/plain, Size: 3777 bytes --] Hi Dirk, Ingo, Eli, Colin, I prepared some (hopefully) fair comparison: $ sudo make install-man prefix=/opt/local/man/compressed -j LINK_PAGES=symlink Z=.gz >/dev/null $ sudo make install-man prefix=/opt/local/man/expanded__ -j LINK_PAGES=symlink >/dev/null I don't know what kind of magic man(1) does to be so fast reading compressed pages: $ export MANPATH=/opt/local/man/compressed/share/man $ time man -Kaw RLIMIT_NOFILE | wc -l 17 real 0m0.330s user 0m0.261s sys 0m0.074s $ time find $MANPATH -type f | xargs zgrep -l RLIMIT_NOFILE | wc -l 17 real 0m3.732s user 0m4.776s sys 0m0.703s $ time find $MANPATH -type f | xargs -P0 zgrep -l RLIMIT_NOFILE | wc -l 17 real 0m3.403s user 0m4.706s sys 0m0.699s $ time find $MANPATH -type f | while read f; do zcat $f | grep -l RLIMIT_NOFILE >/dev/null && echo "$f"; done | wc -l 17 real 0m3.730s user 0m4.769s sys 0m1.973s man(1) seems to be faster than reading uncompressed pages! See: $ export MANPATH=/opt/local/man/expanded__/share/man $ time man -Kaw RLIMIT_NOFILE | wc -l 35 real 0m1.138s user 0m0.669s sys 0m0.470s $ time find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l 17 real 0m0.018s user 0m0.007s sys 0m0.015s Having the pages uncompressed seems to be an important advantage for searching through the sources. 0.018 (with the manual search) is more than 10x faster than what man(1) can get from compressed pages. And it allows using more complex tools, like pcre2grep(1), or sed(1) for more complex searches. Colin, did I do anything wrong to have this slowness in man(1) with uncompressed pages? Also, it's finding some repeated lines; did we find a bug? $ man -Kaw RLIMIT_NOFILE /opt/local/man/expanded__/share/man/man3/errno.3 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man3/getdtablesize.3 /opt/local/man/expanded__/share/man/man3/mq_open.3 /opt/local/man/expanded__/share/man/man3/sysconf.3 /opt/local/man/expanded__/share/man/man2/fcntl.2 /opt/local/man/expanded__/share/man/man2/fcntl.2 /opt/local/man/expanded__/share/man/man2/open.2 /opt/local/man/expanded__/share/man/man2/open.2 /opt/local/man/expanded__/share/man/man2/open.2 /opt/local/man/expanded__/share/man/man2/poll.2 /opt/local/man/expanded__/share/man/man2/poll.2 /opt/local/man/expanded__/share/man/man2/seccomp_unotify.2 /opt/local/man/expanded__/share/man/man2/pidfd_getfd.2 /opt/local/man/expanded__/share/man/man2/dup.2 /opt/local/man/expanded__/share/man/man2/dup.2 /opt/local/man/expanded__/share/man/man2/dup.2 /opt/local/man/expanded__/share/man/man2/getrlimit.2 /opt/local/man/expanded__/share/man/man2/getrlimit.2 /opt/local/man/expanded__/share/man/man2/getrlimit.2 /opt/local/man/expanded__/share/man/man2/getrlimit.2 /opt/local/man/expanded__/share/man/man2/getrlimit.2 /opt/local/man/expanded__/share/man/man2/pidfd_open.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man2/select.2 /opt/local/man/expanded__/share/man/man5/proc.5 /opt/local/man/expanded__/share/man/man5/proc.5 /opt/local/man/expanded__/share/man/man7/capabilities.7 /opt/local/man/expanded__/share/man/man7/fanotify.7 /opt/local/man/expanded__/share/man/man7/unix.7 $ grep -n RLIMIT_NOFILE /opt/local/man/expanded__/share/man/man2/select.2 412:.B RLIMIT_NOFILE Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 21:53 ` Alejandro Colomar @ 2023-04-08 22:33 ` Alejandro Colomar 0 siblings, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-08 22:33 UTC (permalink / raw) To: cjwatson Cc: linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff, Dirk Gouders, Eli Zaretskii, Ingo Schwarze [-- Attachment #1.1: Type: text/plain, Size: 2533 bytes --] On 4/8/23 23:53, Alejandro Colomar wrote: > Colin, did I do anything wrong to have this slowness in man(1) with > uncompressed pages? Also, it's finding some repeated lines; did we > find a bug? > > > $ man -Kaw RLIMIT_NOFILE > /opt/local/man/expanded__/share/man/man3/errno.3 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man3/getdtablesize.3 > /opt/local/man/expanded__/share/man/man3/mq_open.3 > /opt/local/man/expanded__/share/man/man3/sysconf.3 > /opt/local/man/expanded__/share/man/man2/fcntl.2 > /opt/local/man/expanded__/share/man/man2/fcntl.2 > /opt/local/man/expanded__/share/man/man2/open.2 > /opt/local/man/expanded__/share/man/man2/open.2 > /opt/local/man/expanded__/share/man/man2/open.2 > /opt/local/man/expanded__/share/man/man2/poll.2 > /opt/local/man/expanded__/share/man/man2/poll.2 > /opt/local/man/expanded__/share/man/man2/seccomp_unotify.2 > /opt/local/man/expanded__/share/man/man2/pidfd_getfd.2 > /opt/local/man/expanded__/share/man/man2/dup.2 > /opt/local/man/expanded__/share/man/man2/dup.2 > /opt/local/man/expanded__/share/man/man2/dup.2 > /opt/local/man/expanded__/share/man/man2/getrlimit.2 > /opt/local/man/expanded__/share/man/man2/getrlimit.2 > /opt/local/man/expanded__/share/man/man2/getrlimit.2 > /opt/local/man/expanded__/share/man/man2/getrlimit.2 > /opt/local/man/expanded__/share/man/man2/getrlimit.2 > /opt/local/man/expanded__/share/man/man2/pidfd_open.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man2/select.2 > /opt/local/man/expanded__/share/man/man5/proc.5 > /opt/local/man/expanded__/share/man/man5/proc.5 > /opt/local/man/expanded__/share/man/man7/capabilities.7 > /opt/local/man/expanded__/share/man/man7/fanotify.7 > /opt/local/man/expanded__/share/man/man7/unix.7 > > $ grep -n RLIMIT_NOFILE /opt/local/man/expanded__/share/man/man2/select.2 > 412:.B RLIMIT_NOFILE Ahh, it seems to be following symlinks as if they were actual pages. But for some reason this only happens for uncompressed pages, and not for .gz pages. Bug here :) > > > Cheers, > Alex > -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 20:02 ` Eli Zaretskii 2023-04-08 20:46 ` Dirk Gouders @ 2023-04-09 10:28 ` Ralph Corderoy 1 sibling, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-09 10:28 UTC (permalink / raw) To: linux-man, groff Hi, (Colin, something for you near the end; search ‘interesting’.) Eli wrote: > Dirk wrote: > > $ find /usr/share/man -type f -exec bzgrep -l RLIMIT_NOFILE {} \; ... > > find... man -K... > > > > real 107.45 real 96.34 > > user 117.06 user 70.11 > > sys 14.43 sys 26.86 ... > > $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE ... > > real 24.30 > > user 32.34 > > sys 6.84 > > Multiprocessing, obviously. Your CPU has more than one execution > unit, so the pipe via xargs runs 'find' and 'bzgrep' in parallel on > two different execution units. By contrast, "find -exec" runs them > sequentially, in a single thread. No, I don't think it's that. With the first, find(1) does stop whilst waiting for bzgrep to grep a single file. bzgrep may or may not run on the same core. The important thing is the one bzgrep per file and its fork() and exec() overhead. The second has find fill a pipe's buffer with paths and when that's full, xargs's read can return. This continues until xargs either reads end-of-file or reaches the argv[] limits. It then runs a single bzgrep with many filenames. The fork+exec overhead is much reduced. bzgrep is a shell script and has overhead before it gets to the argument-processing loop. That overhead is suffered many times if bzgrep is run once per file. The *zgrep scripts are a poor option in general due to this one-grep-per-file overhead. Better than nothing, but a grep which can internally decompress all the different compression formats avoids this shell overhead. Here is an example. 260 files causes eight times as many clone(2)s, i.e. forks. I've added an extra ‘×...’ column. The ls and xargs will complete their work nearly instantly. All the wall-clock time is the single run of zgrep. $ pwd /usr/share/man/man7 $ ls *.gz | wc -l 260 $ $ ls *.gz | LC_ALL=C strace -fc xargs -rd\\n zgrep -H not-to-be-found % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- ------ ---------------- 93.70 27.510039 7555 3641 ×14 1560 wait4 0.85 0.248763 26 9389 mmap 0.68 0.198674 11 17166 rt_sigprocmask 0.56 0.165702 35 4691 mprotect 0.52 0.153146 6 22637 rt_sigaction 0.50 0.146029 21 6780 read 0.43 0.125451 10 12235 1040 close 0.31 0.091715 29 3132 openat 0.25 0.073542 11 6513 522 fcntl 0.24 0.070822 12 5728 2080 stat 0.23 0.068825 16 4171 fstat 0.20 0.057703 24 2348 brk 0.19 0.054838 69 786 4 execve 0.18 0.052849 25 2081 ×8 clone 0.17 0.051089 17 2862 782 access 0.15 0.043284 55 782 munmap 0.14 0.040012 48 819 write 0.11 0.031992 11 2870 260 lseek 0.11 0.031393 8 3902 dup2 0.08 0.023038 22 1041 pipe 0.07 0.021190 13 1560 rt_sigreturn 0.06 0.018363 11 1564 782 arch_prctl 0.05 0.016013 7 2081 ×8 getgid 0.05 0.015314 7 2081 ×8 getegid 0.05 0.015251 7 2081 ×8 getuid 0.05 0.014780 7 2081 ×8 geteuid 0.02 0.004703 18 260 ×1 sigaltstack 0.01 0.003685 13 264 ×1 prlimit64 0.01 0.003523 1 2084 ×8 getpid 0.01 0.003443 13 260 ×1 set_tid_address 0.01 0.003425 13 260 ×1 set_robust_list 0.00 0.000094 23 4 getdents64 0.00 0.000053 17 3 2 ioctl 0.00 0.000033 16 2 poll 0.00 0.000018 18 1 sysinfo 0.00 0.000014 14 1 getppid 0.00 0.000013 13 1 uname 0.00 0.000013 13 1 getpgrp ------ ----------- ----------- --------- --------- ---------------- 100.00 29.358834 128163 7032 total Compare with running sh(1) to run zcat and grep on each bunch of xargs's files. $ ls *.gz | LC_ALL=C strace -fc sh -c 'xargs -rd\\n zcat | grep not-to-be-found' % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- ------ ---------------- 82.18 0.150049 37512 4 1 wait4 5.92 0.010814 12 881 read 2.25 0.004116 14 286 ×1 openat 2.17 0.003966 13 299 ×1 write 1.33 0.002432 8 301 ×1 2 close 1.24 0.002263 30 74 mmap 1.19 0.002166 7 285 ×1 fstat 0.58 0.001060 17 62 16 stat 0.52 0.000954 34 28 mprotect 0.42 0.000766 12 63 rt_sigaction 0.35 0.000642 27 23 6 access 0.30 0.000543 24 22 rt_sigprocmask 0.19 0.000346 17 20 1 lseek 0.17 0.000310 62 5 munmap 0.14 0.000250 13 18 getuid 0.13 0.000242 18 13 4 ioctl 0.13 0.000238 13 18 getegid 0.13 0.000236 13 18 geteuid 0.11 0.000204 11 18 brk 0.11 0.000199 11 18 getgid 0.06 0.000116 11 10 5 arch_prctl 0.06 0.000106 9 11 1 fcntl 0.05 0.000083 10 8 getpid 0.04 0.000082 13 6 prlimit64 0.04 0.000076 38 2 pipe 0.03 0.000061 20 3 clone 0.03 0.000047 23 2 getpgrp 0.02 0.000044 22 2 sysinfo 0.02 0.000033 3 9 4 execve 0.02 0.000031 10 3 dup2 0.02 0.000030 30 1 set_tid_address 0.01 0.000023 11 2 uname 0.01 0.000022 22 1 set_robust_list 0.01 0.000014 7 2 poll 0.01 0.000014 14 1 rt_sigreturn 0.01 0.000010 5 2 getppid 0.00 0.000007 1 4 getdents64 0.00 0.000000 0 1 sigaltstack ------ ----------- ----------- --------- --------- ---------------- 100.00 0.182595 2526 40 total $ Fifty times fewer system calls. Especially expensive ones. Now, here's the interesting bit. man here also forks once per file. Presumably so the child can decompress the file and write to a pipe with the existing search code in the parent reading from the other end without caring the file is compressed. Removing the pipe and fork could speed things up a bit. A function pointer would be one way. $ LC_ALL=C strace -fc man -Ks7 not-to-be-found % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- ------ ---------------- 61.90 1.890948 14 132630 read 32.05 0.979069 1882 520 ×2 260 wait4 1.68 0.051233 98 520 ×2 260 seccomp 0.90 0.027443 13 2096 ×8 close 0.61 0.018720 18 1024 write 0.59 0.017941 17 1040 ×4 rt_sigprocmask 0.53 0.016137 27 585 37 stat 0.49 0.015120 27 551 15 openat 0.26 0.007975 30 260 ×1 pipe 0.15 0.004705 18 260 ×1 clone 0.14 0.004345 16 260 ×1 rt_sigreturn 0.14 0.004254 16 263 ×1 ioctl 0.13 0.003824 14 262 ×1 getpid 0.10 0.003067 1 1573 rt_sigaction 0.08 0.002581 9 273 fstat 0.07 0.002052 3 520 ×2 prctl 0.07 0.002040 7 263 ×1 lseek 0.06 0.001959 7 260 ×1 dup 0.04 0.001115 2 520 ×2 dup2 0.01 0.000204 17 12 brk 0.00 0.000000 0 145 lstat 0.00 0.000000 0 31 mmap 0.00 0.000000 0 12 mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 1 1 access 0.00 0.000000 0 1 execve 0.00 0.000000 0 3 fcntl 0.00 0.000000 0 6 readlink 0.00 0.000000 0 1 umask 0.00 0.000000 0 1 sysinfo 0.00 0.000000 0 1 getuid 0.00 0.000000 0 1 getgid 0.00 0.000000 0 1 geteuid 0.00 0.000000 0 1 getegid 0.00 0.000000 0 3 fstatfs 0.00 0.000000 0 2 1 arch_prctl 0.00 0.000000 0 8 getdents64 ------ ----------- ----------- --------- --------- ---------------- 100.00 3.054732 143911 574 total $ -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 19:48 ` Accessibility of man pages Dirk Gouders 2023-04-08 20:02 ` Eli Zaretskii @ 2023-04-08 20:31 ` Ingo Schwarze 2023-04-08 20:59 ` Dirk Gouders 1 sibling, 1 reply; 73+ messages in thread From: Ingo Schwarze @ 2023-04-08 20:31 UTC (permalink / raw) To: Dirk Gouders Cc: Alejandro Colomar, Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Hi Dirk, Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200: > Yes, it's very slow but close to `man -K`: > > find... man -K... > > real 107.45 real 96.34 > user 117.06 user 70.11 > sys 14.43 sys 26.86 > > [a thought later] > > Oh, I found something much faster: > > $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE > [snip] > > real 24.30 > user 32.34 > sys 6.84 > > Hmm, perhaps, someone has an explanation for this? These are all terribly slow IMHO. For comparison, this happens on my OpenBSD notebook, with more than five hundred optional software packages installed in addition to the complete default installation: $ time man -k any=RLIMIT_NOFILE dup, dup2, dup3(2) - duplicate an existing file descriptor getrlimit, setrlimit(2) - control maximum system resource consumption sudoers(5) - default sudo security policy plugin 0m00.21s real 0m00.00s user 0m00.03s system $ time man -k 'any=rlimit' ps(1) - display process status brk, sbrk(2) - change data segment size dup, dup2, dup3(2) - duplicate an existing file descriptor execve(2) - execute a file fork(2) - create a new process getdtablecount(2) - get descriptor table count getrlimit, setrlimit(2) - control maximum system resource consumption mlock, munlock(2) - lock (unlock) physical pages in memory mlockall, munlockall(2) - lock (unlock) the address space of a process pledge(2) - restrict system operations poll, ppoll(2) - synchronous I/O multiplexing quotactl(2) - manipulate filesystem quotas sigaction(2) - software signal facilities getdtablesize(3) - get descriptor table size login_cap, login_getclass, login_close, login_getcapbool, login_getcapnum, login_getcapsize, login_getcapstr, login_getcaptime, login_getstyle, setclasscontext, setusercontext(3) - query login.conf database about a user class signal, bsd_signal(3) - simplified software signal facilities sigvec(3) - software signal facilities core(5) - memory image file format login.conf(5) - login class capability database sudoers(5) - default sudo security policy plugin fork1(9) - create a new process mi_switch, cpu_switchto(9) - switch to another process context 0m00.05s real 0m00.01s user 0m00.00s system $ time man -k any=RLIMIT_NOFILE dup, dup2, dup3(2) - duplicate an existing file descriptor getrlimit, setrlimit(2) - control maximum system resource consumption sudoers(5) - default sudo security policy plugin 0m00.01s real 0m00.01s user 0m00.01s system The effect that the time goes down from 210 milliseconds to 10 milliseconds when doing the search a second time is due to the fact that the kernel now has the required information in the buffer cache and no longer needs to read from the rotating disk. The machine in question has i5 2.3 GHz processors and 8 GB of RAM, so it's hardly a high-end machine. Yours, Ingo ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 20:31 ` Ingo Schwarze @ 2023-04-08 20:59 ` Dirk Gouders 2023-04-08 22:39 ` Ingo Schwarze 0 siblings, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-04-08 20:59 UTC (permalink / raw) To: Ingo Schwarze Cc: Alejandro Colomar, Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Hi Ingo, Ingo Schwarze <schwarze@usta.de> writes: > Hi Dirk, > > Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200: > >> Yes, it's very slow but close to `man -K`: >> >> find... man -K... >> >> real 107.45 real 96.34 >> user 117.06 user 70.11 >> sys 14.43 sys 26.86 >> >> [a thought later] >> >> Oh, I found something much faster: >> >> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE >> [snip] >> >> real 24.30 >> user 32.34 >> sys 6.84 >> >> Hmm, perhaps, someone has an explanation for this? > > These are all terribly slow IMHO. > > For comparison, this happens on my OpenBSD notebook, with more than > five hundred optional software packages installed in addition to the > complete default installation: > > $ time man -k any=RLIMIT_NOFILE > dup, dup2, dup3(2) - duplicate an existing file descriptor > getrlimit, setrlimit(2) - control maximum system resource consumption > sudoers(5) - default sudo security policy plugin > 0m00.21s real 0m00.00s user 0m00.03s system Yes, this is really fast and would allow for quite interesting ways to work with manual pages. But, OpenBSD's `man -k` operates on a makewhatis(8) database and not on every single manual page or am I wrong? Regards, Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 20:59 ` Dirk Gouders @ 2023-04-08 22:39 ` Ingo Schwarze 2023-04-09 9:50 ` Dirk Gouders 0 siblings, 1 reply; 73+ messages in thread From: Ingo Schwarze @ 2023-04-08 22:39 UTC (permalink / raw) To: Dirk Gouders Cc: Alejandro Colomar, Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Hi Dirk, Dirk Gouders wrote on Sat, Apr 08, 2023 at 10:59:32PM +0200: > Ingo Schwarze <schwarze@usta.de> writes: >> Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200: >>> Yes, it's very slow but close to `man -K`: >>> >>> find... man -K... >>> >>> real 107.45 real 96.34 >>> user 117.06 user 70.11 >>> sys 14.43 sys 26.86 >>> >>> [a thought later] >>> >>> Oh, I found something much faster: >>> >>> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE >>> [snip] >>> >>> real 24.30 >>> user 32.34 >>> sys 6.84 >>> >>> Hmm, perhaps, someone has an explanation for this? >> These are all terribly slow IMHO. >> >> For comparison, this happens on my OpenBSD notebook, with more than >> five hundred optional software packages installed in addition to the >> complete default installation: >> >> $ time man -k any=RLIMIT_NOFILE >> dup, dup2, dup3(2) - duplicate an existing file descriptor >> getrlimit, setrlimit(2) - control maximum system resource consumption >> sudoers(5) - default sudo security policy plugin >> 0m00.21s real 0m00.00s user 0m00.03s system > Yes, this is really fast and would allow for quite interesting ways to > work with manual pages. > > But, OpenBSD's `man -k` operates on a makewhatis(8) database and not > on every single manual page or am I wrong? Yes, you are completely correct about that. The database format is documented here: https://man.openbsd.org/mandoc.db.5 And the search syntax here: https://man.openbsd.org/apropos.1 The concept works very well because in contrast to man(7), mdoc(7) provides substatial semantic markup (without being harder to write or maintain). The comparison seemed relevant to me because as far as i understood the intention of the thread, participants were looking for ideas to make searching for content in manual pages more powerful and more efficient. The combination of semantic markup and indexing of marked up content is one way to make progress in that direction, and the combination of mdoc(7) with mandoc(1) is an example of a system demonstrating the concept. I understand people familiar with GNU info(1) pointed out that providing index entries that do not correspond to marked up content is also occasionally useful. I do not completely disagree with that, and the mdoc(7) language as implemented by mandoc(1) provides a dedicated macro to do just that: https://man.openbsd.org/mdoc.7#Tg Then again, practical experience shows that manual tagging is needed only in extremely rare cases and completely automatic tagging produces completely satisfactory index entries for the vast majority of cases. Yours, Ingo ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-08 22:39 ` Ingo Schwarze @ 2023-04-09 9:50 ` Dirk Gouders 2023-04-09 10:35 ` Dirk Gouders 0 siblings, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-04-09 9:50 UTC (permalink / raw) To: Ingo Schwarze Cc: Alejandro Colomar, Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Hi Ingo, Ingo Schwarze <schwarze@usta.de> writes: > Dirk Gouders wrote on Sat, Apr 08, 2023 at 10:59:32PM +0200: >> Ingo Schwarze <schwarze@usta.de> writes: >>> Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200: > >>>> Yes, it's very slow but close to `man -K`: >>>> >>>> find... man -K... >>>> >>>> real 107.45 real 96.34 >>>> user 117.06 user 70.11 >>>> sys 14.43 sys 26.86 >>>> >>>> [a thought later] >>>> >>>> Oh, I found something much faster: >>>> >>>> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE >>>> [snip] >>>> >>>> real 24.30 >>>> user 32.34 >>>> sys 6.84 >>>> >>>> Hmm, perhaps, someone has an explanation for this? > >>> These are all terribly slow IMHO. >>> >>> For comparison, this happens on my OpenBSD notebook, with more than >>> five hundred optional software packages installed in addition to the >>> complete default installation: >>> >>> $ time man -k any=RLIMIT_NOFILE >>> dup, dup2, dup3(2) - duplicate an existing file descriptor >>> getrlimit, setrlimit(2) - control maximum system resource consumption >>> sudoers(5) - default sudo security policy plugin >>> 0m00.21s real 0m00.00s user 0m00.03s system > >> Yes, this is really fast and would allow for quite interesting ways to >> work with manual pages. >> >> But, OpenBSD's `man -k` operates on a makewhatis(8) database and not >> on every single manual page or am I wrong? > > Yes, you are completely correct about that. > The database format is documented here: > > https://man.openbsd.org/mandoc.db.5 > > And the search syntax here: > > https://man.openbsd.org/apropos.1 > > The concept works very well because in contrast to man(7), mdoc(7) > provides substatial semantic markup (without being harder to write > or maintain). > > The comparison seemed relevant to me because as far as i understood the > intention of the thread, participants were looking for ideas to make > searching for content in manual pages more powerful and more efficient. > The combination of semantic markup and indexing of marked up content > is one way to make progress in that direction, and the combination > of mdoc(7) with mandoc(1) is an example of a system demonstrating > the concept. Very interesting. I gues that makewhatis(8) then has to cope both formats (man(7) and mdoc(7)) and from between the lines I read that it is not really a problem. Are there any outstanding queries mdoc(7) enables that man(7) cannot? From what I read so far with mdoc(7) it should be very easy (by querying .Xr), for example to get an answer to the question "Which manual pages are referencing me?" (From inside a pager, for example). > I understand people familiar with GNU info(1) pointed out that > providing index entries that do not correspond to marked up > content is also occasionally useful. I do not completely disagree > with that, and the mdoc(7) language as implemented by mandoc(1) > provides a dedicated macro to do just that: > > https://man.openbsd.org/mdoc.7#Tg My role in this thread is not an experts one but the one of a naive guy who plays with an experimental pager (lsp(1)) that tries to offer some additional features for handling manual pages. I read that with .Tg tags are passed to the PAGER and with less(1) one could use :t to navigate to them. I tried to see how this works and wonder how the user knows which tags are available -- maybe man-db's man(1) doesn't support this... If your time allows and it's not too off-topic, perhaps you could provide more detail, e.g. if I can make use of the .Tg tags on a non-OpenBSD system. Regards, Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Accessibility of man pages 2023-04-09 9:50 ` Dirk Gouders @ 2023-04-09 10:35 ` Dirk Gouders 0 siblings, 0 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-09 10:35 UTC (permalink / raw) To: Ingo Schwarze Cc: Alejandro Colomar, Colin Watson, Eli Zaretskii, linux-man, help-texinfo, nabijaczleweli, g.branden.robinson, groff Dirk Gouders <dirk@gouders.net> writes: > Hi Ingo, > > Ingo Schwarze <schwarze@usta.de> writes: >> Dirk Gouders wrote on Sat, Apr 08, 2023 at 10:59:32PM +0200: >>> Ingo Schwarze <schwarze@usta.de> writes: >>>> Dirk Gouders wrote on Sat, Apr 08, 2023 at 09:48:13PM +0200: >> >>>>> Yes, it's very slow but close to `man -K`: >>>>> >>>>> find... man -K... >>>>> >>>>> real 107.45 real 96.34 >>>>> user 117.06 user 70.11 >>>>> sys 14.43 sys 26.86 >>>>> >>>>> [a thought later] >>>>> >>>>> Oh, I found something much faster: >>>>> >>>>> $ time -p find /usr/share/man -type f | xargs bzgrep -l RLIMIT_NOFILE >>>>> [snip] >>>>> >>>>> real 24.30 >>>>> user 32.34 >>>>> sys 6.84 >>>>> >>>>> Hmm, perhaps, someone has an explanation for this? >> >>>> These are all terribly slow IMHO. >>>> >>>> For comparison, this happens on my OpenBSD notebook, with more than >>>> five hundred optional software packages installed in addition to the >>>> complete default installation: >>>> >>>> $ time man -k any=RLIMIT_NOFILE >>>> dup, dup2, dup3(2) - duplicate an existing file descriptor >>>> getrlimit, setrlimit(2) - control maximum system resource consumption >>>> sudoers(5) - default sudo security policy plugin >>>> 0m00.21s real 0m00.00s user 0m00.03s system >> >>> Yes, this is really fast and would allow for quite interesting ways to >>> work with manual pages. >>> >>> But, OpenBSD's `man -k` operates on a makewhatis(8) database and not >>> on every single manual page or am I wrong? >> >> Yes, you are completely correct about that. >> The database format is documented here: >> >> https://man.openbsd.org/mandoc.db.5 >> >> And the search syntax here: >> >> https://man.openbsd.org/apropos.1 >> >> The concept works very well because in contrast to man(7), mdoc(7) >> provides substatial semantic markup (without being harder to write >> or maintain). >> >> The comparison seemed relevant to me because as far as i understood the >> intention of the thread, participants were looking for ideas to make >> searching for content in manual pages more powerful and more efficient. >> The combination of semantic markup and indexing of marked up content >> is one way to make progress in that direction, and the combination >> of mdoc(7) with mandoc(1) is an example of a system demonstrating >> the concept. > > Very interesting. I gues that makewhatis(8) then has to cope both > formats (man(7) and mdoc(7)) and from between the lines I read that it > is not really a problem. > > Are there any outstanding queries mdoc(7) enables that man(7) cannot? > From what I read so far with mdoc(7) it should be very easy (by querying > .Xr), for example to get an answer to the question "Which manual pages > are referencing me?" (From inside a pager, for example). > >> I understand people familiar with GNU info(1) pointed out that >> providing index entries that do not correspond to marked up >> content is also occasionally useful. I do not completely disagree >> with that, and the mdoc(7) language as implemented by mandoc(1) >> provides a dedicated macro to do just that: >> >> https://man.openbsd.org/mdoc.7#Tg > > My role in this thread is not an experts one but the one of a naive guy > who plays with an experimental pager (lsp(1)) that tries to offer some > additional features for handling manual pages. > > I read that with .Tg tags are passed to the PAGER and with less(1) one > could use :t to navigate to them. I tried to see how this works and > wonder how the user knows which tags are available -- maybe man-db's > man(1) doesn't support this... > > If your time allows and it's not too off-topic, perhaps you could > provide more detail, e.g. if I can make use of the .Tg tags on a > non-OpenBSD system. Hmm, I already learned that I have all those commands available with an 'm' prefixed, i.e. mapropos, mman, mmakewhatis... So, I built the makewhatis databases: # find / -name mandoc.db -ls 659416 4 -rw-r--r-- 1 root root 3984 Apr 9 12:25 /usr/lib/rust/1.66.1/share/man/mandoc.db 659419 8 -rw-r--r-- 1 root root 4456 Apr 9 12:25 /usr/lib/llvm/15/share/man/mandoc.db 954004 1812 -rw-r--r-- 1 root root 1848712 Apr 9 12:25 /usr/share/man/mandoc.db 954003 4 -rw-r--r-- 1 root root 1864 Apr 9 12:24 /usr/share/binutils-data/x86_64-pc-linux-gnu/2.39/man/mandoc.db 787032 4 -rw-r--r-- 1 root root 1164 Apr 9 12:24 /usr/share/gcc-data/x86_64-pc-linux-gnu/12/man/mandoc.db 732 12 -rw-r--r-- 1 root root 8444 Apr 9 12:24 /usr/lib64/icedtea8/man/mandoc.db But your example query gives not matches: $ mman -k any=RLIMIT_NOFILE mman: nothing appropriate It's very fast, though: real 0.00 user 0.00 sys 0.00 ;-) Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
[parent not found: <87a5zhwntt.fsf@ada>]
* Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) [not found] ` <87a5zhwntt.fsf@ada> @ 2023-04-09 12:05 ` Alejandro Colomar 2023-04-09 12:17 ` Alejandro Colomar ` (2 more replies) 0 siblings, 3 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-09 12:05 UTC (permalink / raw) To: Alexis, groff, linux-man Cc: Ingo Schwarze, Dirk Gouders, Colin Watson, Sam James, Ralph Corderoy [-- Attachment #1.1: Type: text/plain, Size: 5185 bytes --] [Added back linux-man@, and people that commented on this (sub)topic] [Added Sam, I've got a question for you] Hi Alexis, Please keep (at least) linux-man@ in the loop. On 4/9/23 08:44, Alexis wrote: > > As a related data point, i'd like to mention Gentoo's position on > this, i.e. that man pages will continue to be bzip2-compressed by > default: > > "app-text/mandoc bzip2 support" > https://bugs.gentoo.org/854267 > > "Remove /usr/share/man from default inclusion list for docompress" > https://bugs.gentoo.org/836367 As Ingo said[1] 3 years ago, I don't think in this year it makes any sense to compress pages anymore. However, since it's simple for me to add support for that, and it can be interesting for testing purposes, I added support for installing the Linux man-pages compressed with bzip2 using the Makefile[2]. While I was at it, I also added support for generating .tar.bz2 release tarballs[3]. With this, I was able to test a bit more than what I did yesterday: $ sudo rm -rf /opt/local/man/ $ sudo make install-man prefix=/opt/local/man/gz_ -j LINK_PAGES=symlink Z=.gz | wc -l 2570 $ sudo make install-man prefix=/opt/local/man/bz2 -j LINK_PAGES=symlink Z=.bz2 | wc -l 2570 $ sudo make install-man prefix=/opt/local/man/man -j LINK_PAGES=symlink Z= | wc -l 2570 $ du -sh /opt/local/man/* 5.4M /opt/local/man/bz2 5.5M /opt/local/man/gz_ 9.4M /opt/local/man/man $ export MANPATH=/opt/local/man/gz_/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 0.31 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs zgrep -l RLIMIT_NOFILE | wc -l" 17 1.56 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 zgrep -l RLIMIT_NOFILE | wc -l" 17 1.56 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do zcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.24 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.14 $ export MANPATH=/opt/local/man/bz2/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 10.90 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs bzgrep -l RLIMIT_NOFILE | wc -l" 17 1.33 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 bzgrep -l RLIMIT_NOFILE | wc -l" 17 1.31 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.21 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.22 $ export MANPATH=/opt/local/man/man/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 0.56 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" 17 0.01 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 grep -l RLIMIT_NOFILE | wc -l" 17 0.01 Weird thing: today, the symlink bug in man(1) was reproducible in all kinds of pages, while yesterday it only reproduced in uncompressed ones. Another weird thing: times today changed considerably for the find(1) pipelines (half of yesterday's). It's not a thing of using dash(1), because I get similar times with bash(1) and its builtin time(1). Important note: Sam, are you sure you want your pages compressed with bz2? Have you seen the 10 seconds it takes man-db's man(1) to find a word in the pages? I suggest that at least you try to reproduce these tests in your machine, and see if it's just me or man-db's man(1) is pretty bad at non-gz pages. Test results: - man-db's man(1) is slower with plain man(7) source than with .gz pages for some misterious reason. - man-db's man(1) is turtle slow with .bz2 pages. - xargs -P0 doesn't affect significantly. As Ralph said, this is probably because the main issue with find(1) was having the bottleneck in clone/fork+exec, and xargs(1) already solves that. Expanding the pipeline to use zcat(1) instead of zgrep(1) improves a little bit more, because the zgrep(1) script is probably quite inefficient, while zcat(1) is just a simple wrapper around gzip(1). We see that zgrep(1) is more inefficient than running ourselves a few programs per file in a pipeline! Calling gzip(1) directly is even faster, since we avoid invoking a shell for such a small script. Expanding the bzgrep(1) pipeline into one using bzcat(1) has similar improvements. However, since bzcat(1) is a binary, we don't get further improvement from calling bzip2(1) directly. Cheers, Alex > > > Alexis. > [1]: <https://marc.info/?l=mandoc-discuss&m=160668087317110&w=2> [2]: <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=6a828d5b6879ef19c3f59034fe1d0850d25d0056> [3]: <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e5b23b9c5b318d69ee78af0906e3bf0c665f9ae5> -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-09 12:05 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Alejandro Colomar @ 2023-04-09 12:17 ` Alejandro Colomar 2023-04-09 18:55 ` G. Branden Robinson 2023-04-09 12:29 ` Colin Watson 2023-04-12 8:13 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James 2 siblings, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-09 12:17 UTC (permalink / raw) To: groff, linux-man Cc: Ingo Schwarze, Dirk Gouders, Colin Watson, Sam James, Ralph Corderoy, Alexis [-- Attachment #1.1: Type: text/plain, Size: 5931 bytes --] On 4/9/23 14:05, Alejandro Colomar wrote: > [Added back linux-man@, and people that commented on this (sub)topic] > [Added Sam, I've got a question for you] > > Hi Alexis, > > Please keep (at least) linux-man@ in the loop. > > On 4/9/23 08:44, Alexis wrote: >> >> As a related data point, i'd like to mention Gentoo's position on >> this, i.e. that man pages will continue to be bzip2-compressed by >> default: >> >> "app-text/mandoc bzip2 support" >> https://bugs.gentoo.org/854267 >> >> "Remove /usr/share/man from default inclusion list for docompress" >> https://bugs.gentoo.org/836367 > > As Ingo said[1] 3 years ago, I don't think in this year it makes any > sense to compress pages anymore. However, since it's simple for me > to add support for that, and it can be interesting for testing > purposes, I added support for installing the Linux man-pages > compressed with bzip2 using the Makefile[2]. While I was at it, I > also added support for generating .tar.bz2 release tarballs[3]. > > With this, I was able to test a bit more than what I did yesterday: > > > $ sudo rm -rf /opt/local/man/ > $ sudo make install-man prefix=/opt/local/man/gz_ -j LINK_PAGES=symlink Z=.gz | wc -l > 2570 > $ sudo make install-man prefix=/opt/local/man/bz2 -j LINK_PAGES=symlink Z=.bz2 | wc -l > 2570 > $ sudo make install-man prefix=/opt/local/man/man -j LINK_PAGES=symlink Z= | wc -l > 2570 > $ du -sh /opt/local/man/* > 5.4M /opt/local/man/bz2 > 5.5M /opt/local/man/gz_ > 9.4M /opt/local/man/man > > > $ export MANPATH=/opt/local/man/gz_/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 0.31 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs zgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 zgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do zcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.24 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.14 > > > $ export MANPATH=/opt/local/man/bz2/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 10.90 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs bzgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.33 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 bzgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.31 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.21 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.22 > > > $ export MANPATH=/opt/local/man/man/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 0.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" > 17 > 0.01 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 grep -l RLIMIT_NOFILE | wc -l" > 17 > 0.01 > > Weird thing: today, the symlink bug in man(1) was reproducible in > all kinds of pages, while yesterday it only reproduced in > uncompressed ones. > > Another weird thing: times today changed considerably for the > find(1) pipelines (half of yesterday's). It's not a thing of > using dash(1), because I get similar times with bash(1) and its > builtin time(1). > > Important note: Sam, are you sure you want your pages compressed > with bz2? Have you seen the 10 seconds it takes man-db's man(1) to > find a word in the pages? I suggest that at least you try to > reproduce these tests in your machine, and see if it's just me or > man-db's man(1) is pretty bad at non-gz pages. > > Test results: > > - man-db's man(1) is slower with plain man(7) source than with .gz > pages for some misterious reason. > > - man-db's man(1) is turtle slow with .bz2 pages. > > - xargs -P0 doesn't affect significantly. As Ralph said, this is > probably because the main issue with find(1) was having the > bottleneck in clone/fork+exec, and xargs(1) already solves that. > > Expanding the pipeline to use zcat(1) instead of zgrep(1) > improves a little bit more, because the zgrep(1) script is > probably quite inefficient, while zcat(1) is just a simple > wrapper around gzip(1). We see that zgrep(1) is more > inefficient than running ourselves a few programs per file in a > pipeline! > > Calling gzip(1) directly is even faster, since we avoid invoking > a shell for such a small script. > > Expanding the bzgrep(1) pipeline into one using bzcat(1) has > similar improvements. However, since bzcat(1) is a binary, we > don't get further improvement from calling bzip2(1) directly. And I forgot the obvious one: - Using plain man(7) source is blazingly fast. So much that I don't miss mdoc(7)'s indexability so much. However, I must admit that I do miss mdoc(7)'s power sometimes. The man_lsfunc() and man_lsvar() functions for finding function prototypes and variable declarations in man(7) source would be much simpler using mdoc(1), and I could even use mandoc(1) to find such things. > > > Cheers, > Alex > >> >> >> Alexis. >> > > > [1]: <https://marc.info/?l=mandoc-discuss&m=160668087317110&w=2> > > [2]: <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=6a828d5b6879ef19c3f59034fe1d0850d25d0056> > > [3]: <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e5b23b9c5b318d69ee78af0906e3bf0c665f9ae5> > -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-09 12:17 ` Alejandro Colomar @ 2023-04-09 18:55 ` G. Branden Robinson 0 siblings, 0 replies; 73+ messages in thread From: G. Branden Robinson @ 2023-04-09 18:55 UTC (permalink / raw) To: Alejandro Colomar; +Cc: groff, linux-man, Dirk Gouders, Sam James, Alexis [-- Attachment #1: Type: text/plain, Size: 1398 bytes --] [dropping some people I recognize from the groff list from CC] At 2023-04-09T14:17:57+0200, Alejandro Colomar wrote: > - Using plain man(7) source is blazingly fast. So much that I > don't miss mdoc(7)'s indexability so much. > > However, I must admit that I do miss mdoc(7)'s power sometimes. > The man_lsfunc() and man_lsvar() functions for finding function > prototypes and variable declarations in man(7) source would be > much simpler using mdoc(1), and I could even use mandoc(1) to > find such things. I must point out that I have sketched a solution for solving the problem of semantic tagging in man(7). https://lists.gnu.org/archive/html/groff/2022-12/msg00075.html ...though perhaps I should add some detail to that sketch. My ideas are firming up, so I may mail a proposal to groff@ and linux-man@ in the near future. I'm happy to report that all the man(7) extension macros I have in mind, except for `Q` for quotation[1], will be trivially ignorable; i.e., an implementation (like mandoc(1)) that doesn't recognize them can ignore them (treating them as comment lines) without doing damage to the rendered text of a page. Regards, Branden [1] https://lists.gnu.org/archive/html/groff/2022-12/msg00078.html ...and even that admits a one-line fallback definition. I suspect you could even get away with defining it as a string. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-09 12:05 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Alejandro Colomar 2023-04-09 12:17 ` Alejandro Colomar @ 2023-04-09 12:29 ` Colin Watson 2023-04-09 13:36 ` Alejandro Colomar 2023-04-12 8:13 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James 2 siblings, 1 reply; 73+ messages in thread From: Colin Watson @ 2023-04-09 12:29 UTC (permalink / raw) To: Alejandro Colomar Cc: Alexis, groff, linux-man, Ingo Schwarze, Dirk Gouders, Sam James, Ralph Corderoy On Sun, Apr 09, 2023 at 02:05:08PM +0200, Alejandro Colomar wrote: > Important note: Sam, are you sure you want your pages compressed > with bz2? Have you seen the 10 seconds it takes man-db's man(1) to > find a word in the pages? I suggest that at least you try to > reproduce these tests in your machine, and see if it's just me or > man-db's man(1) is pretty bad at non-gz pages. man-db is significantly slower with bzip2 than gzip these days, because much of the performance work I did in 2.10.0 only applies to gzip: there's in-process support for decompressing gzip, but we use subprocesses for bzip2. IMO the relatively small difference in compressed size doesn't justify the effort of building in-process support for multiple compression algorithms. I recommend that distributions just use gzip; but if distributions _really_ want to use something else for whatever reason, then perhaps they should contribute code to man-db to ensure similar performance to gzip. I'm happy to give pointers if there's a sufficiently compelling reason to make it worth the effort. > - man-db's man(1) is slower with plain man(7) source than with .gz > pages for some misterious reason. Maybe CPU is sufficiently cheaper than I/O that the fact of reading less data from disk dominates. (Can I request that any concrete actions that need to be taken based on this thread be split out to separate bug reports or something, please? This thread is long and I don't really want to have lots of meandering discourse in my inbox going back over the tired old man vs. info debate or whatever, but if there are actual things I need to fix in man-db then I'd rather not miss them.) -- Colin Watson (he/him) [cjwatson@debian.org] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-09 12:29 ` Colin Watson @ 2023-04-09 13:36 ` Alejandro Colomar 2023-04-09 13:47 ` Compressed man pages Ralph Corderoy 0 siblings, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-09 13:36 UTC (permalink / raw) To: Colin Watson, Sam James Cc: Alexis, groff, linux-man, Ingo Schwarze, Ralph Corderoy, Dirk Gouders [-- Attachment #1.1: Type: text/plain, Size: 5842 bytes --] On 4/9/23 14:29, Colin Watson wrote: > On Sun, Apr 09, 2023 at 02:05:08PM +0200, Alejandro Colomar wrote: >> Important note: Sam, are you sure you want your pages compressed >> with bz2? Have you seen the 10 seconds it takes man-db's man(1) to >> find a word in the pages? I suggest that at least you try to >> reproduce these tests in your machine, and see if it's just me or >> man-db's man(1) is pretty bad at non-gz pages. > > man-db is significantly slower with bzip2 than gzip these days, because > much of the performance work I did in 2.10.0 only applies to gzip: > there's in-process support for decompressing gzip, but we use > subprocesses for bzip2. IMO the relatively small difference in > compressed size doesn't justify the effort of building in-process > support for multiple compression algorithms. Agree. > I recommend that > distributions just use gzip; I don't agree here. gzip vs man source is 5M vs 9M. However, a simple pipeline searching for a word in gzip pages takes ~114x the time it takes to perform the same search on man(7) source. I don't think that small benefit in size doesn't justify the slowness. Of course, this is only about theoretical maximum performance. Current man(1) has other issues so it doesn't benefit from this performance advantage. > but if distributions _really_ want to use > something else for whatever reason, then perhaps they should contribute > code to man-db to ensure similar performance to gzip. I'm happy to give > pointers if there's a sufficiently compelling reason to make it worth > the effort. > >> - man-db's man(1) is slower with plain man(7) source than with .gz >> pages for some misterious reason. > > Maybe CPU is sufficiently cheaper than I/O that the fact of reading less > data from disk dominates. My CPU is powerful, but so is my SSD. I wouldn't expect decompressing to be faster than I/O. I have a Samsung 960 PRO, which is quite fast[1]. $ lscpu [...] Model name: Intel(R) Core(TM) i7-5775C CPU @ 3.30GHz CPU family: 6 Model: 71 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 CPU(s) scaling MHz: 44% CPU max MHz: 3700.0000 CPU min MHz: 800.0000 [...] Caches (sum of all): L1d: 128 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 1 MiB (4 instances) L3: 6 MiB (1 instance) L4: 128 MiB (1 instance) [...] $ lspci | grep -i samsung 01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963 $ lsblk -o NAME,FSTYPE,MOUNTPOINT,SIZE,MODEL NAME FSTYPE MOUNTPOINT SIZE MODEL [...] nvme0n1 953.9G Samsung SSD 960 PRO ├─nvme0n1p1 vfat /boot/efi 1023M ├─nvme0n1p2 ext4 /boot 4G └─nvme0n1p3 crypto_L 948G └─nvme0n1p3_crypt ext4 / 948G Also, a manual loop should have similar problems, but it doesn't have them; if I loop manually over the files and grep them, it takes 0.01 s, which is the lowest that /bin/time can measure on my system. I repeated the tests on a tmpfs just to check. The times are almost the same (except that bzip goes down from 10 s to 9 s :). $ mount | grep /tmp tmpfs on /tmp type tmpfs (rw,noatime,inode64) $ sudo rm -r /tmp/man $ sudo make install-man prefix=/tmp/man/gz_ -j LINK_PAGES=symlink Z=.gz | wc -l 2570 $ sudo make install-man prefix=/tmp/man/bz2 -j LINK_PAGES=symlink Z=.bz2 | wc -l 2570 $ sudo make install-man prefix=/tmp/man/man -j LINK_PAGES=symlink Z= | wc -l 2570 $ du -sh /tmp/man/* 5.3M /tmp/man/bz2 5.4M /tmp/man/gz_ 9.3M /tmp/man/man $ export MANPATH=/tmp/man/gz_/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 0.30 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.14 This is quite optimized. I can't beat man(1) with a shell pipeline for .gz pages. :) $ export MANPATH=/tmp/man/bz2/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 9.22 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" 17 1.22 Sam, really consider not using .bz2 for Gentoo's pages. :) $ export MANPATH=/tmp/man/man/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" 37 0.52 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" 17 0.01 man(1) is ~52x slower than my loop. Similar results from RAM and NVMe, so I/O is not the issue here. > > > (Can I request that any concrete actions that need to be taken based on > this thread be split out to separate bug reports or something, please? > This thread is long and I don't really want to have lots of meandering > discourse in my inbox going back over the tired old man vs. info debate > or whatever, but if there are actual things I need to fix in man-db then > I'd rather not miss them.) Sure; do you have a mailing list, or should I send them to you and CC linux-man@? I have at least one bug report for you. Cheers, Alex [1]: <https://www.anandtech.com/show/10754/samsung-960-pro-ssd-review> -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-09 13:36 ` Alejandro Colomar @ 2023-04-09 13:47 ` Ralph Corderoy 0 siblings, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-09 13:47 UTC (permalink / raw) To: Alejandro Colomar; +Cc: Colin Watson, groff, linux-man Hi Alejandro, > Sure; do you have a mailing list, or should I send them to you and > CC linux-man@? I have at least one bug report for you. Start from https://man-db.gitlab.io/man-db/, which is the home page according to Arch Linux's package, and you'll end up in all the typical places: mailing list, issue tracker, etc. -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-09 12:05 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Alejandro Colomar 2023-04-09 12:17 ` Alejandro Colomar 2023-04-09 12:29 ` Colin Watson @ 2023-04-12 8:13 ` Sam James 2023-04-12 8:32 ` Compressed man pages Ralph Corderoy 2023-04-12 13:04 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Kerin Millar 2 siblings, 2 replies; 73+ messages in thread From: Sam James @ 2023-04-12 8:13 UTC (permalink / raw) To: Alejandro Colomar Cc: Alexis, groff, linux-man, Ingo Schwarze, Dirk Gouders, Colin Watson, Ralph Corderoy, Kerin Millar [-- Attachment #1: Type: text/plain, Size: 4996 bytes --] Alejandro Colomar <alx.manpages@gmail.com> writes: > [[PGP Signed Part:Undecided]] > [Added back linux-man@, and people that commented on this (sub)topic] > [Added Sam, I've got a question for you] > > Hi Alexis, > > Please keep (at least) linux-man@ in the loop. > > On 4/9/23 08:44, Alexis wrote: >> >> As a related data point, i'd like to mention Gentoo's position on >> this, i.e. that man pages will continue to be bzip2-compressed by >> default: >> >> "app-text/mandoc bzip2 support" >> https://bugs.gentoo.org/854267 >> >> "Remove /usr/share/man from default inclusion list for docompress" >> https://bugs.gentoo.org/836367 > > As Ingo said[1] 3 years ago, I don't think in this year it makes any > sense to compress pages anymore. However, since it's simple for me > to add support for that, and it can be interesting for testing > purposes, I added support for installing the Linux man-pages > compressed with bzip2 using the Makefile[2]. While I was at it, I > also added support for generating .tar.bz2 release tarballs[3]. > > With this, I was able to test a bit more than what I did yesterday: > > > $ sudo rm -rf /opt/local/man/ > $ sudo make install-man prefix=/opt/local/man/gz_ -j LINK_PAGES=symlink Z=.gz | wc -l > 2570 > $ sudo make install-man prefix=/opt/local/man/bz2 -j LINK_PAGES=symlink Z=.bz2 | wc -l > 2570 > $ sudo make install-man prefix=/opt/local/man/man -j LINK_PAGES=symlink Z= | wc -l > 2570 > $ du -sh /opt/local/man/* > 5.4M /opt/local/man/bz2 > 5.5M /opt/local/man/gz_ > 9.4M /opt/local/man/man > > > $ export MANPATH=/opt/local/man/gz_/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 0.31 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs zgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 zgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do zcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.24 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.14 > > > $ export MANPATH=/opt/local/man/bz2/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 10.90 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs bzgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.33 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 bzgrep -l RLIMIT_NOFILE | wc -l" > 17 > 1.31 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.21 > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > 17 > 1.22 > > > $ export MANPATH=/opt/local/man/man/share/man > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > 37 > 0.56 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" > 17 > 0.01 > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 grep -l RLIMIT_NOFILE | wc -l" > 17 > 0.01 > > Weird thing: today, the symlink bug in man(1) was reproducible in > all kinds of pages, while yesterday it only reproduced in > uncompressed ones. > > Another weird thing: times today changed considerably for the > find(1) pipelines (half of yesterday's). It's not a thing of > using dash(1), because I get similar times with bash(1) and its > builtin time(1). > > Important note: Sam, are you sure you want your pages compressed > with bz2? Have you seen the 10 seconds it takes man-db's man(1) to > find a word in the pages? I suggest that at least you try to > reproduce these tests in your machine, and see if it's just me or > man-db's man(1) is pretty bad at non-gz pages. > > Test results: > > - man-db's man(1) is slower with plain man(7) source than with .gz > pages for some misterious reason. > > - man-db's man(1) is turtle slow with .bz2 pages. I started looking into changing to xz (or just.. not bz2, anyway), partially motivated by https://gitlab.com/man-db/man-db/-/issues/4 / just interest locally (without having done measurements to see if it would be worth a global change) and the xz maintainer ended up recommending a different implementation to how man-db currently handles external utilties entirely (which I have a draft of). The xz author had some suggestions on the best parameters to use for man pages too which I need to look into and dig up... https://bugs.gentoo.org/169260 was an interesting discussion about our choice of bz2 (it came up a bit in https://bugs.gentoo.org/372653 too). (I'll get back and read the rest of the thread later, but wanted to add this tidbit.) Definitely surprised to learn bz2 is *that* bad though! best, sam [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 377 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-12 8:13 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James @ 2023-04-12 8:32 ` Ralph Corderoy 2023-04-12 10:35 ` Mingye Wang 2023-04-12 13:04 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Kerin Millar 1 sibling, 1 reply; 73+ messages in thread From: Ralph Corderoy @ 2023-04-12 8:32 UTC (permalink / raw) To: Sam James; +Cc: groff, linux-man Hi Sam, > I started looking into changing to xz (or just.. not bz2, anyway) If you're putting effort into researching another compressor then consider lzip(1). https://www.nongnu.org/lzip/lzip.html Its author compares it against xz in particular. https://www.nongnu.org/lzip/xz_inadequate.html -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-12 8:32 ` Compressed man pages Ralph Corderoy @ 2023-04-12 10:35 ` Mingye Wang 2023-04-12 10:55 ` Ralph Corderoy 0 siblings, 1 reply; 73+ messages in thread From: Mingye Wang @ 2023-04-12 10:35 UTC (permalink / raw) To: Ralph Corderoy; +Cc: Sam James, groff, linux-man On Wed, Apr 12, 2023 at 4:36 PM Ralph Corderoy <ralph@inputplus.co.uk> wrote: > If you're putting effort into researching another compressor then > consider lzip(1). https://www.nongnu.org/lzip/lzip.html > > Its author compares it against xz in particular. > https://www.nongnu.org/lzip/xz_inadequate.html lzip is cool and all, but the thing is we are talking about storage for distribution on every single person's computer in single-file form, not archiving into a tarball. We are looking at a world where almost every system has xz installed because of some past decisions, unfortunate or not. Regards, Mingye ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-12 10:35 ` Mingye Wang @ 2023-04-12 10:55 ` Ralph Corderoy 0 siblings, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-12 10:55 UTC (permalink / raw) To: groff, linux-man; +Cc: Sam James Hi Mingye, > the thing is we are talking about storage for distribution on every > single person's computer No, I was talking to sam@gentoo.org so I assumed Gentoo as the target. > We are looking at a world where almost every system has xz installed > because of some past decisions, unfortunate or not. That's not the kind of thing I expect to bother Gentoo. :-) -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-12 8:13 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James 2023-04-12 8:32 ` Compressed man pages Ralph Corderoy @ 2023-04-12 13:04 ` Kerin Millar 2023-04-12 14:24 ` Alejandro Colomar 1 sibling, 1 reply; 73+ messages in thread From: Kerin Millar @ 2023-04-12 13:04 UTC (permalink / raw) To: Sam James Cc: Alejandro Colomar, Alexis, groff, linux-man, Ingo Schwarze, Dirk Gouders, Colin Watson, Ralph Corderoy On Wed, 12 Apr 2023 09:13:13 +0100 Sam James <sam@gentoo.org> wrote: > > Alejandro Colomar <alx.manpages@gmail.com> writes: > > > [[PGP Signed Part:Undecided]] > > [Added back linux-man@, and people that commented on this (sub)topic] > > [Added Sam, I've got a question for you] > > > > Hi Alexis, > > > > Please keep (at least) linux-man@ in the loop. > > > > On 4/9/23 08:44, Alexis wrote: > >> > >> As a related data point, i'd like to mention Gentoo's position on > >> this, i.e. that man pages will continue to be bzip2-compressed by > >> default: > >> > >> "app-text/mandoc bzip2 support" > >> https://bugs.gentoo.org/854267 > >> > >> "Remove /usr/share/man from default inclusion list for docompress" > >> https://bugs.gentoo.org/836367 > > > > As Ingo said[1] 3 years ago, I don't think in this year it makes any > > sense to compress pages anymore. However, since it's simple for me > > to add support for that, and it can be interesting for testing > > purposes, I added support for installing the Linux man-pages > > compressed with bzip2 using the Makefile[2]. While I was at it, I > > also added support for generating .tar.bz2 release tarballs[3]. > > > > With this, I was able to test a bit more than what I did yesterday: > > > > > > $ sudo rm -rf /opt/local/man/ > > $ sudo make install-man prefix=/opt/local/man/gz_ -j LINK_PAGES=symlink Z=.gz | wc -l > > 2570 > > $ sudo make install-man prefix=/opt/local/man/bz2 -j LINK_PAGES=symlink Z=.bz2 | wc -l > > 2570 > > $ sudo make install-man prefix=/opt/local/man/man -j LINK_PAGES=symlink Z= | wc -l > > 2570 > > $ du -sh /opt/local/man/* > > 5.4M /opt/local/man/bz2 > > 5.5M /opt/local/man/gz_ > > 9.4M /opt/local/man/man > > > > > > $ export MANPATH=/opt/local/man/gz_/share/man > > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > > 37 > > 0.31 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs zgrep -l RLIMIT_NOFILE | wc -l" > > 17 > > 1.56 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 zgrep -l RLIMIT_NOFILE | wc -l" > > 17 > > 1.56 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do zcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > > 17 > > 1.24 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > > 17 > > 1.14 > > > > > > $ export MANPATH=/opt/local/man/bz2/share/man > > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > > 37 > > 10.90 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs bzgrep -l RLIMIT_NOFILE | wc -l" > > 17 > > 1.33 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 bzgrep -l RLIMIT_NOFILE | wc -l" > > 17 > > 1.31 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzcat \$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > > 17 > > 1.21 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l" > > 17 > > 1.22 > > > > > > $ export MANPATH=/opt/local/man/man/share/man > > $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l" > > 37 > > 0.56 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs grep -l RLIMIT_NOFILE | wc -l" > > 17 > > 0.01 > > $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 grep -l RLIMIT_NOFILE | wc -l" > > 17 > > 0.01 > > > > Weird thing: today, the symlink bug in man(1) was reproducible in > > all kinds of pages, while yesterday it only reproduced in > > uncompressed ones. > > > > Another weird thing: times today changed considerably for the > > find(1) pipelines (half of yesterday's). It's not a thing of > > using dash(1), because I get similar times with bash(1) and its > > builtin time(1). > > > > Important note: Sam, are you sure you want your pages compressed > > with bz2? Have you seen the 10 seconds it takes man-db's man(1) to > > find a word in the pages? I suggest that at least you try to > > reproduce these tests in your machine, and see if it's just me or > > man-db's man(1) is pretty bad at non-gz pages. > > > > Test results: > > > > - man-db's man(1) is slower with plain man(7) source than with .gz > > pages for some misterious reason. > > > > - man-db's man(1) is turtle slow with .bz2 pages. > > I started looking into changing to xz (or just.. not bz2, anyway), > partially motivated by https://gitlab.com/man-db/man-db/-/issues/4 / > just interest locally (without having done measurements to see if it > would be worth a global change) and the xz maintainer ended up > recommending a different implementation to how man-db currently handles > external utilties entirely (which I have a draft of). > > The xz author had some suggestions on the best parameters to use > for man pages too which I need to look into and dig up... > > https://bugs.gentoo.org/169260 was an interesting discussion > about our choice of bz2 (it came up a bit in > https://bugs.gentoo.org/372653 too). Oh, I remember this. Soon after #372653 was closed, I experimented further and found xz --lzma2=preset=6e,pb=0 to be more effective than bzip -9, both in terms of compression ratio and subsequent decompression performance, so I used those settings for a time. Nowadays, I would be more concerned with the time taken to render a man page than in reducing the footprint of the installed documentation. > > (I'll get back and read the rest of the thread later, but wanted > to add this tidbit.) > > Definitely surprised to learn bz2 is *that* bad though! > > best, > sam -- Kerin Millar ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-12 13:04 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Kerin Millar @ 2023-04-12 14:24 ` Alejandro Colomar 2023-04-12 18:52 ` Mingye Wang 0 siblings, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-12 14:24 UTC (permalink / raw) To: linux-man Cc: Alexis, groff, Ingo Schwarze, Dirk Gouders, Colin Watson, Ralph Corderoy, Mingye Wang, Kerin Millar, Sam James [-- Attachment #1.1: Type: text/plain, Size: 7495 bytes --] Hi all, After the suggestion by Ralph of trying .lz, Sam's comment about .xz), and Kerin's comment about tuning the compression parameters, I decided to try out everything at once, so we can see the effects of the alternatives. TL;DR: For manual pages, use uncompressed source, or gzip(1). Everything else is unreasonably slow. Here go the numbers. Below, will be a conclusion I get from them. The following tests have been produced with man-db's man(1) built from source, since Colin fixed an relevant bug a few days ago[1]. This improves performance considerably compared to the latest release. $ sudo make install-man prefix=/opt/local/man/bz2_1 -j LINK_PAGES=symlink Z=.bz2 BZIP2FLAGS=-1 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/bz2_9 -j LINK_PAGES=symlink Z=.bz2 BZIP2FLAGS=-9 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/bz2__ -j LINK_PAGES=symlink Z=.bz2 BZIP2FLAGS= | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/gz__1 -j LINK_PAGES=symlink Z=.gz GZIPFLAGS=-1 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/gz__9 -j LINK_PAGES=symlink Z=.gz GZIPFLAGS=-9 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/gz___ -j LINK_PAGES=symlink Z=.gz GZIPFLAGS= | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/lz__1 -j LINK_PAGES=symlink Z=.lz LZIPFLAGS=-1 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/lz__9 -j LINK_PAGES=symlink Z=.lz LZIPFLAGS=-9 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/lz___ -j LINK_PAGES=symlink Z=.lz LZIPFLAGS= | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/xz__1 -j LINK_PAGES=symlink Z=.xz XZFLAGS=-1 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/xz__9 -j LINK_PAGES=symlink Z=.xz XZFLAGS=-9 | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/xz___ -j LINK_PAGES=symlink Z=.xz XZFLAGS= | wc -l 2571 $ sudo make install-man prefix=/opt/local/man/man__ -j LINK_PAGES=symlink Z= | wc -l 2571 $ du -sh /opt/local/man/* 5.4M /opt/local/man/bz2_1 5.4M /opt/local/man/bz2_9 5.4M /opt/local/man/bz2__ 5.7M /opt/local/man/gz__1 5.5M /opt/local/man/gz__9 5.5M /opt/local/man/gz___ 5.5M /opt/local/man/lz__1 5.4M /opt/local/man/lz__9 5.4M /opt/local/man/lz___ 9.4M /opt/local/man/man__ 5.5M /opt/local/man/xz__1 5.4M /opt/local/man/xz__9 5.4M /opt/local/man/xz___ $ export MANPATH=/opt/local/man/bz2_1/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.15 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.22 $ export MANPATH=/opt/local/man/bz2_9/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.15 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.23 $ export MANPATH=/opt/local/man/bz2__/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.19 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do bzip2 -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.23 $ export MANPATH=/opt/local/man/gz__1/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 0.21 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.16 $ export MANPATH=/opt/local/man/gz__9/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 0.20 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.17 $ export MANPATH=/opt/local/man/gz___/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 0.20 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do gzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.15 $ export MANPATH=/opt/local/man/lz__1/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.95 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do lzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.40 $ export MANPATH=/opt/local/man/lz__9/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.93 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do lzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.40 $ export MANPATH=/opt/local/man/lz___/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.94 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do lzip -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.40 $ export MANPATH=/opt/local/man/xz__1/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 3.43 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do xz -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.24 $ export MANPATH=/opt/local/man/xz__9/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 4.21 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do xz -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.55 $ export MANPATH=/opt/local/man/xz___/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 4.17 $ /bin/time -f %e dash -c "find $MANPATH -type f | while read f; do xz -d - <\$f | grep -l RLIMIT_NOFILE >/dev/null && echo \$f; done | wc -l | xargs printf '%s; '" 17; 1.55 $ export MANPATH=/opt/local/man/man__/share/man $ /bin/time -f %e dash -c "man -Kaw RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 0.55 $ /bin/time -f %e dash -c "find $MANPATH -type f | xargs -P0 grep -l RLIMIT_NOFILE | wc -l | xargs printf '%s; '" 17; 0.01 Conclussions: Any compression formats other than .gz are unreasonably slow. I'd say either use .gz, or plain text, or prepare to contribute code yourself to man-db to optimize for your favourite compression format. .bz2, .lz, and .xz have similar times, and tuning the compression doesn't produce important changes in speed (except slightly for .xz, but I don't see any advantage of using .xz). Similarly, tuning the compression of .gz doesn't produce important changes in speed. Plain text has the advantage that you can use all the power of Unix tools to search through the source code of the pages instantaneously, without being restricted to what man(1) allows. I hope this was useful. Cheers, Alex [1]: <https://lists.nongnu.org/archive/html/man-db-devel/2023-04/msg00000.html> -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) 2023-04-12 14:24 ` Alejandro Colomar @ 2023-04-12 18:52 ` Mingye Wang 2023-04-12 20:23 ` Compressed man pages Alejandro Colomar 2023-04-13 10:09 ` Ralph Corderoy 0 siblings, 2 replies; 73+ messages in thread From: Mingye Wang @ 2023-04-12 18:52 UTC (permalink / raw) To: Alejandro Colomar Cc: linux-man, Alexis, groff, Ingo Schwarze, Dirk Gouders, Colin Watson, Ralph Corderoy, Kerin Millar, Sam James On Wed, Apr 12, 2023 at 10:24 PM Alejandro Colomar <alx.manpages@gmail.com> wrote: > $ sudo make install-man prefix=/opt/local/man/xz___ -j LINK_PAGES=symlink Z=.xz XZFLAGS= | wc -l Small nitpick here as Kerin's recommended pb=0 isn't actually used. https://bugs.gentoo.org/169260#c19 (from Kerin) suggests that we might get one-third more. I'm having trouble getting the Makefile to behave on MSYS2, but it does shrink a manual copy of man*/ totalling 7.2 M (probably because `exit` and `nan` didn't get checked out by git -- case-insensitivity issues) down to 2.8 M (both `du --apparent-size -sh`). > .bz2, .lz, and .xz have similar times, and tuning the compression > doesn't produce important changes in speed Or size. This is to be expected, since man pages are really tiny files, to the point that compressors can't see much context. [Zstd and brotli each have a "dictionary mode" to deal with this, but (a) Zstd dict-file requires an extra flag on decompress (b) nobody has brotli, which has a default dictionary, installed.] > .xz, but I don't see any advantage of using .xz). Going for `xz -9` only unnecessarily inflates the dictionary size beyond the file size and therefore the mem requirement. The dictionary size at -0 is 256 KiB, already enough for almost every man page in existence. (gz -9 is 32 KiB, if I recall correctly.) > Conclussions: > > Any compression formats other than .gz are unreasonably slow. > I'd say either use .gz, or plain text, or prepare to > contribute code yourself to man-db to optimize for your favourite > compression format. For every compression format someone adds, there's going to be one more optional dependency... Cheers, Mingye ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-12 18:52 ` Mingye Wang @ 2023-04-12 20:23 ` Alejandro Colomar 2023-04-13 10:09 ` Ralph Corderoy 1 sibling, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-12 20:23 UTC (permalink / raw) To: Mingye Wang Cc: linux-man, Alexis, groff, Ingo Schwarze, Dirk Gouders, Colin Watson, Ralph Corderoy, Kerin Millar, Sam James [-- Attachment #1.1: Type: text/plain, Size: 2467 bytes --] Hi Mingye, On 4/12/23 20:52, Mingye Wang wrote: > On Wed, Apr 12, 2023 at 10:24 PM Alejandro Colomar > <alx.manpages@gmail.com> wrote: >> $ sudo make install-man prefix=/opt/local/man/xz___ -j LINK_PAGES=symlink Z=.xz XZFLAGS= | wc -l > > Small nitpick here as Kerin's recommended pb=0 isn't actually used. > https://bugs.gentoo.org/169260#c19 (from Kerin) suggests that we might > get one-third more. Hmm, might be interesting to try at some point, but for now, since man(1) is very unoptimized for non-gz, as we saw, I don't thinks it's worth trying now. > > I'm having trouble getting the Makefile to behave on MSYS2, but it > does shrink a manual copy of man*/ totalling 7.2 M (probably because > `exit` and `nan` didn't get checked out by git -- case-insensitivity > issues) down to 2.8 M (both `du --apparent-size -sh`). I didn't push the changes needed to use .lz and .xz. Maybe that was the issue? * bc15c1d7b - Wed, 12 Apr 2023 16:54:01 +0200 (5 hours ago) (tar) | Makefile: tfix - Alejandro Colomar * db5795531 - Wed, 12 Apr 2023 16:53:32 +0200 (5 hours ago) | *.mk: $Z: Support installing xz(1) compressed pages - Alejandro Colomar * c2fffefba - Wed, 12 Apr 2023 16:46:16 +0200 (6 hours ago) | *.mk: Add *FLAGS variables for compression commands - Alejandro Colomar * b220bc5b0 - Wed, 12 Apr 2023 14:43:00 +0200 (8 hours ago) | *.mk: $Z: Support installing lzip(1) compressed pages - Alejandro Colomar * 69ad95988 - Wed, 12 Apr 2023 14:37:08 +0200 (8 hours ago) | *.mk: dist, dist-lz: Create tarballs compressed with lzip(1) - Alejandro Colomar * 254fe38b2 - Tue, 11 Apr 2023 22:33:44 +0200 (24 hours ago) (tag: man-pages-6.05-a1) | dist.mk, version.mk: Create reproducible tarballs - Alejandro Colomar | * c7e9f0ffe - Tue, 11 Apr 2023 22:13:00 +0200 (24 hours ago) (set) |/ build-catman.mk: Use .set suffix for troff(1) output - Alejandro Colomar * 121c8de01 - Tue, 11 Apr 2023 16:55:17 +0200 (29 hours ago) (HEAD -> master, korg/master) | fts.3: SYNOPSIS: Fix nullability - Alejandro Colomar I'll push in a moment so you can try that (already done at the time of sending this email). Or did you see different issues about the Makefile? Please report anything uncomfortable about it. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Compressed man pages 2023-04-12 18:52 ` Mingye Wang 2023-04-12 20:23 ` Compressed man pages Alejandro Colomar @ 2023-04-13 10:09 ` Ralph Corderoy 1 sibling, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-13 10:09 UTC (permalink / raw) To: linux-man, groff Hi Mingye, > [Zstd and brotli each have a "dictionary mode" to deal with this, but > (a) Zstd dict-file requires an extra flag on decompress (b) nobody has > brotli, which has a default dictionary, installed.] I found brotli was already installed here. So here's some numbers, just for the lists' info. $ ls | grep '\.gz$' | shuf -n10 | > while read -r f; do > printf '%32s %5d %5d\n' "$f" `wc -c <"$f"` \ > `zcat "$f" | brotli | wc -c` > done | > awk '{print $0 " " $3/$2}' postmap.1.gz 4125 3333 0.808 gnutls-cli-debug.1.gz 2627 2108 0.802436 cwebp.1.gz 5074 4106 0.809223 findsmb.1.gz 1810 1474 0.814365 ppmntsc.1.gz 1282 973 0.75897 libuv.1.gz 76363 62274 0.8155 xmlwf.1.gz 3486 2760 0.791738 users.1.gz 763 572 0.749672 gpgparsemail.1.gz 294 231 0.785714 perl561delta.1perl.gz 51764 42957 0.829862 $ -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-06 1:10 ` Alejandro Colomar 2023-04-06 8:11 ` Eli Zaretskii @ 2023-04-07 2:18 ` G. Branden Robinson 2023-04-07 6:36 ` Eli Zaretskii 2023-04-07 21:26 ` reformatting man pages at SIGWINCH " Alejandro Colomar 1 sibling, 2 replies; 73+ messages in thread From: G. Branden Robinson @ 2023-04-07 2:18 UTC (permalink / raw) To: Alejandro Colomar; +Cc: Eli Zaretskii, dirk, linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 3547 bytes --] At 2023-04-06T03:10:59+0200, Alejandro Colomar wrote: > Hmm, now that I think, it's probably an issue of coordinating man(1) > and less(1). I sometimes wish that when I resize a window where I'm > reading a man page, it would reformat the page from source. Seems like it shouldn't be impossible to me, but what I imagine would require a little reëngineering of man(1), perhaps to spawn a little custom program to manage zcat/nroff pipeline it constructs. This little program's sole job could be to be aware of this pipeline and listen for SIGWINCH; if it happens, kill the rest of the pipeline and reëxecute it. Maybe I thought of it this way because (I suspect) it aligns with my vision I've expressed elsewhere of man(1) having unfortunately aggregated two separate functions: librarian vs. renderer. Historically, of course the latter function was almost vestigial, since early Unix systems had no pager program and their man pages required little to no preprocessing; man(1) slowly accreted into a larger thing. > Of course, that might be a problem for keeping track of where I was, > since lines moved around. That seems like a harder problem to me. You'd need a way for the pager to communicate position information back to the mini-man renderer program I envision. Two challenges here: (1) what part of the screen was the reader actually looking at? (2) how is the pager supposed to know how to map any given location on the screen back to a place in the unrendered source document so it can be accurately found when the document is rerendered? These feel nearly intractable to me. But maybe I have a poor imagination. > Ahh, yes, this is true. What I found missing is a kind of a map for > knowing what I have available for navigating (also the fact that I > don't usually run info(1) makes me be a bit fuzzy on detailing what > is it that I miss from it). So, info(1) has a map of the sections > available in a page, and does it also have a map of all the pages > in the system (or whatever you call your pages, I don't yet really > understand the organization of info manuals). The "install-info" program is run by packages that install info documents to the system. This creates or updates a file called "dir". For instance, when I "make install" an everyday groff build, the following shows up. /home/branden/groff/share/info/dir /home/branden/groff/share/info/groff.info /home/branden/groff/share/info/groff.info-1 /home/branden/groff/share/info/groff.info-2 /home/branden/groff/share/info/groff.info-3 Since help-texinfo is on the distribution list of this mail, I'll take this opportunity to note something from groff's INSTALL.extra file, explaining how to uninstall the package. ... Run the command 'sudo make uninstall'. (If you successfully used 'make install', simply run 'make uninstall'.) At a minimum, some directories not particular to groff, like 'bin' and (depending on configuration) an X11 'app-defaults' directory will remain, as will one plain file called 'dir', created by GNU Texinfo's 'install-info' command. (As of this writing, 'install-info' offers no provision for removing an effectively empty 'dir' file, and groff does not attempt to parse this file to determine whether it can be safely removed.) All other groff artifacts will be deleted from the installation hierarchy. Any chance 'install-info' could get savvy as noted above? (Maybe it already has--I'm running 6.7.0.) Regards, Branden [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-07 2:18 ` Playground pager lsp(1) G. Branden Robinson @ 2023-04-07 6:36 ` Eli Zaretskii 2023-04-07 11:03 ` Gavin Smith 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson 2023-04-07 21:26 ` reformatting man pages at SIGWINCH " Alejandro Colomar 1 sibling, 2 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-07 6:36 UTC (permalink / raw) To: G. Branden Robinson; +Cc: alx.manpages, dirk, linux-man, help-texinfo > Date: Thu, 6 Apr 2023 21:18:22 -0500 > From: "G. Branden Robinson" <g.branden.robinson@gmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, dirk@gouders.net, > linux-man@vger.kernel.org, help-texinfo@gnu.org > > > Hmm, now that I think, it's probably an issue of coordinating man(1) > > and less(1). I sometimes wish that when I resize a window where I'm > > reading a man page, it would reformat the page from source. > > Seems like it shouldn't be impossible to me, but what I imagine would > require a little reëngineering of man(1), perhaps to spawn a little > custom program to manage zcat/nroff pipeline it constructs. This little > program's sole job could be to be aware of this pipeline and listen for > SIGWINCH; if it happens, kill the rest of the pipeline and reëxecute it. This should be possible, but it flies in the face of the feature whereby formatted man pages are kept for future perusal, which is therefore faster: if the formatted pages reflect the particular size of the pager's window, it is meaningless to cache them. > ... Run the command 'sudo make uninstall'. (If you successfully used > 'make install', simply run 'make uninstall'.) At a minimum, some > directories not particular to groff, like 'bin' and (depending on > configuration) an X11 'app-defaults' directory will remain, as will > one plain file called 'dir', created by GNU Texinfo's 'install-info' > command. (As of this writing, 'install-info' offers no provision for > removing an effectively empty 'dir' file, and groff does not attempt > to parse this file to determine whether it can be safely removed.) > All other groff artifacts will be deleted from the installation > hierarchy. > > Any chance 'install-info' could get savvy as noted above? (Maybe it > already has--I'm running 6.7.0.) Why does it make sense to do that? An "empty" DIR file is not really empty: it has instructions at its beginning, which are important for newbies. Also, on well-maintained system, DIR will rarely become empty, and if it does, it will soon enough become non-empty again, since all the Info manuals installed on the system should be mentioned there, and why would we want to imagine a system which has no Info manuals at all, not even an Info manual that describes how to use Info (which comes with the Texinfo distribution)? So I think Groff should remove that paragraph from its instructions, because (IMO) it is misleading and unnecessary. Of course, mine is not the authoritative opinion about how the Texinfo project should develop its programs, it is just one opinion. So wait for Gavin to chime in. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-07 6:36 ` Eli Zaretskii @ 2023-04-07 11:03 ` Gavin Smith 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson 1 sibling, 0 replies; 73+ messages in thread From: Gavin Smith @ 2023-04-07 11:03 UTC (permalink / raw) To: Eli Zaretskii Cc: G. Branden Robinson, alx.manpages, dirk, linux-man, help-texinfo On Fri, Apr 07, 2023 at 09:36:10AM +0300, Eli Zaretskii wrote: > This should be possible, but it flies in the face of the feature > whereby formatted man pages are kept for future perusal, which is > therefore faster: if the formatted pages reflect the particular size > of the pager's window, it is meaningless to cache them. > > > ... Run the command 'sudo make uninstall'. (If you successfully used > > 'make install', simply run 'make uninstall'.) At a minimum, some > > directories not particular to groff, like 'bin' and (depending on > > configuration) an X11 'app-defaults' directory will remain, as will > > one plain file called 'dir', created by GNU Texinfo's 'install-info' > > command. (As of this writing, 'install-info' offers no provision for > > removing an effectively empty 'dir' file, and groff does not attempt > > to parse this file to determine whether it can be safely removed.) > > All other groff artifacts will be deleted from the installation > > hierarchy. > > > > Any chance 'install-info' could get savvy as noted above? (Maybe it > > already has--I'm running 6.7.0.) > > Why does it make sense to do that? An "empty" DIR file is not really > empty: it has instructions at its beginning, which are important for > newbies. Also, on well-maintained system, DIR will rarely become > empty, and if it does, it will soon enough become non-empty again, > since all the Info manuals installed on the system should be mentioned > there, and why would we want to imagine a system which has no Info > manuals at all, not even an Info manual that describes how to use Info > (which comes with the Texinfo distribution)? It falls under the same category as the "directories not particular to groff" mentioned in the instructions. You want install-info (or Automake rules) to remove an empty dir file; you could equally claim that install-info should remove the empty 'info' directory that contains that dir file. What are the benefits of removing the file? ^ permalink raw reply [flat|nested] 73+ messages in thread
* man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 6:36 ` Eli Zaretskii 2023-04-07 11:03 ` Gavin Smith @ 2023-04-07 14:43 ` G. Branden Robinson 2023-04-07 15:06 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 73+ messages in thread From: G. Branden Robinson @ 2023-04-07 14:43 UTC (permalink / raw) To: Eli Zaretskii Cc: alx.manpages, dirk, cjwatson, linux-man, help-texinfo, groff [-- Attachment #1.1: Type: text/plain, Size: 4511 bytes --] [adding Colin Watson to CC to solicit his man(1) implementation knowledge; adding the groff list as this sub-discussion is relevant to its interests] At 2023-04-07T09:36:10+0300, Eli Zaretskii wrote: > > From: "G. Branden Robinson" <g.branden.robinson@gmail.com> [re-running *roff when a viewing a man page and resizing the terminal] > > Seems like it shouldn't be impossible to me, but what I imagine > > would require a little reëngineering of man(1), perhaps to spawn a > > little custom program to manage zcat/nroff pipeline it constructs. > > This little program's sole job could be to be aware of this pipeline > > and listen for SIGWINCH; if it happens, kill the rest of the > > pipeline and reëxecute it. > > This should be possible, but it flies in the face of the feature > whereby formatted man pages are kept for future perusal, which is > therefore faster: You're referring to cat pages. As far as I know, these are on their way out if not already gone. Colin Watson, who has maintained the man-db implementation of man(1)[1] for something like 20 years, can speak more authoritatively to this, but as I understand it, the advent of resizable xterm windows started to kill the utility of cat pages decades ago and the increasing importance of desktop environments accelerated their demise. If a cat page wasn't pre-rendered at the width of your terminal, or for your terminal type[1], man pages were formatted from scratch for you anyway. You could of course cache pages at a variety of widths (and for multiple terminal types), but doing so for any width encountered was a space concern--or even a DoS vector if some undergraduate rapscallion decided to try rendering every page on the system at every terminal width from 1 to INT_MAX--in the years when man page rendering time was also noticeable. ...which brings me to the other factor, of which I'm more confident: man page rendering times are much lower than they were in Unix's early days. On my system, all groff man pages but one render in between a tenth and a fortieth of a second. The really huge pages like groff(7), groff_char(7), and groff_diff(7) are toward the upper end of this range, because they are long, at ~20-25 U.S. letter pages when formatted for PostScript or PDF, or have many large tables so the tbl(1) preprocessor produces a lot of output. The outlier is groff_mdoc(7) at just over one-third of a second. It is written in its own macro language, not man(7), and also a lengthy document (31 U.S. letter pages). mdoc has always been slow; its original implementers warned of this. (I believe this is mainly due to an aspect of its design: the typical mdoc(7) document has a large number of recursive macro calls arising from a decision to help the document author avoid having to start new control lines to call them.) While not statistically rigorous, mainly because I didn't undertake a large number of trials under various system loads, I attempted fair measurements by (A) always running the 3 preprocessors pic(1), eqn(1), and tbl(1) on _all_ input documents even though this is pointless most of the time (only tbl(1) sees use more than rarely), and (B) formatting both with and without operation of the output driver grotty(1) in the pipeline, in case "cheating" by having groff(1) discard its standard output stream artificially deflated the time consumption. It appears not to have. The bottom line is that, even on BSD systems (where mdoc(7) is preferred to man(7)), a user can expect a man page to render from *roff source in less than, say, half a second in the worst case, and the median GNU/Linux user can expect to start reading a man page "instantaneously": Human subjects need a minimum of about 0.1 second of visual experience or about .01 to .02 second of auditory experience to perceive duration; any shorter experiences are called instantaneous. -- Encyclopædia Britannica[2] My findings are attached. I'll respond to the "uninstall-info" topic in a separate subthread. Regards, Branden [1] Once upon a time, Unix time-sharing systems had to support shell sessions originating from a wide variety of terminals; at Purdue, I never saw a real DEC VT in use (to my regret), but plenty of Zenith Z29s, Wyse 50s, Sun SPARC IPCs in console mode, and the occasional really retro Lear Siegler ADM-5. [2] https://www.britannica.com/science/time-perception/Perceived-duration [-- Attachment #1.2: TIMING --] [-- Type: text/plain, Size: 4186 bytes --] for m in $(find -name "*.[157]" | sort); do echo; echo $m; time ./test-groff -Ez -pet -mandoc $m; done ./contrib/chem/chem.1 real 0m0.039s user 0m0.043s sys 0m0.000s ./contrib/eqn2graph/eqn2graph.1 real 0m0.025s user 0m0.028s sys 0m0.000s ./contrib/gdiffmk/gdiffmk.1 real 0m0.026s user 0m0.023s sys 0m0.007s ./contrib/glilypond/glilypond.1 real 0m0.032s user 0m0.036s sys 0m0.000s ./contrib/gperl/gperl.1 real 0m0.028s user 0m0.031s sys 0m0.000s ./contrib/gpinyin/gpinyin.1 real 0m0.027s user 0m0.028s sys 0m0.002s ./contrib/grap2graph/grap2graph.1 real 0m0.025s user 0m0.026s sys 0m0.002s ./contrib/hdtbl/groff_hdtbl.7 real 0m0.035s user 0m0.032s sys 0m0.006s ./contrib/mm/groff_mm.7 real 0m0.087s user 0m0.092s sys 0m0.009s ./contrib/mm/groff_mmse.7 real 0m0.025s user 0m0.028s sys 0m0.000s ./contrib/mm/mmroff.1 real 0m0.024s user 0m0.018s sys 0m0.010s ./contrib/mom/groff_mom.7 real 0m0.058s user 0m0.053s sys 0m0.010s ./contrib/pdfmark/pdfroff.1 real 0m0.033s user 0m0.036s sys 0m0.000s ./contrib/pic2graph/pic2graph.1 real 0m0.026s user 0m0.029s sys 0m0.000s ./contrib/rfc1345/groff_rfc1345.7 real 0m0.026s user 0m0.026s sys 0m0.004s ./man/groff.7 real 0m0.099s user 0m0.110s sys 0m0.000s ./man/groff_char.7 real 0m0.086s user 0m0.109s sys 0m0.000s ./man/groff_diff.7 real 0m0.082s user 0m0.081s sys 0m0.010s ./man/groff_font.5 real 0m0.033s user 0m0.037s sys 0m0.000s ./man/groff_out.5 real 0m0.042s user 0m0.041s sys 0m0.005s ./man/groff_tmac.5 real 0m0.037s user 0m0.035s sys 0m0.006s ./man/roff.7 real 0m0.047s user 0m0.052s sys 0m0.000s ./src/devices/grodvi/grodvi.1 real 0m0.029s user 0m0.030s sys 0m0.002s ./src/devices/grohtml/grohtml.1 real 0m0.030s user 0m0.029s sys 0m0.004s ./src/devices/grolbp/grolbp.1 real 0m0.029s user 0m0.027s sys 0m0.006s ./src/devices/grolj4/grolj4.1 real 0m0.033s user 0m0.036s sys 0m0.000s ./src/devices/gropdf/gropdf.1 real 0m0.041s user 0m0.045s sys 0m0.000s ./src/devices/gropdf/pdfmom.1 real 0m0.025s user 0m0.028s sys 0m0.000s ./src/devices/grops/grops.1 real 0m0.045s user 0m0.049s sys 0m0.000s ./src/devices/grotty/grotty.1 real 0m0.031s user 0m0.032s sys 0m0.002s ./src/devices/xditview/gxditview.1 real 0m0.035s user 0m0.036s sys 0m0.002s ./src/preproc/eqn/eqn.1 real 0m0.047s user 0m0.052s sys 0m0.000s ./src/preproc/eqn/neqn.1 real 0m0.024s user 0m0.025s sys 0m0.002s ./src/preproc/grn/grn.1 real 0m0.036s user 0m0.030s sys 0m0.010s ./src/preproc/pic/pic.1 real 0m0.036s user 0m0.040s sys 0m0.000s ./src/preproc/preconv/preconv.1 real 0m0.028s user 0m0.031s sys 0m0.000s ./src/preproc/refer/refer.1 real 0m0.051s user 0m0.047s sys 0m0.008s ./src/preproc/soelim/soelim.1 real 0m0.026s user 0m0.030s sys 0m0.000s ./src/preproc/tbl/tbl.1 real 0m0.043s user 0m0.047s sys 0m0.002s ./src/roff/groff/groff.1 real 0m0.050s user 0m0.053s sys 0m0.002s ./src/roff/nroff/nroff.1 real 0m0.026s user 0m0.025s sys 0m0.004s ./src/roff/troff/troff.1 real 0m0.035s user 0m0.037s sys 0m0.002s ./src/utils/addftinfo/addftinfo.1 real 0m0.025s user 0m0.028s sys 0m0.000s ./src/utils/afmtodit/afmtodit.1 real 0m0.029s user 0m0.030s sys 0m0.002s ./src/utils/grog/grog.1 real 0m0.028s user 0m0.028s sys 0m0.004s ./src/utils/hpftodit/hpftodit.1 real 0m0.030s user 0m0.033s sys 0m0.000s ./src/utils/indxbib/indxbib.1 real 0m0.029s user 0m0.026s sys 0m0.006s ./src/utils/lkbib/lkbib.1 real 0m0.027s user 0m0.030s sys 0m0.000s ./src/utils/lookbib/lookbib.1 real 0m0.026s user 0m0.028s sys 0m0.002s ./src/utils/pfbtops/pfbtops.1 real 0m0.025s user 0m0.017s sys 0m0.011s ./src/utils/tfmtodit/tfmtodit.1 real 0m0.027s user 0m0.026s sys 0m0.004s ./src/utils/xtotroff/xtotroff.1 real 0m0.025s user 0m0.022s sys 0m0.006s ./tmac/groff_man.7 real 0m0.049s user 0m0.043s sys 0m0.012s ./tmac/groff_man_style.7 real 0m0.066s user 0m0.070s sys 0m0.004s ./tmac/groff_mdoc.7 real 0m0.379s user 0m0.379s sys 0m0.010s ./tmac/groff_me.7 real 0m0.044s user 0m0.039s sys 0m0.013s ./tmac/groff_ms.7 real 0m0.065s user 0m0.060s sys 0m0.013s ./tmac/groff_trace.7 real 0m0.027s user 0m0.026s sys 0m0.004s ./tmac/groff_www.7 real 0m0.035s user 0m0.030s sys 0m0.009s [-- Attachment #1.3: TIMING2 --] [-- Type: text/plain, Size: 4203 bytes --] for m in $(find -name "*.[157]" | sort); do echo; echo $m; time ./test-groff -E -pet -mandoc -Tutf8 $m >/dev/null; done ./contrib/chem/chem.1 real 0m0.051s user 0m0.062s sys 0m0.008s ./contrib/eqn2graph/eqn2graph.1 real 0m0.018s user 0m0.019s sys 0m0.006s ./contrib/gdiffmk/gdiffmk.1 real 0m0.018s user 0m0.026s sys 0m0.000s ./contrib/glilypond/glilypond.1 real 0m0.027s user 0m0.038s sys 0m0.000s ./contrib/gperl/gperl.1 real 0m0.021s user 0m0.021s sys 0m0.009s ./contrib/gpinyin/gpinyin.1 real 0m0.020s user 0m0.026s sys 0m0.002s ./contrib/grap2graph/grap2graph.1 real 0m0.018s user 0m0.025s sys 0m0.000s ./contrib/hdtbl/groff_hdtbl.7 real 0m0.031s user 0m0.044s sys 0m0.002s ./contrib/mm/groff_mm.7 real 0m0.089s user 0m0.129s sys 0m0.004s ./contrib/mm/groff_mmse.7 real 0m0.018s user 0m0.026s sys 0m0.000s ./contrib/mm/mmroff.1 real 0m0.023s user 0m0.029s sys 0m0.002s ./contrib/mom/groff_mom.7 real 0m0.067s user 0m0.093s sys 0m0.000s ./contrib/pdfmark/pdfroff.1 real 0m0.033s user 0m0.040s sys 0m0.006s ./contrib/pic2graph/pic2graph.1 real 0m0.020s user 0m0.022s sys 0m0.007s ./contrib/rfc1345/groff_rfc1345.7 real 0m0.021s user 0m0.028s sys 0m0.001s ./man/groff.7 real 0m0.116s user 0m0.169s sys 0m0.000s ./man/groff_char.7 real 0m0.069s user 0m0.111s sys 0m0.000s ./man/groff_diff.7 real 0m0.093s user 0m0.134s sys 0m0.002s ./man/groff_font.5 real 0m0.029s user 0m0.031s sys 0m0.011s ./man/groff_out.5 real 0m0.042s user 0m0.058s sys 0m0.002s ./man/groff_tmac.5 real 0m0.034s user 0m0.044s sys 0m0.005s ./man/roff.7 real 0m0.049s user 0m0.075s sys 0m0.000s ./src/devices/grodvi/grodvi.1 real 0m0.023s user 0m0.031s sys 0m0.002s ./src/devices/grohtml/grohtml.1 real 0m0.025s user 0m0.025s sys 0m0.011s ./src/devices/grolbp/grolbp.1 real 0m0.022s user 0m0.030s sys 0m0.002s ./src/devices/grolj4/grolj4.1 real 0m0.027s user 0m0.033s sys 0m0.006s ./src/devices/gropdf/gropdf.1 real 0m0.046s user 0m0.063s sys 0m0.002s ./src/devices/gropdf/pdfmom.1 real 0m0.018s user 0m0.021s sys 0m0.004s ./src/devices/grops/grops.1 real 0m0.038s user 0m0.055s sys 0m0.000s ./src/devices/grotty/grotty.1 real 0m0.025s user 0m0.033s sys 0m0.004s ./src/devices/xditview/gxditview.1 real 0m0.026s user 0m0.028s sys 0m0.009s ./src/preproc/eqn/eqn.1 real 0m0.039s user 0m0.055s sys 0m0.000s ./src/preproc/eqn/neqn.1 real 0m0.018s user 0m0.021s sys 0m0.002s ./src/preproc/grn/grn.1 real 0m0.032s user 0m0.042s sys 0m0.004s ./src/preproc/pic/pic.1 real 0m0.032s user 0m0.046s sys 0m0.000s ./src/preproc/preconv/preconv.1 real 0m0.028s user 0m0.039s sys 0m0.001s ./src/preproc/refer/refer.1 real 0m0.050s user 0m0.067s sys 0m0.004s ./src/preproc/soelim/soelim.1 real 0m0.019s user 0m0.028s sys 0m0.000s ./src/preproc/tbl/tbl.1 real 0m0.040s user 0m0.061s sys 0m0.000s ./src/roff/groff/groff.1 real 0m0.048s user 0m0.067s sys 0m0.002s ./src/roff/nroff/nroff.1 real 0m0.020s user 0m0.028s sys 0m0.000s ./src/roff/troff/troff.1 real 0m0.032s user 0m0.044s sys 0m0.000s ./src/utils/addftinfo/addftinfo.1 real 0m0.019s user 0m0.024s sys 0m0.002s ./src/utils/afmtodit/afmtodit.1 real 0m0.023s user 0m0.021s sys 0m0.012s ./src/utils/grog/grog.1 real 0m0.026s user 0m0.031s sys 0m0.005s ./src/utils/hpftodit/hpftodit.1 real 0m0.026s user 0m0.036s sys 0m0.000s ./src/utils/indxbib/indxbib.1 real 0m0.021s user 0m0.029s sys 0m0.000s ./src/utils/lkbib/lkbib.1 real 0m0.019s user 0m0.020s sys 0m0.006s ./src/utils/lookbib/lookbib.1 real 0m0.019s user 0m0.025s sys 0m0.000s ./src/utils/pfbtops/pfbtops.1 real 0m0.019s user 0m0.021s sys 0m0.004s ./src/utils/tfmtodit/tfmtodit.1 real 0m0.023s user 0m0.028s sys 0m0.004s ./src/utils/xtotroff/xtotroff.1 real 0m0.020s user 0m0.021s sys 0m0.007s ./tmac/groff_man.7 real 0m0.044s user 0m0.061s sys 0m0.002s ./tmac/groff_man_style.7 real 0m0.068s user 0m0.098s sys 0m0.004s ./tmac/groff_mdoc.7 real 0m0.383s user 0m0.418s sys 0m0.006s ./tmac/groff_me.7 real 0m0.031s user 0m0.033s sys 0m0.013s ./tmac/groff_ms.7 real 0m0.059s user 0m0.082s sys 0m0.005s ./tmac/groff_trace.7 real 0m0.019s user 0m0.023s sys 0m0.005s ./tmac/groff_www.7 real 0m0.026s user 0m0.036s sys 0m0.002s [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson @ 2023-04-07 15:06 ` Eli Zaretskii 2023-04-07 15:08 ` Larry McVoy ` (2 more replies) 2023-04-07 16:08 ` Colin Watson 2023-04-08 11:24 ` Ralph Corderoy 2 siblings, 3 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-07 15:06 UTC (permalink / raw) To: G. Branden Robinson Cc: alx.manpages, dirk, cjwatson, linux-man, help-texinfo, groff > Date: Fri, 7 Apr 2023 09:43:19 -0500 > From: "G. Branden Robinson" <g.branden.robinson@gmail.com> > Cc: alx.manpages@gmail.com, dirk@gouders.net, cjwatson@debian.org, > linux-man@vger.kernel.org, help-texinfo@gnu.org, groff@gnu.org > > ...which brings me to the other factor, of which I'm more confident: man > page rendering times are much lower than they were in Unix's early days. > > On my system, all groff man pages but one render in between a tenth and > a fortieth of a second. The really huge pages like groff(7), > groff_char(7), and groff_diff(7) are toward the upper end of this range, > because they are long, at ~20-25 U.S. letter pages when formatted for > PostScript or PDF, or have many large tables so the tbl(1) preprocessor > produces a lot of output. > > The outlier is groff_mdoc(7) at just over one-third of a second. Some people consider 0.1 sec, let alone 0.3 sec, to be long enough to be annoying. Also, did you try with libpng.3 or gcc.1? > Human subjects need a minimum of about 0.1 second of visual experience > or about .01 to .02 second of auditory experience to perceive > duration; any shorter experiences are called instantaneous. > -- Encyclopædia Britannica[2] IME, 0.05 sec of visual experiences is closer to reality. Anyway, I won't argue. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 15:06 ` Eli Zaretskii @ 2023-04-07 15:08 ` Larry McVoy 2023-04-07 17:07 ` man page rendering speed Ingo Schwarze 2023-04-07 19:04 ` man page rendering speed (was: Playground pager lsp(1)) Alejandro Colomar 2 siblings, 0 replies; 73+ messages in thread From: Larry McVoy @ 2023-04-07 15:08 UTC (permalink / raw) To: Eli Zaretskii Cc: G. Branden Robinson, alx.manpages, dirk, cjwatson, linux-man, help-texinfo, groff On Fri, Apr 07, 2023 at 06:06:39PM +0300, Eli Zaretskii wrote: > > Date: Fri, 7 Apr 2023 09:43:19 -0500 > > From: "G. Branden Robinson" <g.branden.robinson@gmail.com> > > Cc: alx.manpages@gmail.com, dirk@gouders.net, cjwatson@debian.org, > > linux-man@vger.kernel.org, help-texinfo@gnu.org, groff@gnu.org > > > > ...which brings me to the other factor, of which I'm more confident: man > > page rendering times are much lower than they were in Unix's early days. > > > > On my system, all groff man pages but one render in between a tenth and > > a fortieth of a second. The really huge pages like groff(7), > > groff_char(7), and groff_diff(7) are toward the upper end of this range, > > because they are long, at ~20-25 U.S. letter pages when formatted for > > PostScript or PDF, or have many large tables so the tbl(1) preprocessor > > produces a lot of output. > > > > The outlier is groff_mdoc(7) at just over one-third of a second. > > Some people consider 0.1 sec, let alone 0.3 sec, to be long enough to > be annoying. True but try and balance that with what they are trying to do, clean things up. I'm retired so my opinion doesn't count but I think they are on the right path. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed 2023-04-07 15:06 ` Eli Zaretskii 2023-04-07 15:08 ` Larry McVoy @ 2023-04-07 17:07 ` Ingo Schwarze 2023-04-07 19:04 ` man page rendering speed (was: Playground pager lsp(1)) Alejandro Colomar 2 siblings, 0 replies; 73+ messages in thread From: Ingo Schwarze @ 2023-04-07 17:07 UTC (permalink / raw) To: Eli Zaretskii Cc: g.branden.robinson, alx.manpages, dirk, cjwatson, linux-man, help-texinfo, groff Hi Eli, Eli Zaretskii wrote on Fri, Apr 07, 2023 at 06:06:39PM +0300: > G. Branden Robinson wrote on Date: Fri, 7 Apr 2023 09:43:19 -0500 >> ...which brings me to the other factor, of which I'm more confident: man >> page rendering times are much lower than they were in Unix's early days. >> >> On my system, all groff man pages but one render in between a tenth and >> a fortieth of a second. The really huge pages like groff(7), >> groff_char(7), and groff_diff(7) are toward the upper end of this range, >> because they are long, at ~20-25 U.S. letter pages when formatted for >> PostScript or PDF, or have many large tables so the tbl(1) preprocessor >> produces a lot of output. >> >> The outlier is groff_mdoc(7) at just over one-third of a second. > Some people consider 0.1 sec, let alone 0.3 sec, to be long enough to > be annoying. > > Also, did you try with libpng.3 or gcc.1? For what it's worth, on my notebook the largest page is ffmpeg-all(1) at about 1.6 Megabyte man(1) source code, 42k lines, 182k words, 1.65 Megabyte rendered to UTF-8 terminal output. Rendering that beast takes three and a half seconds on my notebook with groff and two thirds of a second with mandoc(1), i.e. mandoc is more than five times faster on this page than groff. The largest mdoc(7) page here happens to be openssl(1) at 193 Kilobyte of mdoc(7) source code, 5k lines, 27k words, 265 Kilobyte of UTF-8 terminal output in rendered form. It takes 1.3 seconds with groff and on tenth of a second with mandoc, so mandoc is faster by a factor of thirteen in this case. In general, the speed gain of mandoc is much larger for mdoc(7) than for man(7) input because mandoc refrains from using recursion in the implementation of the mdoc(7) language. Relative speed gains also tend to be larger for large pages than for small ones, so these factors of five and thirteen are on the upper end of the spectrum. Then again, who cares about rendering speeds for small pages, apart from Michael Stapelberg when he pre-renders stuff he is planning to serve on manpages.debian.org? In fact, speed was among the design goals of mandoc when development started about 15 years ago (though the goal was secondary to the goals of BSD licensing, ease of use, and security, and in the meantime, the goal of high-quality HTML output has also become more important). Consequently, people who highly value speed in manual page display might consider mandoc as an option for a manual page searching, formatting and display system. Several Linux distributions nowadays offer the configuration option of using it out of the box (including Fedora, openSUSE and Arch), and some even use it by default (including Alpine, Void, illumos and, of course, almost all BSD systems). Of course, it is *not* a replacement for groff. Mandoc only provides rather poor PDF output and it can only format manual pages, not general-purpose roff(7) documents. Yours, Ingo ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 15:06 ` Eli Zaretskii 2023-04-07 15:08 ` Larry McVoy 2023-04-07 17:07 ` man page rendering speed Ingo Schwarze @ 2023-04-07 19:04 ` Alejandro Colomar 2023-04-07 19:28 ` Gavin Smith 2 siblings, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-07 19:04 UTC (permalink / raw) To: Eli Zaretskii, G. Branden Robinson Cc: dirk, cjwatson, linux-man, help-texinfo, groff [-- Attachment #1.1: Type: text/plain, Size: 4143 bytes --] Hi! On 4/7/23 17:06, Eli Zaretskii wrote: >> Date: Fri, 7 Apr 2023 09:43:19 -0500 >> From: "G. Branden Robinson" <g.branden.robinson@gmail.com> >> Cc: alx.manpages@gmail.com, dirk@gouders.net, cjwatson@debian.org, >> linux-man@vger.kernel.org, help-texinfo@gnu.org, groff@gnu.org >> >> ...which brings me to the other factor, of which I'm more confident: man >> page rendering times are much lower than they were in Unix's early days. >> >> On my system, all groff man pages but one render in between a tenth and >> a fortieth of a second. The really huge pages like groff(7), >> groff_char(7), and groff_diff(7) are toward the upper end of this range, >> because they are long, at ~20-25 U.S. letter pages when formatted for >> PostScript or PDF, or have many large tables so the tbl(1) preprocessor >> produces a lot of output. >> >> The outlier is groff_mdoc(7) at just over one-third of a second. > > Some people consider 0.1 sec, let alone 0.3 sec, to be long enough to > be annoying. > > Also, did you try with libpng.3 or gcc.1? $ time man -w gcc | xargs zcat | groff -man -Tutf8 2>/dev/null >/dev/null real 0m0.406s user 0m0.534s sys 0m0.042s But as others said, I don't really care about the time it takes to format the entire document, but rather the first 24 lines, which is more like instantaneous (per your own definition of ~0.5 s). $ time man -w gcc | xargs zcat | groff -man -Tutf8 2>/dev/null | head -n24 >/dev/null xargs: zcat: terminated by signal 13 real 0m0.064s user 0m0.051s sys 0m0.030s As a curiosity, mandoc(1) seems to be faster for rendering the entire document, but slower to "start reading". $ time man -w gcc | xargs zcat | mandoc >/dev/null real 0m0.270s user 0m0.218s sys 0m0.057s $ time man -w gcc | xargs zcat | mandoc | head -n24 >/dev/null real 0m0.136s user 0m0.119s sys 0m0.023s As a disclaimer, I do sometimes care about reading entire documents, but even in that case, it's not so bad. I can read the few thousand man pages in the Linux man-pages in about a few seconds, or a minute. [1] > >> Human subjects need a minimum of about 0.1 second of visual experience >> or about .01 to .02 second of auditory experience to perceive >> duration; any shorter experiences are called instantaneous. >> -- Encyclopædia Britannica[2] > > IME, 0.05 sec of visual experiences is closer to reality. This is the time to load the first 24 lines of almost any page. gcc(1), which is one of the longest I have, takes 0.6 s. MAX(3), which is one of the shortest I have, takes 0.4 s. > > Anyway, I won't argue. Cheers, Alex [1]: Here's why I do care about time to lead entire pages. I know I can optimize this pipeline by calling groff(1) directly, or even better, mandoc(1), now that I know it's faster for entire docs, but since I haven't used this function for a long time, I didn't spend time optimizing it. man_lsfunc() { if [ $# -lt 1 ]; then >&2 echo "Usage: ${FUNCNAME[0]} <manpage|manNdir>..."; return $EX_USAGE; fi for arg in "$@"; do man_section "$arg" 'SYNOPSIS'; done \ |sed_rm_ccomments \ |pcregrep -Mn '(?s)^ [\w ]+ \**\w+\([\w\s(,)[\]*]*?(...)?\s*\); *$' \ |grep '^[0-9]' \ |sed -E 's/syscall\(SYS_(\w*),?/\1(/' \ |sed -E 's/^[^(]+ \**(\w+)\(.*/\1/' \ |uniq; } man_section() { if [ $# -lt 2 ]; then >&2 echo "Usage: ${FUNCNAME[0]} <dir> <section>..."; return $EX_USAGE; fi local page="$1"; shift; local sect="$*"; find "$page" -type f \ |xargs wc -l \ |grep -v -e '\b1 ' -e '\btotal\b' \ |awk '{ print $2 }' \ |sort \ |while read -r manpage; do (sed -n '/^\.TH/,/^\.SH/{/^\.SH/!p}' <"$manpage"; for s in $sect; do <"$manpage" \ sed -n \ -e "/^\.SH $s/p" \ -e "/^\.SH $s/,/^\.SH/{/^\.SH/!p}"; done;) \ |man -P cat -l - 2>/dev/null; done; } man_lsfunc() is quite slow, but it's acceptable to me, since I only run it sporadically. -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 19:04 ` man page rendering speed (was: Playground pager lsp(1)) Alejandro Colomar @ 2023-04-07 19:28 ` Gavin Smith 2023-04-07 20:43 ` Alejandro Colomar 0 siblings, 1 reply; 73+ messages in thread From: Gavin Smith @ 2023-04-07 19:28 UTC (permalink / raw) To: Alejandro Colomar Cc: Eli Zaretskii, G. Branden Robinson, dirk, cjwatson, linux-man, help-texinfo, groff On Fri, Apr 07, 2023 at 09:04:03PM +0200, Alejandro Colomar wrote: > $ time man -w gcc | xargs zcat | groff -man -Tutf8 2>/dev/null >/dev/null > > real 0m0.406s > user 0m0.534s > sys 0m0.042s > > But as others said, I don't really care about the time it takes to format > the entire document, but rather the first 24 lines, which is more like > instantaneous (per your own definition of ~0.5 s). Here's a sample comparison of "man" versus "info" on my system (relevant as help-texinfo@gnu.org is being copied into this discussion): $ time info gcc > temp real 0m0.112s user 0m0.085s sys 0m0.017s $ ls -l temp -rw-rw-r-- 1 g g 3.0M Apr 7 20:14 temp $ time man gcc > temp troff: <standard input>:11612: warning [p 111, 6.0i]: can't break line troff: <standard input>:11660: warning [p 111, 13.8i]: can't break line real 0m0.620s user 0m1.004s sys 0m0.114s $ ls -l temp -rw-rw-r-- 1 g g 1.2M Apr 7 20:16 temp I find the startup of "info" to be instantaneous, whereas man pages often have a noticeable delay. Doubtless man would have more comparable runtimes were cat pages being used. Being able to reformat the text for arbitrary widths is of limited use, in my opinion, as text becomes more unreadable at long line lengths. I suppose cat pages could be provided in a series of sensible widths. (The same is true in theory for Info, but I've never heard of anybody using widths for Info output other than the default 72 columns.) ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 19:28 ` Gavin Smith @ 2023-04-07 20:43 ` Alejandro Colomar 0 siblings, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-07 20:43 UTC (permalink / raw) To: Gavin Smith Cc: Eli Zaretskii, G. Branden Robinson, dirk, cjwatson, linux-man, help-texinfo, groff, Ingo Schwarze [-- Attachment #1.1: Type: text/plain, Size: 2868 bytes --] Hi Gavin, On 4/7/23 21:28, Gavin Smith wrote: > On Fri, Apr 07, 2023 at 09:04:03PM +0200, Alejandro Colomar wrote: >> $ time man -w gcc | xargs zcat | groff -man -Tutf8 2>/dev/null >/dev/null >> >> real 0m0.406s >> user 0m0.534s >> sys 0m0.042s >> >> But as others said, I don't really care about the time it takes to format >> the entire document, but rather the first 24 lines, which is more like >> instantaneous (per your own definition of ~0.5 s). > > Here's a sample comparison of "man" versus "info" on my system > (relevant as help-texinfo@gnu.org is being copied into this > discussion): > > $ time info gcc > temp > > real 0m0.112s > user 0m0.085s > sys 0m0.017s > $ ls -l temp > -rw-rw-r-- 1 g g 3.0M Apr 7 20:14 temp > $ time man gcc > temp > troff: <standard input>:11612: warning [p 111, 6.0i]: can't break line > troff: <standard input>:11660: warning [p 111, 13.8i]: can't break line > > real 0m0.620s > user 0m1.004s > sys 0m0.114s > $ ls -l temp > -rw-rw-r-- 1 g g 1.2M Apr 7 20:16 temp > > I find the startup of "info" to be instantaneous, whereas man pages often > have a noticeable delay. The times you showed are not _startup_ times, but rather the time for formatting the _entire_ documents. Remember that less(1) already shows you the first lines when they are ready, without waiting for the rest of the pipe. I've optimized a moment ago the functions I had for listing all the functions that appear in the Linux man-pages' SYNOPSIS sections, and got it down from 55 s (calling man(1)) to just 14 s (calling groff(1)) and further to 4 s (calling mandoc(1)). That's parsing around a thousand pages, extracting the SYNOPSIS with sed(1), formatting it, and parsing that to find function prototypes. I guess that's one of the worst cases of when one would care about the time it takes to format a man page, and it's a very reasonable one. > > Doubtless man would have more comparable runtimes were cat pages being used. The startup times don't really change. It's around 0.5 s. However, the time to show the entire page is the same (i.e., virtually all the time is spent in finding and opening the page) > > Being able to reformat the text for arbitrary widths is of limited use, > in my opinion, as text becomes more unreadable at long line lengths. I often want it for the opposite reason: I want to make the terminal narrower (e.g., for pasting contents into an email, at 72 or 66 columns). > I > suppose cat pages could be provided in a series of sensible widths. (The > same is true in theory for Info, but I've never heard of anybody using > widths for Info output other than the default 72 columns.) Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson 2023-04-07 15:06 ` Eli Zaretskii @ 2023-04-07 16:08 ` Colin Watson 2023-04-08 11:24 ` Ralph Corderoy 2 siblings, 0 replies; 73+ messages in thread From: Colin Watson @ 2023-04-07 16:08 UTC (permalink / raw) To: G. Branden Robinson Cc: Eli Zaretskii, alx.manpages, dirk, linux-man, help-texinfo, groff On Fri, Apr 07, 2023 at 09:43:19AM -0500, G. Branden Robinson wrote: > At 2023-04-07T09:36:10+0300, Eli Zaretskii wrote: > > > From: "G. Branden Robinson" <g.branden.robinson@gmail.com> > [re-running *roff when a viewing a man page and resizing the terminal] > > > Seems like it shouldn't be impossible to me, but what I imagine > > > would require a little reëngineering of man(1), perhaps to spawn a > > > little custom program to manage zcat/nroff pipeline it constructs. > > > This little program's sole job could be to be aware of this pipeline > > > and listen for SIGWINCH; if it happens, kill the rest of the > > > pipeline and reëxecute it. I didn't see the rest of the thread, but one significant complexity here would be interacting with the pager to arrange for the viewing position to be returned to where it was pre-SIGWINCH; bear in mind that the pager is user-configurable (less(1) is common but not universal) and isn't directly part of man(1). > > This should be possible, but it flies in the face of the feature > > whereby formatted man pages are kept for future perusal, which is > > therefore faster: > > You're referring to cat pages. As far as I know, these are on their way > out if not already gone. Colin Watson, who has maintained the man-db > implementation of man(1)[1] for something like 20 years, can speak more > authoritatively to this, but as I understand it, the advent of resizable > xterm windows started to kill the utility of cat pages decades ago and > the increasing importance of desktop environments accelerated their > demise. Another major change in that period was the general though gradual move to UTF-8, making it somewhat unclear for some time which encoding should be preferred when rendering cat pages. (Since 2010, man-db always saves cat pages in UTF-8 and converts to the proper encoding at display time, but it took a while to settle on this approach and in the meantime there were perhaps four or five years when cat pages were commonly unavailable in practice. Even then, very few people cared enough to complain.) Furthermore, the traditional approach to saving system-wide cat pages involved having man(1) be set-id. From a modern standpoint, this was obviously problematic, and it caused both security vulnerabilities and more ordinary bugs. There are ways in which this might have been rearranged to be less of a serious problem, but if you can avoid bothering with set-id at all then that's clearly safer. My general approach to cat pages for at least the last ten years has been to put as little effort into them as possible. This has so far included not outright removing support for them (since dealing with the resulting support load, even if small, would itself be effort), but if an improvement to man(1) has some kind of degradation of cat pages as a side-effect then I usually won't hesitate to make it anyway. > ...which brings me to the other factor, of which I'm more confident: man > page rendering times are much lower than they were in Unix's early days. Indeed, and it's been the case for at least a decade that rendering times have been short enough that they can largely be considered negligible. (For most of that time my own equipment has not been particularly on the bleeding edge of performance.) > The bottom line is that, even on BSD systems (where mdoc(7) is preferred > to man(7)), a user can expect a man page to render from *roff source in > less than, say, half a second in the worst case, and the median > GNU/Linux user can expect to start reading a man page "instantaneously": The other thing to note explicitly here is that what normally matters most is the time to _start_ reading, not the time to render the whole page. My usual example for where this makes a difference is zshall(1), which is a concatenation of several other pages and comes to about 30000 lines of 80-column rendered output; on my system this takes about 0.6 seconds to render in its entirety, but typing "man zshall" nevertheless shows the first page subjectively instantaneously. -- Colin Watson (he/him) [cjwatson@debian.org] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: man page rendering speed (was: Playground pager lsp(1)) 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson 2023-04-07 15:06 ` Eli Zaretskii 2023-04-07 16:08 ` Colin Watson @ 2023-04-08 11:24 ` Ralph Corderoy 2 siblings, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-08 11:24 UTC (permalink / raw) To: linux-man, groff; +Cc: Eli Zaretskii, alx.manpages, dirk, Colin Watson Hi Branden, > You're referring to cat pages. As far as I know, these are on their > way out if not already gone. catman must die. It was never a good solution to the problem. As well as ignoring different TERMs, it also didn't handle a user's variations to a terminal's definition. I'm glad to see Colin is open to the idea, though accept it's initial and on-going work for him. > On my system, all groff man pages but one render in between a tenth and > a fortieth of a second. Colin made the point I was going to make: how long must my eyeballs wait to be pleasured? $ strace -ttt -fe read,write -o /tmp/st man ffmpeg-all $ cat /tmp/st → 19788 1680952657.119429 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \0\0\0\0\0\0"..., 832) = 832 ... 19801 1680952658.350823 write(1, "FFMPEG-ALL(1) "..., 1023 <unfinished ...> 19801 1680952658.352054 <... write resumed>) = 1023 19801 1680952658.353074 write(1, "ified by a plain output url.\33[m\n"..., 1023 <unfinished ...> 19801 1680952658.353357 <... write resumed>) = 1023 19801 1680952658.354272 write(1, "e command line multiple times. E"..., 1023) = 1023 19801 1680952658.357171 write(1, "aw input files.\33[m\n\33[m\n\33[1mDETAI"..., 1009) = 1009 19801 1680952658.357478 read(0, "--- | encoded data | <----+\n "..., 4096) = 4096 19801 1680952658.358752 write(1, " | output | <-----"..., 1023) = 1023 19801 1680952658.359556 write(1, "peg\33[0m can process raw audio an"..., 574) = 574 → 19801 1680952658.359735 read(3, <unfinished ...> ... 19801 1680952662.323859 <... read resumed>"q", 1) = 1 ... $ 1680952658.359735 - 1680952657.119429 = 1.240306 strace adds a bit of overhead. $ PAGER=true time -p man ffmpeg-all real 0.99 user 1.07 sys 0.15 $ Hard to find a slower CPU. $ grep name /proc/cpuinfo | uniq -c 4 model name : Intel(R) Atom(TM) CPU D525 @ 1.80GHz -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* reformatting man pages at SIGWINCH (was: Playground pager lsp(1)) 2023-04-07 2:18 ` Playground pager lsp(1) G. Branden Robinson 2023-04-07 6:36 ` Eli Zaretskii @ 2023-04-07 21:26 ` Alejandro Colomar 2023-04-07 22:09 ` reformatting man pages at SIGWINCH Dirk Gouders 1 sibling, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-07 21:26 UTC (permalink / raw) To: G. Branden Robinson; +Cc: Eli Zaretskii, dirk, linux-man, help-texinfo, groff [-- Attachment #1.1: Type: text/plain, Size: 2143 bytes --] Hi Branden, On 4/7/23 04:18, G. Branden Robinson wrote: > At 2023-04-06T03:10:59+0200, Alejandro Colomar wrote: >> Hmm, now that I think, it's probably an issue of coordinating man(1) >> and less(1). I sometimes wish that when I resize a window where I'm >> reading a man page, it would reformat the page from source. > > Seems like it shouldn't be impossible to me, but what I imagine would > require a little reëngineering of man(1), perhaps to spawn a little > custom program to manage zcat/nroff pipeline it constructs. This little > program's sole job could be to be aware of this pipeline and listen for > SIGWINCH; if it happens, kill the rest of the pipeline and reëxecute it. > > Maybe I thought of it this way because (I suspect) it aligns with my > vision I've expressed elsewhere of man(1) having unfortunately > aggregated two separate functions: librarian vs. renderer. > Historically, of course the latter function was almost vestigial, since > early Unix systems had no pager program and their man pages required > little to no preprocessing; man(1) slowly accreted into a larger thing. > >> Of course, that might be a problem for keeping track of where I was, >> since lines moved around. > > That seems like a harder problem to me. You'd need a way for the pager > to communicate position information back to the mini-man renderer > program I envision. Two challenges here: (1) what part of the screen > was the reader actually looking at? (2) how is the pager supposed to > know how to map any given location on the screen back to a place in the > unrendered source document so it can be accurately found when the > document is rerendered? These feel nearly intractable to me. But maybe > I have a poor imagination. Maybe it could be done with .SH and .SS. The heuristics to find these are simple. It wouldn't be very precise, but it could try to find the closest (only upwards) (sub)section heading. With some luck, .TP would also be helpful. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-07 21:26 ` reformatting man pages at SIGWINCH " Alejandro Colomar @ 2023-04-07 22:09 ` Dirk Gouders 2023-04-07 22:16 ` Alejandro Colomar 2023-04-08 11:40 ` Ralph Corderoy 0 siblings, 2 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-07 22:09 UTC (permalink / raw) To: Alejandro Colomar Cc: G. Branden Robinson, Eli Zaretskii, linux-man, help-texinfo, groff Alejandro Colomar <alx.manpages@gmail.com> writes: > Hi Branden, > > On 4/7/23 04:18, G. Branden Robinson wrote: >> At 2023-04-06T03:10:59+0200, Alejandro Colomar wrote: >>> Hmm, now that I think, it's probably an issue of coordinating man(1) >>> and less(1). I sometimes wish that when I resize a window where I'm >>> reading a man page, it would reformat the page from source. >> >> Seems like it shouldn't be impossible to me, but what I imagine would >> require a little reëngineering of man(1), perhaps to spawn a little >> custom program to manage zcat/nroff pipeline it constructs. This little >> program's sole job could be to be aware of this pipeline and listen for >> SIGWINCH; if it happens, kill the rest of the pipeline and reëxecute it. >> >> Maybe I thought of it this way because (I suspect) it aligns with my >> vision I've expressed elsewhere of man(1) having unfortunately >> aggregated two separate functions: librarian vs. renderer. >> Historically, of course the latter function was almost vestigial, since >> early Unix systems had no pager program and their man pages required >> little to no preprocessing; man(1) slowly accreted into a larger thing. >> >>> Of course, that might be a problem for keeping track of where I was, >>> since lines moved around. >> >> That seems like a harder problem to me. You'd need a way for the pager >> to communicate position information back to the mini-man renderer >> program I envision. Two challenges here: (1) what part of the screen >> was the reader actually looking at? (2) how is the pager supposed to >> know how to map any given location on the screen back to a place in the >> unrendered source document so it can be accurately found when the >> document is rerendered? These feel nearly intractable to me. But maybe >> I have a poor imagination. > > Maybe it could be done with .SH and .SS. The heuristics to find these > are simple. It wouldn't be very precise, but it could try to find the > closest (only upwards) (sub)section heading. With some luck, .TP would > also be helpful. Yes, that should give nice results. But for manual pages like git(1) with large areas between those this becomes difficult, again. Today, I experimented with one more heuristics, adjusting the current position according to the proportional change of avg. line size and also change of window dimension (horizontal) but all of those didn't get better results than what I currently implemented (stay at the position). Out of curiosity, I checked how firefox behaves on horizontal resizes and comparing to some of those results, lsp is not the worst on earth ;-) If time allows, I want to see if working with Levenshtein distances could get exact results. Perhaps this will turn out to be too expensive but maybe the fact that the area to be checked is limited helps... Regards, Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-07 22:09 ` reformatting man pages at SIGWINCH Dirk Gouders @ 2023-04-07 22:16 ` Alejandro Colomar 2023-04-10 19:05 ` Dirk Gouders 2023-04-08 11:40 ` Ralph Corderoy 1 sibling, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-07 22:16 UTC (permalink / raw) To: Dirk Gouders Cc: G. Branden Robinson, Eli Zaretskii, linux-man, help-texinfo, groff [-- Attachment #1.1: Type: text/plain, Size: 1541 bytes --] Hi Dirk, On 4/8/23 00:09, Dirk Gouders wrote: >> Maybe it could be done with .SH and .SS. The heuristics to find these >> are simple. It wouldn't be very precise, but it could try to find the >> closest (only upwards) (sub)section heading. With some luck, .TP would >> also be helpful. > > Yes, that should give nice results. But for manual pages like git(1) > with large areas between those this becomes difficult, again. > > Today, I experimented with one more heuristics, adjusting the current > position according to the proportional change of avg. line size and also > change of window dimension (horizontal) but all of those didn't get better > results than what I currently implemented (stay at the position). > > Out of curiosity, I checked how firefox behaves on horizontal resizes > and comparing to some of those results, lsp is not the worst on earth ;-) > > If time allows, I want to see if working with Levenshtein distances > could get exact results. Perhaps this will turn out to be too expensive > but maybe the fact that the area to be checked is limited helps... For something simpler, you could just count words since the start of the section divided by total words in the section. That should be fast, and I expect, also quite precise. Hyphenating might work against you on this, but on average it shouldn't move you too much. Cheers, Alex > > Regards, > > Dirk -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-07 22:16 ` Alejandro Colomar @ 2023-04-10 19:05 ` Dirk Gouders 2023-04-10 19:57 ` Alejandro Colomar 2023-04-10 20:24 ` G. Branden Robinson 0 siblings, 2 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-10 19:05 UTC (permalink / raw) To: Alejandro Colomar Cc: G. Branden Robinson, Eli Zaretskii, linux-man, help-texinfo, groff Hi Alex, Alejandro Colomar <alx.manpages@gmail.com> writes: > On 4/8/23 00:09, Dirk Gouders wrote: >>> Maybe it could be done with .SH and .SS. The heuristics to find these >>> are simple. It wouldn't be very precise, but it could try to find the >>> closest (only upwards) (sub)section heading. With some luck, .TP would >>> also be helpful. >> >> Yes, that should give nice results. But for manual pages like git(1) >> with large areas between those this becomes difficult, again. >> >> Today, I experimented with one more heuristics, adjusting the current >> position according to the proportional change of avg. line size and also >> change of window dimension (horizontal) but all of those didn't get better >> results than what I currently implemented (stay at the position). >> >> Out of curiosity, I checked how firefox behaves on horizontal resizes >> and comparing to some of those results, lsp is not the worst on earth ;-) >> >> If time allows, I want to see if working with Levenshtein distances >> could get exact results. Perhaps this will turn out to be too expensive >> but maybe the fact that the area to be checked is limited helps... > > For something simpler, you could just count words since the start of the > section divided by total words in the section. That should be fast, and > I expect, also quite precise. Hyphenating might work against you on > this, but on average it shouldn't move you too much. very pragmatic -- very effective, thanks for that suggestion. I started with implementing a simpler version of that (no counting of all words in the section): - Backwards count words until we reach an empty line, the section header or the beginning of the document Stop if it was the section header or beginning of the document Continue and just count empty lines until we reach the section header or the beginning of the document This relies on the assumption that horizontal resizes don't create or delete emty lines and it still has the weakness that manual pages (e.g. bash(1)) contain large areas without empty lines but it's definitely better than just staying at the position as it was before. If it turns out to still be too weak, I could count all words between two empty lines and set that in relation to the words from the preceeding empty line. But perhaps, I now learn that empty lines are by no means that constant value that I assume... Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-10 19:05 ` Dirk Gouders @ 2023-04-10 19:57 ` Alejandro Colomar 2023-04-10 20:24 ` G. Branden Robinson 1 sibling, 0 replies; 73+ messages in thread From: Alejandro Colomar @ 2023-04-10 19:57 UTC (permalink / raw) To: Dirk Gouders Cc: G. Branden Robinson, Eli Zaretskii, linux-man, help-texinfo, groff [-- Attachment #1.1: Type: text/plain, Size: 2282 bytes --] Hi Dirk, On 4/10/23 21:05, Dirk Gouders wrote: >> For something simpler, you could just count words since the start of the >> section divided by total words in the section. That should be fast, and >> I expect, also quite precise. Hyphenating might work against you on >> this, but on average it shouldn't move you too much. > > very pragmatic -- very effective, thanks for that suggestion. I > started with implementing a simpler version of that (no counting of all > words in the section): > > - Backwards count words until we reach an empty line, the section > header or the beginning of the document > > Stop if it was the section header or beginning of the document > > Continue and just count empty lines until we reach the > section header or the beginning of the document Hmmmm, good idea. $ man gcc 2>/dev/null | grep "^$" | wc -l 5462 $ man gcc 2>/dev/null | grep "^$" | wc -l 5462 $ man gcc 2>/dev/null | grep "^$" | wc -l 5464 $ man tzset 2>/dev/null | grep "^$" | wc -l 41 $ man tzset 2>/dev/null | grep "^$" | wc -l 41 $ man tzset 2>/dev/null | grep "^$" | wc -l 41 $ man bash 2>/dev/null | grep "^$" | wc -l 657 $ man bash 2>/dev/null | grep "^$" | wc -l 657 $ man bash 2>/dev/null | grep "^$" | wc -l 658 Of course there were important resizes between those invocations. > > This relies on the assumption that horizontal resizes don't create or > delete emty lines and it still has the weakness that manual pages > (e.g. bash(1)) contain large areas without empty lines but it's > definitely better than just staying at the position as it was before. That should give you a quite precise idea of where you were. > > If it turns out to still be too weak, I could count all words between > two empty lines and set that in relation to the words from the > preceeding empty line. > > But perhaps, I now learn that empty lines are by no means that constant > value that I assume... They seem to be constant. Only with the shortest terminal size I can have, that number changes, and only by one or two per entire page. > > Dirk Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-10 19:05 ` Dirk Gouders 2023-04-10 19:57 ` Alejandro Colomar @ 2023-04-10 20:24 ` G. Branden Robinson 2023-04-11 9:20 ` Ralph Corderoy 2023-04-11 9:39 ` Dirk Gouders 1 sibling, 2 replies; 73+ messages in thread From: G. Branden Robinson @ 2023-04-10 20:24 UTC (permalink / raw) To: Dirk Gouders Cc: Alejandro Colomar, Eli Zaretskii, linux-man, help-texinfo, groff [-- Attachment #1: Type: text/plain, Size: 4103 bytes --] Hi Dirk, At 2023-04-10T21:05:24+0200, Dirk Gouders wrote: > This relies on the assumption that horizontal resizes don't create or > delete emty lines and it still has the weakness that manual pages > (e.g. bash(1)) contain large areas without empty lines but it's > definitely better than just staying at the position as it was before. I think this assumption should hold for man and mdoc documents rendered by a *roff--I'm not sure about mandoc(1), but it probably will for reasons I'll elaborate below. Vertical space in *roff documents might get reduced at page breaks, but not to zero, except at page breaks. There are a few reasons that I think reinforce the assumption holding: 1. man(7) and mdoc(7) don't offer macros for just sticking an arbitrary amount of vertical space into a document. If you want that, you'll need to go down to formatter requests, which is seldom done by human man page authors, but a bit more frequently by automated generators of man(7) or mdoc(7) from other formats. 2. Even in traditional *roff, if you issued an ".sp 6" request (demanding 6 blank lines), then if you were within 6 lines of a "trap" (usually a page footer trap or the actual bottom of the page), the result would be that you'd get blank lines until the trap sprung, and any excess would be thrown away. So if there were only 4 lines of distance to the page footer, the leftover two would be discarded and _not_ appear after the header of the next page.[1] 3. mandoc(1) and groff's man(7) and mdoc(7) macro packages both implement "continuous rendering" for terminal output. This means that they contrive to arrange for an effectively infinite page length, so there are no page breaks. (Except when you render multiple man pages at a time, a use case groff 1.23.0 will support.) Since pager programs are applicable only to terminal output in the first place, this should address your use case. (You _can_ turn off continuous rendering in groff, and see man pages as they would have formatted for Western Electric Teletype machines, which printed to long spools of paper with 66 lines to the nominal page.) 4. A habit has grown up among man(1) programs and pagers to call for and support, respectively, a "blank line squeezing" feature: any runs of more than one blank line are condensed to 1 blank line each. In groff 1.23.0, this will no longer be necessary when continuously rendering. (Historically, this squeezing feature was used to "tighten up" vertical space after the page header, prior to the "NAME" section heading of the document.) In my opinion, pager programs should perform as few transformations as possible on the output of grotty(1), the groff output driver that supports terminal devices. The long-time author and maintainer of less(1) does not agree, so you have to call that program with its "-R" flag to view grotty(1) output as groff intends it. (To see what those intentions are, format the document without paging it.) > If it turns out to still be too weak, I could count all words between > two empty lines and set that in relation to the words from the > preceeding empty line. You might do this only as a fallback, if there were no blank lines on the screen at the old window size when the resizing event happened. > But perhaps, I now learn that empty lines are by no means that > constant value that I assume... In my opinion, the presence or absence of a single blank line in formatted output is important. groff 1.23.0 will feature some bug fixes with respect to their handling within and adjacent to tbl(1) input.[2] Since I flogged groff 1.23.0 three times in this email, I suppose I should point people to where they can get the 1.23.0.rc3 release candidate source archive. Feedback would be appreciated. https://alpha.gnu.org/gnu/groff/ Regards, Branden [1] For example, give the following input to "nroff | cat -n". --snip-- .pl 10v .nf The page length is set to 10 vees. 2 3 4 5 Asking for 6 vees of space now. .sp 6 How many appeared? --end snip-- [2] https://savannah.gnu.org/bugs/?57665 https://savannah.gnu.org/bugs/?49390 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-10 20:24 ` G. Branden Robinson @ 2023-04-11 9:20 ` Ralph Corderoy 2023-04-11 9:39 ` Dirk Gouders 1 sibling, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-11 9:20 UTC (permalink / raw) To: linux-man, groff Hi Branden, > see man pages as they would have formatted for Western Electric > Teletype machines, which printed to long spools of paper with 66 lines > to the nominal page. In case it isn't obvious, it was normal for teletypes and line printers to print six lines per inch onto letter-height fan-fold paper perforated every eleven inches giving 66 lines per real page, not nominal. As long as the paper was positioned so it started printing just after a perforation, the page breaks occurred over a perforation. To allow for a bit of leeway, the page often started and ended with blank lines. -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-10 20:24 ` G. Branden Robinson 2023-04-11 9:20 ` Ralph Corderoy @ 2023-04-11 9:39 ` Dirk Gouders 2023-04-17 6:23 ` G. Branden Robinson 1 sibling, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-04-11 9:39 UTC (permalink / raw) To: G. Branden Robinson Cc: Alejandro Colomar, Eli Zaretskii, linux-man, help-texinfo, groff Hi Branden, "G. Branden Robinson" <g.branden.robinson@gmail.com> writes: > At 2023-04-10T21:05:24+0200, Dirk Gouders wrote: >> This relies on the assumption that horizontal resizes don't create or >> delete emty lines and it still has the weakness that manual pages >> (e.g. bash(1)) contain large areas without empty lines but it's >> definitely better than just staying at the position as it was before. > > I think this assumption should hold for man and mdoc documents rendered > by a *roff--I'm not sure about mandoc(1), but it probably will for > reasons I'll elaborate below. > > Vertical space in *roff documents might get reduced at page breaks, but > not to zero, except at page breaks. > > There are a few reasons that I think reinforce the assumption holding: > > 1. man(7) and mdoc(7) don't offer macros for just sticking an arbitrary > amount of vertical space into a document. If you want that, you'll need > to go down to formatter requests, which is seldom done by human man page > authors, but a bit more frequently by automated generators of man(7) or > mdoc(7) from other formats. > > 2. Even in traditional *roff, if you issued an ".sp 6" request > (demanding 6 blank lines), then if you were within 6 lines of a "trap" > (usually a page footer trap or the actual bottom of the page), the > result would be that you'd get blank lines until the trap sprung, and > any excess would be thrown away. So if there were only 4 lines of > distance to the page footer, the leftover two would be discarded and > _not_ appear after the header of the next page.[1] > > 3. mandoc(1) and groff's man(7) and mdoc(7) macro packages both > implement "continuous rendering" for terminal output. This means that > they contrive to arrange for an effectively infinite page length, so > there are no page breaks. (Except when you render multiple man pages at > a time, a use case groff 1.23.0 will support.) Since pager programs are > applicable only to terminal output in the first place, this should > address your use case. (You _can_ turn off continuous rendering in > groff, and see man pages as they would have formatted for Western > Electric Teletype machines, which printed to long spools of paper with > 66 lines to the nominal page.) > > 4. A habit has grown up among man(1) programs and pagers to call for > and support, respectively, a "blank line squeezing" feature: any runs of > more than one blank line are condensed to 1 blank line each. In groff > 1.23.0, this will no longer be necessary when continuously rendering. > (Historically, this squeezing feature was used to "tighten up" vertical > space after the page header, prior to the "NAME" section heading of the > document.) In my opinion, pager programs should perform as few > transformations as possible on the output of grotty(1), the groff output > driver that supports terminal devices. The long-time author and > maintainer of less(1) does not agree, so you have to call that program > with its "-R" flag to view grotty(1) output as groff intends it. (To > see what those intentions are, format the document without paging it.) Thank you for the detailled assessment. Perhaps my misunderstanding is because I'm not a native speaker but which document should I format to see what those intentions are? >> If it turns out to still be too weak, I could count all words between >> two empty lines and set that in relation to the words from the >> preceeding empty line. > > You might do this only as a fallback, if there were no blank lines on > the screen at the old window size when the resizing event happened. Yes, such a fallback would be good to have. I am again about to implement a suggestion with some modifications: I would make using the section-word-count (which is expensive) dependent on _how many_ words I found while searching for an empty line or the section header. My motivation for this is that an increasing number of continuous words also increases the possibility for hyphenation working against the heuristics. Saying that, I probably also need to consider the number of lines that contain those words. I have to think more about this. >> But perhaps, I now learn that empty lines are by no means that >> constant value that I assume... > > In my opinion, the presence or absence of a single blank line in > formatted output is important. groff 1.23.0 will feature some bug fixes > with respect to their handling within and adjacent to tbl(1) input.[2] > > Since I flogged groff 1.23.0 three times in this email, I suppose I > should point people to where they can get the 1.23.0.rc3 release > candidate source archive. Feedback would be appreciated. Oh well, I didn't measure it but I spent quite some time to work on doc/lsp-help.1 and try to find a solution for that "nasty empty line" that appeared in of the tables that I use for the online help -- I was convinced it was my fault. Gentoo already has an ebuild for groff-1.23.0-rc3 and simply using this fixes that problem in the table. So, from now on all my testing happens with groff-1.23.0-rc3 and I will report should I recognize problems. Regards, Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-11 9:39 ` Dirk Gouders @ 2023-04-17 6:23 ` G. Branden Robinson 0 siblings, 0 replies; 73+ messages in thread From: G. Branden Robinson @ 2023-04-17 6:23 UTC (permalink / raw) To: Dirk Gouders; +Cc: Alejandro Colomar, linux-man, groff [-- Attachment #1: Type: text/plain, Size: 3234 bytes --] [CC list trimmed of Texinfo people/lists] At 2023-04-11T11:39:11+0200, Dirk Gouders wrote: [I wrote:] > > 4. A habit has grown up among man(1) programs and pagers to call > > for and support, respectively, a "blank line squeezing" feature: any > > runs of more than one blank line are condensed to 1 blank line each. > > In groff 1.23.0, this will no longer be necessary when continuously > > rendering. (Historically, this squeezing feature was used to > > "tighten up" vertical space after the page header, prior to the > > "NAME" section heading of the document.) In my opinion, pager > > programs should perform as few transformations as possible on the > > output of grotty(1), the groff output driver that supports terminal > > devices. The long-time author and maintainer of less(1) does not > > agree, so you have to call that program with its "-R" flag to view > > grotty(1) output as groff intends it. (To see what those intentions > > are, format the document without paging it.) > > Thank you for the detailled assessment. Perhaps my misunderstanding > is because I'm not a native speaker but which document should I format > to see what those intentions are? Just about any man page will do. By "intentions" I mean things like typeface changes and, in the forthcoming groff 1.23.0,[1] OSC 8 escape sequences to encode hyperlinks. For instance, if I want to look at groff_man(7)'s man page without the intermediation of man(1) or a pager, I can do this. $ man -w groff_man # to tell me where the document is installed /usr/share/man/man7/groff_man.7.gz $ zcat $(!!) | nroff -t -mandoc I recommend the above as an early troubleshooting step with rendering problems, though your terminal emulator may need a lot of scrollback buffer, depending on the document. (On rare occasions, a document may require a preprocessor other than tbl(1), but the parts that use them generally won't produce good (eqn) or any (pic) results on terminal devices. "-t -mandoc" should suffice for well over 95% of man pages.) > > Since I flogged groff 1.23.0 three times in this email, I suppose I > > should point people to where they can get the 1.23.0.rc3 release > > candidate source archive. Feedback would be appreciated. > > Oh well, I didn't measure it but I spent quite some time to work on > doc/lsp-help.1 and try to find a solution for that "nasty empty line" > that appeared in of the tables that I use for the online help -- I was > convinced it was my fault. I am sure a lot of people thought that. I was quite pleased to track down and stomp that bug. > Gentoo already has an ebuild for groff-1.23.0-rc3 and simply using > this fixes that problem in the table. So, from now on all my testing > happens with groff-1.23.0-rc3 and I will report should I recognize > problems. Please do. Bruno Haible has found a passel of portability problems to non-GNU/Linux systems, and helped us to resolve several of them; I am hopeful that 1.23.0 will be the most easily deployed groff in quite some time. Regards, Branden [1] We just tagged and put out 1.23.0.rc4 this past weekend. https://lists.gnu.org/archive/html/groff/2023-04/msg00135.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: reformatting man pages at SIGWINCH 2023-04-07 22:09 ` reformatting man pages at SIGWINCH Dirk Gouders 2023-04-07 22:16 ` Alejandro Colomar @ 2023-04-08 11:40 ` Ralph Corderoy 1 sibling, 0 replies; 73+ messages in thread From: Ralph Corderoy @ 2023-04-08 11:40 UTC (permalink / raw) To: linux-man, groff; +Cc: Eli Zaretskii, Dirk Gouders Hi, > > > (1) what part of the screen was the reader actually looking at? less(1) has -j; that would be a good start. > > > (2) how is the pager supposed to know how to map any given > > > location on the screen back to a place in the unrendered source > > > document so it can be accurately found when the document is > > > rerendered? I would assume the pager looks for the same place in its input, not in the man-page source. It keeps seeking forward to the best matching run of words, jumping to the best so far. Problems I can think of: - the formatter's input may be ephemeral and so need buffering, - the originator may not have intended that and limited its size, - seeking the best match after being WINCH'd must also buffer and may never reach EOF, - the input formatter may alter its output based on the terminal's size, e.g. a pic(1) diagram disappears, and - a solution which re-starts the pager loses the pager's ephemeral settings. I expect more would be found in practice. -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-04 23:45 ` Alejandro Colomar 2023-04-05 5:35 ` Eli Zaretskii @ 2023-04-05 10:02 ` Dirk Gouders 2023-04-05 14:19 ` Arsen Arsenović 2023-04-06 1:31 ` Playground pager lsp(1) Alejandro Colomar 1 sibling, 2 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-05 10:02 UTC (permalink / raw) To: Alejandro Colomar; +Cc: linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 5054 bytes --] Hi Alex, >> first of all, chances are that you consider this post as spam, because >> this list is about linux manual pages and not pagers. > > No, I don't. that's fine, thank you for taking the time give me feedback. >> I will try to not waste your time and attach the manual page and a link >> to a short (3:50) demo video. To me it is absolutely OK should you just >> ignore this spam post, but perhaps you find lsp(1) interesting enough >> for further discussion. > > If you had a Debian package, I might try it :) > > Or maybe a Makefile to build from source... What is this meson.build? If you want to take a look at it: there is a branch "next" which you might prefer as it closer resembles my current work. There is a new toggle "-V" that can be used to completely turn off validation. I tried to assemble a Makefile that might work without a configure script and attach it to the end. A prefix /usr is the default value, if your system prefers /usr/local you can use `make prefix=/usr/local install`. I hope I prepared some reasonable Makefile... Concerning meson.build: I decided to have a look at meson as the autobuild tool for lsp. I am just gathering experiences with it and if you have meson(1) installed you could use thes steps to (un)install lsp: $ # cd to lsp directory $ meson setup --prefix=/usr builddir ; cd builddir $ ninja install # or uninstall >> • Manual pages usually refer to other manual pages and lsp allows to >> navigate those references and to visit them as new files with the >> ability to also navigate through all opened manual pages or other >> files. > > Out of curiosity, is this implemented with heuristics? Or do you rely on > semantic mdoc(7) macros? This is purely based on heuristics (regex) which is one reason for validation of the found references. > If it's the first, how do you handle exit(1)? Is it a reference, or is it > just code (with the meaning exit(EXIT_FAILURE))? exit(1) gets recognized as a possible reference but validation will fail. > If it's the second, I guess it doesn't support that in man(7), right? At > least until MR is released. >> >> Here, lsp tries to minimize frustration caused by unavailable >> references and verifies their existance before offering them as >> references that can be visited. > > Do you mark these as broken references? It is interesting to know that > there's a reference which you don't have installed. It may prompt you to > install it and read it. When I see a broken reference, I usually find it > with `apt-file find man3/page.3`, and then install the relevant package. No, broken references aren't marked. Usually those unavailable references make sense, e.g. if a manual page references some program that not everyone uses. One example that I couldn't resolve so far is a reference to getconf(1) for example in fpatchconf(3). Up to now I was not able to find out which package contains getconf(1)... >> >> • In windowing environments lsp does complete resizes when windows >> get resized. This means it also reloads the manual page to fit the >> new window size. > > Good. This I miss it in less(1) often. Not sure if they had any strong > reason to not support that. Unfortunately, info(1) also doesn't do full resizes (on my system). >> >> • Search for manual pages using apropos(1); in the current most basic >> form it lists all known manual pages ready for text search and >> visiting referenced manual pages. > > What does it bring that `apropos * | less` can't do? If you're going the > of info(1) with full-blown system, it seems reasonable, but I never really > liked all that if it's just a new terminal and a command away from me. You get a pseudo-file from where you can reach any manual page on the system. Originally I thought this to help novice users but since lsp is my system's PAGER I use it more often than expected. I'm missing the ability to give keywords to apropos but that's just a matter of time to get fixed. >> >> • lsp has an experimental TOC mode. >> >> This is a three-level folding mode trying to list only section and >> sub-section names for quick navigation in manual pages. > > Nice, and this an important feature missing feature in info(1), as I > reported recently. :) Maybe they are interested in something similar. > >> >> The TOC is created using naive heuristics which works well to some >> extend, but it might be incomplete. Users should keep that in mind. > > I guess the heuristics are just `^[^ ]` for SH and `^ [^ ]` for SS, right? > I tipically use something similar for searching for command flags, and as > you say, these just work. Yes, that is correct. Only level 2 (0-based) does some additional look-ahead. Cheers, Dirk [-- Attachment #2: Makefile --] [-- Type: application/octet-stream, Size: 576 bytes --] version=\"$(shell cat .version)\" CFLAGS := $(shell pkg-config --cflags ncursesw) CFLAGS += -DLSP_VERSION=$(version) LDFLAGS := $(shell pkg-config --libs ncursesw) ifeq ($(prefix),) prefix := /usr endif lsp: lsp.c gcc $(CFLAGS) $(LDFLAGS) -o $@ $< doc/lsp.1: doc/lsp.adoc a2x --doctype manpage --format manpage -a lsp-version=$(version) $< .PHONY: uninstall install install: lsp doc/lsp.1 doc/lsp-help.1 install lsp $(prefix)/bin install doc/lsp.1 doc/lsp-help.1 $(prefix)/share/man/man1/ uninstall: rm $(prefix)/bin/lsp rm $(prefix)/share/man/man1/lsp{,-help}.1 ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 10:02 ` Playground pager lsp(1) Dirk Gouders @ 2023-04-05 14:19 ` Arsen Arsenović 2023-04-05 18:01 ` Dirk Gouders 2023-04-06 1:31 ` Playground pager lsp(1) Alejandro Colomar 1 sibling, 1 reply; 73+ messages in thread From: Arsen Arsenović @ 2023-04-05 14:19 UTC (permalink / raw) To: Dirk Gouders; +Cc: Alejandro Colomar, linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 5475 bytes --] Dirk Gouders <dirk@gouders.net> writes: > Hi Alex, > >>> first of all, chances are that you consider this post as spam, because >>> this list is about linux manual pages and not pagers. >> >> No, I don't. > > that's fine, thank you for taking the time give me feedback. > >>> I will try to not waste your time and attach the manual page and a link >>> to a short (3:50) demo video. To me it is absolutely OK should you just >>> ignore this spam post, but perhaps you find lsp(1) interesting enough >>> for further discussion. >> >> If you had a Debian package, I might try it :) >> >> Or maybe a Makefile to build from source... What is this meson.build? > > If you want to take a look at it: there is a branch "next" which you > might prefer as it closer resembles my current work. There is a new > toggle "-V" that can be used to completely turn off validation. > > I tried to assemble a Makefile that might work without a configure > script and attach it to the end. A prefix /usr is the default value, if > your system prefers /usr/local you can use `make prefix=/usr/local > install`. I hope I prepared some reasonable Makefile... > > Concerning meson.build: I decided to have a look at meson as the > autobuild tool for lsp. I am just gathering experiences with it and if > you have meson(1) installed you could use thes steps to (un)install lsp: > > $ # cd to lsp directory > $ meson setup --prefix=/usr builddir ; cd builddir > $ ninja install # or uninstall > >>> • Manual pages usually refer to other manual pages and lsp allows to >>> navigate those references and to visit them as new files with the >>> ability to also navigate through all opened manual pages or other >>> files. >> >> Out of curiosity, is this implemented with heuristics? Or do you rely on >> semantic mdoc(7) macros? > > This is purely based on heuristics (regex) which is one reason for > validation of the found references. > >> If it's the first, how do you handle exit(1)? Is it a reference, or is it >> just code (with the meaning exit(EXIT_FAILURE))? > > exit(1) gets recognized as a possible reference but validation will fail. > >> If it's the second, I guess it doesn't support that in man(7), right? At >> least until MR is released. > >>> >>> Here, lsp tries to minimize frustration caused by unavailable >>> references and verifies their existance before offering them as >>> references that can be visited. >> >> Do you mark these as broken references? It is interesting to know that >> there's a reference which you don't have installed. It may prompt you to >> install it and read it. When I see a broken reference, I usually find it >> with `apt-file find man3/page.3`, and then install the relevant package. > > No, broken references aren't marked. Usually those unavailable > references make sense, e.g. if a manual page references some program > that not everyone uses. > > One example that I couldn't resolve so far is a reference to > getconf(1) for example in fpatchconf(3). Up to now I was not able to > find out which package contains getconf(1)... > >>> >>> • In windowing environments lsp does complete resizes when windows >>> get resized. This means it also reloads the manual page to fit the >>> new window size. >> >> Good. This I miss it in less(1) often. Not sure if they had any strong >> reason to not support that. > > Unfortunately, info(1) also doesn't do full resizes (on my system). Do you mean the info pages' column limit or that the viewer itself doesn't resize to fit the frame? The latter would be a bug. >>> >>> • Search for manual pages using apropos(1); in the current most basic >>> form it lists all known manual pages ready for text search and >>> visiting referenced manual pages. >> >> What does it bring that `apropos * | less` can't do? If you're going the >> of info(1) with full-blown system, it seems reasonable, but I never really >> liked all that if it's just a new terminal and a command away from me. > > You get a pseudo-file from where you can reach any manual page on the > system. Originally I thought this to help novice users but since lsp is > my system's PAGER I use it more often than expected. I'm missing the > ability to give keywords to apropos but that's just a matter of time to > get fixed. > >>> >>> • lsp has an experimental TOC mode. >>> >>> This is a three-level folding mode trying to list only section and >>> sub-section names for quick navigation in manual pages. >> >> Nice, and this an important feature missing feature in info(1), as I >> reported recently. :) Maybe they are interested in something similar. >> >>> >>> The TOC is created using naive heuristics which works well to some >>> extend, but it might be incomplete. Users should keep that in mind. >> >> I guess the heuristics are just `^[^ ]` for SH and `^ [^ ]` for SS, right? >> I tipically use something similar for searching for command flags, and as >> you say, these just work. > > Yes, that is correct. Only level 2 (0-based) does some additional > look-ahead. > > Cheers, > > Dirk > > [2. Makefile --- application/octet-stream; Makefile.new]... -- Arsen Arsenović [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 251 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 14:19 ` Arsen Arsenović @ 2023-04-05 18:01 ` Dirk Gouders 2023-04-05 19:07 ` Eli Zaretskii 0 siblings, 1 reply; 73+ messages in thread From: Dirk Gouders @ 2023-04-05 18:01 UTC (permalink / raw) To: Arsen Arsenović; +Cc: Alejandro Colomar, linux-man, help-texinfo Arsen Arsenović <arsen@aarsen.me> writes: >>>> • In windowing environments lsp does complete resizes when windows >>>> get resized. This means it also reloads the manual page to fit the >>>> new window size. >>> >>> Good. This I miss it in less(1) often. Not sure if they had any strong >>> reason to not support that. >> >> Unfortunately, info(1) also doesn't do full resizes (on my system). > > Do you mean the info pages' column limit or that the viewer itself > doesn't resize to fit the frame? The latter would be a bug. Yes, I meant the column limit. Sorry for not having expressed this very clear. Dirk ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 18:01 ` Dirk Gouders @ 2023-04-05 19:07 ` Eli Zaretskii 2023-04-05 19:56 ` Dirk Gouders 2023-04-05 20:38 ` A less presumptive .info? (was: Re: Playground pager lsp(1)) Arsen Arsenović 0 siblings, 2 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-05 19:07 UTC (permalink / raw) To: Dirk Gouders; +Cc: arsen, alx.manpages, linux-man, help-texinfo > From: Dirk Gouders <dirk@gouders.net> > Cc: Alejandro Colomar <alx.manpages@gmail.com>, linux-man@vger.kernel.org, > help-texinfo@gnu.org > Date: Wed, 05 Apr 2023 20:01:56 +0200 > > Arsen Arsenović <arsen@aarsen.me> writes: > > >>>> • In windowing environments lsp does complete resizes when windows > >>>> get resized. This means it also reloads the manual page to fit the > >>>> new window size. > >>> > >>> Good. This I miss it in less(1) often. Not sure if they had any strong > >>> reason to not support that. > >> > >> Unfortunately, info(1) also doesn't do full resizes (on my system). > > > > Do you mean the info pages' column limit or that the viewer itself > > doesn't resize to fit the frame? The latter would be a bug. > > Yes, I meant the column limit. Sorry for not having expressed this very > clear. Info files are formatted already, you cannot ask the reader to reformat them for a different line length. With man pages this is only possible if you never keep the formatted pages and reuse them once they were produced. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 19:07 ` Eli Zaretskii @ 2023-04-05 19:56 ` Dirk Gouders 2023-04-05 20:38 ` A less presumptive .info? (was: Re: Playground pager lsp(1)) Arsen Arsenović 1 sibling, 0 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-05 19:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: arsen, alx.manpages, linux-man, help-texinfo Eli Zaretskii <eliz@gnu.org> writes: >> From: Dirk Gouders <dirk@gouders.net> >> Cc: Alejandro Colomar <alx.manpages@gmail.com>, linux-man@vger.kernel.org, >> help-texinfo@gnu.org >> Date: Wed, 05 Apr 2023 20:01:56 +0200 >> >> Arsen Arsenović <arsen@aarsen.me> writes: >> >> >>>> • In windowing environments lsp does complete resizes when windows >> >>>> get resized. This means it also reloads the manual page to fit the >> >>>> new window size. >> >>> >> >>> Good. This I miss it in less(1) often. Not sure if they had any strong >> >>> reason to not support that. >> >> >> >> Unfortunately, info(1) also doesn't do full resizes (on my system). >> > >> > Do you mean the info pages' column limit or that the viewer itself >> > doesn't resize to fit the frame? The latter would be a bug. >> >> Yes, I meant the column limit. Sorry for not having expressed this very >> clear. > > Info files are formatted already, you cannot ask the reader to > reformat them for a different line length. Thank you for that explanation; I didn't know that and now understand info(1)'s behavior. Dirk > With man pages this is only possible if you never keep the formatted > pages and reuse them once they were produced. ^ permalink raw reply [flat|nested] 73+ messages in thread
* A less presumptive .info? (was: Re: Playground pager lsp(1)) 2023-04-05 19:07 ` Eli Zaretskii 2023-04-05 19:56 ` Dirk Gouders @ 2023-04-05 20:38 ` Arsen Arsenović 2023-04-06 8:14 ` Eli Zaretskii 1 sibling, 1 reply; 73+ messages in thread From: Arsen Arsenović @ 2023-04-05 20:38 UTC (permalink / raw) To: Eli Zaretskii, Gavin Smith Cc: Dirk Gouders, alx.manpages, linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 2296 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > Info files are formatted already, you cannot ask the reader to > reformat them for a different line length. > > With man pages this is only possible if you never keep the formatted > pages and reuse them once they were produced. I've been casually wondering if creating a new format that can host more formatting options and uses more precise syntax than 'plaintext with some binary tags' would be a decent thing to work on. My thoughts were brief and undeveloped as this was thought of on the commute, but something that retains the binary offsets for indices and tags, but stores formatted data (perhaps as s-exprs, those would be easy to parse). It is always easier to remove information than to reintroduce it. Such a structure should resemble the input language, but with far less complexity (e.g. something at the level of abstraction that HTML5 sits at, so, macros would be expanded, and we'd be dealing with lists of paragraphs and formatted blocks, etc.). This would allow for the reflowing that was talked about in this thread, and provide more readable output in graphical contexts, as it wouldn't be data generated with the assumption of a monospace font (rather, the format could store whether your context wants monospace or proportional fonts at a given point), or data generated for a given screen size, or with a given indentation size, or with the assumption of a lack of features like italics, etc. For instance, info2html used by the KDE info viewer currently produces quite terrible results, because it fails to implement the heuristics the Info viewers have properly. This problem would be hard to have with a better "at-rest" format for Info pages. The alternative is, of course, bringing HTML up to par feature-wise (wrt. indices etc), but that'd be on the other end of the extreme, where instead of being too easy to parse and lacking important information, it'd be oververbose with and difficult to parse (not that such a thing should not be done too, so that folks using ordinary browsers can enjoy documentation, and so that projects can provide more accessible documentation by the merit of more people having HTML than Info viewers). WDYT folks? -- Arsen Arsenović [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 381 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: A less presumptive .info? (was: Re: Playground pager lsp(1)) 2023-04-05 20:38 ` A less presumptive .info? (was: Re: Playground pager lsp(1)) Arsen Arsenović @ 2023-04-06 8:14 ` Eli Zaretskii 2023-04-06 8:56 ` Gavin Smith 2023-04-07 13:14 ` Arsen Arsenović 0 siblings, 2 replies; 73+ messages in thread From: Eli Zaretskii @ 2023-04-06 8:14 UTC (permalink / raw) To: Arsen Arsenović Cc: GavinSmith0123, dirk, alx.manpages, linux-man, help-texinfo > From: Arsen Arsenović <arsen@aarsen.me> > Cc: Dirk Gouders <dirk@gouders.net>, alx.manpages@gmail.com, > linux-man@vger.kernel.org, help-texinfo@gnu.org > Date: Wed, 05 Apr 2023 22:38:12 +0200 > > I've been casually wondering if creating a new format that can host more > formatting options and uses more precise syntax than 'plaintext with > some binary tags' would be a decent thing to work on. > > My thoughts were brief and undeveloped as this was thought of on the > commute, but something that retains the binary offsets for indices and > tags, but stores formatted data (perhaps as s-exprs, those would be easy > to parse). It is always easier to remove information than to > reintroduce it. > > Such a structure should resemble the input language, but with far less > complexity (e.g. something at the level of abstraction that HTML5 sits > at, so, macros would be expanded, and we'd be dealing with lists of > paragraphs and formatted blocks, etc.). > > This would allow for the reflowing that was talked about in this thread, > and provide more readable output in graphical contexts, as it wouldn't > be data generated with the assumption of a monospace font (rather, the > format could store whether your context wants monospace or proportional > fonts at a given point), or data generated for a given screen size, or > with a given indentation size, or with the assumption of a lack of > features like italics, etc. > > For instance, info2html used by the KDE info viewer currently produces > quite terrible results, because it fails to implement the heuristics the > Info viewers have properly. This problem would be hard to have with a > better "at-rest" format for Info pages. > > The alternative is, of course, bringing HTML up to par feature-wise > (wrt. indices etc), but that'd be on the other end of the extreme, where > instead of being too easy to parse and lacking important information, > it'd be oververbose with and difficult to parse (not that such a thing > should not be done too, so that folks using ordinary browsers can enjoy > documentation, and so that projects can provide more accessible > documentation by the merit of more people having HTML than Info > viewers). > > WDYT folks? Gavin will tell, but AFAIU our plan is to develop js as the means towards the goals you mentioned. That will allow using HTML browsers to read Texinfo documentation without losing the functionalities of the Info readers we value. HTML rendering reflows as integral part of its workings, so that problem is not an issue if this plan succeeds. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: A less presumptive .info? (was: Re: Playground pager lsp(1)) 2023-04-06 8:14 ` Eli Zaretskii @ 2023-04-06 8:56 ` Gavin Smith 2023-04-07 13:14 ` Arsen Arsenović 1 sibling, 0 replies; 73+ messages in thread From: Gavin Smith @ 2023-04-06 8:56 UTC (permalink / raw) To: Eli Zaretskii Cc: Arsen Arsenović, dirk, alx.manpages, linux-man, help-texinfo On Thu, Apr 06, 2023 at 11:14:01AM +0300, Eli Zaretskii wrote: > > The alternative is, of course, bringing HTML up to par feature-wise > > (wrt. indices etc), but that'd be on the other end of the extreme, where > > instead of being too easy to parse and lacking important information, > > it'd be oververbose with and difficult to parse (not that such a thing > > should not be done too, so that folks using ordinary browsers can enjoy > > documentation, and so that projects can provide more accessible > > documentation by the merit of more people having HTML than Info > > viewers). > > > > WDYT folks? > > Gavin will tell, but AFAIU our plan is to develop js as the means > towards the goals you mentioned. That will allow using HTML browsers > to read Texinfo documentation without losing the functionalities of > the Info readers we value. HTML rendering reflows as integral part of > its workings, so that problem is not an issue if this plan succeeds. Progress on this issue is described in the TODO.HTML file in the Texinfo repository. https://git.savannah.gnu.org/cgit/texinfo.git/tree/TODO.HTML In short, the main avenue of progress appears to be the documentation browser using the embedded WebkitGTK browser. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: A less presumptive .info? (was: Re: Playground pager lsp(1)) 2023-04-06 8:14 ` Eli Zaretskii 2023-04-06 8:56 ` Gavin Smith @ 2023-04-07 13:14 ` Arsen Arsenović 1 sibling, 0 replies; 73+ messages in thread From: Arsen Arsenović @ 2023-04-07 13:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: GavinSmith0123, dirk, alx.manpages, linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 1685 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > Gavin will tell, but AFAIU our plan is to develop js as the means > towards the goals you mentioned. That will allow using HTML browsers > to read Texinfo documentation without losing the functionalities of > the Info readers we value. HTML rendering reflows as integral part of > its workings, so that problem is not an issue if this plan succeeds. Sure, but how will this work with the standalone and/or Emacs viewers? In Emacs, doing so places a strain on the HTML generator to work around eww, and presuming we choose to do that, it requires the user to have Emacs. In the non-Emacs case, it requires that the implementor implement at least a subset of HTML, or places a demand on the user to have a web browser (in which, there are two extremes: either the 'underimplemented and insufficient' ones for which JS as glue won't work, or full browsers which aren't accessible in many scenarios). On the other hand, having a more advanced format based on s-exprs for info at rest storage could let us have complete information about the intended markup of the text to be displayed with only two syntactic elements (lists and strings). That should be rather easy to parse. I don't see it as very viable to replace an implementable info storage format with only HTML for that reason. I have TODO.HTML open on my workstation to take a look through some of those when I get back home. I do believe that it's a high priority target, as it is very important to newcommers to GNU who are viewing GNU documentation from remote servers, but I doubt it can replace a native Info format. -- Arsen Arsenović [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 251 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-05 10:02 ` Playground pager lsp(1) Dirk Gouders 2023-04-05 14:19 ` Arsen Arsenović @ 2023-04-06 1:31 ` Alejandro Colomar 2023-04-06 6:01 ` Dirk Gouders 1 sibling, 1 reply; 73+ messages in thread From: Alejandro Colomar @ 2023-04-06 1:31 UTC (permalink / raw) To: Dirk Gouders; +Cc: linux-man, help-texinfo [-- Attachment #1.1: Type: text/plain, Size: 4614 bytes --] Hi Dirk, On 4/5/23 12:02, Dirk Gouders wrote: > Hi Alex, > >>> first of all, chances are that you consider this post as spam, because >>> this list is about linux manual pages and not pagers. >> >> No, I don't. > > that's fine, thank you for taking the time give me feedback. > :) > If you want to take a look at it: there is a branch "next" which you > might prefer as it closer resembles my current work. There is a new > toggle "-V" that can be used to completely turn off validation. > > I tried to assemble a Makefile that might work without a configure > script and attach it to the end. A prefix /usr is the default value, if > your system prefers /usr/local you can use `make prefix=/usr/local The default prefix in GNU should be /usr/local <https://www.gnu.org/prep/standards/html_node/Directory-Variables.html> > install`. I hope I prepared some reasonable Makefile... I'll have a look. > > Concerning meson.build: I decided to have a look at meson as the > autobuild tool for lsp. I am just gathering experiences with it and if > you have meson(1) installed you could use thes steps to (un)install lsp: > > $ # cd to lsp directory > $ meson setup --prefix=/usr builddir ; cd builddir > $ ninja install # or uninstall >> If it's the first, how do you handle exit(1)? Is it a reference, or is it >> just code (with the meaning exit(EXIT_FAILURE))? > > exit(1) gets recognized as a possible reference but validation will fail. `man 'exit(1)'` works for me. It brings the exit(1posix) page, from manpages-posix. > No, broken references aren't marked. Usually those unavailable > references make sense, e.g. if a manual page references some program > that not everyone uses. > > One example that I couldn't resolve so far is a reference to > getconf(1) for example in fpatchconf(3). Up to now I was not able to > find out which package contains getconf(1)... $ apt-file find /getconf.1 glibc-source: /usr/src/glibc/debian/local/manpages/getconf.1 libc-bin: /usr/share/man/man1/getconf.1.gz manpages-fr: /usr/share/man/fr/man1/getconf.1.gz It's in libc-bin. BTW, did you mean fpathconf(3)? > >>> >>> • In windowing environments lsp does complete resizes when windows >>> get resized. This means it also reloads the manual page to fit the >>> new window size. >> >> Good. This I miss it in less(1) often. Not sure if they had any strong >> reason to not support that. > > Unfortunately, info(1) also doesn't do full resizes (on my system). > >>> >>> • Search for manual pages using apropos(1); in the current most basic >>> form it lists all known manual pages ready for text search and >>> visiting referenced manual pages. >> >> What does it bring that `apropos * | less` can't do? If you're going the >> of info(1) with full-blown system, it seems reasonable, but I never really >> liked all that if it's just a new terminal and a command away from me. > > You get a pseudo-file from where you can reach any manual page on the > system. Originally I thought this to help novice users but since lsp is > my system's PAGER I use it more often than expected. I'm missing the > ability to give keywords to apropos but that's just a matter of time to > get fixed. I guess that's a matter of preferring navigation in some interactive program (to me, info(1) style), vs standalone simple commands where you first find what you want, then run it. I don't find that magic much more comfortable than $ apropos sysctl ... oh, I find many freebsd pages, let's grep them out ... $ apropos sysctl | grep -v freebsd ... hmm, let's see system ... $ apropos system | grep -v freebsd ... okay, now this shows a lot of stuff, let's remove man1 ... $ apropos system | grep -v -e freebsd -e '(1' ... I don't want systemd either ... $ apropos system | grep -v -e freebsd -e '(1' -e systemd ... let's sort by section and navigate through that list ... $ apropos system | grep -v -e freebsd -e '(1' -e systemd | sort -k2 | less Find some pages that may be interesting, note them down, and open them one by one, in different tabs, until I find I wanted to read proc(5), and close everything else. Which brings us to a valid point Eli raised. Some pages are an unreadable mess, and I think proc(5) is one of those that needs a big split into smaller pages such as proc_pid_attr(5). Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Playground pager lsp(1) 2023-04-06 1:31 ` Playground pager lsp(1) Alejandro Colomar @ 2023-04-06 6:01 ` Dirk Gouders 0 siblings, 0 replies; 73+ messages in thread From: Dirk Gouders @ 2023-04-06 6:01 UTC (permalink / raw) To: Alejandro Colomar; +Cc: linux-man, help-texinfo [-- Attachment #1: Type: text/plain, Size: 5064 bytes --] Hi Alex, Alejandro Colomar <alx.manpages@gmail.com> writes: >> If you want to take a look at it: there is a branch "next" which you >> might prefer as it closer resembles my current work. There is a new >> toggle "-V" that can be used to completely turn off validation. >> >> I tried to assemble a Makefile that might work without a configure >> script and attach it to the end. A prefix /usr is the default value, if >> your system prefers /usr/local you can use `make prefix=/usr/local > > The default prefix in GNU should be /usr/local > <https://www.gnu.org/prep/standards/html_node/Directory-Variables.html> > >> install`. I hope I prepared some reasonable Makefile... > > I'll have a look. Perhaps, I messed up the Makefile. Some time ago, I test-compiled lsp on Rasbpian and CentOS and -lutil was missing. That got fixed in meson.build but not in the Makefile I sent you. I'll attach a new one -- this time as plain text ;-) >>> If it's the first, how do you handle exit(1)? Is it a reference, or is it >>> just code (with the meaning exit(EXIT_FAILURE))? >> >> exit(1) gets recognized as a possible reference but validation will fail. > > `man 'exit(1)'` works for me. It brings the exit(1posix) page, from > manpages-posix. Oh yes, I didn't have the POSIX manual pages installed -- now, exit(1) gets recognized as a reference. Thank you. >> No, broken references aren't marked. Usually those unavailable >> references make sense, e.g. if a manual page references some program >> that not everyone uses. >> >> One example that I couldn't resolve so far is a reference to >> getconf(1) for example in fpatchconf(3). Up to now I was not able to >> find out which package contains getconf(1)... > > $ apt-file find /getconf.1 > glibc-source: /usr/src/glibc/debian/local/manpages/getconf.1 > libc-bin: /usr/share/man/man1/getconf.1.gz > manpages-fr: /usr/share/man/fr/man1/getconf.1.gz > > It's in libc-bin. > > BTW, did you mean fpathconf(3)? Yes, that was a typo. I'm on Gentoo and there is no libc-bin. But now I have a direction to search. Thank you, again. > >> >>>> >>>> • In windowing environments lsp does complete resizes when windows >>>> get resized. This means it also reloads the manual page to fit the >>>> new window size. >>> >>> Good. This I miss it in less(1) often. Not sure if they had any strong >>> reason to not support that. >> >> Unfortunately, info(1) also doesn't do full resizes (on my system). >> >>>> >>>> • Search for manual pages using apropos(1); in the current most basic >>>> form it lists all known manual pages ready for text search and >>>> visiting referenced manual pages. >>> >>> What does it bring that `apropos * | less` can't do? If you're going the >>> of info(1) with full-blown system, it seems reasonable, but I never really >>> liked all that if it's just a new terminal and a command away from me. >> >> You get a pseudo-file from where you can reach any manual page on the >> system. Originally I thought this to help novice users but since lsp is >> my system's PAGER I use it more often than expected. I'm missing the >> ability to give keywords to apropos but that's just a matter of time to >> get fixed. > > I guess that's a matter of preferring navigation in some interactive > program (to me, info(1) style), vs standalone simple commands where you > first find what you want, then run it. > > I don't find that magic much more comfortable than > > $ apropos sysctl > ... oh, I find many freebsd pages, let's grep them out ... > $ apropos sysctl | grep -v freebsd > ... hmm, let's see system ... > $ apropos system | grep -v freebsd > ... okay, now this shows a lot of stuff, let's remove man1 ... > $ apropos system | grep -v -e freebsd -e '(1' > ... I don't want systemd either ... > $ apropos system | grep -v -e freebsd -e '(1' -e systemd > ... let's sort by section and navigate through that list ... > $ apropos system | grep -v -e freebsd -e '(1' -e systemd | sort -k2 | less > > Find some pages that may be interesting, note them down, and open > them one by one, in different tabs, until I find I wanted to read > proc(5), and close everything else. As I wrote: I (also) had novice users in mind when I implented the Apropos pseudo-file (it can also be used for verification, that's another use of it). I often watch novice users getting frustrated about all the typing that is needed to get useful results. I know, first of all, they need to train their "keyboard abilities" but some help here and there could perhaps help to keep them on board or minimize frustration... > Which brings us to a valid point Eli raised. Some pages are an > unreadable mess, and I think proc(5) is one of those that needs > a big split into smaller pages such as proc_pid_attr(5). Yes, one of the points where I thought pagers with additional features could help... Cheers, Dirk [-- Attachment #2: Makefile for lsp(1) --] [-- Type: text/plain, Size: 600 bytes --] version=\"$(shell cat .version)\" CFLAGS := $(shell pkg-config --cflags ncursesw) CFLAGS += -DLSP_VERSION=$(version) LDFLAGS := $(shell pkg-config --libs ncursesw) LDFLAGS += -lutil ifeq ($(prefix),) prefix := /usr/local endif lsp: lsp.c gcc $(CFLAGS) $(LDFLAGS) -o $@ $< doc/lsp.1: doc/lsp.adoc a2x --doctype manpage --format manpage -a lsp-version=$(version) $< .PHONY: uninstall install install: lsp doc/lsp.1 doc/lsp-help.1 install lsp $(prefix)/bin install doc/lsp.1 doc/lsp-help.1 $(prefix)/share/man/man1/ uninstall: rm $(prefix)/bin/lsp rm $(prefix)/share/man/man1/lsp{,-help}.1 ^ permalink raw reply [flat|nested] 73+ messages in thread
end of thread, other threads:[~2023-04-17 6:23 UTC | newest] Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-25 20:37 Playground pager lsp(1) Dirk Gouders 2023-03-25 20:47 ` Dirk Gouders 2023-04-04 23:45 ` Alejandro Colomar 2023-04-05 5:35 ` Eli Zaretskii 2023-04-06 1:10 ` Alejandro Colomar 2023-04-06 8:11 ` Eli Zaretskii 2023-04-06 8:48 ` Gavin Smith 2023-04-07 22:01 ` Alejandro Colomar 2023-04-08 7:05 ` Eli Zaretskii 2023-04-08 13:02 ` Accessibility of man pages (was: Playground pager lsp(1)) Alejandro Colomar 2023-04-08 13:42 ` Eli Zaretskii 2023-04-08 16:06 ` Alejandro Colomar 2023-04-08 13:47 ` Colin Watson 2023-04-08 15:42 ` Alejandro Colomar 2023-04-08 19:48 ` Accessibility of man pages Dirk Gouders 2023-04-08 20:02 ` Eli Zaretskii 2023-04-08 20:46 ` Dirk Gouders 2023-04-08 21:53 ` Alejandro Colomar 2023-04-08 22:33 ` Alejandro Colomar 2023-04-09 10:28 ` Ralph Corderoy 2023-04-08 20:31 ` Ingo Schwarze 2023-04-08 20:59 ` Dirk Gouders 2023-04-08 22:39 ` Ingo Schwarze 2023-04-09 9:50 ` Dirk Gouders 2023-04-09 10:35 ` Dirk Gouders [not found] ` <87a5zhwntt.fsf@ada> 2023-04-09 12:05 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Alejandro Colomar 2023-04-09 12:17 ` Alejandro Colomar 2023-04-09 18:55 ` G. Branden Robinson 2023-04-09 12:29 ` Colin Watson 2023-04-09 13:36 ` Alejandro Colomar 2023-04-09 13:47 ` Compressed man pages Ralph Corderoy 2023-04-12 8:13 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Sam James 2023-04-12 8:32 ` Compressed man pages Ralph Corderoy 2023-04-12 10:35 ` Mingye Wang 2023-04-12 10:55 ` Ralph Corderoy 2023-04-12 13:04 ` Compressed man pages (was: Accessibility of man pages (was: Playground pager lsp(1))) Kerin Millar 2023-04-12 14:24 ` Alejandro Colomar 2023-04-12 18:52 ` Mingye Wang 2023-04-12 20:23 ` Compressed man pages Alejandro Colomar 2023-04-13 10:09 ` Ralph Corderoy 2023-04-07 2:18 ` Playground pager lsp(1) G. Branden Robinson 2023-04-07 6:36 ` Eli Zaretskii 2023-04-07 11:03 ` Gavin Smith 2023-04-07 14:43 ` man page rendering speed (was: Playground pager lsp(1)) G. Branden Robinson 2023-04-07 15:06 ` Eli Zaretskii 2023-04-07 15:08 ` Larry McVoy 2023-04-07 17:07 ` man page rendering speed Ingo Schwarze 2023-04-07 19:04 ` man page rendering speed (was: Playground pager lsp(1)) Alejandro Colomar 2023-04-07 19:28 ` Gavin Smith 2023-04-07 20:43 ` Alejandro Colomar 2023-04-07 16:08 ` Colin Watson 2023-04-08 11:24 ` Ralph Corderoy 2023-04-07 21:26 ` reformatting man pages at SIGWINCH " Alejandro Colomar 2023-04-07 22:09 ` reformatting man pages at SIGWINCH Dirk Gouders 2023-04-07 22:16 ` Alejandro Colomar 2023-04-10 19:05 ` Dirk Gouders 2023-04-10 19:57 ` Alejandro Colomar 2023-04-10 20:24 ` G. Branden Robinson 2023-04-11 9:20 ` Ralph Corderoy 2023-04-11 9:39 ` Dirk Gouders 2023-04-17 6:23 ` G. Branden Robinson 2023-04-08 11:40 ` Ralph Corderoy 2023-04-05 10:02 ` Playground pager lsp(1) Dirk Gouders 2023-04-05 14:19 ` Arsen Arsenović 2023-04-05 18:01 ` Dirk Gouders 2023-04-05 19:07 ` Eli Zaretskii 2023-04-05 19:56 ` Dirk Gouders 2023-04-05 20:38 ` A less presumptive .info? (was: Re: Playground pager lsp(1)) Arsen Arsenović 2023-04-06 8:14 ` Eli Zaretskii 2023-04-06 8:56 ` Gavin Smith 2023-04-07 13:14 ` Arsen Arsenović 2023-04-06 1:31 ` Playground pager lsp(1) Alejandro Colomar 2023-04-06 6:01 ` Dirk Gouders
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).