All of lore.kernel.org
 help / color / mirror / Atom feed
* Exctracting source code from EXAMPLES
@ 2022-03-20 20:34 Alejandro Colomar (man-pages)
  2022-03-20 21:26 ` Ingo Schwarze
  2022-03-20 22:27 ` Exctracting source code from EXAMPLES Stephen Kitt
  0 siblings, 2 replies; 7+ messages in thread
From: Alejandro Colomar (man-pages) @ 2022-03-20 20:34 UTC (permalink / raw)
  To: Stephen Kitt, G. Branden Robinson, Michael Kerrisk (man-pages),
	linux-man, Ingo Schwarze

Gidday!

I have ready some code to extract source code from EXAMPLES in man-pages.
For that, I set up some convention:

Enclose the code (including the enclosing .EX/.EE in a pair of comments
with a very precise formatting:

[[
...
.\" SRC BEGIN (program_name.c)
.EX
#include <stdio.h>

int main(void)
{
	printf("Hello, world!");
}
.EE
.\" SRC END
...
]]

There can be multiple programs in a single page, with the only
restriction that each of them has to have a different program_name
(there can be collisions within different manual pages, but not within
the same manual page)

The Makefile will create a directory for each manal page, where the
different programs will be created with the name specified in the
comment (that's why it has to be different from others in the same page
only).

Please, check that you like what you see, and comment if not (or if yes
too :).

I tested it with membarrier.2, and it produced a correct .c file.
The next step will be to add targets to lint and compile the produced
files, to check their correctness.

I hope this will make our lives much easier maintaining manual pages :-)


Cheers,

Alex



diff --git a/Makefile b/Makefile
index 03ebde18c..05a1b5950 100644
--- a/Makefile
+++ b/Makefile
@@ -30,19 +30,20 @@ MAKEFLAGS += --no-print-directory
 MAKEFLAGS += --warn-undefined-variables


-srcdir := .
+srcdir   := .
 builddir := tmp
-LINTDIR := $(builddir)/lint
-HTMLDIR := $(builddir)/html
+LINTDIR  := $(builddir)/lint
+HTMLDIR  := $(builddir)/html
+SRCDIR   := $(builddir)/src

 DESTDIR :=
-prefix := /usr/local
-SYSCONFDIR := $(srcdir)/etc
-TMACDIR := $(SYSCONFDIR)/groff/tmac
+prefix  := /usr/local
+SYSCONFDIR  := $(srcdir)/etc
+TMACDIR     := $(SYSCONFDIR)/groff/tmac
 datarootdir := $(prefix)/share
-docdir := $(datarootdir)/doc
-MANDIR := $(srcdir)
-mandir := $(datarootdir)/man
+docdir  := $(datarootdir)/doc
+MANDIR  := $(srcdir)
+mandir  := $(datarootdir)/man
 MAN0DIR := $(MANDIR)/man0
 MAN1DIR := $(MANDIR)/man1
 MAN2DIR := $(MANDIR)/man2
@@ -61,7 +62,7 @@ man5dir := $(mandir)/man5
 man6dir := $(mandir)/man6
 man7dir := $(mandir)/man7
 man8dir := $(mandir)/man8
-manext := \.[0-9]
+manext  := \.[0-9]
 man0ext := .0
 man1ext := .1
 man2ext := .2
@@ -71,9 +72,9 @@ man5ext := .5
 man6ext := .6
 man7ext := .7
 man8ext := .8
-htmldir := $(docdir)
+htmldir  := $(docdir)
 htmldir_ := $(htmldir)/man
-htmlext := .html
+htmlext  := .html

 TMACFILES            := $(sort $(shell find $(TMACDIR) -not -type d))
 TMACNAMES            := $(basename $(notdir $(TMACFILES)))
@@ -99,9 +100,11 @@ MAN2HTMLFLAGS         := $(DEFAULT_MAN2HTMLFLAGS)
$(EXTRA_MAN2HTMLFLAGS)
 INSTALL      := install
 INSTALL_DATA := $(INSTALL) -m 644
 INSTALL_DIR  := $(INSTALL) -m 755 -d
+MKDIR        := mkdir -p
 RM           := rm
 RMDIR        := rmdir --ignore-fail-on-non-empty
 GROFF        := groff
+MAN          := man
 MANDOC       := mandoc
 MAN2HTML     := man2html

@@ -161,12 +164,14 @@ _man5pages := $(filter %$(man5ext),$(_manpages))
 _man6pages := $(filter %$(man6ext),$(_manpages))
 _man7pages := $(filter %$(man7ext),$(_manpages))
 _man8pages := $(filter %$(man8ext),$(_manpages))
-LINT_groff := $(patsubst
$(MANDIR)/%,$(LINTDIR)/%.lint.groff.touch,$(LINTPAGES))
-LINT_mandoc:= $(patsubst
$(MANDIR)/%,$(LINTDIR)/%.lint.mandoc.touch,$(LINTPAGES))
+LINT_groff :=$(patsubst
$(MANDIR)/%,$(LINTDIR)/%.lint.groff.touch,$(LINTPAGES))
+LINT_mandoc:=$(patsubst
$(MANDIR)/%,$(LINTDIR)/%.lint.mandoc.touch,$(LINTPAGES))
+SRCPAGEDIRS:=$(patsubst $(MANDIR)/%,$(SRCDIR)/%,$(LINTPAGES))

 MANDIRS   := $(sort $(shell find $(MANDIR)/man? -type d))
 HTMLDIRS  := $(patsubst $(MANDIR)/%,$(HTMLDIR)/%/.,$(MANDIRS))
 LINTDIRS  := $(patsubst $(MANDIR)/%,$(LINTDIR)/%/.,$(MANDIRS))
+SRCDIRS   := $(patsubst $(MANDIR)/%,$(SRCDIR)/%/.,$(MANDIRS))
 _htmldirs := $(patsubst $(HTMLDIR)/%,$(DESTDIR)$(htmldir_)/%,$(HTMLDIRS))
 _mandirs  := $(patsubst $(MANDIR)/%,$(DESTDIR)$(mandir)/%/.,$(MANDIRS))
 _man0dir  := $(filter %man0/.,$(_mandirs))
@@ -248,6 +253,37 @@ uninstall-man: $(_mandir_rmdir) $(uninstall_manX)
        @:


+########################################################################
+# src
+
+$(SRCPAGEDIRS): $(SRCDIR)/%: $(MANDIR)/% | $$(@D)/.
+       $(info MKDIR    $@ $<)
+       $(RM) -rf $@
+       $(MKDIR) $@.tmp
+       <$< \
+       sed -n 's/\.\\" SRC BEGIN (\(.*.c\))/\1/p' \
+       | while read f; do \
+               <$< \
+               sed -n \
+                       -e '/^\.TH/,/^\.SH/{/^\.SH/!p}' \
+                       -e '/^\.SH EXAMPLES/p' \
+                       -e "/^\... SRC BEGIN ($$f)$$/,/^\... SRC END$$/p" \
+               | $(MAN) -P cat -l - \
+               | sed '/^[^ ]/d' \
+               >$@.tmp/$$f; \
+       done \
+       || exit $$?
+       mv -T $@.tmp $@
+
+.PHONY: build-src src
+build-src src: $(SRCPAGEDIRS) | builddirs-src
+       @:
+
+.PHONY: builddirs-src
+builddirs-src: $(SRCDIRS)
+       @:
+
+
 ########################################################################
 # lint

diff --git a/man2/membarrier.2 b/man2/membarrier.2
index b2e3e035e..a46283dd7 100644
--- a/man2/membarrier.2
+++ b/man2/membarrier.2
@@ -319,6 +319,7 @@ following code (x86) can be transformed using
 .BR membarrier ():
 .PP
 .in +4n
+.\" SRC BEGIN (membarrier.c)
 .EX
 #include <stdlib.h>

@@ -365,6 +366,7 @@ main(int argc, char *argv[])
     exit(EXIT_SUCCESS);
 }
 .EE
+.\" SRC END
 .in
 .PP
 The code above transformed to use


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Exctracting source code from EXAMPLES
  2022-03-20 20:34 Exctracting source code from EXAMPLES Alejandro Colomar (man-pages)
@ 2022-03-20 21:26 ` Ingo Schwarze
  2022-03-20 21:55   ` Alejandro Colomar (man-pages)
  2022-03-20 22:27 ` Exctracting source code from EXAMPLES Stephen Kitt
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2022-03-20 21:26 UTC (permalink / raw)
  To: alx.manpages; +Cc: steve, g.branden.robinson, mtk.manpages, linux-man

Hi Alex,

Alejandro Colomar (man-pages) wrote on Sun, Mar 20, 2022 at 09:34:47PM +0100:

> I have ready some code to extract source code from EXAMPLES in man-pages.

Frankly, i don't see the point at all.

Manual Pages are not HOWTO documents that mindless users are supposed
to copy from verbatim without understanding what they see.  Instead,
the are supposed to be read with your brain switched on and the reader
is supposed to *apply* what they learnt, not copy it.

> .\" SRC BEGIN (program_name.c)

Ugly as hell.  I would very strongly object to have anything
like that added to any manual pages i maintain.  When people add
comments in order to convey syntax and semantics to a machine,
that is a sure sign that the design of whatever it is intended
to achieve was totally botched.

> The next step will be to add targets to lint and compile the produced
> files, to check their correctness.

If any code snippet from an EXAMPLES section does compile, i would
argue that it is severely ill-designed as it obviously contains lots
of needless fluff that distracts from the point the example is
actually trying to demonstrate.  It ought to be stripped down to
what really matters, to become shorter, more readable, and more
to the point.

Here are a few EXAMPLES sections (in formatted form for readability)
that demonstate how EXAMPLES sections should look like:

  EXAMPLES  /* from chroot(2) */
     The following example changes the root directory to newroot,
     sets the current directory to the new root, and drops some
     setuid privileges.  There may be other privileges which need to
     be dropped as well.

           #include <err.h>
           #include <unistd.h>

           if (chroot(newroot) != 0 || chdir("/") != 0)
                   err(1, "%s", newroot);
           setresuid(getuid(), getuid(), getuid());

  EXAMPLES  /* from write(2) */
     A typical loop allowing partial writes looks like this:

     const char *buf;
     size_t bsz, off;
     ssize_t nw;
     int d;

     for (off = 0; off < bsz; off += nw)
             if ((nw = write(d, buf + off, bsz - off)) == 0 || nw == -1)
                     err(1, "write");

  EXAMPLES  /* from BIO_s_fd(3) */
     This is a file descriptor BIO version of "Hello World":

           BIO *out;
           out = BIO_new_fd(fileno(stdout), BIO_NOCLOSE);
           BIO_printf(out, "Hello World\n");
           BIO_free(out);

  EXAMPLES  /* from MB_CUR_MAX(3) */
     Size a buffer in a portable way to hold one single multibyte character:

           char     buf[MB_LEN_MAX];
           wchar_t  wchar;  /* input value */

           if (wctomb(buf, wchar) == -1)
                   /* error */

     Switch between code handling the ascii(7) and UTF-8 character
     encodings in an OpenBSD-specific way (not portable):

           if (MB_CUR_MAX == 1) {
                   /* Code to handle ASCII-encoded single-byte strings. */
           } else {
                   /* Code to handle UTF-8-encoded multibyte strings. */
           }

  EXAMPLES  /* from malloc(3) */
     If malloc() must be used with multiplication, be sure to test for
     overflow:

           size_t num, size;
           ...

           /* Check for size_t overflow */
           if (size && num > SIZE_MAX / size)
                   errc(1, EOVERFLOW, "overflow");

           if ((p = malloc(num * size)) == NULL)
                   err(1, NULL);

     The above test is not sufficient in all cases.  For example,
     multiplying ints requires a different set of checks:

           int num, size;
           ...

           /* Avoid invalid requests */
           if (size < 0 || num < 0)
                   errc(1, EOVERFLOW, "overflow");

           /* Check for signed int overflow */
           if (size && num > INT_MAX / size)
                   errc(1, EOVERFLOW, "overflow");

           if ((p = malloc(num * size)) == NULL)
                   err(1, NULL);

     Assuming the implementation checks for integer overflow as
     OpenBSD does, it is much easier to use calloc(), reallocarray(),
     or recallocarray().

     The above examples could be simplified to:

           if ((p = reallocarray(NULL, num, size)) == NULL)
                   err(1, NULL);

     or at the cost of initialization:

           if ((p = calloc(num, size)) == NULL)
                   err(1, NULL);

Yours,
  Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exctracting source code from EXAMPLES
  2022-03-20 21:26 ` Ingo Schwarze
@ 2022-03-20 21:55   ` Alejandro Colomar (man-pages)
  2022-03-25  4:14     ` automated example verification in the groff Texinfo manual G. Branden Robinson
  0 siblings, 1 reply; 7+ messages in thread
From: Alejandro Colomar (man-pages) @ 2022-03-20 21:55 UTC (permalink / raw)
  To: Ingo Schwarze; +Cc: steve, g.branden.robinson, mtk.manpages, linux-man

Hi Ingo,

On 3/20/22 22:26, Ingo Schwarze wrote:
> Hi Alex,
> 
> Alejandro Colomar (man-pages) wrote on Sun, Mar 20, 2022 at 09:34:47PM +0100:
> 
>> I have ready some code to extract source code from EXAMPLES in man-pages.
> 
> Frankly, i don't see the point at all.
> 
> Manual Pages are not HOWTO documents that mindless users are supposed
> to copy from verbatim without understanding what they see.  Instead,
> the are supposed to be read with your brain switched on and the reader
> is supposed to *apply* what they learnt, not copy it.

This feature is not supposed to be used by manual page readers (users),
but by manual page editors.

I wanted to do this a long time ago for a few reasons:

- I know some manual pages are incorrect in some minor ways, which I'd
like to fix.  For example, the includes in some examples are incorrect,
typically with more than needed.  Reading 100s of pages carefully to fix
them would be impossible.  And having slightly incorrect examples in the
manual pages is something I would really like to avoid.  We have
received reports of programs having added headers because the manual
page said they had to add them, but they were really not needed.

- Some times, incorrect patches accidentally break the manual page
formatting (for example, "\n" instead of "\en").  If the patch is simple
enough that I may not care enough to render the manual page to check its
correctness, and I don't realize the small detail, I may apply a patch
that breaks an example program very badly.  Having a magic button that
checks that the code at least compiles would greatly reduce those bugs.

- When reviewing an incoming patch, instead of me having to read through
it and after some time then replying "hey, please render your manual
pages before sending patches to see that your code doesn't produce what
you thought", I can just run `make build-src && make lint-src`, and if
it fails, I can tell the contributor: "run `make build-src && make
lint-src`, and you'll notice that your program contains some serious
problems.  Rerun until you are happy with it.  If you can't figure out
how to fix something, you can ask.".

It's kind of a man-pages static analyzer.  Having it passing is not a
measure of how good a manual page is, but having it break is an
indicator that the manual page may actually have some problems.

> 
>> .\" SRC BEGIN (program_name.c)
> 
> Ugly as hell.  I would very strongly object to have anything
> like that added to any manual pages i maintain.  When people add
> comments in order to convey syntax and semantics to a machine,
> that is a sure sign that the design of whatever it is intended
> to achieve was totally botched.

I first thought of some way to achieve this without markers, but the
regex would be very unreliable with current manual pages.  If we
standardize them a bit more, these markers might be made unnecessary,
but I'm not sure about that.

> 
>> The next step will be to add targets to lint and compile the produced
>> files, to check their correctness.
> 
> If any code snippet from an EXAMPLES section does compile, i would
> argue that it is severely ill-designed as it obviously contains lots
> of needless fluff that distracts from the point the example is
> actually trying to demonstrate.  It ought to be stripped down to
> what really matters, to become shorter, more readable, and more
> to the point.

In this project, there are examples as the ones you point below, but
they are usually embedded in the text.  In the EXAMPLES section we
usually have full programs, which are normally minimal working programs
that demonstrate the functions described.  I don't think they have much
noise.  I've sometimes used them myself, and I like them, because I have
something working from which I can test stuff.  Sometimes it's quite
hard to translate man-pages text into a running program, and having a
working example program helps disambiguate the text (at least in my
brain it works like that).

> 
> Here are a few EXAMPLES sections (in formatted form for readability)
> that demonstate how EXAMPLES sections should look like:
> 
>   EXAMPLES  /* from chroot(2) */
>      The following example changes the root directory to newroot,
>      sets the current directory to the new root, and drops some
>      setuid privileges.  There may be other privileges which need to
>      be dropped as well.
> 
>            #include <err.h>
>            #include <unistd.h>
> 
>            if (chroot(newroot) != 0 || chdir("/") != 0)
>                    err(1, "%s", newroot);
>            setresuid(getuid(), getuid(), getuid());
> 
[...]
> 
> Yours,
>   Ingo

Cheers,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exctracting source code from EXAMPLES
  2022-03-20 20:34 Exctracting source code from EXAMPLES Alejandro Colomar (man-pages)
  2022-03-20 21:26 ` Ingo Schwarze
@ 2022-03-20 22:27 ` Stephen Kitt
  2022-03-21  0:02   ` Alejandro Colomar (man-pages)
  1 sibling, 1 reply; 7+ messages in thread
From: Stephen Kitt @ 2022-03-20 22:27 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages)
  Cc: G. Branden Robinson, Michael Kerrisk (man-pages),
	linux-man, Ingo Schwarze


[-- Attachment #1.1: Type: text/plain, Size: 2698 bytes --]

Hi Alex,

On Sun, 20 Mar 2022 21:34:47 +0100, "Alejandro Colomar (man-pages)"
<alx.manpages@gmail.com> wrote:
> I have ready some code to extract source code from EXAMPLES in man-pages.
> For that, I set up some convention:
> 
> Enclose the code (including the enclosing .EX/.EE in a pair of comments
> with a very precise formatting:
> 
> [[
> ...
> .\" SRC BEGIN (program_name.c)
> .EX
> #include <stdio.h>
> 
> int main(void)
> {
> 	printf("Hello, world!");
> }
> .EE
> .\" SRC END
> ...
> ]]
> 
> There can be multiple programs in a single page, with the only
> restriction that each of them has to have a different program_name
> (there can be collisions within different manual pages, but not within
> the same manual page)
> 
> The Makefile will create a directory for each manal page, where the
> different programs will be created with the name specified in the
> comment (that's why it has to be different from others in the same page
> only).
> 
> Please, check that you like what you see, and comment if not (or if yes
> too :).

I’ve been working on something similar, slightly further along (the linting
targets work). The extraction scripts could do with some improvement, but the
Makefile changes are small:

# Check that example programs include in man pages really build
Makefile.examples: $(MANPAGES)
	scripts/list-example-files $^ > $@

include Makefile.examples

# Sources are listed as well as objects to ensure we update all source files
# CPPFLAGS and TARGET_ARCH are defined to avoid warnings
.PHONY: check-example-programs
check-example-programs: CFLAGS = -Wall
check-example-programs: CPPFLAGS =
check-example-programs: TARGET_ARCH =
check-example-programs: $(EXAMPLE_SRCS) $(EXAMPLE_OBJS)

.PHONY: clean-example-programs
clean-example-programs:
	rm -f $(EXAMPLE_SRCS) $(EXAMPLE_OBJS)


scripts/list-example-files builds a separate Makefile to extract all the
programs, which ends up looking like

/home/steve/man-pages/man1/memusage.c: /home/steve/man-pages/man1/memusage.1
	scripts/extract-example-files $<

/home/steve/man-pages/man1/prog.c /home/steve/man-pages/man1/libdemo.c: /home/steve/man-pages/man1/sprof.1
	scripts/extract-example-files $<


The scripts are attached. The patterns used to identify source code are close
to those already present: .EX/.EE introduced by “Program source” (in which
case the source code is extracted to a C file named after the man page) or by
a

\." Example file

comment which can optionally name the file.

This has identified some more man pages which need fixes to their example
code, I’ll send patches tomorrow.

Regards,

Stephen

[-- Attachment #1.2: extract-example-files --]
[-- Type: application/octet-stream, Size: 1418 bytes --]

#!/usr/bin/awk -f

function dirname(filename) {
    if (!sub(/\/[^\/]*\/?$/, "", filename)) {
	return "."
    } else if (filename != "") {
	return filename
    }
    return "/"
}

function start_output(output) {
    currout = output
    outputting = 1
    $0 = ".TH \"//\" 0\n.EX"
}

function end_output() {
    outputting = 0
}

BEGINFILE {
    currout = ""
}

# We look for .EX/.EE blocks introduced by "Program source" and/or by
# a qualifier. This qualifier can be added before .EX and inside the block
# to name files using a comment of the form
# \." Example file: foobar.c
# or it can introduce example files without a "Program source" subsection:
# \." Example file
# When names are used, we want the first comment before .EX so that we avoid
# potentially detecting a duplicate when there would be no content

# Unnamed example file (decision delayed until .EX)
/Program source$/ || /^\.\\" Example file$/ {
    inps = 1; output = substr(FILENAME, i, length(FILENAME) - 2) ".c"
}

# Example named in the text
/Program source: / || /^\.\\" Example file: / {
    inps = 1; output = dirname(FILENAME) "/" $NF
    if (inex) {
	# We've already started a block, we're changing files
	start_output(output)
    }
}

inps && !inex && /^\.EX$/ {
    inex = 1
    start_output(output)
}

/^\.EE$/ { inps = 0; inex = 0 }

inps && inex && outputting {
#    print > currout
    print | ("groff -man -Tutf8 - > " currout)
}

[-- Attachment #1.3: list-example-files --]
[-- Type: application/octet-stream, Size: 1996 bytes --]

#!/usr/bin/awk -f

# Output file names are tracked to ensure we don't have duplicates

BEGINFILE {
    files = ""
}

function dirname(filename){
    if (!sub(/\/[^\/]*\/?$/, "", filename)) {
	return "."
    } else if (filename != "") {
	return filename
    }
    return "/"
}

function start_output(output) {
    if (output in outputs) {
	printf "Duplicate detected, %s is produced by %s and %s.\n", output, outputs[output], FILENAME
    }
    outputs[output] = FILENAME
    if (files) {
	files = files " " output
    } else {
	files = output
    }
}    

# We look for .EX/.EE blocks introduced by "Program source" and/or by
# a qualifier. This qualifier can be added before .EX and inside the block
# to name files using a comment of the form
# \." Example file: foobar.c
# or it can introduce example files without a "Program source" subsection:
# \." Example file
# When names are used, we want the first comment before .EX so that we avoid
# potentially detecting a duplicate when there would be no content

# Unnamed example file (decision delayed until .EX)
/Program source$/ || /^\.\\" Example file$/ {
    inps = 1; output = substr(FILENAME, i, length(FILENAME) - 2) ".c"
}

# Example named in the text
/Program source: / { inps = 1; output = dirname(FILENAME) "/" $NF }

# Named example file
/^\.\\" Example file: / {
    inps = 1
    output = dirname(FILENAME) "/" $NF
    if (inex) {
	# We've already started a block, we're changing files
	start_output(output)
    }
}

# Start of the .EX block
inps && !inex && /^\.EX$/ {
    inex = 1
    start_output(output)
}

/^\.EE$/ { inps = 0; inex = 0 }

ENDFILE {
    inps = 0; inex = 0
    if (files) {
	printf "%s: %s\n", files, FILENAME
	printf "\tscripts/extract-example-files $<\n\n"
    }
}

END {
    printf "EXAMPLE_SRCS +="
    for (output in outputs) {
	printf " %s", output
    }
    printf "\nEXAMPLE_OBJS +="
    for (output in outputs) {
	if (sub(/\.c$/, ".o", output)) {
	    printf " %s", output
	}
    }
    printf "\n\n"
}

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exctracting source code from EXAMPLES
  2022-03-20 22:27 ` Exctracting source code from EXAMPLES Stephen Kitt
@ 2022-03-21  0:02   ` Alejandro Colomar (man-pages)
  2022-03-21  1:07     ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 7+ messages in thread
From: Alejandro Colomar (man-pages) @ 2022-03-21  0:02 UTC (permalink / raw)
  To: Stephen Kitt
  Cc: G. Branden Robinson, Michael Kerrisk (man-pages),
	linux-man, Ingo Schwarze

Hi Stephen!

On 3/20/22 23:27, Stephen Kitt wrote:
> Hi Alex,
> 
> On Sun, 20 Mar 2022 21:34:47 +0100, "Alejandro Colomar (man-pages)" 
> I’ve been working on something similar, slightly further along (the linting
> targets work).

I've been adding code for linting this evening too.  Currently I'm
compiling and linking the programs extracted.  I only did it with
membarrier.2 for now, and I've already fixed a line.

I also plan to add some static analyzers too, such as iwyu(1),
clang-tidy(1), or checkpatch.pl and some others.

You could check my 'lint' branch here:
<http://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/log/?h=lint>

Compare that with what you have, and we can develop some mix of both.

With what I have, you need to run:

make build-src && make build-ld

> The extraction scripts could do with some improvement, but the
> Makefile changes are small:



> 
> # Check that example programs include in man pages really build
> Makefile.examples: $(MANPAGES)
> 	scripts/list-example-files $^ > $@
> 
> include Makefile.examples
> 
> # Sources are listed as well as objects to ensure we update all source files
> # CPPFLAGS and TARGET_ARCH are defined to avoid warnings
> .PHONY: check-example-programs
> check-example-programs: CFLAGS = -Wall
> check-example-programs: CPPFLAGS =
> check-example-programs: TARGET_ARCH =
> check-example-programs: $(EXAMPLE_SRCS) $(EXAMPLE_OBJS)
> 
> .PHONY: clean-example-programs
> clean-example-programs:
> 	rm -f $(EXAMPLE_SRCS) $(EXAMPLE_OBJS)
> 
> 
> scripts/list-example-files builds a separate Makefile to extract all the
> programs, which ends up looking like
> 
> /home/steve/man-pages/man1/memusage.c: /home/steve/man-pages/man1/memusage.1
> 	scripts/extract-example-files $<
> 
> /home/steve/man-pages/man1/prog.c /home/steve/man-pages/man1/libdemo.c: /home/steve/man-pages/man1/sprof.1
> 	scripts/extract-example-files $<

I like the idea of autogenerating and including a makefile, which allows
listing the example programs in the process.  In my current makefile, I
need to run make build-src before any further actions on those files,
since I don't know them at the time of setting variables.

I may take some bits from here.

> 
> 
> The scripts are attached. The patterns used to identify source code are close
> to those already present: .EX/.EE introduced by “Program source” (in which
> case the source code is extracted to a C file named after the man page) or by
> a
> 
> \." Example file
> 
> comment which can optionally name the file.

Those scripts seem a bit messy, due to the problem of not having a
standardized comment.  Are they reliable?  Or are there false negatives
or positives?

> 
> This has identified some more man pages which need fixes to their example
> code, I’ll send patches tomorrow.

Okay.

Cheers,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Exctracting source code from EXAMPLES
  2022-03-21  0:02   ` Alejandro Colomar (man-pages)
@ 2022-03-21  1:07     ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 7+ messages in thread
From: Alejandro Colomar (man-pages) @ 2022-03-21  1:07 UTC (permalink / raw)
  To: Stephen Kitt
  Cc: G. Branden Robinson, Michael Kerrisk (man-pages),
	linux-man, Ingo Schwarze

Hi Stephen,

On 3/21/22 01:02, Alejandro Colomar (man-pages) wrote:
> I like the idea of autogenerating and including a makefile, which allows
> listing the example programs in the process.  In my current makefile, I
> need to run make build-src before any further actions on those files,
> since I don't know them at the time of setting variables.
> 
> I may take some bits from here.

Well, I managed to get everything within the Makefile, with some complex
variable definition; no autogenerated makefiles, and no included
makefiles.  It makes the makefile a bit slow, however, by adding a fixed
initialization overhead:  `make clean` goes from 0.11 s to 1.96 s in my
computer.  2 s is not too much for complex operations, but it's a bit
nasty when you just want to clean, or you just changed one page.  But I
guess any solution will have a similar fixed overhead (or it will be
fast, but will require a separate step such as `make build-src`).

But the result is quite neat and simple, compared to other options:


+SRCPAGEDIRS:=$(patsubst $(MANDIR)/%,$(SRCDIR)/%.d,$(LINTPAGES))
+UNITS_c    := $(patsubst $(MANDIR)/%,$(SRCDIR)/%,$(shell \
+               find $(MANDIR)/man?/ -type f \
+               | grep '$(manext)$$' \
+               | while read m; do \
+                       <$$m \
+                       sed -n "s,^\.\\"'"'" SRC BEGIN
(\(.*.c\))$$,$$m.d\1,p";\
+               done))




+########################################################################
+# src
+
+$(SRCPAGEDIRS): $(SRCDIR)/%.d: $(MANDIR)/% | $$(@D)/.
+       $(info MKDIR    $@)
+       $(MKDIR) $@
+
+$(UNITS_c): $$(@D)
+       $(info SED      $@)
+       <$(patsubst $(SRCDIR)/%.d,$(MANDIR)/%,$<) \
+       sed -n \
+               -e '/^\.TH/,/^\.SH/{/^\.SH/!p}' \
+               -e '/^\.SH EXAMPLES/p' \
+               -e "/^\... SRC BEGIN ($(@F))$$/,/^\... SRC END$$/p" \
+       | $(MAN) -P cat -l - \
+       | sed '/^[^ ]/d' \
+       | sed 's/^       //' \
+       >$@ \
+       || exit $$?
+
+$(SRCDIRS): %/.: | $$(dir %). $(SRCDIR)/.
+
+.PHONY: build-src src
+build-src src: $(UNITS_c) | builddirs-src
+       @:
+
+.PHONY: builddirs-src
+builddirs-src: $(SRCDIRS)
+       @:
+
+


Those two biggie snippets embedded into the Makefile
are similar in essence to the 2 helper scripts you use.
The benefit of embedding them in the Makefile is that I have full
control of it, and can use variables directly from the Makefile.
Also less files :).
And even though it's slower,
I prefer it over having to run `make build-src` manually.


I've updated my 'lint' branch.

Cheers,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* automated example verification in the groff Texinfo manual
  2022-03-20 21:55   ` Alejandro Colomar (man-pages)
@ 2022-03-25  4:14     ` G. Branden Robinson
  0 siblings, 0 replies; 7+ messages in thread
From: G. Branden Robinson @ 2022-03-25  4:14 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: Ingo Schwarze, mtk.manpages, linux-man

[-- Attachment #1: Type: text/plain, Size: 2819 bytes --]

[subject changed; steve@sk2.org dropped from CC list since he's not
quoted]

At 2022-03-20T22:55:48+0100, Alejandro Colomar (man-pages) wrote:
> On 3/20/22 22:26, Ingo Schwarze wrote:
> > Alejandro Colomar (man-pages) wrote on Sun, Mar 20, 2022 at 09:34:47PM +0100:
> > 
> >> I have ready some code to extract source code from EXAMPLES in
> >> man-pages.
> > 
> > Frankly, i don't see the point at all.
> > 
> > Manual Pages are not HOWTO documents that mindless users are
> > supposed to copy from verbatim without understanding what they see.
> > Instead, the are supposed to be read with your brain switched on and
> > the reader is supposed to *apply* what they learnt, not copy it.
> 
> This feature is not supposed to be used by manual page readers
> (users), but by manual page editors.

Indeed.  For quite a while I've had medium-term plans to do something
similar for the many examples in the groff Texinfo manual.  I want to be
sure they remain accurate.

Here's my sketch for whenever I get back around to this idea.

1. Extract examples with, e.g., 'sed -n '/^@Example/,/^@endExample/'.[1]
2. Use Texinfo comment lines within this example to:

   2a. identify the example (an invisible figure caption, if you will);

   and

   2b. specify additional transformations that should take place on the
       input and/or output--these are not necessarily just more 's' sed
       commands but possibly 'i' and 'd' commands as well.

As an example of an 'i' use case, the line length _has_ to be shortened
for many examples to fit within DVI/PDF page margins.  We note this in
the front matter of the current version of the manual[1], but I'm
uncertain of the utility of having it literally present in every case
(particularly in examples related to issues conceptually distant from
line length and/or breaking).

In some cases, either input or output is wholly elided when it sheds no
further light.

I don't envision changing the manual generation process to _populate_
the examples--that should be done carefully and by hand by someone with
a pedagogue's hat on.  The idea is to _validate_ the examples, reading
them in, performing a few invariant sed substitutions (the ones that
distinguish input, output, and error streams in the example text),
perform any further specified sed operations encoded as Texinfo
comments, and then run groff and verify that they match.

Since it would run this way, I reckon it can become just another test in
our suite.  (For those who don't follow groff development, groff 1.22.4
shipped with 3 tests.  groff 1.23 can be expected to have at least 111.)

Regards,
Branden

[1] These are custom Texinfo macros defined by the groff manual.
[2] https://git.savannah.gnu.org/cgit/groff.git/tree/doc/groff.texi#n901

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-25  4:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-20 20:34 Exctracting source code from EXAMPLES Alejandro Colomar (man-pages)
2022-03-20 21:26 ` Ingo Schwarze
2022-03-20 21:55   ` Alejandro Colomar (man-pages)
2022-03-25  4:14     ` automated example verification in the groff Texinfo manual G. Branden Robinson
2022-03-20 22:27 ` Exctracting source code from EXAMPLES Stephen Kitt
2022-03-21  0:02   ` Alejandro Colomar (man-pages)
2022-03-21  1:07     ` Alejandro Colomar (man-pages)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.