ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [TECH TOPIC] Kernel documentation
@ 2023-06-16 17:48 Jonathan Corbet
  2023-06-20 16:02 ` Jani Nikula
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Jonathan Corbet @ 2023-06-16 17:48 UTC (permalink / raw)
  To: ksummit

The documentation discussion at past kernel summits has been lively, so
I think we should do it again.  Some topics I would bring to a session
this year would include:

- The ongoing restructuring of the Documentation/ directory.  I've been
  slowly moving the architecture docs into Documentation/arch/, but
  would like to do more to reduce the clutter of the top-level directory
  and make our documentation tree more closely resemble the organization
  of the source.

- Structure.  We continue to collect documents, but do little to tie
  them together into a coherent whole.  Do we want to change that and,
  if so, how?

- Support for documentation work.  There is nobody in the community who
  is paid to put any significant time into documentation, and it shows.
  How can we fix that?

- Infrastructure.  Sphinx brings a lot but is far from perfect; what can
  we do to improve it?

Other topics will certainly arise as well.

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-16 17:48 [TECH TOPIC] Kernel documentation Jonathan Corbet
@ 2023-06-20 16:02 ` Jani Nikula
  2023-06-20 19:30   ` Jonathan Corbet
  2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
  2023-06-21 11:04 ` [TECH TOPIC] Kernel documentation Thorsten Leemhuis
  2023-11-11 12:42 ` Vegard Nossum
  2 siblings, 2 replies; 23+ messages in thread
From: Jani Nikula @ 2023-06-20 16:02 UTC (permalink / raw)
  To: Jonathan Corbet, ksummit

On Fri, 16 Jun 2023, Jonathan Corbet <corbet@lwn.net> wrote:
> The documentation discussion at past kernel summits has been lively, so
> I think we should do it again.  Some topics I would bring to a session
> this year would include:
>
> - The ongoing restructuring of the Documentation/ directory.  I've been
>   slowly moving the architecture docs into Documentation/arch/, but
>   would like to do more to reduce the clutter of the top-level directory
>   and make our documentation tree more closely resemble the organization
>   of the source.
>
> - Structure.  We continue to collect documents, but do little to tie
>   them together into a coherent whole.  Do we want to change that and,
>   if so, how?
>
> - Support for documentation work.  There is nobody in the community who
>   is paid to put any significant time into documentation, and it shows.
>   How can we fix that?
>
> - Infrastructure.  Sphinx brings a lot but is far from perfect; what can
>   we do to improve it?

It should be more feasible to build the documentation. Make it faster,
reduce the warnings.

Some ideas to make it faster:

- Bump the minimum Sphinx version requirement if it helps the speed. I
  don't think it needs to be as conservative as the compiler.

- Cache kernel-doc results per document. A bunch of .rst files use
  multiple kernel-doc directives for the same source file to better
  control the documentation order [1]. Each directive causes the same
  source to be parsed. (I'm not sure how bad the effect is though.)

- Simplify the rst output kernel-doc produces. For example, use rst
  native field lists for parameter and member descriptions instead of
  hand-crafting them. See [2]. Drop the "definition" part from
  structures, as nobody relies on it anyway. If necessary, add links to
  source instead.

- Default to Sphinx parallel build.

- Consider splitting the whole documentation to multiple smaller
  projects, and linking between them using intersphinx. (This may be a
  tall order.)

Some ideas to reduce warnings:

- W=1 already includes kernel-doc warnings for .c. In i915 we've added
  that to the regular build as well as a separate target to test
  headers, and use kernel-doc -Werror for development. Try to get more
  folks on board.

- Add more warning levels to kernel-doc similar to compilers, and reduce
  the default warnings. For example, I'm not sure it's necessary to warn
  about each undocumented parameter/member by default. That could be a
  verbose option. Bump up the warnings after we've fixed the more
  glaring issues.

- For more verbose checking without Sphinx, it should be possible to
  lint the rst produced by kernel-doc (originating from source), and
  check that as part of the build. But that's clearly W=2 stuff or on a
  subsystem/driver basis.

- Making the Sphinx build faster would also get more people on board
  fixing the warnings too.


BR,
Jani.



[1] git grep "^\.\. kernel-doc::" -- Documentation | sort | uniq -c | sort -rn | grep -v " 1 "
[2] https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#info-field-lists




>
> Other topics will certainly arise as well.
>
> jon
>

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-20 16:02 ` Jani Nikula
@ 2023-06-20 19:30   ` Jonathan Corbet
  2023-11-20 12:06     ` Vegard Nossum
  2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
  1 sibling, 1 reply; 23+ messages in thread
From: Jonathan Corbet @ 2023-06-20 19:30 UTC (permalink / raw)
  To: Jani Nikula, ksummit

Jani Nikula <jani.nikula@intel.com> writes:

> It should be more feasible to build the documentation. Make it faster,
> reduce the warnings.
>
> Some ideas to make it faster:
>
> - Bump the minimum Sphinx version requirement if it helps the speed. I
>   don't think it needs to be as conservative as the compiler.

Alas, newer versions of Sphinx are slower, not faster; 2.4 takes about
half the time to build the docs that 5.x does.

A while back, I went into Sphinx with a hatchet and managed to take
about 20% off the build time.  The C domain stuff builds a data
structure of incredible complexity, then just tosses much of it away.
I've never had the time to figure out why they do that or to try to get
my hack job into a condition where I'd be willing to show it to my dog,
much less the Sphinx developers.

I wish we had an active presence in the Sphinx community, but I've never
been able to make that happen myself.

> - Cache kernel-doc results per document. A bunch of .rst files use
>   multiple kernel-doc directives for the same source file to better
>   control the documentation order [1]. Each directive causes the same
>   source to be parsed. (I'm not sure how bad the effect is though.)

That would help, but I don't think this is our biggest problem.

> - Simplify the rst output kernel-doc produces. For example, use rst
>   native field lists for parameter and member descriptions instead of
>   hand-crafting them. See [2]. Drop the "definition" part from
>   structures, as nobody relies on it anyway. If necessary, add links to
>   source instead.

Seems worth a try.

> - Default to Sphinx parallel build.

We *do* default to that, or so I thought.  Much of the build doesn't
actually parallelize, though.

> - Consider splitting the whole documentation to multiple smaller
>   projects, and linking between them using intersphinx. (This may be a
>   tall order.)

This would be nice.  I looked into it a little while back and ran into
some roadblocks; I'll need to go back to my notes to remind myself of
where the problems were.

> Some ideas to reduce warnings:
>
> - W=1 already includes kernel-doc warnings for .c. In i915 we've added
>   that to the regular build as well as a separate target to test
>   headers, and use kernel-doc -Werror for development. Try to get more
>   folks on board.
>
> - Add more warning levels to kernel-doc similar to compilers, and reduce
>   the default warnings. For example, I'm not sure it's necessary to warn
>   about each undocumented parameter/member by default. That could be a
>   verbose option. Bump up the warnings after we've fixed the more
>   glaring issues.

This seems like a good idea.

> - For more verbose checking without Sphinx, it should be possible to
>   lint the rst produced by kernel-doc (originating from source), and
>   check that as part of the build. But that's clearly W=2 stuff or on a
>   subsystem/driver basis.
>
> - Making the Sphinx build faster would also get more people on board
>   fixing the warnings too.

This, I think, is the key point.  The build speed is a real pain point
that impedes contribution.

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-16 17:48 [TECH TOPIC] Kernel documentation Jonathan Corbet
  2023-06-20 16:02 ` Jani Nikula
@ 2023-06-21 11:04 ` Thorsten Leemhuis
  2023-06-26 14:34   ` Jan Kara
  2023-11-11 12:42 ` Vegard Nossum
  2 siblings, 1 reply; 23+ messages in thread
From: Thorsten Leemhuis @ 2023-06-21 11:04 UTC (permalink / raw)
  To: Jonathan Corbet, ksummit

On 16.06.23 19:48, Jonathan Corbet wrote:
> The documentation discussion at past kernel summits has been lively, so
> I think we should do it again.  Some topics I would bring to a session
> this year would include:
> 
> - The ongoing restructuring of the Documentation/ directory.  I've been
>   slowly moving the architecture docs into Documentation/arch/, but
>   would like to do more to reduce the clutter of the top-level directory
>   and make our documentation tree more closely resemble the organization
>   of the source.
> 
> - Structure.  We continue to collect documents, but do little to tie
>   them together into a coherent whole.  Do we want to change that and,
>   if so, how?

I wonder if it we should try to get some external input for these points
from people with (a) experience in the field and (b) an untainted
viewpoint. And no, I'm not talking about bringing in McKinsey or
PricewaterhouseCoopers. ;) I mean people that are not regularly
contributing to Linux, but have experience with writing docs for
(ideally large) Open Source projects and/or reorganizing large chunks of
docs that accumulated in a project over many years.

Does maybe anyone reading this have ties to someone from groups like
Write the Docs (https://www.writethedocs.org/)? Maybe someone there
might have the right experience and at the same time be willing to
provide us with some input or guidance.

Or do Linux distributors like Red Hat and SUSE maybe have an interest in
improving upstream kernel docs, because it might make their work easier?
If they have at least a little interest, they might be willing to ask
their docs teams to provide a few ideas for us. And if they care a lot,
it might even be quite relevant...
> - Support for documentation work.  There is nobody in the community who
>   is paid to put any significant time into documentation, and it shows.
>   How can we fix that?

...for this point. Or was this tried already without success?

Regarding contacting external people for input or help: I met someone on
two or three conferences that was involved in "Write the Docs", but that
was years ago and I don't know if that person is still active in that
space. I also know somebody that at least used to work on docs for Suse,
but afaik not in the kernel space.

I could ask those two if that's wanted.

But I wonder if somebody here has better connections that would be a
better angle of approach (especially to the docs teams for RH and SUSE).

Side note: during last years session there was someone with many good
ideas in the chat, which Willy read to the audience. It was a partner of
a kernel developer iirc. Maybe that person might also be a good fit to
ask for advice, too.

> [...]

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-21 11:04 ` [TECH TOPIC] Kernel documentation Thorsten Leemhuis
@ 2023-06-26 14:34   ` Jan Kara
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Kara @ 2023-06-26 14:34 UTC (permalink / raw)
  To: Thorsten Leemhuis; +Cc: Jonathan Corbet, ksummit

On Wed 21-06-23 13:04:56, Thorsten Leemhuis wrote:
> On 16.06.23 19:48, Jonathan Corbet wrote:
> > The documentation discussion at past kernel summits has been lively, so
> > I think we should do it again.  Some topics I would bring to a session
> > this year would include:
> > 
> > - The ongoing restructuring of the Documentation/ directory.  I've been
> >   slowly moving the architecture docs into Documentation/arch/, but
> >   would like to do more to reduce the clutter of the top-level directory
> >   and make our documentation tree more closely resemble the organization
> >   of the source.
> > 
> > - Structure.  We continue to collect documents, but do little to tie
> >   them together into a coherent whole.  Do we want to change that and,
> >   if so, how?
> 
> I wonder if it we should try to get some external input for these points
> from people with (a) experience in the field and (b) an untainted
> viewpoint. And no, I'm not talking about bringing in McKinsey or
> PricewaterhouseCoopers. ;) I mean people that are not regularly
> contributing to Linux, but have experience with writing docs for
> (ideally large) Open Source projects and/or reorganizing large chunks of
> docs that accumulated in a project over many years.
> 
> Does maybe anyone reading this have ties to someone from groups like
> Write the Docs (https://www.writethedocs.org/)? Maybe someone there
> might have the right experience and at the same time be willing to
> provide us with some input or guidance.
> 
> Or do Linux distributors like Red Hat and SUSE maybe have an interest in
> improving upstream kernel docs, because it might make their work easier?
> If they have at least a little interest, they might be willing to ask
> their docs teams to provide a few ideas for us. And if they care a lot,
> it might even be quite relevant...

I've forwared this email to our documentation team at SUSE if it sparks
some interest :)

> > - Support for documentation work.  There is nobody in the community who
> >   is paid to put any significant time into documentation, and it shows.
> >   How can we fix that?
> 
> ...for this point. Or was this tried already without success?
> 
> Regarding contacting external people for input or help: I met someone on
> two or three conferences that was involved in "Write the Docs", but that
> was years ago and I don't know if that person is still active in that
> space. I also know somebody that at least used to work on docs for Suse,
> but afaik not in the kernel space.
> 
> I could ask those two if that's wanted.
> 
> But I wonder if somebody here has better connections that would be a
> better angle of approach (especially to the docs teams for RH and SUSE).

Well, my feeling is that our doc team is rather overloaded with internal
work (documenting various products, doing translations to various languages
and what not) so they don't have many cycles for upstream work. Also for
the really technical stuff (like kernel APIs), it is often us, the
developers, that initially write say the tuning docs and the doc team then
takes it as a source of information and cleans it up to customer consumable
state ;) So the doc team is not even a direct consumer of the kernel docs
but us developers that try to prepare something for the doc team. Just my
2c about how it works at SUSE...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-06-20 16:02 ` Jani Nikula
  2023-06-20 19:30   ` Jonathan Corbet
@ 2023-06-29 21:34   ` Jonathan Corbet
  2023-06-30 13:17     ` Jani Nikula
                       ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Jonathan Corbet @ 2023-06-29 21:34 UTC (permalink / raw)
  To: Jani Nikula, ksummit

Jani Nikula <jani.nikula@intel.com> writes:

> - Consider splitting the whole documentation to multiple smaller
>   projects, and linking between them using intersphinx. (This may be a
>   tall order.)

So for anybody who is interested, I went and revisited this.  Actually
splitting the docs into separate books would not be that hard, and
intersphinx will indeed manage the cross-references between them without
a lot of extra effort on our part.

There is a catch, though: In order to be able to create the cross
references, intersphinx has to be able to read the "objects.inv" file
for every other document it refers to.  That file, of course, is created
by building the docs.  In practice this means that, to generate a
complete set of manuals from a clean repository, it would be necessary
to do *two* complete builds - one to create the inventory files, and one
to use them.

That is not exactly the path to a faster build experience.

My conclusion is that intersphinx is aimed at enabling easy linking
between entirely independent sets of manuals, where you can't build
everything together in any case, and not really at our use case.  I'm
not convinced it buys us much over "make SPHINXDIRS=".

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
@ 2023-06-30 13:17     ` Jani Nikula
  2023-06-30 16:54     ` Theodore Ts'o
  2023-07-02  1:46     ` Steven Rostedt
  2 siblings, 0 replies; 23+ messages in thread
From: Jani Nikula @ 2023-06-30 13:17 UTC (permalink / raw)
  To: Jonathan Corbet, ksummit

On Thu, 29 Jun 2023, Jonathan Corbet <corbet@lwn.net> wrote:
> There is a catch, though: In order to be able to create the cross
> references, intersphinx has to be able to read the "objects.inv" file
> for every other document it refers to.  That file, of course, is created
> by building the docs.  In practice this means that, to generate a
> complete set of manuals from a clean repository, it would be necessary
> to do *two* complete builds - one to create the inventory files, and one
> to use them.
>
> That is not exactly the path to a faster build experience.

Right. I was thinking the inventory would be more stable than that, and
you'd somewhat limit the cross-referencing across the boundaries.

It still could be faster, assuming a) non-linear build time increase by
project size, and b) not everything needs to be rebuilt from scratch the
second round. That said, it's a bunch of work to even try. Bummer.

Thanks for looking into it, though.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
  2023-06-30 13:17     ` Jani Nikula
@ 2023-06-30 16:54     ` Theodore Ts'o
  2023-06-30 17:11       ` Jonathan Corbet
  2023-07-02  1:46     ` Steven Rostedt
  2 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2023-06-30 16:54 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Jani Nikula, ksummit

On Thu, Jun 29, 2023 at 03:34:41PM -0600, Jonathan Corbet wrote:
> So for anybody who is interested, I went and revisited this.  Actually
> splitting the docs into separate books would not be that hard, and
> intersphinx will indeed manage the cross-references between them without
> a lot of extra effort on our part.
> 
> There is a catch, though: In order to be able to create the cross
> references, intersphinx has to be able to read the "objects.inv" file
> for every other document it refers to.  That file, of course, is created
> by building the docs.  In practice this means that, to generate a
> complete set of manuals from a clean repository, it would be necessary
> to do *two* complete builds - one to create the inventory files, and one
> to use them.

Yeah, that's a bit of a bummer.  It sounds a bit like TeX/LaTeX's
various *.aux files that are used to generate the numbers for
foornotes, et.al.  But I'll note that while I would do two passes of
running LaTeX before doing sending out the final version of my paper,
most of the time, I'd only run LaTeX once, and live with the fact that
some section numbers or footnotes would be something like [???]
instead of containing the properreference.

From the perspective of someone who is editing the docs, how
usable/unusable is the sphinx output without these inventory files?  Or
if the inventory files are out of date?  And am I right they only
change when someone adds a new section, or a new anchor point for a
cross reference, etc.?

If the goal is for someone to check and see whether the output of a
particular part of the docs looks OK after doing a quick edit (e.g.,
did I mess up a table), it would seem that doing a single pass of a
single "book" would be faster, right?  And would it be good enough for
them to make sure that their edits to a particular .rst file looked
OK?

I also wonder if there's a way people could download inventory files
from some web site so their first pass run of sphinx would look
prettier?  Assuming that intersphinx can deal with slightly
out-of-sync inventory files, of course....

						- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-06-30 16:54     ` Theodore Ts'o
@ 2023-06-30 17:11       ` Jonathan Corbet
  0 siblings, 0 replies; 23+ messages in thread
From: Jonathan Corbet @ 2023-06-30 17:11 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Jani Nikula, ksummit

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Thu, Jun 29, 2023 at 03:34:41PM -0600, Jonathan Corbet wrote:
>> There is a catch, though: In order to be able to create the cross
>> references, intersphinx has to be able to read the "objects.inv" file
>> for every other document it refers to.  That file, of course, is created
>> by building the docs.  In practice this means that, to generate a
>> complete set of manuals from a clean repository, it would be necessary
>> to do *two* complete builds - one to create the inventory files, and one
>> to use them.
>
> Yeah, that's a bit of a bummer.  It sounds a bit like TeX/LaTeX's
> various *.aux files that are used to generate the numbers for
> foornotes, et.al.  But I'll note that while I would do two passes of
> running LaTeX before doing sending out the final version of my paper,
> most of the time, I'd only run LaTeX once, and live with the fact that
> some section numbers or footnotes would be something like [???]
> instead of containing the properreference.
>
> From the perspective of someone who is editing the docs, how
> usable/unusable is the sphinx output without these inventory files?

There will be a lot of broken cross-references; the explicit ones (as
opposed to those created by the automarkup code) will generate warnings. 

> Or
> if the inventory files are out of date?  And am I right they only
> change when someone adds a new section, or a new anchor point for a
> cross reference, etc.?

Yes, in general, but code changes that affect kerneldoc comments could
also bring about a change.

> If the goal is for someone to check and see whether the output of a
> particular part of the docs looks OK after doing a quick edit (e.g.,
> did I mess up a table), it would seem that doing a single pass of a
> single "book" would be faster, right?  And would it be good enough for
> them to make sure that their edits to a particular .rst file looked
> OK?

Yes, it would.  But that can be done now with

  make SPHINXDIRS=whatever htmldocs

...with pretty much the same effect.

> I also wonder if there's a way people could download inventory files
> from some web site so their first pass run of sphinx would look
> prettier?  Assuming that intersphinx can deal with slightly
> out-of-sync inventory files, of course....

Well, we *could* set up intersphinx to fetch those files from kernel.org
automatically, but I suspect I'm not the only one who would be reluctant
to see the build start reaching out onto the net.  Alternatively, we
could add a script that would have to be run explicitly to do that
fetch.

Thanks,

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
  2023-06-30 13:17     ` Jani Nikula
  2023-06-30 16:54     ` Theodore Ts'o
@ 2023-07-02  1:46     ` Steven Rostedt
  2023-07-02  4:56       ` Linus Torvalds
  2 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2023-07-02  1:46 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Jani Nikula, ksummit

On Thu, 29 Jun 2023 15:34:41 -0600
Jonathan Corbet <corbet@lwn.net> wrote:

> There is a catch, though: In order to be able to create the cross
> references, intersphinx has to be able to read the "objects.inv" file
> for every other document it refers to.  That file, of course, is created
> by building the docs.  In practice this means that, to generate a
> complete set of manuals from a clean repository, it would be necessary
> to do *two* complete builds - one to create the inventory files, and one
> to use them.

Could it be possible to check in these files into git, and have the
Documentation maintainer update them whenever there's a need? (yes I
like to volunteer people for jobs I don't do)

This is similar to using flex and bison, where I have the files they
generate prebuilt and checked in so that the user doesn't need to do it
when they build the repository.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-07-02  1:46     ` Steven Rostedt
@ 2023-07-02  4:56       ` Linus Torvalds
  2023-07-02 13:18         ` James Bottomley
  2023-07-02 18:32         ` Steven Rostedt
  0 siblings, 2 replies; 23+ messages in thread
From: Linus Torvalds @ 2023-07-02  4:56 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Jonathan Corbet, Jani Nikula, ksummit

On Sat, 1 Jul 2023 at 18:46, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> This is similar to using flex and bison, where I have the files they
> generate prebuilt and checked in so that the user doesn't need to do it
> when they build the repository.

We really strive not to do that in the kernel because it's too painful.

Yes, we used to. It was a disaster.  It's versioning hell with
different people having different tooling versions, so the "shipped"
binaries then end up constantly depending on who generated them if
anybody does any changes. And maintenance gets to be just more
painful.

So I think for lex/yacc we simply always build things now. No shipped copies.

We still have things like the aic7xxx scsi assembler thing that we do
*not* make people build, and so we have shipped pre-built version of
things like that.

But no. We do *not* want to "solve" some documentation thing by
including pre-build data as shipped files. It's a disaster. It always
has been.

It's just really tedious and error-prone and ugly to re-generate them
at random points.

We've been walking away from that broken model, not adding to it. See
commits like 12dd461ebd19, 7c0303ff7e67 for the crypto layer, but
perhaps more relevantly, commit 833e62245943 or 29c833061c1d where we
got rid of some old traditional flex/bison generated files for
Kconfig..

Or commit e039139be8c2 doing the same for dtc files.

IOW, we've been down the 'shipped' file path. It's a mistake. Let's
learn from our past mistakes, and not repeat them.

          Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-07-02  4:56       ` Linus Torvalds
@ 2023-07-02 13:18         ` James Bottomley
  2023-07-02 18:32         ` Steven Rostedt
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2023-07-02 13:18 UTC (permalink / raw)
  To: Linus Torvalds, Steven Rostedt; +Cc: Jonathan Corbet, Jani Nikula, ksummit

On Sat, 2023-07-01 at 21:56 -0700, Linus Torvalds wrote:
> We still have things like the aic7xxx scsi assembler thing that we do
> *not* make people build, and so we have shipped pre-built version of
> things like that.

The important point there is that the tool is in the tree, so everyone
uses the same version and gets the same output, so the output we check
into the tree is effectively universal.  Plus the tool has been
bitrotting quietly over the years, so what we have is more for
historical completeness now.

James


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-07-02  4:56       ` Linus Torvalds
  2023-07-02 13:18         ` James Bottomley
@ 2023-07-02 18:32         ` Steven Rostedt
  2023-07-02 18:44           ` Linus Torvalds
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2023-07-02 18:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jonathan Corbet, Jani Nikula, ksummit

On Sat, 1 Jul 2023 21:56:42 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Yes, we used to. It was a disaster.  It's versioning hell with
> different people having different tooling versions, so the "shipped"
> binaries then end up constantly depending on who generated them if
> anybody does any changes. And maintenance gets to be just more
> painful.
> 
> So I think for lex/yacc we simply always build things now. No shipped copies.

Interesting. For the tracing user space code, I had to start committing the
C file output for flex/bison because people were complaining that their
versions of flex and bison wouldn't make working C files. I had a newer
version that had some new features that I found I was using. Ever since I
just committed the C generated files into my repo, I haven't had any more
issues with people complaining about them.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-07-02 18:32         ` Steven Rostedt
@ 2023-07-02 18:44           ` Linus Torvalds
  2023-07-03  2:46             ` Theodore Ts'o
  0 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2023-07-02 18:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Jonathan Corbet, Jani Nikula, ksummit

On Sun, 2 Jul 2023 at 11:32, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Interesting. For the tracing user space code, I had to start committing the
> C file output for flex/bison because people were complaining that their
> versions of flex and bison wouldn't make working C files.

You are presumably the only person who changes the lexer. So you find
it easy to serialize.

In the kernel, we've always found it much more painful with shipped
files. We still do it, but only for some very very special stuff.

For example, we have this "mkutf8data" program.  It can generate our
utf8data.c file. Allegedly. Nobody ever does. You need the character
database files to do it.

For something like flex and bison? Just write better code.

             Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Intersphinx ([TECH TOPIC] Kernel documentation)
  2023-07-02 18:44           ` Linus Torvalds
@ 2023-07-03  2:46             ` Theodore Ts'o
  0 siblings, 0 replies; 23+ messages in thread
From: Theodore Ts'o @ 2023-07-03  2:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Steven Rostedt, Jonathan Corbet, Jani Nikula, ksummit

On Sun, Jul 02, 2023 at 11:44:36AM -0700, Linus Torvalds wrote:
> For example, we have this "mkutf8data" program.  It can generate our
> utf8data.c file. Allegedly. Nobody ever does. You need the character
> database files to do it.

Well, Gabriel and I have both run it in the past.  The main issue is
that the character database files are (a) very large, so we didn't
want to check them into kernel tree, and (b) they get updated on
unicode.org once or twice a year, and most of the time there's no
*point* to update it.  Most of the time the Unicode changes are adding
some random Eomji's, or some script that either don't need case
folding, or would only be of interest of some ancient archeologist who
cares about ancient Sumarian (for example), or both.

Most of the time, the only thing we care about case-folding tables.
That's because most installations don't use the Unicode "strict" mode,
since (a) this would annoy Trekkies who want to use the unofficial
Klingon glyphs, which are not recognized by Unicode since they aren't
used by human languages, and (b) in strict mode we would need to take
every single Unicode update when someone wants to use some new emoji
or some new ancient script in filenames.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-16 17:48 [TECH TOPIC] Kernel documentation Jonathan Corbet
  2023-06-20 16:02 ` Jani Nikula
  2023-06-21 11:04 ` [TECH TOPIC] Kernel documentation Thorsten Leemhuis
@ 2023-11-11 12:42 ` Vegard Nossum
  2023-11-11 15:14   ` Jonathan Corbet
  2 siblings, 1 reply; 23+ messages in thread
From: Vegard Nossum @ 2023-11-11 12:42 UTC (permalink / raw)
  To: Jonathan Corbet, ksummit; +Cc: linux-doc

(Added linux-doc to Cc and a few people to Bcc)

On 16/06/2023 19:48, Jonathan Corbet wrote:
> The documentation discussion at past kernel summits has been lively, so
> I think we should do it again.  Some topics I would bring to a session
> this year would include:
> 
> - The ongoing restructuring of the Documentation/ directory.  I've been
>    slowly moving the architecture docs into Documentation/arch/, but
>    would like to do more to reduce the clutter of the top-level directory
>    and make our documentation tree more closely resemble the organization
>    of the source.
> 
> - Structure.  We continue to collect documents, but do little to tie
>    them together into a coherent whole.  Do we want to change that and,
>    if so, how?
> 
> - Support for documentation work.  There is nobody in the community who
>    is paid to put any significant time into documentation, and it shows.
>    How can we fix that?
> 
> - Infrastructure.  Sphinx brings a lot but is far from perfect; what can
>    we do to improve it?
> 
> Other topics will certainly arise as well.

Hi,

This is coming a bit late, but I saw that there is going to be a session
on kernel documentation on the 15th [1] and I wanted to contribute a few
thoughts before that.

First of, regarding the structure, what is the best way to contribute
such changes? Large structural changes would presumably be a patch
series potentially touching a lot of documents from different subsystems
and the individual patches won't necessarily make sense in isolation.
How do we gather consensus for big changes like that? Is it better to
collect acks from subsystem maintainers and then let the documentation
maintainer merge it all at once? Should all the maintainers be Cc'ed on
the cover letter and their individual patches or do they want to be
Cc'ed on everything? What if one or two maintainers don't agree with the
overall approach, does that block the whole series? Does the
documentation maintainer have a veto?  Or do we prefer trickle of small,
incremental patches, going through the individual maintainers? Ideally,
I'd like to see these questions answered in the documentation
subsystem's maintainer entry -- it has a paragraph about the boundaries
of documentation being "fuzzier than normal", but it doesn't offer much
practical or actionable advice IMHO.

Speaking of maintainer entry profiles, for those who aren't aware, here
is the description from [2]:

"""
The Maintainer Entry Profile supplements the top-level process documents
(submitting-patches, submitting drivers...) with
subsystem/device-driver-local customs as well as details about the patch
submission life-cycle. A contributor uses this document to level set
their expectations and avoid common mistakes; maintainers may use these
profiles to look across subsystems for opportunities to converge on
common practices.
"""

We currently only have 7 of these and I think it would be great to
spread awareness of their existence so that we can have more. Please
mention this if there is a room full of subsystem maintainers ;-)

I also think it would be great if we could amend these with
subsystem-specific review checklists. I'm thinking of very hands-on
code-technical things that maintainers will be checking in all their
incoming patches, things that aren't obvious and don't necessarily show
up easily in testing -- things like: for new /proc entries, is extra
permission checking done at ->open() or ->write() time? (This is a
non-obvious potential security issue.) The idea here is for maintainers
to document how they review patches to _their_ subsystems and thus also
make it easier for others (outsiders, newcomers) to review for those
same things. I know it would give me more confidence, actually both when
submitting my own patches and potentially also when reviewing others'
patches. One potential issue here is deciding whether certain things fit
better with the Core API and Driver API sections of the documentation --
for example, should subsystem-specific lock nesting orders be part of a
review checklist or does that belong in the source files themselves? How
do we avoid duplication and things getting stale? Can we add a new
kerneldoc directive that gets collected from the C sources and
automatically put into a subsystem-specific review checklist? (I'd be
happy to try implementing this, if people like the idea.)

On the topic of the overall structure of the documentation: [4]
describes the idea that the kernel documentation is set of "books" --
user and admin guide, core API, drivers API, userspace API. I think this
needs to be emphasized more, as that _is_ the (philosophy of the)
current high-level organization of the documentation and it feels a bit
hidden where it currently is; maybe it should be placed prominently at
the top of that file and called "Organization and philosophy" or
something. At least I was very confused when I came across a passage
that read something like "This book covers ..." and I had no idea why a
kernel document was talking about books.

Finally, I'd like to suggest a number of specific structural changes:

1. the HTML sidebar is a bit of an unreadable mess, at least with the
current alabaster theme (the sphinx_rtd_theme is better in this respect,
IMHO, but that's a separate topic). I think the top-level front page
sidebar should _only_ contain the "books", and then you can click
through/expand to the section that you need. As an example, the
front-page sidebar is currently showing firmware-related documentation,
which seems quite out-of-place to me. In a way, we should think of the
documentation tree as a data structure that is optimized for lookups,
and it should be balanced accordingly: each level of the tree needs to
have an appropriate number of nodes (fanout) and firmware belongs
somewhere deeper down. The "books" are a good guide here, since the
division essentially asks: are you a user, a userspace developer, or a
kernel developer? and would allow you to traverse one "level" of the
tree without having to scan through a dozen different sections that
conceptually belong _after_ that first question of who you are.

2. some documents currently exist at multiple places in the toctree. As
an example, "Core API Documentation" is available from both "Internal
API manuals" and "Internal API manuals -> Kernel subsystem documentation
-> Core subsystems" (i.e. both Documentation/index.rst and
Documentation/subsystem-apis.rst). This is both weird and confusing from
a navigational point of view; it's as if a real book had 20 chapters at
the beginning but also the exact same chapters nested deeply inside
another chapter somewhere else in the book. We should be using
cross-referencing instead. Moreover, do we have a way to detect these
multiple inclusions (e.g. a sphinx-build warning)?

3. I'm wondering if it wouldn't be appropriate to have a top-level
"Community" book (maybe even the very first one) that would detail
things like the CoC, mailing lists and etiquette (but not
process-oriented details like how to submit a patch; we should link to
those, though!), references to IRC channels and social.kernel.org,
kernelnewbies.org, maybe eventually other things like governance
structure, etc. The main idea here is to put the community in focus, as
I think that's something we're lacking slightly -- the kernel community
is large and diverse and in many ways highly fractured. Many things are
not written anywhere at all and other things that are written somewhere
are maybe scattered all over the place. By having a dedicated place to
put community-related documentation it would show that this is something
we actually care about and make the kernel more welcoming to newcomers
and outsiders.

4. "translations" also doesn't need to be a top-level document that
appears in the top-level sidebar; in [5] I submitted a Sphinx extension
that would add a language selection bar to the top of the rendered HTML,
which would allow you to change the language of _any_ document that has
translations, including the front page. I'll still need to submit my v2
of this.

5. I think architecture-specific information should be split up along
the user+admin/userspace-dev/kernel-dev lines and moved into their
respective books instead of being a top-level document. This goes
counter to the idea that Documentation/ should mirror the structure of
the kernel sources, but I think it makes sense to make an exception in
this case.


Vegard

[1] <https://lpc.events/event/17/contributions/1622/>
[2] <https://docs.kernel.org/maintainer/maintainer-entry-profile.html>
[3] <https://github.com/sphinx-doc/sphinx/issues/10966>
[4] 
<https://docs.kernel.org/doc-guide/contributing.html#documentation-coherency>
[5] 
<https://lore.kernel.org/linux-doc/20231028162931.261843-1-vegard.nossum@oracle.com/>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-11 12:42 ` Vegard Nossum
@ 2023-11-11 15:14   ` Jonathan Corbet
  2023-11-20 12:20     ` Vegard Nossum
  0 siblings, 1 reply; 23+ messages in thread
From: Jonathan Corbet @ 2023-11-11 15:14 UTC (permalink / raw)
  To: Vegard Nossum, ksummit; +Cc: linux-doc

Vegard Nossum <vegard.nossum@oracle.com> writes:

> This is coming a bit late, but I saw that there is going to be a session
> on kernel documentation on the 15th [1] and I wanted to contribute a few
> thoughts before that.

Quite a few it seems :)  Yes, this would have been more helpful a bit
sooner; I'll try to respond quickly to some of this.

> First of, regarding the structure, what is the best way to contribute
> such changes? Large structural changes would presumably be a patch
> series potentially touching a lot of documents from different subsystems
> and the individual patches won't necessarily make sense in isolation.

Obviously depends on the specific changes.  You can look at the move of
all the architecture docs as one example of how to do it.  It took
months and a modest number of merge conflicts, but (with 6.7) we got it
done. 

> How do we gather consensus for big changes like that? Is it better to
> collect acks from subsystem maintainers and then let the documentation
> maintainer merge it all at once? Should all the maintainers be Cc'ed on
> the cover letter and their individual patches or do they want to be
> Cc'ed on everything? What if one or two maintainers don't agree with the
> overall approach, does that block the whole series? Does the
> documentation maintainer have a veto?  Or do we prefer trickle of small,
> incremental patches, going through the individual maintainers? Ideally,
> I'd like to see these questions answered in the documentation
> subsystem's maintainer entry -- it has a paragraph about the boundaries
> of documentation being "fuzzier than normal", but it doesn't offer much
> practical or actionable advice IMHO.

The maintainer entry has remarkably little power to dictate how other
maintainers must respond to docs changes.  The answer is that we handle
them like all other cross-subsystem changes - on a case-by-case basis
and with a certain tolerance for pain.

> The Maintainer Entry Profile supplements the top-level process documents
> (submitting-patches, submitting drivers...) with
> subsystem/device-driver-local customs as well as details about the patch
> submission life-cycle. A contributor uses this document to level set
> their expectations and avoid common mistakes; maintainers may use these
> profiles to look across subsystems for opportunities to converge on
> common practices.
> """
>
> We currently only have 7 of these and I think it would be great to
> spread awareness of their existence so that we can have more. Please
> mention this if there is a room full of subsystem maintainers ;-)

I routinely mention it at the maintainers summit...progress is slow. 

> I also think it would be great if we could amend these with
> subsystem-specific review checklists. I'm thinking of very hands-on
> code-technical things that maintainers will be checking in all their
> incoming patches, things that aren't obvious and don't necessarily show
> up easily in testing -- things like: for new /proc entries, is extra
> permission checking done at ->open() or ->write() time? (This is a
> non-obvious potential security issue.) The idea here is for maintainers
> to document how they review patches to _their_ subsystems and thus also
> make it easier for others (outsiders, newcomers) to review for those
> same things. I know it would give me more confidence, actually both when
> submitting my own patches and potentially also when reviewing others'
> patches.

Documentation aimed at helping reviewers would be a great thing.  I do
feel that as little of it as possible should be subsystem-specific,
though.  We need fewer local quirks (IMO) rather than more.

> One potential issue here is deciding whether certain things fit
> better with the Core API and Driver API sections of the documentation --
> for example, should subsystem-specific lock nesting orders be part of a
> review checklist or does that belong in the source files themselves? How
> do we avoid duplication and things getting stale? Can we add a new
> kerneldoc directive that gets collected from the C sources and
> automatically put into a subsystem-specific review checklist? (I'd be
> happy to try implementing this, if people like the idea.)

You can certainly put that material in DOC blocks now.

> Finally, I'd like to suggest a number of specific structural changes:
>
> 1. the HTML sidebar is a bit of an unreadable mess, at least with the
> current alabaster theme (the sphinx_rtd_theme is better in this respect,
> IMHO, but that's a separate topic).

The sidebar is on my list to raise at the session; people like to
complain about it, but I'm not sure we have a consensus on what should
be there.  I dug into the theming code a while back; reproducing the RTD
sidebar is relatively easily done if we want that, but I'm not convinced
it's better.

> I think the top-level front page
> sidebar should _only_ contain the "books", and then you can click
> through/expand to the section that you need.

I think that might be an improvement, yes.

> 2. some documents currently exist at multiple places in the toctree. As
> an example, "Core API Documentation" is available from both "Internal
> API manuals" and "Internal API manuals -> Kernel subsystem documentation
> -> Core subsystems" (i.e. both Documentation/index.rst and
> Documentation/subsystem-apis.rst). This is both weird and confusing from
> a navigational point of view; it's as if a real book had 20 chapters at
> the beginning but also the exact same chapters nested deeply inside
> another chapter somewhere else in the book. We should be using
> cross-referencing instead. Moreover, do we have a way to detect these
> multiple inclusions (e.g. a sphinx-build warning)?

Nope, nothing automated.  Not a huge problem, IMO, and easy enough to
fix wen it comes up.

> 3. I'm wondering if it wouldn't be appropriate to have a top-level
> "Community" book (maybe even the very first one) that would detail
> things like the CoC, mailing lists and etiquette (but not
> process-oriented details like how to submit a patch; we should link to
> those, though!), references to IRC channels and social.kernel.org,
> kernelnewbies.org, maybe eventually other things like governance
> structure, etc. The main idea here is to put the community in focus, as
> I think that's something we're lacking slightly -- the kernel community
> is large and diverse and in many ways highly fractured. Many things are
> not written anywhere at all and other things that are written somewhere
> are maybe scattered all over the place. By having a dedicated place to
> put community-related documentation it would show that this is something
> we actually care about and make the kernel more welcoming to newcomers
> and outsiders.

That is part of what the process book is meant to be - how to work with
our community.  Reworking the process-book top page is another thing I
want to mention in the session, we can do better, but I'm not convinced
that splitting that information out entirely is an improvement.

> 4. "translations" also doesn't need to be a top-level document that
> appears in the top-level sidebar; in [5] I submitted a Sphinx extension
> that would add a language selection bar to the top of the rendered HTML,
> which would allow you to change the language of _any_ document that has
> translations, including the front page. I'll still need to submit my v2
> of this.

...and I still need to look at it; it's been waiting for the merge
window to pass (though LPC and holidays are going to slow me down too).

> 5. I think architecture-specific information should be split up along
> the user+admin/userspace-dev/kernel-dev lines and moved into their
> respective books instead of being a top-level document. This goes
> counter to the idea that Documentation/ should mirror the structure of
> the kernel sources, but I think it makes sense to make an exception in
> this case.

Now that we've just finished moving all the arch docs around, I'm not
sure I want to do that to the arch maintainers again soon :)  Longer
term, this could be considered if we think it makes things better.

Thanks,

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-06-20 19:30   ` Jonathan Corbet
@ 2023-11-20 12:06     ` Vegard Nossum
  2023-11-20 13:50       ` Jonathan Corbet
  0 siblings, 1 reply; 23+ messages in thread
From: Vegard Nossum @ 2023-11-20 12:06 UTC (permalink / raw)
  To: Jonathan Corbet, Jani Nikula, ksummit


(We already exchanged on this topic, but repeating it for the list:)

On 20/06/2023 21:30, Jonathan Corbet wrote:
> Jani Nikula <jani.nikula@intel.com> writes:
> 
>> It should be more feasible to build the documentation. Make it
>> faster,

When using PyPy instead of CPython to run Sphinx, I see a 22%
performance improvement on the kernel documentation, which is not
insignificant.

> A while back, I went into Sphinx with a hatchet and managed to take 
> about 20% off the build time.  The C domain stuff builds a data 
> structure of incredible complexity, then just tosses much of it
> away. I've never had the time to figure out why they do that or to
> try to get my hack job into a condition where I'd be willing to show
> it to my dog, much less the Sphinx developers.

I also profiled the documentation build some weeks ago and came to the
same conclusion: around 40% of the time is spent inside resolve_xref(),
the exact same C domain stuff you mentioned.

The gcc project/documentation has the same problem, albeit in the C++
domain code, there is an open ticket for it:

   https://github.com/sphinx-doc/sphinx/issues/10966

If we're really not using the functionality provided by the C domain
code, maybe instead of ripping it out we could provide something like a
conf.py toggle to disable it? (The idea being that the patch would be
smaller and more acceptable upstream...)


Vegard

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-11 15:14   ` Jonathan Corbet
@ 2023-11-20 12:20     ` Vegard Nossum
  0 siblings, 0 replies; 23+ messages in thread
From: Vegard Nossum @ 2023-11-20 12:20 UTC (permalink / raw)
  To: Jonathan Corbet, ksummit; +Cc: linux-doc, Theodore Ts'o, workflows

On 11/11/2023 16:14, Jonathan Corbet wrote:
> Vegard Nossum <vegard.nossum@oracle.com> writes:
>> The Maintainer Entry Profile supplements the top-level process
>> documents
[...]
>> We currently only have 7 of these and I think it would be great to 
>> spread awareness of their existence so that we can have more.
>> Please mention this if there is a room full of subsystem
>> maintainers ;-)
> 
> I routinely mention it at the maintainers summit...progress is slow.

I saw that a maintainer entry profile is coming soon for ext4 (yay!):

   https://lore.kernel.org/workflows/20231119225437.GA292450@mit.edu/

I also recall the question/discussion from your talk about whether move
some of the documentation closer to their subsystems. Maybe the
maintainer entry profiles would be an ideal candidate to try this out as
an experiment? The advantages would be:

1) there's very few existing documents, so moving these into their
respective directories should cause relatively little churn,

2) probably no document is more tied to a specific subsystem than the
maintainer entry profiles,

3) it would hopefully yield far more visibility for these documents,
which would aid patch submitters as well as potentially inspire more
subsystems to add them.

I tried it out quickly, but Sphinx doesn't really like having documents
outside of the root -- how about just using symlinks? e.g.:

     ln -rs Documentation/filesystems/xfs-maintainer-entry-profile.rst 
fs/xfs/MAINTAINER-ENTRY-PROFILE.rst

I don't really see any downsides... thoughts?


Vegard

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-20 12:06     ` Vegard Nossum
@ 2023-11-20 13:50       ` Jonathan Corbet
  2023-11-20 14:42         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 23+ messages in thread
From: Jonathan Corbet @ 2023-11-20 13:50 UTC (permalink / raw)
  To: Vegard Nossum, Jani Nikula, ksummit

Vegard Nossum <vegard.nossum@oracle.com> writes:

> (We already exchanged on this topic, but repeating it for the list:)
>
> On 20/06/2023 21:30, Jonathan Corbet wrote:
>> Jani Nikula <jani.nikula@intel.com> writes:
>> 
>>> It should be more feasible to build the documentation. Make it
>>> faster,
>
> When using PyPy instead of CPython to run Sphinx, I see a 22%
> performance improvement on the kernel documentation, which is not
> insignificant.

That is nice, but we can't really assume that everybody building the
docs has pypy around.

>> A while back, I went into Sphinx with a hatchet and managed to take 
>> about 20% off the build time.  The C domain stuff builds a data 
>> structure of incredible complexity, then just tosses much of it
>> away. I've never had the time to figure out why they do that or to
>> try to get my hack job into a condition where I'd be willing to show
>> it to my dog, much less the Sphinx developers.
>
> I also profiled the documentation build some weeks ago and came to the
> same conclusion: around 40% of the time is spent inside resolve_xref(),
> the exact same C domain stuff you mentioned.
>
> The gcc project/documentation has the same problem, albeit in the C++
> domain code, there is an open ticket for it:
>
>    https://github.com/sphinx-doc/sphinx/issues/10966
>
> If we're really not using the functionality provided by the C domain
> code, maybe instead of ripping it out we could provide something like a
> conf.py toggle to disable it? (The idea being that the patch would be
> smaller and more acceptable upstream...)

Ah but we are - it's how we generate all of the cross-references in the
built docs.  My sense, from a couple of years ago though was that parts
of that code aren't used by *anybody*.  But I didn't feel that I'd
understood it well enough to make a proper patch.  I'd really like to
get back to that.

Thanks,

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-20 13:50       ` Jonathan Corbet
@ 2023-11-20 14:42         ` Mauro Carvalho Chehab
  2023-11-20 14:49           ` Johannes Berg
  2023-11-20 20:54           ` Jonathan Corbet
  0 siblings, 2 replies; 23+ messages in thread
From: Mauro Carvalho Chehab @ 2023-11-20 14:42 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Vegard Nossum, Jani Nikula, ksummit

Em Mon, 20 Nov 2023 06:50:34 -0700
Jonathan Corbet <corbet@lwn.net> escreveu:

> Vegard Nossum <vegard.nossum@oracle.com> writes:
> 
> > (We already exchanged on this topic, but repeating it for the list:)
> >
> > On 20/06/2023 21:30, Jonathan Corbet wrote:  
> >> Jani Nikula <jani.nikula@intel.com> writes:
> >>   
> >>> It should be more feasible to build the documentation. Make it
> >>> faster,  
> >
> > When using PyPy instead of CPython to run Sphinx, I see a 22%
> > performance improvement on the kernel documentation, which is not
> > insignificant.  
> 
> That is nice, but we can't really assume that everybody building the
> docs has pypy around.
> 
> >> A while back, I went into Sphinx with a hatchet and managed to take 
> >> about 20% off the build time.  The C domain stuff builds a data 
> >> structure of incredible complexity, then just tosses much of it
> >> away. I've never had the time to figure out why they do that or to
> >> try to get my hack job into a condition where I'd be willing to show
> >> it to my dog, much less the Sphinx developers.  
> >
> > I also profiled the documentation build some weeks ago and came to the
> > same conclusion: around 40% of the time is spent inside resolve_xref(),
> > the exact same C domain stuff you mentioned.
> >
> > The gcc project/documentation has the same problem, albeit in the C++
> > domain code, there is an open ticket for it:
> >
> >    https://github.com/sphinx-doc/sphinx/issues/10966
> >
> > If we're really not using the functionality provided by the C domain
> > code, maybe instead of ripping it out we could provide something like a
> > conf.py toggle to disable it? (The idea being that the patch would be
> > smaller and more acceptable upstream...)  
> 
> Ah but we are - it's how we generate all of the cross-references in the
> built docs.  My sense, from a couple of years ago though was that parts
> of that code aren't used by *anybody*.  But I didn't feel that I'd
> understood it well enough to make a proper patch.  I'd really like to
> get back to that.

Cross references is quite useful for media docs. Having a way to
optionally disable it to speedup builds may make some sense, but
the default should be to have it enabled and producing warnings.

There is still a long-term bug on Sphinx C domain logic: it still can't
have symbols with the same name for different types. So, we have a
dozen warnings due to that when building with Sphinx version 3.1 and 
above:

	https://github.com/sphinx-doc/sphinx/pull/8313

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-20 14:42         ` Mauro Carvalho Chehab
@ 2023-11-20 14:49           ` Johannes Berg
  2023-11-20 20:54           ` Jonathan Corbet
  1 sibling, 0 replies; 23+ messages in thread
From: Johannes Berg @ 2023-11-20 14:49 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Jonathan Corbet
  Cc: Vegard Nossum, Jani Nikula, ksummit

On Mon, 2023-11-20 at 15:42 +0100, Mauro Carvalho Chehab wrote:
> 
> There is still a long-term bug on Sphinx C domain logic: it still can't
> have symbols with the same name for different types.

We finally gave up waiting and renamed the symbols in wifi ...

johannes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [TECH TOPIC] Kernel documentation
  2023-11-20 14:42         ` Mauro Carvalho Chehab
  2023-11-20 14:49           ` Johannes Berg
@ 2023-11-20 20:54           ` Jonathan Corbet
  1 sibling, 0 replies; 23+ messages in thread
From: Jonathan Corbet @ 2023-11-20 20:54 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Vegard Nossum, Jani Nikula, ksummit

Mauro Carvalho Chehab <mchehab@kernel.org> writes:

> Cross references is quite useful for media docs. Having a way to
> optionally disable it to speedup builds may make some sense, but
> the default should be to have it enabled and producing warnings.

FWIW, disabling the automarkup extension (which generates the bulk of
cross references) takes about 25% off the build time.  We could
certainly consider adding a compile-time or build-time option for that.
The warning situation should not change at all; automarkup only adds
cross-references to targets that actually exist.

jon

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-11-20 20:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-16 17:48 [TECH TOPIC] Kernel documentation Jonathan Corbet
2023-06-20 16:02 ` Jani Nikula
2023-06-20 19:30   ` Jonathan Corbet
2023-11-20 12:06     ` Vegard Nossum
2023-11-20 13:50       ` Jonathan Corbet
2023-11-20 14:42         ` Mauro Carvalho Chehab
2023-11-20 14:49           ` Johannes Berg
2023-11-20 20:54           ` Jonathan Corbet
2023-06-29 21:34   ` Intersphinx ([TECH TOPIC] Kernel documentation) Jonathan Corbet
2023-06-30 13:17     ` Jani Nikula
2023-06-30 16:54     ` Theodore Ts'o
2023-06-30 17:11       ` Jonathan Corbet
2023-07-02  1:46     ` Steven Rostedt
2023-07-02  4:56       ` Linus Torvalds
2023-07-02 13:18         ` James Bottomley
2023-07-02 18:32         ` Steven Rostedt
2023-07-02 18:44           ` Linus Torvalds
2023-07-03  2:46             ` Theodore Ts'o
2023-06-21 11:04 ` [TECH TOPIC] Kernel documentation Thorsten Leemhuis
2023-06-26 14:34   ` Jan Kara
2023-11-11 12:42 ` Vegard Nossum
2023-11-11 15:14   ` Jonathan Corbet
2023-11-20 12:20     ` Vegard Nossum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).