linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] docs: drop the version constraints for sphinx and dependencies
@ 2024-03-01 14:18 Lukas Bulwahn
  2024-03-03 15:17 ` Jonathan Corbet
  2024-03-18 16:44 ` Donald Hunter
  0 siblings, 2 replies; 7+ messages in thread
From: Lukas Bulwahn @ 2024-03-01 14:18 UTC (permalink / raw)
  To: Jonathan Corbet, Mauro Carvalho Chehab, Vegard Nossum,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc
  Cc: linux-kernel, Lukas Bulwahn

As discussed (see Links), there is some inertia to move to the recent
Sphinx versions for the doc build environment.

As first step, drop the version constraints and the related comments. As
sphinx depends on jinja2, jinja2 is pulled in automatically. So drop that.
Then, the sphinx-pre-install script will fail though with:

  Can't get default sphinx version from ./Documentation/sphinx/requirements.txt at ./scripts/sphinx-pre-install line 305.

The script simply expects to parse a version constraint with Sphinx in the
requirements.txt. That version is used in the script for suggesting the
virtualenv directory name.

To suggest a virtualenv directory name, when there is no version given in
the requirements.txt, one could try to guess the version that would be
downloaded with 'pip install -r Documentation/sphinx/requirements.txt'.
However, there seems no simple way to get that version without actually
setting up the venv and running pip. So, instead, name the directory with
the fixed name 'sphinx_latest'.

Finally update the Sphinx build documentation to reflect this directory
name change.

Link: https://lore.kernel.org/linux-doc/874jf4m384.fsf@meer.lwn.net/
Link: https://lore.kernel.org/linux-doc/20240226093854.47830-1-lukas.bulwahn@gmail.com/
Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
---
v1 -> v2:
  drop jinja2 as suggested by Vegard.
  add tags from v1 review

 Documentation/doc-guide/sphinx.rst    | 11 ++++++-----
 Documentation/sphinx/requirements.txt |  7 ++-----
 scripts/sphinx-pre-install            | 19 +++----------------
 3 files changed, 11 insertions(+), 26 deletions(-)

diff --git a/Documentation/doc-guide/sphinx.rst b/Documentation/doc-guide/sphinx.rst
index 709e19821a16..8081ebfe48bc 100644
--- a/Documentation/doc-guide/sphinx.rst
+++ b/Documentation/doc-guide/sphinx.rst
@@ -48,13 +48,14 @@ or ``virtualenv``, depending on how your distribution packaged Python 3.
       on the Sphinx version, it should be installed separately,
       with ``pip install sphinx_rtd_theme``.
 
-In summary, if you want to install Sphinx version 2.4.4, you should do::
+In summary, if you want to install the latest version of Sphinx, you
+should do::
 
-       $ virtualenv sphinx_2.4.4
-       $ . sphinx_2.4.4/bin/activate
-       (sphinx_2.4.4) $ pip install -r Documentation/sphinx/requirements.txt
+       $ virtualenv sphinx_latest
+       $ . sphinx_latest/bin/activate
+       (sphinx_latest) $ pip install -r Documentation/sphinx/requirements.txt
 
-After running ``. sphinx_2.4.4/bin/activate``, the prompt will change,
+After running ``. sphinx_latest/bin/activate``, the prompt will change,
 in order to indicate that you're using the new environment. If you
 open a new shell, you need to rerun this command to enter again at
 the virtual environment before building the documentation.
diff --git a/Documentation/sphinx/requirements.txt b/Documentation/sphinx/requirements.txt
index 5d47ed443949..5017f307c8a4 100644
--- a/Documentation/sphinx/requirements.txt
+++ b/Documentation/sphinx/requirements.txt
@@ -1,6 +1,3 @@
-# jinja2>=3.1 is not compatible with Sphinx<4.0
-jinja2<3.1
-# alabaster>=0.7.14 is not compatible with Sphinx<=3.3
-alabaster<0.7.14
-Sphinx==2.4.4
+alabaster
+Sphinx
 pyyaml
diff --git a/scripts/sphinx-pre-install b/scripts/sphinx-pre-install
index de0de5dd676e..4c781617ffe6 100755
--- a/scripts/sphinx-pre-install
+++ b/scripts/sphinx-pre-install
@@ -280,8 +280,6 @@ sub get_sphinx_version($)
 
 sub check_sphinx()
 {
-	my $default_version;
-
 	open IN, $conf or die "Can't open $conf";
 	while (<IN>) {
 		if (m/^\s*needs_sphinx\s*=\s*[\'\"]([\d\.]+)[\'\"]/) {
@@ -293,18 +291,7 @@ sub check_sphinx()
 
 	die "Can't get needs_sphinx version from $conf" if (!$min_version);
 
-	open IN, $requirement_file or die "Can't open $requirement_file";
-	while (<IN>) {
-		if (m/^\s*Sphinx\s*==\s*([\d\.]+)$/) {
-			$default_version=$1;
-			last;
-		}
-	}
-	close IN;
-
-	die "Can't get default sphinx version from $requirement_file" if (!$default_version);
-
-	$virtenv_dir = $virtenv_prefix . $default_version;
+	$virtenv_dir = $virtenv_prefix . "latest";
 
 	my $sphinx = get_sphinx_fname();
 	if ($sphinx eq "") {
@@ -318,8 +305,8 @@ sub check_sphinx()
 	die "$sphinx didn't return its version" if (!$cur_version);
 
 	if ($cur_version lt $min_version) {
-		printf "ERROR: Sphinx version is %s. It should be >= %s (recommended >= %s)\n",
-		       $cur_version, $min_version, $default_version;
+		printf "ERROR: Sphinx version is %s. It should be >= %s\n",
+		       $cur_version, $min_version;
 		$need_sphinx = 1;
 		return;
 	}
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-01 14:18 [PATCH v2] docs: drop the version constraints for sphinx and dependencies Lukas Bulwahn
@ 2024-03-03 15:17 ` Jonathan Corbet
  2024-03-18 16:44 ` Donald Hunter
  1 sibling, 0 replies; 7+ messages in thread
From: Jonathan Corbet @ 2024-03-03 15:17 UTC (permalink / raw)
  To: Lukas Bulwahn, Mauro Carvalho Chehab, Vegard Nossum,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc
  Cc: linux-kernel, Lukas Bulwahn

Lukas Bulwahn <lukas.bulwahn@gmail.com> writes:

> As discussed (see Links), there is some inertia to move to the recent
> Sphinx versions for the doc build environment.
>
> As first step, drop the version constraints and the related comments. As
> sphinx depends on jinja2, jinja2 is pulled in automatically. So drop that.
> Then, the sphinx-pre-install script will fail though with:
>
>   Can't get default sphinx version from ./Documentation/sphinx/requirements.txt at ./scripts/sphinx-pre-install line 305.
>
> The script simply expects to parse a version constraint with Sphinx in the
> requirements.txt. That version is used in the script for suggesting the
> virtualenv directory name.
>
> To suggest a virtualenv directory name, when there is no version given in
> the requirements.txt, one could try to guess the version that would be
> downloaded with 'pip install -r Documentation/sphinx/requirements.txt'.
> However, there seems no simple way to get that version without actually
> setting up the venv and running pip. So, instead, name the directory with
> the fixed name 'sphinx_latest'.
>
> Finally update the Sphinx build documentation to reflect this directory
> name change.
>
> Link: https://lore.kernel.org/linux-doc/874jf4m384.fsf@meer.lwn.net/
> Link: https://lore.kernel.org/linux-doc/20240226093854.47830-1-lukas.bulwahn@gmail.com/
> Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
> Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
> ---
> v1 -> v2:
>   drop jinja2 as suggested by Vegard.
>   add tags from v1 review
>
>  Documentation/doc-guide/sphinx.rst    | 11 ++++++-----
>  Documentation/sphinx/requirements.txt |  7 ++-----
>  scripts/sphinx-pre-install            | 19 +++----------------
>  3 files changed, 11 insertions(+), 26 deletions(-)

I've applied this - thanks.

jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-01 14:18 [PATCH v2] docs: drop the version constraints for sphinx and dependencies Lukas Bulwahn
  2024-03-03 15:17 ` Jonathan Corbet
@ 2024-03-18 16:44 ` Donald Hunter
  2024-03-18 16:54   ` Vegard Nossum
  1 sibling, 1 reply; 7+ messages in thread
From: Donald Hunter @ 2024-03-18 16:44 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Jonathan Corbet, Mauro Carvalho Chehab, Vegard Nossum,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc,
	linux-kernel

Lukas Bulwahn <lukas.bulwahn@gmail.com> writes:

> As discussed (see Links), there is some inertia to move to the recent
> Sphinx versions for the doc build environment.
>
> [...]
>
> Link: https://lore.kernel.org/linux-doc/874jf4m384.fsf@meer.lwn.net/
> Link: https://lore.kernel.org/linux-doc/20240226093854.47830-1-lukas.bulwahn@gmail.com/
> Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
> Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
> ---
> v1 -> v2:
>   drop jinja2 as suggested by Vegard.
>   add tags from v1 review
>
>  Documentation/doc-guide/sphinx.rst    | 11 ++++++-----
>  Documentation/sphinx/requirements.txt |  7 ++-----
>  scripts/sphinx-pre-install            | 19 +++----------------
>  3 files changed, 11 insertions(+), 26 deletions(-)

Apologies if I am a little late to the party here - I am just catching
up with the changes on docs-next.

I went to install Sphinx 2.4.4 using requirements.txt for some doc work
and hit the upstream Sphinx dependency breakage. So I pulled docs-next
with the intention of sending a patch to requirements.txt with pinned
dependences. When I noticed that things have already moved on in
docs-next, I decided to spend some time investigating the performance
regression that has been present in Sphinx from 3.0.0 until now.

With Sphinx 2.4.4 I always get timings in this ballpark:

% time make htmldocs
...
real	4m5.417s
user	17m0.379s
sys	1m11.889s

With Sphinx 7.2.6 it's typically over 9 minutes:

% time make htmldocs
...
real	9m0.533s
user	15m38.397s
sys	1m0.907s

I collected profiling data using cProfile:

export srctree=`pwd`
export BUILDDIR=`pwd`/Documentation/output
python3 -m cProfile -o profile.dat ./sphinx_latest/bin/sphinx-build \
    -b html \
    -c ./Documentation \
    -d ./Documentation/output/.doctrees \
    -D version=6.8.0 -D release= \
    -D kerneldoc_srctree=. -D kerneldoc_bin=./scripts/kernel-doc \
    ./Documentation \
    ./Documentation/output

Here's some of the profiling output:

$ python3 -m pstats profile.dat
Welcome to the profile statistics browser.
profile.dat% sort tottime
profile.dat% stats 10
Fri Mar 15 17:09:39 2024    profile.dat

         3960680702 function calls (3696376639 primitive calls) in 1394.384 seconds

   Ordered by: internal time
   List reduced from 6733 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
770364892  165.102    0.000  165.102    0.000 sphinx/domains/c.py:153(__eq__)
   104124  163.968    0.002  544.788    0.005 sphinx/domains/c.py:1731(_find_named_symbols)
543888397  123.767    0.000  176.685    0.000 sphinx/domains/c.py:1679(children_recurse_anon)
     4292   74.081    0.017   74.081    0.017 {method 'poll' of 'select.poll' objects}
631233096   69.389    0.000  246.017    0.000 sphinx/domains/c.py:1746(candidates)
121406721/3359598   65.689    0.000   76.762    0.000 docutils/nodes.py:202(_fast_findall)
  3477076   64.387    0.000   65.758    0.000 sphinx/util/nodes.py:633(_copy_except__document)
544032973   52.950    0.000   52.950    0.000 sphinx/domains/c.py:156(is_anon)
79012597/3430   36.395    0.000   36.395    0.011 sphinx/domains/c.py:1656(clear_doc)
286882978   31.271    0.000   31.279    0.000 {built-in method builtins.isinstance}

profile.dat% callers c.py:153
   Ordered by: internal time
   List reduced from 6733 to 4 due to restriction <'c.py:153'>

Function                            was called by...
                                       ncalls  tottime  cumtime
sphinx/domains/c.py:153(__eq__)  <- 631153346  134.803  134.803  sphinx/domains/c.py:1731(_find_named_symbols)
                                       154878    0.041    0.041  sphinx/domains/c.py:2085(find_identifier)
                                    139056533   30.259   30.259  sphinx/domains/c.py:2116(direct_lookup)
                                          135    0.000    0.000  sphinx/util/cfamily.py:89(__eq__)

From that you can see there is a significant call amplification from
_find_named_symbols (100k calls) to __eq__ (630 million calls), plus
several other expensive functions. Looking at the code [1], you can see
why. It's doing a list walk to find matching symbols. When adding new
symbols it does an exhaustive walk to check for duplicates, so you get
worst-case performance, with ~13k symbols in a list during the doc
build.

I have an experimental fix that uses a dict for lookups. With the fix, I
consistently get times in the sub 5 minute range:

% time make htmldocs
...
real	4m27.085s
user	10m56.985s
sys	0m56.385s

I expect there are other speedups to be found. I will clean up my Sphinx
changes and share them on a GitHub branch (as well as push them
upstream) so that others can try them out.

For some reason, if I run sphinx-build manually with -j 12 (I have a 12
core machine) I get better performance than make htmldocs:

% sphinx-build -j 12 ...
...
real	3m56.074s
user	9m52.775s
sys	0m52.905s

I haven't had a chance to look at what makes the difference here, but
will investigate when I have time.

Cheers,
Donald.

[1] https://github.com/sphinx-doc/sphinx/blob/ff252861a7b295e8dd8085ea9f6ed85e085273fc/sphinx/domains/c/_symbol.py#L235-L283

> diff --git a/Documentation/sphinx/requirements.txt b/Documentation/sphinx/requirements.txt
> index 5d47ed443949..5017f307c8a4 100644
> --- a/Documentation/sphinx/requirements.txt
> +++ b/Documentation/sphinx/requirements.txt
> @@ -1,6 +1,3 @@
> -# jinja2>=3.1 is not compatible with Sphinx<4.0
> -jinja2<3.1
> -# alabaster>=0.7.14 is not compatible with Sphinx<=3.3
> -alabaster<0.7.14
> -Sphinx==2.4.4
> +alabaster
> +Sphinx
>  pyyaml

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-18 16:44 ` Donald Hunter
@ 2024-03-18 16:54   ` Vegard Nossum
  2024-03-18 17:10     ` Donald Hunter
  0 siblings, 1 reply; 7+ messages in thread
From: Vegard Nossum @ 2024-03-18 16:54 UTC (permalink / raw)
  To: Donald Hunter, Lukas Bulwahn
  Cc: Jonathan Corbet, Mauro Carvalho Chehab, Akira Yokosawa,
	Jani Nikula, Randy Dunlap, linux-doc, linux-kernel


On 18/03/2024 17:44, Donald Hunter wrote:
> With Sphinx 2.4.4 I always get timings in this ballpark:
> 
> % time make htmldocs
> ...
> real	4m5.417s
> user	17m0.379s
> sys	1m11.889s
> 
> With Sphinx 7.2.6 it's typically over 9 minutes:
> 
> % time make htmldocs
> ...
> real	9m0.533s
> user	15m38.397s
> sys	1m0.907s

Was this running 'make cleandocs' (or otherwise removing the output
directory) in between? Sphinx is known to be slower if you already have
an output directory with existing-but-obsolete data, I believe this is
the case even when switching from one Sphinx version to another. Akira
also wrote about the 7.x performance:

https://lore.kernel.org/linux-doc/6e4b66fe-dbb3-4149-ac7e-8ae333d6fc9d@gmail.com/

> I have an experimental fix that uses a dict for lookups. With the fix, I
> consistently get times in the sub 5 minute range:

Fantastic!

There is a github issue for the C++ domain but I believe it's the same
issue for both C and C++ domains:

https://github.com/sphinx-doc/sphinx/issues/10966


Vegard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-18 16:54   ` Vegard Nossum
@ 2024-03-18 17:10     ` Donald Hunter
  2024-03-19 17:59       ` Donald Hunter
  0 siblings, 1 reply; 7+ messages in thread
From: Donald Hunter @ 2024-03-18 17:10 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Lukas Bulwahn, Jonathan Corbet, Mauro Carvalho Chehab,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc,
	linux-kernel

On Mon, 18 Mar 2024 at 16:54, Vegard Nossum <vegard.nossum@oracle.com> wrote:
>
> > % time make htmldocs
> > ...
> > real  9m0.533s
> > user  15m38.397s
> > sys   1m0.907s
>
> Was this running 'make cleandocs' (or otherwise removing the output
> directory) in between? Sphinx is known to be slower if you already have

Yes, times were after 'make cleandocs'.

> an output directory with existing-but-obsolete data, I believe this is
> the case even when switching from one Sphinx version to another. Akira
> also wrote about the 7.x performance:
>
> https://lore.kernel.org/linux-doc/6e4b66fe-dbb3-4149-ac7e-8ae333d6fc9d@gmail.com/

Having looked at the Sphinx code, it doesn't surprise me that
incremental builds can have worse performance. There's probably going
to be some speedups to be found when we go looking for them.

> > I have an experimental fix that uses a dict for lookups. With the fix, I
> > consistently get times in the sub 5 minute range:
>
> Fantastic!
>
> There is a github issue for the C++ domain but I believe it's the same
> issue for both C and C++ domains:
>
> https://github.com/sphinx-doc/sphinx/issues/10966

Ahh, I looked for an issue for the C domain but did not see one. I
didn't think to check for issues with the C++ domain, even though the
code for the C domain has been copied from there.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-18 17:10     ` Donald Hunter
@ 2024-03-19 17:59       ` Donald Hunter
  2024-03-21 16:56         ` Donald Hunter
  0 siblings, 1 reply; 7+ messages in thread
From: Donald Hunter @ 2024-03-19 17:59 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Lukas Bulwahn, Jonathan Corbet, Mauro Carvalho Chehab,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc,
	linux-kernel

On Mon, 18 Mar 2024 at 17:10, Donald Hunter <donald.hunter@gmail.com> wrote:
>
> On Mon, 18 Mar 2024 at 16:54, Vegard Nossum <vegard.nossum@oracle.com> wrote:
> >
> > > % time make htmldocs
> > > ...
> > > real  9m0.533s
> > > user  15m38.397s
> > > sys   1m0.907s
> >
> > Was this running 'make cleandocs' (or otherwise removing the output
> > directory) in between? Sphinx is known to be slower if you already have
>
> Yes, times were after 'make cleandocs'.
>
> > an output directory with existing-but-obsolete data, I believe this is
> > the case even when switching from one Sphinx version to another. Akira
> > also wrote about the 7.x performance:
> >
> > https://lore.kernel.org/linux-doc/6e4b66fe-dbb3-4149-ac7e-8ae333d6fc9d@gmail.com/
>
> Having looked at the Sphinx code, it doesn't surprise me that
> incremental builds can have worse performance. There's probably going
> to be some speedups to be found when we go looking for them.

Following up on this, Symbol.clear_doc(docname) does a linear walk of
symbols which impacts incremental builds. The implementation of
clear_doc() looks broken in other ways which I think would further
worsen the incremental build performance.

Incremental builds also seem to do far more work than I'd expect. A
single modified .rst file is quick to build but a handful of modified
.rst files seems to trigger a far larger rebuild. That would be worth
investigating too.

> > > I have an experimental fix that uses a dict for lookups. With the fix, I
> > > consistently get times in the sub 5 minute range:
> >
> > Fantastic!

I pushed my performance changes to GitHub if you want to try them out:

https://github.com/donaldh/sphinx/tree/c-domain-speedup

I noticed that write performance (the second phase of sphinx-build) is
quite slow and doesn't really benefit from multi processing with -j
nn. It turns out that the bulk of the write work is done in the main
process and only the eventual writing is farmed out to forked
processes. I experimented with pushing more work out to the forked
processes (diff below) and it gives a significant speedup at the cost
of breaking index generation. It might be a  viable enhancement if
indexing can be fixed thru persisting the indices from the
sub-processes and merging them in the main process.

With the below patch, this is the build time I get:

% time make htmldocs SPHINXOPTS=-j12
...
real 1m58.988s
user 9m57.817s
sys 0m49.411s

Note that I get better performance with -j12 than -jauto which auto
detects 24 cores.

diff --git a/sphinx/builders/__init__.py b/sphinx/builders/__init__.py
index 6afb5d4cc44d..6b203799390e 100644
--- a/sphinx/builders/__init__.py
+++ b/sphinx/builders/__init__.py
@@ -581,9 +581,11 @@ class Builder:
                 self.write_doc(docname, doctree)

     def _write_parallel(self, docnames: Sequence[str], nproc: int) -> None:
-        def write_process(docs: list[tuple[str, nodes.document]]) -> None:
+        def write_process(docs: list[str]) -> None:
             self.app.phase = BuildPhase.WRITING
-            for docname, doctree in docs:
+            for docname in docs:
+                doctree = self.env.get_and_resolve_doctree(docname, self)
+                self.write_doc_serialized(docname, doctree)
                 self.write_doc(docname, doctree)

         # warm up caches/compile templates using the first document
@@ -596,6 +598,7 @@ class Builder:

         tasks = ParallelTasks(nproc)
         chunks = make_chunks(docnames, nproc)
+        logger.info(f"_write_parallel: {len(chunks)} chunks")

         # create a status_iterator to step progressbar after writing a document
         # (see: ``on_chunk_done()`` function)
@@ -607,12 +610,7 @@ class Builder:

         self.app.phase = BuildPhase.RESOLVING
         for chunk in chunks:
-            arg = []
-            for docname in chunk:
-                doctree = self.env.get_and_resolve_doctree(docname, self)
-                self.write_doc_serialized(docname, doctree)
-                arg.append((docname, doctree))
-            tasks.add_task(write_process, arg, on_chunk_done)
+            tasks.add_task(write_process, chunk, on_chunk_done)

         # make sure all threads have finished
         tasks.join()

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
  2024-03-19 17:59       ` Donald Hunter
@ 2024-03-21 16:56         ` Donald Hunter
  0 siblings, 0 replies; 7+ messages in thread
From: Donald Hunter @ 2024-03-21 16:56 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Lukas Bulwahn, Jonathan Corbet, Mauro Carvalho Chehab,
	Akira Yokosawa, Jani Nikula, Randy Dunlap, linux-doc,
	linux-kernel

On Tue, 19 Mar 2024 at 17:59, Donald Hunter <donald.hunter@gmail.com> wrote:
>
> > > > I have an experimental fix that uses a dict for lookups. With the fix, I
> > > > consistently get times in the sub 5 minute range:
> > >
> > > Fantastic!
>
> I pushed my performance changes to GitHub if you want to try them out:
>
> https://github.com/donaldh/sphinx/tree/c-domain-speedup

Now a PR: https://github.com/sphinx-doc/sphinx/pull/12162

Following up on the incremental build performance, I have been
experimenting with different batch sizes when building the Linux
kernel docs. My results suggest that the best performance is achieved
by using a minimum batch size of 200 for reads because batches smaller
than that have too high a merge overhead back into the main process. I
also experimented with a minimum threshold of 500 before even
splitting into batches, i.e. if there are less than 500 changed docs
then just process them serially.

With the existing make_chunks behaviour, a small number of changed
docs gives worst case behaviour of 1 doc per chunk. Merging single
docs back into the main process destroys any benefit from the parallel
processing. E.g. running make htmldocs SPHINXOPTS=-j12

Running Sphinx v7.2.6
[...]
building [html]: targets for 3445 source files that are out of date
updating environment: [new config] 3445 added, 0 changed, 0 removed
[...]
real 7m46.198s
user 14m18.597s
sys 0m54.925s

for a full build of 3445 files vs an incremental build of just 114 files:

Running Sphinx v7.2.6
[...]
building [html]: targets for 114 source files that are out of date
updating environment: 0 added, 114 changed, 0 removed
real 5m50.746s
user 6m33.199s
sys 0m13.034s

When I run the incremental build serially with make htmldocs
SPHINXOPTS=-j1 then it is much faster:

building [html]: targets for 114 source files that are out of date
updating environment: 0 added, 114 changed, 0 removed
real 1m5.034s
user 1m3.183s
sys 0m1.616s

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-21 16:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-01 14:18 [PATCH v2] docs: drop the version constraints for sphinx and dependencies Lukas Bulwahn
2024-03-03 15:17 ` Jonathan Corbet
2024-03-18 16:44 ` Donald Hunter
2024-03-18 16:54   ` Vegard Nossum
2024-03-18 17:10     ` Donald Hunter
2024-03-19 17:59       ` Donald Hunter
2024-03-21 16:56         ` Donald Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).