linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Donald Hunter <donald.hunter@gmail.com>
To: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	 Mauro Carvalho Chehab <mchehab@kernel.org>,
	Akira Yokosawa <akiyks@gmail.com>,
	 Jani Nikula <jani.nikula@linux.intel.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	 linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies
Date: Thu, 21 Mar 2024 16:56:41 +0000	[thread overview]
Message-ID: <CAD4GDZz32w9MXf1msX0otVwZqtqFWOajG+00jQimAKUp627xcA@mail.gmail.com> (raw)
In-Reply-To: <CAD4GDZyABi3wjKY4SUV804OyBDBaC=Ckz5b0GZ34JmCX8S6V_g@mail.gmail.com>

On Tue, 19 Mar 2024 at 17:59, Donald Hunter <donald.hunter@gmail.com> wrote:
>
> > > > I have an experimental fix that uses a dict for lookups. With the fix, I
> > > > consistently get times in the sub 5 minute range:
> > >
> > > Fantastic!
>
> I pushed my performance changes to GitHub if you want to try them out:
>
> https://github.com/donaldh/sphinx/tree/c-domain-speedup

Now a PR: https://github.com/sphinx-doc/sphinx/pull/12162

Following up on the incremental build performance, I have been
experimenting with different batch sizes when building the Linux
kernel docs. My results suggest that the best performance is achieved
by using a minimum batch size of 200 for reads because batches smaller
than that have too high a merge overhead back into the main process. I
also experimented with a minimum threshold of 500 before even
splitting into batches, i.e. if there are less than 500 changed docs
then just process them serially.

With the existing make_chunks behaviour, a small number of changed
docs gives worst case behaviour of 1 doc per chunk. Merging single
docs back into the main process destroys any benefit from the parallel
processing. E.g. running make htmldocs SPHINXOPTS=-j12

Running Sphinx v7.2.6
[...]
building [html]: targets for 3445 source files that are out of date
updating environment: [new config] 3445 added, 0 changed, 0 removed
[...]
real 7m46.198s
user 14m18.597s
sys 0m54.925s

for a full build of 3445 files vs an incremental build of just 114 files:

Running Sphinx v7.2.6
[...]
building [html]: targets for 114 source files that are out of date
updating environment: 0 added, 114 changed, 0 removed
real 5m50.746s
user 6m33.199s
sys 0m13.034s

When I run the incremental build serially with make htmldocs
SPHINXOPTS=-j1 then it is much faster:

building [html]: targets for 114 source files that are out of date
updating environment: 0 added, 114 changed, 0 removed
real 1m5.034s
user 1m3.183s
sys 0m1.616s

      reply	other threads:[~2024-03-21 16:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01 14:18 [PATCH v2] docs: drop the version constraints for sphinx and dependencies Lukas Bulwahn
2024-03-03 15:17 ` Jonathan Corbet
2024-03-18 16:44 ` Donald Hunter
2024-03-18 16:54   ` Vegard Nossum
2024-03-18 17:10     ` Donald Hunter
2024-03-19 17:59       ` Donald Hunter
2024-03-21 16:56         ` Donald Hunter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAD4GDZz32w9MXf1msX0otVwZqtqFWOajG+00jQimAKUp627xcA@mail.gmail.com \
    --to=donald.hunter@gmail.com \
    --cc=akiyks@gmail.com \
    --cc=corbet@lwn.net \
    --cc=jani.nikula@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=mchehab@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=vegard.nossum@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).