From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ms.lwn.net (ms.lwn.net [45.79.88.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A96591FA6 for ; Thu, 23 Jun 2022 13:40:52 +0000 (UTC) Received: from localhost (unknown [IPv6:2601:281:8300:73::5f6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ms.lwn.net (Postfix) with ESMTPSA id B05DF385; Thu, 23 Jun 2022 13:40:45 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 ms.lwn.net B05DF385 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lwn.net; s=20201203; t=1655991645; bh=7sYwfBvrRGhjCtIuYtYIX0qSRF15ZmfieRm1MCkxCZs=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=lgOMaTmSxwSbLjrUNxPUayE0w2lkdQCcs7o25hStiuPyCcf8NpXeQ6Pyf04YShj9L 0TCs7iK7QMtGLIHRTU7rYF0kxvVQv/dLyYGpWaTlfbdaygt/zp/AwPdnJ47IGls/my ZCUSZvkif47evA3NJwrt+v3ki+VxlKA1AzoyHU1AfOvVHoO7y/JL0DzMxmREY6Taov mET07SHOKMj5JjsvDDAc6rUyXz2CyWgdI3FLmjYAudT871I/S0WFpGyqyH3yzY+Emv C5UM7ETFdoeqqYQLufQ8Q8FWD7E1Aoe13ljdOhlGyZ77z6MmiiUNQWRUenSua8ge6r JpvWN1fupOCDg== From: Jonathan Corbet To: Mauro Carvalho Chehab , Jani Nikula Cc: ksummit-discuss@lists.linuxfoundation.org, ksummit@lists.linux.dev Subject: Re: [Ksummit-discuss] [TECH TOPIC] What kernel documentation could be In-Reply-To: <20220623105747.079ac92b@sal.lan> References: <87sfo3nh3t.fsf@meer.lwn.net> <20220618092447.5ebed314@sal.lan> <875ykrrb45.fsf@intel.com> <20220623105747.079ac92b@sal.lan> Date: Thu, 23 Jun 2022 07:40:45 -0600 Message-ID: <87a6a38plu.fsf@meer.lwn.net> Precedence: bulk X-Mailing-List: ksummit@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Mauro Carvalho Chehab writes: >> The solution would be to finally convert the >> script to a proper python Sphinx extension that can do caching. (This is >> how it works in Hawkmoth, FWIW.) I've been pondering on this for a bit, and would like to do it, but I don't know when I might find the time for it. > That's one solution, but see: there is already a python extension > that currently calls kernel-doc everytime. It could, instead, > cache the rst returned by its first run (or a parsed version of it) > and use the cached results the other 3 times. > > Porting kernel-doc to python could be doable, but not trivial, due to several > reasons: > > - it should keep running standalone, as otherwise debugging parsing issues > on kernel-doc would be a lot harder. In particular, kernel-doc --none is > really helpful to report kernel-doc tag errors; Yes, of course. As Jani noted, that's just how you would do it, not a problem. > - regressions will likely be introduced on a change like that; The nice thing is ... we already have a really nice regression test in the form of the current docs build and diff. > - regular contributors to kernel-doc will need to ramp up with the newer > version; We have those? That script is a nightmare and nobody goes near it if they can possibly avoid it. I would expect to have more contributors with a decent Python version that doesn't include 25 years of regex accretion. > - a port like that could increase the script run time, as the > optimizations and regular expressions there could behave different on > python. It could also decrease it by improving caching opportunities, getting rid of a lot of fork()/exec() pairs and Perl interpreter startups, etc. I've actually, in a spare moment or two, been doing some profiling of the kernel docs build and trying to track down the sources of the slowness. I am thinking that nearly 700 *million* calls to the iterator for the C-domain symbol list might have something to do with it... > True, but independently if the script would be rewritten in python or not, > one way would be to enrich the 'DOCS:' kernel-doc tag in order to mention > there the symbols that belong to each part of the document, e. g. something > like: > > /** > * DOC: foo > * > * Some comments... > * > * symbols: > * foo > * bar > */ > > One advantage is that all documentation will be on a single place, > so hopefully it would be easier to maintain. I'm not quite sure I get this...you want to put the TOC tree in the source comments? This looks like the kind of thing that nobody ever remembers to update, but maybe I'm missing something. Thanks, jon