linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kernel source spellchecker
@ 2003-02-27  6:59 Dan Kegel
       [not found] ` <1046330232.15763.97.camel@localhost.localdomain>
  0 siblings, 1 reply; 46+ messages in thread
From: Dan Kegel @ 2003-02-27  6:59 UTC (permalink / raw)
  To: linux-kernel

Since the main remaining feature before release of the 2.6
kernel is fixing all the remaining spelling errors,
this patch seems appropriate.  This is against 2.4 but
should apply to other versions as well.
It's not very smart, but should help get us to our
all-important goal of 100% correctly spellt kernel source.
Todo: make it ignore names from the MAINTAINERS file,
the list of signals and syscalls, and other well-known
english words seem mostly in Webster's Posix edition;
rewrite in Perl rather than C, or add real Makefile entry.
Enjoy!
- Dan

--- /dev/null	2002-08-30 16:31:37.000000000 -0700
+++ linux/scripts/spellcheck-kernel	2003-02-26 22:51:46.000000000 -0800
@@ -0,0 +1,12 @@
+#!/bin/sh
+# Script to spellcheck kernel.
+# usage: spellcheck-kernel [ sourcedir ]
+#     The source directory defaults to /usr/src/linux.
+# e.g.
+#   scripts/spellcheck-kernel .
+#      Check spelling of the kernel tree in the current directory
+
+sourcedir=${1-/usr/src/linux}
+
+make -C .. scripts/lspell
+find $sourcedir -name '*.[ch]' | xargs ./lspell
--- /dev/null	2002-08-30 16:31:37.000000000 -0700
+++ linux/scripts/lspell.c	2003-02-26 22:51:14.000000000 -0800
@@ -0,0 +1,74 @@
+/*
+ * C comment spell checker
+ * For each given source file, print the filename, then
+ * extract all comments from the file, send them through the system
+ * spellchecker, sort the list of words flagged as misspellings,
+ * and word-wrap the sorted list.
+ * Copyright 2003, Dan Kegel.  Licensed under GPL.  See the file ../COPYING for details.
+ */
+#include <stdio.h>
+int
+main(int argc, char **argv)
+{
+	int argi;
+
+	for (argi = 1; argi < argc; argi++) {
+		int c;
+		enum state_t { NONCOMMENT, SLASH, COMMENT, STAR, EOLCOMMENT };
+		enum state_t state = NONCOMMENT;
+		FILE *fp = fopen(argv[argi], "rt");
+		if (!fp) {
+			perror(argv[argi]);
+			continue;
+		}
+		FILE *pout = popen("/usr/bin/spell | sort -f | fmt", "w");
+		if (!pout) {
+			perror("/usr/bin/spell | sort -f | fmt");
+			exit(1);
+		}
+		printf("\n%s:\n", argv[argi]);
+		fflush(stdout);
+		while ((c = getc(fp)) != EOF) {
+			switch (state) {
+			case NONCOMMENT:
+				if (c == '/')
+					state = SLASH;
+				break;
+			case SLASH:
+				if (c == '*')
+					state = COMMENT;
+				else if (c == '/')
+					state = EOLCOMMENT;
+				else {
+					state = NONCOMMENT;
+				}
+				break;
+			case COMMENT:
+				if (c == '*')
+					state = STAR;
+				else
+					fputc(c, pout);
+				break;
+			case STAR:
+				if (c == '/')
+					state = NONCOMMENT;
+				else {
+					if (c != '*') {
+						fputc('\n', pout);
+						state = COMMENT;
+					}
+				}
+				break;
+			case EOLCOMMENT:
+				if (c == '\n')
+					state = NONCOMMENT;
+				else
+					fputc(c, pout);
+				break;
+			}
+		}
+		fclose(pout);
+		fclose(fp);
+	}
+	exit(0);
+}

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 46+ messages in thread
* Re: [PATCH] kernel source spellchecker
@ 2003-03-01 15:57 shaheed
  2003-03-01 16:35 ` Jörn Engel
  0 siblings, 1 reply; 46+ messages in thread
From: shaheed @ 2003-03-01 15:57 UTC (permalink / raw)
  To: ms; +Cc: linux-kernel


Matthias,

Here is a list of corrections...I have omitted those that seem OK to me, 
apostrophes, proper names, some that seem to be hypenation-related, American 
vs. British differences and a few others.

In the case of broken American spelling, I have provided American fixes 
(against my better judgement :-)). Enjoy...

accommodate=accomodate
adapter=adaptor
address=adddress
additional=additionnal
alignment=alignement
always=allways
appropriate=apropriate
around=arround
associated=assosciated,assosiated
asynchronous=asyncronous
Auxillary=Auxillary
available=availible,avaliable
basically=basicly
being=beeing
broken=borken
boundary=boundry
brain-damaged=dain-bramaged,dain bramaged
calling=callin
capabilities=capabilites
chosen=choosen
command=comamnd
coming=comming
committed=commited
comparison=comparision
Compatibility=Compatability
compatibility=compatibilty,compatiblity
completely=completly
concurrent=concurent
Continuous=Continous
continuous=continous
controller=controler,controllen
corresponding=coresponding
decrementor=decrementer
descriptor=decriptor,desciptor
deferred=defered
definitions=defintions
dependent=dependend
divide=devide
differentiate=differenciate
entries=entrys
everytime=everytime
explicitly=explicitely
forward=foward
function=fuction,funtion
guaranteed=guarenteed
handling=handeling
hardware=harware
physical=hysical
immediately=immediatly,
implementation=implemantation,implmentation
Incoming=Incomming
incoming=incomming
index=indice
information=infomation
Infinity=Inifity
initial=inital
initialization=initalization,initilization,intialization
Initialize=Initalize,Intialize
initialize=initalize,intialize
interface=inteface
Interrupt=Interupt
interrupt=interrrupt
interrupts=interrups
interval=intervall
invocation=invokation
Length=Lenght
management=managment
necessary=neccessary
negotiated=negociated
No-one=Noone
occurred=occured
occurrance=occurence
occurring=occuring
output=ouput
outputting=outputing
overridden=overriden
parameter=paramter
parameters=paramters
performance=performace
promiscuous=promiscous
receiving=receving
Receive=Recieve
receive=recieve
received=recieved
registered=registred
Register=Regsiter
relevant=relevent
resources=ressources
scatter=scather
specific=specifc
specified=specifed,speficied
successful=succesful,successfull
superfluous=superflous
threshold=threshhold
through=throught
timing=timming
transceiver=tranceiver
transferring=transfering
transmitting=transmiting
transferred=trasfered
truly=truely
ugliness=uglyness
usable=useable
useful=usefull
vertices=verticies
warranty=waranty
wasteful=watseful
writing=writting

^ permalink raw reply	[flat|nested] 46+ messages in thread
[parent not found: <20030301160017$56fc@gated-at.bofh.it>]
* Re: [PATCH] kernel source spellchecker
@ 2003-03-02 18:56 Jared Daniel J. Smith
  2003-03-02 17:22 ` Bernd Petrovitsch
  2003-03-02 18:46 ` Steven Cole
  0 siblings, 2 replies; 46+ messages in thread
From: Jared Daniel J. Smith @ 2003-03-02 18:56 UTC (permalink / raw)
  To: linux-kernel

Regarding these two cautious comments:

==========================================================================
I wouldn't go that far. Better give a list of speling mistakes (file/line)
and fix them by hand. It won't need to be done more than occasionally, so
the overhead is not too bad. --Dr. Horst H. von Brand 

It might also be worth adding a list of 'suspect' spellings -- which
require human intervention. Such items might include 'indices=indexes'
and 'erratum=errata' although you can't do it automatically because
sometimes the right-hand side is actually correct. --David Woodhouse
==========================================================================

I fully agree.

I have tried to automatically spell-check long, complex texts for years,
with numerous algorithms; all of them fail for one reason or another,
and I find that the only proper way to do it is the tedious work by hand.

Even a single lost pun because of overenthusiastic spellchecking is
not worth the cleanup. I would prefer to see typos than lose a single
intentional 'misspelling'. It would be best if you posted all changes 
somewhere so that they could be verified manually.

Consider the following:

alignment=alignement
alignmement is French; is this intentional?

constants=konstants
konstants is German; is this intentional?

consumer=comsumer
comsumer is a neologism: http://www.firstmonday.dk/issues/issue5_5/henshall/

Converted=Coverted
is it a pun on something 'hidden' or is it something transformed?

descriptor=decriptor,desciptor
is it descriptor or decrypter?

invocation=invokation
invokation is German; is this intentional?

negative=negativ
negativ is a legitimate non-English word; is this intentional?

signaled=signalled
signaling=Signalling
signaling=signalling
signalled is a legitimate alternate spelling of signaled.

succeeded=succeded
succeded could also be a typo for 'succeed'

through=throught,throuth
throught could also be a typo for 'thought'

writable=writeable
writeable is a legitimate alternate spelling of writable

Thank you,

-Jared



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2003-03-07 10:58 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-27  6:59 [PATCH] kernel source spellchecker Dan Kegel
     [not found] ` <1046330232.15763.97.camel@localhost.localdomain>
2003-03-01  5:38   ` Dan Kegel
2003-03-01 14:11     ` Matthias Schniedermeyer
2003-03-01 17:13       ` Matthias Schniedermeyer
2003-03-01 18:54       ` Dan Kegel
2003-03-01 19:18         ` Steven Cole
2003-03-01 21:20           ` Dan Kegel
2003-03-02  3:45             ` jw schultz
2003-03-02  2:08           ` Dan Kegel
2003-03-02  3:02             ` Dan Kegel
2003-03-02  3:54               ` Steven Cole
2003-03-02  8:04                 ` Dan Kegel
2003-03-02  4:16               ` Steven Cole
2003-03-02  8:21                 ` Dan Kegel
2003-03-02  8:40                   ` jw schultz
2003-03-02 11:21                 ` David Woodhouse
2003-03-02 13:49                   ` Steven Cole
2003-03-02 14:55                     ` David Woodhouse
2003-03-02 22:44                     ` Alan Cox
2003-03-02 22:59                       ` John Bradford
2003-03-03  2:29                       ` Dan Kegel
     [not found]                     ` <3E62C0FF.1090700@kegel.com>
     [not found]                       ` <1046661777.7527.518.camel@spc1.mesatop.com>
2003-03-03  5:36                         ` Dan Kegel
     [not found]                         ` <3E62E4C0.9070103@kegel.com>
     [not found]                           ` <1046668274.7527.533.camel@spc1.mesatop.com>
2003-03-03  5:48                             ` Dan Kegel
2003-03-02 15:35                   ` Dan Kegel
2003-03-02  8:09               ` Matthias Schniedermeyer
2003-03-02  8:13                 ` Matthias Schniedermeyer
2003-03-02  3:29             ` Steven Cole
2003-03-01 19:30         ` Matthias Schniedermeyer
2003-03-01 20:33           ` Matthias Schniedermeyer
2003-03-01 21:25           ` Dan Kegel
2003-03-01 21:25             ` Matthias Schniedermeyer
2003-03-02  9:15               ` John Bradford
2003-03-02  9:31                 ` Matthias Schniedermeyer
2003-03-02  3:16         ` Horst von Brand
2003-03-01 15:57 shaheed
2003-03-01 16:35 ` Jörn Engel
2003-03-01 18:01   ` shaheed
2003-03-01 18:31     ` Jörn Engel
2003-03-05 18:10   ` Pavel Machek
     [not found] <20030301160017$56fc@gated-at.bofh.it>
2003-03-01 18:39 ` Pascal Schmidt
2003-03-02 18:56 Jared Daniel J. Smith
2003-03-02 17:22 ` Bernd Petrovitsch
2003-03-02 17:47   ` Werner Almesberger
2003-03-02 18:28     ` Bernd Petrovitsch
2003-03-02 18:46 ` Steven Cole
2003-03-02 22:32   ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).