Tardis:What SpellBot actually corrects

Because even the most conscientious of editors will occasionally make spelling errors, there is a need to have bot enforcement of the spelling policy. A comprehensive list of the differences between British and American spellings has been compiled, and is being coded for bot use as of the second week of June, 2011. This page will see heavy updating throughout that week as the list is fully coded.

Following is the raw code of the bot routine (known as a "user-fix"), so that all users may see what exactly the bot is checking for.

Words impossible for a bot
Some words are beyond the capability of the bot, because they are valid spellings (even if of different words) in British English. This list includes:
 * Check. Americans use this word to mean not only the verb to inquire after or to investigate, but also the noun, which is a financial instrument.  Because  BrEng spells the verb that way, too, the bot can't be programmed to correct the other usage.  We'd end up with sentences like:
 * The Doctor chequed on Sarah Jane in her hospital room before going to the pathology lab.


 * Tire. Both sides of the Atlantic use tire as a verb.  It's again the noun that's problematic.  Americans view tire as the correct spelling for what the British would call a tyre.  The bot can't figure this one out, so it doesn't even try.
 * Draft. Americans use this spelling for all senses, the British use both for different senses.  All words beginning with drafts will be converted to draughts-, and the word drafty will be converted to draughty, but the word draft itself won't be touched by the bot, as that is a valid British spelling of the word.  Clear as mud?  Cool.  Onwards, then . ..
 * Disc. Way, way, way too screwed up a word for a simple bot to handle.  Disc jockey is fine on both sides of the divide, but so is floppy disk and hard disk. This one simply depends on context.
 * Practise. The -ise version of this word is the correct spelling for the verb in BrEng; the British noun ends in -ice.  Americans use -ice for everything.  Thus, the bot can't be of much use, except for participles derived from the verb.  So, the bot will make no attempt to change the spelling of practice, but it will change practicing and practiced to practising and practised.
 * Jail. Yes, gaol is still correct, especially historically, but jail has largely supplanted it.  So we won't look to correct jail to gaol, but neither will we try to correct gaol to jail.  Because of the known presence of both spellings in DWU fiction, this one can't be decided by forum debate.  We just have to live with gaol occasionally popping up.
 * Licence. License is correct in BrEng as a noun, so the bot can't correct for Americans using license as a verb (and other parts of speech deriving from the verb form).  Other -ence words don't necessarily work this way.  Offence and defence are unambiguously correct — but then again their verb form is different — offend and defend — which means their gerunds and verbal nouns are different, too.
 * Storey is the proper British spelling for a floor in a building. Americans just spell this story, as in a 15-story building.  Obviously, the bot can't make this correction, because the word for a tale is spelled story on both sides of the Atlantic.
 * Chimaera/chimera. Standard British spelling in the Cambridge Online Dictionary is chimera, but there are other sources which state that the British prefer chimaera.  Chimaera, however, is the universal spelling for a certain kind of fish, and the mountain from which the legend of the chimera got its name.  The bot's going to steer well clear of all this. It's probably best to spell however the particular source spells this word.

Words requiring forum decision
Other words are possible for a bot to correct, but the presence of two valid British spellings means that we will require a forum discussion. Precedent for such forum debates over particular words can be found in the following threads: category=spelling debates columns=2
 * Judgment. There's no agreement on either side of the Atlantic whether this word should be judgment or judgement.  Oddly, most British spell-checkers will red-flag judgment, even though that's the official spelling in Commonwealth courts.  We have at least one story title preferencing the version with two es — Judgement of the Judoon.  But still, this word will require a special forum discussion to decide which way we want to spell it.
 * Connexion. This British spelling of connection is not universally used in Britain. Connection is correct in Britain, too, so the bot won't try to force connection into a connexion-shaped hole.
 * Simidgen, smidgeon, smidgin. They're all valid spellings for the same word, on both sides of the Atlantic.  The only way the bot could be useful is if we had a forum discussion to settle on one of the three spellings.
 * Yogurt passes most British spell-checkers today, but so does yoghurt. We'll leave both well alone until a forum discussion decides the matter.
 * Almanac is universally the way it's spelt in American English, and increasingly the way Britons spell it, too. Still, some old-timers will go for almanack. Until a forum discussion decides otherwise, the bot won't enforce either spelling.
 * Gasses/gases. Both spellings are correct on both sides of the Atlantic.  The bot won't correct for either until a forum discussion settles on a particular spelling.
 * Programme/program. Both spellings pass British spell-checkers, even though there's a perception that "programme" is Britsh and "program" is American.  Until there's a community decision on spelling, the bot won't touch either spelling.
 * Griffin/Gryphon. Both pass British spell-checkers, so it'll take a forum discussion to decide which way we want to go.
 * Inflexion is the way the British have historically spelt inflection, but modern British spell-checkers pass both spellings. So, the bot won't enforce either without a forum decision to the contrary.
 * Instal/install. Modern British spell-checkers are cool with both, so the bot is, too. But the proper British spelling is instalment, not installment.
 * Mediaeval. Ironically this spelling is now considered archaic and is most often seen in British academic writing.  Medieval, the American spelling, passes British spell checks, too.  So, in the absence of a forum decision to the contrary, the bot won't try to correct either spelling.
 * Praesidium/presidium/presidiums/presidia is the archaic British spelling of presidium. It no longer passes default settings on British spell-checkers, so the bot will correct to presidium, which is also the American spelling. Note that the plural of the word is more confused.  Both presidiums and presidia pass British spell-checkers.  So the case here is complicated.  The singular form of the verb will be corrected to presidium.  The plural will be corrected from praesidiums to presidiums.  But presidia will go uncorrected, unless a forum discussion decides on one or the other plural form.
 * Pizzaz/pizzazz. Both spellings pass modern British spell checkers, although historically the three-z version was British and the four-z version was American. A forum discussion will be required before the bot corrects for either.
 * Siphon/syphon. Both pass most modern British spell-checkers (and American spell-checkers, for that matter).
 * Ton/tonne. Both pass British spell-checkers, so the bot won't touch either until forum decision to the contrary.
 * Tranquility/tranquillity. Both pass British spell-checkers, though it's unclear whether this is because the simple noun is actively spelled both ways in Britain, or if it's because of Tranquility Base (see below).
 * Notwithstanding the general fact that the bot will not correct for either chimera or chimaera (see above), there probably should be a forum discussion on the proper spelling of Space Station Chimera, since we're technically inventing that spelling. The novelisation doesn't actually offer us a spelling, so it could equally be Space Station Chimaera.

Valid British spellings actively corrected
Thanks to the ubiquity of American spellings in pop culture, there are a few cases where valid if archaic British spellings can be corrected to standard American spellings without the need for a forum decision. In such cases, modern British usage hews closely to the American, and clearly argues against more archaic forms.
 * Primaeval. Due to the presence of the modern television ITV/BBCA television series with which many Doctor Who fans will be familiar, as well as BFA: Primeval, the unambiguously American spelling "wins" the contest.  Primaeval will be actively corrected to primeval.  This shouldn't ruffle too many feathers, since modern British spell-checkers fail primaeval.
 * Tranquillize/tranquillise/tranquilise/tranquilize. There are four different ways to spell this one damned word (and all words deriving from it).  Ridiculous.  The -ll versions are both okay in BrEng; the -l versions are both okay in AmEng.  However, only one spelling passes modern British spell-checkers.  Therefore tranquillise shall be deemed correct, and the bot will correct the other three spellings.
 * Tranquility Base. The bot will actively correct Tranquillity Base to the IAU-standard Tranquility Base.

Confusing words
Some words are homonyms, and look like they're archaic forms of other words, but in fact are totally different words. Other words behave one way as a root, but are spelt differently once suffixes are added. The bot will therefore correct in an unexpected way, which is why those ways are explained here.
 * Philtre is not the British spelling of filter, but of the American philter. It's a noun meaning "love potion", not a verb meaning "to remove impurities".
 * Pouffe is not an archaic British spelling of poof, but a current spelling for what Americans would call a pouf — that is, a nice, thick cushion you can sit on. Thus, the bot will correct pouf to pouffe.  Not that we actually expect pouf to ever be used in a sentence on this wiki — except on this very page, which the bot doesn't patrol.
 * Groyne is a British spelling of groin only when it's not. It doesn't refer to that region of the human anatomy between the legs.  That's groin on both sides of the Atlantic.  Groyne is, instead, a civil engineering term, referring to a construction that controls erosion.  Technically, it's known as a groin in the US, but almost no one calls it that.  It's more commonly known as a breakwater, bulwark or seawall.  Frankly, groynes are so often called by these more specific names, even in Britain, that groyne doesn't pass modern British spell-checkers.  It's probably never been used at any time in the DWU, but still it's correct to say something like: "All seawalls are groynes, but not all groynes are seawalls."  The bot won't correct away from it, but because groin is BrEng correct in its anatomical sense, it won't correct groin to groyne, either.
 * Vapour/vaporise. Yeah, this one's a beauty.  As a plain noun, Americans always spell this one vapor, while Britons always go for vapour.  That's pretty much the definitional British/American spelling difference.  What's weird is what happens when you turn the noun into a verb by adding -ize, or, as the British would have it, -ise.  Suddenly the cosmetic u is gone.  So, the bot will correct vapor to vapour, but vaporize only to vaporise.  Vapourise is incorrect on both sides of the Atlantic.
 * Odour/Deodorise. This pair works the same way as vapour/vaporise. Deodourise is wrong; deodorise is right.

How to read the code
The code works by telling the bot to look for the word described before the comma. Then it replaces it with the word after the comma. A most basic expression would be:
 * {u'color',u'colour')

This looks for the American "color", then replaces it with the British "colour".

Because typing every permutation of a word, including all words that share the same root and capitalised variants, would be very time-consuming, most of the code won't work in such a simplistic way. Most of it uses a "regular expression" — or regex — to find a lot of hits with just one line. Here's an explanation of the regex used in this code:
 * The expression ([Cc]) means "look for either capitalised or lowercase versions of the letter C
 * (.?) means, "You, Mr. Fancy Computer bot thing, might find some more letters to the right of this point. Grab 'em all up to the next space only."


 * /1 means, "take whatever is in the first parentheses and put it here"
 * /2 means, "take whatever is in the second parentheses and put it here"
 * /3 means, "take whatever is in the second parentheses and put it here"

Thus, if we have the expression,
 * (r'([Cc])apitaliz(.?)', r'\1capitalis\2')

It means, roughly,
 * Look for all words, beginning with either a capital or lowercase C, which are followed by the letters "apitaliz" + any other letters you find until the next space. Then, keep the form of the letter c that you find, stick on "apitalis", and add back in any letters you orginally found after the "z".

In other words, find, Capitaliz-, keep the C capitalised, switch the z to an s, then stick on "-e', "-ing", "-ed", or "-ation", as appropriate.

Correcting all related words at once
Now let's take a look at arguably the most complicated coding here. What if I wanted to change every word that had favor as a root? How could I take care of words that had both a prefix and a suffix, like disfavorable? Putting together everything we've learned so far, it would be:
 * (r'(.?)([Ff])avor(.?)',r'\1\2avour\3')

The leading (.?) means check to see if there's a prefix. The ([Ff]) switch checks for capitalisation of the root letter f. The (.?) at the end checks for suffixes. Now we have three parentheses instead of just two. So \1 means the prefix, \2 puts the letter f in with proper capitalisation, and \3 adds any suffixes.

This one statement will therefore switch over: favor, favors, favored, disfavor, disfavored, unfavorable, favoring, disfavoring, favorable, and almost certainly a few more.

When correcting to British leaves an American spelling around
A few words — mostly those which have -log in them — retain the Amercian spelling even after changing to the British. For instance:


 * AmEng: dialog &rarr; BrEng: dialogue, but dialogue still contains dialog

This means that the next time the bot is run, it will find dialog again and attempt to replace it. After several passes, you'll end up with something like dialogueueueueue, which is obviously not desirable. Thus, we must find a way to limit the search to only the case of dialog+space, dialog+puncutation mark. Here's how we do it: r'dialog(\.|\;|\:| |\!|\,|\?+)' The pipes (|) act as a switch. They say, look for this character | that character | or the other character. The plus sign at the very end says "at least one time". And the back slashes (\) escape the punctuation marks from their usual special meanings.

Altogether then, what this statement says is, "Look for the word dialog followed by either a period, a semi-colon, a colon, an exclamation mark, a space or a comma that's present at least once." It will therefore find only:
 * the speed of his dialog was rapid
 * how could he have forgotten his dialog?
 * dialog: the bane of the actor
 * he had far too many lines of dialog!
 * They had a fruitful dialog; however, the humans would soon kill the Silurians.

Cases where regex fails
Not every word on our list has been switched using regex expressions. Sometimes it's easier just to type up a switch of literal characters, as when a word serves as the root of no other words.

The code
The following code will change over time, as more words are added. The final word in the English language that has a British/American difference is yodelling. Once you see that word on this list, you'll know the bot is fully programmed. fixes['spelling'] = { 'regex': True, 'recursive': True, 'msg': { 'en':u'Enforcing spelling policy.' },   'replacements': [ #AAAA# (r'([Aa])ccessoriz(.?)', r'\1ccessoris\2'), (r'([Aa])cclimitiz(.?)',r'\1cclimatis\2'), (r'([Aa])ccouterments',r'\1ccoutrements'), (r'( +)eon( +)',r'\1aeon\2'), (r'( +)eons( +)',r'\1aeons\2'), (r'([Aa])erogram( +)',r'\1erogramme\2'), (r'([Aa])erograms',r'\1erogrammes'), (r'( +)esthete(.?)( +)',r'\1aesthete\2\3'), (r'( +)esthetic(.?)( +)',r'\1aesthetic\2\3'), (u'( +)etiology',u'\1aetiology'), (r'( +)aging',r'\1ageing'), (r'([Dd])e(.?)aging',r'\1e\2ageing'), (r'([Aa])ggrandizement',r'\1ggrandisement'), (r'([Aa])goniz(.?)', r'\1gonis\2'), (r'([Aa])luminum', r'\1luminium'), (r'([Aa])mortize( +)',r'\1mortise\2'), (r'([Aa])mortiz(.?)',r'\1mortis\2'), (r'(.?)([Tt])heater(.?)',r'\1\2heatre\3'), (r'([Aa])nemi(.?)',r'\1naemi\2'), (r'([Aa])nesthesia',r'\1naesthesia'), (r'([Aa])nestheti(.?)',r'\1naestheti\2'), (r'([Aa])nalog( +)',r'\1nalogue\2'), (r'([Aa])nalogs',r'\1nalogues'), (r'(.?)([Aa])nalyze( +)',r'\1\2nalyse\3'), (r'(.?)([Aa])nalyz(.?)',r'\1\2nalys\3'), (r'([Aa])ngliciz(.?)',r'\1nglicis\2'), (r'([Aa])nnualized',r'\1nnualised'), (r'([Aa])ntagoniz(.?)',r'\1ntagonis\2'), (r'([Aa])pologiz(.?)',r'\1pologis\2'), (u'appall',u'appal'), (u'appalls',u'appals'), (r'([Aa])ppetiz(.?)',r'\1ppetis\2'), (r'([Aa])rbor(.?)',r'\1rbour\2'), (r'([Aa])rcheolog(.?)',r'\1rchaeolog\2'), (u'ardor',u'ardour'), (r'([Aa])rmor(.?)',r'\1rmour\2'), (r'([Aa])rtifact(.?)',r'\1rtefact\2'), (r'(.?)([Aa])uthoriz(.?)',r'\1\2uthoris\3'), (r'([Aa])x( +)',r'\1xe\2'), #BBBB# (r'(.?)([Pp])edaled', r'\1\2edalled'), (r'(.?)([Pp])edaling', r'\1\2edalling'), (r'([Bb])anister(.?)', r'\1annister\2'), (r'([Bb])aptiz(.?)',r'\1aptis\2'), (r'([Bb])astardiz(.?)',r'\1astardis\2'), (r'(Bb)attleax( +)',r'\1attleaxe\2'), (r'(Bb)alk(.?)',r'\1aulk\2'), (r'(Bb)edeviled',r'\1edevilled'), (r'(Bb)edevling',r'\1edevilling'), (r'(.?)(Bb)ehavior(.?)',r'\1\2ehaviour\3'), (r'(Bb)ehoove(.?)',r'\1ehove\2'), (r'(Bb)ejeweled',r'\1ejewelled'), (r'(.?)([Ll])abor(.?)',r'\1\2abour\3'), (r'([Bb])eveled',r'\1evelled'), (r'([Bb])evies',r'\1evvies'), (r'([Bb])evy',r'\1evvy'), (r'([Bb])iased',r'\1iassed'), (r'([Bb])iasing',r'\1iassing'), (r'([Bb])inging',r'\1ingeing'), (r'([Bb])ougainvillea(.?)',r'\1ougainvillaea\2'), (r'([Bb])owdleriz(.?)',r'\1owdleris\2'), (r'([Bb])reathalyz(.?)',r'\1reathalys\2'), (r'([Bb])rutaliz(.?)',r'\1rutalis\2'), (r'(.?)([Bb])usses',r'\1\2uses'), (r'([Bb])ussing',r'\1using'), #CCCC# (r'([Cc])esarean(.?)',r'\1aesarean\2'), (r'([Cc])aliber(.?)',r'\1alibre\2'), (r'([Cc])aliper(.?)',r'\1calliper\2'), (r'([Cc])alisthenics',r'\1allisthenics'), (r'([Cc])analiz(.?)',r'\1analis\2'), (r'([Cc])ancelation',r'\1ancellation'), (r'([Cc])ancelations',r'\1ancellations'), (r'([Cc])anceled',r'\1ancelled'), (r'([Cc])anceling',r'\1ancelling'), (r'([Cc])andor',r'\1andour'), (r'([Cc])annibaliz(.?)',r'\1annibalis\2'), (r'([Cc])anibaliz(.?)',r'\1annibalisi\2'), (r'([Cc])anibalis(.?)',r'\1annibalis\2'), (r'([Cc])anoniz(.?)',r'\1anonis\2'), (r'([Cc])apitaliz(.?)',r'\1apitalis\2'), (r'([Cc])arameliz(.?)',r'\1aramelis\2'), (r'([Cc])arboniz(.?)',r'\1arbonis\2'), (r'([Cc])aroled',r'\1arolled'), (r'([Cc])aroling',r'\1arolling'), (r'([Cc])atalog( +)',r'\1atalogue\1'), (r'([Cc])atalogs( +)',r'\1atalogues\2'), (r'([Cc])ataloged',r'\1atalogued'), (r'([Cc])ataloging',r'\1ataloguing'), (r'([Cc])atalyz(.?)',r'\1atalys\2'), (r'([Cc])ategoriz(.?)',r'\1ategoris\2'), (r'([Cc])auteriz(.?)',r'\1auteris\2'), (r'([Cc])avilled',r'\1avilled'), (r'([Cc])aviling',r'\1avilling'), (r'(.?)([Gg])ram( +)',r'\1\2gramme\3'), (r'(.?)([Gg])rams',r'\1\2grammes'), (r'(.?)([Ll])iter(.?)',r'\1\2itre\3'), (r'(.?)([Mm])eter(.?)',r'\1\2etre\3'), (r'([Cc])entraliz(.?)',r'\1entralis\2'), (r'(.?)([Cc])enter(.?)',r'\1\2entre\3'), (r'([Cc])hanneled',r'\1hannelled'), (r'([Cc])hanneling',r'\1hannelling'), (r'([Cc])haracteriz(.?)',r'\1haracteris\2'), (r'([Cc])heckbook(.?)',r'\1hequebook\2'), (r'([Cc])hili',r'\1hilli'), (r'([Cc])himera(.?)',r'\1himaera\2'), (r'([Cc])hiseled',r'\1hiselled'), (r'([Cc])hiseling',r'\1hiselling'), (r'([Cc])irculariz(.?)',r'\1ircularis\2'), (r'(.?)([Cc])iviliz(.?)',r'\1\2ivilis\3'), (r'([Cc])lamor(.?)',r'\1lamour\2'), (r'([Cc])langor',r'\1langour'), (r'([Cc])larinetist',r'\1larinettist'), (r'([Cc])ollectiviz(.?)',r'\1ollectivis\2'), (r'([Cc])oloniz(.?)',r'\1olonis\2'), (r'(.?)([Cc])olor(.?)',r'\1\2olour\3'), (r'([Cc])ommercializ(.?)',r'\1ommercialis\2'), (r'([Cc])ompartmentaliz(.?)',r'\1ompartmentalis\2'), (r'([Cc])omputeriz(.?)',r'\1omputeris\2'), (r'([Cc])onceptualiz(.?)',r'\1onceptualis\2'), (r'([Cc])ontextualize(.?)',r'\1ontextualis\2'), (r'([Cc])oz(.?)',r'\1os\2'), (r'([Cc])ouncilor(.?)',r'\1ouncillor\2'), (r'([Cc])ounselor(.?)',r'\1ounsellor\2'), (r'([Cc])ounseling',r'\1ounselling'), (r'([Cc])ounseled',r'\1ounselled'), (r'([Cc])renelated',r'\1renellated'), (r'([Cc])riminaliz(.?)',r'\1riminialis\2'), (r'([Cc])riticiz(.?)',r'\1riticis\2'), (r'([Cc])rueler',r'\1rueller'), (r'([Cc])ruelest',r'\1ruellest'), (r'([Cc])rystalliz(.?)',r'\1rystallis\2'), (r'([Cc])udgeled', r'\1udgelled'), (r'([Cc])udgeling',r'\1udgelling'), (r'([Cc])ustomiz(.?)',r'\1ustomis\2'), (r'([Cc])ipher(.?)',r'\1ypher\2'), #DDDD# (r'([Dd])ecentraliz(.?)',r'\1ecentralis\2'), (r'([Dd])ecriminaliz(.?)',r'\1ecriminalis\2'), (r'([Dd])efense(.?)',r'\1efence\2'), (r'(.?)([H,h])umaniz(.?)',r'\1\2umanis\3'), (r'(.?)([Dd])emeanor',r'\1\2emeanour'), (r'(.?)([Mm])ilitariz(.?)',r'\1\2ilitaris\3'), (r'(.?)([Mm])obiliz(.?)',r'\1\2obilis\3'), (r'([Dd])emocratiz(.?)',r'\1emocratis(.?)'), (r'([Dd])emoniz(.?)',r'\1emonis(.?)'), (r'(.?)([Mm])oraliz(.?)',r'\1\2oralis\3'), (r'(.?)([Nn])ationaliz(.?)',r'\1\2ationalis\3'), (r'([Dd])eodoriz(.?)',r'\1eodoris\2'), (r'(.?)([Pp])ersonaliz(.?)',r'\1\2ersonalis\3'), (r'([Dd])eputiz(.?)',r'\1eputis\2'), (r'(.?)([Ss])ensitiz(.?)',r'\1\2ensitis\3'), (r'(.?)([Ss])tabliz(.?)',r'\1\2tablis\3'), (r'([Dd])ialed',r'\1ialled'), (r'([Dd])ialing',r'\1ialling'), (r'([Dd])ialog( +)',r'\1ialogue\2'), (r'([Dd])ialogs( +)',r'\1ialogues\2'), (r'([Dd])iarrhea',r'\1iarrhoea'), (r'([Dd])igitiz(.?)',r'\1igitis\2'), (r'([Dd])isemboweled',r'\1isembowelled'), (r'([Dd])isemboweling',r'\1isembowelling'), (r'(.?)([Ff])avor(.?)',r'\1\2avour\3'), (r'([D,d])isheveled',r'\1ishevelled'), (r'(.?)([Hh])onor(.?)',r'\1\2\onour\3'), (r'(.?)([Oo])rganization(.?)',r'\1\2rganisation\3'), (r'([Dd])istil( +)',r'\1istill\2'), (r'([Dd])istils',r'\1istills'), (r'([Dd])ramatiz(.?)',r'\1ramatis\2'), (r'([Dd])rafts(.?)',r'\1raughts\2'), (r'([Dd])rafty',r'\1raughty'), (r'([Dd])rafti(.?)',r'\1raughti\2'), (r'([Dd])riveled',r'\1rivelled'), (r'([Dd])riveling',r'\1rivelling'), (r'([Dd])ueled',r'\1uelled'), (r'([Dd])ueling',r'\1uelling'), #EEEE# (r'([Ee])conomiz(.?)',r'\1conomis'), (r'([Ee])dema',r'\1doema'), (r'([Ee])ditorializ(.?)',r'\1ditorialis\2'), (r'([Ee])mpathiz(.?)',r'\1mpathis\2'), (r'(.?)([Ee])mphasiz(.?)',r'\1\2mphasis\2'), (r'([Ee])nameled',r'\1namelled'), (r'([Ee])nameling',r'\1namelling'), (r'([Ee])namor(.?)',r'\1namour\2'), (r'([Ee])ncyclopedi(.?)',r'\1ncyclopaedi\2'), (r'([Ee])ndeavor(.?)',r'\1ndeavour\2'), (r'(.?)([Ee])nergiz(.?)',r'\1\2nergis\3'), (r'([Ee])nroll(.?)',r'\1nrol\2'), (r'([Ee])nthrall(.?)',r'\1nthral\2'), (r'([Ee])paulet( +)',r'\1paulette\2'), (r'([Ee])paulets',r'\1paulettes'), (r'([Ee])pilog( +)',r'\1pilogue'), (r'([Ee])pilogs',r'\1pilogues'), (r'([Ee])pitomiz(.?)',r'\1pitomis\2'), (r'([Ee])qualiz(.?)',r'\1qualis\2'), (r'([Ee])ulogiz(.?)',r'\1ulogis\2'), (r'([Ee])vangeliz(.?)',r'\1vangelis\2'), (r'([Ee])xorciz(.?)',r'\1xorcis\2'), (r'(.?)([Tt])emporiz(.?)',r'\1\2emporis\2'), (r'([Ee])xternaliz(.?)',r'\1xternalis\2'), #FFFF# (r'([Ff])actoriz(.?)',r'\1actoris\2'), (r'([Ff])eces',r'\1aeces'), (r'([Ff])ecal',r'\1aecal'), (r'([Ff])amiliariz(.?)',r'\1amiliaris\2'), (r'([Ff])antasiz(.?)',r'\1antasis\2'), (r'([Ff])eminiz(.?)',r'\1eminis\2'), (r'([Ff])ertiliz(.?)',r'\1ertilis\2'), (r'([Ff])ervor',r'\1ervor'), (r'([Ff])iber(.?)',r'\1ibre\2'), (r'([Ff])ictionaliz(.?)',r'\1ictionalis\2'), (r'([Ff])ilet(.?)',r'\1illet\2'), (r'([Ff])inaliz(.?)',r'\1inalis\2'), (r'(.?)([Ff])lavor(.?)',r'\1\2\lavour\3'), (r'([Ff])etal',r'\1oetal'), (r'([Ff])etus(.?)',r'\1oetus\2'), (r'([Ff])etid',r'\1oetid'), (r'([Ff])ormaliz(.?)',r'\1ormalis\2'), (r'([Ff])ossiliz(.?)',r'\1ossilis\2'), (r'([Ff])raterniz(.?)',r'\1raternis\2'), (r'([Ff])ulfill( +)',r'\1ulfil\2'), (r'([Ff])ulfillment',r'\1ulfilment'), (r'([Ff])unneled',r'\1unnelled'), (r'([Ff])unneling',r'\1unnelling'), #GGGG# (r'([Gg])alvaniz(.?)',r'\1alvanis\2'), (r'([Gg])amboled',r'\1ambolled'), (r'([Gg])amboling',r'\1amboling'), (r'([Gg])eneraliz(.?)',r'\1eneralis\2'), (r'([Gg])hettoiz(.?)',r'\1hettois\2'), (r'([Gg])lamoriz(.?)',r'\1lamoris\2'), (r'([Gg])lamor( +)',r'\1lamour\2'), (r'([Gg])lobaliz(.?)',r'\1lobalis\2'), (r'([Gg])luing',r'\1ueing'), (r'([Gg])oiter(.?)',r'\1oitre\2'), (r'([Gg])onorrhea',r'\1onorrhoea'), (r'([Gg])raveled',r'\1ravelled'), (r'([Gg])ray( +)',r'\1rey\2'), (r'([Gg])ray(.?)',r'\1rey\2'), (r'([Gg])roveled',r'\1rovelled'), (r'([Gg])roveling',r'\1rovelling'), (r'([Gg])rueling(.?)',r'\1ruelling\2'), (r'([Gg])ynacol(.?)',r'\1ynaecol\2'), #HHHH# (r'([Hh])ematolog(.?)',r'\1aematolog\2'), (r'([Hh])emo(.?)',r'\1aemo\2'), (r'([Hh])arbor(.?)',r'\1arbour\2'), (r'([Hh])armoniz(.?)',r'\1armonis\2'), (r'([Hh])omeopath(.?)',r'\1omoeopath\2'), (r'([Hh])omogeniz(.?)',r'\1omogenis\2'), (r'([Hh])ospitaliz(.?)',r'\1ospitalis\2'), (r'([Hh])umor(.?)',r'\1umour\2'), (r'([Hh])ybridiz(.?)',r'\1ybridis\2'), (r'([Hh])ypnotiz(.?)',r'\1ypnotis\2'), (r'([Hh])ypothesiz(.?)',r'\1ypothesis\2'), #IIII# (r'([Ii])dealiz(.?)',r'\1dealis\2'), (r'([Ii])doliz(.?)',r'\1dolis\2'), (r'(.?)([Mm])obiliz(.?)',r'\1\2obilis\3'), (r'([Ii])mmortaliz(.?)',r'\1mmortalis\2'), (r'([Ii])mmuniz(.?)',r'\1mmunis\2'), (r'(.?)([Pp])aneled',r'\1\2anelled'), (r'(.?)([Pp])aneling',r'\1\2anelling'), (r'([Ii])mperiled',r'\1mperilled'), (r'([Ii])mperiling',r'\1mperilling'), (r'([Ii])ndividualiz(.?)',r'\1ndividualis\2'), (r'([Ii])ndustrializ(.?)',r'\1ndustrialis\2'), (r'([Ii])nstill(.?)',r'\1nstil\2'), (r'([Ii])nitialed',r'\1nitialled'), (r'([Ii])nitialing',r'\1nitialling'), (r'([Ii])nstallment(.?)',r'\1nstalment\2'), (r'([Ii])nstitutionaliz(.?)',r'\1nstitutionalis\2'), (r'([Ii])ntellectualiz(.?)',r'\1ntellectualis\2'), (r'(.?)([Nn])ationaliz(.?)',r'\1ationalis\2'), (r'([Ii])nternaliz(.?)',r'\1nternalis\2'), (r'([Ii])oniz(.?)',r'\1onis\2'), (r'([Ii])taliciz(.?)',r'\1talicis\2'), (r'([Ii])temiz(.?)',r'\1temis\2'), #JJJJ (r'([Jj])eopardiz(.?)',r'\1eopardis\2'), (r'([Jj])eweler(.?)',r'\1eweller\2'), #KKKK# #None known# #LLLL# (r'([Ll])abeled',r'\1abelled'), (r'([Ll])abeling',r'\1abelling'), (r'([Ll])ackluster',r'\1acklustre'), (r'(.?)([Ll])egaliz(.?)',r'\1\2egalis\3'), (r'(.?)([Ll])egitimiz(.?)',r'\1\2egitimis\3'), (r'([Ll])ukemia',r'\1eukaemia'), (r'(.?)([Ll])evele(.?)',r'\1\2evelle\3'), (r'(.?)([Ll])eveling',r'\1\2evelling'), (r'([Ll])ibeled',r'\1ibelled'), (r'([Ll])ibelous',r'\1ibellous'), (r'([Ll])ibeling',r'\1ibelling'), (r'([Ll])iberaliz(.?)',r'\1iberalis\2'), (r'([Ll])ioniz(.?)',r'\1ionis\2'), (r'([Ll])iquidiz(.?)',r'\1iquidis\2'), (r'([Ll])ocaliz(.?)',r'\1ocalis\2'), (r'([Ll])ouver(.?)',r'\1ouvre\2'), (r'([Ll])uster',r'\1ustre'), #MMMM# (r'(.?)([Mm])agnetiz(.?)',r'\1\2agnetis\3'), (r'(.?)([Mm])aneuver(.?)',r'\1\2anoeuvr\3'), (r'([Mm])arginiliz(.?)',r'\1arginilis\2'), (r'([Mm])arshaled',r'\1arshalled'), (r'([Mm])arshaling',r'\arshalling'), (r'([Mm])arveled',r'\1arvelled'), (r'([Mm])arveling',r'\1arvelling'), (r'([Mm])arvelo(.?)',r'\1arvello\2'), (r'(.?)([Mm])aterializ(.?)',r'\1\2aterialis\3'), (r'([Mm])aximiz(.?)',r'\1aximis\2'), (r'([Mm])eager',r'\1eager'), (r'([Mm])echaniz(.?)',r'\1echanis\2'), (r'([Mm])emorializ(.?)',r'\1emorialis\2'), (r'([Mm])emoriz(.?)',r'\1emoris\2'), (r'([Mm])esmeriz(.?)',r'\1esmoris\2'), (r'([Mm])etaboliz(.?)',r'\1etabolis\2'), (r'([Mm])iniaturiz(.?)',r'\1iniaturis\2'), (r'([Mm])inimiz(.?)',r'\1inimis\2'), (r'([Mm])iter(.?)',r'\1itre\2'), (r'(.?)([Mm])odele(.?)',r'\1\2odelle\3'), (r'(.?)([Mm])odeling',r'\1\2odelling'), (r'([Mm])oderniz(.?)',r'\1odernis\2'), (r'([Mm])oisturiz(.?)',r'\1oisturis\2'), (r'([Mm])onolog( +)',r'\1onologue\2'), (r'([Mm])onologs',r'\1onologues'), (r'([Mm])onopoliz(.?)',r'\1onopolis\2'), (r'(.?)([Mm])old(.?)',r'\1\2ould\3'), (r'([Mm])olt(.?)',r'\1oult\2'), (r'([Mm])ustache(.?)',r'\1oustache\2'), #NNNN# (r'([Nn])aturaliz(.?)',r'\1aturalis\2'), (r'([Nn])eighbor(.?)',r'\1eighbour\2'), (r'([Nn])aturaliz(.?)',r'\1aturalis\2'), (r'([Nn])eutraliz(.?)',r'\1eutralis\2'), (r'([Nn])ormaliz(.?)',r'\1ormalis\2'), #OOOO# (r'([Oo])dor( +)',r'\1dour\2'), (r'([Oo])dors',r'\1dours'), (r'( +)esophagus(.?)',r'\1oesophagus\2'), (r'( +)Esophagus(.?)',r'\1Oesophagus\2'), (u'( +)estrogen',u'\1oestrogen'), (u'( +)Estrogen',u'\1Oestrogen'), (r'([Oo])ffense(.?)',r'\1ffence\2'), (r'([Oo])melet( +)',r'\1melette\2'), (r'([Oo])melets',r'\1melettes'), (r'(.?)([Oo])ptimiz(.?)',r'\1\2ptimis\3'), (r'(.?)([Oo])rganiz(.?)',r'\1\2rganis\3'), (r'([Oo])rthopedic(.?)',r'\1rthopaedic\2'), (r'([Oo])straciz(.?)',r'\1stracis\2'), (r'([Oo])xidiz(.?)',r'\1xidis\2'), #PPPP# (r'([Pp])ederast(.?)',r'\1aederast\2'), (r'([Pp])ediatric(.?)',r'\1aediatric\2'), (r'([Pp])edo( +)',r'\1aedo\2'), (r'([Pp])edophil(.?)',r'\1aedophil\2'), (r'([Pp])aleo(.?)',r'\1alaeo\2'), (r'([Pp])anelist(.?)',r'\1anellist\2'), (r'([Pp])araliz(.?)',r'\1aralys\2'), (r'([Pp])arceled',r'\1arcelled'), (r'([Pp])arceling',r'\1arcelling'), (r'([Pp])arlor(.?)',r'\1arlour\2'), (r'([Pp])articulariz(.?)',r'\1articularis\2'), (r'([Pp])assiviz(.?)',r'\1assivis\2'), (r'([Pp])asteuriz(.?)',r'\1asteuris\2'), (r'([Pp])atroniz(.?)',r'\1atronis\2'), (r'([Pp])edestrianiz(.?)',r'\1edestrianis\2'), (r'([Pp])enaliz(.?)',r'\1enalis\2'), (r'([Pp])enciled',r'\1encilled'), (r'([Pp])enciling',r'\1encilling'), (r'([Pp])harmacopeia(.?)',r'\1harmacopoeia\2'), (r'([Pp])hilosophiz(.?)',r'\1hilosophis\2'), (r'([Pp])hilter(.?)',r'\1hiltre\2'), (r'([Pp])lagiariz(.?)',r'\1lagiaris\2'), (r'([Pp])low( +)',r'\1lough\2'), (r'([Pp])low(.?)',r'\1lough\2'), (r'(.?)([Pp])olariz(.?)',r'\1\2olaris\3'), (r'(.?)([Pp])oliticiz(.?)',r'\1\2oliticis\3'), (r'([Pp])opulariz(.?)',r'\1opularis\2'), (r'([Pp])ouf( +)',r'\1ouffe\2'), (r'([Pp])oufs',r'\1ouffes'), (r'([Pp])racticed',r'\1ractised'), (r'([Pp])racticing',r'\1ractising'), (r'([Pp])raesidium(.?)',r'\1residium\2'), (r'(.?)([Pp])ressuriz(.?)',r'\1ressuris\1'), (r'([Pp])retens(.?)',r'\1retenc\2'), (r'([Pp])rimaeval',r'\1rimeval'), #Correcting in favour of American spelling# (r'(.?)([Pp])rioritiz(.?)',r'\1\2rioritis\3'), (r'(.?)([Pp])rivatiz(.?)',r'\1\2rivatis\3'), (r'([Pp])roffesionaliz(.?)',r'\1roffesionalis\2'), (r'([Pp])rolog( +)',r'\1rologue\2'), (r'([Pp])rologs',r'\1rologues'), (r'([Pp])ropagandiz(.?)',r'\1ropagandis\2'), (r'([Pp])roselytiz(.?)',r'\1roselytis\2'), (r'([Pp])ubliciz(.?)',r'\1ublicis\2'), (r'([Pp])ulveriz(.?)',r'\1ulveris\2'), (r'([Pp])ummeled',r'\1ummelled'), (r'([Pp])ummeling',r'\1ummelling'), (r'([Pp])ajama(.?)',r'\1yjama\2'), #QQQQ# (r'([Qq])uarreled',r'\1uarrelled'), (r'([Qq])uarreling',r'\1uqarrelling'), #RRRR# (r'([Rr])adicaliz(.?)',r'\1adicalis\2'), (r'([Rr])ancor(.?)',r'\1ancour\2'), (r'([Rr])andomiz(.?)',r'\1andomis\2'), (r'([Rr])ationaliz(.?)',r'\1ationalis\2'), (r'(.?)([Rr])aveled',r'\1\2avelled'), (r'(.?)([Rr])aveling',r'\1\2avelling'), (r'(.?)([Rr])ealiz(.?)',r'\1\2ealis\3'), (r'(.?)([Rr])ecogniz(.?)',r'\1\2ecognis\3'), (r'([Rr])econnoiter(.?)',r'\1econnoitre\2'), (r'([Rr])efueled','\1efuelled'), (r'([Rr])efueling','\1efuelling'), (r'(.?)([Rr])egulariz(.?)',r'\1\2\egularis\3'), (r'([Rr])evele(.?)',r'\1evelle\2'), (r'([Rr])eveling',r'\1evelling'), (r'(.?)([Vv])italiz(.?)',r'\1\2vitalis\3'), (r'([Rr])evolutioniz(.?)',r'\1evolutionis\2'), (r'([Rr])hapodiz(.?)',r'\1hapodis\2'), (r'([Rr])igor(.?)',r'\1igour\2'), (r'([Rr])itualiz(.?)',r'\1itualis\2'), (r'(.?)([Rr])ivaled',r'\1\2ivalled'), (r'([Rr])ivaling',r'\1ivalling'), (r'([Rr])omanticiz(.?)',r'\1omanticis\2'), (r'([Rr])umor(.?)',r'\1umour\2'), #SSSS# (r'([Ss])aber(.?)',r'\1sabre\2'), (r'([Ss])altpeter',r'\1altpetre'), (r'(.?)([Ss])anitiz(.?)',r'\1\2anitis\3'), (r'([Ss])atiriz(.?)',r'\1atiris\2'), (r'([Ss])avior(.?)',r'\1aviour\2'), (r'(.?)([Ss])avor(.?)',r'\1\2avour\3'), (r'([Ss])candaliz(.?)',r'\1candalis\2'), (r'([Ss])keptic(.?)',r'\1ceptic\2'), (r'([Ss])cepter(.?)',r'\1sceptre\2'), (r'([Ss])crutiniz(.?)',r'\1crutinis\2'), (r'([Ss])eculariz(.?)',r'\1ecularis\2'), (r'([Ss])ensationaliz(.?)',r'\1ensationalis\2'), (r'([Ss])entimentaliz(.?)',r'\1entimentalis\2'), (r'([Ss])epulcher(.?)',r'\1epulchre\2'), (r'([Ss])erializ(.?)',r'\1erialis\2'), (r'([Ss])ermoniz(.?)',r'\1ermonis\2'), (r'([Ss])hoveled',r'\1hovelled'), (r'([Ss])hoveling',r'\1hovelling'), (r'([Ss])hriveled',r'\1hrivelled'), (r'([Ss])hriveling',r'\1hrivelling'), (r'([Ss])ignaliz(.?)',r'\1ignalis\2'), (r'([Ss])ignaled',r'\1ignalled'), (r'([Ss])ignaling',r'\1ignalling'), (r'([Ss])molder(.?)',r'\1moulder\2'), (r'([Ss])niveled',r'\1nivelled'), (r'([Ss])niveling',r'\1nivelling'), (r'([Ss])norkeled',r'\1norkelled'), (r'([Ss])norkeling',r'\1norkelling'), (r'(.?)([Ss])ocializ(.?)',r'\1\2ocialis\3'), (r'([Ss])odomiz(.?)',r'\1odomis\2'), (r'(.?)([Ss])olemniz(.?)',r'\1\2olemnis\3'), (r'([Ss])omber',r'\1ombre'), (r'([Ss])pecializ(.?)',r'\1pecialis\2'), (r'([Ss])pecter(.?)',r'\1pectre\2'), (r'([Ss])piraled',r'\1piralled'), (r'([Ss])piraling',r'\1piraling'), (r'([Ss])plendor(.?)',r'\1plendour\2'), (r'([Ss])quirreled',r'\1quirrelled'), (r'([Ss])quirreling',r'\1quirrelling'), (r'(.?)([Ss])tabliz(.?)',r'\1\2tablis\3'), (r'(.?)([Ss])tandardiz(.?)',r'\1\2tandardis\3'), (r'([Ss])tenciled',r'\1tencilled'), (r'([Ss])tenciling',r'\1tencilling'), (r'(.?)([Ss])teriliz(.?)',r'\1\2terilis\3'), (r'(.?)([Ss])tigmatiz(.?)',r'\1\2tigmatis\3'), (r'(.?)([Ss])ubsidiz(.?)',r'\1\2ubsidis\3'), (r'([Ss])uccor(.?)',r'\1uccour\2'), (r'([Ss])ulfa(.?)',r'\1ulpha\2'), (r'([Ss])ulfi(.?)',r'\1ulphi\2'), (r'([Ss])ulfu(.?)',r'\1ulphu\2'), (r'([Ss])ummariz(.?)',r'\1ummaris\2'), (r'([Ss])wiveled',r'\1wivelled'), (r'([Ss])wiveling',r'\1wiveling'), (r'([Ss])ymboliz(.?)',r'\1ymbolis\2'), (r'([Ss])ympathiz(.?)',r'\1ympathasis\2'), (r'(.?)([Ss])ynchroniz(.?)',r'\1\2ynchronis\3'), (r'(.?)([Ss])ynthesiz(.?)',r'\1\2ynthesis\3'), (r'(.?)([Ss])ystematiz(.?)',r'\1\2ystematis\3'), #TTTT# (r'([Tt])antaliz(.?)',r'\1antalis\2'), (r'([Tt])asseled',r'\1asselled'), (r'([Tt])enderiz(.?)',r'\1enderis\2'), (r'([Tt])erroriz(.?)',r'\1erroris\2'), (r'([Tt])heoriz(.?)',r'\1heoris\2'), (r'([Tt])oweled',r'\1owelled'), (r'([Tt])oweling',r'\1owelling'), (r'([Tt])oxemia',r'\1oxaemia'), (r'([Tt])ranquiliz(.?)',r'\1ranquillis\2'), (r'([Tt])ranquilis(.?)',r'\1tranquillis\2'), (r'([Tt])ranquilliz(.?)',r'\1ranquillis\2'), #correcting archaic BrEng form to modern BrEng# (r'([Tt])ranquillity ([Bb])ase',r'Tranquility Base'), #correcting to IAU standard# (r'([Tt])ransistoriz(.?)',r'\1ransistoris\2'), (r'([Tt])raumatiz(.?)',r'\1raumatis\2'), (r'([Tt])ravelers',r'\1ravellers'), #other forms under "ravelled" above# (r'([Tt])ravelog( +)',r'\1ravelogue\2'), (r'([Tt])ravelogs',r'\1ravelogues'), (r'([Tt])rvializ(.?)',r'\1rivialis\2'), (r'([Tt])umor(.?)',r'\1umour\2'), (r'([Tt])unneled',r'\1unnelled'), (r'([Tt])unneling',r'\1unnelling'), (r'([Tt])yraniz(.?)',r'\1yranis\2'), #UUUU# (r'(.?)([Uu])tiliz(.?)',r'\1utilis\2'), (r'([Uu])nioniz(.?)',r'\1nionis\2'), (r'([Uu])ntrameled',r'\1ntramelled'), (r'(.?)([Uu])rbaniz(.?)',r'\1\2rbanis\3'), (r'(.?)([Uu])tiliz(.?)',r'\1\2tilis\3'), #VVVV# (r'([Vv])alor',r'\1alour'), (r'([Vv])andaliz(.?)',r'\1andalis'), (r'(.?)([Vv])aporiz(.?)',r'\1\2apouris\3'), (r'([Vv])apor( +)',r'\1apour\2'), (r'([Vv])apors',r'\1apours'), (r'([Vv])aporiz(.?)',r'\1aporis\2'), #Weirdly, words that have vapour as a root lose the cosmetic 'u' # (r'(.?)erbaliz(.?)',r'\1erbalis\2'), (r'([Vv])ictimiz(.?)',r'\1ictimis\2'), (r'([Vv])igor',r'\1igour'), (r'([Vv])isualiz(.?)',r'\1isualis\2'), (r'([Vv])ocaliz(.?)',r'\1ocalis\2'), (r'([Vv])ulaniz(.?)',r'\1ulcanis\2'), (r'([Vv])ulgariz(.?)',r'\1ulgaris\2'), #WWWW# (r'([Ww])easeled',r'\1easelled'), (r'([Ww])weaseling',r'\1easelling'), (r'([Ww])esterniz(.?)',r'\1esternis\2'), (r'([Ww])omaniz(.?)',r'\1omanis\2'), (r'([Ww])oolen(.?)',r'\1ollen\2'), (r'([Ww])oolies',r'\1oollies'), (r'([Ww])ooly',r'\1oolly'), #XXXX# #None known# #YYYY# (r'([Yy])odeled',r'\1odelled'), (r'([Yy])odeling',r'\1odelling'), #ZZZZ# #None known# ],   'exceptions': { 'inside-tags': [ 'pre', 'code', 'nowiki', 'hyperlink', 'link', 'comment', ]       'category': [ 'spelling', ]       }    }