aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[aspell-devel] more hints


From: Dennis R. Crosby
Subject: [aspell-devel] more hints
Date: Tue, 18 May 2010 23:34:20 +0200

I’ve done some more link jumping from your site and read other peoples comments also.

I’m afraid everyone is using some kind of word based approach. Words aren’t basic enough.

Everything is based on morphemes, units of meaning, that might correspond to words, but also correspond to inflectional affixes (and sound shifts).

Starting from English is a bad start. Practical applications are possible short term, but will be too unwieldy for other languages (some languages will break the infrastructure immediately).

Given time, even English based systems will cease working, or require increasingly complex additional layers of ‘correction’.

 

All languages utilize morphemes and their allomorphs. Even polysynthetic (where “word” in isolation is a meaningless concept and utterance (sentence) is as small as you’re going to get. They too are composed of strings of morphemes in functional-contextual allomorphic variants.

 

Also: besides starting with your database of morphemes and conditional rules, you need to identify morphemes on context frequencies (how often does this morpheme indicate plural, how often possession, how often the abbreviated allomorph form of 3rd person singular “to be”, as in the case of “s” (and its allomorphic variations “es”, “ren”, and null (as well as “z”, which is pronounced, though not written).

 

Context indexes need to include etymological identifiers as well as functional identifiers. Why? Because where a word comes from originally, was well as when it was adopted into English and sometimes even the root it took (i.e. latin root imported via French or Spanish or technological revolution born necessity [or should I say technological revolution *borne* necessity? Neither is wrong, depending on what I want to emphasize. Either choice is wrong if I mean to emphasize the other meaning]) = These factors actually DETERMINE the set of rules that apply to spelling variation as well a usage in English. They are they ‘why’ that we scratch our heads about but go on applying consistently, knowing something is wrong when we try to do otherwise. Morphemes are packages, units of meaning with variants that are just as bound to their lineage as they are to the meanings they carry and the phonemes of which they are composed (which in turn cause them to be subject to another set of rules governing contextual sound variations). That form/meaning package of variants has a set of rules that govern them – a “citizenship” with rights and obligations if you will. A twin morpheme  (homophone) a “citizen” of another country, has other duties and other rights-based expectations.

And the rules governing contextual sound shifts MAY BE DIFFERENT. Lineage is the reason. Sometimes we didn’t just borrow words. Sometimes we borrowed the manual that went with the word. Sometimes we didn’t read the manual, or read it well, or it’s been so long – how did that go? Maybe the manual got lost.

 

This stupid analogy illustrates how many things can be involved in “proper” spelling. Periodically, the culture gets tired of learning to apply all these things nobody can remember why and usually get wrong and a wave of simplification sweeps through the language.

 

English has suffered from domination by French speaking Danish descendants for 200 years followed by pretentions of loyalty and proclamations of convictions – which were enormous factors in word choice, spelling and grammar. English doesn’t look at all like Icelandic, but it used to.

 

We didn’t bother to educate the slaves or their descendants, so they never quite got around to imitating our usage exactly, sometimes never quite abandoned grammatical transformations that (quite conveniently) expressed day-to-day sameness in a way ‘standard’ usage ignored completely and lo and behold, 200 years later every good white Anglo-Saxon descendant in the US knows exactly “what they be talkin’ ‘bout”. Those lineage based forms (in the latter case, concepts bound in rules that apply to categories – verbs) are INTERNALIZED IN OUR WHOLE CULTURE. How many people do you know who can explain why?

 

Sorry, this was meant to be a short letter. I can blame the medicine partly, but it’s rooted in basic nature, fed by much thinking and bottled up due to a world-wide lack of interest in the subject.

 

I really hope you can glean some useful points out of my rantings. My points are important, but I’m afraid I’m burying them too deeply. Sorry.



__________ Information from ESET NOD32 Antivirus, version of virus signature database 5125 (20100518) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]