Algilez Logo

Algilez

International Language

Home Information Vocabulary Notes Grammar Phrase Book Vocabulary Lessons Alphabet Notes

Meeting Photo 1Meeting Photo 2Presentation photoMuseum PhotoBeach PhotoMeeting Photo 3

Algilez Vocabulary Notes

Contents on this page

1   Vocabulary - where does it come from The various sources of the Algilez vocabulary
2   Statistics Some statistics relating to the vocabulary
3   Vocabulary used for standard exams The vocabulary required for the AQA Exam Board GCSE French and German exams
4   How to use the vocabulary Understanding the numbering system and the alternative grammatical word forms provided
5   Roget's Thesaurus The importance of reliable language classification system
6   Algilez classification example Examples of classification in Algilez
7   Vocabulary How the vocabulary is classified
8   Why use a classification system? What are the problems if you don't classify?
9   Choice of Root Words Explanation of how Root Words are chosen
10 Creating a compound word An example of how a compound word is formed
11 Creating new Algilez words This is on a new page1

1    Vocabulary - where does it come from?

The Algilez vocabulary is compiled from a number of different sources.  These include:

 

Roget's Thesaurus,  headwords and keywords

These cover all word meanings in the English language (although there may be some sub-classifications which are not covered)

 

Voice of America Wordlist

Wordlist used in Voice of America radio broadcasts.  This list contains a useful number of words relating to politics and present day life.

 

Longman Defining Vocabulary

The wordlist used by the Longman English Dictionary to define its meanings.  This list should therefore include most of the commonly used and understood English words.

 

Assessment and Qualifications Alliance (AQA) French and German GCSE wordlists in the UK

The wordlist used by the AQA board for the French and German GCSE exams.  This gives the French & German words or phrases that a student would be expected to know in order to take the AQA French or German GCSE exams.  This is normally taken by British school children aged 16 after about 5 years of learning. A pass at A, B or C level is equivalent to Common European Framework (CEF) level B1.   CEF levels vary from A1 (beginner), A2, B1, B2 (A Level), C1 (University level) and  C2 (professional level).

 

Additional Sources

These include common household items, animals (domestic, farm and zoo), children's games and toys, colours and shapes etc2

2    Statistics

 

The Algilez Vocabulary continues to be developed.  In July 2006 there were 4,662 Roget entries, by January 2012 this had increased to 5,851.  This was the result of development of the Algilez documentation which includes the Grammar, Phrase Book, the English-Algilez Translation Guide, additional words from the AQA Examination Board and the draft GCSE Lesson Book.  There are estimated to be about a million words in the English language, so we still have some way to got yet.  Please remember that the vocabulary still really only contains general ‘day to day’ words but is being added to almost every day.  Note also that the word list will not contain all the possible grammatical combinations of tenses or adjectival affixes that are possible in Algilez, just a few examples (usually of the verb infinitive and adjective) are provided.

There were about 1725 root words used as of January 2012, excluding plants, animals and proper names. Algilez does not yet contain a full set of individual words for plants, animals and technical items that would be encountered in a natural language.  However, the core Algilez vocabulary appears to be fairly robust for day to day use – anything you can say in English you can say in Algilez!

Note that 'compound words' as defined below are new words with a new semantic meaning.  Grammatical variations derived from any existing root or compound word (e.g. conjugated tenses, adjectives & adverbs etc) are sometimes included, in order to make finding the words easier.  However since there are no irregular words in Algilez, all grammatical variations follow set patterns (e.g. for tenses, verbs, professions, adverbs and adjectives based on the root words) and are very easy to apply.

Statistics as of 27 January 2012 (note that these are changing regularly)

Total vocabulary entries i.e. Algilez words

5,851

Total individual Root words

1,913

Main root words ' r ' (i.e. excluding plants, animals and proper names)

1,725

Repeated root words, r2 & r3, (Note, not duplicates but just repeated for ease of reference)

29

Animals

88

Plants

45

Proper names (countries etc)

55

Compound words (derived from the main root words)

3,902

Dictionary words (root words plus various affixes) - the total number of words used so far

14,570

3

Comparison with English

Algilez Phrase Book

The Algilez Phrase Book (as of 23 June 2011) contains the following numbers of words and characters:-

English words  7434 (equivalent to 9 or 10 sides of typed A4 in normal text)

Algilez  words  6419 (86%).  i.e. a direct translation from English to Algilez requires only 86% of the words.

Even if the words 'the' (280) and 'a' (191) are excluded from the English text (since they are not included in Algilez) this removes 471 words, leaving 6963 English words.  Algilez has still only 92% of the words, showing that the grammar and format of Algilez is more compact than English.

English characters  31244

Algilez  characters  22298 (71% compared with English)

Eliminating the characters from the words 'the' (840) and 'a' (191) will remove 1031 characters leaving 30,213 characters in the English text.  Algilez has 22,298 characters which is 74% of the English.

 

Algilez GCSE Lesson Book

Translations of English and Algilez sentences in the Algilez GCSE Lesson Book (Jan 2012) gives similar results:-

English words 13,730,  English characters 56,805.  Without 'the' & 'a' 12,768 words & 54,629 characters.

Algilez words  11,052,  Algilez characters  40,803.

Even with 'the' & 'a' removed from the figures, Algilez has only 87% of the words and 75% of the characters compared with English.

Total figures for both documents gives 89% words and 74% characters

This demonstrates that over a fairly large amount of standard text, Algilez contains over 10% fewer words and 25% fewer characters. On average, Algilez words are shorter than English words (3.6 characters compared with 4.2 characters).  All of this contributes to help reduce the burden of learning.

3    Vocabulary required for standard exams

What do you need to know?

The AQA exam board word list provides a useful benchmark for estimating the words that need to be learned for Common European Framework (CEF) level B1.  I have combined the recommended wordlists for the French and German exams (marked with 'F' in the second column of the Algilez word list) which total 2843 words.   Most words are identical for both exams but a few hundred are specific to either French or German.  Also, many words are grammatical variations of the same root word.  The list contains many words appropriate to normal day to day life for European teenagers and is therefore a reasonable list for an intermediate standard of conversation and reading.

 There are 1276 Algilez root words (including plants and food) included in that list.  1567 words (2843-1276) are compound words or grammatical variations derived from the root words.  Hence as well as the grammar rules, knowledge of approximately 68% of the 1693 Algilez root words (and the compound words derived from them) would therefore need to be learnt to achieve a CEF level B1 (UK GCSE level).4

4    How to use the vocabulary

The Algilez vocabulary format is based on the Roget Thesaurus numbering system (see Roget's Thesaurus below).

Roget
English

Algilez

Root
Additional Algilez
English

Roget number

plus additional sub-classification letters

Roget headword or keyword (this gives the basic meaning of the word). Plus additional common words which express the same meaning.

Algilez

word

Root Word (r)

animal (a),

plant (p)

or Proper Name (n)

Additional grammatical versions of the Algilez word.  Normally the adjective and verb

English grammatical versions of the Algilez words

054b

fullness, plenitude

fu

r

fua, fuiz

full, to fill

136bb

defer, postpone

delgã

 

delgãiz

to defer

209e

up, rise

up

r

suupiz, upiz, upa

to rise: to raise, upper5

5    Roget's Thesaurus

3.1    Ogden's Basic English

My initial starting point for a simple wordlist was Ogden's 'Basic English'.  This appeared to be a well thought out and compact list of 850 common English words.  The theory being that you can say anything that you need to say just using those words.  Unfortunately it soon became clear that 'Basic English' was fatally flawed due to allowing multiple meanings of words, since this is the only way the list can be kept down to 850.  Multiple meanings are confusing in any language and English suffers particularly badly from this problem.  Any artificial language must be designed to avoid this.

3.2    Roget's 1000 categories of meaning

The reference book which I used to establish the various meaning of the words used in Basic English was Roget's Thesaurus.  In using the book in a detailed and methodical way (rather than just looking for synonyms for essay writing or crosswords), I began to realise the excellent work which lay behind it.  In compiling his Thesaurus, Peter Roget had first categorised the whole of the English Language.  This was a lifetime's work and the final product is not just a detailed list of English synonyms, but, most importantly, a comprehensive analysis of the English language into a logical list of just under 1000 categories of meaning.

3.3    Other languages

It is a great complement to Roget's intellectual ability and language skill that his categories are, with only very minor exceptions, still valid today after 150 years.  Every single word in the English language can be placed in one of the categories.  Since the categories are based on meaning, then any other language can be similarly categorised using the same 1000 headings.  Roget himself was certainly familiar with, possibly even fluent in French, German and Latin and did hope that a common world language might benefit from his work.  In fact, shortly after Roget's Thesaurus was first published, versions also appeared in French and German.  It would be interesting to know if they have been published in any other languages.  Since crossword puzzles cannot be a purely English language hobby, then I'm sure there must be more versions around somewhere!

3.4    Different types of Thesaurus

Modern versions of Roget’s Thesaurus sometimes use a different numbering system (e.g. starting with '001: Birth' instead of '001: Existence').  Other versions may not use numbers at all but simply be a list of synonyms in alphabetical order.  It is the classification system of Roget’s original Thesaurus, as much as the grouping of synonyms that makes Roget’s work so useful in language analysis (although a few of his categories might seem a little questionable today).  The construction of a new logical vocabulary would be impossible without first deciding what meanings it was necessary to express and what associated words stem from those meanings.  The classification process was a lifetime’s work in itself.  Fortunately we are able to use the excellent work of Roget to continue with the construction of a new language, today.

The Historical Thesaurus of English, recently published by Glasgow University (after 45 years of work!) also uses a different numbering system to Roget.  However it is still possible to compare similar meanings from one book to another.  In view of this, I have decided to retain the Roget classification system for the time being, since it is better suited to language development work.  Given the enormous workload in producing a new thesaurus, it appears unlikely that a better version of the Roget system will appear, but we shall see.

3.5    Language by numbers - a new way of looking at language

A very interesting implication for language development has become apparent from working with the Roget Classification system.  Since every word in Algilez has (or will eventually have) a unique classification number, then it would be possible to write sentences in Algilez using just the classification numbers alone.  This is made much easier by the regular syntax of Algilez.

There are a number of implications for this.  Firstly, the actual words used could be changed very easily.  This has already proved very useful in this development stage of the language when words have sometimes needed to be changed to something more suitable.  For translation purposes, the substitution of foreign language words might also be possible (although syntax differences would still require further work to make a good translation).

The second implication is that machine reading of Algilez should be considerably easier, since any sentence could theoretically be reduced to the numerical components from the classified vocabulary list e.g.

English    He          went      with             his son              to    the park
Algilez    il              goz       vek        cuil            ila           u        pãk
Algilez Roget numbers 371cf   265c/125a   089a   011dea   371cf/564b   289b   837fa

Whether this will eventually prove to be worthwhile remains to be seen, but I suspect there is considerable potential for further development.     

3.6    Issues with Roget's Thesaurus

 Occasionally I have come across classification examples that I find questionable.  Generally I have accepted Roget's expertise in the matter and gone along with his classification.  However there are a few cases where I cannot agree and have put words into a different classification group.  Whether the original classifications were down to Roget or to later editors I cannot say.  Similarly, I wonder if these arguable classifications had been noticed and debated previously?  I list below the two examples found so far.  In any case it certainly does not detract from the magnificent achievement of the creation of the thesaurus in the first place.

English

Algilez

Original Number

Revised Number

Reason for change

Route, road etc

rut etc

624

305

Roget appears to have mixed the meanings of 'way' which in 624 is used to mean 'method/how' and that of 305 where 'way' is used as a passage/physical route.  I have therefore moved all words relating to routes, roads, paths etc to 305

harm

bocid etc

645

655

Roget uses 624 'Badness'.  I think that 655 'Deterioration' fits better for harm.6

6    Algilez classification example

4.1    Lateness

The word lists used to build the Algilez vocabulary contained three similar words: delay, defer and postpone.  All three words come under the same category of Roget 136, for which the Headword is 'Lateness'.  Lateness is a general term and is sub-divided into 'Lateness' and 'Delay', which I have given the numbers Roget 136a (Lateness) and Roget 136b (delay).  'Delay', 'defer' and 'postpone' all come under sub-section 136b (delay).

4.2    Delay, defer and postpone

In other words 'delay, defer and postpone' are all considered by Roget to have a similar enough meaning to be grouped under the same category and to be a form of 'delay'.  As an initial assumption, we could say that all three words have the same meaning.

Note that this does not imply that they would always be interchangeable whenever they were used in English.  It may be that under different circumstances either 'delay' or 'defer' or 'postpone' might be used, due to 'custom and practice' of normal English usage.  However, the point is, initially we are starting by saying that the semantic meaning of the three words is identical and for that reason any one of them could be used and the Algilez word that represents that meaning (del) would apply to any one of the three.

(Note that when I compare 'a delay' I am comparing it with 'a deferral' or 'a postponement', similarly we are comparing 'to delay' with 'to defer' or 'to postpone'.  It is the semantic meaning that we are looking at here, not the grammatical usage).

4.3    Differences between Defer/Postpone and Delay

However in looking at the words more closely, we may consider for example, that although 'defer' and 'postpone' have identical semantic meaning, that meaning is slightly different to that of 'delay'.  In such a case we need to modify the Roget numbering slightly e.g.

Roget Number

English

Algilez

136ba

delay

del

136bb

defer, postpone

delgã

In this case we have chosen to define 'defer' & 'postpone' as a delay to starting something and therefore formed the compound word 'delgã ' from the roots 'del' (delay) and 'gã' (begin).  An alternative may have been to say that 'defer' & 'postpone' might be defined by 'del' (delay) and 'hãp' (happening/event) , hence making 'delhãp'.

This then enables others to quickly compare the semantic meaning of any words with both English and other languages, both natural and artificial.7

7    Vocabulary

5.1    Classification

Algilez is classified into approximately 1000 main classes of meaning.  The classification is based on those used by Peter Roget in his Thesaurus.  Each main heading of meaning is numbered.  Sub headings and individual words are shown by additional letters to a maximum of 3 digits and 3 letters e.g.:

011c

family

fam

r

011ca

mother

pãrel

 

011caa

mum (mother)

r

011cab

grandmother

pãrpãrel

 

011cb

father

pãril

 

011cba

dad

r

011cbb

grandfather

pãrpãril

 

011cc

child

cu

r2

011cca

son

cuil

 

011ccb

daughter

cuel

 

011ce

sibling

sib

r

011cea

brother

sibil

 

011ceb

sister

sibel

 

011d

race, people

peg

r

011da

tribe

fampeg

 

5.2    Root Words

A Root Word is a Algilez word that is in its most basic simple form.  It is generally (but not always) a noun and can have tense and verbal affixes etc added.  An example of a root noun is 'bel', meaning beauty (an abstract noun).  To this root we add a verbal suffix to create the verb 'beliz', meaning 'to beautify'.  We can also add an adjective suffix 'a' to make 'bela' e.g. 'peel bela' (a beautiful woman).  The same affix 'a' can also make an adverb e.g. 'pintoz bela' (beautifully painted).  Note that a 'qualifying' word following a noun will always be an adjective and one following a verb will always be an adverb.

A number of frequently used words consist of single letter roots e.g.:

journey, travel, move place

g

hear

h

listen

l

see

s

However the above roots are never used alone, they will always have an additional letter or letters to make them into a noun, adjective, verb or adverb e.g.:

to travel

giz

I went yesterday

me goz ozde

a journey

go

Come here!

gez he

5.3    Algilez Vocabulary

The Algilez Vocabulary has a separate web page and based on MS Excel.  It consists of a Algilez word list categorised by Roget reference number.  The wordlist can be copied from the web page onto any spread sheet and then re-sorted into alphabetical order of English or Algilez words as required.8

8    Why use a classification system?

6.1    Starting with a word list

The starting point for a new vocabulary is going to be one's own language.  In my case, English.  The initial need is for a basic wordlist/vocabulary of the more commonly used words, which can then be expanded to include the remainder of the language.  Given the tens of thousands of regularly used words (including the variations of tense etc) and the hundreds of thousands of lesser used words (including specific animal, plant and technical terms), the difficulty is in knowing even where to begin.

6.2    The need for classification

A second difficulty is that basic word lists, no matter how common the words, do nothing to help with the classification of the vocabulary which is essential for a new language.  Without classification and the sensible ordering of words of similar meanings or those derived from the same roots, then, no matter how much better the grammar is, the new language itself is going to be little better than any natural language with all of its inconsistencies and difficulties for the learner.

6.3    Previous approaches to choosing vocabularies

Previous artificial languages have generally succeeded in providing a simplified grammar but have generally still tried to use lengthy and sometimes illogical European language words as the basis for their vocabulary.  They often use just an alphabetical word list with little or no attempt at a classification of word meanings.  This may have eased word recognition by European language speakers but would be meaningless to native speakers of Chinese, Hindi, Arabic etc.

6.4    The Algilez classification method

Perhaps the simplest way to demonstrate the advantages of a classified list is to look again at the words relating to family

011a

consanguinity, kinship

ken

011cb

father

pãril

011b

kinsman

kenpe

011cba

dad

011ba

uncle/aunt

onk

011cbb

grandfather

pãrpãril

011bb

uncle

onkil

011cc

child

cu

011bc

aunt

onkel

011cca

son

cuil

011bd

cousin

kos

011ccb

daughter

cuel

011c

family

fam

011ce

sibling

sib

011ca

mother

pãrel

011cea

brother

sibil

011caa

mum (mother)

011ceb

sister

sibel

011cab

grandmother

pãrpãrel

011d

race, people

peg

 

 

 

011da

tribe

fampeg

English words such as son, daughter, brother & sister have no common roots to denote male or female or to denote child.  Seeing the words together, in the same Algilez classification group above, makes it much easier to see which words ought to use a common root.  The Algilez words follow a logical pattern, making understanding and learning much easier and quicker.

Algilez uses many words of English origin and I have chosen to use new root words if the word is 1) frequently used and 2) would otherwise require a long compound word of three or more root words.9

9    Choice of root words

Root words are generally based upon the abstract noun.  In some cases there are a large number of choices, any of which would work and none of them obviously right or wrong.  In these circumstances the tangible noun is often the one chosen due to being the more common word.  An example is friend 'fren'.

Grammatical use English Algilez

Root Noun

friend

fren

ex - quality

friendliness

frenex

øk - result/outcome

friendship

frenøk

iz - verb

to befriend

freniz

a - adjective

friendly

frena

a - adverb

friendlily

frena

However, we could have used 'friendliness' as the main root word and modified the other meanings accordingly e.g.

Root Noun

friendliness

fren

tangible noun

friend

frenpe

øk - result/outcome

friendship

frenøk

Alternatively we could have taken 'friendship' as the main root word :

Root Noun

friendship

fren

tangible noun

friend

frenpe

ex - quality

friendliness

frenex

In most cases the root chosen, in order to maintain the shorter word (without affixes), has been that which is most commonly used.  In this case I have judged that 'friend' is likely to be a more commonly used word than 'friendship' or 'friendliness' and therefore defined friend as 'fren' instead of 'frenpe'.  (In fact fren and frenpe have slightly different meanings anyway but it illustrates the point).  See below for information about compound words.10

10    Creating a compound word

Compound words are comprised of two or more root words. In the section above are examples based on the root word 'Fren' (friend). In these examples, we have used 'ex' and 'øk', which are two commonly used modifiers.

Grammatical use English Algilez

Root Noun

friend

fren

ex - quality

friendliness

frenex

øk - result/outcome

friendship

frenøk

However, not all word creation is quite so obvious.  Let us take the example of the word 'Passport'.  This is a two-part English word, in common use and well understood.  However the word itself was probably created several hundred years ago and would have been used to describe a letter of permission allowing an English traveller to cross by sea into France.  Nowadays the two parts of the word do not accurately describe the function of a passport and it would be confusing to just apply a literal translation from English to Algilez i.e. pass-port = pãs-goas.

We really need to think about what exactly the function of the document is and then find the best words to describe it. A dictionary definition gives 'passport:- official document for use by a person travelling abroad.' E.g. a passport is a travel document, a means of identification, a permit to enter countries etc.  However, we do not want to produce an unnecessarily complicated, multi-syllable  word.  Some of the choices available are words such as:-

059b foreign country bosnax
265d journey go
494b authenticity, genuineness truøk
547b identification, naming, point out den
548a

document, record, documentation

rek

733a authority fur
756a let, permission, allowing, allow, may le
756c permit, licence lepap

Some of the above words are already two-part compound words.  In the end, the choice was made to use 'Goden' which combined the meaning of 'Journey' and 'Identity' and seemed most appropriate to the present use of the word 'Passport'.11

11    Creating new words

This is on a new page Creating new words

 

Last revised: 27 January 2012


Home Information Vocabulary Notes Grammar Phrase Book Vocabulary Lessons Alphabet Notes
Logo image  Algilez International Language
© Copyright Alan Giles 1999

If you would like to know more, please contact me at:- admin@algilez.com

BottomCornerLeft Free Message Forum from Bravenet.com    Free Message Forums from Bravenet.com BottomCornerRight