Summary
coi. rodo .i mi'e jexOm.
While learning the Lojban language1,
I found myself drawing some Unified Modeling Language (UML) class
diagrams to help me better grasp the Lojban grammar concepts. I
was doing domain modeling in the linguistics field! Was it a UML
Whorfian effect?
This article is about a constructed language called Lojban,
the UML, and the Sapir-Whorf Hypothesis (SWH).
Table of Contents
Constructing Languages
Learning the Lojban Grammar Concepts
Lojban Basic Predications
Lojban Word Categories
More Lojban Word Categories
Lojban Tanru
Description Sumti
Would You Like More of This?
Why Did I Create this Model?
Back to the Sapir-Whorf Hypothesis
Experiencing UML Whorfian Effects?
Be Liberated from Whorfian Mind-locks
ki'e .i co'o
Acknowledgements
References
Endnotes
Constructing
Languages
Designing new languages or studying artificial languages is a
most fascinating activity, in my opinion. When I say new languages,
I mean new tongues, not new programming languages (although
it may be fascinating also to invent or learn new programming languages).
Esperanto2 is the most successful
artificial language today. Clearly, it has it roots in European
natural languages, but has been built with a regular grammar and
a regular way for building families of words. Esperantists promote
Esperanto as a universal secondary language for culturally neutral
international communications.
There are thousands of other artificial languages (also known
as constructed languages, or "conlangs"), some of them being "art
languages," practiced by their inventor only, or by a whole community,3
like Tolkien's Elvish languages4 or
Klingon,5 spoken by the aliens of Star
Trek.
Lojban has evolved from its inception to become an original, usable
and interesting language. Why would anybody like to learn and practice
Lojban? Basically because it's a mind-expanding experience and because
it's fun!
At the very beginning, in 1955, Lojban was called Loglan (both
names mean "Logical Language"). It was an experiment to test a linguistic
research concept known as the Sapir-Whorf Hypothesis [9,
10] (see later section "Back to the Sapir-Whorf
Hypothesis"). The SWH has many different formulations, depending
whether you take the "strong" or "weak" variant:
- SWH strong formulation:
Language shapes the way we think, and determines what we can
think about.
- SWH weak formulation:
The language spoken by a linguistic community has an influence
on its culture (what this society does and thinks).
- SWH negative (and rather strong) formulation:
The limits of the language one speaks are the limits of the
world one inhabits.
One the issues with the SWH is that nobody really agrees on what
it could be, but we will discuss more about the SWH later.
Today, the Logical Language Group has departed from its original
objective to test the SWH (which is very difficult to demonstrate,
by the way) and has grown Lojban as an instance of the "engineering
languages," a subcategory of the constructed languages.
As of today, Lojban is a beautifully designed language. And it
is fascinating partly because it is different. The grammar,
the vocabulary, everything is very carefully built, in a logical
way, but doesn't look like any spoken or written natural language.
That doesn't mean that it is ugly or difficult to learn, and "logical
language" doesn't at all mean that you can't be fuzzy in what you
say or that you can't write poetry!
For more about Lojban, see The Complete Lojban Language book
[1]; Chapter 2, "A Quick Tour of Lojban
Grammar, with Diagrams," is a good place to start and get a feel
of what Lojban is. Actually, the book doesn't contain many diagrams...
Learning the Lojban Grammar Concepts
This section is about the Lojban language, for which I describe
some grammar concepts with UML class diagrams. If you're lost, feel
free at any time to skip to the next section, "Why Did I Create
this Domain Model?"
Lojban Basic Predications
Since the Lojban grammar is rather unusual, it is explained from
the beginning using Lojban terms. (Note: Lojban terms are always
written as invariable nouns, since in Lojban plurals are not denoted
with a suffix such as an "s".)
A standard Lojban sentence expresses a predication, which is called
a bridi. In English, all the following sentences, although
built from different grammatical entities, also express predications,
which can be paraphrased with relationships:
- I am your father (to be + noun) = to-be-father-of (father =>
I, child => you)
- You are big (to be + adjective) = to-be-big (who-is-big =>
you)
- I go to Paris (active verb) = to-go (goer => I, destination
=> Paris)
- I give you this (active verb) = to-give (donor => I,
gift => this,
beneficiary => you)
- That is green (to-be + adjective) = to-be-green (what/who =>
that)
- You are a cat (to-be + noun) = to-be-a-cat (what/who => you)
| Lojban
Pronunciation |
| |
sounds
like |
| u |
/oo/, like
in "look" |
| o |
/o/, like
in "show" |
| c |
/sh/, like
in "show" |
| g |
/g/, like
in "god" |
| s |
/ss/, never
/z/ |
| j |
/j/, like
in French "bonjour", or /s/, like in "pleasure" |
| ‘ |
/h/, like
in "hello" |
| x |
/kh/, like in the Arabic "Khaled", or /ch/ like in the Scottish
"loch", or the German "Bach"
|
The translations into Lojban give the following bridi:
- mi patfu do
- do barda
- mi klama la paris.
- mi dunda ti do
- ta crino
- do mlatu
Note: the Lojban words, like patfu, barda, klama, etc.,
have been built algorithmically using today's six most widely spoken
languages: Chinese, Hindi, English, Russian, Spanish, and Arabic.
For each relationship, a default place structure
(programmers would say signature) has been defined. The place number
in the bridi tells the role played by its occupant. A place in the
bridi is called a sumti. The centerpiece of the bridi,
called the selbri, expresses the relationship itself. So, typically,
a bridi will have the form shown in Figure 1.
sumti selbri sumti sumti ...

Figure 1. Lojban bridi Structure
|
Lojban Grammar Glossary
|
| Word |
Definition |
| bridi |
predicate |
| sumti |
argument |
| selbri |
predicate
relation |
| cmavo |
structure
word |
| gadri |
article |
| cmene |
proper
name |
| brivla |
predicate
word |
| gismu |
root
word |
| valsi |
word |
| lujvo |
compound
predicate word |
| tanru |
phrase
compound |
Lojban Word Categories
If we look back to the examples in Lojban, we see different kinds
of words:
- mi, do, la, ti, ta belong to the category of small grammatical
words called cmavo.
- among these, la is an article (gadri) announcing
the name paris. (a name is called cmene in Lojban).
- mi, do, ti, ta are sumti cmavo, a bit like pronouns.
- patfu, barda, klama, dunda, crino, mlatu are all brivla
i.e., words that express a relationship, words that carry the
meaning; these brivla are gismu actually, i.e.,
root words.
STOP! You may say. Don't you feel the need to draw some diagrams
to help yourself at this point? Well, I do.

Figure 2. Categories of Lojban Words
More Lojban Word Categories
In other words, Lojban has no such category as noun, verb, adjective,
or adverb. It has relationships, called bridi, with one or
more words that constitute the selbri at the center.
In
- do mamta mi ("you are-a-mother-of me" i.e., "you are
my mother")
or in
- do patfu mi ("you are-a-father-of me" i.e., "you are
my father")
mamta and patfu play the role of the selbri.
They are different brivla. A brivla is a content word,
it can be:
- a gismu, built into the language
- a lujvo, derived from combination of gismu
- a fu'ivla, borrowed other languages, and adapted to Lojban

Figure 3. Kinds of Lojban brivla
We have already used some gismu. These gismu are formally defined
like this:
- patfu: x1 is a father of x2
- barda: x1 is big/large in property/dimension(s) x2 as
compared with standard/norm x3
- klama: x1 comes/goes to destination x2 from origin x3
via route x4 using means/vehicle x5
- dunda: x1 [donor] gives/donates gift/present x2 to recipient/beneficiary
x3 [without payment/exchange]
- crino: x1 is green/verdant [color adjective]
- mlatu: x1 is a cat/[puss/pussy/kitten] [feline animal]
of species/breed x2
Where x1, x2, … represent the arguments (the sumti)
that are accepted in the predicate (the bridi) when these
gismu play the role of a selbri. The arguments are
optional. If there are present, it is their order in the bridi
that counts to understand the sentence. (There are means to change
this order and still understand the same thing, but it's beyond
the scope of this presentation.)
Lojban Tanru
A selbri can be also a tanru, which is a metaphor,
built with a set of brivla. Like:
- mi sutra bajra (I am a quick runner / I run quickly /
I quickly run)
- do barda nanla (you are a big boy)
- mi dunda patfu (I am the father-who-gives)
Where:
- sutra: x1 is fast/swift/quick/hasty/rapid at doing/being/bringing
about x2 (event/state)
- bajra: x1 runs on surface x2 using limbs x3 with gait
x4
- nanla: x1 is a boy/lad [young male person] of age x2
immature by standard x3
Note that the meaning of a tanru may be fuzzy.
In a tanru, the left part is called the seltau;
it is a modifier for the rightmost brivla in the tanru,
which is called the tertau. A tanru has the place
structure of its tertau.
A tanru may be more complex, with more than two brivla.
Complex tanru have a "left-grouping rule" semantics that
can be overridden using the cmavo bo, which acts as a top-priority
operator. For example, with the following additional vocabulary:
- cmalu: x1 is small in property/dimension(s) x2 (ka)
as compared with standard/norm x3
- nixli: x1 is a girl [young female person] of age x2 immature
by standard x3
- ckule: x1 is school/institute/academy at x2 teaching
subject(s) x3 to audience x4 operated by x5
you can build the following complex tanru, which all mean
"this is a small girl school," but where the English is disambiguated
in:
- ta cmalu nixli ckule ("left-grouping rule" semantics)
ta cmalu bo nixli ckule (same meaning as above)
This is a small-girl school (a school for small girls)
- ta cmalu nixli bo ckule
This is a small girl-school (a small school for girls)
A tanru may be modeled with a variant of the Composite Pattern
as shown in Figure 4.

Figure 4. Lojban tanru Basic Structure
Do you remember the lujvo, which is a kind of brivla?
I said a lujvo is derived from a combination of gismu.
The Lojban vocabulary is founded on a list of 1350 gismu
and building lujvo is the only way to extend this vocabulary.
A lujvo is built by contracting a tanru, and fixing
its meaning (a tanru may have an ambiguous meaning, that
will be disambiguated by its usage context).
Let's consider:
- gerku: x1 is a dog/canine of species/breed x2
- zdani: x1 is a nest/house/lair/den/[home] of/for x2
The following tanru
means "a house that has something to do with some dog or dogs."
It may mean any of the following:
- houses occupied by dogs
- houses shaped by dogs
- dogs which are also houses (e.g., houses for fleas)
- houses named after dogs
If you want the meaning "doghouse," fix it into a lujvo.
For that, you just have to combine (the exact rules won't be described
here) two of the rafsi (affix) associated with the gismu
in the basic dictionary.
- gerku has ger as rafsi (and also ge'u)
- zdani has zda as rafsi
For "doghouse," we can now build a new word from gerku zdani,
and set its meaning and place structure:
- gerzda
for which:
x1 = x1 of zdani = nest
x2 = x2 of zdani = inhabitant = x1 of gerku = dog
gerku zdani is said to be the veljvo of gerzda.
So, there's a relationship between a lujvo and a tanru
that has something to do with the rafsi of the participant
gismu. See Figure 5.

Figure 5. A More Complete tanru Model
Description Sumti
Description sumti turn a selbri place into a "description
sumti." All the x1, x2, … in the previous examples
were filled by pronouns (sumti cmavo), except in one
example, "la paris.", which has an article (or gadri:
la), which turns the cmene "paris." into a description
sumti. There are other gadri to use with a gismu.
Suppose I would like to say "My mother gives the green cat to the
big girl." You need something to fill the places of "give": x1 (the
donor), x2 (the gift) and x3 (the beneficiary). The cmavo "le"
directly extracts the first place of the bridi built with
a unique brivla or tanru. Combined with "se"
it extracts the second place, and with "te" the third place,
and so on. For example:
- le dunda (the donor)
- le se dunda (the gift)
- le te dunda (the beneficiary)
- le mlatu (the cat)
- le se mlatu (the species of the cat)
- le crino mlatu (the cat that has something to do with
green-ness)
So:
- le mi mamta cu dunda la crino mlatu le barda nixli
My mother gives the green cat to the big girl
- le crino mlatu cu se dunda
The green cat is given (to someone by somebody)
The green cat is a gift
- le barda nixli cu te dunda le crino mlatu
The big girl is given the green cat
Somebody gives the green cat to the big girl
Note: "cu" is a cmavo used to introduce the selbri.
If not present in the first example above, "mamta dunda"
would have to be interpreted as a tanru, meaning something
like "a giver which has something to do with a mother," or a "motherly
giver." So, you need something to separate the end of the first
sumti from the beginning of the selbri: "cu"
plays this role. It is optional when the first sumti is simple,
like a sumti cmavo, but is mandatory when the first
sumti is more complex.
If you think about it, descriptors are used to turn a selbri
into a sumti. If you study Lojban, you'll see how
"events" are used to turn a whole bridi into a selbri.
These sentences are in fact object representations (instances)
of the following class diagram, which is an enhancement of Figure
1, where the selbri and sumti classes have now been
turned into interfaces.

Figure 6. Lojban Grammatical Concepts
Would You Like More of This?
This may look complex, because the explanations were very quick
and not very progressive.
Of course, a text or a dialog would use many grammatical features
that won't be described in this short article (the events, the Lojban
time/space tense system, etc., or even much simpler constructs).
If you feel like you could be interested in Lojban, check out the
documents and lessons at the www.lojban.org
Web site. The Lojban community is really friendly to beginners;
feel free to ask questions on the mailing lists.
Why Did I Create this Domain Model?
Why did I do all that diagramming? Confronted with new concepts,
I felt the need to represent them and their relationships. What
I have now is only a map of the concepts, and a lot of white spaces.
Working on a domain model is exactly this: building an enhanced
glossary of the concepts. They are mostly class diagrams, but, of
course, you can model beyond that. Modeling aids understanding,
but it has its limitations: you still have your original task at
hand. Suppose you want to build an application, for example, a dedicated
structured editor for writing and automatically helping fixing Lojban
texts, or a translator, or a computer-aided tutorial? You may have
to build entirely different models for that, maybe reusing only
small parts of the domain model for the application design. It depends
on the application itself, and the way you analyze its use cases.
For Lojban, like in any other field, a domain model is a valuable
and essential artifact in a project, but, by definition, the domain
model doesn't depend on the project itself.
Back to the Sapir-Whorf Hypothesis
The SWH is named after the name of two linguists, Edward Sapir
(1884-1939) and Benjamin Whorf (1897-1941). It states that the way
people think is strongly affected by their native languages. There
is controversy on this subject, for example attacks from Noam Chomsky,
the father of the generative grammar. Today, the SWH [8,
9 10, 11]
is well accepted, in its weak sense.
I am not a linguist, and won't go into deep linguistic debates,
but I like this question: "Are all languages equivalent, a means
of simple communication? Or is the SWH true: "Do languages shape
(or limit, or extend) the way we think?" If language is like a tool
to cut reality into slices, a tool to describe reality and think
about it, maybe different languages end up with different slices—more
precise in some domains and less precise in others.
Let's contemplate what a (so-called) Whorfian effect could be
like. People fluently speaking several languages all experience,
depending on the situation, that the ideas they want to express
are easier to formulate using one of their languages rather than
the others. It may depend not on the ideas themselves, but rather
on a complex interaction between the idea, the person, the way he/she
has learnt these different languages. Let's take another example:
in the previous sentence, I have used "he/she" Some languages have
a third person pronoun that doesn't depend on the gender. The point
is that sometimes a given language, which reflects some culture,
which is an historical result of some elaboration process (which
never ends), can limit how something is communicated. Note: in Lojban,
sumti cmavo has no indication of gender or number.
Much has been written about the SWH, and lots of flame wars took
place on the Internet. A conclusion is that the SWH hypothesis is
almost impossible to prove (it was the goal at the inception of
Lojban: experience how people would start to think differently while
learning a new and logical language). People talk more about Whorfian
effects; and about Whorfian mind-locks [5,
6], a special case when Whorfian effects
are negative.
Experiencing UML Whorfian Effects?
The SWH is very difficult to demonstrate for natural languages.
It is very difficult to invent a new culturally neutral language,
teach it to people from different cultures, and wait for Whorfian
effects to manifest.
Let's consider our software engineering "culture": we have our
own languages; we share common knowledge, problems, and solutions.
I can talk about some things with any software engineer anywhere
in the world, and feel more commonality than discussing with my
neighbor next door. But our engineering field is far less complex
than a real culture. Anyway, it already has its own history, and
has the specificity of being very focused on inventing the best
languages. Our goal is always to enhance the way we can solve our
engineering problems, by inventing programming languages, like LISP,
Prolog, Smalltalk, Ada, Erlang, and so many others; and design and
analysis languages like the UML.
I have had some system engineering discussions with people who build
very different software than I do. To find some common ground, I
suggested modeling a simple fire alarm system for a house. One person,
who was used to building software for controlling an aircraft engine
unit, considered everything as a control loop, with outputs giving
feedback for modifying what to do with the inputs. Another considered
everything as working like function chains with filters. So, even
in our software (or software intensive systems) engineering culture,
many different subcultures can be found. Each of these uses a different
representation and codifies knowledge using special languages.
From this example, and many other examples, we can infer that
there is a strong relationship between a language and an "engineering
culture" (what engineers do and think). We are very close
to the SWH here, in its weak formulation. Other questions about
such a possible Engineering SWH: Does the engineering language we
practice limit the solution space we can explore? (This is a strong
negative formulation.) Does the engineering language we practice
shape the way we engineer and determine the solutions we can imagine?
In software engineering, the UML is today's language of choice
for analysis and design. And since the UML is a language, could
there be something like a UML Whorfian effect? Faced with a complex
problem to solve, what do you visualize in your mind? If your reflex,
as an engineer, is to create (mentally or physically) UML models
and diagrams, then I think you are directly experiencing a (positive)
Whorfian effect.
Students are now taught the UML at school. For the rest of us
who have been in the field for some time, we discovered OO programming,
then OO design, and then OO analysis. We read and participated in
the elaboration of methods and visual representations of our design
results and analysis results. We experienced a paradigm shift in
how we implement and think about our engineering practices. A language
is not only something we learn, it is at foremost something we practice.
Practicing the UML gives new reflexes for solving problems.
Now, wouldn't the UML, by its very nature, eliminate solution
paths one could take to solve a problem? If this is the case, you're
experiencing a Whorfian effect, too, but a negative one. Such negative
Whorfian effects, or Whorfian constraints, should drive UML enhancements.
There is another difference between an "engineering culture" and
a "real culture": as engineers, we have much more freedom to change
our communication languages. Doing so is an engineering activity
on its own.
Be Liberated from Whorfian Mind-locks
We have seen that since the UML is a language, there could be
UML Whorfian effects (positive or negative). Positive ones would
be that learning and practicing the UML might enable us (by giving
us new mental structures) to view the (software/systems) world differently.
Or, more simply put, UML might enable us to practice engineering
differently.
Eric Steven Raymond, as a theorist of the free software movement,
is well known for "The Cathedral and the Bazaar," [12]
and less known for "Tolkien's Tengwar: A romantic orthography for
Lojban" [7]. In his Jargon File [5],
Raymond defines Whorfian mind-locks (Jeff Prothero's term [6]):
"Software designs are sometimes restricted in avoidable ways
by mental habits a developer has picked up from a particular language
or environment (perhaps a now-obsolete one) and never discarded."
An example of that is the well-known joke:
"Good FORTRAN programmers can program in FORTRAN with any programming
language."
Maybe the UML could liberate us from some Whorfian mind-locks.
What would a UML Whorfian effect feel like? Actually, nobody really
knows. I think that it almost happened to me when I started to learn
Lojban. Be warned that it could happen to you, dear super-modelers.
Then share it when it does! Maybe it simply would instantiate itself
as a release of some old Whorfian mind-lock.
Be warned, too, that the UML is not the "final word" in software
engineering. Don't get caught in UML mind-locks when it comes to
imagining new solutions for new problems.
ki'e .i co'o
To finish, just in case you're lost in Lojbanistan during your
next holidays, here is a Lojbanic survival kit:
- coi (hello)
- mi na jimpe (I don't understand)
- mi xagji (I am hungry)
- ma do cmene (what's your name?)
- mi prami do (I love you)
- ki'e (thank you)
- co'o (bye)
- ko ko kurji 6 (take care of you)

Acknowledgements
Catherine Southwood really helped improve the English in this
article, and made many useful suggestions. Many thanks to her! .i
ki'e doi. katrin.
References
Books
[1] John Cowan. The Complete Lojban Language.
A Logical Language Group Publication, 1997. (Partially available
online.)
[2] Nick Nicholas and John Cowan. What is Lojban?
.i la lojban. mo. A Logical Language Group Publication, 2003.
[3] Robin Turner and Nick Nicholas. Lojban for
beginners. http://www.opoudjis.net/lojbanbrochure/lessons/book1.html
Web Sites
[4] The Lojban official website: http://www.lojban.org
Online Articles
[5] Eric Steven Raymond's Jargon file extract:
http://catb.org/~esr/jargon/html/W/Whorfian-mind-lock.html
[6] Jeff Prothero's original thought: http://www.lojban.org/files/papers/4thtense
[7] Eric Steven Raymond's article: Tolkien's Tengwar:
A romantic orthography for Lojban http://catb.org/~esr/tengwar/lojban-tengwar.html
[8] What is Lojban? (and the SWH): http://www.lojban.org/files/draft-textbook/lesson01
[9] Lojban and the SWH, discussions: http://www.lojban.org/files/why-lojban/swh.txt
[10] Presentation of the SWH and compilation
of links: http://www.usingenglish.com/speaking-out/linguistic-whorfare.html
[11] And of course, Lojban, UML, and the SWH can
be found in the Wikipedia: http://www.wikipedia.org
Other Web Sites and Articles
[12] http://www.catb.org/~esr/writings/cathedral-bazaar/
Eric Steven Raymond's seminal essay about the open-source hacker
culture.
[13] http://www.uea.org
The World Esperanto Association.
[14] http://www.langmaker.com
about Model Languages & The Art of Language Making (Conlang).
[15] http://www.elvish.org
The Elvish Linguistic Fellowship.
[16] http://www.kli.org
The Klingon Language Institute.
[17] Wanted: A World Language, by Edward Sapir,
1931: http://www.langmaker.com/sapir.htm
Endnotes
| 1. |
See www.lojban.org
[4] |
| 2. |
See www.uea.org
[13] |
| 3. |
See www.langmaker.com
[14] |
| 4. |
See www.elvish.org
[15] |
| 5. |
See www.kli.org
[16] |
| 6. |
ko ko kurji is the same a ko kurji ko
(only the sumti order counts in a bridi, not their absolute
place). ko is the imperative do.
From the Lojban FAQ: "ko kurji do" commands
"Take care of you(rself)" but "ko kurji ko"
commands both that "You take care of yourself,"
and "Allow yourself to be taken care of by you," with
a resulting double emphasis that indicates an especial priority
or responsibility for self-focus. |
|