Necessary universals of language that can be deduced from its role in communication

1. Introduction

The belief in the innateness of language is based on several arguments, and one of these is that all human languages obey some rules, which are called universals. It is claimed that the fact that all human languages obey these universals show that the latter are built-in in the human genetic information, because otherwise languages would diverge. However, languages are means for communication between humans, and this puts some limits on their divergence. Therefore, any universal which may be a result of these restrictions does not require any innate information to explain it, and only universals which cannot be explained in this way can be used for the argument. Hence, it is important to determined which universals are a result of the fact that languages are means for communication among humans. However, this question is rarely, if ever, discussed. In the rest of this text I discuss this question, and seggest what universals are required for communication, and hence their existence in all languages cannot be used as evidence that these universals are innate.

2. The questions to answer, basic assumptions and methodological problems

I am going to discuss two questions.

What are the rules that all languages which are used for communication by systems that are intelligent as humans{1} have to obey?
What are the rules that all languages which are used for communication by systems that are intelligent as humans, and have the known physiological and perceptual characteristics as humans, have to obey?
In the 'known physiological and perceptual characteristics' I include anything that is (almost) universally accepted as such. For example, the limit of amount of phonological information that can be handled at the same time is one such characteristic.

The first question is more general, and would be important for preparation to meet other intelligent races. For understanding human cognition, the second question is the important one.

The most serious danger in this kind of discussion the danger of making ad-hoc assumptions, based on the only intelligent systems that we are familiar with (humans). To try to avoid this danger, I try to explicitly discuss any assumption which I introduce and is not trivial.

The basic assumptions that I make are:

We have a group of intelligent systems.
The group continually develops new areas of mental activity, and increases and changes the scope of existing areas of mental activity.
The individuals of this group cooperate well enough that they reach far better results by cooperation, and cooperation is important in the success of the group.
Cooperation requires efficient communication.
The individuals do not have a 'magical' way of communication, and hence require physical means of communication. This means that communication requires some effort and time.
The medium that is used for communication is not the medium of the primary sensory input.
There is a limit on the amount of medium-specific information that an individual can handle at the same time. For Question 1, this limit may be of any value. For question 2, I assume it is quite small. This assumption includes the assumption that there is a limit on the time-resolution in which individuals can identify signals reliably.
There is a limit of the amount of any information (non-medium-specific) that an individual can handle and mentally operate on at the same time.

2. Basic units of communication

For a communication to take place, the result of a communication event, when individual A (the sender) communicate a message X to B (the receiver), has to lead to a consistent correlation between what A is thinking about and what B is thinking about. (Note that the way this 'thinking about' is implemented is irrelevant for communication). For this to happen, X must have a consistent correlation with what A thinks about, and B must be able to interpret X consistently. Thus X must be used in a consistent way, and the first requirement for communication is a set of messages are used in a consistent way. I will refer to the way a message is used as the meaning of the message.

In general, messages can be sequential strings of other messages, but there must be some subset of messages which cannot be decomposed this way. I will call this subset the basic units of the language. The meaning of a basic unit must be known by each individual directly, so each individual must have a way of finding the meaning of a basic unit directly from its sensory input.

The minimum length of the basic units is the resolution of the perception of a signal in the medium. To be reliably understood, they have to be significantly longer than this minimum. The maximum length of the basic units has to be smaller than the amount of medium-specific information that an individual can handle at the same time.

The number of basic units may be restricted by several factors:

The time and other requirements it takes to learn the association between a meaning and a basic unit.
The number of such associations that an individual can efficiently cope with.
The number of reliably differentiated units within the limits of their length.

The first two factors are unlikely to strongly limit the number of basic units. The association of basic unit to meaning is a simple association, and an intelligent system must be able to learn and deal efficiently with large number of these. The third factor may be significant if the limit is very small, but seems to be irrelevant in the case of humans.

Below the limits above, where finding the meaning of a basic unit is fast, processing a basic unit is much more efficient than processing a message which is made of several basic units. This is because it does not require cognitive operations to combine the meaning of each basic unit together. Thus we would expect the number of basic units to grow to these limits, rather than stay small.

Combinations of basic units

The number of different ideas that intelligent systems can reasonably think about is huge, effectively infinite, because these systems combine concepts together in an unrestricted way. It is not possible to match this variety by learning basic units, so to be able to serve as a communication tool effectively, a language must be able to form combinations, too. Hence, most of the messages in the language must be combinations of basic units.

In general, a group of ideas, each of which is expressed by a basic unit, can be be combined to give several meanings. Therefore, to understand unambiguously a combined message, the receiver must have a way to decide how to combine the meaning of the basic units into the meaning of the complete message. The information for this decision can be conveyed by any combination of several methods:

The meaning of the basic units already gives some information about the possible combination. This information is always available, so it is cheap, hence we should expect this information to be used almost in all languages. This information is limited to the kind of information which can be deduced from meaning of basic units in each specific message, so its contents are very diverse.
The order of the basic units in the message. This is also cheap, but the amount of information in the basic unit order is relatively small. Hence we should expect the order to be used in almost all languages, and to be used to determine features which are very frequently needed to be determined.
Modifying basic units in an arbitrary fashion. This require users of the language to learn the meaning of each individual modification, so it is like adding new basic units. This would be expected to happen relatively rarely.
Modifying basic units in a rule-like fashion. This requires less specific learning, but requires the users to recognize some input as some basic unit with modification. This is restricted to modifications that still allow the identification of the basic unit, which may restrict the number of possible modifications.
Adding specialized basic units, which define the proper combination. This method is unlimited, but is has a cost, because it requires more basic units per message. Therefore we should expect these specialized basic units to be easy to communicate (normally that would mean they are short).
A hybrid of the last two methods, i.e. modifying basic units by adding something.

To understand the message correctly the receiver must use the information in the way the sender intended, so they must share the knowledge of how to do this. Thus there must be some set of rules of combinations, which the users of the language must know.

The kind of combinations of ideas that intelligent systems can think, and therefore may need to express, is not limited, so the rules of the language must not limit the kind of combinations that can be expressed. The easiest way to achieve this, and maybe the only general way, is to make it always possible to modify a message by adding more information to it. In other words, the rules should allow combining any message with further information. A possible restriction on this flexibility is the limit of information that an individual can handled mentally at the same time. However, there is no need for rules to restrict these cases, because these combinations are difficult to produce, and therefore will not be part of the language anyway. Hence, we should not expect any explicit limits on combinations.

Meaning of basic units, modifications and messages

Since language is mostly used to describe the real world, for efficient communication the basic units meanings correspond to the typical entities of the real world. What are these?

At the macro level, where quantum mechanical effects are negligible, the world seems to be made of objects, which have attributes, act in some way, and affect the attributes of some other objects (I use here the word 'attribute' in its widest meaning). Thus the basic message is mostly associating an object with some attribute(s), action or some effect. Therefore we would expect the bulk of the basic units to correspond to one of these, i.e. to be noun, adjective or verb (and adverbs), and the rules of the language to be about combining these kind of basic units.

In addition, as mentioned above, we would expect a language to contain specialized basic units that are used to determine how to combine the meaning of the basic units into the meaning of the full message.

References to objects real world are the most demanding part of the language, because the identity of the object is unrestricted, and the object is external to the language. On the other hand, attributes and effects are restricted by the the kind of objects they are associated with. Thus the language must have tools to make it easier to identify objects, which may be special basic units, or special modifications. The most important problem is multiple references to the same object, which are not only expensive, but also adds to the effort of the receiver the task of figuring out whether they really refer to the same object or not. Thus the language must contain means of easy identification of repeated references to the same object, which should also make it cheap on term of time and cognitive effort.

The structure of a typical message

The typical length of a meaningful message is ultimately restricted by the cognitive abilities of the individuals, i.e. the amount of information that she/he can handle at the same time. If this limit is large, than the length of meaningful messages would be very variable, and restricted by the amount the information the sender want to deliver, the amount of time she/he has to do it, and how long she/he can expect the receiver to to be receiving. If the limit is small, it will determine the typical length of a complete message. In humans, the limit seems to be quite small, corresponding to a medium length sentence. Much longer sentences can be understood only if they are easily decomposable to shorter sentences.

The limit of the amount of medium-specific information that an individual can handle make it easier to handle messages that fit into this limit. If this limit is smaller than the cognitive limit, it would be easier for individuals to understand messages that can be combined in sub-message, each of which fit into the medium-specific limit. If this limit is larger than the cognitive limit, it probably will not have any effect.

Completeness and perfectness of language

While the rules of the language should allow expression of any idea that the systems that use it would like to express, there is no reason why they should allow interpretation of every possible signal. Thus many possible signals will have no interpretation in the language. That include both signals which cannot be interpreted as basic units, and sequences of basic units which cannot be combined using the rules of the language.

Because intelligent systems continually develop new areas of mental activity, the language must continually evolved to deal efficiently with the new areas. The evolution is close to be 'darwinian', because the changes, while not completely random, are not based on understanding of the language and an effort to conserve its global efficiency. As a result, languages continually diverge away from being optimal communication tool, and are pushed back towards communication optimum only when their efficiency start to fall significantly.

In addition, there are many additional forces on languages, which are not consistent, yet may have quite large effects. In humans, at least the following forces are significant:

Using language as defining the identity of a person, e.g. nationality, class etc. This is normally done by introducing features that don't have any communication value.
Using language for artistic expression. This is normally done by messages that leaves a lot of room for interpretation, i.e. they are not optimal communication.
Using language for dis-communication. In many cases, the sender of a message does not really wants to communicate, and the language evolves tools to allow this.
Explicit efforts to improve language. These are sometimes positive, but not always.

The result of the continuous evolution of the language and the additional forces is that languages can never be at a communication optimum, and will always be full of quirks and oddities.

Summary

To summarize, the following rules (at least) apply to all languages that are used for communication:

A language must have rules.
Language must have have basic units, which have their own meaning. For spoken language, these are the words (phrases with a meaning that cannot be deduced from the basic units that make them can also be regarded as basic units).
Most of the of the messages are combinations of several words, according to some rules (grammar).
The basic structure of most of messages is association of some object with an attribute or an effect.
The rules of combinations are express by one of: meaning of the words, word order, word modifications, special words, modification of words by adding something.
Semantic information normally used whenever it is available to decide about the appropriate combination.
Word order is used for simple and frequently used features of the grammar.
The grammar has to allow combining any message with more information. (This includes the so-called 'recursion').
The main bulk of words mean either an object, an attribute, an action or an effect.
The language must have tools to make identification of objects efficient.
The language must have a way to signify repeated reference to the same object, and make it cheap.
Languages are far from optimum.

Any language that is used for communication by humans must have these features, independently on any innate rules. Therefore, finding these rule does not support theirs innateness.

Other, more specific rules, are more difficult to predict directly from the communicative role of language, because our understanding of language and human cognitive performance is not good enough, but they seems to be plausible. These include:

The fixed location of a header in a phrase. This is an example of using word order for combining the meaning of words to the complete phrase. Fixed location in different types of phrases is presumably less confusing than different word order for each kind of phrase.
Fixed order of subject, verb, object. Again, a usage of word order.

---------------------- Notes ----------------------

{1} For the definition of "intelligent as humans", I am using my own 'Simplified Turing Test': "A system is intelligent as humans if, after extensive examination, of the people which believe that such systems can exist, more than 96.12% intuitively believe that it is intelligent as humans."

Yehouda Harpaz
yh@maldoo.com
2Nov96
http://human-brain.org/