Human Sentence Processing Some Assumptions

Human Sentence Processing
Some Assumptions

No one knows how the human brain processes language but certain reasonable assumptions can be made about human sentence processing.

These assumptions, whether valid or not, presented a useful basis for modeling the Logos System.

Among these assumptions are:

First Assumption

Human sentence processing is opportunistic rather than algorithmic.

There is not some master control circuitry--some homunculus sitting somewhere in the brain--that "manages" human sentence processing in the manner of an algorithm (i.e. a logic-driven sequence of steps).

In the Logos System, sentence processing is also non-algorithmic, i.e. it is based on declarative information about language rather than upon sequences of procedural steps.

In the Logos System, the question of how such declarative information is to be applied to the input stream is solved in a novel, symbolic/neural net paradigm. (See Logos Model/Neural Net Architecture.)

In both the brain and the Logos System, the only thing that might be said to control sentence processing is the sentence itself.

Effectively, the sentence is the algorithm.

Second Assumption

Human sentence processing is effected by memory associations stored in human memory--i.e. by the content and organization of human memory and all its associations--as memory reacts to the input signal.

The process has nothing to do with fixed sequences of logical operations.

Thus, human sentence processing will vary from sentence to sentence as the memory associations vary. For one linguistic situation, the brain may rely more upon semantic associations, for another, upon syntactic associations, for still another, upon stored chunk of hybrid linguistic data.

In this latter case, associations involving the chunks have become so familiar that short cuts to them have sprung up (to satisfy the law of least effort).

In the Logos System, sentence processing is also effected solely by the content and organization of associative memory.

In the Logos system, by associative memory is meant the words and structures of a given language (and their associations) that have been learned through exposure to that language and assimilated into memory. This assimilation, of course, is effected by human developers, who work inductively with representative text samples of the given language.

Rules in the Logos System used to process sentences are comprised entirely of semantico-syntactic patterns found in actual text. These patterns capture the semantico-syntactic associations learned regarding the given language, based on the sampling.

Here too, the parts and types of memory involved in processing a sentence vary from from one linguistic situation to another, in true opportunistic fashion, now veering more towards syntax, now towards semantics.

In sum, in both the brain and the Logos System, associative memory itself is the processor, as this memory reacts opportunistically to the input signals. See Mental Model.

Third Assumption

The brain abhors complexity and seeks wherever possible to simplify its processes, driven in this by the law of least effort.

The brain achieves simplification largely through the use of abstraction.

The brain cell (neuron) by its very architecture would appear to be an abstraction device, with many dendrites for input and a single axon for output.

The brain clearly employs abstraction in dealing with the complexity of natural language. No one knows how the brain represents language internally, but there are hints from language acquisition (click here for discussion).

The Logos System relies upon abstraction as its chief device for coping with complexity.

Abstraction is the principle upon which the SAL Representation Language of the Logos System is based.

The semantico-syntactic patterns of the Logos System knowledge base are patterns at several levels of abstraction and thus enjoy the power of generality.

In both the brain and the Logos System, the problem of complexity of natural language is handled by dealing with natural language at the level of semantico-syntactic abstractions.

Fourth Assumption

Humans acquire language by means of the general mechanism for learning, viz. the assimilation of what is unknown to what is already known.

The language learner assimilates new, unknown patterns by associating them with already known patterns. This assimilation occurs when some resemblance (or distinction) between the unknown and the already known is recognized.

Language learning advances dramatically when the brain discerns commonality in differing, already known linguistic patterns and thus, through the power of abstraction, begins to generalize these patterns.

When this recognition/assimilation process is effected at a sufficiently abstract abstract level, the learning process becomes very efficient.

This abstraction mechanism, in part at least, might well explain how virtually infinite language capacity may result from merely finite language exposure.

The assimilation of the unknown to the already known perfectly describes the Logos approach to sentence analysis.

In effect, an unknown input sentence is "learned" (decoded) by associating the various parts of the unknown sentence with already known (stored) semantico-syntactic patterns which, at some level of abstraction, they resemble.

These already known, stored patterns are the product of prior, finite exposure to language by Logos System developers who have generalized what they have seen in language (simulating presumed brain processes).

The capacity to discern analogy among diverse phenomena is fundamental to this generalization process.

Seeing that x (the unknown) is like y (the already known) entails seeing them under a common (more general or abstract) classification.

The SAL Representation Language, being an abstraction language, supplies the classifications by which an unknown x can be linked to an already known y in an efficient manner.

The stored, semantico-syntactic (SAL) patterns of the system's memory, being abstract, encompass virtually all actual and potential sentences of a given source language.

Thus, after generalizing its language exposure as a finite set of abstract, semantico-syntactic patterns, the system's memory is prepared to deal with (decode) an infinite variety of new, previously unseen sentences.

Fifth Assumption

In the brain, syntactic processing and semantic processing are integrated rather than separate, sequential processes.

Evidence for this is seen in the fact that we often understand half-uttered sentences, and can often complete the sentences begun by others.

This further suggests that human sentence processing must be deterministic, producing a single parse rather than a syntactic parse forest that must then be semantically pruned.

In the Logos System, the SAL Representation Language integrates semantics and syntax, seeing them as the two extremes of a continuum.

The integration of syntax and semantics in SAL allows syntactic processing and semantic processing to be conducted simultaneously.

By applying semantics to the syntactic parse, the Logos System is able to produce a single, deterministic parse rather than a parse forest which must then be pruned by means of semantics.

Regarding Translation Proper

So far we have not discussed how translation per se occurs.   We hypothesize, however, that the same faculty of associative memory applies. In effect, in learning a foreign language, the brain associates target language patterns found to be equivalent to semantico-syntactic patterns of the source language. The Logos Model follows this methodology, using contrastive linguistics (at the semantico-syntactic level) as the basis of its transfer modules.

(As target language fluency increases, source language mediation no doubt begins to drop out. A very distant approximation of this may be said to also occurs in the Logos Model, as the source language sentence is re-expressed in the SAL metalanguage. Target generation takes place, then, not from source strings but from SAL strings which loosely approximate a conceptual metalanguage. This realization is quite imperfect, to be sure, but is indicative of the orientation of the Logos Model.)

Mental Model as a Basis for the Logos Model

The graphic of the Mental Model, shown below, illustrates conjectures regarding the role of associative memory in human sentence processing.

Key Characteristics of the Mental/Logos Model

1. The brain reacts opportunistically to language input, using whatever resources it needs to derive intelligence from the input signals. Resources are drawn into this process by memory associations. Because associations are different, the same sentence will be processed differently by different people.

2. To derive intelligence from the input signal, the brain must first deal with signal ambiguity and complexity. The prevalence of fan-out and fan-in circuitry in the anatomy of the brain suggests that analysis (fan-out) and synthesis (fan-in) are fundamental to its operations. (See graphic below for illustration).

3. The processor for accomplishing this work is associative memory itself and its organization (interconnections). There is no circuitry devoted specifically to managing sentence processing. There is, however, architecture designed for this purpose.

4. Sentence processing is done in incremental stages along a pipeline architecture, vaguely akin to the incremental processes of the visual cortex.

5. Syntax and semantics are not handled in separate operations but are rather different aspects of an integrated process. Syntax and semantics form a semantico-syntactic continuum.

6. Memory is content-addressable. This explains why mental processes do not slow down as the knowledge store increases. It is clear that of the billions of cells in the brain, only those which should become involved with a given input signal do in fact become involved.

The above are the properties of the Mental Model which we sought to emulate in the Logos Model.

_______________________________________________________________________

Mental Model
Click on Blue Circles for Discussion

For a comprehensive discussion of human sentence processing and the Logos System, request the Technical Report, Linguistic Overview of the Logos System, from Logos Customer Support, or directly from its author (bscott@logos-usa.com).

Step One - Decoding the noun phrase the kitchen table

Two distinct, familiar operations are involved here: (a) the string the kitchen table is captured and reduced to a more manageable semantico-syntactic abstraction, much in the manner of a bottom-up parse (except that the resultant entity here is semantico-syntactic); (b) the word table is disambiguated on the basis of its association with the word kitchen.

Operation (a) employs a fan-in circuit which in effect cause the string to be rewritten as a semantically labeled NP. Operation (b) employs a fan-out circuit which links table to its various meanings ("information", "geological formation.", "support surface."). The surmise here is that the association of kitchen with table favors synapse with "support surface.

Step Two - Decoding the preposition from

The preposition from has a number of different meanings (taking quite different lexical transfers in Russian, for example). The surmise here is that a strong association exists between one of the meanings, namely "off of" and the NP("support surface") object of the preposition, thus effecting resolution.

Note the abstract nature of the foregoing operation. Had the object of the preposition from here differed somewhat, e.g. from the bathroom floor, or from the closet shelf, the very same circuitry (all involving "support surface") would nevertheless have applied. We assume that semantico-syntactic generalizations of this sort typify human sentence processing

As in Step One,fan-in circuitry here concatenates from NP("support surface") as single, semantico-syntactic prepositional phrase entity, further simplifying the sentence in the manner of a bottom-up parse.

Step Three - Decoding the verb take

The verb take has several different meanings, only one of which has a strong association with the from ("off of"). This association enables the brain to resolve take to the sense of remove.

As a result of the three foregoing operations, the brain has achieved both clarity and simplification regarding the first clause of our input sentence. The presumption is that when no further ambiguity remains and no further simplification is necessary, the brain can be said to understand the input.

The assumptions and conjectures regarding this Mental Model for human sentence processing significantly influenced the design and implementation of the Logos Model. See Logos Model as a Symbolic Neural Net in the section Introducing the Logos Model..