1 Sep 2016

Kennedy, “Metaphor in Pictures,” summary


by Corry Shores


[Search Blog Here. Index tabs are found at the bottom of the left column.]


[Central Entry Directory]

[Literature, Drama, and Poetry, Entry Directory]

[Graphic Literature, Entry Directory]

[Studies of Perception with regard to Graphic Literature, entry directory]



[The following is summary. All boldface and bracketed commentary is my own. Please forgive my distracting typos. Proofreading is incomplete.]




Summary of


John Kennedy


“Metaphor in Pictures”





Brief summary:

Images can operate like figures of speech, and as such they are called “pictorial metaphors.”  There are certain established ways, either natural or cultural, for how to depict things. When a drawing adheres to these standards or conventions for visual representation, then it is literal. (Another way to understand literal depiction is as an accurate depiction of the objective reality of a situation or of the reality of the subjective experience of a situation). Metaphorical pictorial devices are ones that intentionally break those conventions or standards in a way that makes some point about the depicted thing or circumstance. Like metaphors in language, pictorial metaphors can been understood as having a structure where there is a tenor and a vehicle. The tenor is the thing being treated (commented on), and the vehicle is the treatment (the comment, evaluation, point, etc. about the treated thing). Pictorial metaphors may take every form of figure of speech, including hyperbole, metonymy, allusion, and all the rest. They may also take forms that do not have equivalents in language. These are called pictorial runes. They are modifications of literal pictorial depictions so to allow some aspect of the depicted thing to be more easily rendered, especially when the literal depiction is too difficult or impossible. For example, it is not possible to draw smells, so one pictorial runic solution is to draw wavy “waft” lines above for instance a stinky trash can. However, were we to depict some actual feature of the subjective perceptual experience of something when that feature would not objectively be visible, then this is a literal depiction. For, the person whose perspective is being depicted “literally” really had such features to their experience. For example, were we drawing the visual experience of someone looking at a bright light through half-open eyes, then we could draw lines emanating from the light source, as the viewer would probably be seeing such streaks in their field of vision. Or to depict the “pins and needles” sensation in a limb, we could draw thin straight lines radiating from the limb, because the person really is feeling jab-like sensations. It would be metaphorical, however, if some figurative device were used instead of an accurate visual depiction of the experiential features being presented.








Abstract [quoting]:

Pictures can be literal or metaphoric. Metaphoric pictures involve intended violations of standard modes of depiction that are universally recognizable. The types of metaphoric pictures correspond to major groups of verbal metaphors, with the addition of a class of pictorial runes. Often the correspondence between verbal and pictorial metaphors depends on individual features of objects and such physical parameters as change of scale. A more sophisticated analysis is required for some pictorial metaphors, involving juxtapositions of well-known objects and indirect reference.




Kennedy [henceforth written as ‘K’] begins with a clear example of a metaphor depicted in a picture. It is a “metaphoric picture of a man who is bold outwardly and timid inwardly.” It specifically is a man wearing a business suit, but his head is shaped like a helmet, “drawn in a cutaway style so that we can see there is a rabbit peering out through the visor”. K then wonders the following things: “Just what kinds of metaphors can be presented in pictures? Would a metaphor involving time be impossible in a static picture? Are some common pictorial devices actually metaphors? Are there pictorial equivalents for each kind of verbal metaphor? Are there pictorial metaphors with no verbal equivalent?” (589)


K then notes some related work by other authors, including Gombrich. He then assesses: “formal attempt to assess the relation of depiction to metaphor has never been undertaken. Nor have the basic conceptions of object and abstraction been systematically related to metaphor in depiction. As will become evident, only some types of metaphors lend themselves readily to depiction, and the concrete/abstract dimension, to which Arnheim and Gombrich appeal, makes its own problems for depiction” (589).


K notes some standardized sorts of metaphors with conventional significance, like “scythes and hour-glasses for Death” (589). But he is concerned with a more general sort of metaphorical depiction, “using features of common objects and shapes of the normal environment be achieved?” (589).


K next notes Richards’ distinction between two major features of metaphor, namely, tenor and vehicle: “The tenor is the thing treated, and the vehicle is the treatment — for example ‘his face is stony’ has ‘face’ as tenor and ‘stony’ as vehicle” (589d). Or suppose we have a picture of a president looking like a gun. Here the person is the tenor and the gun is the vehicle. K then directs our attention to figure 1.


 photo Kennedy.MetaphorInPictures.fig1_zpsm1vylqur.jpg

[K does not further discuss this figure. His point might be something like the following. On the left side we have the metaphor of “the tree was a woman”. Here the tree is the tenor and the figure of a woman is the vehicle. On the right side we have the metaphor of “the woman was a tree”. Here the woman is the tenor and the figure of a tree is the vehicle.]


K then observes that were metaphorical language taken literally, it would create a contradiction. He then says that when we depict something in a way that follows “standard canons,” then that depiction is literal, and otherwise it is metaphorical. [I am not sure but I think by canon he simply means convention or rules of usage.] He then wonders what these canons are for depiction.

The person who says “John has a heart of stone” is using language metaphorically. If we ask for the meaning of the word stone, or the word John, out of the context of the sentence, then we will obtain standard meanings which entail a contradiction in “John has a heart of stone”. Depictions that follow some standard canons might be called literal, and ones that are metaphoric would be those that deliberately violate the standard canons while being intended to make a valid point that can be determined by examining the depiction and its referents. The deliberate contradiction of the standard canons is to make a point but not to revise or reject the standard canons. What then are standard canons for depiction?



K then turns our attention to the “spontaneous untrained understanding of at least some kinds of depiction” (590). [Here we are not talking about recognizing metaphors without reference to convention but rather just to recognizing images without training. But as we will see, these inborn ways of recognizing will become the canons.] K gives a number of studies that {1} have demonstrated that people without the training can recognize pictures, that {2} have shown ouline drawing to be universal in cave drawing, and that {3}even animals and congenitally blind people can recognize pictures “without training in outline codes” (590). K suggests that these sorts of inborn ways of recognizing images can be canons that metaphorical depictions might violate. He also recognizes that within a culture there can be certain canons of image recognition that can also be violated.

These can be taken to provide a widespread set of standard canons one can violate intentionally, fully expecting the recipient of the picture to notice a violation. More restricted canons would be possible by using referents that are relatively familiar in a particular culture, eg telephones in our own.



So when a canon is deliberately misused to make a point, it is a metaphorical depiction.

These canons can be followed in a picture in an anomalous way, and the anomaly may be taken to be an error, or it may be taken to make a point. Where the anomaly is considered to be appropriate to make a point, without revising the standard canon (eg our ideas about telephones), the picture is taken to be using anomaly deliberately in a metaphoric manner.



K has us compare three figures in his article. In the first figure, there is a mountain, and at is base are people collecting reeds to make thatching for houses [it seems].

 photo Kennedy.MetaphorInPictures.fig2.resize_zpsxvxpkhsn.jpg Here K’s point is that there is a contradiction in line use. In the mountains, there are lines used as cross-hatching to depict the rocky, solid, facets of the mountain. When cross-hatching is used, “the density of lines is relevant, but the length, direction, and shape of a particular line are not relevant” (591a). [Here it is used like shading, and the hatch marks could have been drawn various ways while still having the same effect of depicting the substantiality of the mountain’s composing material.] The other way that lines are used are [at the bottom] where the reeds are depicted. Here the lines are used as outlines, with each line depicting one individual reed. When lines are used in this way for outlining, “the length, direction, and shape of individual lines is relevant,” (591) [for, each of these factors tell us exactly how the object is shaped and oriented.] Here “the anomaly is for effect” (591) [but I am not sure what the effect is. I guess it is a little like optical illusions of a sort, where paradoxes are presented. We will examine this sort of image later when summarizing an article by Graham Priest called “Perceiving Contradictions.” Later he refers to this as a sort of pun, so perhaps the effect is something like humor.]


In figure 3, line is used only for outline [and not also for cross-hatching, for example]. However, the object here, like in figure 2, is also anomalous, but for a different reason. Here, the familiar object, that is, the woman, is “depicted as having more limbs than is usual”. Here the anomaly is to make a point by means of metaphor, namely, “The busy-ness of the hard-working person is indicated by the extra limbs” (591).

 photo Kennedy.MetaphorInPictures.fig3.4_zpsuk7ym0at.jpg


K says that figures 2 and 3 “are both anomalous in easily-argued ways.” Yet, he says there are other examples of visual anomalies where it is open to debate what makes them anomalous (291).


He then directs our attention to figure 4 (above). It “uses line to indicate edges of surfaces, and introduces other lines as well to indicate a path of movement or ‘speed-lines’ (Barrand and Toleno 1972) or ‘action lines’ (Brooks 1977).” K says we need to examine this case more closely.


K cites studies he had done which did not find movement lines in cave art but he did find cases of them that predate camera effects, namely, in “Japanese picture of the eleventh or twelfth century” (591). [I suppose the idea is that the camera can show a blur when there is movement. One might then argue that the movement lines in drawings is a replication of that effect. But K shows it to predate these camera techniques.] He then adds, “They are generally uncommon in depictions before the last century, although they are now frequent in comics etc.” (591). He also notes that still many people do not recognize these movement lines, giving statistics of the studies showing which groups did not recognize them [for details, see pp.591-592].


On this basis K concludes, “The evidence on speed lines suggests they are not as universal as outline style, and not understood by several groups who understand the outline style” (592a). K explains that simply looking at the percentages of groups that misrecognized the lines misses the also interesting information as to how the lines were recognized instead of as being motion lines (592).


K now will offer explanations for why speed lines are universally recognized. The first explanation he addresses is that “static pictures are not taken as possible depictions of movement by naive subjects” (592). But K does not think this is the right answer, because there are studies showing other sorts of indications of movement that were properly recognized by larger proportions of test subjects (592). K thinks instead that we need the following explanation: “a standard use of line is inadequate, and a new and more sophisticated possibility has to be sought by the viewer before success can be attained” (592).


To continue the investigation, K begins by noting that often times motion lines are like physical tracks. He calls this their being “ecological” [perhaps because they are found existing physically in the environment of the motion, but I am not sure. Perhaps it is also a reference somehow to ecological theories of perception. See this entry on Gibson’s ecological psychology.]

Notice that speed lines are entirely ‘ecological’ in certain cases. A cart leaves tracks and makes ruts. A bird leaves a wake across a water surface. A brush leaves a cleared trail. A duster wipes a space across a dusty board. Thus a path of movement is left in certain circumstances.


K then explains that since in the example the motion line is in the air and is thus invisible, it is an application “in inappropriate circumstances” and hence is metaphorical.

The particular drawings in the various studies cited have the path-of-movement lines in empty air, in space, rather than on an actual surface. Hence speed lines and the like can be related to an actual ecological case, but are being applied in inappropriate circumstances, although they are still pertinent to the depiction. In that sense such lines are being used metaphorically when they appear in empty space, though they are ecological on surfaces.



The next observation K makes results from a study he did on deaf children. [See p.593 for details.] The study concluded that “the prediction based on the idea that these pictorial devices are metaphorical, and hence are less readily understood by deaf children, was upheld” (593).


Figures 2 and 3 show pictorial metaphors, as they involve inappropriate uses of line depictions to make a point or to have an effect, and thus we need to broaden our notion of metaphor to include images and not just language. So he crafts a broader category called “figures of representation” [rather than simply figures of speech] to include both such types of metaphors (593).


Tropes generally speaking, and metaphors more specifically, are a broad category of figure of speech. K will now use psychological studies to classify different sorts of pictorial metaphors [or tropes more generally] (593).


For this we turn first to the field in English studies known as Rhetoric, because in this field the different sorts of figures of speech are classified (593).


The figures that K finds relevant to this analysis are: “allegory, anticlimax, catechresis [or ‘catachresis’?], cliché, euphemism, hendiadys, hyperbole, litotes, meiosis, metonymy, oxymoron, paranomasia [or ‘paronomasia’?], persiflage, personification,” prolepsis, allusion, and synecdoche” (593). He will now consider each term on its own.




“An allegory is a story actually about one set of events presented as though occurring elsewhere to other people.” K gives an example of when an allegory would be evident. Consider “if a well-known event were presented in modern dress; eg the crucifixion on Golgotha could be set in Central Park and portrayed with a Puerto Rican cast” (594). K says that there is a contradiction involved here. [I am not sure what the contradiction is exactly, but it might be that the historical crucifixion happened in a different place and time and involved a different culture of people. So to mix the two incompatible sets of elements creates a conceptual contradiction between what is depicted (along with all its particular features) and how it is depicted (along with all its particular features that are incompatible with the other set).] The tenor is the crucifixion [because this is what is being treated or ‘portrayed’ let us say] and the modern setting is the vehicle [because it is a treatment or ‘portrayal’ of the tenor.] [K’s next point seems to be that the purpose of making such an allegorical presentation is to make the point that this ancient foreign event is still relevant here and now.] “The resolution or point of the picture [its ‘ground’ in Richards’s (1936) terminology] would be the supposed relevance of the event to today” (594, bracketed insertion is Kennedy’s).




Allusion is like allegory in involving something from one place being shown in another” (594). It is often found when “a phrase is quoted without the source being explicitly mentioned” (594). Thus in a pictorial depiction, “a classic figure (eg the Mona Lisa, or the
Statue of Liberty) can be added to a scene” (594). He then gives an example where this presents a contradiction:

Botticelli’s female Primavera added to a picture of autumn would be an allusion, to sophisticated viewers. The contradiction given by the presence of Primavera in a picture of the fall would be understood to be deliberate, and to be invoking the special relationship between Spring and Fall.





Anticlimax is present when an event taken to be of major significance leads to one of inferior importance” (594). K gives the example of an impressive scene centering around an unstately figure. “In a depiction a large mural of a Greek temple with a crowd of priests and worshippers would be an impressive context, one offering an anticlimax if the small central figure were Alfred E Neumann” (594). Here there is a contradiction taking the form of an anachronism. And it also has the metaphor structure: “the tenor is the vast temple, the treatment is the annoying grinning kid” (594). The point of such a depiction would be that there is something wrong in having such a pompous display. “The contradiction deflates the setting, and it can be said to have appropriate ‘grounds’ if the idea is to indicate something is at fault in overdone, pompous ritual” (594).




“In catechresis a metaphoric term is used to fill in a gap in the lexicon” (594). K gives a standard example: “He fell in love.” As he explains, “There is no standard literal verb for the event, eg ‘he became in love’ is poor English. Hence, a metaphor of ‘fall’ is in service in everyday English” (594). Although a catechresis may begin as a metaphorical construction, it can eventually become a standard usage such that the original metaphorical status becomes forgotten (594). K offers speed lines as an example of visual catechresis. He says that to in order to be such a case of visual catechresis, there would need to be {a} some limitation in the normal modes of depiction that {b} are remedied using visual metaphor and also that {c} become so commonly used that their status as metaphor becomes forgotten. Motion lines, as many studies suggest, fulfill these three criteria (594-595).




Clichés are phrases that, while avoiding the obvious way of stating something and over embellishing the message, have become popular and commonplace: ‘make the supreme sacrifice’ is the cliché for die in battle, and ‘stands to reason’ is the cliché for being obvious” (595). K says that the criterion for being a visual cliché would be that they are “devices that have become popular although they are not the straightforward realistic way of portraying something” (595). He gives two examples: {1} “the giant head that children are given in cartoons” [but I am not sure what “point” the large heads are making], and {2} “the light bulb over someone’s head when they have an idea” (595). K then explains the metaphor structure of these tropes: “The tenor is ‘child’ or ‘person with idea’ and the treatments are the enlargement and the light bulb” (595).




“A euphemism, in language, is used when some decorous term, indirectly related to an object with socially sensitive connotations, is used instead of the socially cruder standard term. The ‘dear departed’ and ‘the restroom’ are euphemisms for the dead and the toilet respectively” (595). For an example of visual euphemism, K gives the example of fig leaves: “fig leaves appear in the most unecological places. Fig leaves are the treatment and sex organs the tenor, in Richards’s terms” (595). [Here we see the notion of ecological again, and I still am not entirely sure I understand the usage. But here it seems to means something like natural, or maybe, how things are normally found in the world.]




Hendiadys, in language, is ‘one by means of two’, ie one object being indicated as though it were two, as in saying there is a boy and a Robert in that room, when in fact Robert is the boy.” [I find this explanation to be a bit strange, because I would not know why someone would say “there is a boy and a Robert in that room”. Let me quote from some dictionaries of literary terms:

hendiadys [hen-dy-a-dis], a FIGURE OF SPEECH described in traditional RHETORIC as the expression of a single idea by means of two nouns joined by the conjunction ‘and’ (e.g. house and home or law and order), rather than by a noun qualified by an adjective. The commonest English examples, though, combine two adjectives (nice and juicy) or verbs (come and get it). Shakespeare uses this figure quite often in his later works, as in the first part of this line from Hamlet:

The flash and outbreak of a fiery mind.

(Baldick 111, boldface in the original)


HENDIADYS. Greek hén dià duoîn, one thing by means of two. A figure of speech by which a single complex idea (such as that normally contained in a noun and an adjective) is expressed by two words joined by ‘and’, as ‘pour libation from bowls and from gold’ for ‘from golden bowls’. Fowler gives as another example of hendiadys, ‘try and do better’ for ‘try to do better’. This is a passage from the Psalms:

Such as sit in darkness and in the shadow of death, being fast bound in misery and iron.

The idea of the last three words, ‘misery and iron’,  is not two but one: the iron shackles are the misery.

There is a single complex idea in the words ‘show and gaze’ in this line from Macbeth:

Live to be the show and gaze o’ the time.

(Scott 125)

] K says that the criterion for a visual hendiadys would be if “there are two items if we read literally, but only one if we read metaphorically” (595). Thus it is not enough to simply draw a figure and their mirror image, for “In mirror images there is literally another image.” However, what would qualify is “a pictorial device used to indicate drunkeness [sic?] or poor vision: multiple images are given, eg a head with several pairs of eyes one above the other, and several grinning mouths one above the other. The device has the viewer reject the literal implication that there are multiple eyes and mouths in the referent in favour of the notion that there is one pair of eyes and one mouth but unusual conditions of observation. Hendiadys is used humourously in figure 5” (K 595).

 photo Kennedy.MetaphorInPictures.fig5_zpsthbfy4xd.jpg

[The example above is a bit odd, because it seems to not be entirely consistent. In the first panel, the screen image is multiplied because of an unwanted distortion effect. Presumably in the second panel the person is shaking themselves so to bring the image into focus. So the image is shown on the screen in focus in this second panel, but the viewer’s body is now in many places like the image once was. But what is inconsistent is that if we are seeing what the character sees, as is indicated by the screen’s content, then everything else should be moving too (like the television set itself and the structure it is placed upon). However, if we are seeing what a non-shaking observer sees, as is indicated by our seeing the shaking body, then the screen should still have “ghosts”. Yet, perhaps if it were somehow made consistent, then it would no longer work as a joke.]




Hyperbole is the use of exaggerated terms for emphasis rather than deception, eg ‘a thousand apologies’, or ‘a man as big as a mountain’” (595). K says that in pictures, the distance measures can be extended (I suppose for example like making a desired thing seem very far away when in fact it is not, to emphasize the longing). Also, curvatures can be exaggerated by making them more curvilinear or rectilinear. Other features can be presented as more extreme than the actually are, like “more pronounced jowls, a wider brow, a curvier mouth” and so on (595), as with caricatures (596). Here, “the viewer takes the tenor to be the object and the vehicle to be the exaggeration” (596).



Litotes and Meiosis:

Litotes and meiosis are related to hyperbole. Litotes is the use of opposites to express a particular meaning, eg ‘not bad!’ means ‘quite good!’ Meiosis is the use of understatement, eg ‘now that’s something like a baby’ means it is close to an ideal baby. Hyperbole exaggerates, meiosis understates, and litotes reverses the polarity of an epithet” (596).


[Meiosis, as understatement, could be like an exaggeration in the sense that it is an excessive underestimation.] “Meiosis can be used in place of hyperbole, in language, and to the same effect; eg ‘a thousand apologies’ could have ‘may I offer you a bit of an apology’ as a substitute, the former seeming to be effusive in addition to its basic intention, and the latter seeming rather stiff” (596). However, unlike hyperbole in pictorial caricature, an understatement of meiosis would probably be less effective at communicating an idea. K notes studies where a visual diminishment of prominent facial features led to people having more difficulty identifying faces. But if what is prominent about a face is already the characteristic smallness of a feature, then to portray it even smaller would be hyperbole and probably enhance recognizability (596).


[So since were we to diminish some feature visually that would lead us to reduce emphasis in a picture] “In sum, it may be difficult to devise a depiction which is unambiguously meiotic, involving both understatement and emphasis, while being distinguishable from hyperbole” (596). But diminishment is not the only sort way to construct a meiosis. Also one can use indirect reference:

Perhaps a more indirect use of a meaningful object may work; ‘where have all the flowers gone?’ in words is meiotic in its indirect reference to the tragedy of war.


In this respect consider a set of pictures on war. Many could directly display the aftermath of war by showing battered buildings and maimed people. In such a setting a picture that showed, lying on the ground between the tank-tracks, a child’s light shirt with a rust-coloured stain and a bullet hole, would be an understatement. To many people it might be a chilling meiosis.


Thus, to Fraser’s (1979) distinction between metaphors based on anomalous objects and those based on anomalous contexts it may be appropriate to add a distinction between direct and indirect references. To bring out the indirect reference, setting a depiction alongside others in a similar vein may be useful.



“Litotes” K continues, “involves metaphoric opposites, eg ‘real bad!’” (597). But it is hard to depict opposites and non-presence.  [I am not sure however why opposites are hard to depict, unless it is self-opposition of some sort or self-contradiction. Perhaps K is referring especially to non-presence.]


K will consider some possible visual examples involving non-presence and opposites (597).


To be a visual litotes involving absence, “the stated absence would have to be taken as an actual presence” (597). Thus it is not enough simply to portray a scene where someone should be expected to be present but to also intentionally leave that person out of the picture (597).


[The next example is very interesting, but I may not present it accurately. K seems to be saying that one can use visual litotes to encode the opposite meaning of something so to give the impression that things are wrong to distant audiences in a way that passes censorship.]

A case can be made for an important kind of picture. Consider a victim of oppression trying to depict a state of affairs to a distant audience, while knowing his picture will be censored. (Such cases arose in World War II when concentration camp artists, photographers, and film makers were ordered to make propaganda pictures of the camp. Similar dilemmas faced recent POWs and hostages.) The artist could include in his picture several well-known people whom he can use to indicate opposites are at issue. The familiar person who is bald can be depicted with hair. Someone known to have lost a limb can be depicted whole. The person who is irreligious can be depicted wearing a religious emblem. One with no sense of smell can be depicted enjoying a flower's scent. The fat as thin, the tall as short ... . The picture involves a set of opposites. Perhaps then their apparent friendliness to a central figure in the picture — eg the regime’s dictator — would be taken as suspicious. If so much is to be reversed by the aware audience, the apparent friendliness could be taken for litotes, the opposite being taken to be the intended message. Thus, if many well-known details are the reverse of the truth, and are in some way obviously intended, the major focus of the picture may be taken to be false too by the viewer.




Metonymy and Synecdoche:

Metonymy is the representation of one thing by the presence of some alternative which suggests the basic referent. One refers to a textbook by mentioning its author, eg ‘please pass me Arnheim’” (597).  [Synecdoche is similar, in that it also has one thing representing another thing, but here the representing thing is a part of the represented thing.] “Metonymy is to be distinguished from synecdoche, in which the representation is accomplished by presenting a part of the basic referent, though, as Paivio and Begg (1981) note, in some cases a phrase may be either metonymy or synecdoche depending on how it is being processed by the hearer” (K 597). K then notes a point of clarification for a complicated situation. Suppose we have the name of a country, like England, and a member of that nation, like England’s King. This situation sets us up for both metonymy and synecdoche, depending on which is used to refer to what. If we start with the country (the whole) and use it to refer to the part (the person), then we are using metonymy. So if we ask the king of England, “What do you, proud England, think?” and we want to know what the king is thinking (but indirectly perhaps what the official sentiment is of the nation), then we are using metonymy. For here one thing (the king) is represented by some alternative (the nation). But suppose instead we want to know what the national sentiment is of England, but instead of asking what England thinks, we ask, “What does the Crown think?”, then we are using synecdoche, because here we are using the part of something (the king of England) to reference the whole (England itself.)


However, by the definition of the terms it is logically possible to classify a phrase where the general term (eg the name of the country) refers to the part (eg the ruler) as metonymy. Where the part (the ruler) refers to the greater whole (the name of the country) then the phrase is a synecdoche.


Thus, ‘What of you, proud Norway’ is metonymic, since Norway includes the Norwegian lord being addressed.


‘What does the Crown think?’ and ‘What does the White House think?’ are synecdoches when they refer to Britain and the United States.



K then notes that pictorial synecdoche is in fact quite common, as often pictures can depict a part to indicate a whole. “A drawing of a skyscraper commonly will include only a few representative windows. These are not to be seen as being the only windows, as though the rest of the building were blank walls. A drawing of a crowd scene may only have a few heads drawn, both the remainder of the crowd and the bodies left unstated, ie not drawn. | (Kennedy 1974b calls this device an etcetera principle.) Gombrich (1961, 1963) describes how politicians can be indicated by items like their grins, cigars, favourite hats, etc. However, in some cases these cartoons are not synecdoches — they may be personifications if, for instance, the grin is on a bull moose. In pictures, if a synecdochal Cheshire cat is shown by his smile, it means that the whole cat is present, he has not faded. In synecdoche the remnant means the whole is present” (K 598).


“Pictorial metonymy” instead “would involve drawing a whole to indicate a part, the more general to indicate the more specific” (598). We might even think of metonymy as a matter of “the more abstract being used to depict the more concrete” (598). K then cites research showing that “viewers can examine a pictured object for its category” (598, see this page for details of that research). To be a metonymy, that depicted category then needs to stand for some particular thing. K gives examples where certain visual clues indicate that a generically drawn object is to be understood as a more specific one: 

One possibility would be to sketch, say, a car, in a casual style, where the lines do not fully connect, fade in places, and involve no precision or grid-like look to suggest detailed care. Let us consider, too, that a particular person is drawn in the car. Let the person be someone who is known to drive a particular car (Rolls-Royce, Maserati, Cadillac, or Jeep). For example, the characters in M*A*S*H drive Jeeps. If the person who is drawn is a M*A*S*H character in costume then the generalized car will be understood to stand for a Jeep.


Similarly the vague ship around a swashbuckling pirate is a sailing ship. The generalized boat under the fur-trapper is a canoe. The craft with Luke Skywalker in it is an X-wing fighter.



Recall from above the complications with classifying metonymy and synecdoche in language. K does not think that we have as much trouble making such classifications when we are dealing with these devices in pictures [he cites research, see pp. 598-599]:

In sum, while it is true that the notions of concrete and abstract are often contentious, it seems clear that, because a picture can be seen as casually drawn, an object may be depicted more or less casually or precisely, as distinct from more or less detailed. Precision is one dimension, and detail is another. One can have very detailed pictures with each detail casually drawn, or pictures that have few details each precisely drawn. A metonymic picture, then, would be one where the picture does not depict the object with precision, and where one cannot see more than the general nature of the object, yet a particular object is indicated. In Richards’s terms, the particular object is the tenor and the vague treatment is the vehicle.





Oxymoron is the attribution of incompatible traits to an object, eg the ‘sharp dullness’ of a professor” (599).   A pictorial oxymoron could have “incompatible features, eg a man drawn with a halo and horns, or the robes of a saint and the six-guns of a cowboy. The tenor is the person and the vehicle is the pair of contrasting addenda” (599).




Paranomasia is a synonym for a pun, a play on words making use of similarities for effect, eg ‘What’s a weasel? Oh, its weasily distinguished.’ ‘What’s a stoat? Oh, that’s stoatally different’” (599). We saw already pictorial puns in figure 2, where “the line element is used in two incompatible ways. The tenor is the rocks and reeds of the landscape; the vehicle is their treatment by one pictorial element, thus allowing the solid rocks to be transformed into reeds” (599).




Persiflage is ‘irresponsibility’ in the sense of treating the serious with levity and the trivial with grandeur” (599). A pictorial example could be an event that some would think is holy yet is depicted cartoonishly. “A Disney drawing of the crucifixion would be persiflage to many people” (599). Or a pictorial persiflage could be a mundane trivial event were portrayed grandly, “such as washing a car or brushing one’s teeth were drawn in the style of a Renaissance grand master’s woodcut” (599). In this case, “The tenor would be the event and the vehicle would be the (light or grand) style” (599).




Personification (or prosopopoeia) is the treating of an object as a person” (599). K notes that personifications are often used in drawings; for example, “Machines, animals, plants, and the world are often given faces and other human features in depiction” (599). Here, “The tenor is the object and the vehicle is the addition of human characteristics,” and K refers to figure 1” [above, where a tree is portrayed as a woman] (599).




Prolepsis (anticipation) is a metaphor involving time in a particular way. ‘The two men were walking with their victim’ is a metaphor if the ‘victim’ only becomes a victim several days later. The absence of an epithet indicating that the crucial event is set at some future time marks the phrase as prolepsis” (599).


But in drawings, which are often “static depictions,” time is frozen. So is it possible for them to depict prolepsis, which requires a temporal difference? K says that in current studies, there is debate over “the relationship between time and depiction” (599). But they do not necessarily address the relevant question here, namely, “Can a picture display a particular moment and, say, its aftermath, with the temporal relation not being shown expressly, while making a metaphorical point?” (599).


To answer this question, K has us consider the following picture, which is a legitimate instance of prolepsis.

A squad of keen young recruits is embarking on a troopship. In the line marching to the gang plank is a walking cadaver, a dead soldier. The scene involves a contradiction — the dead soldier. The depiction indicates the present moment and the future death to which the recruits innocently go. | The contradiction prevents a literal reading of the scene, and requires that the picture be taken as a metaphor for it to be sensible. Notice that if there were a dead body lying to one side of the march, that would enable a literal reading of the scene to pass. The contradiction between the cadaver and its participation in the march requires a metaphoric reading.



But there is also another common way to depict the passage of time which would not qualify as prolepsis. K has us consider if Pinocchio were “drawn with a solid-colour nose and a series of shadowy noses of greater and greater length” (600). This would not qualify as prolepsis, because “it would be ambiguous whether the noses existed in the past or the future, and in any case the picture portrays an event actually taking place and is thus like speed lines” (600).


[So we have covered figures of speech and we saw in each case how they could be found operating in pictorial form.] There are also “pictorial devices which are metaphoric but which have no clear equivalent in language” (600). K names these cases pictorial runes (600). He defines rune as “a graphic character which is a modification of some prior item to facilitate the displaying of the item” (600). K first notes alphabetic runes. These were letters “of Teutonic, Scandinavian, or Anglo-Saxon writing, developed in the second or third century by modifying Greek or Roman letters to make carving the letter easier” (600). [I am not familiar with runes, so I am not sure exactly what the symbols looked like before being converted into runes and what they looked like after. I also do not know why they would need to be converted. What was wrong with their original form? Apparently the original forms were hard to carve, but I wonder why. Perhaps they were too ornate. Or perhaps simply the people doing the carving were better trained for the Greek or Roman letters, and so they converted the original forms so that their techniques applied to them.] In keeping with this sense of rune, “A pictorial rune is a graphic device used in a picture which is a modification of the literal depiction of an object, making some aspect of the object become easy to depict, that aspect of the object often being difficult for the literal depiction to convey” (600).


K says that certain inner states, like anxiety and pain, are difficult to depict visually. So cartoonists often use pictorial runes to depict these states, for example “spirals in Linus’s woebegone eyes or lines radiating from a swollen thumb. Smelly substances are supplemented with ‘waft’ lines (probably borrowed from drifting smoke curling up from the offending substance). People shouting and hammers banging have lines radiating from the centre of noise (see figure 6)” (K 600). [In the figure below, we see on the left side the eye spirals for an anxious person, ‘waft lines’ for the stink of a trash can in the middle panel, and noise lines from a shouting mouth in the right panel.]

 photo Kennedy.MetaphorInPictures.fig6_zpsgxrvgvyw.jpg


[The next point is important. There are certain sorts of subjective perceptual experiences that cannot be depicted in an objective sort of way. So if one is suddenly facing a bright light with half closed eyes, they might actually see streaks or other sorts of distortions. The drawing of this experience might show lines radiating from the light source “to capture the apparent lines” that the person saw. This is to be considered literal. I think K’s reasoning for this might be that the perceptual impression had those depicted features, rather than drawing imagery that cannot be said to show impressionistically (but not simply associatively) the features of that perceptual experience. His next example is a depicted person experiencing ‘pins and needles’ in a limb from constricted blood flow. We suppose the artist here depicts that experience with “lines radiating” from the limb. This example is less obvious to me. In the case of the seeing the streaks, the distortion is apparent. But we do not tactilely feel lines. It is more like an experience as if straight sharp hard things are jabbing our limb. So why is this literal? I suppose it is because the visual articulation of this perceptual impression would take the form of straight lines radiating from the limb. Perhaps the idea would be like the following. Suppose that you are having this pins and needles experience. Someone then says, “close your eyes, and visualize what you see happening to your arm. Now draw what you see,” then you might perhaps draw the radiating lines. However, suppose instead you have this experience and someone says, “find a clever way to tell someone that you are having this experience”. Then perhaps you might draw something that is not a feature of the impression you had. I have no idea what one might draw, but perhaps for example one might draw a barrel full of pins and needles (to remind people of the verbal metaphor) and show the person plunging their arm into it (thus using exaggeration as well). This is perhaps figurative because when we have this experience, we do feel sharp pains that could be caused by pins and needles, but to specify them as pins and needles is to go beyond the actual features of the experience. We do not for example feel other features of a pin and needle that would distinguish the pains as coming from these objects rather than from other sharp pointed objects, like for example the tips of sharp pointed knives. K clarifies the criteria in the following way: “+” (600-601). I did not understand in the second sentence when he writes, “despite the possibility of a false impression”. It is not clear to me if he is saying that the literal presentations of perceptual experiences, like the streaks of light, are false are not. Also, if we are talking about a runic device, I do not understand how the truth or falsity of the impression comes into play. The examples of light streaks and radiating lines around the limb are not runic, he seems to be saying. So here the issue of truth and falsity of the impression is not what he is talking about. But the other examples he gave of pictorial runes in figure 6 also to me to seem not to have anything to do with the truth or falsity of impression. At any rate, the way this discussion will be important is when we evaluate the way K’s ideas have been used in comics studies. Later we will examine a wonderful article on the issue of pictorial metaphors for the depictiion of motion, namely, “Analysis of Motions in Comic Book Cover Art: Using Pictorial Metaphors,” by Igor Juricevic and Alicia Joleen Horvath. The authors use the literal/metaphorical distinction, citing a different work by Kennedy (which we will also examine). In a supplementary summary of their arguments, Juricevic defends their notions of literal and metaphorical devices in the following way:

We acknowledge that there are many differing opinions on what is or is not metaphorical in pictures. For our research, we needed to adopt a rigorous definition of what is considered to be a literal pictorial device, and what is metaphorical. For this, we used the definition that literal pictorial devices are those that represent features that are present in the real world, while metaphorical devices represent features that are not present in the real world (Kennedy, Green, & Vervaeke 1993). We chose to investigate our specific literal devices (posture, orientation, and ground plane) and metaphorical devices (action/speed lines and multiple images) based on the work of Carello, Rosenblum, and Grosofsky (1986).

We realize that our definition may be contentious to some while agreeable to others. In the tradition of James J. Gibson and Ecological Psychology, our definition tacitly assumes that pictures are effective because they provide crucial information that is present in the real world. As such, violations in a picture (i.e., information that is not present in the real world) can be readily identified. Further, if that information can be understood, then it can provide metaphorical information.

(Juricevic, “Literal or Metaphorical”, citing:

Kennedy, J M, Green, C D and Vervaeke, J 1993 Metaphoric thought and devices in pictures. Metaphoric and Symbolic Activity, 8(3): 243–255.

Carello, C, Rosenblum, L and Grosofsky, A 1986 Static depiction of movement. Perception, 15(1): 41–58.)

We will also examine this other article by Kennedy et al., and we will see if there the authors explicitly consider depictions of subjective impressions as literal or metaphorical. My other question is if Juricevic and Horvath would consider as metaphorical certain impressions of motion like blurry imagery or other ways of depicting the visual experience of a trailing afterimage. These things are not in the real world, but they are what one can “literally” experience when viewing quickly moving objects.]

It should be noted that some lines or devices around an object can be literal in the sense that they attempt to convey perceptual impressions. For example, a bright light drawn with some lines radiating from it may be an attempt to capture the apparent lines that result as the viewer half-closes his eyes while facing a bright light. Similarly lines radiating from a limb may be an attempt to depict the impression of ‘pins and needles’ that result from ischemia (blood-flow blockage) and the return of the blood flow. Thus, it is important to try to determine what intent governs the devices being used. Where a device is an attempt to capture an impression, it is literal. Where it is a modification intentionally introduced, despite the possibility of | a false impression, to overcome a limitation in the literal depiction of the object, it is a pictorial rune.



Kennedy furthers this discussion of pictorial runes by examining blind people’s depictions of spinning wheels. [I am not entirely sure why it is important for the argument here that the people be blind, but the idea might be the following. A blind person probably does not know explicitly what the visual experience is like of a spinning wheel. But they need to depict its motion. In order to do so, they might add other devices that are not part of the visual experience. These would qualify as runes, and again a pictorial rune is defined as “a graphic device used in a picture which is a modification of the literal depiction of an object, making some aspect of the object become easy to depict, that aspect of the object often being difficult for the literal depiction to convey” (K 600). So K shows drawings where there is a bend in the wheel’s perimeter to represent the way that forces acting on the wheel would seem to make (any) part of it rush forward. In a similar way, another depiction has the spokes shown as bent, perhaps to indicate the way the forces are acting on the wheel’s parts. K’s point here is that the blind people do not think that these features are parts of the experience of the spinning wheel. But K also shows a depiction where the spokes are drawn as mixed up, because they become blurred when in motion. This would strike me as a literal depiction, since that is very similar to what one would actually see. And K even places this instance in its own figure. But he does not say whether or not this is literal or runic, and my impression is that he thinks it is runic, as he discusses it in the context of the others without noting a difference. But were that so, I do not understand why it would be runic, unless the idea is that the blur could not be drawn, so instead the person drew the spokes as jumbled up. K also makes the point that certain kinds of shape changes like in figure 7 cannot be rendered into figures of speech. ]

 photo Kennedy.MetaphorInPictures.fig7-8_zpsjvynbpyo.jpg

Consider one possible runic device, a change between straight and curved. Figure 7 contains drawings of a type obtained when blind adults unfamiliar with depiction have been asked to draw a wheel spinning (Kennedy 1982). Since in any ‘frozen moment’ a spinning wheel is identical to a static wheel, it is not possible to present a wheel in a posture which distinguishes the moving one from the still one. Some devices must be introduced to make the distinction; for example, blind people introduced context (a person spinning the wheel) or ancillary speed lines (arcs around the wheel). In addition to context and ancillary lines, blind people also drew the spokes of the wheel with a change of shape between the static and moving wheels. Such devices included mixing-up the spokes, or making some long and some short and — in drawings from four blind adults out of seven tested — the spokes were drawn as straight in the static case and curved in the moving case. Similar devices were obtained from blind children tested in Tucson, Phoenix, and Haiti (Kennedy 1982).


Inspection of figure 7 cannot reveal how the curvature is intended. It might have been that the blind people intended the drawings to copy what wheels feel like as they spin. Certainly, the blind person who drew figure 8 with spokes mixed up reported that the mixing of the spokes was an attempt to draw a blur: “a circle with a terrific blur in the centre”, he said. Similarly figure 7a, with a peak at the top, | is said by its author to capture the ‘looming up’ that occurs because, she said, the “top part is moving so quickly towards the front — it’s looming up towards you”. In contrast, figure 7b uses a device said by the artist to “get across the idea” of a wheel in motion without referring to an impression from the wheel. The artist, in the interview protocol which accompanied the drawing task, said explicitly that spokes do not really curve. The curves showed movement, she said, while denying that the curves were real. “I’ve got the spokes all facing one way to show the wheel spinning toward this way ... I don’t know how else to depict it. I know the spokes don’t bend in the wheel but ... .” Hence the curvature may be interpreted as a feature that is literally wrong being used to suggest something else which lies beyond the standard range of the medium.


Shape change as a metaphor is a feature with special relevance to depiction rather than to language. There is no direct equivalent in language for figure 7b: ‘a wheel with its spokes curved’ is not a standard metaphor for movement in language. Shape being a sine qua non for depiction, it is important to consider how shape might be deployed in metaphoric ways that may not have immediate parallels in language, and where the study of rhetoric may not provide, ready made, an adequate set of distinctions. (A similar examination of colour and perspective should be made too cf Arnheim 1974; Gombrich 1972a, 1972b.)

(K 601-602)



K also notes ways that shape changes can be used runically to depict things other than motion. “Applied to pillars, it could suggest an overbearing weight. Applied to make a man’s legs wiggly, it could suggest an emotion such as fear. If the edges of a building were drawn wiggly then it could suggest an upset — psychic or physical. The reverse change from curved to straight can also be used for many purposes, eg a person becoming rigid and inflexible, or a car braking fiercely” (602).


K then discusses the “indicating power” of runes, and he identifies three sources for curved and straight runes: {1} natural phenomena, like how things bend under weight, {2} psychological phenomena, like how one’s legs shake and thus could be drawn as wiggly lines, and {3} other graphic sources [I do not know what he is referring to here, but the examples seem to come from other papers, so let me quote,] “[...] eg ‘Maluma’, a nonsense word, can be represented by a mass of curved lines and similarly ‘Takete’ can be represented by a pattern of straight lines. This is probably due to borrowing from ‘M’ being written with rounded characters and ‘TK’ being written with straight lines (R H Kennedy 1977, personal communication). The pairing of nonsense words and the curved/straight lines works in North America but does not work in a nontechnological tribe (Rogers and Ross 1975). Also, observation suggests that exchanging vowels and consonants  (Takuta, Maleme) leaves the straight and curved graphics still paired with the consonants” (K 602).


K then summarizes his ideas for runic pictures: “In sum, a class of pictorial runes can be defined to include those devices of depiction which are metaphoric but for which there are no immediate parallels in language, and these devices include shape changes which are in common use today in pictures” (K 603). K next discusses other studies of visual metaphor by Alrich and Rothenberg [see p. 603 for details of their findings. K explains how their notion of metaphor is too broad to fit the one we used here, and instead we should add the notion of simile for the other cases.]




K notes the different ways that “figures of depiction” may arise: {1} by varying individual features of an object, {2} by juxtaposing objects, {3} by adding graphic devices such as speed lines (K 603). K next says that certain cases of figures of depiction can be classified without much controversy, but other cases “may be more dubious” (603). K also notes that the controversial cases can be found in all three ways of generating pictorial metaphors (603).


K’s next point is that mentalist concepts can play a role in studies of depiction [but I am not sure what is meant by ‘mentalist’. Because he discusses figure and ground, it might have something to do with Gestalt, but I am not sure. His overall point here might be that more is needed to explain pictorial metaphors that simply what is depicted, because there are other important determining mental operations involved. Let me quote, as this is a guess.]

Also controversial is the important role played here by mentalist concepts, which are both unusual in current psychology and particularly outlandish in studies of depiction. However, the mentalist approach has borne fruit here, and it can be pointed out that other mentalist concepts are crucial to the study of depiction”


Picture perception involves the pictorial functions of elements, such as lines and contours, and their deployment in configurations, as well as the intentional use of elements and patterns in literal or metaphoric ways. The fact is that a mentalist approach is as vital in studying pictorial elements and configurations as it is in studying metaphor. Notice, for example, that lines and contours have their pictorial powers whether or not the configurations they make are specific to a given object. That is, a brief line form can be taken by a viewer to depict various things — a solid object, a flat | form, a cavemouth, a wire frame, etc — in figure-ground fashion. The figure-ground experience, which is the taking of a line or contour to depict foreground and background (Kennedy 1974a) is not dependent on the depiction offering unequivocal information for what is to the fore and what is to the rear. Rather, the depiction operates because the various pictorial effects are selected by the viewer, not because the external display is specific and unambiguous. Hence in dealing with lines and contours in simple displays, and the related figure-ground experiences a mentalist explanation is required, not an explanation in terms of specification by physical details of the display.


So far as the pictorial configuration is concerned, a mentalist approach still has a key role to play. Where accounts of metaphors depend on intention, and accounts of perception of elements in simple displays depend on the figure-ground experience, accounts of configurations being perceived depend on the notion of relevance. Many pictures violate physical perspective, or omit physical features such as colour and texture. It is important for the viewer to know what is relevant. Is a space white through omission, or because it depicts snow? And what principle governs the configurations? Is the order from left to right relevant? Is a difference in size relevant or not? Is a curve a casual drawing of a straight line? The viewer has to sort out the relevant from the irrelevant, and determine the governing principles, rather than accept all features equally. To that extent a mentalist explanation of depiction is important in accounting for how configurations are being viewed.


In general, then, the use of a mentalist approach in examining pictorial metaphor is far from idiosyncratic and isolated. A mentalist approach to elements and configurations is also necessary.












Kennedy, John. “Metaphor in Pictures.” Perception, 1982, vol.11, pp.589–605.


Or if otherwise noted:


Baldick, Chris. The Concise Oxford Dictionary of Literary Terms. 2nd edn. Oxford: Oxford University, 2001 [First published 1990. Second edition published 2001].


Scott, A. F. Current Literary Terms: A Concise Dictionary of their Origin and Use. London: MacMillan, 1965 [revised reprint 1985. This is a softcover reprint of the hardcover 1st edition.]


Juricevic, Igor. “Literal or Metaphorical? An Analysis of Motions in Comic Book Covers.” Blog and Archive of the Comics Grid: Journal of Comics Scholarship. Website post. 20-April-2016. Accessed 29-Aug-2016.





Works cited by Kennedy:


Barrand A, Toleno T, 1972 “Pictural events” presented at the American Association for the Advancement of Science Meeting.


Brooks P H, 1977 “The role of action lines in children's memory for pictures” Journal of Experimental Child Psychology 23 98-107.


Gombrich E H, 1961 Art and Illusion (London: Phaidon Press).


Gombrich E H, 1963 Meditations on a Hobby Horse and Other Essays on the Theory of Art (London: Phaidon Press).


Kennedy J M, 1974b “Perception, pictures and the etcetera principle” in Perception: Essays in Honor of J J Gibson Eds R B MacLeod, H Pick (Ithaca, NY: Cornell Press).


Rogers K, Ross A S, 1975 “A cross-cultural test of the Maluma-Takete phenomenon” Perception 4 105-106.




29 Aug 2016

Bickhard and Richie (1.2) On the Nature of Representation, Ch.1.2, “A Historical Summary of [James] Gibson’s Theory,” summary


by Corry Shores


[Search Blog Here. Index tabs are found at the bottom of the left column.]


[Central Entry Directory]

[Literature, Drama, and Poetry, Entry Directory]

[Graphic Literature, Entry Directory]

[Studies of Perception with regard to Graphic Literature, entry directory]

[Bickhard & Richie. On the Nature of Representation, entry directory]


[The following is summary. All boldface and bracketed commentary are my own. I apologize in advance for my typos. Proofreading is incomplete.]




Summary of


Mark Bickhard and D. Richie


On the Nature of Representation:

A Case Study of James Gibson’s Theory of Perception


Ch.1 Foundations


1.2 A Historical Summary of Gibson’s Theory




Very brief summary:

The important information about the world we perceive is not something our minds place into our perceptions by organizing and processing them one way or another. Rather, that information is built into the world (the “ecology”) itself and especially in the patterns, structures, and ways we perceive it when we actively interact with it. And we directly, without further processing, discern these significances in our perceived world when our perceptual interactions tell us the possible uses of (or potential further interactions with) the perceived things, which are called their “affordances”.



Brief summary:

Gibson rejected two predominate views of perception of his time and proposed a new theory. The first predominate view are the sensation-based theories (of for example Berkeley, Müller, and Helmholtz) which say that our eyes directly receive and encode our fragmented visual data and secondarily we construct full perceptions on their basis by applying to them processes of memory, inference, and judgment. The other view is the Gestalt one, which sees the process of perception as a relatively spontaneous sensory organization that also involves some reconstructive work on the part of the perceiver. But Gibson’s own scientific studies showed that humans and animals react to their environment in a way that is too accurate and immediate for there to be additional acts of processing of the data, as in these two theories. Instead of the important information being encoded into the perceptual data by means of perceptual and mental processes, Gibson instead came to hold that the important information about the environment was fully given in the sensory data already and directly discerned without further processing of it. His work with motion parallax illustrates his thinking. (Motion parallax is the visual experience that we have when we are moving, and the things further away from us pass through our field of vision slower compared with nearer things.)  The two existing models would say that on the basis of a static impression of what is given to our vision while we remain passive observers, we process that data to obtain knowledge of the properties of depth in the scene before us. However, motion parallax shows us that we are not passive observers, because we actively make decisions about how we position ourselves in the world, and in that way we in fact interact with our surroundings. And also, what we perceive is not a static image, because it requires that we see a flow of information and we recognize certain patterns in it. But it was not so clear in his early theories that he can avoid what he called the homunculus problem, which plagued the other theories. The idea is that because an interpretative sort of process was needed to “encode” the raw data with other important significances, like depth relations, there needed to be an internal agency that receives that data and interprets it, like a miniature human living in our brain. This leads to an infinite regress, because that homunculus would need some internal interpretative agent of its own as well, and so on. Gibson, thus, needs to explain the nature of the discerned significant information and the way it is obtained in a manner that shows how it does not involve an interpretative component. This danger arises on account of his notion that schematic perception is based on literal perception. Our literal perceptions give us data about the physical spatial properties of things in our perceived world, while our schematic perceptions tell us about the perceived things’ potential uses and significances. Schematic perception is in some sense obtained from literal  perceptions secondarily, and this leaves open the possibility that Gibson’s theory falls to the homunculus problem (for, it could be that there needs to be an additional process that “reads” the literal perceptions to interpret their schematic significances). What he says instead is that humans and animals, by perceiving the physical features of things, in that same act thereby perceive their potential uses or interactive possibilities, called their affordances, of that perceived thing.  Thus there is no ‘encoding’ or processing to obtain knowledge of the perceived thing’s significance; it is perceived and recognized directly when seeing its physical features. Given that this interactive element with the environment directly contains the important information about it, Gibson moved the locus of perception away from the passive perceiver’s internal workings and relocated it in the surrounding “ecology” of interwoven environmental elements that are also interactively interwoven with the observer. [For this reason, it is called “ecological psychology,” as Gibson argued that] perception is a “process that can only be understood in terms of its natural ecology” (Bickhard & Richie 10).







Bickhard and Richie (henceforth written as BR) will discuss James Gibson’s theory of perception by first examining its historic context and then its conceptual development (BR 8).


BR begin by noting the basic context of Gibson’s original studies:

Gibson (1950) points out that the study of perception had long been dominated by the problem of how the mind can generate our full experienced perceptual knowledge from the inadequate data provided by the senses, with vision and the eyes always the primary focus. The major approaches to this problem were based on the works of Berkeley (1709/1922), Müller (1838/1948), and Helmholtz (1896/1952), who proposed that the eyes directly receive and encode certain basic sensations, such as patches of color, lines, points, and so on, and that full visual perceptions are then constructed on the basis of such sensations through various processes of comparisons with memory, inferences based on cues within the sensations, and, ultimately, judgments concerning the nature of the external stimulus.

(BR 8)

Although there were disagreements on the nature of the sensations and of their processing, “all such models, including a slight variant in which the retinal image served in the role of sensations, assume that perceptions must be generated out of primitive sensations or retinal images. They assume that the senses receive fragmented or incomplete information about the world that must be enriched by mental processing (Gibson & Gibson, 1955)” (BR 9a).


Gestaltists “objected to this approach,” because they thought that “the sensory elements seemed impossible to specify” and also that this approach at best tells how we make judgements about the world but does not explain how we actually see the world [in a more encompassing sense] (9). Rather, “Gestaltists argued that ‘experience is not reducible to elements or additive units’ and proposed instead that the process of perception ‘was one of a relatively spontaneous sensory organization’ (Gibson, 1950, p. 22)” (9). This notion of sensory organization applied well to the perception of form, but not so well to the perception of space. And in both cases it was difficult to specify [what the nature of the sensory organization is]. What interested Gibson with regard to the Gestaltists was that they “formulated genuinely relevant problems for space perception, problems concerning the characteristics of the actual experienced visual world rather than the flat geometric visual field (Gibson, 1950, p. 23)” (9).


With these two theories in mind, Gibson conducted his own experiments in depth perception during the second World War (9). What he found was that “depth perception was more accurate than could be explained by any model based on depth cues” (9). This meant specifically that the sensation models failed but also the Gestalt theories proved inadequate [for some reason] as well (9).


In 1950, Gibson writes The Perception of the Visual World, and here he goes beyond both theories. BR assesses the role of these alternative theories in this way:

From the Gestaltists, he accepted and adapted the idea that the most basic problems of visual perception were those regarding the experienced three-dimensional visual world, not the flat geometric visual field, but he rejected the proposed process of sensory organization. From the sensation-based approaches, he accepted very little, neither their basic problems nor their basic solutions.

(BR 9)


[I am not certain, but perhaps we can say the following about Gibson’s critique. He will say that humans and animals react to the spatial environment in a way that shows they have a very precise understanding of its spatial features. This means that the two theories fail. The sensation-based approaches perhaps do not explain this, because this theory might say that people’s and animal’s understanding of the spatial environment is based on just what they sense. But too little is sensed to support such a detailed knowledge of the environment. The Gestalt approach does not work, because this would say that the mind constructs a lot of its knowledge of the environment artificially. But were that so, there would be more errors in that construction than are actually there. Let me quote, as that was just a guess.]

Gibson argued that people and animals “appear to react to the spatial environment with an accuracy and precision too great for any known theory of space perception to be able to explain. ... If the solid visual world is a contribution of the mind, if the mind constructs the world for itself, where do the data for this construction come from, and why does it agree so well with the environment in which we actually move and get about” (p. 14). This basic rejection of mental constructivism, of mental processing was one of the most fundamental moves in the development of Gibson’s own theory. Consistent with this rejection, Gibson also rejected the premise that made such processing necessary and the particular distinctions and processes by which it was presumed to occur.

(BR 9-10, citing Gibson 1950)


Gibson especially “rejected the basic premise that the data available to the senses were inadequate to perception” (10). [I am not sure, but this notion might work against both theories. The Gestaltists seem to be saying that there are additional constructions and organizations on the basis of what is given, and the sensation-based approaches discuss certain unconscious cognitive processes that formulate inferences and judgements on the basis of what is given. But if what we perceive is already adequate, none of this additional work is needed to develop a sufficient understanding of the world around us.]

In particular, and most fundamentally, Gibson rejected the basic premise that the data available to the senses were inadequate to perception: “Even complex perceptual qualities must have stimuli” (p. 8); “If the total stimulation contains all that is needed to account for visual perception, the hypothesis of sensory organization is unnecessary” (p. 25). Clearly, if the total stimulation contains all that is necessary to account for visual perception, then the (unconscious) inferences, comparisons with memory, and judgments – the mental processing – of the sensation-based models are also unnecessary. If we ask the right question, Gibson suggests, if we ask about the experienced visual world based on surfaces and edges, rather than about the flat geometric visual field, then we find that the information available to the visual senses is sufficient to perception, and information enhancement via mental processing is a superfluous and flawed postulate.

(BR 10, citing Gibson 1950)


Because Gibson rejected this sense of mental processing as involving enhancements, he thereby also rejected “the classical distinction between sensations and perceptions;” for, “that distinction is based on the assumptions that sensations are informationally impoverished and that mental processing enriches them into perceptions” (10). [In other words, according to this view that Gibson rejects, we have raw sensations that are inadequate possibly because they are fragmented and/or disorganized, and thus perception is the process by which these sensations are made adequate perhaps by organizing them and/or by completing them where they leave informational gaps.]


Sensation-based models often see the perceiver as passively and statically receiving sensations on the basis of which she forms perceptions. Gibson rejects this idea, because human perceivers are very active while perceiving, making spontaneous decisions like changing where to look and how to orient themselves in their environment in order to perceive it better. Gibson also argued that perception is a “process that can only be understood in terms of its natural ecology” (BR 10). [Perhaps by this is meant that perception is always bound up with the conditions of the environment and one’s interactions with it, but I am not sure.] One way that Gibson supported this notion was by noting ways that changes the perceiver makes in their motion within their setting can enhance their perception, which can in fact even “modify the retinal images in a quite specific way”, as when for example certain physical movements provide “powerful information for depth perception in the form of motion parallax” (10).


[I am not very familiar with this notion of motion parallax. I found this helpful diagram.

 photo parallax motion auto_zpsik5ce1lx.jpg

Motion parallax seems to be that very familiar phenomenon where when we are moving, objects in the distance seem to pass through our field of vision slower in comparison to things nearer to us. We might normally notice this when looking out of a train or automobile window. Until I read the source text by Gibson, I will not know what to say about this. But perhaps we might note at least that here depth perception is attained by moving around interactively in an environment, rather than simply taking in some sensory information passively, processing it, and then discerning the depth afterward.]


Gibson had yet another strong argument against the sensation-based models of perception, namely, the “homunculus problem” (10). To understand this issue, we should first take note of “retinal-image theories,” which say that it is necessary for people to process the stimulation on the eye’s retina. Gibson argues against this. He first observes that this view thinks that an image forms on the retina like an image projected onto a screen, and thus “the retinal image is something to be seen” (BR 11, quoting from Gibson, 1979, p. 60). [The idea here is that there is a sort of second act of seeing, that is, seeing the image on the “screen” of the eye.] Gibson considers this then a matter of a homunculus problem, because it is, as he calls it, “the little man in the brain theory” of the retinal image (BR 11, citing Gibson 1966, p. 226), “which conceives the eye as a camera at the end of a nerve cable that transmits the image to the brain. Then there has to be a little man, a homunculus, seated in the brain who looks at the physiological image. The little man would have to have an eye to see it with, of course, a little eye with a little retinal image connected to a little brain, and so we have explained nothing by this theory” (11 again from quotation). In fact, this theory only makes the matter worse, because it entails an endless series of little perceivers inside little perceivers (11). As BR explain, there other versions of this argument, but they all have in common “that something, or someone, must ultimately do the perceiving, and that is what was to be accounted for in the first place” (BR 20). They say that this homunculus problem is found in “any form of inputs-followed-by-processing-followed-by-perception model” (11).


Sensation-based models posit constructions by means of perceptual processes in order to account for “the problem of how full perceptions are derived from impoverished sense data” (11). But Gibson’s “assertion that the total stimulation is informationally adequate to perception” rejects the assumptions underlying that problem (11). “Gibson continued to develop his arguments against sensation-processing and other input-processing models” (11).


Gibson, seeing the shortcomings of sensation-based and Gestalt models of perception, offered his own model that BR describe as “an ecological direct-encoding model” (11). [The notion of an ecological model of perception is not very well defined for me yet, but perhaps that becomes clearer as we continue. It might mean a model in which the perceiver takes an active role in interacting with the perceived world while perceiving it. In this sense it might be something like Merleau-Ponty’s notion of how the perceiver is integrated with the world they perceive, forming one flesh. The fact that it is encoded seems to mean that the information is still discerned by placing it into what is perceived, but this is somehow done directly. But I am not sure. From what is said later, the notion of direct encoding might be related to the idea of resonance. So maybe our mind encodes the information into the sense data by resonating with the information that is already in a sense encoded in the way the data is given to us. But that is wild guessing on my part.] BR continue,

Gibson rejected the sensation-based conception of the perceiver as a passive individual confronting a flat visual field in favor of an active perceiver confronting an ecologically structured visual world – thus, | an ecological model. He also rejected both the mental constructivism of the sensation-based models and the sensory organization of the Gestaltists in favor of a direct correspondence between stimulation and perception  thus, a direct encoding model.



BR then note how “The direct encoding aspect of Gibson’s 1950 model was both a methodological move and a theoretical move” (12). The methodological component is his locating the basic problem of perception as being the problem of “establishing an empirical correspondence between the stimulus and its conscious resultant,” and this means that he proposes a sort of ecological psychophysics (BR 12, quoting Gibson 1950, p. 52). The theoretical component lies in the fact that he “rejected any intermediate processing of encoded sensations between the stimulation and the perception and in his corresponding rejection of the sensation-perception distinction” (12).


BR acknowledge that “It is not entirely clear that Gibson would have agreed with the ‘encoding’ part of our designation of his model as an ‘ecological direct-encoding’, especially in his late career” (BR 12). [Our summaries so far have skipped over BR’s own way of using Gibson’s thinking.] But, his 1950 model “seems committed to some form of a direct encoding model” (12).


If we adopt an encoding model, then we then have to explain how these encodings occur (12). As we will see, his “conceptualization of an ecologically active perceiver contains the germ of his later answers to that question and, we argue, the germ of interactive insights that allowed him to largely transcend the encoding approach altogether” (12). BR will now discuss the later development of his model.


The later model develops the “internal implications” of the 1950 model (12).


In the 1950 model was a sort of retinal-image-based model of perception, and “He described his psychophysics program as involving a ‘jump from the retinal image directly to the perceptual experience’ (p. 51)” (BR p. 12, qtg. Gibson 1950).


But as Gibson came more to an “ecological emphasis on the importance of the active perceiver,” this retinal-image focus proved inadequate (12d). For, “The retinal image of an active perceiver changes too much, too fast, and too continuously, in contrast with relatively stable perception, to be the primary locus of perception” (BR 13a). Gibson writes, “The active observer [however] gets invariant perception despite varying sensations” (BR p. 13, qtg. Gibson 1966, p. 3, bracketed insertion is BR’s).


So Gibson needed to find a “different locus of perception information” (13). It would need on the one hand to remain stable like how our perceptions are, while at the same time be “adequate to those perceptions” (13). [I am not sure what is meant by them being adequate to the perceptions. Perhaps the idea is that his model would need to juggle both the fact that sensory information is highly variable and thus not similar to perceptions, which are stable, while at the same time needing to somehow have this stability that perceptions have. So if we say that the sensory information is directly perceived but is variable, then it is not adequate to the perceptions, which are less variable. I am not certain however.] [The notion of a “perceptual locus” is important here, but I am not entirely sure I grasp what is meant by it. It seems to me that BR are saying that for Gibson, the locus of perception is actually somehow in the environment. But I am not exactly sure what is meant by that yet. They will refer again to motion parallax, and they write, “Motion parallax is a phenomenon of the structure of the ambient light through which the eye moves. The clear suggestion is that the broader spatial and temporal patterns in the ambient light might well be the actual locus of visual perception.” So the idea here might be the following: the structures by which depth is discerned are found not in the way the visual information is processed in the perceiver’s mind but rather it is located within the structures of the visual data itself as it is already in its raw givenness. So contained within the visual data hitting our eyes when we pass through the fields of light beams found in the space along our train ride, there is already the far away items “moving” slowly relative to the faster moving things nearer and nearer to us. Gibson might then be saying we thus perceive the depth directly, because that depth is already built into the way the light beams are structured in their patterns of givenness. Thus, we have a direct perception of depth on the basis of whatever visual data we get in whatever way it is given, without needed to process it.]

A different locus of perceptual information was required, one that maintained a stability comparable to that of perceptions and one that was adequate to those perceptions. A new perceptual locus was required by Gibson’s recognition of the importance of the active perceiver; such a locus was suggested by that same recognition. Gibson’s original emphasis on the active perceiver stemmed in part from the motion parallax information concerning depth that was thereby derived. Motion parallax is a phenomenon of the structure of the ambient light through which the eye moves. The clear suggestion is that the broader spatial and temporal patterns in the ambient light might well be the actual locus of visual perception.1 Certainly, on the one hand, there is no information available in the retinal image that is not available in the ambient light, and, on the other hand, it is difficult to conceive what alternative external locus for visual perception might be possible. Furthermore, very encouraging success was obtained in investigating the information that was in fact available in the ambient light. Correspondingly, “In my book, The Perception of the Visual World (1950), I took the retinal image to be the stimulus for an eye. In this book I will assume that it is only the stimulus for a retina and that ambient light is the stimulus for the visual system”  (1966, p. 155).

(BR 13)

[Endnote 1 on page 85 (quoting):

1. Such a shift to patterns in the ambient light as the locus of perception is clearly prefigured by his 1950 point that patterns of stimulation could themselves be stimuli (p. 9), even though at that time he was referring to retinal patterns. The shift is also consistent with his general ecological emphasis, but neither of these points is sufficient to force that shift – the active observer is sufficient.]


RB note two revisions that are called for in light of Gibson’s discovery. The first revision shifts the “postulated locus of visual perception from the retinal image to the ambient light” (13). [I might not follow the second revision. It seems to be that as a result of the first revision, we now need to change our view of the perceiver as someone who simply interprets the visual data given on the retina to instead be an active participant in the perceptual process by making spatial modifications in order to find patterns in the resulting changes in the visual data. Let me quote.]

Thus, consideration of the fact and necessity of the active perceiver forced a shift in the postulated locus of visual perception from the retinal image to the ambient light. Consideration of the ambient light as the locus of perception forced, in its turn, a reciprocal revision of the conception of the perceiver. The logic of the second revision derives from the fact that such broader spatial and temporal patterns in the ambient light cannot simply be sought by the visual system, then, when found, statically, retinally perceived. They are, by definition, too big for that. They must be scanned, sampled, or otherwise interacted with in such a way as to detect and identify - to pick up – an encounter with a discriminable pattern.

The detection and differentiation of such a broader pattern, a variant or invariant in the ambient light-the pickup of such information – is intrinsically interactive. The active perceiver of 1950 had to become a truly interactive perceiver: |

There is a loop from response to stimulus to response again (1966, p. 31).

An explanation of constant perception ... should be sought in the neural loop of an active perceptual system that includes the adjustments of the perceptual organ. Instead of supposing that the brain constructs or computes the objective information from a kaleidoscopic inflow of sensations, we may suppose that the orienting of the organs of perception is governed by the brain so that the whole system of input and output resonates to the external information, (1966, p. 5).

The process of pick up is postulated to depend on the input-output loop of a perceptual system (1979, p. 250)

The process is circular, not a one way transmission (1979, p. 61)

The course of the whole interaction can be critical. It is the course of the interaction by the visual system, for example, the scanning, both input and output and the relationships between them, that differentiates the pattern interacted with; it is not the ‘final’, static, retinal image that ‘completes’ the interaction that picks up such a pattern, nor even the ‘succession of images’ or, better, the flow of retinal stimulation that accompanies the interaction. Retinal stimulation is relegated to the input side of an overall interactive visual system that engages in such interactions and discriminates such patterns. It is the pattern of the interaction that differentiates and, thus, identifies the pattern interacted with; it is not any piece or component of the interaction.

(BR pp. 13-14, block qtg. Gibson)


[The next point reminds me very strongly of what Merleau-Ponty writes in section 1.2.1 of The Structure of Behavior. He describes the dynamic process of responding to a stimulus. Humans and animals do not know at the very start of a stimulus the correct response to it, because what needs to be recognized in the stimulus is a pattern that unfolds over time. And, while that pattern is unfolding, the responding creature modifies its receptivity in real time so to better sense the stimulus and respond appropriately. One example is how the ear of a cat responds to different sorts of touches:

Five different reflex responses can be obtained by stimulating the ear of a cat depending on the structure of the excitant employed. The pinna of the ear flattens out when it is bent, but responds to tickling with a few rapid twitches. The character of the response is completely modified depending on the form of electrical excitation (faradic or galvanic) or its strength; for example, weak strengths evoke rhythmic responses; strong ones evoke tonic reflexes. [...] (Sherrington and Miller). [qtd in Merleau-Ponty English translation p.11 /  French p.9]

He also seems to have a view of the perceiver not being passive, as it interacts with the stimuli in order to perceive it in such a way as to respond to it properly.

The organism cannot properly be compared to a keyboard on which the external stimuli would play and in which their proper form would be delineated for the simple reason that the organism contributes to the constitution of that form. (Merleau-Ponty 13 / 11)

And he has the example of holding an animal in an instrument and adjusting one’s hold in response to the creature’s movements. The idea here seems to be that if the animal changed its body shape so to escape the instrument, then we lose the ability to feel its movements. However, if we sense its body changing its shape and immediately respond by changing our hold on it, then we can continue sensing it. In other words, perception involves self-modification in immediate interactive response to what we are perceiving.

When my hand follows each effort of a struggling animal while holding an instrument for capturing it, it is clear that each of my movements responds to an external stimulation; but it is also clear that these stimulations could not be received without the movements by which I expose my receptors to their influence. “... The properties of the object and the intentions of the subject ... are not only intermingled; they also constitute a new whole.” (Merleau Ponty 13 / 11; the quotation is cited as “Weizsäcker, Reflexgesetze, p.45. “L’organisme est, dit Weizsäcker, Reizgestalter.” [Note: Reizgestalter is misspelled as Reizgestaller in the English translation.])

And he illustrates the real-time modifications to receptivity with how telephones seemed to work at that time. Apparently you dialed the receiving person’s name. After dialing the first letter, the connecting station then becomes sensitive to only those sets of letters that it knows can come after that first letter, and so on. (But I am not sure exactly how these phones worked.)

The model of the automatic telephone appears more satisfactory. Here indeed we find an apparatus which itself elaborates the stimuli. | In virtue of the devices installed in the automatic central, the same external action will have a variable effect according to the context of the preceding and following actions. An "O" marked on the automatic dial will have a different value depending on whether it comes at the beginning, as when I dial the exchange "Oberkampf," for example, or second, as in dialing "Botzaris." Here, as in the organism, it can be said that the excitant — that which puts the apparatus in operation and determines the nature of its responses — is not a sum of partial stimuli, because a sum is indifferent to the order of its factors; rather it is a constellation, an order, a whole, which gives its momentary meaning to each of the local excitations. The manipulation “B” always has the same immediate effect, but it exercises different functions at the automatic central depending on whether it precedes or follows the manipulation “O,” just as the same painted panel takes on two qualitatively distinct aspects depending on whether I see a blue disc on a rose-colored ground or, on the contrary, a rose-colored ring in the middle of which would appear a blue ground. In the simple case of an automatic telephone constructed for a limited number of manipulations, or in that of an elementary reflex, the central organization of the excitations can itself be conceived as a functioning of pre-established devices: the first manipulation would have the effect of making accessible to subsequent ones only a certain keyboard where the latter would be registered.

(Merleau-Ponty 13|14 / 12)

Gibson’s point of course is not identical to Merleau-Ponty’s, but let us note what seems to be two important similarities. In both cases, what is being perceived is something dynamic, and its unity is to be understood in terms of a pattern of variation. The second similarity is the interactive nature of the perception. The perceiver cannot simply remain in the same mode of receptivity. Rather, they need to adjust or modify themselves (in relation to their environment in general or to the external stimulus specifically) in one way or another in order to properly perceive the important patterns of the stimulus’ dynamic variations.]


BR next discuss the parallax example in terms of information. The light patterns are the information itself, found in the field of ambient light. BR also characterize the perceptive act by which the depth is discerned as involving information-extracting interactions as a means of picking up information. [But I would suppose this is not a matter of processing the information but rather of picking out the information already given immediately.] But Gibson does not mean for “information” to have its normal sense of “knowledge communicated to a receiver,” because Gibson does not want to imply necessarily that the information needs to be encoded and communicated to the perceiver [rather than being completely apparent from the beginning and immediately available as such to the perceiver (without further ‘encoding’ or ‘decoding’)] (BR 14).


Gibson does still think that retinal stimulation plays a role in visual perception. His emphasis however is on the nature of that role they play [I am not entirely certain, but it seems Gibson thinks the role is the following (and then I will quote so you can check). The role of the retinal stimulation is to provide the information in its physiological form, and perhaps we are to think of it as nervous signals. But Gibson emphasizes that the information about depth for example does not need to be acquired by further processing that visual information; for, it is already built into the structure of that information and it can be directly (without mediation) discerned by the perceiver.]

Gibson was also well aware that retinal stimulation does occur, that it plays a central role in visual perception, and that it is involved in (interactive) processes. The issue is the nature of that involvement: “The inputs of the receptors have to be processed, of course, because they in themselves do not specify anything more than the anatomical units that are triggered” (1979, p. 251). Information, however, “is not something that has to be processed” (1979), p. 251). “Information is conceivable as available in the ambient energy flux, not as signals in a bundle of nerve fibers” (1979, p. 263). Information is extracted by the interactions of sensory systems, not | encoded and transmitted by sensory organs. The eye and its stimulations participate in information-extracting patterns of interactions; they do not encode that information.

(BR 14-15)


Gibson’s interactive theories of perception involve the criticism of the notion of encoding and decoding in perception (BR 15).


The information in ambient light [with regard to parallax motion] does not need to be encoded, but there still needs to be a “process of pickup,” which he explained using the metaphor of resonance:

The perceiver interactively resonates with the available information (for example, 1966, p. 5; 1979, p. 246). Consistent with this suggestive metaphor, he also referred to the process of becoming able to extract information, of learning to resonate to available information, with a metaphor of “tuning.”

(BR 15)


But resonance is not the only way “energy patterns can be picked up without intermediate enhancement of encoded information” (15). Another problem with these metaphors is that “resonance requires periodicities in patterns to resonate to, and those are not necessarily available in information to be perceived” (15). A third problem is that “even if such periodicities were available, it is neither at all clear what it is about the interactive loop that would resonate to them nor how it would do so” (16). And a final problem with these metaphors, and also the most important one,  is that “that which resonates generally resonates at the same (or a directly related) frequency as that which is resonated to. The resonant frequency is a copy, a duplicate, of the original frequency. Such vestiges of picture, of image, of encoding conceptualizations are regretfully distortive of Gibson’s basic interactive insight in his concept of information extraction. The pattern of an interaction need not have any particular structural correspondence whatsoever with the pattern of ambient light that it differentiates” (15).


[So this metaphor of resonance is not entirely helpful for understanding the process of picking up information by means of interactive perceptions of the patterns in the environment’s dynamics. This also means that, without other explanations, we might have difficulty conceiving how this pick up process works.] But despite these problems with the metaphors, “the basic direction of the evolution of Gibson’s theory seems clear” BR explain (15). And in fact, that development continued even after his discovery of “interactive information extraction” (15). But Gibson’s model could involve the homunculus problem, [because the nature of the extraction has not been specified] so he needed to make a further step in the theory’s development, and that step “involved the problem of meaningful perception” (15).


Gibson already in his 1950 book The Perception of the Visual World worked with a notion of meaningful perception. There he “made a distinction between ‘The perception of the substantial or spatial world and ... the perception of the world of useful and significant things to which we ordinarily attend’ (p. 10 italics omitted)” (BR 15). Gibson’s term for the perception of the substantial or spatial world is literal perception, and the perception of the world of useful and significant things to which we ordinarily attend is called schematic perception. BR explain that “Schematic perception was presumed to be based on literal perception because literal perception ‘provides the fundamental repertory of impressions for an experience’ (p. 10), and the two forms of perception were presumed to have importantly different properties. Meanings were presumed to be attached to, and detachable from, the spatial impressions of literal perception” (BR 16).


BR then explain how this relates to the homunculus problem. For there to be meaningful perception, it would seem to require a homunculus to receive the literal spatial impressions and interpret them as having their appropriate meanings (16). And thus “Literal spatial impressions must be enhanced, presumably via some kind of processing with meanings” (16).


BR trace Gibson’s solution to this problem to beginning steps in his work of 1950, where he tied the usefulness of objects to their spatial features [and perhaps thus also our literal perceptions were tied to our schematic ones.] Gibson calls this squeezability: “He recognized [...] that ‘squeezableness is something which seems to be located in the object, not in the hand.... Visual objects appear to have soaked up such qualities and to be fairly saturated with them, the use of the object and the shape of the object being almost indistinguishable’ (pp. 203, 204). The idea that needs to be avoided still is that “the perception of the functional nature is dependent on the perception of the spatial nature” (16).


BR then have us look at Gibson’s model in a light that does not necessarily lead to this idea. We know already that when [for example in the case of parallax motion] we directly perceive the information [about depth], what we perceive are patterns that result from interactions. And furthermore, the information that we obtain indicates “potentialities for further actions and interactions” (BR 16). In other words, “what are most directly perceived are functional potentialities, potential usefulnesses” (16). [So when for example we interact with the environment by moving around it in order to obtain data that directly tells us of its spatial feature of depth, what that tells us are the different sorts of spatial ways that we may further interact with that space by moving through it in all its available dimensions.] Thus these “patterns of interactions [...] are simply functional indicators” (BR 16). [BR continue in the endnote to this passage: “From this perspective, in fact, the spatial is subsidiary to the functional. Surfaces, objects, and the like are constructed as patterns of potential interactions, including further perceptual interactions, that may be indicated by particular perceptual interactions, that is that may be perceived. Such construction of the physical and spatial out of the functional is in the general spirit of Piaget” (BR, endnote 2, p.85).] [I am not exactly sure what is meant here by “functional indicators”. Perhaps they are indications of ways that certain interactions with the environment can produce certain types of possible results. So for example, knowledge of how far a mountain is away from us, along with its relative height in comparison with its surroundings, can indicate the sorts of views we might have were we to climb it and the approximate amounts of time and effort it would take to accomplish that.]


What he previously called squeezability he later refines into his notion of affordance. [An affordance seems to be the directly apparent uses of things we perceive. So in the same act by which we observe the physical features of something, we thereby perceive its usability for certain purposes. But this usability is observed directly, because it is directly evident in the thing’s physical features.]

Such an imbuing of perception with direct, functional, ecological meaning, already hinted at in his 1950s discussion of squeezability, yield Gibson's concept of affordance. “The affordance of anything is a specific combination of the properties of its substance and its surfaces taken with reference to an animal” (1977, p. 67, italics omitted). Affordances are those things the environment “offers the animal, what it provides or furnishes either for good or ill" (1979, p. 127).3 And such affordances are intrinsic to perception:4

The composition and layout of surfaces constitute what afford ... to perceive them is to perceive what they [surfaces] afford ... it implies that the “values” and “meanings” of things in the environment can be directly perceived (1979, p. 127).

The perceiving of an affordance is not a process of perceiving a value-free physical object to which meaning is somehow added in a way that no | one has yet been able to agree upon; it is a process of perceiving a value-rich ecological object (1979, p. 140).

(BR 16-17)

[Endnotes 3 and 4 from page 85 (qtg.):

3. Affordances, of course, are therefore “relative to the animal. They are unique for that animal. They are not just abstract physical properties” (Gibson, 1979, p. 127). “Knee-high [therefore affording the potentiality of sitting on] for a child is not the same as knee-high for an adult” (Gibson, 1979, p. 128). Horizontal support for water bugs is different than for heavy terrestrial animals (Gibson, 1979, p. 127).

4. Gibson’s discussion, however, still suggests too much independence of the spatial from the functional; there is an incomplete recognition of the construction of physical and spatial representation out of functional representation. (Such construction would be a part of Gibson’s tuning, not his information extraction.) Gibson still wants to go “from surfaces to affordances,” he does so by having “the composition and layout of surfaces constitute what they afford” (1979, p. 127), but such constitution still leaves the question of what representation of a surface is as logically prior, though no longer temporally prior, to a representation of an affordance. Yet infants can perceive affordances without necessarily perceiving the surfaces, edges, and full objects that provide, or constitute, those affordances.]



BR’s final point seems to be that originally Gibson had a notion of ecological direct encoding, where we directly perceive the information because our minds somehow resonate with it [and in that way “encode” it in the sense of endowing its internal form with an informational value that was already there in its external form, but I am guessing]. But on account of there needing still to be some process encoding that information, it was abandoned later for this notion of affordance and interactive information extraction, where the meaning or information to be discerned in something perceived is given by interacting with it, which gives us its important physical features that thereby directly informs us of its possible uses. In other words, the significance of a perceived thing is its potential uses, and that is directly perceived by interaction with it and its environment rather than by some cognitive process whereby that information about its significance is decoded from our static perceptions of it. [Note, I might be missing the idea, because BR are using the term “encoding” rather than “decoding”. I am a little confused how it all works. Apparently according to some models, somehow at the site of sensations there is an encoding of information, and perhaps, but I am not sure, these models would have to say there later is a process that decodes this information (as if by a homunculus). The model is not very obvious to me yet. I wonder by the way if the encoding is anything like Lotze’s “local signs”.]


Interactive information extraction and affordances were the culminations of Gibson’s major moves away from his early ecological direct encoding. Although we later argue that those moves were nontrivially incomplete, nevertheless they transcended that early encoding model by constructing an intrinsically interactive mode of perception. Essentially, Gibson started with ecological direct encoding, then filled in the detection-differentiation-identification process, the process of ‘transducing’ the encodings, with so much interactive activity – extraction, resonance, pickup, affordance – so as to make it clear that whatever ultimate perceptual encoding, if any, occurred it was not primary nor necessary nor independent, but, rather, subsidiary to interactive extraction. Gibson’s basic insight was that it is possible to derive information about an environment from interactions with that environment without encoding anything from that environment.

(BR 17)






Bickhard, Mark, & D. Richie. On the Nature of Representation: A Case Study of James Gibson’s Theory of Perception. New York [and other cities]: Praeger, 1983.



Bickhard and Richie cited a number of Gibson sources:


Gibson, J. J. The ecological approach to visual perception. Boston: Houghton Mifflin, 1979.


Gibson, J. J. The perception of the visual world. Boston: Houghton Mifflin, 1950.


Gibson, J. J., & Gibson, E. J. Perceptual learning: Differentiation or enrichment? Psychological Review, 1955, 62, 32–41.


Gibson, J. J. The senses considered as perceptual systems. Boston: Houghton Mifflin, 1966.



Or if otherwise noted:

Merleau-Ponty, Maurice. The Structure of Behavior. Transl. Alden L. Fisher. Boston: Beacon Press, 1963.

Merleau-Ponty, Maurice. La structure du comportement. Paris: Presses universitaires de France, 1942 / 1967.



Image credits:

Automobile parallax motion diagram:

Travis Schirner, ¨Mission Possible?¨