Infants' reliance on shape to generalize novel labels to animate and inanimate objects

Two experiments were conducted to examine infants' reliance on object shape versus colour for word generalization to animate and inanimate objects. A total of seventy-three infants aged 1;4 to 1;10 were taught labels for either novel vehicles or novel animals using a preferential looking procedure (Experiment 1) or an interactive procedure (Experiment 2). The results of both experiments indicated that infants limited their word generalization to those exemplars that shared shape similarity with the original referent for both animate and inanimate objects. These findings indicate that a strong reliance on shape is present earlier than previously shown. In Experiment 2, reliance on shape to generalize novel words did not vary as a function of vocabulary size. Thus reliance on shape versus colour for word generalization does not appear to increase in strength as a function of word learning during late infancy.

  - learning. Given the multitude of possible referents for a new word in any given situation, many researchers have argued that word learners are guided by biases or lexical principles (e.g. Markman,  ; Golinkoff, Mervis & Hirsh-Pasek, ). These biases are proposed to simplify the process of word learning by limiting the number of possible referents that a child must consider when confronted with a new word. In the present studies, we examined the role that object shape plays in restricting infants' hypotheses about the meaning of novel labels. Specifically, we investigated whether infants emphasize shape over colour to extend novel words and whether reliance on shape differed for animate and inanimate objects.
A large body of empirical research suggests that preschool-age children assume that objects with the same shape share the same linguistic label (e.g. Landau, Smith & Jones,  ; Smith, Jones & Landau, ). Baldwin () found that both two-and three-year old children emphasized object form (which included similarity of shape, size, and material) rather than colour in their judgements about object label reference. In a series of studies with preschoolers and adults, Landau, Smith & Jones () examined whether shape, size, and texture were weighted differentially in judgements of word extensions. Across four experiments, they found reliance on shape increased in strength and generality from two to three years and more markedly from childhood to adulthood. Interestingly, reliance on shape was stronger in classifications of labelled objects than in classifications of unlabelled objects for the children but not for the adults. Thus, Landau et al. concluded that, with development, an emphasis on shape comes to be more broadly applied to categorization tasks outside the lexical domain.
This research clearly indicates that preschool-age children attend to object shape vs. other perceptual properties in the presence of a novel count noun. Similarly, naturalistic studies of early production indicate that overextension errors occur mainly on the basis of shape (Clark,  ; Bowerman, ). Recent empirical studies have found that young children's reliance on shape similarity for word extension will even lead them to override strict adherence to ontological category. In tasks contrasting taxonomic kind and shape similarity, researchers have found that children extend a novel word to a shape match, even when this choice involves crossing ontological boundaries (e.g. matching a butterfly and hairbow ; Imai, Gentner & Uchida, ). It is perhaps not surprising that young children would rely on object shape to extend novel words in light of the information that shape affords. Object shape is a good cue to object identity in that it tends to vary much less than other perceptual properties such as size and colour. It is also an excellent predictor of basic-level category membership (Rosch, Mervis, Gray, Johnson & Boyes-Braem, ). Furthermore, shape is easily detected and children require no special experience to perceive shape, unlike other attributes such as function (Landau, ). Thus, children relying on object shape to extend  '    novel words would have an easily accessible rule as they could quickly determine whether objects shared the same shape. Furthermore, they would enjoy a relatively high rates of success, as objects with the same label frequently have similar shapes.
Given the value of a same-shape rule for lexical extension, it follows that such a strategy would be highly beneficial for infants at the initial stages of lexical acquisition. There is evidence that infants do emphasize shape similarity over taxonomic membership in judgements about object word reference (Poulin-Dubois, Frank, Graham & Elkin, in press). However, the issue of reliance on shape information vs. other perceptual information such as colour, texture, or size has not yet been tested during infancy. As such, one of the goals of the present studies was to examine infants' use of object shape versus colour to generalize novel words. We chose to contrast shape and colour for two reasons. First, shape and colour are among the first perceptual properties of objects processed by young infants (e.g. Bornstein, Kessen & Weiskopf,  ; Bomba & Siqueland, ). Shape, however, is highly predictive of object category membership whereas colour is not. Thus, we would expect that infants would weigh shape over colour in an object label extension task." Secondly, Baldwin () found that three-year-olds did make some use of colour in extending novel words. She speculated that children under three may possess a strong form-over-colour bias which they may begin to relax by three years of age. The parameters of her experiments, however, did not allow her to test the strength of two-year-olds' form-overcolour bias. In the present study, we tested the shape-over-colour bias hypothesis in infants aged  ;  to  ;  who are in the midst of rapid vocabulary acquisition.
A second goal of the present research was to examine whether reliance on shape increases in strength with lexical development during late infancy. As noted earlier, Landau et al. () found that reliance on shape (vs. size and texture) increased with age (see also Smith, ). Smith, Jones & Landau () recently argued that attention to shape becomes linked to language as a consequence of learning count nouns. To date, however, this proposal has only been tested with preschoolers and has only been tested indirectly (i.e. using age rather than vocabulary size). In the present research, we pursued the question of whether infants' reliance on shape versus colour varies with vocabulary development. More specifically, we examined whether infants' reliance on shape varied with vocabulary size and vocabulary composition. If attention to shape is indeed a consequence of learning count nouns, we would [] Research has found that children have difficulty learning colour terms (e.g. Dockrell & Campbell,  ; Au & Laframboise, ). Thus, it is important to note that our goal is to examine the predictive weight the infants would assign to colour (in comparison to shape) for   generalization. We did not expect the infants to treat the novel words as colour terms.
 expect those infants with more names in their vocabulary to exhibit a greater reliance on shape for word extensions than those infants who have fewer names in vocabulary. The final goal of these studies was to examine the generality of infants' reliance on shape. Researchers have suggested that shape is more integral to certain categories of objects (e.g. artifacts) than to others (e.g. natural kinds) (Becker & Ward,  ; Jones, Smith & Landau, ). Using a word extension procedure and nonword classification task, Jones et al. () presented two-and three-year-old children with objects that varied in shape, size, and texture. Half of the objects had eyes and half did not. The results indicated that both eyes and colourful stickers lead to a decrease in the number of same-shape choices while at the same time heightening the importance of texture for novel word extension. Interestingly, the decrease was more marked for three-year-olds than two-year-olds. Children did exhibit the shape bias in classifying objects without eyes in the word extension condition. Jones et al. suggest that the presence of eyes seemed to indicate animacy and that children understand that novel words emphasize the importance of different types of features for different types of objects.
This research suggests that indications of animacy can lead preschool-aged children to moderate their reliance on shape for lexical extension. It remains to be seen, however, whether infants follow the same pattern. It is conceivable that infants begin with a more rigid reliance on shape, using it to extend novel words to both animate and inanimate objects. Through experience with categories of different types, infants may come to understand that different word extension criteria apply to different ontological types. On the other hand, given that the inanimate-animate distinction is recognized early in life (e.g. Poulin-Dubois, Lepage & Ferland, ), animacy cues may lead infants to decrease their reliance on shape to extend novel words.
The experiments reported here were designed to examine three issues that have not yet been investigated during the infancy period : whether infants emphasize shape over colour to extend novel words (Experiments  and ); whether infants' reliance on shape differs for animate and inanimate objects (Experiments  and ) ; and whether reliance on shape varies with vocabulary size or vocabulary composition (Experiment ). We chose to test infants ranging in age from  ;  to  ;  in these studies as there is a great deal of variability in vocabulary size during this period (Fenson, Dale, Reznick, Bates, Thal & Pethick, ).
In Experiments  and  infants were taught novel labels for two novel animal-or vehicle-like categories. The animals represented animate creatures, possessing facial features such as eyes and mouths, appendages, and a curved body shape and are similar to those used by Becker & Ward (). The vehicles represented inanimate objects, possessing wheels and more straight line contours. At the end of the lexical training trials, infants were presented with three types of word generalization tasks : () an identity match ; () a same shape exemplar varying only in colour with object parts remaining intact ; and () a different shape exemplar which was created by changing the shape of the stimuli with the colour and object parts remaining intact. These word generalization tasks allowed for the assessment of reliance on shape versus colour in that one exemplar shared shape but not colour with the original exemplar whereas the other shared colour but not shape. It is important to note that object parts were held constant across all three objects.
We expected that infants would restrict their word generalizations to objects that shared shape similarity with the original referent. That is, infants should generalize the novel labels to the identical and same shape exemplars but should not generalize to the different shape exemplars. However, if infants, like preschoolers, moderate their reliance on shape similarity for animate categories, we expected those infants who learn the labels for the novel animals to be more accepting of a shape change than those infants who learn the labels for the novel vehicles. In Experiment , infants were taught the novel labels using pictures and the preferential looking procedure. In Experiment , infants were taught the novel labels using objects and a more interactive word learning situation.

EXPERIMENT 
The preferential looking paradigm, as adapted by Golinkoff, Hirsh-Pasek, Cauley & Gordon, , was used to teach the infants the novel words in this experiment. A number of elements were included to approximate the naturalistic word learning circumstances encountered by an infant. First, the novel labels were embedded in natural sentence frames (e.g. ' Look ! This is a dax.'). Secondly, these sentences were recorded with a female voice using the intonational contours of the infant-directed speech register with the novel word stressed and in the final position. Studies have shown that parents introduce novel labels by placing them in the final position on a pitch peak with an elongated vowel (e.g. Fernald & Mazzie, ). Finally, the infant's parent was asked to participate in the teaching phase by pointing to the screen and repeating the novel label for their infant, following a printed script.

Participants
Thirty-four infants,  boys and  girls, ranging in age from  ; n to  ; n (M l  ; n, .. l n) participated in the study. In order to be included in the final sample, infants had to be from families in which English was the predominant language spoken at home.# An additional  infants [] Twenty-one additional infants were tested but excluded from the study because they were learning more than one language and English was not the predominant language spoken at home.

   -
were tested but excluded from the sample for the following reasons : technical failure during testing (n l ), seated on parent's lap during testing (thus, infants may have received inadvertent cueing from their parents) (n l ), parent directed\reinforced choices during testing (n l ), did not meet trial criteria (n l  ; see Data reduction section).

Materials and apparatus
Apparatus. A three-sided testing chamber was constructed for this task. The front panel display contained two  cm Macintosh colour monitors separated by  cm. Two audio speakers placed midway between the two monitors transmitted the verbal prompt. A  watt blue light situated above the speakers was used to direct the infant's attention to the centre of the apparatus. The infant's state and parent's behaviour were monitored via a small TV monitor. Trial length and synchronization of the auditory and visual stimuli were controlled by a Macintosh Centris microcomputer. All sessions were videotaped with an  cm camcorder, the lens of which was focused on the infant's face.
Stimuli. The initial pool of stimuli were brightly coloured computergenerated line drawings representing eight novel categories. Two of the four animal categories were similar to those used by Becker & Ward (). For each category, three sets of exemplars were generated : () a category standard ; () same shape exemplars varying only in colour with parts remaining intact ; and () different shape exemplars which were created by changing the shape of the stimuli with the colour and parts remaining intact. A group of  adult raters (mean age l  years) was asked to identify the category membership of each object and then to rate the perceptual similarity of each exemplar to the category standard using a seven-point rating scale, with a score of one indicating that the exemplar shared very little similarity with the category standard and seven indicating that the exemplar was highly similar to the category standard.
Category exemplars were assessed according to their suitability for the word learning task. Four novel categories, two animal-like and two vehiclelike, selected from the eight initially rated were chosen for inclusion in the task. Each category included a category standard, a different colour, same shape exemplar, and a same colour, different shape exemplar. t-tests were used to ensure that the ratings of the category standards, and different shape exemplars were significantly different from one another. In all cases, the category standards and the same shape exemplars were rated significantly more similar than the different shape exemplars (see Appendix). Furthermore, to ensure that the different shape exemplar was distinct enough from the standard to be classified as a member of a new category, only those different shape exemplars that were considered category members by less  '    than  of the  adult raters (as per the binomial distribution) were chosen for inclusion in the word learning task.
Thus, there were three types of generalization stimuli for each target word : an identical match (the category standard), a same shape exemplar that differed only in colour from the identical match (SS), and a different shape exemplar (DS) (see Figs  and ). The identical match was used for the learning phase of each session. These pictures were scanned using the Abaton Scan \Colour Scanner for presentation using the Adobe Photoshop . software program. The word referents were labelled with one-syllable nonsense words with recorded voice prompts.

Design and procedure
Infants were randomly assigned to learn  the two novel labels for the stimuli in the animals set (n l ) or the vehicles set (n l  brought into the laboratory and seated in a baby chair attached to a table. The parent was seated behind and to the left of the child. Both the infant and the parent faced the front panel of the testing chamber from a distance of n m. In all phases of the experiment, each trial began with blank screens and the light blinking. The length of the trials was  sec with a  sec intertrial interval. The testing session consisted of four phases : (a) lexical training trials (four trials) ; (b) familiar word trials (six trials) ; (c) lexical training trials (four trials) ; and (d) novel word generalization trials (twelve trials). During the lexical training phase, infants were taught two novel labels for either the two novel vehicles (fep and rif ) or the two novel animals (dax and sud), depending on the condition to which they had been assigned. The identical match (the category standard) for one novel animal or vehicle was first presented on   screens and the infant heard the novel word embedded in recorded natural sentence frames as follows : ' Look ! This is a -. See the -'. The infant's parent was provided with a written script and was asked to participate in the teaching phase by pointing to the screen and repeating the novel label using natural sentence frames. After this trial, the identical match (category standard) for the second animal or vehicle was presented on both screens and labelled in the same manner. Each referent was presented twice in alternating order in each block of four word learning trials.
The familiar words trials were included to ensure that the present instantiation of the preferential looking paradigm was a sensitive measure of word comprehension for infants in this experiment. The infant was presented with three pairs of pictures of familiar objects (dog, cat, bird) without any specific instructions. The pictures appeared on screen and the infant heard ' Look ! See the pictures ! Look at the pictures ! '. On subsequent trials, the infant was asked to find the referent of each of the three words (dog, cat, bird).
The pictures appeared on screen and the infant heard ' Can you find the-? Where's the-? Find the-'. These words were chosen as they have been found to be in the receptive vocabulary of most -month-old infants (Fenson et al., ).
The familiar words trials were followed by another series of four lexical training trials, identical to the first set.$ This was followed by the novel word generalization trials during which three different types of word generalization trials for each novel word were presented : () an identical match ; () a different colour, same shape exemplar (SS) ; and () a different shape, same colour exemplar (DS). On each word generalization trial, a word referent was paired with the equivalent exemplar of the other word category taught in the session. That is, when the SS referent from one category was the target, it was paired with the SS item from the other category. For example, on a SS animal generalization trial, the infant would be presented with the SS sud and the SS dax and asked for the sud. These pairings were chosen to allow infants the opportunity to extend the words based on shape  colour. That is, we wanted to determine whether children would use colour to extend the novel words, if shape similarity was not available as a cue. The specific instructions presented during the word generalization trials were : ' Can you find the-? Where's the-? Find the-? '. The side of presentation of the targets was counterbalanced. Thus, each version of the word referent appeared twice as the target of a word and twice as the non-target within the twelve generalization trials. Any given target appeared once on the right screen and once on the left screen. The order of presentation of the three types of word generalization trials was randomly determined with constraint that no two [] The familiar words trials were placed between the two lexical training blocks after pilot work revealed that infants' interest in the stimuli decreased after approximately four trials.

   -
instances of the same type of generalization trial be presented successively. This order was then fixed across infants.

Interobserver reliability and scoring
The amount of time (in seconds) that the infant looked at each stimulus was coded from the videotapes. Two observers, who were unaware of the order and position of the pictures, coded a random selection of  % of the sessions. Pearson product-moment correlations were computed between observers' ratings of the total time the infant looked at each side. Mean interobserver reliability was r l n (.. l n). Three visual fixation variables were coded for each trial : looking time on the right screen, looking time on the left screen, and looking time off-screen. Data were screened to assess for outliers, normality, and skewness. No eliminations were made on these bases. In addition, trials were omitted if : (a) fixation to one of the stimuli was  % of total fixation to the pair ; and\or (b) total fixation time was less than n sec (\ of trial duration). These criteria, which are similar to those used in previous studies (e.g. Reznick & Goldfield, ), were used to ensure that only data from infants who were exposed to both pictures for a sufficient period of time were included. Seven percent of all trials were eliminated for these reasons. Participants were excluded from the analyses if they exhibited a side preference across all trials (i.e. looking at one side more than  % of the time across all trials) or if their total fixation time to both stimuli was less than n sec (\ of trial length) for more than  % of the trials. As described in the Participants section, six infants were eliminated for the latter reason.

Familiar word trials
The data from the familiar word trials were used to assess whether the present version of the preferential looking paradigm was a sensitive measure of word comprehension for infants in this age group. First, the amounts of time infants fixated the familiar items when no label was heard versus when these items were the referents of target words were compared using a paired t-test. This analysis enabled us to rule out simple salience or preference effects by comparing infants' baseline fixation of the target animals when they were simply directed to look at the pictures with their fixation of the animals when they were referents of target words (i.e. the amount of time infants looked at the dog during the no label trials versus the amount of time infants looked at the dog when instructed to find the dog). This analysis indicated that infants looked significantly longer at the items when they were the referents of target words (M l n sec, .. l n) than when they were not labelled (M l n sec, .. l n), t() l n, p n.

'   
Secondly, infants' fixation of the referent of the target word was compared to their fixation of the non-target stimulus presented within the same trial (e.g. fixation of the dog when referent of the target word versus fixation of the cat which was presented on the same trial). A paired t-test indicated that infants fixated the referents of the target words (M l n sec, .. l n) significantly longer than they fixated the non-target stimuli (M l n, .. l n), t() l n, p n. Taken together, these analyses indicate that infants understood the instructions presented and that this version of the preferential looking paradigm was a sensitive measure of word comprehension for this group of infants.

Novel word generalization trials
Infants' generalizations of the novel words were examined in three sets of analyses. First, preference scores were calculated for each of the three target types (i.e. identical, SS, DS). These preference scores reflected the amount of time infants fixated the target stimulus while controlling for their fixation of the non-target stimulus presented on the same trial. Thus, these scores represent a highly sensitive measure of infants' generalization of the novel words to the various target types. These scores were calculated by subtracting the amount of fixation to the non-target stimulus from the fixation to the target stimulus (e.g. fixation to the identical dax when referent of the target word  fixation to the non-target sud presented on the same trial). Preference scores were calculated for each of the trial types for each of the two novel words infants were taught and then collapsed across words for a mean score. Thus, each child had three preference scores (i.e. identical, SS, DS). A () ontological category : animals vs. vehiclesi() target type : identical, SS, DS) mixed factor ANOVA was conducted to examine these preference scores. Ontological category was the between-subjects factors and target type was the within-subjects factor. This analysis yielded only a significant main effect of target type, F(, ) l n, p n.
We predicted that infants would generalize the novel labels to the identical and SS exemplars but not the DS exemplars. Planned comparisons indicated, as expected, that identical preference scores (M l n sec, .. l n) did not differ significantly from SS preference scores (M l n sec, .. l n), t() lkn, p n. In contrast, identical preference scores were significantly larger than DS preference scores (M lkn sec, .. l n), t() l n, p n. Similarly, SS preference scores were significantly larger than DS preference scores, t() l n, p n. These comparisons indicate that infants looked longer at the target word referents than at the nontargets presented on the same trial, but only for the identical and SS exemplars (as indicated by positive preference scores).
In the second set of analyses, infants' fixations of the identical, SS and DS objects when they were the targets of the words were compared to fixation of the   when they were the non-targets (on another trial), using planned comparisons. That is, we compared how long infants looked at the identical dax, for example, when they were asked to find the dax versus how long they looked at this exemplar when they were asked to find the sud (and the dax was the non-target). The mean fixation times for each exemplar type are presented in Table  SS exemplars when they were targets than when they were non-targets, t() l n, p n, and t() l n, p n, respectively. In contrast, infants did not look significantly longer at the DS exemplars when target or nontarget, t() lkn, p n. These comparisons provide critical evidence indicating that infants were actually generalizing the novel words to the target referents, and not simply fixating them because these exemplars were interesting to them. Thus, consistent with the results of the first set of analyses, this analysis indicated that infants generalized the novel words to the identical and SS exemplars, but not to the DS exemplars. Finally, in the third set of analyses, we explored the effect of ontological category on infants' generalization of the novel words to the different exemplar types. The results of the ANOVA reported earlier indicated no main effect nor interaction with ontological category. However, in light of research with preschoolers indicating greater acceptance of shape changes for animates (e.g. Jones et al., ), we chose to further explore this possibility. We conducted planned comparisons using difference scores representing the amount of time infants fixated a referent target minus the amount of time they fixated the same referent when non-target (e.g. difference score l identical dax when target minus identical dax when non-target). We chose difference scores as they allowed us to examine generalization to the different exemplars when the target of the label while controlling for simple preferences for these same exemplars when the non-target on another trial. Paired t-tests indicated no significant difference between the identical animal  (M l n sec, .. l n) and identical vehicle (M l n sec, .. l n) difference scores, t() lkn, p n. Similarly, the SS animal (M l n sec, .. l n) and SS vehicle (M l n sec, .. l n) difference scores did not differ significantly, t() lkn, p n. Most importantly, the DS animal (M lkn sec, .. l n) and vehicles (M lkn sec, .. l n) difference scores did not significantly differ, t() lkn, p n. We repeated these analyses using raw looking times to the referents when target as well as the identical, SS, and DS preference scores described earlier. In each case, we obtained the same results : infants were not more accepting of a shape change for the animals than for the vehicles.


As expected, infants generalized the novel labels to the identical and same shape referents but not to the different shape referents. It is possible that infants' lack of preference for either of the objects on the DS trials could reflect overgeneralization of the labels to both DS objects. That is, given that both items on a DS trial shared either animate or inanimate features, infants may have generalized the labels based on those features and thus considered both objects likely referents of the novel labels. However, if infants were indeed overgeneralizing the novel labels on the basis of animate or inanimate features, one would expect them to do so on the identical and SS trials as well. Instead, on these trials, infants only generalized the label to the same shape items. Furthermore, although the DS exemplars differed in shape from their respective target exemplars, they did share colour similarity as well as object parts with the original referents. Thus, if infants were to overgeneralize the label, one would expect them to choose the appropriate DS exemplar as it still shared more features with the original referent than the foil did. Given the pattern of generalization across all trials, infants' lack of preference for a particular object on the DS trials does appear to indicate a lack of generalization of the novel labels due to the low shape similarity of the DS exemplars.
Infant's reliance on shape to generalize words occurred regardless of ontological category type. That is, infants restricted the novel label to sameshaped objects for both animate and inanimate objects. It is possible, however, that the use of pictures, rather than objects, prevented participants from perceiving the ontological kind of the animals and vehicles, thereby accounting for the lack of effect. Some authors have argued that two dimensional stimuli are impoverished and provide participants with only limited information about the objects (e.g. Dea! k & Bauer, ). To address this possibility, we conducted a second experiment using a more interactive procedure and three-dimensional objects.
In Experiment , we also examined the role of vocabulary size in moderating infants' reliance on shape for word generalization. As noted earlier, recent research suggests that a strong reliance on shape for word extension may be present early in word learning (e.g. Poulin-Dubois et al., in press). Other researchers, however, argue that reliance on shape actually increases in strength with lexical development (Landau et al.,  ; Smith,  ; Smith et al., ). To assess vocabulary size, we had parents complete the MacArthur Communicative Development Inventory (Fenson et al., ). If infants do learn about the importance of shape through word learning experience, we expected that those infants with larger vocabularies should be more likely to restrict their word extension to same-shape objects. In contrast, those infants with smaller vocabularies who have supposedly not narrowed their word extensions to shape should generalize to both the same shape and different shape match as they can rely on similarity of shape and similarity of colour, as well as the object parts that were held constant across exemplars.

Participants
Thirty-nine infants,  boys and  girls, ranging in age from  ; n to  ; n (M l  ; n, .. l n) participated in the study. An additional  infants were tested but excluded from the sample for the following reasons : did not complete the testing (n l ), experimenter error (n l ), and parental reinforcement\interference (n l ). Participants were from families for whom English was the predominant language spoken at home.

Materials and stimuli
Vocabulary checklist. The MacArthur Communicative Development Inventory : Words and Sentences (MCDI ; Fenson et al., ) was used to assess the infants' expressive vocabulary. These vocabulary data were used to divide the infants into vocabulary size groups with those infants with less than  words in their vocabulary classified as falling in the low vocabulary group and those infants with more than  words in their productive vocabularies falling in the high vocabulary group.% Stimuli. The pictures of the novel animals and vehicles used in Experiment  were transformed into small objects made of a dough-like substance and baked until hard. These objects were then painted so that they were identical to the stimuli used in Experiment . The objects were [] We should note that this criterion has also been used as a marker of the vocabulary spurt (Lucariello, )   approximately n cm in height. A small toy slide attached to a board was also used to keep infants interested in the procedure.

Design and procedure
Twenty infants were tested with the animals set and  infants were tested with the vehicles set by one of two female experiments. Infants were brought into the laboratory and seated either in a baby chair attached to a table with their parent seated beside them or on their parents' lap. The experimenter sat directly across the table from the infant. Parents were asked to complete the MCDI while the infant was being tested. One infant was tested in her daycare using an identical seating arrangement. The experiment began with a  min familiarization phase during which the infant was allowed to play with the training objects from all four categories. The experimenter placed all four objects on the table and allowed the child to play with them. This was followed by the lexical training trials during which the child was taught two new words for either the animals or the vehicles. During these trials, the experimenter handed an object to the infant and labelled it three times using natural sentence frames (e.g. ' Look. This is a dax. See the dax. Look at the dax.') and intonation patterns characteristic of infant-directed speech. The experimenter then removed the first object and handed the second object to the child. This alternating sequence was repeated three times for each object for a total of six trials. This phase lasted - min.
The lexical training trials were followed by a series of familiar words trials. This series of trials served to orient infants to the task of picking objects. Before the session, the experimenter asked the parent for the names of four common objects that their infant understood (e.g. dog, keys, bottle, cup). The experimenter arranged three objects on a tray and presented them to the child saying ' Can you find the-? Put the-down the slide '. Once the child had either pointed to or picked up an object, the experimenter said ' Okay, put it down the slide ' in a neutral tone of voice. Each object in a given infant's set served as a target on one trial and a non-target on two trials to control for salience effects.
The familiar words trials were followed by another series of six lexical training trials, identical to the first. Three different types of word generalization trials for each novel word were then presented : () an identical match ; () a different colour, same shape exemplar (SS) ; () a different shape exemplar (DS). The order of presentation of the three types of word generalization trials was randomly determined with the constraint that no two instances of the same type of generalization trial be presented successively. This order was then fixed across infants. As in the previous experiment, each word referent was paired with the equivalent exemplar of the other word category taught in the session. One difference in this   - experiment, however, was that a third distracter object from the set not taught to the infant was included in the generalization trials. For example, when the SS dax object (Animal set) was target, it was paired with the SS sud object (Animal set) and the SS fep object or SS rif object (vehicles set). Each version of the word referent appeared twice as a target and twice as a non-target within the  generalization trials. The position of the target in the array was randomized and counterbalanced such that the target appeared in each position an equal number of times. The experimenter placed the three objects on the tray and asked the infant for the target using natural sentence frames (e.g. ' Show me the dax. Put the dax down the slide.'). Once the child had either pointed to or picked up an object, the experimenter said ' Okay ' in a neutral tone of voice.
The experimenter scored each infant's choice on the familiar words and novel word generalization trials. A random selection of  % of the sessions was coded twice, once by each experimenter. Cohen's kappa was then calculated for each infant as an index of interobserver agreement (Cohen, ). Kappas ranged from n to n with a mean of n (.. l n). Any disagreements were subsequently discussed and resolved.

  
The data from the MDCI were used to divide the infants into low and high vocabulary groups. Eleven infants had productive vocabularies of less than  words and thus, fell within the low vocabulary group (seven males and four females, mean age l  ; n months, .. l n). Infants in this group had vocabularies that ranged from three to  words (M l n, .. l n). There were  infants in the high vocabulary group ( males and  females, mean age l  ; n months, .. l n). The average vocabulary size of these infants was n words, ranging from  to  words (.. l n).

Familiar words trials
On the familiar word trials, infants chose the correct referent an average of n % of the times (.. l n), which was significantly greater than would be expected by chance alone ( %), t() l n, p n, indicating that they understood the demands of the task.

Novel word generalization data
Infants' generalizations of the novel words were examined in three sets of analyses. First, the percentage of target choices was computed for each of the three types of word generalization trials. The means are presented in Table  . These data were then analysed in a () vocabulary group : low vs. highi() ontological category : animals vs. vehiclesi() target type : identical, SS, DS mixed factor ANOVA. Vocabulary group and ontological category were '    between-subjects factors and target type was the within-subjects factor. This analysis yielded only a significant main effect of target type, F(, ) l n, p n.
As in Experiment , to assess our prediction that infants would generalize the novel labels to the identical and SS referents but not to the DS referents, we conducted planned paired t-tests. As expected, infants chose the identical referent (M l n %, .. l n) significantly more often than the DS referent (M l n %, .. l n), t() l n, p n, but not more often than the SS referent (M l n %, .. l n), t() l n, p n. Infants also chose the SS referent significantly more often than the DS referent, t() l n, p n. Comparison of the percentage of target responses to chance performance ( %) for each exemplar type indicated that the target referent was chosen above chance-levels for the identical and SS exemplars (t() l n, p n, one-tailed test and t() l n, p n, one-tailed test, respectively) but not for the DS exemplar (t() lkn, p n, one-tailed test). (We report one-tailed p values for all the chance-level comparisons as they are directional tests, Rosenthal & Rosnow, ).
Because the effect observed might have been due to a general bias to select salient objects, planned comparisons were conducted between the percentage of choices when objects were the target word referents versus when they were one of the two non-targets on another trial. Infants selected the identical and SS objects significantly more often when they were the target referent versus when they were not, t() l n, p n and t() l n, p n, respectively (see Table ). Selection of the DS exemplar did not differ when the objects were targets or non-targets, t() l n, p n. These data suggest that infants chose the identical and SS exemplars when they were targets because they had learned the novel words and not simply because these objects were interesting to them.
In the second set of analyses, we conducted planned comparisons to rule out any possible influence of ontological category on infants' generalization of the novel words in this task. Planned paired t-tests indicated that infants' percentage target choices of the identical animals (M l n %, .. l n) did not differ significantly from their target choices of the identical vehicles (M l n %, .. l n), t() l n, p n. Similarly, target choices of the SS animals (M l n %, .. l n) and SS vehicles (M l n %, .. l n) did not differ significantly, t() lkn, p n. Most importantly, infants were not more likely to choose the target DS animals (M l n %, .. l n) than the target DS vehicles (M l n %, .. l n), t() l n, p n. We repeated these analyses using difference scores reflecting the number of choices of the objects when they were targets of the novel words minus number of choices of the objects when they were non-target (as in Experiment ) and obtained the same results. We also compared the percent target choices scores for each exemplar to chance-level responding. The identical animal, identical vehicle, SS animal, and SS vehicle scores were all above chance (all t-tests p n) whereas both the DS animal and DS vehicle scores did not differ significantly from chance (all t-tests p n). Thus, consistent with the results of Experiment , infants were not more accepting of a shape change for the animals than for the vehicles.
In the third set of analyses, we explored in depth the potential influence of vocabulary on infants' performance on this task. As reported above, the vocabulary group by target type interaction was not significant. However, we conducted three planned paired t-tests to compare these groups on percentage of responses for each target type (see Table   names criterion More than  names (n l ) n % a (n) n % a (n) n % (n) a Greater than chance-level responding ( %), p n, one-tailed. b Standard deviations in parentheses.
vocabulary group). Infants with less than  words in their vocabularies did not differ significantly from those with more than  words in percentage of target choices of the identical (t() l n, p n), SS (t() lkn, p n), and DS exemplars (t() lkn, p n). Comparisons to chance-level responding conducted separately for each vocabulary group indicated that both the low and high vocabulary group infants chose the target referents significantly more often than would be expected by chance alone for the identical (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n), SS exemplars (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n)  '    but not for the DS exemplars (low vocabulary group : t() lkn, p n ; high vocabulary group : t() l n, p n).
Although the -word criterion is commonly used (e.g. Lucariello, ), it is possible that the  word marker may not have been sensitive enough to differences in vocabulary size. The data were therefore reanalysed using two additional criteria : the achievement of  productive words (see Smith, ) and the achievement of  object names in productive vocabulary (see Gopnik & Meltzoff, ). Like the -word criterion, these two criteria have also been used in the literature as markers of the vocabulary spurt.
First, the sample was divided into vocabulary groups using the criterion of  productive words in vocabulary. Eighteen infants had less than  words in their vocabulary (M l n words, .. l n, range : - words) with  having more than  words (M l n words, .. l n, range : - words). A () vocabulary group : less vs. more than  wordsi() ontological category : animals vs. vehiclesi() target type : identical, SS, DS) mixed factor ANOVA yielded only a significant main effect of target type, F(, ) l n, p n, consistent with the ANOVA reported above. To further explore the potential role of vocabulary, three paired t-tests were conducted to compare these groups on percentage of responses for each target type. The means are presented in Table . Infants with less than  words in their vocabularies did not differ significantly from those with more than  words in percentage of target choices of the identical (t() lkn, p n), SS (t() lkn, p n), and DS exemplars (t() l n, p n). Comparisons to chance-level responding conducted separately for each vocabulary group indicated that both the low and high vocabulary group infants chose the target referents significantly more often than would be expected by chance alone for the identical (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n) and SS exemplars (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n) but not for the DS exemplars (low vocabulary group : t() lkn, p n ; high vocabulary group : t() lkn, p n).
A second ANOVA was then conducted, using the achievement of  names to divide the vocabulary groups. Twenty infants had less than  names in their productive vocabularies (M l n names, .. l n, range : - names), with the remaining  having more than  names (M l n names, . l n, range : - names). This analysis yielded only a significant effect of target type, F(, ) l n, p n. Three planned comparisons were conducted to compare infants with less or more than  names in vocabulary on percentage of responses for each target type. The means are presented in Table . As expected, infants with less than  names in their vocabularies did not differ significantly from those with more than  names in percentage of target choices of the identical (t() lkn, p   - n), SS (t() lkn, p n), and DS exemplars (t() lkn, p n). Comparisons to chance-level responding conducted separately for each vocabulary group indicated that both the low and high vocabulary group infants chose the target referents significantly more often than would be expected by chance alone for the identical (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n) and SS exemplars (low vocabulary group : t() l n, p n ; high vocabulary group : t() l n, p n) but not for the DS exemplars (low vocabulary group : t() lkn, p n ; high vocabulary group : t() l n, p n).
Finally, we computed correlated coefficients between infants' age, vocabulary size and proportion of names in their vocabulary (as an index of vocabulary composition), and percentage of choices of the identical, SS, and DS exemplars. There were no significant correlations, suggesting the infants' use of shape to generalize novel words in this task does not vary with vocabulary size nor age.
The overall word learning performance of participants in this study was consistent with other researchers studying children of the same age using similar paradigms (e.g. Ross, Nelson, Wetstone & Tanouye,  ; Poulin-Dubois, Graham & Riddle, ). In accord with the previous experiment, infants in this study generalized the novel words to the same shape exemplars but not to the different shape exemplars. There were no effects of vocabulary size or ontological category type (see General Discussion). Thus, the lack of effect of ontological category in Experiment  cannot be attributed to the use of two-dimensional stimuli.

 
The two studies presented here were designed to examine the relative importance of shape similarity in infants' lexical generalizations. These studies specifically examined infants' reliance on shape versus colour to generalize novel words. The results of both experiments indicated that infants limited their word generalization to those exemplars that had the same shape as the original referent. Thus, infants clearly were not relying on just any type of perceptual similarity for word generalization, as the different shape exemplar also shared some perceptual features (e.g. colour, object parts) with the original referent. Infants who relied on any type of perceptual similarity as their primary basis for word generalization would have likely extended the novel label on the basis of colour as well as object parts (which all three exemplars shared) in a force-choice paradigm. The current findings extend the results of previous work indicating that preschool-aged children emphasize shape over colour for novel word extension (e.g. Baldwin, ) to younger children.

'   
In Experiment , infants extended the novel words based on shape regardless of age, vocabulary size or vocabulary composition. This finding is noteworthy given that infants ranging in age from  ;  to  ;  years with productive vocabularies ranging from three to over  words were included in the study and three vocabulary size criteria were used. Those infants with larger productive vocabularies were not more likely to rely on shape than those infants with smaller vocabularies. Thus, reliance on shape over colour for word generalization appears to be present at the beginning of lexical acquisition and does not appear to increase in strength as a function of word learning experience over the range of ages examined in this study. This finding is consistent with that of Poulin-Dubois et al. (in press) who found that children of  ;  and  ;  extended novel words based on shape similarity, rather than taxonomic kind, regardless of vocabulary size or composition.
This lack of relation between vocabulary size and reliance on shape, however, does contrast with Landau et al.'s () proposal that vocabulary acquisition contributes to the development of the shape bias. This proposal is based on their finding of an increase in reliance on shape for word extension between two and three years of age. There are important differences between our studies and those of Landau et al. that may account for this inconsistency. First, Landau et al. contrasted shape similarity with textural and size similarity using objects (e.g. U-shaped objects) that do not resemble real world categories. In contrast, we examined shape versus colour using categories that bear more resemblance to real-world categories. Secondly, and most importantly, Landau et al. tested preschoolers whereas we tested infants. In our studies, we do not address the issue of whether greater reliance on shape over size and texture does indeed increase with vocabulary or age during the preschool years. Our data speak only to the lack of relation between vocabulary size and reliance on shape during the infancy years. We should note, however, that our data are in accord with studies which find that preschoolers initially possess a strong shape bias that eventually declines with age. For example, studies indicate that a preference for shape similarity over taxonomic similarity for word extension actually declines with age (e.g. Imai et al., ).
Infants' demonstration of an appreciation of the linkage between words and object shape regardless of vocabulary size points to the primacy or ' special ' status of shape information. There are many reasons why the use of shape as a cue to word reference would not require extensive experience. The perception of shape is integral to the object recognition system, with shape perception fully mature by the end of infancy (Landau, ). Furthermore, shape can be easily perceived and does not require experience with the object, unlike function or other non-perceptual attributes. Finally, shape is highly predictive of both object identity and category membership, particularly at the basic-level. Thus, shape potentially yields important information about   - other ' deeper ' characteristics such as function and taxonomic kind. When one considers the ease of perception of shape and the informativeness, it is not surprising that even young infants would rely on shape for word generalization.
The finding that reliance on shape is present at the beginnings of lexical development suggests that this strategy may evolve out of a general categorization strategy. Well before infants are beginning to produce language, they are experiencing objects and object categories in the world. Studies of non-verbal categorization by young infants have found a strong reliance on shape similarity for categorization. For example, infants aged three and four months have been shown to form categories of common geometric forms (e.g. circles ; Bomba & Siqueland, ). Leslie, Hall & Tremoulet () presented one-year-old infants with two objects (e.g. a triangle and a circle), passing in and out from behind an opaque screen. When the screen was lowered, infants increased their looking times to the stimuli following changes in object shape but not in object colour. Thus, in a nonverbal task, infants appear to individuate objects by shape but not by colour.
A further goal of the present research was to examine whether infants' reliance on shape to extend novel words varied as a function of ontological category type. We found that infants did not extend the novel words to the different shape exemplar for either the animals or vehicles. Thus, unlike preschool-age children, the presence of animacy features did not lead infants to moderate their reliance on shape. Of course, one explanation for the lack of difference between the animate and inanimate categories, is that infants did not perceive the difference between these two category types. This possibility is unlikely for a number of reasons. First, the categories were designed specifically to indicate animacy or inanimacy. The animals all possessed facial features such as eyes and mouths, appendages, and a curved body shape whereas the vehicles represented inanimate-like objects, possessing wheels and more straight line contours. These same characteristics have been used to indicate animacy in previous word learning studies (e.g. Becker & Ward,  ; Jones et al., ). Secondly, informal observations indicated that the infants treated the animals and vehicles differently. In Experiment , when infants were able to manipulate the objects, they tended to move the animals in more animate-like ways (e.g. making them walk). In contrast, infants tended to treat the novel vehicles as they do other vehicles : by rolling them along and sometimes making ' vroom vroom ' noises. Finally, other studies have found that infants as young as nine months are sensitive to the distinction between animals and vehicles using features similar to those used in the present studies (e.g. Mandler & McDonough,  ; Rakison & Butterworth,  ; Van de Walle & Hoerger, ). Thus, it is unlikely that infants did not perceive the novel animals as animate-like creatures, although that explanation cannot be completely eliminated.

'   
This finding, albeit preliminary, that animate features did not lead infants to be more accepting of shape changes may indicate that their reliance on shape for word generalization is somewhat more rigid than that of older children. This notion that younger children have a more rigid shape bias than older children is supported to some extent by Jones et al.'s () finding that the addition of eyes to stimuli decreased choices of same-shaped objects for three-year-olds more so than for two-year-olds. This proposal is highly speculative for two reasons : first, the present studies are the only studies to date that examine the effect of animacy on infants ' reliance on shape for word generalization. Secondly, we presented infants with only one type of extreme shape change. Future studies which present infants with a number of different shape changes are needed to fully examine the possible effect of animacy on infants' reliance on shape. One particularly interesting avenue for future research would be to present infants with postural shape variations that are unrelated to an animate object's identity such as those used by Becker & Ward (). These researchers presented three-and five-year-old children and adults with a novel label for a flat snake-like creature and asked them to judge whether other flat snake-like, curled snake-like (posturalchange), and snail-like creatures (identity-change) could also receive the same label. They found that participants of all ages accepted, to some extent, the postural-change creatures, but were less likely to accept the identitychange creatures. Becker & Ward conclude that children as young as  ;  years distinguish shape differences that are related to the identity of an object from those that indicate temporary changes.
The results of the present studies, when considered in conjunction with other recent empirical work, provide a developmental perspective on the importance of shape for word extension. We propose that infants rely on shape, versus colour, for word extension, even at the beginning of lexical development. This reliance on shape may be due to the ease of perception of shape similarity as well as previous non-verbal experience. Reliance on shape affords infants with an easily accessible and reliable means for extending novel words. This strong reliance on shape can account for the preponderance of basic-level names in infants' early vocabularies, however, a continued rigid adherence to shape would be an inefficient strategy. By grouping objects together on the basis of shape with a high degree of success, infants are likely to discover other commonalties amongst these objects, such as functional similarity. Research with preschool-age children suggests that perceptual similarity may indeed ' bootstrap ' other more abstract knowledge about objects (e.g. Gentner & Imai, ). We propose that infants gradually moderate their reliance on shape and incorporate other word extension bases, such as function and taxonomic kind, into their word extension repertoire. Colour may also serve as a basis for word extension in preschoolers, particularly for kinds of food (e.g. Macario, ). Thus, infants develop the   - ability to consider a multitude of properties, both perceptual and nonperceptual, when extending words. Future research focusing on the conditions under which children rely on different types of information for word extensions will offer valuable insights into the nature of early word learning.
In summary, lexical extensions to both animate and inanimate categories are restricted by shape similarity by the middle of the second year of life, indicating that a strong reliance on shape is present earlier than previously shown. This reliance on shape to extend novel words did not vary as a function of vocabulary size, or vocabulary composition. This suggests that a same-shape strategy (versus a same-colour strategy) for word generalization does not increase in strength with vocabulary acquisition during late infancy.