Return to J. Mallet Home Page


Polymorphisms, shifting balance,
and speciation

James Mallet1


Mathieu Joron2

1Galton Laboratory
4 Stephenson Way
tel: +44-171-380-7412/7411
fax: +44-171-383-2048 


2Génétique et Environnement
Université de Montpellier 2
Place Bataillon
F-34095 MONTPELLIER cedex 5

KEYWORDS: Aposematism, Batesian mimicry, Müllerian mimicry, defensive coloration, predator behaviour


*Preprint of article published Nov. 1999, Annual Review of Ecology and Systematics 30:201-233. The version below is probably sufficient for most online purposes for those who do not have access to the journal, although some of the explanations are clearer in the final version, which can be downloaded from Annual Reviews Inc.  On the other hand, a considerable number of misprints have been added to the published version in spite of careful proof checking. For corrections to the printed version, see Errata.

JM 16 Feb 2000.


Mimicry and warning color are highly paradoxical adaptations. Color patterns in both Müllerian and Batesian mimicry are often determined by relatively few pattern-regulating loci with major effects. Many of these loci are "supergenes", consisting of multiple tightly-linked epistatic elements. On the one hand, strong purifying selection on these genes must explain accurate resemblance (a reduction of morphological diversity between species), as well as monomorphic color patterns within species. On the other hand, mimicry has diversified at every taxonomic level: warning color has evolved from cryptic patterns, and there are mimetic polymorphisms within species, multiple color patterns in different geographic races of the same species, mimetic differences between sister species, and multiple mimicry rings within local communities. These contrasting patterns can be explained, in part, by the shape of a "number-dependent" selection function first modeled by Fritz Müller in 1879: purifying selection against any warningly-colored morph is very strong when that morph is rare, but becomes weak in a broad basin of intermediate frequencies, allowing opportunities for polymorphisms and genetic drift. This Müllerian explanation, however, makes unstated assumptions about predator learning and forgetting which have recently been challenged. Today's "receiver psychology" models predict that classical Müllerian mimicry could be much rarer than believed previously, and that, "quasi-Batesian mimicry", a new type of mimicry intermediate between Müllerian and Batesian, could be common. However, this receiver psychology theory is untested, and indeed seems unlikely; alternative assumptions could easily lead to a more traditional Müllerian/Batesian mimicry divide.


Since their discovery, anti-predator mimicry and warning colors have been used as simple and visually appealing examples of natural selection in action. This simplicity is beguiling, and controversy has often raged behind the textbook examples. Warning color and mimicry have been discussed from three different points of view: a traditional insect natural history angle, which makes simplistic assumptions about both predator behavior and prey evolution (6, 103, 122, 175); an evolutionary dynamics angle, which virtually ignores predator behavior and individual prey/predator interactions (37, 47, 54, 91, 162); and a predator behavior (or "receiver psychology") angle (59-61, 71-72, 115, 148-150), which tends to be simplistic about evolutionary dynamics. Assumptions are necessary to analyse any mathematical problem, but the sensitivity of mimicry to these different simplifications remains untested.

We believe it will be necessary to combine these disparate views (for example, 111, 133, 181) in order to resolve controversy and explain paradoxical empirical observations about the evolution of mimicry. Mimicry should progressively reduce numbers of color patterns, but the actual situation is in stark contrast: there is a diversity of "mimicry rings" (a mimicry ring is a group of species with a common mimetic pattern) within any single region; closely related species and even adjacent geographic races often differ in mimetic or warning color pattern; and there are locally stable polymorphisms. The current controversies and problems are not simply niggles with the theory of mimicry, designed to renew flagging interest in a largely solved area of evolutionary enquiry. Instead, they probe the heart of the textbook explanations we have all learnt. In our review, we will often conclude that traditional interpretations are correct. However, recent challenges and critiques are worth examining carefully because they cast justifiable doubt about previously unstated assumptions. A further reason for re-examining the evolution of mimicry is that its frequency-dependent selective landscapes are rugged, as in mate choice and hybrid inviability (55), so that mimicry provides a model system for the shifting balance theory (41, 178-180); mimicry may also act as a barrier to isolate species (96). While a number of interesting peculiarities of mimicry arguably have little general importance (57), mimicry excels in providing an intuitively understandable example of multiple stable equilibria and transitions between them (41, 87, 97-98, 161, 167). Mimicry and warning color are highly variable both geographically within species, and also between sister species. In this, they are similar to other visual traits involved in signalling and speciation, such as sexually selected plumage morphology and color in birds (2). Sexual and mimetic coloration may therefore share some explanations in common.

This article aims to cover only the evolution of diversity in mimicry systems, and we will skate quickly over many issues reviewed elsewhere (19, 20, 32, 40, 45, 49, 123, 125, 129, 134, 170, 176; see also a list of over 600 references in ref. 90). Our discussion mainly concerns anti-predator visual mimicry, though it may apply to other kinds of mimicry and aposematism, for example warning smells (62, 130), and mimicry of behavioral pattern (18, 152-154). In addition, most of our examples are shamelessly taken from among the insect mimics and their models, usually butterflies, that we know best; careful studies on the genetics or ecology of other systems have rarely been done.


Bates (6) noticed two curious features among a large complex of butterflies of the Amazons. First, color patterns of unrelated species were often closely similar locally; second, these "mimetic" patterns changed radically every few hundred miles, "as if by the touch of an enchanter's wand" (8). Bates argued that very abundant slow-flying Ithomiinae (related to monarch butterflies) were distasteful to predators, and that palatable species, particularly dismorphiine pierids (related to cabbage whites) "mimicked" them; that is, natural selection had caused the pattern of the "mimic" to converge on that of the "model" species. This form of mimicry became known as Batesian (122). The term "mimicry" had already been used somewhat vaguely by pre-Darwinian natural philosophers for a variety of analogical resemblances (13), but Bates's discovery was undoubtedly a triumph of evolutionary thinking.

Bates also noticed that rare unpalatable species such as Heliconius (Heliconiinae) and Napeogenes (Ithomiinae) often mimicked the same common ithomiine models (such as Melinaea, Oleria and Ithomia) copied by dismorphiines. He assumed that this "mimetic resemblance was intended" (6: p. 554) because, regardless of palatability, a rare species should benefit from similarity to a model. However, where the apparently mimetic species was common, as in the similarity of unpalatable Lycorea (Danainae) to ithomiine models, he felt this was "a curious result of [adaptation to] local [environmental] conditions" (6: p. 517); in other words convergent evolution unrelated to predation. It was left to Fritz Müller (103) to explain clearly the benefits of mimicry in pairs of unpalatable species. If a constant number of unpalatable individuals per unit time must be sacrificed to teach local predators a given color pattern, the fraction dying in each species will be reduced if they share a color pattern, leading to an advantage to mimicry. Thus, mimicry between unpalatable species became known as Müllerian mimicry.

Many mimetic species are also warningly colored, but some are not: for example the larvae of a notodontid moth (6) and of Dynastor darius (Brassolidae) (1), which both mimic highly poisonous but cryptic pit vipers (Viperidae). The former mimics even the keeled scales of its model; the latter has eyes which mimic the snake's own eyes, even down to the slit-shaped pupils. Many small clearwing ithomiines Bates studied in the tropical rainforest understory are also very inconspicuous, but are clearly mimetic. Mimicry doesn't require a warningly colored model, only that potential predators develop aversions to the model's appearance. Warning color, or "aposematism" (122) was first developed as an evolutionary hypothesis by Wallace in response to a query from Darwin, four years after Bates' publication on mimicry. Darwin's sexual selection theory (42) explained much bright coloration in animals, but could not explain conspicuous black, yellow, and red sphingid caterpillars found by Bates in Brazil, because the adult sexual stage does not choose mates on the basis of larval colors. Wallace in 1866 (see 42, 175) suggested that bright colors advertised the unpalatability of the larvae, in the same way that yellow and black banding advertised defensive sting of a hornet (Vespidae). Warning color in effect must increase the efficiency with which predators learn to avoid unpalatable prey (see also refs. 59, 61 for excellent discussion of possible advantages of warning color).


It was apparently not clear to the natural history viewpoint of early Darwinians (6, 8, 42, 103, 122, 175) that explaining aposematism and mimicry as adaptations often requires an additional assumption: that long-term group fitness will always be maximized. In fact, there are multiple stability peaks in the selective "landscape" of mimetic evolution, and there should often be initial barriers to the spread of ultimately beneficial unpalatability, warning color, and some mimicry. To understand why this is so, we must examine the evolutionary dynamics of mimicry.

Müller (103) was the first to formulate the benefits of mimicry explicitly, using mathematical intuition from a natural history perspective (reprinted in 78). He assumed that, while learning to avoid the color pattern of unpalatable species, a predator complex killed a fixed number of individuals per unit time. Müllerian mimicry is favored, therefore, because the per capita mortality rate decreases when another unpalatable species shares the same pattern. If this traditional naturalist's "number-dependent" (162) view of mimicry is correct, it leads to two interesting predictions, only one of which Müller himself apparently appreciated. The first is that, although Müllerian mimicry of this kind should always be mutualistic, a rare species ultimately gains far more from mimicry than a common one, in proportion to the square of the ratio of abundances (103). The second prediction is that a novel mimetic variant in the rarer species resembling the commoner is always disfavored because the common species generates greater numerical protection, while a mimetic variant of the commoner species is always disfavored because it loses the strong protection of its own kind and gains only weak protection from the rarer pattern (161). Both these effects will tend to cause rarer unpalatable species to mimic commoner models, rather than the other way around, in spite of the fact that the mimetic state is mutualistic once reached.

Müller's number-dependent selection applies similarly to morphs within a single species (Fig. 1). A warningly colored variant within a cryptic but unpalatable prey will suffer a two-fold disadvantage: first, it is more conspicuous to predators; second, it does not gain from warning color because predators, not having learnt to avoid the pattern, may attack it. This creates a barrier to initial spread, even though, once evolved, warning color is beneficial because by definition it reduces the number of prey eaten during predator learning (67). In exactly the same way, a novel warning pattern is disfavored within an already warningly colored species, essentially because of intraspecific mimicry (97, 89; also Fig. 1). This selection against rarity makes it easy to understand why warningly colored races are normally fixed, and sharply separated by narrow overlap zones from other races (27, 87, 93), but in turn makes it hard to understand how geographic races diversified in the first place (6, 8, 87, 89, 98, 137, 167). Similarly, if energy is required to synthesize or sequester distasteful compounds, unpalatability itself may be disfavored (49, 63, 67, 160) because unpalatable individuals may sacrifice their lives in teaching predators to avoid other members of their species. The tussle to explain the difficulties with this new, more sophisticated evolutionary dynamic view of aposematism are detailed in the following section.


The Evolution of Unpalatability

Unpalatability itself is hard to define (20, 151; see also below under Müllerian Mimicry), but here we mean the term loosely to mean any trait which acts on predators as a punishment, and which causes learning leading to reduced attack rate. Unpalatability has long been recognized as a potentially altruistic trait which may require kin selection for its evolution. The unpalatable individual may incur costs in synthesis or processing of distasteful chemistry and is often likely to suffer damage during predator sampling, while only other members of the population benefit from predator learning. The frequent aposematism of gregarious larvae, often siblings from the same brood, suggests that benefits are shared among kin, and that kin selection could have been responsible for the evolution of unpalatability (49, 63, 67, 160). These authors assumed that altruistic unpalatability was unlikely to evolve unless kin-groups already existed, so explaining the association between gregarious larvae and unpalatability.

However, unpalatability may not be very costly. Firstly, although it may be expensive to process distasteful secondary compounds, in some cases the same biochemical machinery is required to exploit available food; for instance, Zygaena and Heliconius, which feed on cyanogenic host plants, can also synthesize their own cyanogens (70, 76, 104). Secondly, because most toxic compounds also taste nasty (arguably the sense of taste has evolved to protect eaters from toxic chemistry), and because predators often taste-test their prey before devouring them, and, finally, because unpalatable insects are often tough and resilient, an unpalatable insect should often gain an individual advantage by sequestering distasteful chemicals. A good example of predator behavior showing this is possible is seen in birds feeding on monarch (Danaus plexippus) aggregations at their overwintering sites in Mexico: birds taste-reject some butterflies more or less unharmed, until they find a palatable individual which is then killed and eaten (29).

Another problem with empirical evidence for kin selection is that gregariousness, which reduces per capita detectability of the prey, is expected to evolve when there is any tendency towards predator satiation (14, 64, 157, 160); and one of the best ways of satiating predators is to be distasteful. Thus the association between gregarious larvae and unpalatability can easily be explained because gregariousness will evolve much more readily after unpalatability, rather than previously as in the kin selection hypothesis. This expected pattern of unpalatability first, gregariousness thereafter is now well supported in Lepidoptera by phylogenetic analysis (139, 141). In conclusion, the supposed necessity for kin selection in the evolution of unpalatability is now generally disbelieved (59, 97, 139), although kin selection could, of course, help.

Evolution of Novel Warning Colors in Cryptic and Aposematic Defended Prey

Although the realization that aposematic insects may be altruistic came 70 years ago (49), it was finally some 50 years later that the evolution of warning color was explicitly disentangled from the evolution of unpalatability (66-67). Under Müller's number-dependent theory, intraspecific Müllerian mimicry acting on a novel warningly colored variant A within a population would strongly favor the commonest, wild-type morph a; the frequency-dependent selection is thus purifying, tending to prevent polymorphism (A). If unpalatable prey often survive attacks, it might be argued that the problem will be surmounted (47, 73-74, 177). However, this is not true provided that attacks are even potentially damaging; nk the "effective number killed" may take fractional or probabilistic values, but the frequency-dependent logic applies in exactly the same way (68, 97).

An interesting feature of number-dependence is the great non-linearity of frequency-dependent selection. Many authors from the evolutionary dynamics tradition have assumed a simpler linear frequency-dependence (dotted line in Fig. 1) (47, 54, 87, 91, 121). In fact the relationship between selection and frequency becomes more sigmoidal as nk/N decreases. When nk << N, there are strong spikes of selection against A and a when each is rare, but much of the frequency range forms a nearly neutral polymorphic basin (N=100, Fig. 1). Another interesting feature of this model is that the mean fitness surface is flat: assuming most predators learn the pattern and then avoid it, the mean fitness throughout the frequency spectrum becomes approximately constant at 1 - [nk(A)+nk(a)]/N (Fig. 1B). This is an extreme example of how mean fitness cannot be guaranteed to improve when the fitness function is frequency-dependent (65). (Mimetic and warning color patterns may, of course, vary continuously, and not as discrete patterns (82); this may also contribute to evolution of warning color and mimicry - see under Pattern Enhancement and Peak-shift and Mimetic Polymorphism and Genetic Architecture, below.)

In the real world, not only do warning colors exist, but also novel warning patterns are forever being multiplied in already warningly colored species (see under Genetic drift and the shifting balance) in spite of the difficulties biologists foresee in initial evolution. Various ideas have been proposed:

1) Novelty, Recognizability and "Green-beard" Selection. It has been suggested that warning colors can evolve because they induce predator neophobia and because they are easier to learn (140, 177). Neophobia has some experimental evidence (45, 140), while increased memorability is part of the definition of warning color (see above). These factors, coupled with a high survival rate of attacked prey might seem to allow warning color to increase from low frequency in spite of increased conspicuousness (140, 177). Unfortunately, the problem with fear of novelty is that this survival advantage evaporates after a time, and enhanced learning is useful only if there are enough individuals available to do the teaching. This behavior viewpoint is rarely coupled with much thought about evolutionary dynamics. Thus, even the existence of an unpalatable sea slug that survives 100% of attacks by fish (158) seems likely to have some risk, or loss of fitness due to fish biting; any slight loss of fitness will be progressively diluted as the numbers increase, leading, again, to frequency-dependent selection against the rare form. In fact, Müllerian mimicry or warning color would be unnecessary if this selection against rarity were not present. An increase of conspicuousness will almost always lead to an initially greater level of attack on the first few individuals with the new pattern, even if the pattern is ultimately advantageous once fixed within the population (59, 81, 97). Essentially, Nnk(A)<=nk(a) for warning color A to spread - the learning advantage must outweigh the population size disadvantage for the first A mutant. With reasonably large prey population sizes, say N>10, for a reasonably unpalatable species, this seems almost impossible; given that A is more conspicuous, the possibility seems even more remote (89, 97). In any case, high rates of beak-marks on the wings of brightly colored unpalatable butterflies attest to a high frequency of potentially lethal attacks (11, 30-31, 91, 111). Nonetheless, there are various possibilities that allow warning colors to cheat against this apparent selective disadvantage. These are reviewed below.

2) Preadaptation. This idea is motivated by the fact that many palatable insects, particularly butterflies, are already brightly colored. Cryptic resting postures, and rapid, jinking flight allow these insects to expose conspicuous patterns in flight which may be important for intraspecific signalling in mate choice and sexual selection (42) or in territoriality and male-male interactions (138, 169), as deflection markings (45, 128, 176), or in Batesian mimicry. If these species become unpalatable, perhaps as a result of a need to process toxic secondary compounds in food, their conspicuous patterns, already adapted for signalling, could simply be re-used in predator education.

3) Pattern Enhancement and Peak-shift. The representation of a pattern in a predator's memory is likely to be a caricature of the actual pattern. Thus an exaggerated pattern may be avoided by a predator more strongly than the normal pattern on which the predator originally trained, and exaggerated warning patterns will evolve to exploit this predator bias. Training an artificial neural network model can also recreate this kind of perceptual bias for supernormal stimuli (3, 48). Whether perceptual bias is produced in computer models is strongly assumption-dependent (79), but there is good evidence for exaggerated responses to supernormal stimuli in vertebrate perception (156), which seem likely to have been a cause of exaggerated male traits in sexual selection (110, 131). These perceptual biases in vertebrates may contribute to the evolution of warning colors.

A related idea is "peak-shift" whereby, if zones of negative and positive reinforcement are located close together along a perceptual dimension, they may each cause the perceiver to bias their responses further apart (Fig. 2). Peak shift is not dissimilar from the old idea that warning colors function by appearing as different as possible from the color patterns of edible prey (49, 59-61,164). Theory shows that peak-shift can produce gradual evolution of warning colors (133, 181), and recent experiments with birds have demonstrated relevant perceptual bias (52, 84).

It seems likely that at least some warning colors evolved by this route. For example, the patterns of conspicuous morphine butterflies Taenaris and Hyantis are clearly related to those of cryptic morphines and satyrines, such as Morphopsis with deflective eyespot patterns similar to many other edible members of the satyrid lineages to which they belong. Taenaris and Hyantis have apparently evolved unpalatability, perhaps as a result of feeding on toxic Cycadaceae, both as larvae on leaves, and as adults on sap and fruits. Compared with the drab Morphopsis, color and brightness have been enhanced, eyespot size has been increased, and eyespot number has been reduced. A variety of Batesian mimics from palatable genera such as Elymnias agondas (Satyrinae), and females of Papilio aegeus (Papilioninae), apparently mimic these Taenaris/Hyantis patterns (116-117), which attest to the unpalatability of the latter.

Although likely to explain some warning color evolution, it is hard to imagine that all novel warning patterns evolved by enhancement. The color patterns of related species, or even races of Heliconius, for example, seem so radically divergent as to preclude one being an enhancement of the other. Of course, this is a dubious anthropocentric argument, but the major gene switches in Heliconius suggest that radical shifts, rather than gradual enhancement of existing patterns are responsible for much of the pattern diversity within already warningly colored lineages. If this is the case for switches between warning patterns, then the need for enhancement and predator bias, even for the initial switch, seems less pressing.

4) Müllerian Mimicry is another way that a newly unpalatable species might become warningly colored. While the constraints to the evolution of Müllerian mimicry discussed above apply, the widespread existence of Müllerian mimicry suggests that the idea should work both in the initial evolution of warning color and in its diversification within already unpalatable lineages. Because many species typically join in Müllerian mimicry rings (9-10, 22-25, 116), it seems likely that, in butterflies, most warning color switches are due to Müllerian mimicry. Only the initial divergence of mimicry rings needs to be explained in some other way (24, 89, 98, 167).

5) Density-dependent Warning Color. Our formulation so far of number-dependent warning color and Müllerian mimicry (see Fig. 1) assumes all individuals are seen by predators, but in fact apparency, as well as density per se is important for the ultimate benefits of warning colors. If more prey are killed during predator learning of a warning pattern than would be detected and killed for a cryptic population of the same size, it may pay the prey to remain cryptic. This may explain why many stationary pupae of unpalatable insects, such as Heliconius, are brown and resemble dead leaves, while their more apparent and mobile larvae and adults are brightly colored and classically aposematic. Density-dependent color pattern development in Schistocerca (desert locusts and their relatives) shows a switch from crypsis at low density to advertisement of food-induced unpalatability at high density (155), and predation experiments with Anolis lizards support the idea (155). If so, enhancement of characteristics used by predators for recognition may provide a way in which this kind of warning color evolves (133, 155, 181); this could represent an easy route to warning color in some organisms. Nonetheless, density-dependent facultative warning colors are unlikely in most animals, such as adult butterflies, in which color patterns are fixed.

6) Kin Selection, Kin-founding, and "Green-beard" Selection. Predators attacking kin groups can kill or damage some individuals, but, after doing so, avoid others who are relatives carrying the same pattern. A superior warning pattern may therefore increase locally under a kind of kin group selection (67). This is somewhat different to classical kin selection, because benefits are transferred between individuals of like phenotype, rather than according to degree of relationship (58): the effect has therefore been called "family selection" (66), or "kin-founding" (97). Guilford (58, 59) pointed out that warning color is a concrete and uncheatable "green-beard" trait, a hypothetical type of altruism invented by Dawkins (43) whereby altruists carrying a badge (such as a green beard) recognize other altruists because they also carry the badge. More recently, the general term "synergistic selection" (61, 81-82, 99, 139) has been applied to such traits. The synergism can be viewed as a behavioral explanation of the warning color trait, once evolved, but the knowledge of the nature of the synergism does not explain its initial evolution because both a fixed absence of the trait and the fixed presence of the trait are evolutionary stable strategies (99). The population genetic problem of frequency-dependence shown in Fig. 1 still arises, and it seems clear that kin founding could aid the initial increase of novel warning colors (59, 97).

Whether kin founding is important for the initial or subsequent evolution of novel warning colors seems hard to decide (see also under Genetic Drift and the Shifting Balance). However, the evidence for kin-grouping and larval gregariousness in many unpalatable insects does not seem such good evidence now as formerly for kin-founding, for reasons already discussed above under the evolution of unpalatablity by kin selection: in most cases, gregariousness seems to have evolved after the evolution of unpalatability and aposematism (139, 141).

7) Genetic Drift and the Shifting Balance. Although kin-founding can be looked upon as a purely deterministic model similar to kin selection (66), it is clear that, like Sewall Wright's "shifting balance" model of evolution (178-180) it requires a small local population size: the phenotypes of a small group of related individuals must dominate the learning and recognition systems of local predators, which is only possible if the total local population is low. The evolution of warning color via kin-founding is in fact a special case of phases I & II of the shifting balance (89, 97-98, 167). In phase I, genetic drift allows a local population to explore a new adaptive peak; in phase II, local selection causes the population to adapt fully to the new adaptive peak. Although not usually treated in kin-founding models (but see 66), phase III of the shifting balance, i.e. spread of the new adaptive peak to other populations, would clearly be an important final phase in the kin-founding process. This would be equivalent to having local populations with different warning colors competing across narrow bands of polymorphism, as is actually the case in many hybrid zones between geographic races of warningly colored species today; movement of these clines for warning color would be the equivalent of Phase III (87, 98). In warning color, stable and unstable equilibria are peaks and troughs of relative fitness, but not necessarily of mean fitness: under purely number-dependent selection (Fig. 1), mean fitness is a constant independent of frequency, 1 - [nk(A)+nk(a)]/N, and under the linear frequency-dependent selection model, the minimum of mean fitness is not at the unstable equilibrium (0.4 in Fig. 1). If A is more memorable than a, then nk(A) < nk(a), but this does not increase the mean fitness when qA is high, except very close to fixation of A when hardly any a are available to be tasted by predators.

A recent critique of the shifting balance model concluded that chromosomal evolution, warning color evolution, and more general patterns of phenotypic adaptation were almost always better explained by ordinary individual selection (41). For warning color and mimicry, the key problems are that natural selection seems too intense so that drift is unlikely, and, in common with other examples of rugged adaptive surfaces, phase III of the shifting balance seems an inefficient means of spreading better warning patterns. While these problems seem serious, key features of warning color considerably increase the chances of shifting balance occurring. Firstly, although selection for warning color can often be extremely strong, it would be surprising if predator attacks were not sometimes reduced or suspended locally, due to temporary absence of key predators such as flycatchers or jacamars (34-35, 119-120). If so, populations can occasionally drift to become polymorphic because of a relaxation of selection. Provided that the prey are abundant compared with their predators, nk << N, and the populations will quickly enter the central basin where selection is weak (e.g. for N = 100 in Fig. 1). Here drift, or mild forms of selection other than that due to warning function may cause a new pattern to rise in frequency above the unstable equilibrium (phase I), whereupon selection can fix and refine the new pattern (phase II). An interface between new and old patterns will form, resulting in a cline similar to hybrid zones between races observed today. If one pattern is superior at warning away predators, asymmetries of selection will drive it into the range of the other behind a narrow moving cline (phase III). Cline movement seems likely; with strong selection in the clines observed in nature (93), fairly rapid movement is predicted (91). The shifting balance proposal is speculative because we know little about the frequency, timing and depth of episodes of selection relaxation required for phase I, the relative advantages of different warning colors across clinal boundaries required for phase III, and whether population structural constraints will prevent cline movement (4, 69, 89, 98).

However, empirical evidence for all phases suggest the shifting balance is likely:
(a) polymorphism seems to exist regularly among Müllerian mimics (see below under Müllerian Mimicry, Polymorphism and the Palatability Spectrum), showing that, in spite of strong selection acting sometimes on mimicry (11, 75, 80, 92), at other times, reduced selection, genetic drift, and non-mimetic selection allowing polymorphism in the central basin seem actually to occur; (b) the strong purifying selection that is the problem for phase I promotes phase II; and (c) the existence of today's narrow clines, and biogeographic evidence for past cline movement and movement in historical times suggests that phase III is also likely. The current disjunct distribution of genetically homologous "postman" patterns of Heliconius erato and its Müllerian co-mimic Heliconius melpomene in peripheral populations strongly suggests that some such competitive cline movement in favor of central Amazonian "dennis-ray" patterns of this nature has occurred, even if the color patterns have been sometimes restricted to Pleistocene refuges in the past (24, 27, 89, 98, 137, 164). There is some empirical evidence for movement of Heliconius clines this century - although slow on a historical scale, the movement of warning color clines could be very fast relative to an evolutionary time scale. (d) The shifting balance does seem to have a strong potential in explaining geographic diverence within species, the strong differences in warning color and mimicry between sister species, and also the extraordinary diversity and novelty of these patterns (98). If the shifting balance is important for current diversification, there is little reason to doubt that it could also have been important in the murky initial stages of the origins of warning color, though evidence has long since been erased by more recent evolution within already aposematic lineages.


Sex-limited Mimicry

In a minority of Batesian mimetic butterflies, females are mimetic, while males, although brightly colored, are not. Such cases can be explained if males are constrained to be non-mimetic by sexual selection, either via female choice (162-163), or by the requirements of combat or other male-male signalling (138, 169). This topic has been reviewed excellently elsewhere (164; see also 78), so we do not treat it in detail here.

Sexual selection may explain sexually dimorphic mimicry, but there are some peculiarities of female-limited mimicry for which the answers are not known. Firstly, female-limitation seems restricted to putative Batesian mimicry. As far as is known, Müllerian mimics lack strong sexual dimorphism. Presumably, this is explained because Müllerian mimicry is under purifying density-dependent selection: as a mimetic pattern becomes more common, its advantage increases (Fig. 1). In contrast, Batesian mimicry becomes less successful as it becomes commoner; thus sexual selection is more likely to outweigh this weakening mimetic advantage in Batesian mimics (162). Female-limited mimicry also seems virtually confined to butterflies (46, 168), whereas the sexual selection theory should apply to all examples of Batesian mimicry. Here, the explanation may be ecological. Territorial or fighting males of many butterflies fly purposefully, fast, and can escape predators easily. Female butterflies searching for oviposition sites can be particularly vulnerable to predator attacks (111) because they must fly slowly, like potential models; thus ecological considerations may explain why butterfly females, but not males, often mimic slow-flying models (111, 169). Ecological constraints on sexually dimorphic mimicry are well demonstrated by cases in which only the male is mimetic (168), for example, in saturniid moths with nocturnal females but diurnal males (172).

Mimetic Polymorphism and Genetic Architecture

Batesian mimetic butterflies may be polymorphic as well as sexually dimorphic. This phenomenon is best known and studied genetically among female mimetic forms of Papilionidae, particularly Papilio dardanus and P. memnon, where each female form mimics a different unpalatable model. The maintenance of this polymorphism is easily explained in common Batesian mimics because frequency-dependent selection favors rare mimics. Polymorphisms in Batesian mimics are also well-known in non-butterfly groups: good examples exist in hoverflies (170-171). However, the rarity of accurate polymorphic mimicry of the kind displayed in Papilio suggests that special circumstances must be involved. Mimetic polymorphisms in these cases are usually determined at relatively few genomic regions with large effect ("supergenes"), often with almost complete dominance (38-39, 134-135). The maintenance of mimetic polymorphisms probably depends rather strongly on supergene inheritance. Without it, non-adaptive intermediates would be produced.

While it is easy to understand the maintenance of polymorphisms at mimetic supergenes, it is far from clear how these supergenes initially evolved. Early Mendelians used these genetic switches as evidence that mutations of major effect were prime movers of adaptation (123). Fisher (49) argued forcefully that most adaptive evolution could be explained via multiple genetic changes of individually small effect being sorted by natural selection (49). Essentially, Fisher proposed that selection rather than mutation was the creative process in adaptation. Goldschmidt (56) then revived mutationist theory in more sophisticated form, and proposed that mimics could exploit major ("systemic") mutations which re-used the same developmental machinery originally exploited by the model. He felt it unlikely that the same genes were re-used by mimics and models, proposing instead that different genes had access to the same developmental pathways. Gradualists were quick to point out cases in which development of mimicry was clearly analogous rather than homologous; such as colored spots on the head and body of models being mimicked by basal wing patches on mimetic Papilio memnon (135). Single gene switches in P. memnon were demonstrated to consist of tightly-linked multiple genetic elements that could be broken apart by recombination or mutation, and it was suggested by gradualists that these "supergenes" had been gradually constructed by a process of linkage tightening to reduce the breakup of adaptive combinations by recombination (38-39).

More recently, opinion has swung back (but only part way) towards a more middle-ground mutationist viewpoint. It has been realized that it would be hard to construct supergenes by means of natural selection alone. Separate elements of a supergene must have been tightly linked initially in order that a sufficiently high correlation between favorable traits was available for selection for tighter linkage. Thus the genetic architecture of mimicry must to some extent consist of pre-existing gene clusters. If so, this could explain why Müllerian mimics and models such as Heliconius often themselves show major gene inheritance. Müllerian mimics are not expected to have polymorphisms, and usually they do not (but see below under Müllerian mimicry), thus they are not expected to require supergene inheritance of their color patterns under the gradualist hypothesis. Heliconius patterns are inherited at multiple loci: this was interpreted as confirming a gradualist expectation for polygenic inheritance of mimicry (113, 137, 164). However, a closer look at Heliconius shows that many of the pattern switches are indeed major, have major fitness effects, and can also in some cases be broken down into tightly linked component parts by recombination or mutation (87a, 93, 137), again suggesting mimetic "supergenes". For example, in both Müllerian mimics H. erato and H. melpomene, a large forewing orange patch known as "dennis", and orange hindwing "ray" patterns are very tightly linked, but are separable via rare recombination or mutations which show up only in rare individuals from hybrid zones (87a). Probably, mutations with major effect are required even in Müllerian mimicry because, during adaptation, a Müllerian mimic loses its current warning pattern while approaching that of a model. There is thus a phenotypic fitness trough between the old pattern and the new pattern. Only if a mutation produces instant protection by the new pattern can the gene be favored, unless the two patterns are already extremely close. After approximate mimicry has been achieved by mutation, multilocus "modifiers" can improve the resemblance in the normal way (37, 105, 161, 164). This hybrid view of Müllerian mimicry, known as the Nicholson "two-step" theory, combines what is arguably a mutationist argument with a gradualist hypothesis to explain the perfection of resemblances.

This explanation fits major gene adaptations in Müllerian mimicry, especially as it is now realized that Fisher's argument for adaptation via small mutations has serious flaws (112), even without the frequency-dependent stability peaks of mimicry (Fig. 1). However, two-step theory cannot explain why genes for forewing and hindwing patterns should be tightly linked in both model and mimic in Heliconius. Why should H. erato and H. melpomene (the former is almost certainly the model driving the divergence -- see 55, 89) both diverge geographically using probable supergenes of major genetic effect? One possibility is that genetic architecture for color pattern change in Heliconius simply has limited flexibility (87a). We now know that there is widespread re-use of homeotic gene family machinery throughout the animal kingdom, including some involvement in color pattern development in butterflies (16, 33). It would not be surprising if mimicry gene families were not also re-used similarly (106-108, 164) in the lineages leading to H. erato and H. melpomene. This argument is similar to Goldschmidt's (56), but in one sense more extreme, since Goldschmidt thought it likely only that the same patterning control would be reused, rather than the very same genes. Others argue from similar data that the evidence is in favor of analogous rather than homologous developmental pathways and gene action (88), but a true test will be possible only when mimicry genes are characterized at the molecular level in both lineages (51, 95).

In conclusion, current opinion based on nearly a century of genetic studies and mathematical population genetic theory shows how mimetic as well as other adaptations may often require mutations of major effect, at least initially, both because of the ruggedness of the selective landscape, and probably also because of constraints imposed by pattern genetics. Perfection of these adaptations then involves effects generated at multiple genes of increasingly small effect. The types of genetic architecture required, especially for polymorphic mimicry, may be rare. This may explain why some groups involved in mimicry, such as the Papilionidae, are able to colonize multiple mimicry rings (164), while others are rarely mimetic. Disruptive mimetic selection is perhaps as likely to be an agent causing an alternative, speciation, as it is to be a common cause of polymorphism (see Mimicry and Speciation below).

Müllerian Mimicry, Polymorphism and the Palatability Spectrum

Müllerian mimicry and warning color are standard textbook examples of frequency-dependent selection within species (e.g. 99, 126) as well as leading to Müllerian mimicry between species (103). Polymorphisms should be rare due to high rates of attack on rare variants (Fig. 1; 27, 47, 67, 87, 89). In general, workers in the field of mimicry assert that this is so (89, 97-98, 161, 164, 167), but there are some very embarrassing exceptions to the rule among even the best known Müllerian mimics. The most famous case is Danaus chrysippus and its Müllerian mimics Acraea encedon, A. encedana, together with their Batesian mimic Hypolimnas misippus. While distinct color patterns are virtually fixed in the peripheries of their respective ranges, these species are highly polymorphic over an area of Central and Eastern Africa larger than Europe (57, 146). Similarly embarrassing widespread polymorphisms are found in two-spot ladybirds (15, 85), and in Laparus doris (Heliconiinae) (151, 159). Arguably, mimicry in many of these cases is weak: non- or poorly mimetic morphs are common (114, 85, 143-144, 159). However, there are equally problematic examples in which mimicry is very accurate. For instance, Heliconius cydno is mostly monomorphic in Central America (94, 142), but becomes polymorphic throughout much of the Andes of Colombia and W. Ecuador (80, 83); each morph can be clearly identified as an accurate mimic of other Heliconius, particularly H. sapho and H. eleuchia. The pinnacle of Müllerian mimetic polymorphism is found in Heliconius numata. This species is polymorphic throughout virtually its whole range, and some populations of the Amazon basin near the slopes of the Eastern Andes may have up to 7 different morphs, each of which is an accurate mimic of a separate species of ithomiine in the genera Melinaea or Mechanitis (23, 26). Three general explanations have been proposed.

1) Batesian Overload and Coevolutionary Chase. If an unpalatable species has many Batesian mimics, it may suffer from "Batesian overload". According to this hypothesis, the deleterious effects of mimics may force the model to diverge from its normal pattern to escape mimicry, leading to a coevolutionary chase of model by mimic. This idea has generated some controversy (54, 71, 72, 109, 165), but has been well reviewed recently (164, 165), and we merely summarize: it does not seem likely that coevolutionary chase or Batesian overload can explain polymorphisms in unpalatable models. Frequency-dependent purifying selection on the models is almost always stronger than diversifying selection imposed by mimetic load (57, 78, 109, 165).

2) The Palatability Spectrum. Unpalatability cannot be absolute; there must be variation in unpalatability, which could lead to some interesting evolutionary effects. Müllerian and Batesian mimicry are differentiated by means of palatabilities: models and Müllerian mimics are negatively reinforcing, while Batesian mimics positively reinforce predator attacks. Hence we obtain the straightforward view that Batesian mimics are parasitic - they hurt their models, while Müllerian mimics are mutualistic and benefit their models (103). However, a second equally straightforward idea apparently conflicts with this view: if two Müllerian mimics are not equally unpalatable, the presence of the more palatable could increase the rate of attack on the less palatable, so that unpalatable mimics may harm their models or co-mimics, leading to a parasitic form of Müllerian mimicry. A series of behavioral modellers since the 1960s have suggested that parasitic Müllerian mimicry may explain some of the embarrassing examples of polymorphism in aposematic species. Because benefits and costs become decoupled from the Müllerian/Batesian palatability divide in this latter prediction, a new terminology must be developed. An appropriate name for the new parasitic form of Müllerian mimicry is "quasi-Batesian" (148). [There is also a category of palatability-defined Batesian mimicry which is beneficial to the model as well as the mimic; "quasi-Müllerian" mimicry. This is possible if seeing a palatable mimic "jogs" the memory, reminding predators of unpleasant experiences with the model, and leading to greater avoidance of the model than if there were no mimic. Quasi-Müllerian mimicry seems unlikely (151); anyway, it should not lead to polymorphism and will not be discussed further.] In quasi-Batesian mimicry, the more palatable mimic may suffer increasing attacks as its numbers increase relative to the model's, even though its effect while alone would be to reduce its predation progressively as density increases (Fig. 3B,C) (71, 72, 115, 148-151). This has been suggested to lead to the evolution of polymorphism in Müllerian mimicry systems (71-72, 148-151).

The behavioral assumptions that lead to quasi-Batesian mimicry pose a severe threat to traditional natural history and evolutionary dynamics views of mimicry, possibly "the end of traditional Müllerian mimicry" (148). This problem never arose until behaviorists attempted to model memory realistically. It is apparent that Müller and subsequent naturalists and evolutionists made an unstated assumption: that the sum of learning and forgetting over all predators would cause an approximately constant number (nk) of unpalatable individuals of each phenotype to be killed (or damaged) per unit time (Fig. 1). Purifying frequency-dependent selection results from Müller's assumption because the average attack fraction nk/N decreases as the total number of individuals, N increases. The existence of quasi-Batesian mimicry, in contrast, requires that the attack fraction on a Müllerian mimic increases as N increases, implying that nk can be a rising function of N rather than a constant. We will here follow the development of these ideas, and discuss why we feel the assumptions that lead to quasi-Batesian mimicry may not be met in most real situations.

The original idea for what is now called quasi-Batesian mimicry was proposed by Huheey (71 and earlier). After a single trial experience with an unpalatable individual, the predator was imagined to learn to avoid totally; thereafter the predator would forget after seeing, but not attacking, a fixed number of individuals with the same pattern. In this formulation, unpalatability affected only the rate of memory loss, rather than its acquisition: very unpalatable species caused slower forgetting than mildly unpalatable species. Increasing the density of less nasty mimics caused a rise in the average forgetting rate and led to an increasing fraction of models attacked (Fig. 3B). Thus, if two unpalatable species differed in palatability, only one benefited, while the other suffered, though the more palatable species on its own was still unpalatable in the sense that predators are negatively reinforced. Even mimicry at the point of equal palatability was neutral, in that increases in density of either caused a faster rate of both acquisition and loss of memory, rather than a reduction in fraction attacked. The predicted absence of mutualistic mimicry in Huheey's theory was strongly attacked (12, 114, 136, 151). The problem appeared to be the event-triggered forgetting model, in which avoidance lapsed after a certain number of prey were avoided. This meant that the total number of prey in the population had no effect on the evolution; selection was assumed to depend only on frequency.

To avoid this pathology of Huheey's formulation, it was proposed that forgetting should be time-dependent (12, 114, 136, 166), rather than depending on the number of avoidances; forgetting should cause the attack fractions to decline or rise exponentially with rates Mi and Mo (for mimic and model respectively) towards the "naive attack rate", i.e. naive attack fraction (115, 149). At the same time, a more flexible system of learning was proposed, in which unpalatability was represented as an asymptotic fraction of prey attacked, Mi and Mo; these asymptotes were again approached exponentially, with learning rates forming another set of parameters (115, 151). These theories could reproduce the full spectrum of mimicry from Batesian mimicry (Fig. 3A) to Müllerian mimicry (Fig. 3D), including quasi-Batesian mimicry (Fig. 3B), and also a curious form of bimodal mimicry which is quasi-Batesian at low mimic density, but traditionally Müllerian at higher mimic densities (Fig. 3C).

The behavior of these models is easy to explain. Memory gain or loss both result in exponential approach to an asymptote of attack frequency, so the combination of the two processes will itself lead to a stationary resultant attack fraction independent of density for either model or mimic on their own. The joint attack fraction on model and mimic together (assuming models and mimics are visually indistinguishable) is simply an average between the curves for model and mimic asymptotic attack fractions. When mimic density is very low, the joint response is very like that of the model; when mimic density is high, the effect of the mimic dominates, and the joint response increasingly obeys the mimic's asymptote. Because the averaging process is of the form of a harmonic mean (115) rather than an arithmetic mean, curious peaks in the density response can occur, the "Owen & Owen effect" (149) (Fig. 3C), implying a quasi-Batesian/Müllerian transition across a density threshold.

Speed and Turner (151, 167a) recently examined the behavior of a number of different formulations and combinations of these basic memory assumptions. They concluded that (a) many of the assumptions produce quasi-Batesian responses like that of Fig. 3B-C; (b) that behavioral experiments on mimicry and warning color are not usually set up to test for density responses, and therefore cannot easily be used to test whether mimicry falls into quasi-Batesian categories (167a). Well-known polymorphic Müllerian mimics often have intermediate levels of acceptance in tests both with caged and wild birds (20, 34-35, 77, 119-120, 132), showing that many supposedly unpalatable species may often be attacked. Therefore, the known biology of predation on unpalatable species as well as theory mesh with the possibility of a palatability spectrum that could lead to quasi-Batesian mimicry.

However, if theories like those in Fig. 3 are correct, the assumptions for traditional number-dependent and frequency-dependent mimicry of Fig. 1 must be wrong. Our own belief is that new and incorrect assumptions lurking in the behavioral models are to blame for the conflict. Our criticisms are as follows:

a) It seems unlikely that attack rates will reach an asymptotic fraction independent of density, unless that fraction is zero. To understand this, imagine that forgetting is switched off, so that all learning is perfect. Under the new theories (115, 148), learning should asymptote at a constant frequency; number-dependence enters into memory dynamics only through time-based forgetting. With no forgetting, there is then no number-dependent selection, and mutualistic Müllerian mimicry becomes impossible if memory is perfect (149-151). Intuitively, it seems odd that perfect memory does not lead to extremely successful Müllerian mimicry, and this intuition is, we believe, correct. The absence of Müllerian mimicry when there is no forgetting is due to rapid attainment of asymptotic attack fraction. In other words, as the density of an unpalatable mimic in Fig. 3 rises, the predator is supposed to stuff itself with more and more unpalatable prey in order to maintain a constant asymptotic fraction of prey attacked. Note that this argument does not depend on "hunger levels", because unpalatable prey are unlikely to form a large component of the diet (166). The argument is about learning to avoid prey, which is more likely to depend on dose received by the predator per unit time, rather than dose per individual prey. The new theories in effect have the same problem for learning (i.e. not being time-based) as the forgetting model for which Huheey was criticized (12, 115, 136). It seems much more likely to us that for "unpalatable" prey, an asymptotic number of prey attacked per unit time would be required for learning, leading to strongly number-dependent and frequency-dependent selection like that of Fig. 1.

b) It is hard to justify the term "unpalatability" unless the effect is density independent; predators should reject and increasingly avoid unpalatable prey whenever they encounter them. However, the new theories see a species as unpalatable if it has a learning asymptote lower than the "naive attack fraction", and as palatable if it has a learning asymptote higher than the naive attack fraction (149, 151). But only when the asymptotic attack fraction is zero do we produce avoidance whatever the attack fraction prior to each individual experience; this was the case, for example, in an original simulation model designed to disprove Huheey's assertions, and which recovered only Batesian and Müllerian mimicry, with a sharp transition between them (166). If our argument is correct, the whole of the palatability spectrum above an asymptotic attack fraction of 0 is then "palatable", and "quasi-Batesian mimicry" simply becomes Batesian, parasitic mimicry. The "palatability spectrum" represented by 0 < asymptotic attack fraction <= 1 is just that, a spectrum of palatability rather than of unpalatability.

c) Another problem is that, strictly speaking, "attack fraction" is not "palatability" at all, but a transformation of palatability onto a behavioral axis. Unpalatability is likely related linearly, or perhaps logarithmically, to dosage of particular noxious compounds. The effect of these compounds may be to produce an asymptotic attack fraction of 0%, 100%, or somewhere in between (Fig. 3). However, the dosage response curve will certainly be sigmoidal, with most of the dosages exhibiting approximately 100% (palatable) or 0% (unpalatable) asymptotic attack, and a narrow intervening band of dosages giving rise to intermediate levels of palatability. Thus, the "palatability spectrum" as modelled by attack fraction is a highly distorted view of the underlying palatability, or dosage, of noxious chemistry; in fact, most of the dosage spectrum is not considered by these attack rate spectrum models (115, 148-151). In reality, these intermediate asymptotic attack fractions, if they do exist at all, are likely to be rare.

Empirical data from caged and wild birds showing intermediate levels of attack on models are of great interest, but they do not necessarily conflict with the points made above: attack fractions in the laboratory or in nature rarely tell us how they vary with prey density (167a). The behavioral, "receiver psychology" view which leads to possibly novel forms of mimicry suggests that the density response will asymptote; the number-dependent (natural history) view predict that attack fractions on unpalatable insects will always decline with increasing density. Unfortunately, experiments have not clearly distinguished between these alternatives, because they were rarely designed to check density responses (167a). It does not seem impossible to design such experiments, however.

In conclusion, theories of the palatability spectrum from a receiver psychology angle have led to a potentially major upset in traditional views of mimicry. To decide which view is correct, we need to understand memory dynamics of actual predators, and, given that many of the controversial theories are supposedly based on standard Pavlovian learning theory (124, 148), understanding the evolutionary results of memory on mimicry could lead to advances in memory theory in general. Even if quasi-Batesian mimicry turns out, as we believe, to be unlikely, the threat posed by these new theories demonstrate the naiveté of the natural history assumption that memory is a black box producing number dependence.

3) Spatial and Temporal Variation in Mimetic Selection

Geographic variation in mimetic color patterns within a mimic can obviously be maintained by geographic divergence of models. If mimicry is Müllerian, then divergence becomes self-reinforcing. Patches of habitat with different Müllerian mimetic patterns will be separated by zones of polymorphism: the width of the polymorphic region will be proportional to average dispersal distance and inversely proportional to the square root of the strength of selection (92), as for clines in general (5). Thus, if selection is weak and dispersal extensive, bands of polymorphism may be wide compared to areas of monomorphism. This situation undoubtedly pertains in many species; for example, it is often not realized that it is a common situation within Heliconius. Heliconius erato and melpomene are renowned for the narrowness of their hybrid zones between strongly differentiated forms (e.g. 87, 93); however, zones of polymorphism between weakly differentiated races, for instance in the Amazon basin, are much broader, so that polymorphism is almost the norm (see maps in 24, 27; many other maps of Heliconius races oversimplify the actual distributions).

A similar situation may exist for the unpalatable species Acraea encedon, A. encedana, Danaus chrysippus, and their Batesian mimic Hypolimnas misippus in Central and Eastern Africa: peripheral populations of all these species are nearly monomorphic (146). Similarly, spatially varying mimetic and other selection pressures, rather than quasi-Batesian mimicry (151), may explain the polymorphisms of ladybirds such as Adalia bipunctata (15, 85) and butterflies such as Laparus doris (159).

A more complicated version of this idea has temporal as well as spatial variation in mimetic selection. The diverse polymorphism of Heliconius numata may be selected because the models (ithomiines in the genus Melinaea) vary greatly in abundance over time and space (26). However, it would be hard to explain how polymorphism is maintained via temporal variation unless the color pattern loci have, on average, a net heterozygote advantage. Given that the supergenes affecting mimicry in H. numata are visually dominant (26), any heterozygous advantage must usually be non-visual. Another example of polymorphism in a Müllerian mimic with multiple models is Heliconius cydno. There are strong differences across W. Ecuador in the frequency of models Heliconius sapho and H. eleuchia, causing divergent patterns of natural selection (80). In conclusion, the observed polymorphisms of many Müllerian mimics can be explained without quasi-Batesian mimicry, via spatial and possibly temporal variation in model abundance.

4) The Shape of Frequency-Dependence. The maintenance of polymorphism in unpalatable species will be considerably aided by the shape of frequency-dependence, given number-dependent selection (Fig. 1). When population sizes of prey (N) are large relative to the numbers sacrificed during predator learning, the fraction nk/N will be small, say 1/100 or less, and there will be little selection in the central polymorphic basin.

Although we do not know the values of nk/N typical in the wild, a variety of experiments (11, 75, 80, 91) indicate that selection can be strong, i.e. nk/N >= 1/10. On the other hand, it seems likely that many predators will require few learning trials to avoid an aposematic insect. Models and common Müllerian mimics will often outnumber their predators considerably, and, furthermore, predators live much longer and will often be able to generalize between prey generations. Experiments by Kapan on H. cydno in W. Ecuador showed that selection against polymorphism was much weaker where H. cydno was abundant than where it was rare (80). Thus it seems not unlikely that nk/N<=1/100, at least some of the time.

Drift can explain the origin, but not the maintenance of polymorphism in the central basin. However, polymorphisms, once attained, should be only slowly removed via mimetic selection. Second order selective forces, such as non-visual selection (for instance thermal selection in ladybirds), arbitrary mate choice or other factors (15, 85, 144-147) may become important, and contribute to non-adaptedness of mimetic polymorphisms. Strong selection at some times and places (nk/N = 1/10 or greater) is clearly required to produce near-perfect resemblance and narrow hybrid zones between races. But if a Müllerian mimic or model, perhaps by an ecological fluke, becomes abundant relative to its predators (nk/N <= 1/100), it could then be relatively free to experiment with non-adaptive and polymorphic color patterns. In short, the shape of frequency-dependence, together with varying selection for mimicry and mild selection of other types can explain Müllerian polymorphism without the need for quasi-Batesian mimicry.


Bates, Wallace, and Darwin were all of the opinion that strong natural selection, which must occur sometimes to explain mimicry, could lead to speciation. The continuum between forms, races and species of diversely-patterned tropical butterflies led to this idea in the first place (6-7, 173-174). This view has since faded into the background, probably because of a post-war concentration on reproductive characteristics ("reproductive isolating mechanisms") thought important in speciation under the "biological species concept" (100). However, mimicry can cause strong selection against non-mimetic hybrids or intermediates, and should therefore contribute strongly to speciation and species maintenance, by acting as a form of ecologically mediated post-mating isolation, as in H. himera and H. erato (96, 101).

If mimicry contributes to speciation, mimetic shifts should often be associated with speciation within phylogenies. Mimicry-related speciation would explain the curious pattern of "adaptive radiation" in Heliconius: Müllerian co-mimics are usually unrelated, while closely related species almost always belong to different mimicry rings (164). Mimetic pattern has been switched between eight of nine pairs of terminal sister taxa in a mtDNA phylogeny of Heliconius, (17, 95). Many sister taxa that have switched mimicry are known from other groups as well. For example, among butterflies, the viceroy (Limenitis archippus) mimics queen and monarch butterflies (Danaus spp.), while its close relative, the red-spotted purple (Limenitis arthemis astyanax) mimics an unpalatable papilionid, Battus philenor. The two Limenitis are very closely related, and hybridize occasionally in the wild (127). Similar examples exist in the Papilionidae. According to Dr. F. Sperling (in litt.), it does not seem that mimetic lineages speciate more rapidly in the genus Papilio than non-mimetic lineages; however, a number of closely related species do differ in their mimicry ring.

Nonetheless, while we believe mimicry undoubtedly contributes to speciation, this section must remain somewhat speculative. We cannot point to any convincing case in which mimicry has been the major or only cause of speciation. But then perhaps speciation is almost always caused by multiple, rather than single episodes of disruptive selection.


A naive view of Müllerian mimicry would suggest that all similarly sized species should converge locally onto a single color pattern. In fact, there are often ten or more mimicry rings among ithomiine and heliconiine butterflies of the Amazon basin (6, 9, 102, 137). The reason for the lack of a single uniform mimicry ring among similarly sized butterflies is currently disputed, and parallels, at an interspecific level, the debate on Müllerian polymorphisms.

Papageorgis (116) provided data from Peru showing that different heliconiine and ithomiine mimicry rings fly at different heights in the forest canopy. She suggested that dual selection for camouflage and mimicry might explain these patterns. In other words, particular mimicry rings are better camouflaged in the lighting conditions pertaining at their favored flight heights. However, heliconiine flight heights are now well-documented to overlap far more extensively than appeared from Papageorgis' data, although weak mimicry associations do exist for habitat and nocturnal roosting height (25, 28, 94). It is unclear how dual selection would work, and is anyway hard to imagine that the garish reds, yellows, blacks and iridescent blues of heliconiines are ever very cryptic against subdued forest backdrops.

Nonetheless, recent studies of ithomiines, in contrast, do demonstrate some patterning of mimicry rings in flight height as well as in horizontal (habitat-related) distribution (10, 25, 44, 102). A possible explanation for these community patterns is that different guilds of predators are found preferentially in the different habitats or micro-habitats, so that, within each habitat, mimicry is tuned to local predator knowledge (10, 94). There must be some selection pressure of this sort to explain the multiple patterns; however, it would be hard to imagine birds ignoring butterflies a metre or two higher or lower than their normal flight height in the forest understory, and it seems highly unlikely that the proposed sub-communities of predators and particular mimicry rings are very discrete. The overlap between mimicry rings is rather more noticeable to the naturalist than the rather statistical differences in average heights or habitats of mimicry rings (10, 25, 44, 94, 102). Instead, some segregation may exist because newly invading unpalatable species are most likely to join the mimicry rings already most prevalent in their habitats. Major mimicry rings that overlap substantially may be unlikely to join together as species accumulate in each ring for the same reason that intraspecific polymorphisms have a nearly neutral central basin (Fig. 1): the selection for convergence of two abundant mimicry rings will simply not be that strong.


We have shown that the shape of number-dependent selection on the color patterns of unpalatable species can help to explain many puzzling and mutually conflicting data of mimicry and mimetic diversity. When the attack fraction is high because of a high predator/prey ratio, selection on mimicry can clearly be extremely strong and has been measured to be so in a handful of field studies. But when predator/prey ratios are low (nk/N ~ 1/100), there is a wide central basin of near neutrality where only weak purifying selection acts on polymorphisms. Therefore, once an unpalatable butterfly becomes abundant relative to predators, nk/N decreases hyperbolically, and its morphology becomes less constrained by selection. Polymorphisms may then result which become relatively impervious to further bouts of selection. The weakness of purifying selection in polymorphic populations can help explain why puzzling polymorphisms in some Müllerian mimics are not removed by selection. Such polymorphisms enable populations to explore the selective landscape, which can increase the chances of shifting balance, one of the few ways to explain why utterly novel color patterns continually evolve in mimetic butterflies. Finally, partial responsibility for a puzzling diversity of local mimicry rings can also be placed at the door of weak selection against multiple rings when each ring is abundant.

But these arguments will fail if predator memory and perception do not produce number-dependent selection. If predators behave according to some current theories of "receiver psychology", our conclusions based on more traditional Müllerian theory may be invalid. We do not think that this is the case; however, appropriate experimental studies are urgently required to test between these models of memory and forgetting.


We are very grateful to George Beccaloni, Chris Jiggins, Gerardo Lamas, Russ Naisbit, Mike Speed, Felix Sperling, Maria Servedio, Greg Sword, John Turner, Dick Vane-Wright, Dave Williams for helpful conversations about mimicry and comments on the current paper, and to NERC, BBSRC, the British Council and the Ministries of Higher Education and Research and of Foreign Affairs, France, for financial support.


Return to Mimicry
Return to J. Mallet Home Page