
Morphometrics^{1} is a field concerned with studying variation and change in the form (size and shape) of organisms^{[1]} or objects. There are several methods for extracting data from shapes, each with their own benefits and weaknesses. These include measurement of lengths and angles, landmark analysis and outline analysis.
Morphometric analyses are commonly performed on organisms, and are particularly useful in analysing the fossil record. In this use, it is assumed that morphometrics can quantify a trait of evolutionary significance, and by detecting changes in the shape of organisms, deduce something of their ontogeny or evolutionary relationships.
"Morphometrics", in the broader sense of the term, is also used to precisely locate certain areas of featureless organs such as the brain, and is used in describing the shape of other things.
Morphometrics adds a quantitative element to descriptions, allowing more rigorous comparisons. It enables one to describe complex shapes in a rigorous fashion, and permits numerical comparison between different forms. By reducing shape to a series of numbers, it allows objective comparison that does not rely on individuals' interpretation of descriptive words. Further, statistical analysis can highlight areas where change is concentrated, removing the need to explicitly declare an area for investigation before study.^{[1]}
Morphometric study aims to describe the shape of an object in the simplest possible fashion, removing extraneous information and thereby facilitating comparison between different objects.
An object's shape be described in many ways – one may take a sequence of defined measurements, record the position of certain important landmarks, or define the outline of the object. Each of these exaggerates a certain aspect of an object. Morphometric analysis begins by obtaining and (usually) digitising one of these suites of descriptors. Since morphometrics is concerned solely with shape, analysis begins by removing confounding factors – size, rotation and location must all be corrected for.
Typically, analysis begins with principal component analysis, which highlights any trends and makes it easy to spot any correlation with other features.
The traditional, and most rudimentary, method of morphometrics involves measuring distances, angles and areas (image).^{[2]} Commonly, the measurements taken are of little significance in terms of the organism. The method has the drawback that many measurements covary, thwarting statistical analysis – for instance, tibia length will vary with arm length, and the interdependence of these two variables will bias the data set.
The methodology is useful in cases where linear and angular data are available, and is of great utility in study of growth. However, it can only distinguish changes in length, and cannot be used to map how these changes are accomplished.
This technique assesses the distribution of "landmarks": points described by a tightly defined set of rules, for example the suture between three named bones in a skull.
The technique only generates data that are as good as the landmarks that are input, and many studies are called into question as a result of suspect landmark selection.
Well chosen landmarks reflect homologous points – i.e. points with evolutionary significance. In order for a landmark to be of utility, it must be present on all specimens studied.
The number of landmarks which can provide meaningful data is approximately equal to the number of specimens sampled: if there are more landmarks than specimens, some landmarks are redundant, and results produced may be unsubstantiated.^{[1]}
There are three categories of landmarks, listed here in order of decreasing utility. True landmarks represent a genuine homologous structure. Pseudolandmarks are marks defined by relative locations; for example, "the point of highest curvature of this bone." Semilandmarks are defined relative to other landmarks, for example "midway between landmarks X and Y." The latter are of less value and often weighted against in analyses.
In order to compare shapes, they need to be fitted into a frame of reference that places them in the same virtual space. With landmarks, this can be done by lining up one specific landmark; this has the disadvantage that it removes all the data from this point. A preferable (but not errorfree) method is to use the "centroid" – that is, the "centre of mass" of the landmarks, assigning an equal weight to each landmark. This centroid is calculated for each specimen and translated to the origin.
Objects must be scaled to the same relative size. In an ideal world this could be done by adjusting everything to a fixed measure of size, for example based on mass. However, such reliable measures of body size are usually lacking, and as such the spread of the landmarks must be used as a proxy for size. A simple approach is to scale the landmarks such that the distance between two named landmarks is constant in all specimens; however, this removes data from these two landmarks, and has the implicit assumption that the two are an equivalent distance apart in all specimens. A better method has since been developed, which involves calculating the distance of each landmark from the centroid, calculating the square root of the squares of individual distances (the "centroid size"), and setting this to 1 for each specimen.
Rotation is the final nonshape attribute that must be removed from datasets prior to their interpretation. This is performed by minimising the sum of squared distances between corresponding landmarks on subsequent specimens.
Various techniques have been developed to perform these three stages in one step. Bookstein registration is simpler on many levels, and was traditionally used, but strips the data from two landmarks. It involves setting two landmarks to the fixed coordinates (0,0) and (0,1) to combine the previous steps.^{[3]} Better is Procrustes superimposition, which combines the least squares approaches previously discussed into one algorithm.
The technique has some deficiencies; most notably, all data about the shape is discarded save those points defined by a landmark. These landmarks may be difficult to identify on all specimens, and if 2D data is used, the orientation of a specimen has a marked effect on the relative positions of landmarks.^{[1]}
Outline analysis is often considered an alternative to be used only when landmarks are too difficult to define or observe. This is in part due to the difficulty in collecting 3D outlines, which has restricted the data to a relatively informationempty 2D line drawn round the edge of specimens. However, the increasing availability of 3D mapping techniques, such as laser rangers, is making 3D outline analysis (using semilandmarks on "semilandmarklines") an increasingly attractive alternative. As this discipline is a rapidly developing field of research, pioneered by Norm MacLeod and others^{2}, its techniques have not yet stabilised (and will not be discussed here at present).
The use of outline data is in some ways inferior to geometric analysis, but for many shapes it can be more informative. Both techniques pick up different forms of variation and ideally both should be considered in tandem. The perceived failings of outline morphometrics are that it doesn't compare points of a homologous origin, and that it oversimplifies complex shapes by restricting itself to considering the outline and not internal changes. Also, since it works by approximating the outline by a series of ellipses, it deals poorly with pointed shapes.^{[3]}
Detractors of the technique claim that the loss of information means that unrelated organisms can be mistakenly grouped together – a famous example being that outline analysis struggles to differentiate a scapula from a fortuitously shaped potato chip.^{[4]} However, this stems partly from a misapplication of the technique; and the same criticism can be levelled at geometric morphometrics, which groups (for example) a shark more closely to an ichthyosaur than a swordfish.^{[4]}
In practice, there are a number of ways of quantifying an outline. Older techniques^{[5]} have been superseded by the two main modern approaches: eigenshape analysis,^{[6]} and elliptical fourier analysis (EFA),^{[7]} using hand or computertraced outlines. The former involves fitting a preset number of semilandmarks at equal intervals around the outline of a shape, recording the deviation of each step from semilandmark to semilandmark from what the angle of that step would be were the object a simple circle.^{[8]} The latter defines the outline as the sum of the minimum number of ellipses required to mimic the shape.^{[9]}
Both methods have their weaknesses; the most dangerous (and easily overcome) is their susceptibility to noise in the outline.^{[10]} Likewise, neither compares homologous points, and global change is always given more weight than local variation (which may have large biological consequences). Eigenshape analysis requires an equivalent starting point to be set for each specimen, which can be a source of error EFA also suffers from redundancy, in that not all variables are independent.^{[10]} On the other hand, it is possible to apply them to complex curves without having to define a centroid; this makes removing the effect of location, size and rotation much simpler.^{[10]} An alternative to eigenshape analysis is to treat the semilandmarks as landmarks – which is of limited value.
To quantify and display the differences in shape, the variability needs to be reduced to a comprehensible (lowdimensional) form. Principal component analysis (PCA) is the most commonly employed tool to do this. Simply put, the technique projects as much of the overall variation as possible into a few dimensions. See the figure (upload imminent) for an example. Each axis on a PCA plot is an eigenvector of the covariance matrix of shape variables. The first axis accounts for maximum variation in the sample, with further axes representing further ways in which the samples vary. The pattern of clustering of samples in this morphospace often reflects a phylogenetic, or more commonly a phenetic, relationship.
Landmark data allows the deviation of an individual specimen from the mean to be visualised via thin plate splines. These visualisations are formed by calculating the mean location of landmarks, and drawing a rectilinear grid over them. This grid is then deformed ("stretched") so as to keep the landmarks in the same grid square, while moving the points themselves to their location in each specimen (see figure [upload imminent]).
In neuroimaging, the most common variants are Voxelbased morphometry, Deformationbased morphometry and Surfacebased morphometry of the brain.
The application of morphometrics is not restricted to biological uses. It can also be applied to terrain in the form of Geomorphometrics. It also has a host of other applications.
Note 1: from Greek: "morph," meaning shape or form, and "metron”, measurement Note 2: of the NHM
