Parallel coordinates is a common way of visualizing highdimensional geometry and analyzing multivariate data.
To show a set of points in an ndimensional space, a backdrop is drawn consisting of n parallel lines, typically vertical and equally spaced. A point in ndimensional space is represented as a polyline with vertices on the parallel axes; the position of the vertex on the ith axis corresponds to the ith coordinate of the point.
Contents 
Parallel coordinates were invented by Maurice d'Ocagne in 1885 ^{[1]} and were independently rediscovered and popularised by Al Inselberg ^{[2]} in 1959 and systematically developed as a coordinate system starting from 1977. Some important applications are in Collision Avoidance Algorithms for Air Traffic Control (1987  3 USA patents), Data Mining (USA patent), Computer Vision (USA patent), Optimization, Process Control, more recently in Intrusion Detection and elsewhere (see discussion). It is worth mentioning that most of these applications of parallel coordinates and their success are due to the landmark paper entitled "Hyperdimensional Data Analysis Using Parallel Coordinates" (Wegman 1990). Generalized parallel coordinates system is proposed by Moustafa and Wegman (2002,2006), at which the Cartesian coordinates system is transformed into a parameter space (parallel coordinates) using bases functions. The relationships between generalized parallel coordinates and Andrews plot as well as Grand tour are explored by Moustafa and Wegman (2002,2006).
Adding more dimensions in the parallel coordinates (often abbreviated coords or PCs) involves adding more axes. The value of parallel coordinates is that certain geometrical properties in high dimensions transform into easily seen 2D patterns. For example, a set of points on a line in nspace transforms to a set of polylines(or curves) in parallel coordinates all intersecting at n1 points. For n = 2 this yields a point <> line duality pointing out why the mathematical foundations of parallel coordinates are developed in the Projective rather than Euclidean space. Also known are the patterns corresponding to (hyper)planes, curves, several smooth (hyper)surfaces, proximities, convexity and recently nonorientability. ^{[3]}. It is worth mentioning that since the process maps a kdimensional data onto a lower 2D space, some loss of information is expected. The loss of information can be measured using the Parsavel's Identity (or Energy Norm).
When used for statistical data visualisation there are three important considerations: the order, the rotation, and the scaling of the axes.
The order of the axes is critical for finding features, and in typical data analysis many reorderings will need to be tried. Some authors have come up with ordering heuristics which may create illuminating orderings ^{[4]}.
The rotation of the axes is a translation in the parallel coordinates and if the lines intersected outside the parallel axes it can be translated between them by rotations. The simplest example of this is rotating the axis by 180 degrees. More details can be found at ^{[5]}.
The necessity of scaling stems from the fact that the plot is based on interpolation (linear combination) of consecutive pairs of variables^{[5]}. Therefore, the variables must be in common scale, and there are many scaling methods to be considered as part of data preparation process that can reveal more informative views.
The generalized parallel coordinate plot (GPCP) has been proposed by (Moustafa and Wegman 2002) ^{[6]} as a generalization of parallel coordinates plots, based on parameter transformation. In this design, instead of plotting the raw data, it is transformed in some way first. If the interpolation function is piecewise Lagrange, this corresponds to the traditional PCP. If splines are used as the interpolation function, then the smooth parallel coordinate plot (SPCP) is achieved. In the smooth plot, every observation is mapped into a parametric line (or curve), which is smooth, continuous on the axes, and orthogonal to each parallel axis ^{[5]}.
This SPCP design gives a clear quantization level of each data attribute, that can best describe its distribution in complex situations, even with large data sets. Finally, if one uses the Fourier interpolation of degree equals to the data dimensionality, then Andrews plot (Andrews 1972) ^{[7]} is achieved. The GPCP design gives opportunities to researchers to explore alternative interpolation functions that best suited for particular application, and statistical dualities between the data space and GPC space that are important for visual pattern recognition using GPCP ^{[8]}.
