In information theory, the Shannon–Hartley theorem is an application of the noisy channel coding theorem to the archetypal case of a continuoustime analog communications channel subject to Gaussian noise. The theorem establishes Shannon's channel capacity for such a communication link, a bound on the maximum amount of errorfree digital data (that is, information) that can be transmitted with a specified bandwidth in the presence of the noise interference, under the assumption that the signal power is bounded and the Gaussian noise process is characterized by a known power or power spectral density. The law is named after Claude Shannon and Ralph Hartley.
Contents 
Considering all possible multilevel and multiphase encoding techniques, the Shannon–Hartley theorem states that the channel capacity C, meaning the theoretical tightest upper bound on the information rate (excluding error correcting codes) of clean (or arbitrarily low bit error rate) data that can be sent with a given average signal power S through an analog communication channel subject to additive white Gaussian noise of power N, is:
where
During the late 1920s, Harry Nyquist and Ralph Hartley developed a handful of fundamental ideas related to the transmission of information, particularly in the context of the telegraph as a communications system. At the time, these concepts were powerful breakthroughs individually, but they were not part of a comprehensive theory. In the 1940s, Claude Shannon developed the concept of channel capacity, based in part on the ideas of Nyquist and Hartley, and then formulated a complete theory of information and its transmission.
In 1927, Nyquist determined that the number of independent pulses that could be put through a telegraph channel per unit time is limited to twice the bandwidth of the channel. In symbols,
where f_{p} is the pulse frequency (in pulses per second) and B is the bandwidth (in hertz). The quantity 2B later came to be called the Nyquist rate, and transmitting at the limiting pulse rate of 2B pulses per second as signalling at the Nyquist rate. Nyquist published his results in 1928 as part of his paper "Certain topics in Telegraph Transmission Theory."
During that same year, Hartley formulated a way to quantify information and its line rate (also known as data signalling rate or gross bitrate inclusive of errorcorrecting code 'R' across a communications channel).^{[1]} This method, later known as Hartley's law, became an important precursor for Shannon's more sophisticated notion of channel capacity.
Hartley argued that the maximum number of distinct pulses that can be transmitted and received reliably over a communications channel is limited by the dynamic range of the signal amplitude and the precision with which the receiver can distinguish amplitude levels. Specifically, if the amplitude of the transmitted signal is restricted to the range of [ –A ... +A ] volts, and the precision of the receiver is ±ΔV volts, then the maximum number of distinct pulses M is given by
By taking information per pulse in bit/pulse to be the base2logarithm of the number of distinct messages M that could be sent, Hartley^{[2]} constructed a measure of the line rate R as:
where f_{p} is the pulse rate, also known as the symbol rate, in symbols/second or baud.
Hartley then combined the above quantification with Nyquist's observation that the number of independent pulses that could be put through a channel of bandwidth B hertz was 2B pulses per second, to arrive at his quantitative measure for achievable line rate.
Hartley's law is sometimes quoted as just a proportionality between the analog bandwidth, B, in Hertz and what today is called the digital bandwidth, R, in bit/s.^{[3]} Other times it is quoted in this more quantitative form, as an achievable line rate of R bits per second:^{[4]}
Hartley did not work out exactly how the number M should depend on the noise statistics of the channel, or how the communication could be made reliable even when individual symbol pulses could not be reliably distinguished to M levels; with Gaussian noise statistics, system designers had to choose a very conservative value of M to achieve a low error rate.
The concept of an errorfree capacity awaited Claude Shannon, who built on Hartley's observations about a logarithmic measure of information and Nyquist's observations about the effect of bandwidth limitations.
Hartley's rate result can be viewed as the capacity of an errorless Mary channel of 2B symbols per second. Some authors refer to it as a capacity. But such an errorless channel is an idealization, and the result is necessarily less than the Shannon capacity of the noisy channel of bandwidth B, which is the Hartley–Shannon result that followed later.
Claude Shannon's development of information theory during World War II provided the next big step in understanding how much information could be reliably communicated through noisy channels. Building on Hartley's foundation, Shannon's noisy channel coding theorem (1948) describes the maximum possible efficiency of errorcorrecting methods versus levels of noise interference and data corruption.^{[5]}^{[6]} The proof of the theorem shows that a randomly constructed error correcting code is essentially as good as the best possible code; the theorem is proved through the statistics of such random codes.
Shannon's theorem shows how to compute a channel capacity from a statistical description of a channel, and establishes that given a noisy channel with capacity C and information transmitted at a line rate R, then if
there exists a coding technique which allows the probability of error at the receiver to be made arbitrarily small. This means that theoretically, it is possible to transmit information nearly without error up to nearly a limit of C bits per second.
The converse is also important. If
the probability of error at the receiver increases without bound as the rate is increased. So no useful information can be transmitted beyond the channel capacity. The theorem does not address the rare situation in which rate and capacity are equal.
The Shannon–Hartley theorem establishes what that channel capacity is for a finitebandwidth continuoustime channel subject to Gaussian noise. It connects Hartley's result with Shannon's channel capacity theorem in a form that is equivalent to specifying the M in Hartley's line rate formula in terms of a signaltonoise ratio, but achieving reliability through errorcorrection coding rather than through reliably distinguishable pulse levels.
If there were such a thing as an infinitebandwidth, noisefree analog channel, one could transmit unlimited amounts of errorfree data over it per unit of time. Real channels, however, are subject to limitations imposed by both finite bandwidth and nonzero noise.
So how do bandwidth and noise affect the rate at which information can be transmitted over an analog channel?
Surprisingly, bandwidth limitations alone do not impose a cap on maximum information rate. This is because it is still possible for the signal to take on an indefinitely large number of different voltage levels on each symbol pulse, with each slightly different level being assigned a different meaning or bit sequence. If we combine both noise and bandwidth limitations, however, we do find there is a limit to the amount of information that can be transferred by a signal of a bounded power, even when clever multilevel encoding techniques are used.
In the channel considered by the ShannonHartley theorem, noise and signal are combined by addition. That is, the receiver measures a signal that is equal to the sum of the signal encoding the desired information and a continuous random variable that represents the noise. This addition creates uncertainty as to the original signal's value. If the receiver has some information about the random process that generates the noise, one can in principle recover the information in the original signal by considering all possible states of the noise process. In the case of the ShannonHartley theorem, the noise is assumed to be generated by a Gaussian process with a known variance. Since the variance of a Gaussian process is equivalent to its power, it is conventional to call this variance the noise power.
Such a channel is called the Additive White Gaussian Noise channel, because Gaussian noise is added to the signal; "white" means equal amounts of noise at all frequencies within the channel bandwidth. Such noise can arise both from random sources of energy and also from coding and measurement error at the sender and receiver respectively. Since sums of independent Gaussian random variables are themselves Gaussian random variables, this conveniently simplifies analysis, if one assumes that such error sources are also Gaussian and independent.
Comparing the channel capacity to the information rate from Hartley's law, we can find the effective number of distinguishable levels M:^{[7]}
The square root effectively converts the power ratio back to a voltage ratio, so the number of levels is approximately proportional to the ratio of rms signal amplitude to noise standard deviation.
This similarity in form between Shannon's capacity and Hartley's law should not be interpreted to mean that M pulse levels can be literally sent without any confusion; more levels are needed, to allow for redundant coding and error correction, but the net data rate that can be approached with coding is equivalent to using that M in Hartley's law.
In the simple version above, the signal and noise are fully uncorrelated, and in that case S + N is the total power of the received signal and noise together. A generalization of the above equation for the case where the additive noise is not white (or that the S/N is not constant with frequency over the bandwidth) is obtained by treating the channel as many narrow, independent Gaussian channels in parallel:
where
Note: the theorem only applies to noises that are Gaussian stationary processes. This formula's way of introducing frequencydependent noise cannot describe all continuoustime noise processes. For example, consider a noise process that consists of adding a random wave whose amplitude is 1 or 1 at any point in time, and a channel that adds such a wave to the source signal. Such a wave's frequency components are highly dependent. Though such a noise may have a high power, it is fairly easy to transmit a continuous signal with much less power than one would need if the underlying noise was a sum of independent noises in each frequency band.
For large or small and constant signaltonoise ratios, the capacity formula can be approximated:
In a guest lecturer presentation delivered in December 2009 at Columbia University in New York City, OnLine(tm) executive Steven Perlman described his efforts to break the current limitation. In his words: "People talk about Shannon's Law, give me a break, it's not a law  it is a limit". He described how his company's current compression technology delivers 100X Shannon in one wireless channel. He argued that Shannon was not wrong, but that "he was asking the wrong questions".
A video clip of the presentation was featured on CrunchGear and is located at OnLive Presentation (the relevant section is discussed around time frame 39:05)
