Comparison of analog and digital recording
This article compares the two ways in which sound is recorded and stored. Actual sound waves consist of continuous variations in air pressure. Representations of these signals can be recorded using either digital or analog techniques.
An analog recording is one where a property or characteristic of a physical recording medium is made to vary in a manner analogous to the variations in air pressure of the original sound. Generally, the air pressure variations are first converted (by a transducer such as a microphone) into an electrical analog signal in which either the instantaneous voltage or current is directly proportional to the instantaneous air pressure (or is a function of the pressure). The variations of the electrical signal in turn are converted to variations in the recording medium by a recording machine such as a tape recorder or cutting lathe. Examples of properties that are modified are the magnetization of magnetic tape or the deviation (or displacement) of the groove of a gramophone disc from a smooth, flat spiral track.
A digital recording is produced by converting the physical properties of the original sound into a sequence of numbers, which can then be stored and read back for reproduction. Normally, the sound is converted into an electrical analog signal in the same way as for analog recording, and then the analog signal is converted to a digital signal, through an analog-to-digital converter and then recorded onto a digital storage medium such as a compact disc or hard disk.
Two prominent differences in functionality between the two methods are the bandwidth and the signal-to-noise ratio (S/N). The bandwidth of the digital system is determined, according to the Nyquist frequency, by the sample rate used. The bandwidth of an analog system is dependent on the physical capabilities of the analog circuits. The S/N of a digital system is may be limited by the bit depth of the digitization process, but the electronic implementation of conversion circuits introduces additional noise. In an analog system, other natural analog noise sources exist, such as flicker noise and imperfections in the recording medium. Other functions are also naturally exclusive to either one or the other, such as the ability for more transparent filtering algorithms in digital systems and the harmonic saturation and speed variations of analog systems.
Overview of differences
It is a subject of debate whether analog audio is superior to digital audio or vice versa. The question is highly dependent on the quality of the systems (analog or digital) under review, and other factors which are not necessarily related to sound quality. Arguments for analog systems include the absence of fundamental error mechanisms which are present in digital audio systems, including aliasing, quantization noise, and the absolute limitation of dynamic range. Advocates of digital point to the high levels of performance possible with digital audio, including excellent linearity in the audible band and low levels of noise and distortion.
Accurate, high quality sound reproduction is possible with both analog and digital systems but in general it tends to be less expensive to achieve any given standard of technical signal quality with a digital system. One of the most limiting aspects of analog technology is the sensitivity of analog media to minor physical degradation; however, when the degradation is more pronounced, analog systems usually perform better, often still producing recognizable sound, while digital systems will usually fail completely, unable to play back anything from the medium (see digital cliff). The principal advantages that digital systems have are a very uniform source fidelity, inexpensive media duplication, and direct use of the digital 'signal' in today's popular portable storage and playback devices. Analog recordings by comparison require comparatively bulky, high-quality playback equipment to capture the signal from the media as accurately as digital.
Early in the development of the Compact Disc, engineers realized that the perfection of the spiral of bits was critical to playback fidelity. A scratch the width of a human hair (100 micrometres) could corrupt several dozen bits, resulting in at best a pop, and far worse, a loss of synchronization of the clock and data, giving a long segment of noise until resynchronized. This was addressed by encoding the digital stream with a multi-tiered error-correction coding scheme which reduces CD capacity by about 20%, but makes it tolerant to hundreds of surface imperfections across the disk without loss of signal. In essence, "error correction" can be thought of as "using the mathematically encoded backup copies of the data that was corrupted." Not only does the CD use redundant data, but it also mixes up the bits in a predetermined way (see CIRC) so that a small flaw on the disc will affect fewer consecutive bits of the decoded signal and allow for more effective error correction using the available backup information.
Error correction allows digital formats to tolerate quite a bit more media deterioration than analog formats. That is not to say poorly produced digital media are immune to data loss. Laser rot was most troublesome to the Laserdisc format, but also occurs to some pressed commercial CDs, and was caused in both cases by inadequate disc manufacture. There can occasionally be difficulties related to the use of consumer recordable/rewritable compact discs. This may be due to poor-quality CD recorder drives, low-quality discs, or incorrect storage, as the information-bearing dye layer of most CD-recordable discs is at least slightly sensitive to UV light and will be slowly bleached out if exposed to any amount of it. Most digital recordings rely at least to some extent on computational encoding and decoding and so may become completely unplayable if not enough consecutive good data is available for the decoder to synchronize to the digital data stream, whereas any intact segment of any size of an analog recording is playable.
Unlike analog duplication, digital copies are exact replicas, which can be duplicated indefinitely without degradation. This made Digital rights management more of an issue in digital media than analog media. Digital systems often have the ability for the same medium to be used with arbitrarily high or low quality encoding methods and number of channels or other content, unlike practically all analog systems which have mechanically pre-fixed speeds and channels. Most higher-end analog recording systems offer a few recording speeds, but digital systems tend to offer much finer variation in the rate of media usage.
There are also several non-sound related advantages of digital systems that are practical. Digital systems that are computer-based make editing much easier through rapid random access, seeking, and scanning for non-linear editing. Most digital systems also allow non-audio data to be encoded into the digital stream, such as information about the artist, track titles, etc., which is often convenient.
Noise and distortion
In the process of recording, storing and playing back the original analog sound wave (in the form of an electronic signal), it is unavoidable that some signal degradation will occur. This degradation is in the form of distortion and noise. Noise is unrelated in time to the original signal content, while distortion is in some way related in time to the original signal content.
For electronic audio signals, sources of noise include mechanical, electrical and thermal noise in the recording and playback cycle. The actual process of digital conversion will always add some noise, however small in intensity; the bulk of this in a high-quality system is quantization noise, which cannot be theoretically avoided, but some will also be electrical, thermal, etc. noise from the analog-to-digital converted device.
The amount of noise that a piece of audio equipment adds to the original signal can be quantified. Mathematically, this can be expressed by means of the signal to noise ratio (SNR or S/N). Sometimes the maximum possible dynamic range of the system is quoted instead. In a digital system, the number of quantization levels, in binary systems determined by and typically stated in terms of the number of bits, will have a bearing on the level of noise and distortion added to that signal. Each additional quantization bit adds 6 dB in possible SNR, e.g. 24 x 6 = 144 dB for 24 bit quantization, 126 dB for 21-bit, and 120 dB for 20-bit.
The 16-bit digital system of Red Book audio CD has 216= 65,536 possible signal amplitudes, theoretically allowing for an SNR of 98 dB if undithered, however, the perceived dynamic range of 16-bit audio can be 120 dB or more with noise-shaped dither, taking advantage of the frequency response of the human ear.
With digital systems, the quality of reproduction depends on the analog-to-digital and digital-to-analog conversion steps, and does not depend on the quality of the recording medium, provided it is adequate to retain the digital values without error.
Consumer analog cassette tapes may have a dynamic range of 60 to 70 dB. Analog FM broadcasts rarely have a dynamic range exceeding 50 dB, though under excellent reception conditions the basic FM transmission system can achieve just over 80 dB. The dynamic range of a direct-cut vinyl record may surpass 70 dB. Analog studio master tapes using Dolby-A noise reduction can have a dynamic range of around 80 dB.
"Rumble" is a form of noise characteristic caused by imperfections in the bearings of turntables, the platter tends to have a slight amount of motion besides the desired rotation—the turntable surface also moves up-and-down and side-to-side slightly. This additional motion is added to the desired signal as noise, usually of very low frequencies, creating a "rumbling" sound during quiet passages. Very inexpensive turntables sometimes used ball bearings which are very likely to generate audible amounts of rumble. More expensive turntables tend to use massive sleeve bearings which are much less likely to generate offensive amounts of rumble. Increased turntable mass also tends to lead to reduced rumble. A good turntable should have rumble at least 60 dB below the specified output level from the pick-up.:79–82
Wow and flutter
Wow and flutter are a change in frequency of an analog device and are the result of mechanical imperfections, with wow being a slower rate form of flutter. Wow and flutter are most noticeable on signals which contain pure tones. For LP records, the quality of the turntable will have a large effect on the level of wow and flutter. A good turntable will have wow and flutter values of less than 0.05%, which is the speed variation from the mean value. Wow and flutter can also be present in the recording, as a result of the imperfect operation of the recorder.
The frequency response of the standard for audio CDs is sufficiently wide to cover the entire normal audible range, which roughly extends from 20 Hz to 20 kHz. Commercial and industrial digital recorders record higher frequencies, while consumer systems inferior to the CD record a more restricted frequency range. Analog audio's frequency response is less flat than digital, but it can vary in the electronics.
For digital systems, the upper limit of the frequency response is determined by the sampling frequency. The choice of sample rate used in a digital system is based on the Nyquist-Shannon sampling theorem. This states that a sampled signal can be reproduced exactly as long as it is sampled at a frequency greater than twice the bandwidth of the signal. Therefore, a sampling rate of 40 kHz would be theoretically enough to capture all the information contained in a signal having frequency bandwidth up to 20 kHz. The sampling theorem assumes ideal filters, however, which cannot exist in reality, so practical sampling uses "guard bands" (higher than necessary sample rates) to reduce aliasing.
High quality open-reel machines can extend from 10 Hz to above 20 kHz. The linearity of the response may be indicated by providing information on the level of the response relative to a reference frequency. For example, a system component may have a response given as 20 Hz to 20 kHz +/- 3 dB relative to 1 kHz. Some analog tape manufacturers specify frequency responses up to 20 kHz, but these measurements may have been made at lower signal levels. Compact cassettes may have a response extending up to 15 kHz at full (0 dB) recording level (Stark 1989). At lower levels usually -10 dB, cassettes typically rolls-off at around 20 kHz for most machines, due to the nature of the tape media caused by self-erasure (which worsens the linearity of the response).
The frequency response for a conventional LP player might be 20 Hz - 20 kHz +/- 3 dB. Unlike the audio CD, vinyl records and cassettes do not require a cut-off in response above 20 kHz. The low frequency response of vinyl records is restricted by rumble noise (described above). The high frequency response of vinyl depends on the cartridge. CD4 records contained frequencies up to 50 kHz, while some high-end turntable cartridges have frequency responses of 120 kHz while having flat frequency response over the audible band (e.g. 20 Hz to 15 kHz +/-0.3 dB). In addition, frequencies of up to 122 kHz have been experimentally cut on LP records.
With vinyl records, there will be some loss in fidelity on each playing of the disc. This is due to the wear of the stylus in contact with the record surface. A good quality stylus, matched with a correctly set up pick-up arm, should cause minimal surface wear. Magnetic tapes, both analog and digital, wear from friction between the tape and the heads, guides, and other parts of the tape transport as the tape slides over them. The brown residue deposited on swabs during cleaning of a tape machine's tape path is actually particles of magnetic coating shed from tapes. Tapes can also suffer creasing, stretching, and frilling of the edges of the plastic tape base, particularly from low-quality or out-of-alignment tape decks. When a CD is played, there is no physical contact involved, and the data is read optically using a laser beam. Therefore, no such media deterioration takes place, and the CD will, with proper care, sound exactly the same every time it is played (discounting aging of the player and CD itself); however, this is a benefit of the optical system, not of digital recording, and the Laserdisc format enjoys the same non-contact benefit with analog optical signals. Recordable CDs slowly degrade with time, called disc rot, even if they are not played, and are stored properly. A new compact disc was released called M-DISC which is said to last 1000 years. These discs are recordable and have a layer developed from stone. They come in CD, DVD and Blu-ray formats and various storage sizes. They can be used for music movies and data for computers etc.
Technical difficulty arises with digital sampling in that all high frequency signal content above the Nyquist frequency must be removed prior to sampling, which, if not done, will result in these ultrasonic frequencies "folding over" into frequencies which are in the audible range, producing a kind of distortion called aliasing. The difficulty is that designing a brick-wall anti-aliasing filter, a filter which would precisely remove all frequency content exactly above or below a certain cutoff frequency, is impractical. Instead, a sample rate is usually chosen which is above the theoretical requirement. This solution is called oversampling, and allows a less aggressive and lower-cost anti-aliasing filter to be used.
Unlike digital audio systems, analog systems do not require filters for bandlimiting. These filters act to prevent aliasing distortions in digital equipment. Early digital systems may have suffered from a number of signal degradations related to the use of analog anti-aliasing filters, e.g., time dispersion, nonlinear distortion, temperature dependence of filters etc. (Hawksford 1991:8). Even with sophisticated anti-aliasing filters used in the recorder, it is still demanding for the player not to introduce more distortion.
Hawksford (1991:18) highlighted the advantages of digital converters that oversample. Using an oversampling design and a modulation scheme called sigma-delta modulation (SDM), analog anti-aliasing filters can effectively be replaced by a digital filter. This approach has several advantages. The digital filter can be made to have a near-ideal transfer function, with low in-band ripple, and no aging or thermal drift.
Higher sampling rates
CD quality audio is sampled at 44.1 kHz (Nyquist frequency = 22.05 kHz) and at 16 bits. Sampling the waveform at higher frequencies and allowing for a greater number of bits per sample allows noise and distortion to be reduced further. DAT can sample audio at up to 48 kHz, while DVD-Audio can be 96 or 192 kHz and up to 24 bits resolution. With any of these sampling rates, signal information is captured above what is generally considered to be the human hearing range.
Work done in 1981 by Muraoka et al. showed that music signals with frequency components above 20 kHz were only distinguished from those without by a few of the 176 test subjects (Kaoru & Shogo 2001). Later papers, however, by a number of different authors, have led to a greater discussion of the value of recording frequencies above 20 kHz. Such research led some to the belief that capturing these ultrasonic sounds could have some audible benefit. Audible differences were reported between recordings with and without ultrasonic responses. Dunn (1998) examined the performance of digital converters to see if these differences in performance could be explained. He did this by examining the band-limiting filters used in converters and looking for the artifacts they introduce.
A perceptual study by Nishiguchi et al. (2004) concluded that "no significant difference was found between sounds with and without very high frequency components among the sound stimuli and the subjects... however, [Nishiguchi et al] can still neither confirm nor deny the possibility that some subjects could discriminate between musical sounds with and without very high frequency components."
Additionally, in blind tests conducted by Bob Katz, recounted in his book Mastering Audio: The Art and the Science, he found that listening subjects could not discern any audible difference between sample rates with optimum A/D conversion and filter performance. He posits that the primary reason for any aural variation between sample rates is due largely to poor performance of low-pass filtering prior to conversion, and not variance in ultrasonic bandwidth. These results suggest that the main benefit to using higher sample rates is that it pushes consequential phase distortion out of the audible range and that, under ideal conditions, higher sample rates may not be necessary.
A signal is recorded digitally by an analog-to-digital converter, which measures the amplitude of an analog signal at regular intervals, which are specified by the sample rate, and then stores these sampled numbers in computer hardware. The fundamental problem with numbers on computers is that the range of values that can be represented is finite, which means that during sampling, the amplitude of the audio signal must be rounded. This process is called quantization, and these small errors in the measurements are manifested aurally as a form of low level distortion.
Analog systems do not have discrete digital levels in which the signal is encoded. Consequently, the original signal can be preserved to an accuracy limited only by the intrinsic noise-floor and maximum signal level of the media and the playback equipment, i.e., the dynamic range of the system. This form of distortion, sometimes called granular or quantization distortion, has been pointed to as a fault of some digital systems and recordings (Knee & Hawksford 1995, Stuart n.d.:6). Knee & Hawksford (1995:3) drew attention to the deficiencies in some early digital recordings, where the digital release was said to be inferior to the analog version.
The range of possible values that can be represented numerically by a sample is defined by the number of binary digits used. This is called the resolution, and is usually referred to as the bit depth in the context of PCM audio. The quantization noise level is directly determined by this number, decreasing exponentially as the resolution increases (or linearly in dB units), and with an adequate number of true bits of quantization, random noise from other sources will dominate and completely mask the quantization noise. The Redbook CD standard uses 16 bits, which keep the quantization noise 96 dB below maximum amplitude, far below a discernible level with almost any source material.
Dither as a solution
It is possible to make quantization noise more audibly benign by applying dither. To do this, a noise-like signal is added to the original signal before quantization. Dither makes the digital system behave as if it has an analog noise-floor. Optimal use of dither (triangular probability density function dither in PCM systems) has the effect of making the rms quantization error independent of signal level (Dunn 2003:143), and allows signal information to be retained below the least significant bit of the digital system (Stuart n.d.:3).
Dither algorithms also commonly have an option to employ some kind of noise shaping, which pushes the frequency response of the dither noise to areas that are less audible to human ears. This has no statistical benefit, but rather it raises the S/N of the audio that is apparent to the listener.
One aspect that may degrade the performance of a digital system is jitter. This is the phenomenon of variations in time from what should be the correct spacing of discrete samples according to the sample rate. This can be due to timing inaccuracies of the digital clock. Ideally a digital clock should produce a timing pulse at exactly regular intervals. Other sources of jitter within digital electronic circuits are data-induced jitter, where one part of the digital stream affects a subsequent part as it flows through the system, and power supply induced jitter, where DC ripple on the power supply output rails causes irregularities in the timing of signals in circuits powered from those rails.
The accuracy of a digital system is dependent on the sampled amplitude values, but it is also dependent on the temporal regularity of these values. This temporal dependency is inherent to digital recording and playback and has no analog equivalent, though analog systems have their own temporal distortion effects (pitch error and wow-and-flutter).
Periodic jitter produces modulation noise and can be thought of as being the equivalent of analog flutter (Rumsey & Watkinson 1995). Random jitter alters the noise floor of the digital system. The sensitivity of the converter to jitter depends on the design of the converter. It has been shown that a random jitter of 5 ns (nanoseconds) may be significant for 16 bit digital systems (Rumsey & Watkinson 1995). For a more detailed description of jitter theory, refer to Dunn (2003).
Jitter can degrade sound quality in digital audio systems. In 1998, Benjamin and Gannon researched the audibility of jitter using listening tests (Dunn 2003:34). They found that the lowest level of jitter to be audible was around 10 ns (rms). This was on a 17 kHz sine wave test signal. With music, no listeners found jitter audible at levels lower than 20 ns. A paper by Ashihara et al. (2005) attempted to determine the detection thresholds for random jitter in music signals. Their method involved ABX listening tests. When discussing their results, the authors of the paper commented that:
'So far, actual jitter in consumer products seems to be too small to be detected at least for reproduction of music signals. It is not clear, however, if detection thresholds obtained in the present study would really represent the limit of auditory resolution or it would be limited by resolution of equipment. Distortions due to very small jitter may be smaller than distortions due to non-linear characteristics of loudspeakers. Ashihara and Kiryu  evaluated linearity of loudspeaker and headphones. According to their observation, headphones seem to be more preferable to produce sufficient sound pressure at the ear drums with smaller distortions than loudspeakers.'
On the Internet-based hi-fi website, TNT Audio, Pozzoli (2005) describes some audible effects of jitter. His assessment appears to run contrary to the earlier papers mentioned:
'In my personal experience, and I would dare say in common understanding, there is a huge difference between the sound of low and high jitter systems. When the jitter amount is very high, as in very low cost CD players (2ns), the result is somewhat similar to wow and flutter, the well known problem that affected typically compact cassettes (and in a far less evident way turntables) and was caused by the non perfectly constant speed of the tape: the effect is similar, but here the variations have a far higher frequency and for this reasons are less easy to perceive but equally annoying. Very often in these cases the rhythmic message, the pace of the most complicated musical plots is partially or completely lost, music is dull, scarcely involving and apparently meaningless, it does not make any sense. Apart for harshness, the typical "digital" sound, in a word... In lower amounts, the effect above is difficult to perceive, but jitter is still able to cause problems: reduction of the soundstage width and/or depth, lack of focus, sometimes a veil on the music. These effects are however far more difficult to trace back to jitter, as can be caused by many other factors.'
The dynamic range of an audio system is a measure of the difference between the smallest and largest amplitude values that can be represented in a medium. Digital and analog differ in both the methods of transfer and storage, as well as the behavior exhibited by the systems due to these methods.
There are some differences in the behaviour of analog and digital systems when high level signals are present, where there is the possibility that such signals could push the system into overload. With high level signals, analog magnetic tape approaches saturation, and high frequency response drops in proportion to low frequency response. While undesirable, the audible effect of this can be reasonably unobjectionable (Elsea 1996). In contrast, digital PCM recorders show non-benign behaviour in overload (Dunn 2003:65); samples that exceed the peak quantization level are simply truncated, clipping the waveform squarely, which introduces distortion in the form of large quantities of higher-frequency harmonics. The 'softness' of analog tape clipping allows a usable dynamic range that can exceed that of some PCM digital recorders. (PCM, or pulse code modulation, is the coding scheme used in Compact Disc, DAT, PC sound cards, and many studio recording systems.)
In principle, PCM digital systems have the lowest level of nonlinear distortion at full signal amplitude. The opposite is usually true of analog systems, where distortion tends to increase at high signal levels. A study by Manson (1980) considered the requirements of a digital audio system for high quality broadcasting. It concluded that a 16 bit system would be sufficient, but noted the small reserve the system provided in ordinary operating conditions. For this reason, it was suggested that a fast-acting signal limiter or 'soft clipper' be used to prevent the system from becoming overloaded (Manson 1980:8).
With many recordings, high level distortions at signal peaks may be audibly masked by the original signal, thus large amounts of distortion may be acceptable at peak signal levels. The difference between analog and digital systems is the form of high-level signal error. Some early analog-to-digital converters displayed non-benign behaviour when in overload, where the overloading signals were 'wrapped' from positive to negative full-scale. Modern converter designs based on sigma-delta modulation may become unstable in overload conditions. It is usually a design goal of digital systems to limit high-level signals to prevent overload (Dunn 2003:65). To prevent overload, a modern digital system may compress input signals so that digital full-scale cannot be reached (Jones et al. 2003:4).
The dynamic range of digital audio systems can exceed that of analog audio systems. Typically, a 16 bit analog-to-digital converter may have a dynamic range of between 90 and 95 dB (Metzler 2005:132), whereas the signal-to-noise ratio (roughly the equivalent of dynamic range, noting the absence of quantization noise but presence of tape hiss) of a professional reel-to-reel 1/4 inch tape recorder would be between 60 and 70 dB at the recorder's rated output (Metzler 2005:111).
The benefits of using digital recorders with greater than 16 bit accuracy can be applied to the 16 bits of audio CD. Stuart (n.d.:3) stresses that with the correct dither, the resolution of a digital system is theoretically infinite, and that it is possible, for example, to resolve sounds at -110 dB (below digital full-scale) in a well-designed 16 bit channel.
Despite the lower dynamic range and signal-to-noise ratios a vinyl or tape record can achieve in theory (60-80 dB versus 90-96 dB for CD recordings), vinyl records may still be preferred for their greater dynamic range in practice because of aggressive dynamic range compression used for CD audio material (see Loudness war), a practice relatively uncommon for vinyl mastering.
After initial recording, it is common for the audio signal to be altered in some way, such as with the use of compression, equalization, delays and reverb. With analog, this comes in the form of outboard hardware components, and with digital, the same is accomplished with plug-ins that are utilized in the user's DAW.
A comparison of analog and digital filtering shows technical advantages to both methods, and there are several points that are relevant to the recording process.
Many analog units possess unique characteristics that are desirable. Common elements are band shapes and phase response of equalizers and response times of compressors. These traits can be difficult to reproduce digitally because they are due to electrical components which function differently from the algorithmic calculations used on a computer.
When altering a signal with a filter, the outputted signal may differ in time from the signal at the input, which is called a change in phase. Many equalizers exhibit this behavior, with the amount of phase shift differing in some pattern, and centered around the band that is being adjusted. This phase distortion can create the perception of a "ringing" sound around the filter band, or other coloration. Although this effect alters the signal in a way other than a strict change in frequency response, this coloration can sometimes have a positive effect on the perception of the sound of the audio signal.
Digital filters can be made to objectively perform better than analog components, because the variables involved can be precisely specified in the calculations.
One prime example is the invention of the linear phase equalizer, which has inherent phase shift that is homogeneous across the frequency spectrum. Digital delays can also be perfectly exact, provided the delay time is some multiple of the time between samples, and so can the summing of a multitrack recording, as the sample values are merely added together.
A practical advantage of digital processing is the more convenient recall of settings. Plug-in parameters can be stored on the computer hard disk, whereas parameter details on an analog unit must be written down or otherwise recorded if the unit needs to be reused. This can be cumbersome when entire mixes must be recalled manually using an analog console and outboard gear. When working digitally, all parameters can simply be stored in a DAW project file and recalled instantly. Most modern professional DAWs also process plug-ins in real time, which means that processing can be largely non-destructive until final mix-down.
Many plug-ins exist now that incorporate some kind of analog modeling. There are some engineers that endorse them and feel that they compare equally in sound to the analog processes that they imitate. Digital models also carry some benefits over their analog counterparts, such as the ability to remove noise from the algorithms and add modifications to make the parameters more flexible. On the other hand, other engineers also feel that the modeling is still inferior to the genuine outboard components and still prefer to mix "outside the box".
Subjective evaluation attempts to measure how well an audio component performs according to the human ear. The most common form of subjective test is a listening test, where the audio component is simply used in the context for which it was designed. This test is popular with hi-fi reviewers, where the component is used for a length of time by the reviewer who then will describe the performance in subjective terms. Common descriptions include whether the component has a 'bright' or 'dull' sound, or how well the component manages to present a 'spatial image'.
Another type of subjective test is done under more controlled conditions and attempts to remove possible bias from listening tests. These sorts of tests are done with the component hidden from the listener, and are called blind tests. To prevent possible bias from the person running the test, the blind test may be done so that this person is also unaware of the component under test. This type of test is called a double-blind test. This sort of test is often used to evaluate the performance of digital audio codecs.
There are critics of double-blind tests who see them as not allowing the listener to feel fully relaxed when evaluating the system component, and can therefore not judge differences between different components as well as in sighted (non-blind) tests. Those who employ the double-blind testing method may try to reduce listener stress by allowing a certain amount of time for listener training (Borwick et al. 1994:481-488).
Early digital recordings
Early digital audio machines had disappointing results, with digital converters introducing errors that the ear could detect (Watkinson 1994). Record companies released their first LPs based on digital audio masters in the late 1970s. CDs became available in the early 1980s. At this time analog sound reproduction was a mature technology.
There was a mixed critical response to early digital recordings released on CD. Compared to vinyl record, it was noticed that CD was far more revealing of the acoustics and ambient background noise of the recording environment (Greenfield et al. 1986). For this reason, recording techniques developed for analog disc, e.g., microphone placement, needed to be adapted to suit the new digital format (Greenfield et al. 1986).
Some analog recordings were remastered for digital formats. Analog recordings made in natural concert hall acoustics tended to benefit from remastering (Greenfield et al. 1990). The remastering process was occasionally criticised for being poorly handled. When the original analog recording was fairly bright, remastering sometimes resulted in an unnatural treble emphasis (Greenfield et al. 1990).
Super Audio CD and DVD-Audio
The Super Audio CD (SACD) format was created by Sony and Philips, who were also the developers of the earlier standard audio CD format. SACD uses Direct Stream Digital (DSD), which works quite differently from the PCM format discussed in this article. Instead of using a greater number of bits and attempting to record a signal's precise amplitude for every sample cycle, a DSD recorder uses a technique called sigma-delta modulation. Using this technique, the audio data is stored as a sequence of fixed amplitude (i.e. 1- bit) values at a sample rate of 2.884 MHz, which is 64 times the 44.1 kHz sample rate used by CD. At any point in time, the amplitude of the original analog signal is represented by the relative preponderance of 1's over 0's in the data stream. This digital data stream can therefore be converted to analog by the simple expedient of passing it through a relatively benign analog low-pass filter. The competing DVD-Audio format uses standard, linear PCM at variable sampling rates and bit depths, which at the very least match and usually greatly surpass those of a standard CD Audio (16 bits, 44.1 kHz).
In the popular Hi-Fi press, it had been suggested that linear PCM "creates [a] stress reaction in people", and that DSD "is the only digital recording system that does not [...] have these effects" (Hawksford 2001). This claim appears to originate from a 1980 article by Dr John Diamond entitled Human Stress Provoked by Digitalized Recordings. The core of the claim that PCM (the only digital recording technique available at the time) recordings created a stress reaction rested on "tests" carried out using the pseudoscientific technique of applied kinesiology, for example by Dr Diamond at an AES 66th Convention (1980) presentation with the same title. Diamond had previously used a similar technique to demonstrate that rock music (as opposed to classical) was bad for your health due to the presence of the "stopped anapestic beat". Dr Diamond's claims regarding digital audio were taken up by Mark Levinson, who asserted that while PCM recordings resulted in a stress reaction, DSD recordings did not. A double-blind subjective test between high resolution linear PCM (DVD-Audio) and DSD did not reveal a statistically significant difference. Listeners involved in this test noted their great difficulty in hearing any difference between the two formats.
Some audio enthusiasts prefer the sound of vinyl records over that of a CD. Founder and editor Harry Pearson of The Absolute Sound journal says that "LPs are decisively more musical. CDs drain the soul from music. The emotional involvement disappears". Dub producer Adrian Sherwood has similar feelings about the analog cassette tape, which he prefers because of its warm sound.
Those who favour the digital format point to the results of blind tests, which demonstrate the high performance possible with digital recorders. The assertion is that the 'analog sound' is more a product of analog format inaccuracies than anything else. One of the first and largest supporters of digital audio was the classical conductor Herbert von Karajan, who said that digital recording was "definitely superior to any other form of recording we know". He also pioneered the unsuccessful Digital Compact Cassette and conducted the first recording ever to be commercially released on CD: Richard Strauss's Eine Alpensinfonie.
Was it ever entirely analog or digital?
Complicating the discussion is that recording professionals often mix and match analog and digital techniques in the process of producing a recording. Analog signals can be subjected to digital signal processing or effects, and inversely digital signals are converted back to analog in equipment that can include analog steps such as vacuum tube amplification.
For modern recordings, the controversy between analog recording and digital recording is becoming moot. No matter what format the user uses, the recording probably was digital at several stages in its life. In case of video recordings it is moot for one other reason; whether the format is analog or digital, digital signal processing is likely to have been used in some stages of its life, such as digital timebase correction on playback.
An additional complication arises when discussing human perception when comparing analog and digital audio in that the human ear itself, is an analog-digital hybrid. The human hearing mechanism begins with the tympanic membrane transferring vibrational motion through the middle-ear's mechanical system—three bones (malleus, incus and stapes)—into the cochlea where hair-like nerve cells convert the vibrational motion stimulus into nerve impulses. Auditory nerve impulses are discrete signalling events which cause synapses to release neurotransmitters to communicate to other neurons (see here.) The all-or-none quality of the impulse can lead to a misconception that neural signalling is somehow 'digital' in nature, but in fact the timing and rate of these signalling events is not clocked or quantised in any way. Thus the transformation of the acoustic wave is not a process of sampling, in the sense of the word as it applies to digital audio. Instead it is a transformation from one analog domain to another, and this transformation is further processed by the neurons to which the signalling is connected. The brain then processes the incoming information and perceptually reconstructs the original analog input to the ear canal.
It is also worth noting two issues that affect perception of sound playback. The first is human ear dynamic range which for practical and hearing safety reasons might be regarded as 120 decibels, from barely audible sound received by the ear situated within an otherwise silent environment, to the threshold of pain or onset of damage to the ear's delicate mechanism. The other critical issue is manifestly more complex; the presence and nature of background noise in any listening environment. Background noise subtracts useful hearing dynamic range, in any number of ways that depend on the nature of the noise from the listening environment: noise spectral content, noise coherence or periodicity, angular aspects such as localization of noise sources with respect to localization of playback system sources and so on.
While the words analog audio usually imply that the sound is described using a continuous time/continuous amplitudes approach in both the media and the reproduction/recording systems, and the words digital audio imply a discrete time/discrete amplitudes approach, there are methods of encoding audio that fall somewhere between the two, e.g. continuous time/discrete levels and discrete time/continuous levels.
While not as common as "pure analog" or "pure digital" methods, these situations do occur in practice. Indeed, all analog systems show discrete (quantized) behaviour at the microscopic scale, and asynchronously operated class-D amplifiers even consciously incorporate continuous time, discrete amplitude designs. Continuous amplitude, discrete time systems have also been used in many early analog-to-digital converters, in the form of sample-and-hold circuits. The boundary is further blurred by digital systems which statistically aim at analog-like behavior, most often by utilizing stochastic dithering and noise shaping techniques. While vinyl records and common compact cassettes are analog media and use quasi-linear physical encoding methods (e.g. spiral groove depth, tape magnetic field strength) without noticeable quantization or aliasing, there are analog non-linear systems that exhibit effects similar to those encountered on digital ones, such as aliasing and "hard" dynamic floors (e.g. frequency modulated hi-fi audio on videotapes, PWM encoded signals).
Although those "hybrid" techniques are usually more common in telecommunications systems than in consumer audio, their existence alone blurs the distinctive line between certain digital and analog systems, at least for what regards some of their alleged advantages or disadvantages.
There are many benefits to using digital recording over analog recording because “numbers are more easily manipulated than are grooves on a record or magnetized particles on a tape” (Rudolph & Leonard, 2001, p. 3). Because numerical coding represents the sound waves perfectly, the sound can be played back without background noise.
- Note that Laserdisc, despite using a laser optical system that has become commonly associated with digital disc formats, is an old analog format, except for its optional digital audio tracks; the video image portion of the content is always analog.
- Unless imposed DRM restrictions apply.
- It is technically possible, to implement analog systems with integrated digital metadata channels.
- "Chapter 21: Filter Comparison". dspguide.com. Retrieved 2012-09-13.
- Sony Europe (2001). Digital Audio Technology 4th edn, edited by J. baby & M. vacumn and jj delorosa. Focal Press.
- Montgomery, Chris (March 25, 2012). "24/192 Music Downloads ...and why they make no sense". xiph.org. Retrieved 26 May 2013.
With use of shaped dither, which moves quantization noise energy into frequencies where it's harder to hear, the effective dynamic range of 16 bit audio reaches 120dB in practice, more than fifteen times deeper than the 96dB claim. 120dB is greater than the difference between a mosquito somewhere in the same room and a jackhammer a foot away.... or the difference between a deserted 'soundproof' room and a sound loud enough to cause hearing damage in seconds. 16 bits is enough to store all we can hear, and will be enough forever.
- Stuart, J. Robert (1997). "Coding High Quality Digital Audio" (PDF). Meridian Audio Ltd. Retrieved 2016-02-25.
One of the great discoveries in PCM was that, by adding a small random noise (that we call dither) the truncation effect can disappear. Even more important was the realisation that there is a right sort of random noise to add, and that when the right dither is used, the resolution of the digital system becomes infinite.
- Driscoll, R. (1980). Practical Hi-Fi Sound, 'Analogue and digital', pages 61–64; 'The pick-up, arm and turntable', pages 79–82. Hamlyn. ISBN 0-600-34627-7.
- Technics EPC-100CMK4
- "mastering". Positive-feedback.com. Retrieved 2012-08-15.
- Byers, Fred R (October 2003). "Care and Handling of CDs and DVDs" (PDF). Council on Library and Information Resources. Retrieved 27 July 2014.
- Thompson, Dan. Understanding Audio. Berklee Press, 2005, ch. 14.
- Muraoka, Teruo; Iwahara, Makoto; Yamada, Yasuhiro (1981). "Examination of Audio-Bandwidth Requirements for Optimum Sound Signal Transmission". Journal of the Audio Engineering Society. 29 (1/2): 2–9.
- Dunn, Julian (1998). "Anti-alias and anti-image filtering: The benefits of 96kHz sampling rate formats for those who cannot hear above 20kHz" (PDF). Nanophon Limited. Retrieved 27 July 2014.
- Nishiguchi, Toshiyuki; Iwaki, Masakazu; Ando, Akio (2004). Perceptual Discrimination between Musical Sounds with and without Very High Frequency Components. NHK Laboratories Note No. 486 (Report). NHK. Retrieved August 15, 2012.
- Katz, Bob (October 3, 2007). Mastering Audio: The Art and the Science (2nd ed.). Focal Press. ISBN 978-0240808376.
- Archived January 31, 2014, at the Wayback Machine.
- "Jitter explained - Part 1.4 [English]". Tnt-audio.com. Retrieved 2012-08-15.
- John Eargle, Chris Foreman. Audio Engineering for Sound Reinforcement, The Advantages of Digital Transmission and Signal Processing. Retrieved 2012-09-14.
- "Secrets Of The Mix Engineers: Chris Lord-Alge". Retrieved 2012-09-13.
- "Digital stress". The Diamond Center. 2003 . Archived from the original on 2004-08-12. Retrieved 17 July 2013.
- Home. "AES E-Library » More on -Human Stress Provoked by Digitalized Recordings- and Reply". Aes.org. Retrieved 2013-08-16.
- Are the Kids All Right?: The Rock Generation and Its Hidden Death Wish, John Grant Fuller, ISBN 0812909704, pp130-135
- http://www.acoust.rise.waseda.ac.jp/1bitcons/1bitforum2002/Mark.pdf Archived March 23, 2014, at the Wayback Machine.
- "Red Rose Music SACDs". Redrosemusic.com. Retrieved 2013-08-16.
- "Stereophile eNewsletter". Stereophile.com. 2005-07-05. Retrieved 2013-08-16.
- Blech, Dominik; Yang, Min-Chi (8–11 May 2004). "DVD-Audio versus SACD Perceptual Discrimination of Digital Audio Coding Formats" (PDF). Audio Engineering Society. Archived from the original (PDF) on 27 September 2007. Retrieved 27 July 2014.
- James Paul (2003-09-26). "Last night a mix tape saved my life | Music | The Guardian". London: Arts.guardian.co.uk. Retrieved 2012-08-15.
- "ABX Testing article". Boston Audio Society. 1984-02-23. Retrieved 2012-08-15.
- "Analog or Digital?". St-andrews.ac.uk. Retrieved 2012-08-15.
- Ashihara, K. et al. (2005). "Detection threshold for distortions due to jitter on digital audio", Acoustical Science and Technology, Vol. 26 (2005), No. 1 pp. 50–54.
- Blech, D. & Yang, M. (2004). "Perceptual Discrimination of Digital Coding Formats", Audio Engineering Society Convention Paper 6086, May 2004.
- Croll, M. (1970). "Pulse Code Modulation for High Quality Sound Distribution: Quantizing Distortion at Very Low Signal Levels", Research Department Report No. 1970/18, BBC.
- Dunn, J. (1998). "The benefits of 96 kHz sampling rate formats for those who cannot hear above 20 kHz", Preprint 4734, presented at the 104th AES Convention, May 1998.
- Dunn, J. (2003). "Measurement Techniques for Digital Audio", Audio Precision Application Note #5, Audio Precision, Inc. USA. Retrieved March 9, 2008.
- Elsea, P. (1996). "Analog Recording of Sound". Electronic Music Studios at the University of California, Santa Cruz. Retrieved 9 March 2008.
- Ely, S. (1978). "Idle-channel noise in p.c.m. sound-signal systems". BBC Research Department, Engineering Division.
- Greenfield, E. et al. (1986). The Penguin Guide to Compact Discs, Cassettes and LPs. Edited by Ivan March. Penguin Books, England.
- Greenfield, E. et al. (1990). The Penguin Guide to Compact Discs. Edited by Ivan March. Preface, viii-ix. Penguin Books, England. ISBN 0-14-046887-0.
- Hawksford, M. (1991). "Introduction to Digital Audio", Images of Audio, Proceedings of the 10th International AES Conference, London, September 1991. Retrieved March 9, 2008.
- Hawksford, M. (1995). "Bitstream versus PCM debate for high-density compact disc", ARA/Meridian web page, November 1995.
- Hawksford, M. (2001). "SDM versus LPCM: The Debate Continues", 110th AES Convention, paper 5397.
- Hicks, C. (1995). "The Application of Dither and Noise-Shaping to Nyquist-Rate Digital Audio: an Introduction", Communications and Signal Processing Group, Cambridge University Engineering Department, United Kingdom.
- Jones, W. et al. (2003). "Testing Challenges in Personal Computer Audio Devices". Paper presented at the 114th AES Convention. Audio Precision, Inc., USA. Retrieved March 9, 2008.
- Kaoru, A. & Shogo, K. (2001). "Detection threshold for tones above 22 kHz", Audio Engineering Society Convention Paper 5401. Presented at the 110th Convention, 2001.
- Knee, A. & Hawksford, M. (1995). "Evaluation of Digital Systems and Digital Recording Using Real Time Audio Data". Paper for the 98th AES Convention, February 1995, preprint 4003 (M-2).
- Lesurf, J. "Analog or Digital?", The Scots Guide to Electronics. Retrieved October 2007.
- Libbey, T. "Digital versus analog: digital music on CD reigns as the industry standard", Omni, February 1995.
- Lipshitz, S. "The Digital Challenge: A Report", The BAS Speaker, Aug-Sept 1984.
- Lipshitz, S. (2005). "The Rise of Digital Audio: The Good, the Bad, and the Ugly". Abstract of Heyser Memorial Lecture given by Prof. Stanley Lipshitz at the 118th AES Convention.
- Liversidge, A. "Analog versus digital: has vinyl been wrongly dethroned by the music industry?", Omni, February 1995.
- Manson, W. (1980). "Digital Sound: studio signal coding resolution for broadcasting". BBC Research Department, Engineering Division.
- Nishiguchi, T. et al. (2004). "Perceptual Discrimination between Musical Sounds with and without Very High Frequency Components", NHK Laboratories Note No. 486, NHK (Japan Broadcasting Corporation).
- Pozzoli, G. "DIGITabilis: crash course on digital audio interfaces. Part 1.4 - Enemy Interception. Effects of Jitter in Audio", "TNT-Audio - online HiFi review", 2005.
- Pohlmann, K. (2005). Principles of Digital Audio 5th edn, McGraw-Hill Comp.
- Rathmell, J. et al. (1997). "TDFD-based Measurement of Analog-to-Digital Converter Nonlinearity", Journal of the Audio Engineering Society, Volume 45, Number 10, pp. 832–840; October 1997.
- Rumsey, F. & Watkinson, J. (1995). The Digital Interface Handbook, 2nd edition. Sections 2.5 and 6. Pages 37 and 154-160. Focal Press.
- Stark, C. (1989). Encyclopædia Britannica, 15th edition, Volume 27, Macropaedia article 'Sound', section: 'High-fidelity concepts and systems', page 625.
- Stuart, J. (n.d.). "Coding High Quality Digital Audio". Meridian Audio Ltd, UK. Retrieved 9 March 2008. This article is substantially the same as Stuart's 2004 JAES article "Coding for High-Resolution Audio Systems", Journal of the Audio Engineering Society, Volume 52 Issue 3 pp. 117–144; March 2004.
- Watkinson J. (1994). An Introduction to Digital Audio. Section 1.2 'What is digital audio?', page 3; Section 2.1 'What can we hear?', page 26. Focal Press. ISBN 0-240-51378-9.