Sacred cows will get you nowhereLet's try to shoot some holes in a few favorite topics of hi-fi reviewing. One of my pet hates is soundstaging. For some people, this seems to be very important. For me, it isn't. When asked if the hardware he sells images well, Colin Hammerton---an expatriate Brit working as British amp manufacturer Exposure's importer in Germany---says, "I don't want to hear where the musicians are on stage. I want to hear
why they are on stage." I couldn't agree more. Please don't get the impression that I'm against soundstaging---it's nice to have. It just doesn't matter for my emotional reaction to music.
Out in the real world, however, soundstaging is very important. If a review would state that a component makes wonderful music but can't image, sales would be practically nil, at least among the very large part of the clientele whose buying decisions are influenced by what's said by magazines and dealers (who rely on magazines as the most important sales aid).
The expression "sonic fireworks" is a recurring theme in hi-fi journalism. It seems also to describe the listening expectations of a certain type of hi-fi customer. "Ooh, look there . . . and there, to the right, outside the right-side speaker . . . ooh, and there, six yards behind the speakers . . . and there, over the speakers---isn't that beautiful!"
This listening style could be called visual-oriented listening, because it tries to describe sound in terms of visual experience. Visual-oriented listening is attractive because it allows a quantitative analysis ("The soundstage reproduced by the device under test was this broad, this deep, and this high."), which must be a big help in developing, and describing the sonic performance of, audio components.
It is also a defensible listening experience: We all know that the so-called objectivists try to knock the so-called subjective listeners. The latter have responded by turning into observational listeners (another visual term), relating an experience that other listeners can duplicate if the test conditions are identical---a prerequisite for gaining recognition as a scientific, and thus reputable, branch of engineering. (It must be hard to live your working life without the recognition of your peers.) Everyone with intact hearing will agree to reasonably identical dimensions of the soundstage, and the location of instruments in that soundstage.
However, there is no way yet to objectify the musical pleasure a component gives. A different listener will approach the same sonic demonstration with a different mood, different reactions to a musical stimulus, and so on. The emotional experience is not as easily transferable as the observational one.
A bonus of visual-oriented listening is that it is economically attractive. It allows listening irrespective of psychological and physical condition, and thus opens up a much larger part of the day to the accomplishment of meaningful work than if you could listen only when you were really in the mood for some music. For people who make their living from sonic judgments---designers, dealers, and journalists alike---I can see that it may be imperative. Problem is, this is in direct opposition to the listening experience of the paying customer, who wants to unwind from a day's work with a little musical entertainment.
Since visual-oriented listening is something at which a reviewer tends to get very good, it usually makes up a large part of a review's content. Magazine-reading audiophiles will be influenced in their listening habits by those reviews (the "learning" part that Jürgen Ackermann was talking about). They choose their systems and set them up so that the visual-oriented listening experience is emphasized. Many such systems, to me, sound boring. There's no meat on the bone.
An experiment: Disconnect one speaker from your setup and listen to the sound of just the remaining speaker, preferably with a mono source. I'm sure that few so-called high-end speakers (and systems) will survive this test. Many will sound bland and anemic. Two such speakers sound just the same, but probably a little fuller, because with the usual practice of mixing bass sounds straight down the middle, doubling the radiating surface of the bass drivers and doubling the available amplifier power gives a perceived 3dB rise in relative bass level. But the speakers don't become more interesting.
Another experiment: Listen to recorded voice. My favorite material for this kind of test are comedy records (Eddie Murphy, Bette Midler, and Bill Cosby spring to mind). Good comedy works on much the same principles as music. Timing is crucial, as are small inflections of the voice, speed of delivery, and so on. You'll be surprised by how few systems preserve intelligibility, an essential prerequisite for this kind of stuff. Dynamics and low-level resolution are much more important than timbral fidelity here.
While I'm at it, I'd also like to diss timbral fidelity. Of course, timbral fidelity is the essential prerequisite for the accurate reproduction of music in the home. It is also nigh on impossible to achieve, for sound scientific reasons.
Modern multimiked recordings tend to employ a microphone for every instrument or, at most, small group of instruments. The sound put down on (digital) tape is that of instruments at close proximity.
In the concert hall, one tends to listen from a much greater distance. Even if one were to sit at the conductor's feet, only a few instruments would be this close, the rest farther away. Thus, what is recorded on tape can never be heard in a real-world situation.
There is also the question of radiation patterns. In the concert hall, the sound one hears from a solo violin is a mixture of sound waves radiated by the strings and the top and bottom plates of the violin body, the latter two usually much lower in frequency than the strings to which they resonate and thus radiate with a broader radiation pattern. The close-proximity mike picks up a greater proportion of the string sound than would be heard live. The sound the microphone "hears" is only a fraction of the instrument's total sound that would be perceived by the typical listener, who sits much farther away than the typical microphone. This fraction will be prejudiced toward the higher frequencies. If the microphone's output is faithfully reproduced by all subsequent elements in the chain, the resulting sound will be unnatural. (In contrast, rock, pop, and blues music is usually amplified even when performed live. The sound you get from a recording is, on a certain level, faithful to the original.)
THX has drawn our attention to the fact that room size influences perceived tonal balance. Listening rooms tend to be much smaller than the halls or studios in which music is recorded (this is even more true in Europe than in the US). Thus, if a recording is true to the original event but is reproduced in a smaller room, it will sound too bright---which, again, seems to indicate that a truly flat system will sound bright (footnote 10).
These factors have long been known to audio designers. Having spoken to a number of manufacturers of drive-units, I know that it's relatively easy to make a tweeter with a flat on-axis amplitude response. But the loudspeaker designer knows that flat is not necessarily right (a point I'll return to later). Celestion's SL series, particularly the SL 600, was an international sales success and very well reviewed in all leading audio magazines, including this one. Its tweeter was shelved down 2dB vis-Ã -vis the woofer, but it sounded pleasingly natural in typical living rooms.
Conversely, many speakers have an on-axis rise in the tweeter's output to compensate for radiation patterns and give a flat room-averaged response, and to heighten the apparent level of detail a speaker can reproduce. To my ears, such speakers have always sounded way too bright. Summing up, I think it nigh on impossible to design components, especially loudspeakers, that will sound anything like their input in a variety of settings (footnote 11).
The third sacred cow waiting to be slaughtered is measurements. This magazine is working very hard to correlate the listening experience with measurements. I remain to be convinced that conventional measurements tell us much about whether a hi-fi component reaches the heart or not. In loudspeakers, there seems to be a fairly good correlation between a reasonably flat amplitude response and fidelity of timbre. In my own experience, low loudspeaker distortion and a reasonably flat phase response make for ease of listening, in the sense that I can listen for long periods of time without listening fatigue. Power bandwidth, perhaps more so for loudspeakers than amplifiers, will tell you if a component is apt to change its sound when the listening level goes up.
I think that good measurements are often an excuse for the designer: It measures well, so I haven't done anything wrong. Not doing anything wrong, however, does not automatically mean that the component under test will do enough right. To put it another way, I have yet to find a measurement that tells me if I'll want to listen to a component.
A final pet hate is detail. A proposal for the international language of hi-fi reviewing: There should be a distinction between detail and nuance. Just as a fact is mere data without an interpretable context, which only meaning can transform into information, a detail is meaningless without its context of musical direction, which transforms it into a nuance of interpretation (footnote 12). Dwelling on details like the audibility of a microphone falling down, the direction taken by a London underground line below the recording venue, or the chirping of a bird somewhere outside the recording venue, seems counterproductive: Such aspects take my attention away from the music and its meaning; they don't lead me to the music itself.
Magazines
A reviewer who relates his listening experience in terms of the emotional impact a component made on his enjoyment of music has a hell of a time getting his point across. As is evident from this magazine's "Letters," a lot of readers out there don't have a clue what he is on about. I can understand why: If the writer uses a type of music the reader can't relate to, it's hard to translate the review into a context relevant to his own preferred music. "Yeah, but how would it sound on my kind of music?" is a question often heard when discussing such reviews with readers. The prevailing impression seems to be that different music styles depend on different aspects of reproduced sound to carry their musical meanings.
A typical observation seems to be that for classical music, timbral fidelity, low-level dynamics, and, yes, soundstaging are considered important. (The soundstaging part I have never really understood; yes, I know, in the concert hall, the violins are seated on the left and the double basses on the right, but hey, they have to sit somewhere, and I have yet to read that a composer---Stockhausen excepted, and you never know if he's joking---specifies a certain seating arrangement for artistic reasons.) For rock music, essential aspects seem to be loudness, speed, rhythm and pace, and a tonal balance that conveys power in music.
These prejudices are so widely held that there must be something to them (although I submit that if you listen to Ansermet conducting, pace and rhythm are very important for his readings). And the dichotomy is so deeply anchored in the minds of music lovers that it seems almost insurmountable.
Yet it seems to me that for the reviewer, the way out need not lie in falling back on a sonic description of the audio experience. He should instead try to incorporate as many different styles of music into the review as possible, and describe the emotional impact these different styles have made. That means that the reviewer must educate himself in the appreciation of these different music styles.
Footnote 10: Which brings to mind J. Gordon Holt's famous "Down With Flat!" essay.
In my estimation, the writing style that prevails in current hi-fi journalism is an attempt to describe the sonic presentation as an abstraction from the listening experience, in an attempt to produce results that do not depend on a certain kind of music, but can be related to the perceived requirements of a given style of music. If a review states that a speaker has abrasive highs, it matters little if this observation was made while listening to massed violins or to a rock guitar. The assumption is that the reader then translates this observation to his own listening experience and decides if this particular aspect of music reproduction is important to his enjoyment of music or not.
But how do we know this assumption is true?
The way forward?
This article would be pretty pointless if it didn't at least try to find a way out of the dilemma we have brought upon ourselves. A magazine, after all, has to be useful (and entertaining) to its readers if it wants to survive. Here's the question we have to answer: Are there aspects of sound that are more important for emotional appreciation than others, and if so, which? It's clear that, however much I have derided sound per se up to now, somehow emotional response must be related to the waveform of the sound reaching our ears. There is no secret medium other than sound emanating from our speakers.
Let me introduce you to the thinking of another searcher for a new direction: Jean-Marie Piel, a 45-year-old journalist living in Paris, France. At the age of 15 Piel built his first hi-fi chain, consisting of a tube amp and Supravox speakers with a single chassis per channel. After earning his baccalaureat, the French equivalent of a high school diploma, he began to study literature and philosophy. He taught himself how to play the flute, which he taught for 13 years---beginning at age 23---at the Conservatoire de Fontainebleau. From the age of 20 he also worked as a journalist for hi-fi and music magazines, among other achievements writing and editing the "Arts Sonores" section of L'audiophile, the influential French underground magazine. Since 1985 he has been responsible for the sound section of Diapason, the largest music and sound magazine in France, as a joint editor-in-chief. He also writes regularly for Paris-Match. Jean-Marie Piel receives and listens to practically every CD that is offered in the French market, selecting two to four of these each month as musically and sonically outstanding. His knowledge of music, instruments, and musicians is encyclopaedic. He can eloquently explain the differences between instruments of different periods and why they evolved in a specific way. He writes (footnote 13):
"With his familiar ironic humor, Paul Valéry once said that 'the vice begins when one gives up the whole for the part'---a sentence that could be applied to a lot of hi-fi enthusiasts and that well [describes] the perversity that grips us when we take our pleasure by listening to music with those devices called loudspeakers. Modern miking techniques, which have a tendency to run amok on technology and to favor the detail at the expense of the ensemble, further push us in this direction: that of fragmented listening, even if one has to guard against overgeneralizations. But the fact is, if 20 microphones are used for recording an orchestra, there is little chance for the cohesion of the ensemble to survive. We are then reduced to hearing details, to take interest in nothing else. But the music escapes the detail; if the detail takes precedence, it is nothing but sound, a piece of sound. The music passes through it---if you stop to examine the detail, the music has already moved on. Of course, sound is the necessary medium for music. It's the sound that makes the music, not the notes. Still, by a mysterious paradox, fidelity to sound does not always coincide with fidelity to the emotion, which is the soul itself of music.
"Therein lies the rub: If one wants to judge a hi-fi system, one tends to erroneously concentrate on purely sonic details---are the lower mids good and are the extreme highs easy on the ear? As if one would ask such questions in a concert. In a concert, there is no woofer, no tweeter, there are only musicians playing. When listening to a hi-fi system, it is they and only they one should be listening to. It is true that a lot of components, in all price brackets, do not invite us to do this, and direct attention to the sound. We then have every occasion to think that the invisible link between notes that gives them musical meaning is not being reproduced. There's no necklace, just pearls . . . they may be beautiful, just as sounds made by certain sophisticated systems, which reproduce sounds superbly and with a certain implacable coldness, yet miss the soul of music, can be beautiful.
"All the difficulty lies in analyzing what is missing in the sound when living music is not happening. For the beginning of an answer we may turn our attention to certain chains, sometimes somewhat colored, missing the bottom or the top octave, which nevertheless reproduce the life and magic of musical movement. A certain timbral fidelity may be missing, but in a broad midrange where the essence of musical energy is concentrated (between about 200Hz and 4kHz), they are capable of perfectly reproducing nuances: ie, the intensity interrelations between sounds; or, to be more precise, the fluctuations of intensity within a single sound. This is where the life is. It's enough to analyze a note held by a musician to gain consciousness of this fact. You know that this held note comes from a musician and not a machine because there are infinitely small instabilities. The sound does not have a constant intensity. Sure, the variations are very small, but they exist. In the ability to reproduce these infinitely small nuances is the answer to the question of whether a chain will let through the life, without which, evidently, the music is just dead notes.
"Another example: Listen to the way a violinist like Salvatore Accardo lets sounds develop in the Beethoven Concerto. He attacks certain notes hard, with a broad vibrato (variations in pitch and volume), then progressively reduces the intensity, tying the whole down (which in itself creates several levels of nuance) until he flirts with silence. The way he lets the note finish, or die, is so subtly progressive that one doesn't quite know where the note ends and the silence begins. From this uncertainty, which makes us listen hard to save this fascinating passage from nothingness, arises a strong emotion. If superficiality enters into the reproduction---a kind of oversimplification in the rendition of nuances that gives the impression that the note, instead of dying away imperceptibly, is brutally cut off---the interpretation's magic is immediately destroyed. The artist leaves us indifferent because he doesn't force us to train our ears to the outer limits of audibility.
"The essence of an interpretation lies in working on the infinitely small---be it an attack on a note held back for a fraction of a second (perceptible if the preceding note is reproduced neither too short nor too long), or be it a note that develops in itself; or, on a larger level, a crescendo or diminuendo encompassing several notes---all of which gives music a sense of direction, its palpable dynamics, its quivering life, and all of which, in the end, lies in the nuances.
"Which explains, by the way, why certain old loudspeakers with a very high sensitivity and thus a very high precision in the rendition of dynamics, especially of very small signals---just like certain tube amplifiers with very simple circuits---and despite more or less obvious colorations and the omission of an octave or two, manage to reproduce with disturbing fidelity all the emotional intensity of an interpretation. Which should give our designers something to think about, and convince them that the musically more important kind of dynamics is that which loses itself in silence (footnote 14), not the kind that turns into noise."
Learning from our ancestors
I think it is no coincidence that Jean-Marie Piel would turn to "old" technology for inspiration. Some old gear can still hold up surprisingly well today. The American press, with the occasional exception from Sound Practices, has concentrated so far on triode amplifiers as "the new thing." Loudspeakers receive a lot less attention. I have given my opinion on triode amps and their qualities in this magazine (footnote 15), and of late there have been a number of articles on single-ended triodes. Instead of further amplifying this addiction to triode amps (which, contrary to what you may have been led to believe, are no panacea; if we have to talk about amps, I'd prefer to emphasize the role of the preamp), let me concentrate first on another piece of the hi-fi chain in need of a reevaluation: the loudspeaker.
Let's start with an unexpected item of old technology: vintage tube radios. Those of the 1940s to 1960s often have an astonishingly good sound quality. The frequency range of their single driver is severely restricted, but they have a magical coherence that more than compensates. All the really good ones seem to have a single-ended tube, not necessarily a triode; an EL86 pentode can sound wonderful in a single-ended topology. (By the way, Jean-Constant Verdier, designer of the best turntable I have ever had the pleasure to hear, has a huge collection of old tube radios.)
One of the more intriguing facts about old tube radios is the way they make use of their enclosures. These are not designed to be as acoustically inert as possible, as are most modern speakers, but are allowed to resonate with the music, a character trait shared with many old loudspeakers. The wood panels' size and density are judged so that those inevitable resonances are consonant with the music. Music seems to pass through them unscathed. If you listen to the output of modern speaker cabinets (using an ear pressed to the box; or, for a more dignified approach, a stethoscope), most sound horrible. The sounds emitted by an Altec Voice of the Theatre's cabinet can be much less objectionable (footnote 16).
Another facet of this phenomenon is the way the room is energized by a loudspeaker using a noninert enclosure. Sound, especially the lower frequencies, is radiated from the entire surface of the box, not just the chassis. This seems to accomplish much the same thing as using multiple drivers or dipoles. One of the most convincing loudspeakers I have ever heard is built according to principles having more to do with the making of musical instruments than with orthodox hi-fi loudspeakers.
Another aspect of old loudspeakers is that they tend to have dimension ratios diametrically opposed to those of modern speakers. Modern speakers typically have very narrow fronts, the enclosed space needed for a reasonable bass-driver alignment being found by making speakers tall and deep. By comparison, old loudspeakers tended to be wide but shallow. This has profound consequences for sound dispersion. Once the baffle is narrower than the wavelength of a tone emitted by one of its chassis, the emitted sound is no longer reflected by the baffle and projected by the speaker toward the listener, assuming the listener sits in front of the speakers; instead it will travel around the speaker and radiate to all sides.
Typically, low and middle frequencies are dispersed quite evenly in the room, while high frequencies are projected in a narrow angle. Thus the energy concentration at the listener's point in the room is tipped toward the high frequencies. Many designers compensate for this by introducing a slight clockward tilt in the speaker's frequency response, a gentle fall from low to high frequencies. The indirect sound, which in nondead listening rooms makes up an important part of the overall gestalt of the sound, the perceived tonal balance, will then be perceived as lacking in high-frequency energy. The speaker sounds dull. To prevent this, there will often be an on-axis rise in the tweeter's top octave. Unfortunately, two wrongs don't make a right. Old loudspeakers, which have wider baffles, project more energy at lower frequencies toward the listener and have a more natural balance between mid and high frequencies without that tilt in the frequency response.
I think that this factor, tonal balance, is another key aspect in which old gear has an advantage over much modern equipment, and is as important as the low-level dynamics Jean-Marie Piel was talking about. Jean Hiraga, a French journalist whose writings appear mostly in the
Nouvelle Revue du Son, has often cited the "Law of 400,000": The product of a loudspeaker's -3dB points should always be 400,000. If a speaker is down 3dB at 20Hz, it should be down 3dB at 20,000Hz; if a speaker is down 3dB at 40Hz, it should be down 3dB at 10,000Hz; and so on. This law is simplistic, because it is applied only to the on-axis response. Ideally, it should be applied to the room-averaged response. Many modern speakers are flat or even tilted up in the final octave, as we have seen above, without an adequate bass fundamental to counterbalance this top-end extension.
Another aspect of old loudspeakers that seems important to me is the drivers they employ. Old loudspeakers are all about pneumatic coupling. When a loudspeaker chassis' membrane is propelled forward by voltage and/or current applied to the voice-coil, the air in front is pushed away. Depending on membrane size and the length and speed of the excursion, the air in front of the loudspeaker will react more or less willingly to the input (the technical term is acoustic impedance). There is a fairly precise point when the air will more or less fail to be impressed by the driver's stimulus, with an inverse ratio between frequency and loudness on one hand and membrane size on the other hand. (Loudness is a function of the air you move; to achieve a greater loudness level, you have to increase either the surface or the excursion of the membrane.)
Put simply, to reproduce a bass tone loudly, you need a fairly large membrane; for a treble tone, a much smaller surface will suffice (in case you wondered why your tweeter is smaller than your woofer). Above a certain frequency, the air will effectively follow the membrane's movements, vibrating forward and backward. Below that point, the air's inertia is too great to be influenced by the driver---compare the effect of waving your hand with waving a ping-pong bat. There is also a point where excursion cannot be substituted for membrane size, because the air will no longer couple efficiently to the driver.
This acoustic impedance stuff is one of the reasons why horns were once so popular. A horn can be seen as an acoustic impedance transformer: The air in front of the driver cannot escape to the sides when stimulated by the membrane, but will faithfully follow the stimulus. By gently broadening the canal through which the sound waves travel, these air movements will be imposed on an ever greater amount of air, until you come to the end of the horn. In a certain sense, the air that is present at the horn's outlet can be seen as the effective driving surface of the horn driver, because it is this air that couples to the rest of the room. The larger the surface, the less excursion is needed to play at a certain loudness level; and in speakers, the less excursion, the better.
A large bass driver needs a large cabinet behind it, which makes it impractical for many people. I think it's no coincidence that the small infinite-baffle speaker was invented when stereo became available. One big enclosure, for mono, can be tolerable enough, but two such behemoths are beyond what most people will tolerate in their living rooms. Fine, I say. Just be aware that there is a sonic price you pay for the small woofer.
There's one other component of the hi-fi chain I want to comment on: the phono cartridge, for those of us who still listen to vinyl. Some time ago I reviewed (for a German magazine) the latest iteration of the EMT cartridge, a design that started out in the early '60s. Listening to this cartridge after a spate of newer designs made me realize anew that certain classic designs (whose number includes the Denon DL 103 and the Ortofon SPU series) have an emotional rightness that speaks powerfully to the heart and soul of the listener, even if his head can discern some not-very-subtle deviations from linearity. The EMT has a much more colored sound than many modern cartridges do. Yet it is a heck of a lot more fun to listen to than those modern, oh-so-flat, tread-carefully designs. When was the last time you read that a cartridge could really get down and boogie?
Yes, I'll listen to the future
Please don't think that I'm anti-progress, anti-technology, anti-digital, or whatever. Far from it. I hate the expense and complication I have to go to to obtain good sound---which to me means satisfying sound: the rigors of speaker placement (a surprisingly accurate first approximation for speaker placement is to put them where they do the most visual damage to a room; that's probably where they'll sound their best), cables that positively invite you to trip over them, the seemingly unstoppable proliferation of small or not-so-small electronics boxes, and so on. My ideal hi-fi rig consists of a small and preferably inexpensive appliance that sits quietly and unobtrusively in some corner of the room, but fills the room with sweet music. Now that's what I'd call progress.
I'm also not saying that triodes are the only way to go. I remain unattached to any specific technology. I would like to see more single-ended transistor amplifiers. These should provide quite respectable specs, a low output impedance, a flat amplitude and phase response, and so on. Judging from my experiences with tube designs, I would caution against the use of parallel transistors in the quest for higher power outputs. Anyway, the compromises inherent in this technology tend to show up much more clearly in single-ended topologies than in circuits that split the signal.
Single-ended designs are necessarily class-A, so they'll never be as energy efficient as I'd like my hi-fi to be. It could be argued that it doesn't matter much on a global scale. I don't yet see a Japanese electronics giant bringing out inexpensive single-ended integrateds, so for the foreseeable future this exciting technology will remain the expensive preserve of the dedicated few. But I have to say that I'd be happier if all of humanity could follow my path to audio truth without vaporizing the polar ice caps. This aspect truly troubles me.
I also have great hopes for the Super Audio CD and DVD-Audio formats. The present CD format, after all, was laid down in the late 1970s and relied on technology that was then cost-efficient to manufacture. If you compare a present-day computer to its late-'70s counterpart, the latter appears to be a relic of the Neolithic. The CD standard seems just as antediluvian when compared with the new digital technologies.
A change in direction?
I'm sure that I've raised more questions in readers' minds with this article than I have provided answers. However, I hope to ignite a discussion that may lead to a better understanding of how sound influences emotion, and how equipment that doesn't get in the way of the emotion can be designed. The High End has become too technocratic, too sure of itself, maybe even a little arrogant. In my estimation, we have only scratched the surface of this whole matter of music reproduction in the home. Some humility would give a more accurate perception of our achievements in this worthwhile field. Personally, I'm usually very unhappy when someone tells what I should and shouldn't enjoy.
In lieu of a conclusion, I offer this observation: There is a paradigm shift underway in the world of music reproduction. For the last 40 years or so, the High End's aim could be summed up in Quad's famous motto: "the closest approach to the original sound." But there is a growing movement underfoot that refuses to adhere to this motto, creating its own instead: the closest approach to the original emotion.
Down With Flat!
A tradition is anything we do, think, or believe for no better reason than that we have always done it, thought it, or believed it. Most traditions are followed in this mindless and automatic way, and, if questioned, are defended with the argument of, well, that it seems to work. It's time-tested, true-blue and, because so familiar, as comfy as an old slipper. So why rock the boat, throw a wrench in the works, or fix it if it ain't broke.
Although we like to think of audio as high-tech and up-to-the-minute, it, like virtually everything else, is hidebound by rituals, mental sets, and assumptions that have no better basis than simple tradition. One of these is flat frequency response.
The reason flat frequency response has become an audio tradition is because it seems to make so much sense. The obverse of GIGO (footnote 1), the FFR view is that, if you present to the human ear the same frequency response as existed in the concert hail, the result will be a spectral balance which the ears perceive as identical to the original sound. What could be more self-evident? Nothing, except that this traditional approach to audio component design doesn't always work very well.
There are, however, many cases where it does. Microphones, phono cartridges, and audio electronics almost invariably sound pretty much the way their measured frequency responses suggest they would. (At least, they do if that response curve doesn't come packed in the box with a Japanese phono cartridge.)
But loudspeakers? Well, Sir, something is very much amiss in the world of loudspeakers.
Just about every manufacturer of high-end speaker systems brags about how flat their frequency response is because, of course, flatness is considered one of the prime objectives of any audio-component design. Unfortunately, in a loudspeaker, the flat frequency response doesn't seem to work.
Many times in past years I have been impressed by the incredible flatness of the measured high-end response of some speakers: almost like the proverbial straight edge out to 15kHz, and sometimes beyond. In every such case, I have been equally amazed at how positively awful those loudspeakers sounded—so tipped-up at the high end that could not enjoy listening to them. (They aroused a deep nostalgia for the days when preamps all had tone controls.)
Nor am I the first to have observed that an objectively flat high end sounds tipped up. Ever since acoustical engineers started using equalizers to "voice" recording studios and monitor systems, they have observed this marked disparity between what measures flat at the top and what sounds flat. They were all ultimately reduced to pulling down the whole high end and—Heaven forbid!— adjusting it by ear. No one seems to know why this is so, but the important thing right now is that it is.
That flat/rising high end has become an even greater liability for loudspeakers since Compact Disc came along, because CD players and discs are not subject to the HF and detailing losses we all grew so accustomed to and comfortable with (tradition) from analog discs. If a "flat" speaker needs a 2dB pull-down at 10kHz with analog sources, it usually needs about 4dB with CDs.
Similarly, audiophile loudspeakers that measure flat through the lower middle range seem to have a penchant for sounding sucked-out and gutless through that region.
The paragon for lower-middle-range reproduction is the large horn-loaded system of the type used in recording studios and movie theaters, which more often than not sounds exaggerated through this range. But audiophile systems sound
deficient here, even though designers' measurements show otherwise. And this region has a most profound effect on the ability of loudspeakers to reproduce the real timbres of real musical instruments. I have complained bitterly about this in countless speaker reports, yet my own response measurements have consistently failed to turn up any pattern of objective suckout through the lower middle range. And equalization to correct the sound has, with equal consistency, introduced a measured rise through the 300Hz to 1kHz range.
Perfectionist loudspeaker design has made tremendous strides in the past ten years toward improved detailing and imaging, and extension of the highest and lowest octaves of frequency range, yet that all-important issue of tonal accuracy has been consistently overlooked. If anything, the middle range, the whole foundation of sonic accuracy, is less felicitously reproduced today than it was 30 years ago. Those old horn-loaded squawkers had an awfully strident and dirty high end, but they reproduced the range from 100Hz to 2kHz with a degree of subjective accuracy not even approached by many of today's most highly esteemed audiophile speakers. It should not have been necessary to exchange an abominable high end for an equally abominable middle range, but that's what we've done. I don't give a damn what the measurements say—most modern speakers just do not reproduce that part of the range properly. If you doubt this, just pay close attention to the sound of the next live, in-the-flesh (brass) trombone you hear.
At the low end, things seem to be reversed from the high end. (That makes an insane kind of sense, somehow!) Invariably, loudspeakers that measure flat in my own listening room sound thin at the low end, while those sounding flat at the bottom measure as having a low-end rise. (The same correlation exists in another room of different size and shape in my house, so it isn't just the main listening room.)
So, what about the sanctity of flat response?
Sometimes the problem is not with the measurement, but the measuring technique. Many loudspeaker manufacturers measure frequency response in an anechoic chamber, which is senseless. Loudspeakers are never listened to in an anechoic chamber. Loudspeakers are listened to in real rooms, and in real rooms their measurements are quite different. Other manufacturers bury loudspeakers flush with the ground out doors, and measure response that way. Again, this bears little relationship to a real listening situation, as it does not show the influence of the listening room, and neatly suppresses all those little edge-diffraction effects which roughen the response of a free-standing speaker. But not even real-room response measurements assure that speaker systems sound the way they measure, because no two rooms influence speaker response in the same way (unless their dimensions, construction, and furnishings are identical).
What all this means is that there is no justification for viewing flat measured frequency response of loudspeakers as The Word of God. This sacrosanct measurement serves as little more than a crutch for designers who, for whatever reasons, are unwilling to apply critical, subjective judgments to the sound of their own designs.
I'm not advocating, of course, that we entirely abandon the criterion of loudspeaker response flatness. Peaks and dips are still peaks and dips, and they do adversely affect the sound. But when subjective accuracy and objective perfection are as clearly at odds with one another as they are in loud speaker design, we should reassess our approach to the latter in terms of the former, rather than merely shrugging off our contradictory subjective observations as "irrelevant."
I realize how much more difficult this is going to make the design process, as it is much harder to judge what's right than to measure it. But since we cannot, as of now, measure the ultimate rightness of a loudspeaker's response anyway, subjective evaluation is the only available alternative. Actually, this isn't asking all that much of loudspeaker designers—our equipment reviewers do it all the time. And they will be the ones critiquing that designer's products in Stereophile, and doing it subjectively. It's the only way that makes sense.—J. Gordon Holt