12.4 Time Alignment — Theory and Practice
🔰 BEGINNER LEVEL: Why Time Alignment Matters
The Precedence Effect
Human hearing localizes sound sources primarily by detecting which ear receives a sound first (interaural time difference, or ITD). The brain uses arrival time, not loudness, as the primary cue for direction.
This is the precedence effect: if two identical sounds arrive at slightly different times, the brain attributes the source to the first arrival. The second sound is perceived as coming from the same direction as the first, as long as it arrives within about 30ms (after 30ms it becomes a separate echo).
In a car: Your left tweeter might be 24 inches from your left ear and 44 inches from your right ear. Your right tweeter might be 42 inches from your right ear and 60 inches from your left ear.
Without time alignment, the left tweeter always reaches your left ear first — everything images left. With time alignment, we delay the closer speaker so both tweeters arrive simultaneously.
Illustration note: Top-down view showing speaker-to-ear distances and delay correction concept
Measuring Distances
Measure from the dust cap of each driver to your ear position (centered in the driver's headrest):
- Use a flexible tape measure or string
- Hold one end at the driver's dust cap
- Route the tape to your ear position (approximate center of where your ear would be while seated normally)
- Record the measurement in inches
Illustration note: Distance measurement diagram for full 3-way system with delay calculation table
Reference driver: The driver with the longest distance from your ear becomes the reference — it receives 0ms delay. All closer drivers receive delays to match.
Delay_n = (D_ref − D_n) / 13,500 in/s × 1000 ms/s
Example:
| Driver | Distance (in) | Delay (ms) |
|---|---|---|
| Right tweeter | 62 in (reference) | 0.00 ms |
| Left tweeter | 24 in | (62−24)/13,500 × 1000 = 2.81 ms |
| Right midbass | 56 in | (62−56)/13,500 × 1000 = 0.44 ms |
| Left midbass | 28 in | (62−28)/13,500 × 1000 = 2.52 ms |
| Subwoofer | 58 in | (62−58)/13,500 × 1000 = 0.30 ms |
These calculated values are the starting point. Fine-tune acoustically or with measurement.
🔧 INSTALLER LEVEL: Measurement-Based Alignment
Using Impulse Response Measurements
Calculated delays from tape measurements are a first approximation. Acoustic path differences (reflections, diffraction around seats) mean the actual arrival times may differ from physical distances alone.
REW impulse response procedure:
- Position microphone at listener's ear location (clamped to headrest)
- Enable one driver at a time in the DSP
- Play REW's impulse measurement signal
- Identify the time of the first arrival peak in the impulse response
- Record the arrival time for each driver
- Set DSP delays so all drivers have the same arrival time
Reading the impulse response:
The impulse response shows amplitude vs time. The first significant peak (before any reflections) is the direct arrival. Its position on the time axis, relative to time zero, gives the travel time.
At 343 m/s (20°C):
Distance = time × 343 m/s
If left tweeter peaks at 2.1ms and right tweeter at 4.3ms:
Distance_L = 0.0021 × 343 = 0.72m = 28.3 in
Distance_R = 0.0043 × 343 = 1.47m = 58.1 in
Required delay for left tweeter = 4.3ms − 2.1ms = 2.2ms
This measured 2.2ms may differ from your tape-measured estimate due to acoustic path effects — use the measured value.
Fine-Tuning by Ear
After measurement-based alignment, fine-tune listening:
Test track: Solo voice recording, well-centered. Diana Krall, Johnny Cash, or any recording where the vocalist is clearly intended to be centered.
Method: Play and listen. If image pulls left: right channel has slightly too much delay (or left too little). Add 0.05ms delay to left channel, re-listen. Repeat until vocalist appears exactly centered.
Note: This fine-tuning is for the listening position only. Passengers may experience different imaging — the car is fundamentally asymmetric and perfect imaging at all seats is not achievable with standard stereo time alignment.
⚙️ ENGINEER LEVEL: Psychoacoustic Basis and Binaural Processing
Interaural Time Difference (ITD) and Level Difference (ILD)
Spatial perception uses two complementary cues:
ITD (Interaural Time Difference):
ITD_max = d / c = 0.215m / 343 m/s ≈ 0.63 ms
Where d = head diameter (approx. 0.215m). Maximum ITD for sound directly to one side.
Just-noticeable difference for ITD: 10–20 μs (microseconds).
This means time alignment must be accurate to better than 0.01ms (10 μs) for the finest spatial discrimination. DSPs with 0.01ms resolution (common in quality units) provide adequate precision.
ILD (Interaural Level Difference):
At high frequencies (> 1.5 kHz), the head creates an acoustic shadow — the ear facing away from the source receives less energy. Level difference provides directional cue above 1.5 kHz.
Below 1.5 kHz: ITD dominant. Above 1.5 kHz: ILD dominant. Crossover: Both contribute.
Implication: The critical frequency region for imaging precision is above 1.5 kHz — the tweeter range. Subwoofer time alignment affects integration with mid-bass but not phantom imaging. Tweeter and midrange time alignment is what produces the phantom center image.
Head-Related Transfer Function (HRTF):
The ear's pinna (outer ear shape) causes frequency-dependent filtering that varies with sound source angle and elevation. This HRTF provides elevation cues and front/rear discrimination.
In-car audio, HRTFs are modified by the interior reflective environment. Seat backs, headrests, and glass create reflections that partially simulate HRTF elevation cues — which is why careful positioning of tweeters (angled, elevated) can create convincing elevated soundstage without explicit elevation EQ.