The new jidctred.c code
The old jidctred.c code
/ cos(pi/4) cos(pi/4) \ / 1/sqrt(2) 1/sqrt(2) \ / 1 1 \ | | = | | = 1/sqrt(2) | | \ cos(pi/4) -cos(pi/4) / \ 1/sqrt(2) -1/sqrt(2) / \ 1 -1 /This leads to a simple add/sub (sum/diff) calculation scheme. The scalar multiplication is deferred to the descaling operation at the end of the calculation. After column and row application the factor is 1/sqrt(2) * 1/sqrt(2) = 1/2. We also have to account for the different size DCT with a factor of 1/4, so the final descaling factor is 1/2 * 1/4 = 1/8 (right shift 3).
/ C4 C4 C4 C4 \ / 1 1 1 1 \ | | | | | C2 C6 -C6 -C2 | | c2 c6 -c6 -c2 | 1/sqrt(2) | | = 1/2 | | | C4 -C4 -C4 C4 | | 1 -1 -1 1 | | | | | \ C6 -C2 C2 -C6 / \ c6 -c2 c2 -c6 / where Ck = cos(k*pi/16), ck = sqrt(2) * Ck (note C4 = 1/sqrt(2))We use the indexing from the 8x8 point DCT for comparison. The IDCT is the transpose of the DCT, hence
/ 1 c2 1 c6 \ | | | 1 c6 -1 -c2 | 1/2 | | | 1 -c6 -1 c2 | | | \ 1 -c2 1 -c6 /The scalar multiplication is deferred to the descaling operation at the end of the calculation. After column and row application the factor is 1/2 * 1/2 = 1/4. We also have to account for the different size DCT with a factor of 1/2, so the final descaling factor is 1/4 * 1/2 = 1/8 (right shift 3).
Now let (x0 x1 x2 x3) be our input vector, (y0 y1 y2 y3) our output. Then we have
/ y0 \ / 1 c2 1 c6 \ / x0 \ / x0 + c2 * x1 + x2 + c6 * x3 \ | | | | | | | | | y1 | | 1 c6 -1 -c2 | | x1 | | x0 + c6 * x1 - x2 - c2 * x3 | | | = | | * | | = | | | y2 | | 1 -c6 -1 c2 | | x2 | | x0 - c6 * x1 - x2 + c2 * x3 | | | | | | | | | \ y3 / \ 1 -c2 1 -c6 / \ x3 / \ x0 - c2 * x1 + x2 - c6 * x3 /You can see that the even part (x0 x2) does not contain any multiplication.
t10 = x0 + x2 t12 = x0 - x2 (even part) t0 = c2 * x1 + c6 * x3 t2 = c6 * x1 - c2 * x3 (odd part)Then we have
y0 = t10 + t0 y3 = t10 - t0 y1 = t12 + t2 y2 = t12 - t2This would require 4 multiplications. We can save 1 multiplication by simple algebraic transformation:
t0 = (c2 - c6) * x1 + c6 * (x1 + x3) t2 = c6 * (x1 + x3) - (c2 + c6) * x3This requires 3 multiplications with
c6 = sqrt(2) * cos(6*pi/16) = 0.541196100 [c2 = sqrt(2) * cos(2*pi/16) not needed] c2 + c6 = 1.847759065 c2 - c6 = 0.765366865This is the same 'rotation' as in the even part of the 8x8 point LL&M IDCT algorithm.