Monday, May 27, 2013

Transform and Roll Out

I've spent most of the last week deep in the Fast Fourier Transform and WebGL Shaders, two things I have not had that much experience with before. Oh sure, I knew the theory, but there's a big difference between that and being able to write actual code.


This paper was extremely useful. My implementation really only added one line.


It didn't help that I set myself a pretty hard challenge on this one. Implement a famously brain-bending algorithm that I didn't entirely understand in a language/environment that I'd never used before. And I have to admit, this time yesterday morning I was staring at frustration at corrupted transforms with no idea how to fix them.

But, when in doubt, work it out by hand. I knew I was close. Exactly half a pixel off, in fact. That's pretty close.

Turns out the gl_FragCoord  coordinates passed to the Fragment Shader are not integers. They are half-integers, which align with the pixel's center. So they run [0.5, 1.5, 2.5...]. (It seems DirectX, which the original paper targeted, doesn't do this.)

Because I like to show the code, here's the WebGl fragment shader which performs a radix-2 Stockham FFT pass on a texture.


<script id="fft-rows" type="x-fragment-shader">
precision mediump int;
precision highp float;
precision highp sampler2D;

uniform sampler2D image; // source texture
uniform float N; // FFT Size (eg. 256)
uniform float Ns; // FFT Pass Size (eg. 128)

// perform a dual-fft (operate on the texture XY as one complex number, and ZW independantly)
// pass for a muti-pass transform of a size declared in the uniforms
vec4 dualfft_rows(vec2 p) {
p.x = floor(p.x);
float base = floor(p.x / Ns) * (Ns/2.0);
float offset = mod(p.x, Ns/2.0);
float iy  = p.y / N;
float ix0 = (base + offset) / N;
float ix1 = ix0 + 0.5;
vec4 v0 = texture2D(image, vec2(ix0, iy) );
vec4 v1 = texture2D(image, vec2(ix1, iy) );
float angle = radians(-360.0*p.x/Ns);
vec2 t = vec2( cos(angle), sin(angle) );
// transform two complex number planes at once
return v0 + vec4(t.x*v1.x - t.y*v1.y,
t.y*v1.x + t.x*v1.y,
t.x*v1.z - t.y*v1.w,
t.y*v1.z + t.x*v1.w);
}

void main() {
gl_FragColor = dualfft_rows( gl_FragCoord.xy );
}
</script>

It computes the twiddle factors explicitly instead of looking them up in a texture (as other variants do, since modern cards probably perform trig faster than texture accesses) and performs two independent FFTs on the xy and zw pairs per pass. Floating Point textures are essential for this.

Of course there's another 1500 lines of code wrapped around this little fragment shader, but a lot of that is 'test' scaffolding that can come down now. However there seems to be a good 700 lines of code that I've packaged into a 'GPU' class that acts as a friendly wrapper around WebGL, and lets me 'compute' with textures like I'm doing simple algebra.

The main point is this: two weeks ago, I had favorable initial impressions of WebGL. There were some pretty demos. It worked great as a toy. But I had some reservations about how far I could take it. Since then, I've implemented one of the most important and powerful scientific analysis tools there is - the Fourier transform. And it flies.

Today's challenge will be to combine the working FFT code with some other experimental code, and jump from images to transforming video. I've already got the code to access the webcam from inside the browser. HTML5 makes all this integration not only possible, but almost trivial.

Lastly, this is a huge load off my mind. I'd been proceeding on the assumption this was possible, (that I can actually write a working FFT in GLSL) so I'm gratified that has proved correct. I spend last weekend standing in a very, very cold paddock capturing astronomical imagery on that assumption. Now I can process that data.

Beyond that, I've now got new tools in my bag of tricks. And a new respect for browsers.

No comments:

Post a Comment