One of my final projects at the Guildhall@SMU was writing a ray tracer for the Playstation3. It was one of my most rewarding projects at school as it gave me the chance to explore the Cell architecture and create a system that would allow me to massively parallelize a simple ray tracer.
The Cell processor in the PlayStation 3 contains a main processor (PPU) and several sub-processors (SPUs) that are specialized towards vector operations. The processors are connected by a ring data bus that has 16 channels for sending and receiving data via DMA. To load a program onto an SPU, it sends the executable code from the PPU or another SPU to an available processor. Data is then sent to the SPU and the work begins. Communication between processors was performed via a mailbox system that allows simple message exchanges.
For the ray tracer, the PPU initially sets up each available SPU with a basic job system and the ray trace code. Once the SPU starts, it enters a no-CPU wait until a message is sent from the PPU. For each scanline on the screen, the PPU sends a message to any available SPU which then pulls the list of spheres and lights over via DMA. The SPU cranks through a single scan line of the image, generating one line that will be sent to the display. When done, the SPU sends the resulting display data back to the PPU, then responds with another message saying that it’s waiting for another task.
I went through several iterations of how to queue work on the SPUs from traditional threading to a simple messaging protocol to enqueue and dequeue tasks on a waiting SPU.
The end result is a ray tracer that can render a scene at 1152×768 with 10 spheres, 10 levels of reflection, 3 dynamic lights with occlusion shadows at about 30 frames per second.
I used the knowledge that I gained on this task to create a much more efficient task management system that I still use today.