In theory, a program running on a 64 core machine, should execute 64 times as quickly as it would on a single core machine. But, according to a team from MIT, it rarely works out that way. Most computer programs are sequential, and splitting them up so that chunks can run in parallel can result in complications.
Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory have created a chip called Swarm, said to make parallel programs more efficient and easier to write.
In simulations, the researchers compared Swarm versions of six common algorithms with the best existing parallel versions, which had been individually engineered. Swarm versions were found to be up to 18 times faster, but generally required only 10% of the code. Not only that, Swarm speeded one program computer scientists had failed to parallelise by a factor of 75.
“Multicore systems are really hard to program,” says Daniel Sanchez, an MIT assistant professor who led the project. “You have to explicitly divide the work that you’re doing into tasks, then you need to enforce some synchronisation between tasks accessing shared data. What this architecture does, essentially, is to remove all sorts of explicit synchronisation to make parallel programming easier.
“There’s an especially hard set of applications that have resisted parallelisation for many years and those are the kinds of applications we’ve focused on.”
Swarm is said to differ from other multicore chips in that it has extra circuitry for handling prioritisation. It timestamps tasks according to their priorities and begins working on the highest priority tasks in parallel. While higher priority tasks may create lower priority tasks of their own, Swarm slots those into its task queue automatically. It will also ensure higher priority tasks run first and will maintain synchronisation between cores accessing the same data.
The hard work is said to be done by the chip itself, which Sanchez designed in collaboration with MIT graduates Mark Jeffrey and Suvinay Subramanian, University of Washington PhD student Cong Yan and MIT Professor Joel Emer, a senior distinguished research scientist with NVidia.
Apart from extra circuitry to store and manage the task queue of tasks, Swarm uses a Bloom filter to record the memory addresses of all the data its cores are currently working on. This helps Swarm to identify memory access conflicts and the team showed that timestamping makes synchronisation between cores easier to enforce.
Finally, all the cores occasionally report the time stamps of the highest priority tasks they’re still executing. If a core has finished tasks that have earlier time stamps than those reported by its fellows, it knows it can write its results to memory without creating conflicts.
Author
Graham Pitcher
Source: www.newelectronics.co.uk