The research led by SUN HPCS group tried to find a way out in programming languages to address the software engineering issue in HPC software.
As illustrated by the research, HPC software had seen great reduction (tenfold or so) in code size as well as improvement in readability, verifiability and maintainability, when they get rewritten with no restrictions on compiler, architecture or even programming methodology.
What is interesting, however, is that the software engineering issues in HPC software eventually arise from three non software engineering reasons.
The first problem seems to be related to performance. Techniques such as loop unrolling, cache blocking, vectorization and so on makes the code more complex. Replacing them with intrinsic structure as array operations or external libraries can help. The compiler can also help with these low level optimizations such as loop unrolling and vectorization.
The second problem is parallelization. Finding and proper handling of concurrency can always add to the complexity. To distributing data in a parallel (both shared-memory and distributed-memory) system makes the problem even worse. Hardware improvements would help in making things easier. In the software parts, I think it is possible to looking into the original physical/mathematical problem for easier identification of possible concurrency and hierarchy of parallelization.
There exists a third reason, which has something to do with complex algorithms, usually introduced by the sometimes irrational requirements of performance. Sometimes, physicists, mathematicians and engineers write a lot of complex code only to extract the one or two drops of performance out of an unimportant section of the algorithm. It is even more interesting that when the algorithms evolve, very complex terms in these formulas sometimes canceled. For this type, those languages which can express mathematical structure more easily/clearly may help.
In familiar software engineering, it is believed that the lifetime of a code is much longer that you would have thought. However, sometimes in HPC code/software, it is found that the lifetime of code is much shorter that your expectation. Eventually I have seen a handful of these codes. These codes are often small projects of Ph.D students, where they are written for some validation purposes of algorithms or demonstrations. When someone else comes to continue the project, the code is so difficult to read and understand that the poor gay has to rewrite from scratch. If these codes were written with programmability in mind, they would live much longer and save lots of human efforts.
2010年7月10日星期六
Writing HPC code with programmability in mind
2010年6月25日星期五
Future of parallel programmings
John Shalf says that to help migrating from traditional serial CPU to multi/many-core architectures, we need more support in the programming languages. Traditional languages add more restrictions and the additive way such as OpenMP would be help but can not be fundamentally helpful.
The most important thing in parallel programming languages is that it exposes a good way to 'locality-of-effect'. From this point of view, the functional programming languages can help. So the future of parallel programming languages can be some of functional languages.
"Implicit parallelism and constructs derived from functional languages are likely to see a resurgence", John says.
I agree with the idea that parallelization depends on the internal structure of data flows in the solution process. A good way to help parallelization is to expose as much as possible the internal data flows of an algorithm. Of course this will require a different thinking of the algorithm itself, since the algorithms make use of the data of the original problem.