Science Data Processor for the Square Kilometer Array Radio Telescope
I have been involved in the SKA Science Data Processor (SDP) project since 2013. I worked in the following areas:
Explore data-driven programming models
An early memo I co-authored with Bojan Nikolic explains some of the approaches. My company was contracted to explore multiple approaches to domain specific languages and data flow approaches. We first created several prototypes in Haskell, then explored other domain specific languages that had been successful (in particular Halide and Legion).
We reviewed Legion in detail, and while Legion’s architecture appeared a very good match, its implementation wasn’t far enough along for it to get the traction we were hoping for.
The project is described in the following documents:
This contract was awarded, and we include the Milestone reports:
- Milestone 1: Evaluating Data Flow DSL’s for the SKA SDP
- Milestone 2: SKA SDP DSL project, DSL Project Milestone Design
- Milestone 3: SDP DSL project report, DNA Performance Report, DNA API Manual
- Milestone 4: Synthesis as Data Flow, Comparison of HPC data flow languages forSKA SDP
- Milestone 5: Pipeline DSL, MS5 DSL Design
- Milestone 6: DSL Continuum Pipeline
- Milestone 7: Data Flow Prototyping Report
Deep study of Legion - part of Milestone 7
There is an accompanying Radio Cortex GitHub repository
Working with the Haskell community to get a solid grasp of the programming principles and with teams around the world understanding their approaches to HPC was extremely interesting.
Create industrial relationships
I established high level executive relationships between SKA and major chip and memory vendors. The resulting discussions were very influential, particularly with regards to understanding memory technologies.
Software Engineering Process for SKA
I wrote a memo explaining the methodology from Carnegie Mellon’s Software Engineering Institute, with which I’m very familiar, having used it in startups and big corporations. It was adopted by the entire SKA organization.
Review data storage software.
This is of course an ongoing discussion, as always driven as much by opinions and politics as facts. A memo summarizes my thoughts.
Strategic use of variable precision number formats and arithmetic
There is much interest in new number formats, for example Google’s TPU processor incorporates an implementation called bfloats. A much more general approach has been designed by John Gustafson see Posithub.org.
Watch my blog for my keynote presentation regarding this topic at CONGA and for our report.