An automatic approach to implementing DSP algorithms on parallel processors
Goddard, A.J. (1990). An automatic approach to implementing DSP algorithms on parallel processors. (Unpublished Doctoral thesis, City, University of London)
Abstract
Recently, there has been an increase in demand for low cost, high throughput parallel processors on which to implement real-time DSP applications. Numerous solutions have been proposed, though often these are application dependent. This is true for SIMD and systolic architectures, which require a high degree of regularity in an application’s structure. A more general purpose solution is offered using MIMD architectures, which come in a variety of forms. Here we concentrate on loosely-coupled, homogeneous architectures, because they offer infinite expandability and a low cost/processor ratio.
On the route to successful parallel implementation, there are three fundamental problems to be solved, these are parallelism detection, partitioning and scheduling. An automatic approach to solving these problems is presented in the thesis. The approach is based on a compile-time implementation strategy which extracts parallelism, partitions and schedules during compile-time.
Applications are written in a single-assignment language called DFDL (Digital Filter Description Language); a language designed specifically for deterministic, sampled systems. By using a single-assignment language, programs are easily translated into a graphical form (task graph) which conserves and displays parallelism. The task graph is used as an input to the partitioning and scheduling stages.
Prior to scheduling, the user is prompted for details of the target architecture. This includes the number and type of processors, input, output and inter-processor connections. These parameters are used to form a separate graph, called the processor graph. Execution profile information is added to the task graph, this enables analysis to be performed which aids scheduling.
The compile-time scheduling problem is expressed as an optimisation problem, which is shown to be NP-complete. The thesis presents an efficient approximation algorithm for the scheduling problem, which is based on a lateness heuristic. The resulting schedules are translated into parallel Occam for execution on an array of Transputers. The successes and failures of the implementation strategy are examined and commented upon. Performance results are given for several ex-ample applications written in DFDL. These examples are implemented on a range of architectures and the effects of communication, scheduling and topology are discussed.
Publication Type: | Thesis (Doctoral) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Departments: | Doctoral Theses School of Science & Technology > School of Science & Technology Doctoral Theses School of Science & Technology > Engineering |
Download (12MB) | Preview
Export
Downloads
Downloads per month over past year