Repository landing page
COMPROF and COMPLACE:shared-memory communication profiling and automated thread placement via dynamic binary instrumentation
Abstract
This paper presents COMPROF and COMPLACE, a novel profiling tool and thread placement technique for shared-memory architectures that requires no recompilation or user intervention. We use dynamic binary instrumentation to intercept memory operations and estimate inter-thread communication overhead, deriving (and possibly visualising) a communication graph of data-sharing between threads. We then use this graph to map threads to cores in order to optimise memory traffic through the memory system. Different paths through a system's memory hierarchy have different latency, throughput and energy properties, COMPLACE exploits this heterogeneity to provide automatic performance and energy improvements for multi-threaded programs. We demonstrate COMPLACE on the NAS Parallel Benchmark (NPB) suite where, using our technique, we are able to achieve improvements of up to 12% in the execution time and up to 10% in the energy consumption (compared to default Linux scheduling) while not requiring any modification or recompilation of the application code- contributionToPeriodical
- NUMA
- Thread Placement
- Data Placement
- Cache Optimisation
- Energy Optimization
- Refactoring
- QA75 Electronic computers
- Computer science
- NDAS
- Energy Optimisation
- /dk/atira/pure/subjectarea/asjc/1800/1802; name=Information Systems and Management
- /dk/atira/pure/subjectarea/asjc/1700/1702; name=Artificial Intelligence
- /dk/atira/pure/subjectarea/asjc/2600/2606; name=Control and Optimization
- /dk/atira/pure/subjectarea/asjc/1700/1710; name=Information Systems
- /dk/atira/pure/subjectarea/asjc/1700/1708; name=Hardware and Architecture
- /dk/atira/pure/subjectarea/asjc/1700/1706; name=Computer Science Applications