NTRS - NASA Technical Reports Server

Back to Results
efficacy of code optimization on cache-based processorsIn this paper a number of techniques for improving the cache performance of a representative piece of numerical software is presented. Target machines are popular processors from several vendors: MIPS R5000 (SGI Indy), MIPS R8000 (SGI PowerChallenge), MIPS R10000 (SGI Origin), DEC Alpha EV4 + EV5 (Cray T3D & T3E), IBM RS6000 (SP Wide-node), Intel PentiumPro (Ames' Whitney), Sun UltraSparc (NERSC's NOW). The optimizations all attempt to increase the locality of memory accesses. But they meet with rather varied and often counterintuitive success on the different computing platforms. We conclude that it may be genuinely impossible to obtain portable performance on the current generation of cache-based machines. At the least, it appears that the performance of modern commodity processors cannot be described with parameters defining the cache alone.
Document ID
Document Type
Preprint (Draft being sent to journal)
VanderWijngaart, Rob F.
(MRJ Technology Solutions, Inc. United States)
Saphir, William C.
(National Energy Research Supercomputer Center United States)
Chancellor, Marisa K.
Date Acquired
August 20, 2013
Publication Date
January 1, 1997
Subject Category
Computer Programming and Software
Meeting Information
SC97: High Performance Networking and Computing(San Jose, CA)
Funding Number(s)
PROJECT: RTOP 519-40-12
Distribution Limits
Work of the US Gov. Public Use Permitted.