In simultaneous multithreaded architectures many separate threads are running concurrently, sharing processor resources, thereby realizing a high utilization rate of the available hardware. However, this also implies that threads are competing for resources and in many cases this competition can actually degrade overall performance. There are two major causes for this: first, instructions that, because of a long latency data cache miss, cause dependent instructions not to proceed for many cycles thereby wasting space in the instruction queues, and second, execution of instructions that belong to a mispredicted path. Both of these have a harmful effect on throughput and the second moreover wastes energy. in this paper we propose a fetch policy that avoids issuing instructions to the pipeline if we are not confident that the instruction belongs to the correct execution path. In this way, we avoid using resources for instructions that will not contribute to performance. This fetch policy, called agstall, is based on a dynamic branch classification mechanism. Branch instances are classified as either strongly biased or not strongly biased. We consider all strongly biased branches as easy to predict, and we stall the thread on branches that are not strongly biased to avoid mispredicting them. Our results show that agstall achieves similar or better performance than icount, and reduces by up to 86% the number of wrong-path instructions executed.

Computer architecture, Concurrent computing, Contracts, Degradation, Delay, Hardware, Round robin, Surface-mount technology, Throughput, Yarn
International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2002
Sprott School of Business

Knijnenburg, P.M.W. (P. M.W.), Ramirez, A, Latorre, F. (F.), Larriba, J. (J.), & Valero, M. (M.). (2002). Branch classification to control instruction fetch in simultaneous multithreaded architectures. In Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems (pp. 67–76). doi:10.1109/IWIA.2002.1035020