Architecture: Fall '04
From SubfireWiki
Contents |
Please feel free to do the modification
Problem 1.
- Consider the following proposal to deal with the problem of branching in a superscaler processor:
- (1)Additional hardware resources are devoted to execute instructions on both paths(fall-through adn target) of a conditional branch and the number of unresolved brnaches at any time is limited.
- (2)As a brnach instruction is resolved, appropriate actions are taken to remove the resultes of #instructions along the misprediction path.
Identify the major limitations , inefficencies and complications in the proposed solution. Make sure that you address the complication/inefficiencies in dispatching/issuing instructions, in pipelien flushing and in maintenance of a precise state. Also identify and address the additional hardware requirements.
- Key Points:
- (1) Very hardware intensive: more ports to memory system
- (2) Modern superscalar pipeline, a single cycle can encompass 2-6 instructions. If 2-3 cycle are need to resolve a branch, a large issue queue is needed.
- (3) Removing the results of instructions along the mispredicted path inhibit protential gains.
Problem 2
- L1 I-cache, L1 D-Cache, branch predictor, physical register file, rename table, architecture register file(if any), issue queue, Reorder buffer, the load-store Queue and the function Unit.
- (1)Which ones are the least likely to be a bottleneck?
- (2)Outline a soluetion to work around these multicycle access times for two of the most critical components listed above.
- (3)How would the increased L1 cache latencies degrade the IPC? what would be a solution for avoiding the IPC loss due to the longer access time of the L1 D-cache?
Problem 3
Identify to techniques of EPIC that can be adapted for use in superscalar CPUs.
- Keys:
- (1) speculative load instruction, decoupling memory operation from exception recognitiong, speculative bypassing of earlier stroe by load.
- (2) Instructions for moving data explicitly across memory hierachy.
Problem 4
- Compare and contrast there two organizations of a clustered superscalar processor, sonsidering such factors as load balancing, instruction dispatching/steering policay(That decides to which sluster an instruction is dispatched to), instruction retirement and hardware complexity.
- (a)clustered organization with replicated registers.
- (b)clustered organization with non-replicated registers.
- Keys:
