Architecture: Fall '04

From SubfireWiki

Jump to: navigation, search

Contents

Please feel free to do the modification

Problem 1.

  • Consider the following proposal to deal with the problem of branching in a superscaler processor:
  1. (1)Additional hardware resources are devoted to execute instructions on both paths(fall-through adn target) of a conditional branch and the number of unresolved brnaches at any time is limited.
  2. (2)As a brnach instruction is resolved, appropriate actions are taken to remove the resultes of #instructions along the misprediction path.

Identify the major limitations , inefficencies and complications in the proposed solution. Make sure that you address the complication/inefficiencies in dispatching/issuing instructions, in pipelien flushing and in maintenance of a precise state. Also identify and address the additional hardware requirements.

  • Key Points:
    1. (1) Very hardware intensive: more ports to memory system
    2. (2) Modern superscalar pipeline, a single cycle can encompass 2-6 instructions. If 2-3 cycle are need to resolve a branch, a large issue queue is needed.
    3. (3) Removing the results of instructions along the mispredicted path inhibit protential gains.

Problem 2

  • L1 I-cache, L1 D-Cache, branch predictor, physical register file, rename table, architecture register file(if any), issue queue, Reorder buffer, the load-store Queue and the function Unit.
    1. (1)Which ones are the least likely to be a bottleneck?
    2. (2)Outline a soluetion to work around these multicycle access times for two of the most critical components listed above.
    3. (3)How would the increased L1 cache latencies degrade the IPC? what would be a solution for avoiding the IPC loss due to the longer access time of the L1 D-cache?


Problem 3

Identify to techniques of EPIC that can be adapted for use in superscalar CPUs.

  • Keys:
    1. (1) speculative load instruction, decoupling memory operation from exception recognitiong, speculative bypassing of earlier stroe by load.
    2. (2) Instructions for moving data explicitly across memory hierachy.


Problem 4

  • Compare and contrast there two organizations of a clustered superscalar processor, sonsidering such factors as load balancing, instruction dispatching/steering policay(That decides to which sluster an instruction is dispatched to), instruction retirement and hardware complexity.
    • (a)clustered organization with replicated registers.
    • (b)clustered organization with non-replicated registers.
  • Keys:
Personal tools