Summary of Fundamentals of Computer Architecture: Lecture 1: Modern Microprocessor Design (Spring 2025)
Summary of "Fundamentals of Computer Architecture: Lecture 1: Modern Microprocessor Design (Spring 2025)"
This lecture provides an introductory overview of modern computer architecture, focusing on microprocessor design principles, instruction processing models, and the evolution from single-cycle to pipelined and out-of-order execution architectures. It situates the course in the broader context of computer engineering education and current research trends, emphasizing the importance of hardware-software co-design, heterogeneity in computing, and the balance between performance, energy efficiency, and reliability.
Main Ideas, Concepts, and Lessons
1. Course Introduction and Context
- The course bridges the gap between foundational computer engineering and advanced computer architecture.
- It targets master’s level students from Electrical Engineering (EE), Computer Science (CS), and related fields.
- The teaching team includes the main professor and co-instructor Muhammad Sadati, with guest lectures planned.
- The course aims to provide both fundamental principles and exposure to cutting-edge research topics.
- Emphasis on learning, critical thinking, trade-off analysis, and practical labs (mostly simulation-based) rather than exams.
2. Research and Modern Computing Landscape
- Modern computing is heterogeneous: CPUs, GPUs, specialized accelerators (e.g., systolic arrays), and processing-in-memory architectures.
- Shift from CPU dominance to diverse processing units driven by workloads like machine learning, genomics, media processing, etc.
- Energy efficiency is a critical design metric, often equated with performance.
- Machine learning techniques are increasingly used to design better architectures.
- The course will touch on hardware-software co-design, where algorithms and hardware are optimized together.
3. Fundamental Concepts of Computer Architecture
- Computer Architecture Definition: Science and art of designing, selecting, and interconnecting hardware components and designing hardware-software interfaces to meet goals like performance, energy, cost, and functionality.
- Instruction Set Architecture (ISA): The contract/interface between software and hardware, specifying how instructions behave.
- Microarchitecture: Implementation of the ISA in hardware; can vary widely while maintaining the same ISA.
- ISA changes slowly, microarchitecture evolves faster.
- Example: Gas pedal analogy for ISA vs. engine internals for microarchitecture.
4. Instruction Processing Models
- Von Neumann Model: Stored program, sequential instruction execution, one instruction completes before the next starts.
- Architectural State: Programmer-visible state (registers, memory, program counter).
- Microarchitectural State: Invisible to programmer, used internally to optimize execution.
5. Single-Cycle vs. Multi-Cycle Microarchitecture
- Single-Cycle Machines:
- Each instruction executes in one clock cycle.
- Clock cycle time determined by the slowest instruction (worst-case).
- Simple but inefficient due to long clock cycles and hardware duplication.
- Multi-Cycle Machines:
- Instruction execution broken into multiple clock cycles/stages.
- Clock cycle time determined by the slowest stage, not the slowest instruction.
- Reuse hardware resources across cycles, reducing cost.
- Introduces overhead for storing intermediate results and sequencing.
- More flexible, can handle variable memory latency (e.g., waiting for memory readiness).
- Implemented as a finite state machine controlling the instruction processing cycle.
6. Design Principles for Microarchitecture
- Critical Path Design: Minimize longest combinational logic delay to maximize clock frequency.
- Bread and Butter Design: Optimize for the common case workload, not rare instructions.
- Balanced Design: Balance instruction and data flow to avoid bottlenecks.
- Single-Cycle Machines violate all three principles.
- Modern machines still struggle with balanced design due to memory bottlenecks.
7. Pipelining
- Pipeline divides instruction execution into stages; multiple instructions processed concurrently in different stages.
- Increases instruction throughput (more instructions per cycle).
- Does not reduce latency of individual instructions; latency may increase due to overhead.
- Challenges include:
- Handling control hazards (branches).
- Handling data hazards (instruction dependencies).
- Managing pipeline stalls and bubbles.
- Pipeline efficiency depends on:
- Uniform partitioning of stages.
- Independent and repetitive operations.
- Real pipelines suffer from internal and external fragmentation due to different instruction types and stage latencies.
- Pipeline registers add overhead (sequencing delay).
- Example: A 5-stage pipeline ideally improves throughput by 5x but practical factors reduce this.
8. Performance Metrics and Trade-offs
- Execution time = Number of instructions × Average cycles per instruction (CPI) × Clock cycle time.
- Single-cycle: CPI = 1 but long clock cycle.
- Multi-cycle: CPI > 1 but shorter clock cycle.
- Pipelining reduces average CPI by increasing concurrency.
- Trade-offs between hardware cost, clock frequency, CPI, and complexity.
9. Control vs. Data Path
Instruction processing engine consists of
Notable Quotes
— 36:54 — « The gas pedal is the interface for acceleration, but internally at the microarchitecture level, there may be many ways of implementing that acceleration. »
— 75:23 — « There's a huge imbalance between compute and memory today, which is why we think about processing in memory architectures. »
— 76:58 — « In multicycle microarchitectures, each instruction takes as many clock cycles as it needs, allowing you to determine clock cycle time independently of instruction processing time. »
— 95:56 — « When an instruction is using some resources in its processing, process other instructions on idle resources not needed by an instruction. »
— 103:15 — « Ideal pipelining requires identical, independent, and uniformly partitionable operations, but unfortunately, instruction processing does not have all these properties. »
Category
Educational