The classic compilers textbook, although its front-end emphasis reflects its age.
New edition has more optimization material. • Engineering a Compiler (Ark book).
Spring 2010 Lecture 1: Introduction Intro. to Computer Language Engineering Course Administration info info.
Outline • Course Administration Information • Introduction to computer language engineering – Why do we need a compiler? – What are compilers? – Anatomy of a compiler
Saman Amarasinghe
2
6.035
©MIT Fall 1998
Course Administration • • • • • •
Staff Optional Text Course Outline The Project j Project Groups Grading
Saman Amarasinghe
3
6.035
©MIT Fall 1998
Reference Textbooks Modern Compiler Implementation in Java (Tiger book)
A textbook tutorial on compiler implementation, including techniques for many language features
•
Advanced Compiler Design and Implementation (Whale book)
Essentially a recipe book of optimizations; ti i ti very complete l t and d suited for industrial practitioners and researchers.
•
Compilers: Principles, Principles Techniques and Tools (Dragon book)
The classic Th l i compilers il textbook, t tb k although its front-end emphasis reflects its age. New edition has more optimization material.
•
Engineering a Compiler (Ark book)
A modern classroom textbook, with increased emphasis on the back-end and implementation techniques techniques.
•
Optimizing Compilers for Modern Architectures
A modern textbook that focuses on optimizations including parallelization and memory hierarchy optimization
•
A.W. Appel Cambridge University Press, 1998 ISBN 0-52158-388-8
Steven Muchnick Morgan Kaufman Publishers, 1997 ISBN 1-55860-320-4
Aho, Lam, Sethi and Ullman Addison-Wesley, 2006 ISBN 0321486811
Keith D. Cooper, Linda Torczon Morgan Kaufman Publishers, 2003 ISBN 1-55860 55860-698 698-X X
Randy Allen and Ken Kennedy Morgan Kaufman Publishers, 2001 ISBN 1-55860-286-0
Saman Amarasinghe
5
6.035
©MIT Fall 1998
The Project: The Five Segments
Lexical and Syntax Analysis Semantic Analysis
Code Generation
Data-flow Data flow Analysis
Optimizations
Saman Amarasinghe
6
6.035
©MIT Fall 1998
Each Segment... • Segment Start – Project Description
• Lectures – 2 to 5 lectures
• Project Time – (Design Document) – (Project Checkpoint)
• Project Due
Saman Amarasinghe
7
6.035
©MIT Fall 1998
Project Groups • 1st project is an individual project • Projects 2 to 5 are group projects consists of 3 to 4 students • Grading
– All group members b (mostly) ( tl ) gett the th same grade d
Saman Amarasinghe
8
6.035
©MIT Fall 1998
Grades • Compiler project
70%
• In-class Quizzes
30% (10% each)
• In-class mini-quizzes
10% (0.5% each)
Saman Amarasinghe
10
6.035
©MIT Fall 1998
Grades for the Project – – – – –
Scanner/Parser S Semantic ti Checking Ch ki Code Generation D t fl Data-flow Analysis A l i Optimizations
Saman Amarasinghe
5% 7 5% 7.5% 10% 7 5% 7.5% 30% 60%
11
6.035
©MIT Fall 1998
Optimization Segment • Making programs run fast – – – –
We provide a test set of applications Figure-out what will make them run fast Prioritize and implement the optimizations Compiler derby at the end
• A “similar” application to the test set is provided the day before • The compiler that produced the fastest code is the winner
• Do any optimizations you choose
– Including parallelization for multicores
• Grade is divided into: – Documentation
• Justifyy yyour optimizations p and the selection p process
6%
– Optimization Implementation
12%
– Derby performance
12%
• Producing correct code
30% Saman Amarasinghe
12
6.035
©MIT Fall 1998
The Quiz • Three Quizzes • In-Class Quiz – 50 Minutes (be on time!) – Open book, open notes
Saman Amarasinghe
13
6.035
©MIT Fall 1998
Mini Quizzes • You already got one. one • Given at the beginning of the class; Collected at th end the d • Collaboration is OK • This is in lieu of time consuming problem sets
Saman Amarasinghe
14
6.035
©MIT Fall 1998
Outline • Course Administration Information • Introduction to computer language engineering – What are compilers? – Why should we learn about them? – Anatomy of a compiler
Saman Amarasinghe
15
6.035
©MIT Fall 1998
Why Study Compilers? • Compilers enable programming at a high level language instead of machine instructions. – Malleability Malleability, Portability, Portability Modularity, Modularity Simplicity, Simplicity Programmer Productivity – Also Efficiency and Performance
Saman Amarasinghe
16
6.035
©MIT Fall 1998
Compilers Construction touches many topics t i in i Computer C t Science S i • Theory – Finite Fi it State St t A Automata, t t G Grammars and dP Parsing, i d data-flow t fl
• Algorithms – Graph manipulation, dynamic programming
• Data structures – Symbol tables, abstract syntax trees
• Systems – Allocation and naming, multi-pass systems, compiler construction
• Computer Architecture – Memory M hierarchy, hi h instruction i t ti selection, l ti interlocks i t l k and d latencies, l t i parallelism ll li
• Security – Detection of and Protection against vulnerabilities
• Software Engineering – Software development environments, debugging
• Artificial Intelligence – Heuristic based search for best optimizations Saman Amarasinghe
17
6.035
©MIT Fall 1998
Power of a Language • Can use to describe any action – Not tied to a “context”
• Many M ways to t d describe ib the th same action ti – Flexible
Saman Amarasinghe
18
6.035
©MIT Fall 1998
How to instruct a computer • How about natural languages? – – – –
English?? “Open the pod bay doors, Hal.” “I am sorryDDave, I am afraid f id I cannott do d that” th t” We are not there yet!!
• Natural Languages: – Powerful, but… – Ambiguous • Same expression describes many possible actions
Saman Amarasinghe
19
6.035
©MIT Fall 1998
Programming Languages • Properties – – – –
need need need need
Saman Amarasinghe
to to to to
be be be be
precise concise expressive at a high-level (lot of abstractions)
20
6.035
©MIT Fall 1998
High-level Abstract Description t Low-level to L l l Implementation I l t ti Details D t il My poll ratings are low, lets invade a small nation President
General
Cross the river and take defensive positions
Sergeant
Forward march, march turn left Stop!, Shoot
Foot Soldier Saman Amarasinghe
21
Figure by MIT OpenCourseWare.
6.035
©MIT Fall 1998
1. How to instruct the computer • Write a program using a programming language – High-level Abstract Description
• Microprocessors talk in assembly language – Low-level Implementation Details Program written in a Programming Languages Saman Amarasinghe
Compiler p
22
Assembly Language Translation
6.035
©MIT Fall 1998
1. How to instruct the computer • Input: High-level High level programming language • Output: Low-level assembly instructions • Compiler p does the translation: – – – –
Read and understand the program Preciselyy determine what actions it require q Figure-out how to faithfully carry-out those actions Instruct the computer to carry out those actions
Saman Amarasinghe
23
6.035
©MIT Fall 1998
Input to the Compiler • Standard imperative language (Java, (Java C, C C++) – State • Variables, Variables • Structures, y • Arrays
– Computation • • • •
Expressions (arithmetic, logical, etc.) Assignment statements Control flow (conditionals, loops) P Procedures d
Saman Amarasinghe
24
6.035
©MIT Fall 1998
Output of the Compiler • State – Registers – Memory with Flat Address S Space pace
• Machine code – load/store architecture – Load, store instructions – Arithmetic, logical operations on registers – Branch instructions
Saman Amarasinghe
25
6.035
©MIT Fall 1998
Example (input program) int sumcalc(int a, int b, int N) { int i, x, y; x = 0; y = 0; for(i = 0; i j) Missing increment return j; Not an expression } Not a keyword
Saman Amarasinghe
35
6.035
©MIT Fall 1998
Anatomy of a Computer Program (character stream) L i lA Lexical Analyzer l (Scanner) (S ) Token Stream Syntax Analyzer (Parser) Parse Tree Semantic Analyzer Intermediate Representation
Saman Amarasinghe
36
6.035
©MIT Fall 1998
Semantic Analyzer int * foo(i, j, k) int i; int j; { int x; x = x + j + N; return j; }
Saman Amarasinghe
37
Type not declared
Mismatched return type Uninitialized variable used Undeclared variable
6.035
©MIT Fall 1998
Anatomy of a Computer Program (character stream) L i lA Lexical Analyzer l (Scanner) (S ) Token Stream Syntax Analyzer (Parser) Parse Tree Semantic Analyzer Intermediate Representation Code Optimizer Optimized Intermediate Representation
Saman Amarasinghe
38
6.035
©MIT Fall 1998
Optimizer
int sumcalc(int a, int b, int N) { int i; int x, t, u, v; x = 0; u = ((a