Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Wish me Luck [My Presentation]
#1
Tomorrow is my first presentation on my project Compiler Construction.

I have been writing these stuff which I have to present tomorrow.

feeling nervous , hopefully all goes well ;)

now I have to make PPT slides from this text.


Code:
[ Compiler Construction ]


First of all , what is a Comipler ??

Compiler is a computer program which converts your source code into an assembly code.


Source code is a code written in programming languages like C++ , C , Java , which are usually termed as High level Language.
Compiler takes your source code and checks it for errors and if not found any , converts it into an assembly code .
assembly code is a code written in a assembly language which is a low level language.


    


Construction

Basically it is a two pass Compiler,

Front End and back End.


Front End is composed of two things

Scanner
Parser


Back End is composed of many parts

few of them are

Code Optimization
Instruction Selection
Registry Allocation


Scanner is called as a lexical analyzer also known as tokenizer.
lexical analysis is the process of converting a sequence of characters into a sequence of tokens.

Token is composed of two things
<token type , value>

example

Consider this expression in C++ programming language.
    sum=3+2;

sum = identifier
=     Assignment operator
3     =Number
+     =Addition operator
2     =Number
;     =End of statement


Token = <number , 3>


how it works ?

Scanner has some basic definations of the token type for the language we are working on.

firstly we deifne a regular language for the Scanner and then derive regular expressions  from it.

then we convert it into a Non Deterministic finite automata then to Deterministic finite automata  and then minimizes it.

Regular language can be

[0-9] means 0 to 9

[a-z]*  * means in a loop                                            b  
a|b  a or b


thse are regular expressions defined for our language .
we make a finite automata through this expression.
Lex is a tool which is used to generate this Scanner.


Coding in C++ for Scanner


%
{#include "tokendef.h"
}
%


D   [0-9]

L  [a-zA-Z_]

id {L} ({L} {D})*

%%

void {return(TOK_VOID); }

int {return (TOK_INT}; }

(D)+

while   {return (TOK_WHILE);}

if      {return (TOK_IF);}
else    {return (TOK_ELSE);}


%%




#define void 1
#define int 2
#define wihle 3
#define if 4
#define else 5



.cpp file

#include
int main(){

FlexLexer lex;

int tc = lex. [method name  which basically contains all the defination statement ]

while(tc !=0){

cout <<tc <<" , " <<lex.YYText() <<endl;

tc =lex.[same method name]

}
}
}





Parser

Parser is used to create an Intermediate Representation of this tokens.

one form of this IR is Abstract Syntax Tree [AST]

which is derived from Parse Tree.

Parser deals with the Syntax and Semantics.

it basically checks wheather your expression belongs to your programming language or not

lets take an example to further explain this


expression

fi(a=20;a<78;a++){



as per the syntax it should be "if" , instead of "fi"

also , there is no semicolon at the end.
this is what you call syntax.

Scanner will pass this as an identifier.


what is semantics?

Linguistic semantics is the study of meaning that is used to understand human expression through language.
Other forms of semantics include the semantics of programming languages.

lets take an example to explain this

suppose you initialize a variable "a"

and do not declare it,
our Comipler should throw a semantic error.
The whole concept of Parser is based on CFG
which is Context free Grammer.

It is a formal grammar in which every production rule is of the form

    V --> w

where V is a single nonterminal symbol, and w is a string of nonterminals.

There are certain rules defined in CFG which helps  us to make Parser.

There are variuos types of parsing technique

top down parsing

bottom up parsing

preductive parsing

recursive descend parsing


Backup Normal Form(BNF) notation was used to derive certain rules for making syntax for programming languages.


These rules are based on
[Context free Grammer]

the way of doing this is

take your expression and start deriving it

either form the left hand side or right hand side

example

goal --> expr

these expr is further derived .

lets take an example

expression x-2*y

lets do a leftmost derivation

goal ---> expr

expr--?> expr , op, term

expr --->expr , op , term

result --> (x-2)*y


when you do this same derivation by using right most derivation

result would be x-(2*y)

now in the case of programming language,

our expressions should be based on BODMAS rule.

for this we define levels.
Top up down parsing

in this method , for the construction of our parse tree, we follow our path from root to last leaf

let us take an example

same expression used before


x-2*y


we follow the construction from root .

this is our rule for defining grammer

goal -> expr

expr -> expr + term   [here we are taking + just to try out our rule]

later we will see that we have not applied the correct rule , then we will change our operator.

how we will do this ? by backtrack

backtrack is a method which tells you to go back and change your rule.

now we will change our operator to - wherever it was +.



our last term in this rule is  always factor
replace your actual value by factor.


this technique only works for right recursive grammer
it fails when left recursive grammer occurs.

eg for left recursive grammer

term --> term * factor



now we have to convert this left recursive grammer to right one


expr --> term  expr'

expr' ----> + term expr'

expr'------> - term expr'


expr'----->e [epsolong sign]

this is done by using epsolong sign.

Preductive Parsing


this technique is known as Preductive Parsing because we predict for the next token.

it is based on the LL(k) property

L--> left to right scan

L-->  left side parsing


k is the look ahead token.

very complex technique to use.

LL(k) property is true

when A->a amd A->b
then first(a) U first(b) = null;

this property was basically used in Recursive descend Parsing.

in this we have to make functions of each non terminal , i.e for expr(), term(), Eprime().

We use OOP language C++ , so that we can use objects and concept of Inheritance to make it simpler and easy.


first thing we have to do is make a SuperClass NonTerminal and then all the non terminals will be the sub class of this superclass.


class NonTerminal{

public:NonTerminal(Scanner* sc){

s= sc; tree =null;
}

virtual = ~NonTerminal();

virtual bool isPresent() =0;



TreeNode* AST(){return tree;}

protected:
scanner* s;
TreeNode* tree;

}


our Grammer rule

goal --->expr

expr---> term expr*  

expr* --> + term expr'

    | - term expr*




we have to make subclasses of each Non Terminal

class expr:public NonTerminal{
bool expr::isPresent(){

Term* op1 = new Term(s);

if(op1->ispresent())

return false;

tree = op1->AST();

Eprime* op2 =new Eprime(s,tree);

if(op2->ispresent())
tree = op2->AST();

return true;



Subclass for Eprime


bool Eprime::ispresent(){


int op = s->nextToken();

if(op==PLUS || op==MINUS){


s->advance();

Term* op2 = new Term(s);

if(!op2->isPresent()

{syntaxError(s);
TreeNode* t2 = op2->AST();

tree = new TreeNode(op, exprsofar, t2);


Eprime op3 = new Eprime(s, tree);

if(op3->isPresent())
tree = op3->AST();

return true;
}

else return false;




More to Come [] . .


There are many corrections to be done , please let me know if someone can , thanks Smile




Reply
#2
Ummm, that looks pretty good bro! It's a little over my head, but I think you got dis!
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)