Saturday, January 14, 2012

A basic Forth language interpreter (non-interactive)

Last time we showed a simple tokenizer for a Forth interpreter. We find it easier to combine the ideas in that post to create a simple Forth engine which process a simple string.

We will expound further on this until we successfully build a compiler for a non-standard Forth language.

Have fun working on this. The current version works on simple strings but specified on the source code itself. We will introduce interactivity and expand the capabilities of this simple Forth interpreter.



/*
file     simple-forth.c
author   Dr. Ernesto P. Adorio
         UPDEPP (University of the Philippines,
         Extension Program in Pampanga
         Clarkfield, Pampanga
email    ernesto.adorio@gmail.com
version  0.0.1 January 14, 2012
*/

#include 
#include 
#include 
#include 


#define MAXSTKLEN 32
#define MAXTOKENLEN 128

enum {L_EMIT, L_ADD, L_SUB, L_MUL, L_DIV, L_POP, L_INT32, L_CR, L_ERROR} OPCODES;
const char *stdwords[] = { 
".", 
"+",
"-",
"*",
"/",
"pop",
"int32",
"cr",
};


int LENSTDWORDS = 8;

int stack[MAXSTKLEN];
int SP = 0;   /* stack pointer index */

int main(){
  char tokens[] = "   123 34 + . cr 567 -3456 * . cr";
  char *tokstart = tokens;
  char *tokend = tokstart;
  int  opcode;

  /* @@@ printf("tokstart: %s\n", tokstart);*/

  tokend = tokens;

  while (1) {
    tokstart = tokend;
    /* ignore leading white spaces */
    while (isspace(*tokstart)) tokstart++;

    /* find terminating space of end of string */
    tokend = tokstart;
    while (!isspace(*tokend) && *tokend != '\0') tokend++;
    
    /* string terminator */
    if (*tokend != '\0') {
      *tokend = '\0';
      tokend ++;
    } else {
      break;
    } 

    /* get opcode */
    opcode = -1; 
    /*@@@ printf("token [%s]\n", tokstart); */
    for (int i =0; i < LENSTDWORDS; i++) {
      if (strcmp(stdwords[i], tokstart) == 0){
 opcode = i;
        break;
      }
    }
    if (opcode == -1){
      opcode = L_INT32;
    }


    /* Evaluate opcode */
    switch (opcode){
    case L_EMIT: 
      //printf("opcode L_EMIT %s \n", tokstart);
      printf ("%d", stack[SP--] );
      break;
    case L_ADD:
      //printf("opcode L_ADD %s\n", tokstart);
      stack[SP-1] += stack[SP];
      SP--;
      break;
    case L_SUB:
      //printf("opcode L_SUB %s\n", tokstart);
      stack[SP-1] += stack[SP];
      SP--;
      break;
    case L_MUL:
      //printf("opcode L_MUL %s\n", tokstart);
      stack[SP-1] *= stack[SP];
      SP--;
      break;
    case L_DIV:
      //printf("opcode L_DIV %s\n", tokstart);
      stack[SP-1] /= stack[SP];
      SP--;
      break;
    case L_INT32:
      //printf("opcode L_INT32 %s\n",tokstart);
      stack[++SP]= atoi(tokstart);
      break;
    case L_POP:
      //printf("opcode L_POP %s \n", tokstart);
      SP--;
      break;
    case L_CR:
      // printf("opcode L_CR %s \n", tokstart);
      printf("\n");
    }  
  }
  printf("\n");
  return 0;
}
Save the code to a file simple-forth.c, then compile using the following command line on the directory where the source file is stored. gcc simple-forth.c -std=c99 -o forth It should compile cleanly, then execute the executible by issuing ./forth When the program runs, it prints out
toto@toto-Aspire-4520:~/Blogs/my-other-life-as-programmer/forth$ ./forth
157
-1959552


1 comment:

  1. Hi, I appreciate your speculating as it attracts people’s attention and make this topic discussable. Language Interpreter">

    ReplyDelete