Interested in improving this site? Please check the To Do page.
When you hit the compile button of a script window, Frontier first creates a string version of the contents of that window, adding semicolons and curly braces as neccessary, similar to what you get when you coerce a script object to a string. [see opverbgetlangtext in opverbs.c]
The UserTalk compiler builds a code tree from that string. As far as I can tell, the compiler isn't very special in that the implementation appears to follow the basic computer science recipes for building compilers. I have never had to take a closer look at the internals. Treating the compiler as a black box that takes a string containing some UserTalk code and returns the corresponding code tree always worked for me.
The resulting code tree is linked to the script object for reuse.
[see langbuildtree in lang.c; all the low-level stuff is in langparser.c, langscan.c, and langtree.c]
A node in the code tree is represented by a struct of type tytreenode. [see lang.h]
Every node represents a certain operation which is indicated by a struct member named “nodetype” of type tytreetype . For example, this operation could be a multiplication, an assignment, a function call, a kernel call, or a return statement. The definition of tytreetype [in lang.h] has a list of all possible node types (tytreetype is an enum).
In order to store the result of the operation (if needed), the struct has a member named “nodeval” of type tyvaluerecord (which is the universal kernel representation of a UserTalk value, see below).
A struct member named “link” can be used to store a handle to the next node in a list of statements or parameters. So, in fact, we aren't usually dealing with just a single code tree, but rather any node can be the head of a linked list of nodes with each of the nodes being the root of another code tree.
Depending on whether the operation represented by the node has any operands, these will be represented by up to four handles to subnodes. For example, for a multiplication, there will be two subnodes representing the two values to be multiplied, or for a function call, there would be a subnode representing the function name/address and another representing the list of parameters.
The struct also has members named “lnum” and “charnum” for remembering from what part of the source code the node was generated. In case of an untrapped scripterror, this information can be used to position the cursor if the user clicks the Go To button in the script error window.
Running a UserTalk script means evaluating code trees. Before you can start evaluating a list of statements, some global variables and callbacks need to be set up. A universal wrapper function that handles these tasks is langruncode [in lang.c].
At the next-lower level, evaluatelist [in langevaluate.c] is something like the main event loop of the UserTalk interpreter. It chains thru a linked list of nodes until it reaches the end of the list. It calls evaluatetree (see below) for each node to evaluate the code tree of which that node is the root. Before chaining thru the list, it sets up a new table for local variables if needed. These are the very same tables that show up in system.compiler.stack if you run a script in the debugger. It also sets up the local “this” and “tryerror” variables in the table for the local variables if needed. If the linked list of nodes represents the body of a function, it will use a magic table that already contains the parameters of the function call instead of creating a new table for local variables. This magic table is passed into evaluatelist via the global hmagictable variable which is set by langfunctioncall [in langvalue.c]. When the end of the list of nodes is reached, the table for local variables is automatically disposed which also takes care of disposing all local variables at that level of scope.
At the next-lower level, evaluatetree is just a wrapper for evaltree that checks available stack space [in langevaluate.c].
evaltree first (recursively) evaluates the child nodes if it is required for the operation associated with the node. Then there's a big switch statement with a handler for every known operation. The handlers often just call another function (usually living in langvalue.c) that implements the operation. The functions in turn might call evaluatetree or even evaluatelist, e.g. to repeatedly evaluate the body of a loop. If you want to find out how the kernel implements a certain UserTalk operator, this big switch statement is the ideal jumping-off point.
The universal kernel representation of a UserTalk value is a tyvaluerecord struct. [see lang.h]
The struct has a “valuetype” field of type “tyvaluetype” to indicate the type of the value, e.g. whether it's a string, a number, or a date. The definition of “tyvaluetype” [in lang.h] lists all known types (it's an enum).
The actual value data is contained in the “data” field of type “tyvaluedata” (it's a union). If the value can be represented as a four-byte integer, it will usually be contained in the “data” field itself. Otherwise, the “data” field will be a handle to the actual data. Whether the “data” field contains a handle or not is determined by the “valuetype” field.
There are also a couple of status flags that are only used during runtime, i.e. they are never saved to the database file on disk.
The fltmpdata flag gets set when you call copyvaluerecord (in langvalue.c) to create a copy of a window-based value (also known as an external value in kernel-speak). It indicates that the handle in the data field is still the handle to the original value. Later, when you actually want to assign the value to another variable or ODB cell, you first need to call copyvaluedata to duplicate the contents of the handle. The idea is to defer the actual copying until we are certain that the copying will be needed. This is what made passing external values as direct parameters or return values practical rather than always having to pass the value's address. (See comments at the top of copyvaluerecord and copyvaluedata for details.)
The fltmpstack flag gets set when you push a value onto the temp stack. (See next section for details.)
The flformalval flag has something to do with the initialization of optional function parameters, but I've never needed to figure out how this works exactly. (See langaddlocalsymbols and langaddfuncparams in langvalue.c for details.)
The fldiskval flag is set when the “data” field points to the address of the block in the database file containing the actual value. (See “Database Structure” and “Database File Format” for more on this.)
Finally, a value gets a name either by assigning it to a table that is part of the ODB hierarchy, or if it is a local variable, by assigning it to the local table set up by evaluatelist (see above). The variable name of course will be the name of the table element.
Local variables are linked into the local table set up by evaluatelist and as such they will be automatically disposed when evaluatelist disposes the local table after it reached the end of a statement list.
However, besides local variables, lots of other values will be created while evaluating code trees. For example, when you are adding several values, e.g. strings, there will be intermediate result values that are not local variables. To simplify memory management for these temporary values, they can be pushed onto the so-called temp stack [see langtmpstack.c]
The temp stack is linked to the local table which was set up by evaluatelist. When evaluatelist disposes of the local table, all the values on the temp stack will also be disposed automatically. Actually, evaluatelist will dispose everything on the temp stack once thru every iteration over the list of statements. So, once you have pushed a value onto the temp stack, you no longer need to call disposevaluerecord on it.
One thing to always keep in mind is that if you assign a temp value to a table (either local or global), it's your responsibility to first exempt it from the temp stack. Otherwise, the value will be disposed automatically at some point and you end up with a stale handle in the table (and possibly a spectacular crash later on).
There's lots of existing code that deals with these issues. A reasonable starting point for learning more about this might be stringverbs.c or langhtml.c.
Pretty much any function that is involved in evaluating UserTalk code returns a boolean. Conventionally, a return value of true means success. False either means that a script error occurred or that the current thread has been killed, e.g. because the user pressed cmd-period or the escape key.
If the return value is false, the kernel looks at the global fllangerror variable to differentiate between script errors and killed scripts.
If you are writing new kernel code and you encounter an error condition that isn't already handled at a lower level, it's critically important to call langerrormessage [in langerror.c] to set the fllangerror flag and to prepare an error message. If you just returned false and the error was not handled at a lower level, you would soon find any scripts relying on that code to die silently without any explanation.
The default behavior of langerrormessage is to prepare a scripterror window to be displayed to the user. This behavior can be overriden through some callbacks, take a look at evaluatetry and langtryerror in langevaluate.c (used for try statements) or langruntraperror and langtraperror in lang.c (used from langhtml.c for trapping and reporting macro errors).