viernes, octubre 23, 2009

TNT macros: Variables (2)


In a previous entry I wrote about variable declaration under TNT macro language, in this post I show the most important aspect of variables: their access.

Variables can be accessed in two ways, for reading, in which the value stored on the variable is retrieved. The second access way is for writing, in which a value is assigned to a variable.

General variables

In TNT general variables (variables declared with var keyword) are accessed by its name inside single quotations (').
quote The number stored on variable is 'value' ;
hold 'number' ;
In this example, the number stored on value is printed and the number stored in number is used as the maximum tree hold. This is an important feature of the language: you can replace any command parameter by the value of variables. This gives an extraordinary flexibility to TNT's macros.

Sometimes, specially at linux version, it is necessary to use parenthesis to read successfully the variable:
hold ('numero') ;
To access arrays, the same format is used, with an index inside squared brackets ([]).
quote the value of the fourth element is  'vector [4]' ;
It can be a more complex (and useful) way for this example
quote the value of the 'i' element is 'vector ['i']' ;
Note that the variable i is also inside quotations. If you are somewhat familiar with some computer programming language, you note that quotations are a way to "dereference" of a variable.

This is an example for a multidimensional array
quote the value of the element 'i' , 'j' is 'mat ['i' , 'j']' ;
Using parenthesis (()) you can use math operations as index values. Here is an example with a two dimensional array, but that is exactly equal with any kind of array:
quote the value of the element 4, 'j' is 'mat [ (2 + 2 ) , 'j' ]' ;
Another option is the use of series
quote the value of the elements 3 to 8 is 'list [ 3 - 8 ]' ;
In this case, note that there are no parenthesis limiting the scope! If you put parenthesis, TNT interpret it as a mathematical operation.

To write values on general variables, the keword set is used with the name of the variable to assing, without quotations, and followed with the assigned value
set val 4 ;           /* Assigns 4 to val */
set res (3 + 4) ; /* Assigns a math operation */
set num 'val' ; /* Assigns the content of val to num */
Note that as is the value of the variable that is assigned to num, then val must be inside quotations. To assign arrays, it is possible to use the same format, just using one element at time.
set vec [4] 5 ;                 /* Assigns 5 to the 4th element of the array*/
set arreglo [ (2 + 3) ] 8 ; /* Assigns 8 to the 5th element of the array */
set arreglo ['i'] 10 ; /* Assigns 10 to the i-element of the array */
set arreglo [ ('j' + 'k') ] 7 ; /* Assigns 7 to the element j + k of the array */
Note that indexes must be dereferenced!

Many times, an operation to a variable is just a modification of its value, for example increase its value by one
set val 'val' + 1 ;
It is more clear and intuitive, modifying directly the variable without dereference it, this can be done using a C-like sintaxys
set val ++ ;              /* increase by 1 */
set arr [3] -- ; /* decrease the content of element 3 in 1 */
set mat [2, 4] += 'val' ; /* adds the value of val to the element 2, 4 of mat */
set num *= (3 + 'j') ; /* multiplies the content of num by (3 plus j)-times */
set arr [2] /= 3 ; /* divides the content of element 2 of arr by 3 */
set val -= 4 ; /* decreases the content of val in 4 */
In some cases, you want to store all array elements at the same time. This can be done with the keyword setarray. It is important that array dimensions coincide with the dimensions used in the declaration. The format of setarray is the name of the array followed by the name of the array an the elements.
var:
lista [5, 4]
;
setarray 5, 4 lista 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 2 2 2 2 ;
In this example, the following matrix is assigned to a 5x4 array
0 1 1 1
1 0 1 1
1 1 1 0
2 2 2 2
The order of the elements follows the dimensions from left to right, then the first four elements assigned to the array are the array elements [0, 0], [0, 1], [0, 2] and [0, 3].

Variables in loops

Usually, loop variables are control variables, so the best option is not modifying them. So in principle, write the code as if loop variables are read-only. If a loop variable needs to be modified TNT can do it, but that usually shows a design problem, and it is not a good practice to make that modifications.

Loop variables can be accessed using using names or number. Is a good practice to use names to identify loop variables.

To read a loop variable the number or name of the cicle must be preceded by number character (#). For example to save a simple search of several k values using implied weights [1]
loop =kval 1 10
keep 0 ;
piwe = #kval ;
mult ;
tsave* resu#kval.tre ; save; tsave /;
stop
As #kval is the name of the first cicle, then is possible to use #1, then to save the tree it would be
tsave* resu#1..tre ;
Note the double point after #1, that is because a point is interpreted by TNT as a decimal fraction, and assumes that the point is part of the name. In any case, it is better to use names ;). The numbering of loops is assigned in relation with its nestedness, starting with #1.

To modify a loop variable the keyword setloop is used, that change the most nested loop (i.e. the loop in which the instruction is used). As changing the value of loops distorts their sequence, you must be aware of infinite-loops. This is the reason that makes the change of loop variables unsafe and unrecommended. But, some times it is necessary a non-stop loop that finish only when a specified condition is meet. In that case, using endloop coupled with setloop can be very useful
var:
i
;

set i 0 ;
loop =non 0 1
/* several instructions */
if ('i' == 1)
endloop ;
else
setloop 0 ;
end ;
stop
It is assumed that somewhere inside the loop, the value of i is changed to 1.

Command line variables

Variables from command line are read-only. Then, we only can access their value. To access a command line variable it is used the percent sign (%).
set j %1 ;
Assigns j to the first parameter from the command line (by the way, %0 is the name of the macro). Trying to read more parameters than parameters actually provided, is an error and stops the execution. When you write a macro, keep in mind that command line parameters is the only way to communicate with the user, so it is necessary to do a good choice of reading order and default values.

As always, do not forgive to keep your TNT copy updated. And check the TNT wiki or join the TNT google group to deal with any question!

References
[1] Goloboff, P.A. 1993. Estimating character weights during tree search. Cladistics 9: 83-91. DOI: 10.1111/j.1096-0031.1993.tb00209.x

Previous post on TNT's macros

No hay comentarios.: