next | previous | forward | backward | up | top | index | toc | home

parsing precedence, in detail

A newline ends a statement if it can, otherwise it acts like any white space.
i1 : 2+
     3+
     4

o1 = 9

Parsing is determined by a triple of numbers attached to each token. The following table (produced by the command seeParsing), displays each of these numbers.

parsing     binary    unary                                                
precedence  binding   binding                    operators
            strength  strength

     0                                            )  ]  }                  

     1          0                                    ;                     

     2          2         2                          ,                     

     3                    3                 do  else  list  then           

     4          3                          ->  :=  <-  =  =>  >>           

     5                    5                from  in  of  to  when          

     6          6         6                          <<                    

     7          6                                   ===>                   

     8          7         8                          |-                    

     9          8                                   <==>                   

    10          9                                   ==>                    

    11         10                                    or                    

    12         11                                   and                    

    13                   13                         not                    

    14         13                             !=  =!=  ==  ===             

    14         13        14                   <  <=  >  >=  ?              

    15         15                                    ||                    

    16         15                                    :                     

    17         17                                    |                     

    18         18                                    ^^                    

    19         19                                    &                     

    20         20                                    ..                    

    21         21                                    ++                    

    21         21        21                         +  -                   

    22         22                                    **                    

    23                    0                          [                     

    24         23                                  \  \\                   

    24         24                                 %  /  //                 

    24         24        24                          *                     

    25         24                                    @                     

    26                                              (*)                    

    26                    0                         (  {                   

    26                    3     catch  if  shield  time  timing  try  while

    26                    5       break  continue  for  new  return  throw 

    26                   31                global  local  symbol           

    26         25                                  SPACE                   

    26         26        26                      <SYMBOLS>                 

    27         27                                    @@                    

    28                                               ~                     

    29         29                           #?  .  .?  ^  ^**  _           

    29         29        25                          #                     

    30                                               !                     
Here is the way these numbers work. The parser maintains a number which called the current parsing level, or simply, the level. The parser builds up an expression until it encounters an input token whose parsing precedence is less than or equal to the current level. The tokens preceding the offending token are bundled into an expression appropriately and incorporated into the containing expression.

When an operator or token is encountered, its binding strength serves as the level for parsing the subsequent expression, unless the current level is higher, in which case it is used.

Consider a binary operator such as *. The relationship between its binary binding strength and its parsing precedence turns out to determine whether a*b*c is parsed as (a*b)*c or as a*(b*c). When the parser encounters the second *, the current parsing level is equal to the binding strength of the first *. If the binding strength is less than the precedence, then the second * becomes part of the right hand operand of the first *, and the expression is parsed as a*(b*c). Otherwise, the expression is parsed as (a*b)*c.

For unary operators, the unary binding strength is used instead of the binary binding strength to reset the current level. The reason for having both numbers is that some operators can be either unary or binary, depending on the context. A good example is # which binds as tightly as . when used as an infix operator, and binds as loosely as adjacency or function application when used as a prefix operator.

To handle expressions like b c d, where there are no tokens present which can serve as a binary multiplication operator, after parsing b, the level will be set to 1 less than the precedence of an identifier, so that b c d will be parsed as b (c d).

The comma and semicolon get special treatment: the empty expression can occur to the right of the comma or semicolon or to the left of the comma.

One of the most unusual aspects of the parsing precedence table above is that [ is assigned a precedence several steps lower than the precedence of symbols and adjacency, and also lower than the precedence of /. This was done so expressions like R/I[x] would be parsed according to mathematical custom, but it implies that expressions like f g [x] will be parsed in a surprising way, with f g being evaluated first, even if f and g are both functions. Suitably placed parentheses can help, as illustrated in the next example.

i2 : f = x -> (print x; print)

o2 = f

o2 : FunctionClosure
i3 : f f [1,2,3]
f
[1, 2, 3]
i4 : f f ([1,2,3])
[1, 2, 3]
print

o4 = print

o4 : FunctionClosure
i5 : f (f [1,2,3])
[1, 2, 3]
print

o5 = print

o5 : FunctionClosure