string_expression

home top contents previous up next
SUBSTITUION SHORTCUTS IN STRING EXPRESSION
==========================================

Shortcuts have forms: 

    \<control character> or
    $<variable name><variable limitor> (explained in varialbles_help.txt),
    ""
    
    Examples: 

       \n
    is a shortcut for line-feed with
    <control character> = "n".

    "xyz$my_variable xyz"
       $           - variable escape character
       my_variable - <variable name>
       " "         - <variable limitor>, space
    "xyz""xyz"     - this is a string:
                     xyz"xyz


There are three times when shortcuts can be substituted:

   immediate - when schema is parsed;
   early     - at parent event time;
   late      - for string-expression in action when action happens;
               for string-expression in event-action when event-action is fired;
               for string-expression in event-pattern when event-pattern
                   is to be searched by Tokenizer;

Only event-actions preceded by escape character "+"
have early substitution which is described in
early_and_late_event_actions_examples_help.txt 


In all other cases, shortcuts are substituted lately, except following: 

 control characters with immediate binding:

   web aware:
   _     &nbsp;
   >     &gt;
   <     &lt;
   "     &qout;
   &     &amp;
   |     <br />
   /     <br /> CR LF 
     
   pure text:
   l     CR LF
   r     CR 
   n     LF 
   t     TAB 
   z     chr(0)
   N     LF LF

   \code
     code has three decimal digits:
     example: \066 stands for "B";

End of immediate substitution characters.


                
Control characters (non-immediate binding)

      \c      - comma ",";
      \p      - period ".";
      \s      - space, ASCII 32



    This shortcuts obey the general order rule:
         unformat-unescape-process:

      \L      - low case of following characters;
      \U      - upper case of following characters;
      \H      - reverses the string ("horizontally");
      \h      - minimum HTML encoding;
      \D      - "decoding";
      \d      - reverse \D operation;
      \Z      - replacing: chr(0) --> \z
      \S      - size of string;
      
      \f \F   - "face" trimmers
      \g \G   - "ground" trimmers
      \w \W   - both side trimmers

      'requires one parameter:
      \R      - replacement

      requires one parameter:
      \P      - position of substring in the string;
      \I      - inverse position of substring in the string;

      requires one parameter:
      \B      - beginning of string;
      \E      - end of string;

      requires one, two, or more parameters:
      \M      - middle of string;
      \e      - extend string;

      \T      - described in "disk_variables_help.txt" in Variables folder.

      \?      - conditional substitution
      \#      - double unescape 

  With more details:

     \L<string>  - <string> is unescaped first, then low-cased.
     \U<string>  - <string> is unesceped first, then upper-cased.

     \S  syntax:
         1. unecaping <string>
         2. returning length of <string>;

     \f \F \g \G \w \W  syntax: \X<string>

         - low case corresponds to soft-trimming;
         - upper case to strong (including CR and LF) trimming;
         - f,F - beginning of string, "face";
         - g,G - end of string, "ground";
         - w,W - both ends, "wipe" or "white characters";

     \R syntax:
        \R<searchee>.<replacement>,<replacee>
    
        unformat:
         1. first, <space> (which is " ") is searched,
         2. then, comma is searched,
        unescape:
         3. strings <searchee>,<replacement>, and
            <replacee> are unescaped,
        process:
         4. occurances of <searchee> are replaced with
            <replacement> in <replacee> 

     \P \I syntax: "\X<searchee>,<where_to_search>"; X is P or I;

           1. unformat: string <searchee>,<where_to_search> is parsed to find
                        first comma from the left
           2. unesape   strings <searchee> and <where_to_search>;
           3. process   search:
                          for P, from left to right;
                          for I, from right to left;
                        return number which is position in the <where_to_search>
                        first position has number 1;
                        returning 0 indicates absence <searchee> in 
                            <where_to_search>

     \B \E \M  syntax: 

           "\Blength,string"; 
           "\Elength,string";
           "\Mstart[.length],string";  

           For \X<substring>, where X is M,B,E, 
             1. unformat: parse <substring> for
                <start[.length],string> tokens
             2. unescape start[.length],string
             3. process: operators M,B,E execute "string".

       There is an analogy: B,E,M - VB's Left, Right, and Middle.

    \e<number>[.seed][,tail_string]
          Unformat.
          Unescape number,seed,tail_string.
          Produces string = seed seed .... seed tail_string 
          String seed is repeated <number> times.

    \H<string> 

    \h<string> - minimum HTML encoding:
            & - &amp;
            > - &gt;
            < - &lt;
            " - &#34;

    \D<string> - "decoding": replacing:
                 \       --> \\
                 CRLF    --> \l
                 LF      --> \n
                 CR      --> \r
                 char(0) --> \z

    \d<string> - "undecoding": revese to \D operation;


    \?        \?string1,string2,string3,string4[,string5]
              Unformat first.  
              Unescape strings.
              if string1=string2 then result will be 
                 string3 & string5
              otherwise, result will be
                 string4 & string5
              Any of string[s] can be empty.

    \#        \#string1[,string2]
              Unformat first.
              Unescape string1 and string2.
              Unescape string1 second time.
              Concatenate to obtain result:
                 string1 & string2
              
              Note: The purpose of \# is to evaluate variables which
              contain dynamically generated string expressions.
              To use this method, programmer should be aware about internal
              format which is used to keep variables in string expressions:
                 \$<variable name><space>
              (not $<variable name><space>)

                    



Examples:

    token.type.scope
      "\r".t                       

      "\013 $variable $second variabe".t
      "\065 (065 stands for ""A"")"
 
      "first ""fragment"" contains double quotas".
 
      "second fragment is a multiline fragment"."fragment start
      fragment continued
      third line".fragment_three


      To include variable into a string, dollar sign "$" must prefix variable name,
      and space, CR, LF, or TAB must postfix variable name.