SUBSTITUION SHORTCUTS IN STRING EXPRESSION
==========================================
Shortcuts have forms:
\<control character> or
$<variable name><variable limitor> (explained in varialbles_help.txt),
""
Examples:
\n
is a shortcut for line-feed with
<control character> = "n".
"xyz$my_variable xyz"
$ - variable escape character
my_variable - <variable name>
" " - <variable limitor>, space
"xyz""xyz" - this is a string:
xyz"xyz
There are three times when shortcuts can be substituted:
immediate - when schema is parsed;
early - at parent event time;
late - for string-expression in action when action happens;
for string-expression in event-action when event-action is fired;
for string-expression in event-pattern when event-pattern
is to be searched by Tokenizer;
Only event-actions preceded by escape character "+"
have early substitution which is described in
early_and_late_event_actions_examples_help.txt
In all other cases, shortcuts are substituted lately, except following:
control characters with immediate binding:
web aware:
_
> >
< <
" &qout;
& &
| <br />
/ <br /> CR LF
pure text:
l CR LF
r CR
n LF
t TAB
z chr(0)
N LF LF
\code
code has three decimal digits:
example: \066 stands for "B";
End of immediate substitution characters.
Control characters (non-immediate binding)
\c - comma ",";
\p - period ".";
\s - space, ASCII 32
This shortcuts obey the general order rule:
unformat-unescape-process:
\L - low case of following characters;
\U - upper case of following characters;
\H - reverses the string ("horizontally");
\h - minimum HTML encoding;
\D - "decoding";
\d - reverse \D operation;
\Z - replacing: chr(0) --> \z
\S - size of string;
\f \F - "face" trimmers
\g \G - "ground" trimmers
\w \W - both side trimmers
'requires one parameter:
\R - replacement
requires one parameter:
\P - position of substring in the string;
\I - inverse position of substring in the string;
requires one parameter:
\B - beginning of string;
\E - end of string;
requires one, two, or more parameters:
\M - middle of string;
\e - extend string;
\T - described in "disk_variables_help.txt" in Variables folder.
\? - conditional substitution
\# - double unescape
With more details:
\L<string> - <string> is unescaped first, then low-cased.
\U<string> - <string> is unesceped first, then upper-cased.
\S syntax:
1. unecaping <string>
2. returning length of <string>;
\f \F \g \G \w \W syntax: \X<string>
- low case corresponds to soft-trimming;
- upper case to strong (including CR and LF) trimming;
- f,F - beginning of string, "face";
- g,G - end of string, "ground";
- w,W - both ends, "wipe" or "white characters";
\R syntax:
\R<searchee>.<replacement>,<replacee>
unformat:
1. first, <space> (which is " ") is searched,
2. then, comma is searched,
unescape:
3. strings <searchee>,<replacement>, and
<replacee> are unescaped,
process:
4. occurances of <searchee> are replaced with
<replacement> in <replacee>
\P \I syntax: "\X<searchee>,<where_to_search>"; X is P or I;
1. unformat: string <searchee>,<where_to_search> is parsed to find
first comma from the left
2. unesape strings <searchee> and <where_to_search>;
3. process search:
for P, from left to right;
for I, from right to left;
return number which is position in the <where_to_search>
first position has number 1;
returning 0 indicates absence <searchee> in
<where_to_search>
\B \E \M syntax:
"\Blength,string";
"\Elength,string";
"\Mstart[.length],string";
For \X<substring>, where X is M,B,E,
1. unformat: parse <substring> for
<start[.length],string> tokens
2. unescape start[.length],string
3. process: operators M,B,E execute "string".
There is an analogy: B,E,M - VB's Left, Right, and Middle.
\e<number>[.seed][,tail_string]
Unformat.
Unescape number,seed,tail_string.
Produces string = seed seed .... seed tail_string
String seed is repeated <number> times.
\H<string>
\h<string> - minimum HTML encoding:
& - &
> - >
< - <
" - "
\D<string> - "decoding": replacing:
\ --> \\
CRLF --> \l
LF --> \n
CR --> \r
char(0) --> \z
\d<string> - "undecoding": revese to \D operation;
\? \?string1,string2,string3,string4[,string5]
Unformat first.
Unescape strings.
if string1=string2 then result will be
string3 & string5
otherwise, result will be
string4 & string5
Any of string[s] can be empty.
\# \#string1[,string2]
Unformat first.
Unescape string1 and string2.
Unescape string1 second time.
Concatenate to obtain result:
string1 & string2
Note: The purpose of \# is to evaluate variables which
contain dynamically generated string expressions.
To use this method, programmer should be aware about internal
format which is used to keep variables in string expressions:
\$<variable name><space>
(not $<variable name><space>)
Examples:
token.type.scope
"\r".t
"\013 $variable $second variabe".t
"\065 (065 stands for ""A"")"
"first ""fragment"" contains double quotas".
"second fragment is a multiline fragment"."fragment start
fragment continued
third line".fragment_three
To include variable into a string, dollar sign "$" must prefix variable name,
and space, CR, LF, or TAB must postfix variable name.