The spaces problem, part 1: the art of preserving spaces

AuthorSaturn 2006/10/24 20:28

It's a well known fact that mIRC's scripting language handles spaces in a somewhat 'special' way: in many contexts, leading and trailing spaces are removed, and multiple consecutive spaces are reduced to a single space. This aspect of mIRC's scripting language is one of the aspects of what makes it relatively easy to use for beginners - however, the more advanced scripters often find it to be a nuisance; especially things like displaying text exactly as it comes in, are virtually impossible to do entirely correctly because of this.

What is less well known, is that it's still possible to preserve spaces to a certain extent. This article describes the boundaries of space preservation within mIRC's scripting language. A follow-up article discusses workarounds and their disadvantages.

Where spaces are stripped

In general, there are only a few places in mIRC's script parsing system where spaces are actually lost:

  1. During tokenization of parameters for commands
  2. When tokenizing input passed to functions
  3. While parsing original, literal script lines


The first point is the major problem. Aside from the space-preserving commands listed in the following section, mIRC evaluates and then space-tokenizes the text following every script command, stripping the spaces in this process. This makes it impossible to call any command (again, bar the exceptions listed below) or alias with spaces preserved.

The second point only affects certain events and identifiers; for example, in the "on INPUT" event, the typed text is contained in $1-, but as $1- has been space-tokenized already, the leading/trailing/consecutive spaces are gone. Many events where text is involved fill $rawmsg with the original space-preserved string, but this is unfortunately not the case for "on INPUT". The $1- parameter in the 'command' parameter of the $findfile and $finddir identifiers also suffers from this problem.

The third point roughly means that it is impossible to put leading, trailing or multiple consecutive spaces literally in a script and have them preserved during the execution of the script. I.e., the following command:

echo -ag $len(a      b)

will always result in "3", no matter how many spaces you put between the "a" and the "b". This fact extends to places which are less obvious: for example, $read(file.txt,n,1) will return the first line of file.txt with all spaces preserved, whereas $read(file.txt,1) will consider the line to be script code and indeed strip spaces. Another little obvious case is evaluation brackets inside identifiers: whereas

echo -ag $len( $+($chr(32),$chr(32)) )

gives the result "2",

echo -ag $len( [ $+($chr(32),$chr(32)) ] )

gives the result "0". This happens because of the "pre-evaluating" behaviour of square brackets (explained in the relevant article by jaytea). In this case, the spaces inside the brackets are evaluated before mIRC attempts evaluation of $len itself. Thus when $len is then considered, the parser "sees"

$len(    )

so the third point applies.

Space-preserving commands

The following is an exhaustive list of built-in commands that do preserve spaces:

  • /set
  • /var
  • /returnex

The fact that /set and /var preserve spaces, imply that it's possible to store and use text in variables without losing spaces. The reason for this behaviour is most likely the fact that /set is evaluated in a different way from other commands, as the variable name in its line must not be evaluated to the contents. As all /var commands are internally rewritten as /set commands, it is no surprise that /var behaves exactly as /set when it comes to spaces. Similarly to /set and /var, the '=' assignment ("%x = foo") preserves spaces as well.

There is just one problem with all these assignment commands: if the given text contains exactly one single trailing space, this trailing space will be stripped off:

var %x = foo | echo -ag $len(%x) equals 3
var %x = foo $+ $chr(32) | echo -ag $len(%x) equals 3
var %x = foo $+ $chr(32) $+ $chr(32) | echo -ag $len(%x) equals 5
var %x = foo $+ $chr(32) $+ $chr(32) $+ $chr(32) | echo -ag $len(%x) equals 6

One method to get around this, as suggested by qwerty, is by adding one or more characters after the actual variable contents when using /set or /var, and then using $left(%var,..) instead of just %var to access the contents of the variable throughout the rest of the code; as the space is then not a trailing space anymore, it will not be removed during the /set or /var call. More details about this method, as well as alternatives, will be described in an upcoming article. FIXME

Finally, the third command - /returnex - is relatively new; it was introduced in mIRC 6.2 as undocumented tool to preserve spaces during internal evaluation of the subtext part of $regsubex calls. It is useful beyond just this context as well, as it is simply a space-preserving version of /return, and can therefore be used to construct custom identifiers that preserve spaces.

Besides these three commands, it is possible to construct custom aliases for which the input has all spaces preserved, but these aliases must then be called as identifiers instead of as commands; typically one can use the /noop command to prefix them. For example, instead of:

myalias $rawsmg

One can use:

noop $myalias($rawmsg)

Obviously this affects the way the parameter are tokenized; all input will be in $1 instead of being space-tokenized over $1, $2 etcetera. The individual parameters will have all spaces preserved though, i.e. in this case, $1 in the myalias alias will contain exactly the same as $rawmsg does. More about such custom identifiers in the next section.

Space-preserving identifiers

As far as built-in identifiers are concerned, in general spaces are fully preserved. As exceptions, out of the built-in string manipulation identifiers, at least the following are known not to fully preserve spaces:

Identifier Problem
$noqt Removes leading spaces after the first quote character
$strip Removes leading spaces
$read Removes leading spaces, even with the 'n' switch

It may be useful to know that the token functions do preserve all spaces as long as a token separator other than a space is used.

Constructing custom identifiers that preserve spaces is, although a little tricky, entirely possible: as indicated in the previous section, parameters passed to custom identifers are not space-stripped, and with /returnex, space-preserved text can be returned to the caller. Use of variables to store and use (part of) space-preserved text within the alias code is straightforward, as long as one is aware of the /set and /var quirk mentioned earlier above.

As a simple example, the following custom identifier $align(text,N) will add spaces at the end of "text" so that the total string length is (at least) N characters:

alias align { returnex $1 $+ $str($chr(32),$calc($2 - $len($1))) }

This identifier could help align text for displaying it in a custom format.

Space-preserving DLLs

However, to actually display text within mIRC, one needs the /echo or /aline commands, which obviously do not preserve spaces. This is slightly problematic, as in general, one very common reason for wanting to preserve all spaces is to display arbitrary incoming text in a custom way. Aside from the extremely inefficient way of converting the text to a binvar, writing the binvar to a file and then using /loadbuf to load the file into a window, there is no way of displaying text into a window while preserving all spaces.

On the upside, one can call DLLs using the $dll identifier, and as this built-in identifier preserves the spaces in its parameters when called, it allows for DLLs to provide some of the functionality that mIRC does not offer: for instance, there is spaces.dll for basic text displaying and sending, although this DLL hooks deeply into mIRC and therefore needs to be updated with every new mIRC version.

For dialogs, it is possible to use DLLs like MDX and DCX to add text to the dialogs in a space preserving way, although one would typically have to change the script call to the DLL to use only identifier calls (i.e. only custom identifiers instead of aliases, and $dll instead of /dll).

Conclusion

To recap: the main problem of mIRC's spaces stripping lies with the execution of commands, while the evaluation of identifiers (including by far most built-in identifiers) and manipulation of variables can be done in a space preserving way. Space-preserving custom identifiers can be made constructed as well, although real work will have to be done through DLL calls.

Even though mIRC's way of dealing with spaces is restrictive enough that many things cannot be accomplished without outside help from DLLs, it is still fully possible to manipulate text in any way desired before passing it to such a DLL.

spaces.txt · Last modified: 2013/11/24 12:28 by saturn
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki