libgfx: Simple File Format Scripting

Simple File Format Scripting

The libgfx library provides a very simple scripting facility which is primarily intended to support parsing of input files and control scripts. To use this package, you must include the header file:

    #include <gfx/script.h>

For more information about this package, you may want to examine the sample scripted application.

File Structure

Scripts processed by this package are assumed to be composed of a sequence of lines. Each line is processed separately, in sequence, and falls into one of the following categories:

Lines containing only whitespace are ignored
Lines whose first non-whitespace character is a '#' (hash mark) are treated as comments and are ignored
All other lines are assumed to be a sequence of whitespace-separated tokens where the first token is a command name and subsequent tokens are arguments to this command.

Processing Scripts

In order to process scripts, you must (at minimum) perform the following steps:

Instantiate a scripting environment of type CmdEnv.
Register your command(s) with CmdEnv::register_command() or CmdEnv::register_method().
Feed the text of the script to the parsing system.

The specifics of these steps are detailed in the following sections.

The Scripting Environment

The primary task of the scripting system is to map command name tokens into command procedures. These procedures, also referred to as "handlers", are responsible for actually performing the computation associated with a particular command. Handlers should conform to the following type definition:

    typedef int (*CmdHandler)(const CmdLine &cmd);

The CmdLine type manages the text of a particular command line, and provides various methods for parsing that information (see details below).

The CmdEnv class manages the mapping of command names to handlers. In particular, handler names are mapped to pointers to objects derived from the base class CmdObject. All derived classes of CmdObject are required to override the operator() invocation virtual method. By using this virtual function, handler objects are allowed to encapsulate arbitrary data in their definition (i.e., to create closures). The standard scripting framework defines two kinds of CmdObject handlers: (1) the CmdFunction class to encapsulate normal functions and static methods and (2) and the CmdMethod template class to encapsulate member functions.

New handler procedures (non-member functions) can be associated with names using the method:

    void register_command(const std::string& name, CmdHandler proc);

Member functions can be bound to names uses the templated member function:

    template<class T>
    void register_method(const std::string& name,
                         T *obj,
                         int (T::*fn)(const CmdLine&));

Note that in both cases, prior bindings of name associated with another handler will be overwritten. Existing handlers can be located by name using the method:

    CmdObject *lookup_command(const std::string& name);

which returns NULL if no handler is bound to the given name.

Submitting Text for Execution

Several procedures are available for submitting script text to the parsing system. All of them require a CmdEnv argument that will determine the mapping of command names to handlers.

The underlying method for parsing text is:

    int script_do_line(std::string &line, CmdEnv &env);
    int script_do_line(const char *line, CmdEnv &env);

It assumes that its input is a string consisting of a single line — any embedded newlines will be treated like any other whitespace. It will split this line it a series of whitespace-separated tokens, interpreting the first such token as a command name. If env provides a binding for this name, the appropriate handler will be called.

For convenience, the scripting package also provides the following methods:

    int script_do_stream(std::istream &in, CmdEnv &env);
    int script_do_file(const char *name, CmdEnv &env);
    int script_do_string(const char *str, CmdEnv &env);

They operate by extracting a single line from the input source and processing that line with script_do_line(). They repeat this line-by-line process until the file/stream/string has been exhausted. The first time the processing of a line fails with an error code, these procedures return this error code immediately without completing the processing of the rest of the input.

Writing Command Handlers

A command procedure is declared as follows:

    int proc(const CmdLine& cmd);

The CmdLine structure contains all the necessary data about the line being processed. It provides the following fundamental accessors:

    class CmdLine
    {
    public:
	const std::string &line;        // Raw text of the (complete) line
	std::string opname() const;     // Name of the command being invoked
	std::string argline() const;    // Argument string
	int argcount() const;           // Number of argument tokens
    };

The argument string returned by argline() is unparsed except that whitespace following the command name and trailing whitespace at the end of the line have been removed.

It is up to the handler to parse the command line in whatever way it likes. However, the scripting system assumes that command will be given whitespace-separated token lists. Therefore, it pre-computes the indices of these tokens in the command line text before invoking the handler. To access an individual token, you can use the CmdLine methods:

    std::string token_to_string(int i) const;
    double token_to_double(int i) const;
    float token_to_float(int i) const;
    int token_to_int(int i) const;

Tokens are numbered from [0 .. argcount()-1]. Note that, for efficiency, these methods do not perform range checking. It is up to the caller to verify that the given indices are valid.

In addition to accessing single argument tokens, you can collect all argument tokens into lists with the following CmdLine methods:

    int collect_as_strings(std::vector<std::string> &v) const;
    int collect_as_numbers(std::vector<double> &v) const;
    int collect_as_numbers(std::vector<int> &v) const;

    int collect_as_numbers(double *v, int size) const;
    int collect_as_numbers(float *v, int size) const;
    int collect_as_numbers(int *v, int size) const;

These methods always return the number of tokens collected. The vector-based methods will always collect all available tokens. In contrast, those which accept raw arrays will either collect size or argcount() tokens, whichever is smaller.

See the accompanying scripting example for more details on how to write command handlers.