Codegeneration

From Official Kodi Wiki
Revision as of 19:14, 12 August 2012 by >Jcarroll (→‎Typemaps - Return types)
Jump to navigation Jump to search

Template:Frodo

An upcoming release of XBMC will change the python interface from the existing hand coded one, with one whose code is generated as part of building XBMC. There are several advantages to this approach including:

  1. More flexible and straightforward means of managing the API. The API will be defined in terms of a C++ library independent of the fact that it can be called from a scripting language.
  2. Provide a means of merging different APIs into a single API accessed through different means. It would be nice to define the API independent of the means of access.
  3. Tackle some longstanding stability issues resulting from the mixing of Python mechanisms with XBMC mechanisms. These are typically deadlocks.
  4. Smaller source codebase to manage
  5. Provide a basis to support other scripting language.

There are some distinct disadvantages.

  1. Object code size will be larger.
  2. As with any large and new module, there will be issues that need to be worked out.
  3. More work will be required to parse out the documentation and include it in the scripting language interface. This could be done with either a future SWIG release (a current GCOS project for SWIG is to include documentation) or a separate (optional) doxygen pass.

How it works

Xbmc-codegenerator.png

In general, the code-generator is build in two phases. In the first phase SWIG is used to parse C++ header files that define the API. SWIG outputs an XML file that contains a complete description of the structure of the API. In the second phase, the XML file is ingested by a Groovy program that then creates C++ code that forms the bridge to the scripting language (Python).

Referring to the figure above, the following steps contain a little more detail.

  1. A SWIG interface file is created per API module. The SWIG interface file is documented in the SWIG documentation. It is mainly a SWIG configuration file that includes each of the header files that is meant to be included in the given module. It is also used to insert code snippets into the generated code so that, for example, the appropriate dependencies can be '#include'-ed.
  2. SWIG is run (currently from the Makefile) and XML is output for the given module. The XML file contains a full description of all of the code included via the SWIG interface file.
  3. The XML file is ingested by a Groovy program in order to generate the final .cpp. Groovy has native XML parsing support as well as a built in template processor that makes it very powerful and flexible. When invoking the Groovy program, a template is supplied. The template drives the code-generation for a particular scripting language. It's basically the rule set for how to construct the language interface from the structured, parsed, API representation in the XML. The result is a C++ file that contains the scripting language interface code.
  4. Finally the resulting C++ file is compiled into XBMC.

An Example

This is a walkthrough of a simple example to show how this works and how it can be extended in the future. Let's say we have the following simple API function in XBMC that we want available to python. Since this is actually a piece of the current API's module 'XBMC,' let's say this is the complete contents of the file "ModuleXbmc.h":

    void log(const char* msg, int level = lLOGNOTICE);

SWIG takes SWIG interface files as input. Interface files allow the SWIG user to direct the code generation a many different levels. The interface file has a C++-like syntax and must at least specify the module name and identify the include files that are to be parsed as part of the API. For the purposes of our simplified example, an AddonModuleXbmc.i interface file that only includes our "ModuleXbmc.h" would look like:

%module xbmc

%include "native/ModuleXbmc.h"

The SWIG interface file syntax is very rich and can give very fine grained control over the code-generation and the parse-tree that SWIG creates. Please look at the interface file tutorial and the complete SWIG documentation.

The first pass, the call to SWIG, generates a really ugly, but very detailed XML file. This file contains a lot of extraneous information that SWIG has attached to the parse tree, but the section we care about looks like this (this is only here for completeness, even if you write your own templates you don't need to know this):

    <include id="208" addr="82de5b50" >
        <attributelist id="209" addr="82de5b50" >
            <attribute name="name" value="AddonModuleXbmc.i" id="210" addr="82de68f0" />
            <attribute name="module" value="" id="211" addr="82de5bf0" />
            <attribute name="options" value="82de5ad0" id="212" addr="82de5ad0" />
        </attributelist >
        <module id="213" addr="82de5b90" >
            <attributelist id="214" addr="82de5b90" >
                <attribute name="name" value="xbmc" id="215" addr="82de68f0" />
            </attributelist >
        </module >
        <include id="216" addr="82de5c50" >
            <attributelist id="217" addr="82de5c50" >
                <attribute name="name" value="[absolute path to header]/ModuleXbmc.h" id="218" addr="82de68f0" />
            </attributelist >
            <cdecl id="219" addr="82de5fb0" >
                <attributelist id="220" addr="82de5fb0" >
                    <attribute name="name" value="log" id="221" addr="82de68f0" />
                    <attribute name="sym_symtab" value="82dd27f0" id="222" addr="82dd27f0" />
                    <attribute name="sym_nextSibling" value="82de61f0" id="223" addr="82de61f0" />
                    <attribute name="csym_nextSibling" value="82de61f0" id="224" addr="82de61f0" />
                    <attribute name="kind" value="function" id="225" addr="82de68f0" />
                    <attribute name="sym_name" value="log" id="226" addr="82de68f0" />
                    <attribute name="decl" value="f(p.q(const).char,int)." id="227" addr="82de68f0" />
                    <attribute name="sym_overloaded" value="82de5fb0" id="228" addr="82de5fb0" />
                    <parmlist id="229" addr="82de5db0" >
                        <parm id="230">
                            <attributelist id="231" addr="82de5db0" >
                                <attribute name="name" value="msg" id="232" addr="82de68f0" />
                                <attribute name="type" value="p.q(const).char" id="233" addr="82de68f0" />
                            </attributelist >
                        </parm >
                        <parm id="234">
                            <attributelist id="235" addr="82de5f10" >
                                <attribute name="name" value="level" id="236" addr="82de68f0" />
                                <attribute name="value" value="LOGNOTICE" id="237" addr="82de68f0" />
                                <attribute name="type" value="int" id="238" addr="82de68f0" />
                            </attributelist >
                        </parm >
                    </parmlist >
                    <attribute name="type" value="void" id="239" addr="82de68f0" />
                    <attribute name="sym_overname" value="__SWIG_0" id="240" addr="82de68f0" />
                </attributelist >
            </cdecl >
            <cdecl id="241" addr="82de61f0" >
                <attributelist id="242" addr="82de61f0" >
                    <attribute name="sym_name" value="log" id="243" addr="82de68f0" />
                    <attribute name="name" value="log" id="244" addr="82de68f0" />
                    <attribute name="decl" value="f(p.q(const).char)." id="245" addr="82de68f0" />
                    <parmlist id="246" addr="82de6150" >
                        <parm id="247">
                            <attributelist id="248" addr="82de6150" >
                                <attribute name="name" value="msg" id="249" addr="82de68f0" />
                                <attribute name="type" value="p.q(const).char" id="250" addr="82de68f0" />
                            </attributelist >
                        </parm >
                    </parmlist >
                    <attribute name="kind" value="function" id="251" addr="82de68f0" />
                    <attribute name="sym_overloaded" value="82de5fb0" id="252" addr="82de5fb0" />
                    <attribute name="defaultargs" value="82de5fb0" id="253" addr="82de5fb0" />
                    <attribute name="type" value="void" id="254" addr="82de68f0" />
                    <attribute name="sym_previousSibling" value="82de5fb0" id="255" addr="82de5fb0" />
                    <attribute name="sym_symtab" value="82dd27f0" id="256" addr="82dd27f0" />
                    <attribute name="sym_overname" value="__SWIG_1" id="257" addr="82de68f0" />
                </attributelist >
             
            </cdecl >
        </include >
    </include >

Ugly, eh? Well, it's too complicated and verbose to provide directly to a Groovy template so the generator transforms the XML prior to providing it to the template. The transformed XML can be obtained by running the Groovy generator script and passing the '-verbose' flag. For this example the transformed XML would look like:


<?xml version="1.0" encoding="UTF-8"?>
<module name="xbmc">
  <function name="log" sym_name="log" decl="f(p.q(const).char,int)." sym_overloaded="82de5fb0" type="void" id="82de5fb0">
    <parm name="msg" type="p.q(const).char"/>
    <parm name="level" value="LOGNOTICE" type="int"/>
  </function>
  <typetab/>
</module>

This is much simpler. Notice, how this xml reflects the original C++ structure. This forms the input 'bindings' (in Groovy template parlance) provided to the template. I cannot provide a full tutorial on Groovy templates because they are basically, in and of themselves, Groovy scripts. They are very analogous to JSPs in Java, but not specific to generating web pages.

The Template

Trivial Example

As an example, let's suppose, rather than generate C++, our template simply provides information about the API. If our template looks like this:

Module Name: ${module.@name}

Then the resulting output file would look like:

Module Name: xbmc

Notice, we directly reference the xml element by name in the template. Use the '@' sign to reference an attribute of that element. The template is interpreted as a Groovy GString and therefore anything within the ${ ... } has its value, as a string, substituted.

We can access children of the element as a list (since child elements can be repeated). This combined with Groovy's support for closures makes the templating very powerful. Let's modify the template to print out all of the function names in the module.

Module Name: ${module.@name}
<%
module.function.each { 
%>
  function: ${it.@name}
<%
}
%>

Note that anything within <% ... %> can be straight Groovy code with direct access to the parse tree. In this case we apply a closure to 'each' function in the module. Within that closure we exit the <% ... %> brackets and here, whatever we type ends up in the output file. The above template results in:

Module Name: xbmc

  function: log

Let's extend the template to provide parameter information.

Module Name: ${module.@name}
<%
module.function.each { functionNode ->
%>
  function: ${functionNode.@name}
<%
  functionNode.parm.eachWithIndex { param, index ->
%>
    parameter ${index}= name:${param.@name}, type:${param.@type}<% if (param.@value) { %>, default value: ${param.@value} <% } %>
<%
   }
}
%>

Now, within the closure for each function, we're providing a closure that's applied over each parameter. The resulting output file now contains:

Module Name: xbmc

  function: log

    parameter 0= name:msg, type:p.q(const).char

    parameter 1= name:level, type:int, default value: LOGNOTICE 

Type information

You may have noticed the type information for the parameters above. The type information is generated from the SWIG type system. This type system is very powerful and most of it has been 'transliterated' into Groovy and is accessible to the template. Reading the type information is fairly straightforward and documented in the SWIG documentation on the Type System.

The type system includes the means of representing types that can be used on the left hand side of the assignment operator. These types are called 'lvalues'. We can modify the template to show the 'type' information both as an "lvalue" and straight as follows:

<%
import SwigTypeParser
%>
Module Name: ${module.@name}
<%
module.function.each { functionNode ->
%>
  function: ${functionNode.@name}
<%
  functionNode.parm.eachWithIndex { param, index ->
%>
    parameter ${index}= name:${param.@name}, type:${param.@type}<% if (param.@value) { %>, default value: ${param.@value} <% } %>
                 type:${SwigTypeParser.SwigType_str(param.@type)}
                 lvalue-type:${SwigTypeParser.SwigType_lstr(param.@type)}
<%
   }
}
%>

Access to the Type System is through the class SwigTypeParser. Therefore the "SwigTypeParser" needs to be imported. SwigType_str, and SwigType_lstr convert a type string to a representative C++ type in either a straightforward manner, or as its corresponding 'lvalue' type, respectively.

Module Name: xbmc

  function: log

    parameter 0= name:msg, type:p.q(const).char
                 type:char const *
                 lvalue-type:char *

    parameter 1= name:level, type:int, default value: LOGNOTICE 
                 type:int 
                 lvalue-type:int

Notice, an instance of the 'lvalue-type' can be assigned an instance of the 'type.' This is the correct way to handle 'const'-ness in parameters.

Typemaps - Handling Parameters

The Groovy template has access to "typemaps" for the conversion of input parameters and return values. These are completely analogous to SWIG Typemaps but are actually mini-Groovy templates. They contains snippets that allow the generalization of how to convert various type to and from the scripting language. An "in typemap" is used to map parameter values from the scripting language, to the API types. An "out typemap" will map types returned from the API to the scripting language.

In this example an "in typemap" is used to map the 'log' input parameters from a hypothetical scripting language to the API values.

Typemap handling needs to be initialized by calling the 'setup' method on the "Helper" Groovy class (which, of course, needs to be imported).

import Helper

Helper.setup(null,null,null,
        [ 'p.q(const).char':'${api} = convertSlString(${slarg});',
          'int':'${api} = convertSlInt(${slarg});'],
        null)

A Map of type to "in typemap" is the fourth parameter to the "Helper.setup()" method. The key is the 'type' string (ltype' also works). In the example above the typemap has only two entries (given that our example only has two parameter types).

The "in typemap" has a set of predefined 'bindings' (or variable values) that it has access to. Each of the typemaps above uses:

  1. api - is a string containing the name of the variable that will be passed to the API class.
  2. slarg - is the variable name holding the value that was passed in from the scripting language that needs to be converted and stored in 'api'.

If the scripting language is Python then the 'slarg' variable's type will be a "PyObject*". In our example this value is just passed to the function 'convertSl*()'.

The Template needs to then use the typemap functionality through the "Helper" class. The method "Helper.getInConversion()" is used from the template to execute the typemap for a given parameter. Augmenting the script so far gives:

<%
import SwigTypeParser
import Helper

Helper.setup(null,null,null,
        [ 'p.q(const).char':'${api} = convertSlString(${slarg});',
          'int':'${api} = convertSlInt(${slarg});'],
        null)
%>
Module Name: ${module.@name}
<%
module.function.each { functionNode ->
%>
  function: ${functionNode.@name}
<%
  functionNode.parm.eachWithIndex { param, index ->
%>
    parameter ${index}= name:${param.@name}, type:${param.@type}<% if (param.@value) { %>, default value: ${param.@value} <% } %>
                 type:${SwigTypeParser.SwigType_str(param.@type)}
                 lvalue-type:${SwigTypeParser.SwigType_lstr(param.@type)}
    code to handle parameter ${index} {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_${param.@name} = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      ${SwigTypeParser.SwigType_lstr(param.@type)} p_${param.@name};
      ${Helper.getInConversion(param.@type,'p_' + param.@name,'sl_' + param.@name, functionNode)}
    }
<%
   }
}
%>

Now, for each parameter (in each function - though there is only one in our example), a section is printed out that shows how that parameter would be converted from the scripting language to the api type given the provided "in typemaps." The results are:

Module Name: xbmc

  function: log

    parameter 0= name:msg, type:p.q(const).char
                 type:char const *
                 lvalue-type:char *
    code to handle parameter 0 {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_msg = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      char * p_msg;
      p_msg = convertSlString(sl_msg);
    }

    parameter 1= name:level, type:int, default value: LOGNOTICE 
                 type:int 
                 lvalue-type:int 
    code to handle parameter 1 {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_level = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      int  p_level;
      p_level = convertSlInt(sl_level);
    }

Typemaps - Return types

As you might have suspected by now, the "out typemap" is for converting return values. The Map of "out typemaps" is the second parameter of the "Helper.setup()" method. In this case, the "return" type (as such) is 'void.' Languages like Python always have a return so an API call that's 'void' still needs to return a Py_None instance to the Python caller. In a case like this the following typemap makes sense:

        [ 'void': 'Py_INCREF(Py_None); ${result} = Py_None;' ],

Here, we can see one of the bindings used in the "out typemap." The value of 'result' holds the name of the variable that will be returned from the scripting language. Using the "out typemap" via the "Helper.getOutConversion()":

    code to handle return value {
      // This is an example of how Python handles return values
      Py_Object* result;
      ${Helper.getOutConversion(Helper.getReturnSwigType(functionNode),'result',functionNode)}
      return result;
    }

Notice, the "Helper" class contains a convenience method (Helper.getReturnSwigType()) for extracting the return type from the functionNode of the parsed XML. Adding these two snippets to the correct places in the template gives a generated output file that contains:

    code to handle return value {
      // This is an example of how Python handles return values
      Py_Object* result;
      Py_INCREF(Py_None); result = Py_None;
      return result;
    }

Calling the API method

Of course, the API method invocation needs to be output from the template logic. There is convenience method on the "Helper" class for this. Adding the following to the template will output the method invocation:

    code to invoke the api method {
      ${Helper.callingName(functionNode)}( <% 
          functionNode.parm.eachWithIndex { param, i -> 
            %> p_${param.@name}${i < functionNode.parm.size() - 1 ? "," : ""} <% } %> );
    }

which would put the following into the output file:

    code to invoke the api method {
      log(  p_msg,  p_level  );
    }

The final Template and Output

For reference then, this Template:

<%
import SwigTypeParser
import Helper

Helper.setup(null,
        [ 'void': 'Py_INCREF(Py_None); ${result} = Py_None;' ],
        null,
        [ 'p.q(const).char':'${api} = convertSlString(${slarg});',
          'int':'${api} = convertSlInt(${slarg});'],
        null)
%>
Module Name: ${module.@name}
<%
module.function.each { functionNode ->
%>
  function: ${functionNode.@name}

<%
  functionNode.parm.eachWithIndex { param, index ->
%>
    parameter ${index}= name:${param.@name}, type:${param.@type}<% if (param.@value) { %>, default value: ${param.@value} <% } %>
                 lvalue:${SwigTypeParser.SwigType_lstr(param.@type)}
                 rvalue:${SwigTypeParser.SwigType_str(param.@type)}
    code to handle parameter ${index} {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_${param.@name} = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      ${SwigTypeParser.SwigType_lstr(param.@type)} p_${param.@name};
      ${Helper.getInConversion(param.@type,'p_' + param.@name,'sl_' + param.@name, functionNode)}
    }
<%
   }
%>
    code to invoke the api method {
      ${Helper.callingName(functionNode)}( <% 
          functionNode.parm.eachWithIndex { param, i -> 
            %> p_${param.@name}${i < functionNode.parm.size() - 1 ? "," : ""} <% } %> );
    }

    code to handle return value {
      // This is an example of how Python handles return values
      Py_Object* result;
      ${Helper.getOutConversion(Helper.getReturnSwigType(functionNode),'result',functionNode)}
      return result;
    }
<%
}
%>

Gives this output with the simple ModuleXbmc.h file and AddonModuleXbmc.i file we provided earlier:

Module Name: xbmc

  function: log


    parameter 0= name:msg, type:p.q(const).char
                 lvalue:char *
                 rvalue:char const *
    code to handle parameter 0 {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_msg = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      char * p_msg;
      p_msg = convertSlString(sl_msg);
    }

    parameter 1= name:level, type:int, default value: LOGNOTICE 
                 lvalue:int 
                 rvalue:int 
    code to handle parameter 1 {
      // declare and set the value that came in from the scripting languge
      ScriptingLanguageType sl_level = /* set the value from the scripting language */;
      // declare and set the variable that will contain the api parameter
      int  p_level;
      p_level = convertSlInt(sl_level);
    }

    code to invoke the api method {
      log(  p_msg,  p_level  );
    }

    code to handle return value {
      // This is an example of how Python handles return values
      Py_Object* result;
      Py_INCREF(Py_None); result = Py_None;
      return result;
    }