eed2682001-06-11David Norlin  +--------------------------------------+ | Pike autodoc markup - the XML format | +--------------------------------------+ ====================================================================== a) Introduction ---------------------------------------------------------------------- When a piece of documentation is viewed in human-readable format, it has gone through the following states: 1. Doc written in comments in source code (C or Pike). 2. A lot of smaller XML files, one for each source code file. 3. A big chunk of XML, describing the whole hierarchy. 4. A repository of smaller and more manageable XML files. 5. A HTML page rendered from one such file. (Or a PDF file, or whatever). The transition from state 1 to state 2 is the extraction of documentation from source files. There are several (well, at least two) markup formats, and there are occasions where it is handy to generate documentation automatically &c. This document describes how a file in state 2 should be structured in order to be handled correctly by subsequent passes and presented in a consistent manner. ====================================================================== b) Overall structure ---------------------------------------------------------------------- Each source file adds some number of entities to the whole hierarchy. It can contain a class or a module. It can contain an empty module, that has its methods and members defined in some other source file, and so on. Suppose we have a file containing documentation for the class Class in the module Module. The XML skeleton of the file would then be: <module name=""> <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <doc> ... the documentation of the class Module.Class ... </doc> </class> </module> </module> The <module name=""> refers to the top module. That element, and its child <module name="Module">, exist only to put the <class name="Class"> in its correct position in the hierarchy. So we can divide the elements in the XML file into two groups: skeletal elements and content elements. Each actual module/class/whatever in the Pike hierarchy maps to at most one content element, however it can map to any number of skeletal elements. For example, the top module is mapped to a skeletal element in each XML file extracted from a single source file. To get from state 2 to state 3 in the list above, all XML files are merged into one big. All the elements that a module or class map to are merged into one, and if one of those
af31992001-06-13David Norlin elements contains documentation (=is a content element), then that
eed2682001-06-11David Norlin documentation becomes a child of the merger of the elements. ====================================================================== c) Grouping ---------------------------------------------------------------------- Classes and modules always appear as <module> and <class> elements. Methods, variables, constants &c, however, can be grouped in the source code: //! Two variables: int a; int b; Even a single variable is considered as a group with one member. Continuing the example in the previous section, suppose that Module.Class has two member variables, a and b, that are documented as a group: <module name=""> <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <docgroup homogen-type="variable">
af31992001-06-13David Norlin  <variable name="a"><type><int/></type></variable> <variable name="b"><type><int/></type></variable> <doc> ... documentation for Module.Class.a and Module.Class.b ... </doc>
eed2682001-06-11David Norlin  </docgroup> <doc> ... the documentation of the class Module.Class ... </doc> </class> </module> </module>
af31992001-06-13David Norlin If all the children of a <docgroup> are of the same type, e.g. all are <method> elements, then the <docgroup> has the attribute homogen-type (="method" in the example). If all the children have identical name="..." attributes, then the <docgroup> gets a homogen-name="..." attribute aswell. The <docgroup> has a <doc> child containing the docmentation for the other children of the <docgroup>. An entity that cannot be grouped (class, module, enum), has a <doc> child of its own instead. ====================================================================== d) Pike entities ---------------------------------------------------------------------- Pike entities - classes, modules, methods, variables, constants, &c, have some things in common, and many parts of the xml format are the same for all of these entities. All entities are represented with an XML element, namely one of: <class> <constant> <enum> <inherit> <method> <modifier> <module> <typedef> <variable> The names speak for themselves, except: <modifier> which is used for modifier ranges: //! Some variables: static nomask { int x, y; string n; } A Pike entity may also have the following properties: Name - Given as a name="..." attribute: <variable name="i"> ... </variable> Modifiers - Given as a child element <modifiers>: <variable name="i"> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> If there are no modifiers before the declaration of the entity, the <modifiers> element can be omitted. Source position - Given as a child element <source-position>: <variable name="i"> <source-position file="/home/rolf/hejhopp.pike" first-line="12"/> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> The source position is the place in the code tree where the entity is declared or defined. For a method, the attribute last-line="..." can be added to <source-position> to give the range of lines that the method body spans in the source code. And then there are some things that are specific to each of the types of entities: <class> All inherits of the class are given as child elements <inherit>. If there is doc for the inherits, the <inherit> is repeated inside the appropriate <docgroup>: class Bosse { inherit "arne.pike" : Arne; inherit Benny; //! Documented inherit inherit Sven; } <class name="Bosse"> <inherit name="Arne"><source-position ... /> <classname>"arne.pike"</classname></inherit> <inherit><source-position ... /> <classname>Benny</classname></inherit> <inherit><source-position ... /> <classname>Sven</classname></inherit> <docgroup homogen-type="inherit"> <doc> <text><p>Documented inherit</p></text> </doc> <inherit><source-position ... /> <classname>Sven</classname></inherit> </docgroup> ... </class> <constant> Only has a name. The element is empty (or has a <source-position> child.) <enum> Works as a container ... ??? <inherit> The name="..." attribute gives the name after the colon, if any. The name of the inherited class is given in a <classname> child. If a file name is used, the class name is the file name surrounded by quotes (see <class>). <method> The arguments are given inside an <arguments> child. Each argument is given as an <argument name="..."> element. Each <argument> has a <type> child, with the type of the argument. The return type of the method is given inside a <returntype> container: int a(int x, int y); <method name="a"> <arguments> <argument name="x"><type><int/></type></argument> <argument name="y"><type><int/></type></argument> </arguments> <returntype><int/></returntype> </method> <modifier> Works as a container ... ??? <module> Works just like <class>. <typedef> The type is given in a <type> child: typedef float Boat; <typedef name="Boat"><type><float/></type></typedef> <variable> The type of the variable is given in a <type> child: int x; <variable name="x"><type><int/></type></variable> ====================================================================== e) Pike types ---------------------------------------------------------------------- Above we have seen the types int and float represented as <int/> and <float/>. Some of the types are complex, some are simple. The simpler types are just on the form <foo/>: <float/> <mixed/> <program/> <string/> <void/> The same goes for mapping, array, function, object, multiset, &c that have no narrowing type qualification: <mapping/>, <array/>, <function/> ... The complex types are represented as follows: array If the type of the elements of the array is specified it is given in a <valuetype> child element: array(int) <array><valuetype><int/></valuetype></array> function The types of the arguments and the return type are given (the order of the <argtype> elements is significant, of course): function(int, string: mixed) <function> <argtype><int/></argtype> <argtype><string/></argtype> <returntype><mixed/></returntype> </function> int An int type can have a min and/or max value. The values can be numbers or identifiers: int(0..MAX) <int><min>0</min><max>MAX</max></int> mapping The types of the indices and values are given: mapping(int:int) <mapping> <indextype><int/></indextype> <valuetype><int/></valuetype> multiset The type of the indices is given: multiset(string) <multiset> <indextype><string/></indextype> </multiset> object If the program/class is specified, it is given as the text child of the <object> element: object(Foo.Bar.Ippa) <object>Foo.Bar.Ippa</object> Then there are two special type constructions. A disjunct type is written with the <or> element: string|int <or><string/><int/></or> An argument to a method can be of the varargs type: function(string, mixed ... : void) <function> <argtype><string/></argtype> <argtype><varargs><mixed/></varargs></argtype> <returntype><void/></returntype> </function> ====================================================================== f) XML generated from the doc markup ----------------------------------------------------------------------