eed2682001-06-11David Norlin  +--------------------------------------+ | Pike autodoc markup - the XML format | +--------------------------------------+ ====================================================================== a) Introduction ---------------------------------------------------------------------- When a piece of documentation is viewed in human-readable format, it has gone through the following states: 1. Doc written in comments in source code (C or Pike). 2. A lot of smaller XML files, one for each source code file. 3. A big chunk of XML, describing the whole hierarchy. 4. A repository of smaller and more manageable XML files. 5. A HTML page rendered from one such file. (Or a PDF file, or whatever). The transition from state 1 to state 2 is the extraction of documentation from source files. There are several (well, at least two) markup formats, and there are occasions where it is handy to generate documentation automatically &c. This document describes how a file in state 2 should be structured in order to be handled correctly by subsequent passes and presented in a consistent manner. ====================================================================== b) Overall structure ---------------------------------------------------------------------- Each source file adds some number of entities to the whole hierarchy. It can contain a class or a module. It can contain an empty module, that has its methods and members defined in some other source file, and so on. Suppose we have a file containing documentation for the class Class in the module Module. The XML skeleton of the file would then be: <module name="">
61fe492001-06-14David Norlin  <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <doc> ... the documentation of the class Module.Class ... </doc> </class> </module>
eed2682001-06-11David Norlin  </module> The <module name=""> refers to the top module. That element, and its child <module name="Module">, exist only to put the <class name="Class"> in its correct position in the hierarchy. So we can divide the elements in the XML file into two groups: skeletal elements and content elements. Each actual module/class/whatever in the Pike hierarchy maps to at most one content element, however it can map to any number of skeletal elements. For example, the top module is mapped to a skeletal element in each XML file extracted from a single source file. To get from state 2 to state 3 in the list above, all XML files are merged into one big. All the elements that a module or class map to are merged into one, and if one of those
af31992001-06-13David Norlin elements contains documentation (=is a content element), then that
eed2682001-06-11David Norlin documentation becomes a child of the merger of the elements. ====================================================================== c) Grouping ---------------------------------------------------------------------- Classes and modules always appear as <module> and <class> elements. Methods, variables, constants &c, however, can be grouped in the source code: //! Two variables: int a; int b; Even a single variable is considered as a group with one member. Continuing the example in the previous section, suppose that Module.Class has two member variables, a and b, that are documented as a group: <module name=""> <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <docgroup homogen-type="variable">
af31992001-06-13David Norlin  <variable name="a"><type><int/></type></variable> <variable name="b"><type><int/></type></variable> <doc> ... documentation for Module.Class.a and Module.Class.b ... </doc>
eed2682001-06-11David Norlin  </docgroup> <doc> ... the documentation of the class Module.Class ... </doc> </class> </module> </module>
61fe492001-06-14David Norlin If all the children of a <docgroup> are of the same type, e.g. all are <method> elements, then the <docgroup> has the attribute homogen-type (="method" in the example). If all the children have identical name="..." attributes, then the <docgroup> gets a homogen-name="..." attribute aswell.
af31992001-06-13David Norlin  The <docgroup> has a <doc> child containing the docmentation for the other children of the <docgroup>. An entity that cannot be grouped (class, module, enum), has a <doc> child of its own instead. ====================================================================== d) Pike entities ---------------------------------------------------------------------- Pike entities - classes, modules, methods, variables, constants, &c, have some things in common, and many parts of the xml format are the same for all of
61fe492001-06-14David Norlin these entities. All entities are represented with an XML element, namely one of:
af31992001-06-13David Norlin  <class> <constant> <enum> <inherit> <method> <modifier> <module> <typedef> <variable> The names speak for themselves, except: <modifier> which is used for modifier ranges: //! Some variables: static nomask { int x, y; string n; } A Pike entity may also have the following properties: Name - Given as a name="..." attribute: <variable name="i"> ... </variable> Modifiers - Given as a child element <modifiers>: <variable name="i"> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> If there are no modifiers before the declaration of the entity, the <modifiers> element can be omitted. Source position - Given as a child element <source-position>: <variable name="i"> <source-position file="/home/rolf/hejhopp.pike" first-line="12"/> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> The source position is the place in the code tree where the entity is declared or defined. For a method, the attribute last-line="..." can be added to <source-position> to give the range of lines that the method body spans in the source code. And then there are some things that are specific to each of the types of entities: <class> All inherits of the class are given as child elements <inherit>. If there is doc for the inherits, the <inherit> is repeated inside the appropriate <docgroup>: class Bosse { inherit "arne.pike" : Arne; inherit Benny; //! Documented inherit inherit Sven; } <class name="Bosse"> <inherit name="Arne"><source-position ... /> <classname>"arne.pike"</classname></inherit> <inherit><source-position ... /> <classname>Benny</classname></inherit> <inherit><source-position ... /> <classname>Sven</classname></inherit> <docgroup homogen-type="inherit"> <doc> <text><p>Documented inherit</p></text> </doc> <inherit><source-position ... /> <classname>Sven</classname></inherit> </docgroup> ... </class> <constant> Only has a name. The element is empty (or has a <source-position> child.) <enum>
61fe492001-06-14David Norlin  Works as a container. Has a <doc> child element with the documentation of the enum itself, and <docgroup> elements with a <constant> for each enum constant. So: enum E //! enum E { //! Three constants: a, b, c, //! One more: d } becomes: <enum name="E"> <doc><text><p>enum E</p></text></doc> <docgroup homogen-type="constant"> <doc><text><p>Three constants:</p></text></doc> <constant name="a"/> <constant name="b"/> <constant name="c"/> </docgroup> <docgroup homogen-name="d" homogen-type="constant"> <doc><text><p>One more:</p></text></doc> <constant name="d"/> </docgroup> </enum> Both the <enum> element and the <constant> elements could have <source-position> children, of course.
af31992001-06-13David Norlin  <inherit> The name="..." attribute gives the name after the colon, if any. The name of the inherited class is given in a <classname> child. If a file name is used, the class name is the file name surrounded by quotes (see <class>). <method> The arguments are given inside an <arguments> child. Each argument is given as an <argument name="..."> element. Each <argument> has a <type> child, with the type of the argument. The return type of the method is given inside a <returntype> container: int a(int x, int y); <method name="a"> <arguments> <argument name="x"><type><int/></type></argument> <argument name="y"><type><int/></type></argument> </arguments> <returntype><int/></returntype> </method> <modifier> Works as a container ... ??? <module> Works just like <class>. <typedef> The type is given in a <type> child: typedef float Boat; <typedef name="Boat"><type><float/></type></typedef> <variable> The type of the variable is given in a <type> child: int x; <variable name="x"><type><int/></type></variable> ====================================================================== e) Pike types ---------------------------------------------------------------------- Above we have seen the types int and float represented as <int/> and <float/>. Some of the types are complex, some are simple. The simpler types are just on the form <foo/>: <float/> <mixed/> <program/> <string/> <void/> The same goes for mapping, array, function, object, multiset, &c that have no narrowing type qualification: <mapping/>, <array/>, <function/> ... The complex types are represented as follows: array If the type of the elements of the array is specified it is given in a <valuetype> child element: array(int) <array><valuetype><int/></valuetype></array> function The types of the arguments and the return type are given (the order of the <argtype> elements is significant, of course): function(int, string: mixed) <function> <argtype><int/></argtype> <argtype><string/></argtype> <returntype><mixed/></returntype> </function> int An int type can have a min and/or max value. The values can be numbers or identifiers: int(0..MAX) <int><min>0</min><max>MAX</max></int> mapping The types of the indices and values are given: mapping(int:int) <mapping> <indextype><int/></indextype> <valuetype><int/></valuetype> multiset The type of the indices is given: multiset(string) <multiset> <indextype><string/></indextype> </multiset> object If the program/class is specified, it is given as the text child of the <object> element: object(Foo.Bar.Ippa) <object>Foo.Bar.Ippa</object> Then there are two special type constructions. A disjunct type is written with the <or> element: string|int <or><string/><int/></or> An argument to a method can be of the varargs type: function(string, mixed ... : void) <function> <argtype><string/></argtype> <argtype><varargs><mixed/></varargs></argtype> <returntype><void/></returntype> </function> ====================================================================== f) XML generated from the doc markup ----------------------------------------------------------------------
0af7b52001-06-13David Norlin  The documentation for an entity is put in a <doc> element. The <doc> element is either a child of the element representing the entity (in the case of <class>, <module>, <enum>, or <modifiers>) or a child of the <docgroup> that contains the element representing the entity. The doc markup has two main types of keywords. Those that create a container and those that create a new subsection within a container, implicitly closing the previous subsection. Consider e.g.: //! @mapping //! @member int "ip" //! The IP# of the host. //! @member string "address" //! The name of the host. //! @member float "latitude" //! @member float "longitude" //! The coordinates of its physical location. //! @endmapping Here @mapping and @endmapping create a container, and each @member start a new subsection. The two latter @member are grouped together and thus they form ONE new subsection together. Each subsection is a <group>, and the <group> has one or more <member> children, and a <text> child that contains the text that describes the <member>s: <mapping> <group> <member><type><int/></type><index>"ip"</index></member> <text> <p>The IP# of the host.</p> </text> </group> <group> <member><type><string/></type><index>"address"</index></member> <text> <p>The name of the host.</p> </text> </group> <group> <member><type><float/></type><index>"latitude"</index></member> <member><type><float/></type><index>"longitude"</index></member> <text> <p>The coordinates of its physical location.</p> </text> </group> </mapping> Inside a <text> element, there can not only be text, but also a nested level of, say @mapping - @endmapping. In that case, the <mapping> element is put in the document order place as a sibling of the <p> that contain the text: //! @mapping //! @member mapping "nested-mapping" //! A mapping inside the mapping: //! @mapping //! @member string "zip-code" //! The zip code. //! @endmapping //! And some more text ... //! @endmapping becomes: <mapping> <group> <member><type><mapping/></type><index>"nested-mapping"</index></member> <text> <p>A mapping inside the mapping:</p> <mapping> <group> <member><type><string/></type><index>"zip-code"</index></member> <text> <p>The zip code.</p> </text> </group> </mapping> <p>And some more text ...</p> </text> </group> </mapping> Inside the <p> elements, there may also be some more "layout-ish" tags like <b>, <code>, <tt>, <i>, needed to make the text more readable. Those tags are
61fe492001-06-14David Norlin expressed as @i{ ... @} in the doc markup. However there are no <br>. A paragraph break is done by ending the <p> and beginning a new. A </p><p> is inserted for each sequence of blank lines in the doc markup:
0af7b52001-06-13David Norlin  //! First paragraph. //! //! Second paragraph. //! //! becomes: <p>First paragraph.</p><p>Second paragraph.</p>
61fe492001-06-14David Norlin Note that the text is trimmed from leading and ending whitespaces, and there are never any empty <p> elements. In the example above the keyword `@mapping' translated into <mapping>, whereas the keyword `@member string "zip-code"' translated into: <member><type><string/></type><index>"zip-code"</index></member> The translation of keyword->XML is done differently for each keyword. How it is done can be seen in lib/modules/Tools.pmod/AutoDoc.pmod/DocParser.pmod. Most keywords just interpret the arguments as a space-separated list, and put their values in attributes to the element. In some cases (such as @member) though, some more intricate parsing must be done, and the arguments may be complex (like Pike types) and are represented as child elements of the element.
0af7b52001-06-13David Norlin 
61fe492001-06-14David Norlin ====================================================================== g) Top level sections of different Pike entities. ---------------------------------------------------------------------- In every doc comment there is an implicit "top container", and subsections can be opened in it. E.g.: //! A method. //! @param x //! The horizontal coordinate. //! @param y //! The vertical coordinate. //! @returns //! Nothing :) void foo(int x, int y) becomes: <docgroup homogen-name="foo" homogen-type="method"> <doc> <text><p>A method.</p></text> <group> <param name="x"/> <text><p>The horizontal coordinate.</p></text> </group> <group> <param name="y"/> <text><p>The vertical coordinate.</p></text> </group> <group> <returns/> <text><p>Nothing :)</p></text> </group> </doc> <method name="foo"> ...... </method> </docgroup> Which "top container" subsections are allowed depends on what type of entity is documented: ALL - <bugs/> <deprecated> ... </deprecated> <example/> <note/> <seealso/> <method> - <param name="..."/> <returns/> <throws/>