eed2682001-06-11David Norlin  +--------------------------------------+ | Pike autodoc markup - the XML format | +--------------------------------------+ ====================================================================== a) Introduction ---------------------------------------------------------------------- When a piece of documentation is viewed in human-readable format, it has gone through the following states: 1. Doc written in comments in source code (C or Pike). 2. A lot of smaller XML files, one for each source code file. 3. A big chunk of XML, describing the whole hierarchy. 4. A repository of smaller and more manageable XML files. 5. A HTML page rendered from one such file. (Or a PDF file, or whatever). The transition from state 1 to state 2 is the extraction of documentation from source files. There are several (well, at least two) markup formats, and there are occasions where it is handy to generate documentation automatically &c. This document describes how a file in state 2 should be structured in order to be handled correctly by subsequent passes and presented in a consistent manner. ====================================================================== b) Overall structure ---------------------------------------------------------------------- Each source file adds some number of entities to the whole hierarchy. It can contain a class or a module. It can contain an empty module, that has its methods and members defined in some other source file, and so on. Suppose we have a file containing documentation for the class Class in the module Module. The XML skeleton of the file would then be: <module name="">
61fe492001-06-14David Norlin  <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <doc> ... the documentation of the class Module.Class ... </doc> </class> </module>
eed2682001-06-11David Norlin  </module> The <module name=""> refers to the top module. That element, and its child <module name="Module">, exist only to put the <class name="Class"> in its correct position in the hierarchy. So we can divide the elements in the XML file into two groups: skeletal elements and content elements. Each actual module/class/whatever in the Pike hierarchy maps to at most one content element, however it can map to any number of skeletal elements. For example, the top module is mapped to a skeletal element in each XML file extracted from a single source file. To get from state 2 to state 3 in the list above, all XML files are merged into one big. All the elements that a module or class map to are merged into one, and if one of those
af31992001-06-13David Norlin elements contains documentation (=is a content element), then that
eed2682001-06-11David Norlin documentation becomes a child of the merger of the elements. ====================================================================== c) Grouping ---------------------------------------------------------------------- Classes and modules always appear as <module> and <class> elements. Methods, variables, constants &c, however, can be grouped in the source code: //! Two variables: int a; int b; Even a single variable is considered as a group with one member. Continuing the example in the previous section, suppose that Module.Class has two member variables, a and b, that are documented as a group: <module name=""> <module name="Module"> <class name="Class"> ... perhaps some info on inherits, members &c ... <docgroup homogen-type="variable">
af31992001-06-13David Norlin  <variable name="a"><type><int/></type></variable> <variable name="b"><type><int/></type></variable> <doc> ... documentation for Module.Class.a and Module.Class.b ... </doc>
eed2682001-06-11David Norlin  </docgroup> <doc> ... the documentation of the class Module.Class ... </doc> </class> </module> </module>
61fe492001-06-14David Norlin If all the children of a <docgroup> are of the same type, e.g. all are <method> elements, then the <docgroup> has the attribute homogen-type (="method" in the example). If all the children have identical name="..." attributes, then the <docgroup> gets a homogen-name="..." attribute aswell.
af31992001-06-13David Norlin  The <docgroup> has a <doc> child containing the docmentation for the other children of the <docgroup>. An entity that cannot be grouped (class, module, enum), has a <doc> child of its own instead. ====================================================================== d) Pike entities ---------------------------------------------------------------------- Pike entities - classes, modules, methods, variables, constants, &c, have some things in common, and many parts of the xml format are the same for all of
61fe492001-06-14David Norlin these entities. All entities are represented with an XML element, namely one of:
af31992001-06-13David Norlin  <class> <constant> <enum> <inherit> <method> <modifier> <module> <typedef> <variable> The names speak for themselves, except: <modifier> which is used for modifier ranges: //! Some variables: static nomask { int x, y; string n; } A Pike entity may also have the following properties: Name - Given as a name="..." attribute: <variable name="i"> ... </variable> Modifiers - Given as a child element <modifiers>: <variable name="i"> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> If there are no modifiers before the declaration of the entity, the <modifiers> element can be omitted. Source position - Given as a child element <source-position>: <variable name="i"> <source-position file="/home/rolf/hejhopp.pike" first-line="12"/> <modifiers> <optional/><static/><private/> </modifiers> ... </variable> The source position is the place in the code tree where the entity is declared or defined. For a method, the attribute last-line="..." can be added to <source-position> to give the range of lines that the method body spans in the source code. And then there are some things that are specific to each of the types of entities: <class> All inherits of the class are given as child elements <inherit>. If there is doc for the inherits, the <inherit> is repeated inside the appropriate <docgroup>: class Bosse { inherit "arne.pike" : Arne; inherit Benny; //! Documented inherit inherit Sven; } <class name="Bosse"> <inherit name="Arne"><source-position ... /> <classname>"arne.pike"</classname></inherit> <inherit><source-position ... /> <classname>Benny</classname></inherit> <inherit><source-position ... /> <classname>Sven</classname></inherit> <docgroup homogen-type="inherit"> <doc> <text><p>Documented inherit</p></text> </doc> <inherit><source-position ... /> <classname>Sven</classname></inherit> </docgroup> ... </class> <constant> Only has a name. The element is empty (or has a <source-position> child.) <enum>
61fe492001-06-14David Norlin  Works as a container. Has a <doc> child element with the documentation of the enum itself, and <docgroup> elements with a <constant> for each enum constant. So: enum E //! enum E { //! Three constants: a, b, c, //! One more: d } becomes: <enum name="E"> <doc><text><p>enum E</p></text></doc> <docgroup homogen-type="constant"> <doc><text><p>Three constants:</p></text></doc> <constant name="a"/> <constant name="b"/> <constant name="c"/> </docgroup> <docgroup homogen-name="d" homogen-type="constant"> <doc><text><p>One more:</p></text></doc> <constant name="d"/> </docgroup> </enum> Both the <enum> element and the <constant> elements could have <source-position> children, of course.
af31992001-06-13David Norlin  <inherit> The name="..." attribute gives the name after the colon, if any. The name of the inherited class is given in a <classname> child. If a file name is used, the class name is the file name surrounded by quotes (see <class>). <method> The arguments are given inside an <arguments> child. Each argument is given as an <argument name="..."> element. Each <argument> has a <type> child, with the type of the argument. The return type of the method is given inside a <returntype> container: int a(int x, int y); <method name="a"> <arguments> <argument name="x"><type><int/></type></argument> <argument name="y"><type><int/></type></argument> </arguments> <returntype><int/></returntype> </method> <modifier> Works as a container ... ??? <module> Works just like <class>. <typedef> The type is given in a <type> child: typedef float Boat; <typedef name="Boat"><type><float/></type></typedef> <variable> The type of the variable is given in a <type> child: int x; <variable name="x"><type><int/></type></variable> ====================================================================== e) Pike types ---------------------------------------------------------------------- Above we have seen the types int and float represented as <int/> and <float/>. Some of the types are complex, some are simple. The simpler types are just on the form <foo/>: <float/> <mixed/> <program/> <void/> The same goes for mapping, array, function, object, multiset, &c that have no narrowing type qualification: <mapping/>, <array/>, <function/> ... The complex types are represented as follows: array If the type of the elements of the array is specified it is given in a <valuetype> child element: array(int) <array><valuetype><int/></valuetype></array> function The types of the arguments and the return type are given (the order of the <argtype> elements is significant, of course): function(int, string: mixed) <function> <argtype><int/></argtype> <argtype><string/></argtype> <returntype><mixed/></returntype> </function> int An int type can have a min and/or max value. The values can be numbers or identifiers: int(0..MAX) <int><min>0</min><max>MAX</max></int>
84108b2007-03-04Martin Nilsson string A string type can have a numerical width value. string(8) <string><width>8</width></string>
af31992001-06-13David Norlin mapping The types of the indices and values are given: mapping(int:int) <mapping> <indextype><int/></indextype> <valuetype><int/></valuetype> multiset The type of the indices is given: multiset(string) <multiset> <indextype><string/></indextype> </multiset> object If the program/class is specified, it is given as the text child of the <object> element: object(Foo.Bar.Ippa) <object>Foo.Bar.Ippa</object> Then there are two special type constructions. A disjunct type is written with the <or> element: string|int <or><string/><int/></or> An argument to a method can be of the varargs type: function(string, mixed ... : void) <function> <argtype><string/></argtype> <argtype><varargs><mixed/></varargs></argtype> <returntype><void/></returntype> </function> ======================================================================
a689ff2001-07-18Martin Nilsson f) Other XML tags ---------------------------------------------------------------------- p Paragraph. i Italic. b Bold. tt Terminal Type. pre Preformatted text. code Program code. image An image object. Contains the original file path to the image. Has the optional attributes width, height and file, where file is the path to the normalized-filename file. ====================================================================== g) XML generated from the doc markup
af31992001-06-13David Norlin ----------------------------------------------------------------------
0af7b52001-06-13David Norlin  The documentation for an entity is put in a <doc> element. The <doc> element is either a child of the element representing the entity (in the case of <class>, <module>, <enum>, or <modifiers>) or a child of the <docgroup> that contains the element representing the entity. The doc markup has two main types of keywords. Those that create a container and those that create a new subsection within a container, implicitly closing the previous subsection. Consider e.g.: //! @mapping //! @member int "ip" //! The IP# of the host. //! @member string "address" //! The name of the host. //! @member float "latitude" //! @member float "longitude" //! The coordinates of its physical location. //! @endmapping Here @mapping and @endmapping create a container, and each @member start a new subsection. The two latter @member are grouped together and thus they form ONE new subsection together. Each subsection is a <group>, and the <group> has one or more <member> children, and a <text> child that contains the text that describes the <member>s: <mapping> <group> <member><type><int/></type><index>"ip"</index></member> <text> <p>The IP# of the host.</p> </text> </group> <group> <member><type><string/></type><index>"address"</index></member> <text> <p>The name of the host.</p> </text> </group> <group> <member><type><float/></type><index>"latitude"</index></member> <member><type><float/></type><index>"longitude"</index></member> <text> <p>The coordinates of its physical location.</p> </text> </group> </mapping> Inside a <text> element, there can not only be text, but also a nested level of, say @mapping - @endmapping. In that case, the <mapping> element is put in the document order place as a sibling of the <p> that contain the text: //! @mapping //! @member mapping "nested-mapping" //! A mapping inside the mapping: //! @mapping //! @member string "zip-code" //! The zip code. //! @endmapping //! And some more text ... //! @endmapping becomes: <mapping> <group> <member><type><mapping/></type><index>"nested-mapping"</index></member> <text> <p>A mapping inside the mapping:</p> <mapping> <group> <member><type><string/></type><index>"zip-code"</index></member> <text> <p>The zip code.</p> </text> </group> </mapping> <p>And some more text ...</p> </text> </group> </mapping> Inside the <p> elements, there may also be some more "layout-ish" tags like <b>, <code>, <tt>, <i>, needed to make the text more readable. Those tags are
61fe492001-06-14David Norlin expressed as @i{ ... @} in the doc markup. However there are no <br>. A paragraph break is done by ending the <p> and beginning a new. A </p><p> is inserted for each sequence of blank lines in the doc markup:
0af7b52001-06-13David Norlin  //! First paragraph. //! //! Second paragraph. //! //! becomes: <p>First paragraph.</p><p>Second paragraph.</p>
61fe492001-06-14David Norlin Note that the text is trimmed from leading and ending whitespaces, and there are never any empty <p> elements. In the example above the keyword `@mapping' translated into <mapping>, whereas the keyword `@member string "zip-code"' translated into: <member><type><string/></type><index>"zip-code"</index></member> The translation of keyword->XML is done differently for each keyword. How it is done can be seen in lib/modules/Tools.pmod/AutoDoc.pmod/DocParser.pmod. Most keywords just interpret the arguments as a space-separated list, and put their values in attributes to the element. In some cases (such as @member) though, some more intricate parsing must be done, and the arguments may be complex (like Pike types) and are represented as child elements of the element.
0af7b52001-06-13David Norlin 
61fe492001-06-14David Norlin ======================================================================
a689ff2001-07-18Martin Nilsson h) Top level sections of different Pike entities.
61fe492001-06-14David Norlin ---------------------------------------------------------------------- In every doc comment there is an implicit "top container", and subsections can be opened in it. E.g.: //! A method. //! @param x //! The horizontal coordinate. //! @param y //! The vertical coordinate. //! @returns //! Nothing :) void foo(int x, int y) becomes: <docgroup homogen-name="foo" homogen-type="method"> <doc> <text><p>A method.</p></text> <group> <param name="x"/> <text><p>The horizontal coordinate.</p></text> </group> <group> <param name="y"/> <text><p>The vertical coordinate.</p></text> </group> <group> <returns/> <text><p>Nothing :)</p></text> </group> </doc> <method name="foo"> ...... </method> </docgroup> Which "top container" subsections are allowed depends on what type of entity is documented: ALL - <bugs/> <deprecated> ... </deprecated> <example/> <note/> <seealso/> <method> - <param name="..."/> <returns/> <throws/>