pike.git / refdoc / xml.txt

version» Context lines:

pike.git/refdoc/xml.txt:1: +  +--------------------------------------+ +  | Pike autodoc markup - the XML format | +  +--------------------------------------+    -  + ====================================================================== + a) Introduction + ---------------------------------------------------------------------- +  + When a piece of documentation is viewed in human-readable format, it + has gone through the following states: +  +  1. Doc written in comments in source code (C or Pike). +  +  2. A lot of smaller XML files, one for each source code file. +  +  3. A big chunk of XML, describing the whole hierarchy. +  +  4. A repository of smaller and more manageable XML files. +  +  5. A HTML page rendered from one such file. +  (Or a PDF file, or whatever). +  + The transition from state 1 to state 2 is the extraction of + documentation from source files. There are several (well, at + least two) markup formats, and there are occasions where it is + handy to generate documentation automatically &c. This document + describes how a file in state 2 should be structured in order to + be handled correctly by subsequent passes and presented in a + consistent manner. +  + ====================================================================== + b) Overall structure + ---------------------------------------------------------------------- +  + Each source file adds some number of entities to the whole hierarchy. + It can contain a class or a module. It can contain an empty module, + that has its methods and members defined in some other source file, + and so on. Suppose we have a file containing documentation for the + class Class in the module Module. The XML skeleton of the file + would then be: +  +  <module name=""> +  <module name="Module"> +  <class name="Class"> +  ... perhaps some info on inherits, members &c ... +  <doc> +  ... the documentation of the class Module.Class ... +  </doc> +  </class> +  </module> +  </module> +  + The <module name=""> refers to the top module. That element, and its + child <module name="Module">, exist only to put the <class name="Class"> + in its correct position in the hierarchy. So we can divide the elements + in the XML file into two groups: skeletal elements and content elements. +  + Each actual module/class/whatever in the Pike hierarchy maps to at most + one content element, however it can map to any number of skeletal elements. + For example, the top module is mapped to a skeletal element in each XML + file extracted from a single source file. To get from state 2 to state 3 + in the list above, all XML files are merged into one big. All the elements + that a module or class map to are merged into one, and if one of those + elements contains documentation (=is a content element), then that + documentation becomes a child of the merger of the elements. +  + ====================================================================== + c) Grouping + ---------------------------------------------------------------------- +  + Classes and modules always appear as <module> and <class> elements. + Methods, variables, constants &c, however, can be grouped in the + source code: +  +  //! Two variables: +  int a; +  int b; +  + Even a single variable is considered as a group with one member. + Continuing the example in the previous section, suppose that Module.Class + has two member variables, a and b, that are documented as a group: +  +  <module name=""> +  <module name="Module"> +  <class name="Class"> +  ... perhaps some info on inherits, members &c ... +  +  <docgroup homogen-type="variable"> +  <variable name="a"><type><int/></type></variable> +  <variable name="b"><type><int/></type></variable> +  <doc> +  ... documentation for Module.Class.a and Module.Class.b ... +  </doc> +  </docgroup> +  +  <doc> +  ... the documentation of the class Module.Class ... +  </doc> +  </class> +  </module> +  </module> +  + If all the children of a <docgroup> are of the same type, e.g. all are + <method> elements, then the <docgroup> has the attribute homogen-type + (="method" in the example). If all the children have identical name="..." + attributes, then the <docgroup> gets a homogen-name="..." attribute aswell. +  + The <docgroup> has a <doc> child containing the docmentation for the other + children of the <docgroup>. An entity that cannot be grouped (class, module, + enum), has a <doc> child of its own instead. +  + ====================================================================== + d) Pike entities + ---------------------------------------------------------------------- +  + Pike entities - classes, modules, methods, variables, constants, &c, have some + things in common, and many parts of the xml format are the same for all of + these entities. All entities are represented with an XML element, namely one + of: +  +  <class> +  <constant> +  <enum> +  <inherit> +  <method> +  <modifier> +  <module> +  <typedef> +  <variable> +  + The names speak for themselves, except: <modifier> which is used for modifier + ranges: +  +  //! Some variables: +  static nomask { +  int x, y; +  +  string n; +  } +  + A Pike entity may also have the following properties: +  +  Name - Given as a name="..." attribute: +  <variable name="i"> ... </variable> +  +  Modifiers - Given as a child element <modifiers>: +  <variable name="i"> +  <modifiers> +  <optional/><static/><private/> +  </modifiers> +  ... +  </variable> +  If there are no modifiers before the declaration of the entity, the +  <modifiers> element can be omitted. +  +  Source position - Given as a child element <source-position>: +  <variable name="i"> +  <source-position file="/home/rolf/hejhopp.pike" first-line="12"/> +  <modifiers> +  <optional/><static/><private/> +  </modifiers> +  ... +  </variable> +  The source position is the place in the code tree where the entity is +  declared or defined. For a method, the attribute last-line="..." can be +  added to <source-position> to give the range of lines that the method +  body spans in the source code. +  + And then there are some things that are specific to each of the types of + entities: +  + <class> +  All inherits of the class are given as child elements <inherit>. If there +  is doc for the inherits, the <inherit> is repeated inside the appropriate +  <docgroup>: +  +  class Bosse { +  inherit "arne.pike" : Arne; +  inherit Benny; +  +  //! Documented inherit +  inherit Sven; +  } +  +  <class name="Bosse"> +  <inherit name="Arne"><source-position ... /> +  <classname>"arne.pike"</classname></inherit> +  <inherit><source-position ... /> +  <classname>Benny</classname></inherit> +  <inherit><source-position ... /> +  <classname>Sven</classname></inherit> +  <docgroup homogen-type="inherit"> +  <doc> +  <text><p>Documented inherit</p></text> +  </doc> +  <inherit><source-position ... /> +  <classname>Sven</classname></inherit> +  </docgroup> +  ... +  </class> +  + <constant> +  Only has a name. The element is empty (or has a <source-position> child.) +  + <enum> +  Works as a container. Has a <doc> child element with the documentation of +  the enum itself, and <docgroup> elements with a <constant> for each enum +  constant. So: +  +  enum E +  //! enum E +  { +  //! Three constants: +  a, b, c, +  +  //! One more: +  d +  } +  +  becomes: +  +  <enum name="E"> +  <doc><text><p>enum E</p></text></doc> +  <docgroup homogen-type="constant"> +  <doc><text><p>Three constants:</p></text></doc> +  <constant name="a"/> +  <constant name="b"/> +  <constant name="c"/> +  </docgroup> +  <docgroup homogen-name="d" homogen-type="constant"> +  <doc><text><p>One more:</p></text></doc> +  <constant name="d"/> +  </docgroup> +  </enum> +  +  Both the <enum> element and the <constant> elements could have +  <source-position> children, of course. +  + <inherit> +  The name="..." attribute gives the name after the colon, if any. The name +  of the inherited class is given in a <classname> child. If a file name is +  used, the class name is the file name surrounded by quotes (see <class>). +  + <method> +  The arguments are given inside an <arguments> child. Each argument is +  given as an <argument name="..."> element. Each <argument> has a <type> +  child, with the type of the argument. The return type of the method is +  given inside a <returntype> container: +  +  int a(int x, int y); +  +  <method name="a"> +  <arguments> +  <argument name="x"><type><int/></type></argument> +  <argument name="y"><type><int/></type></argument> +  </arguments> +  <returntype><int/></returntype> +  </method> +  + <modifier> +  Works as a container ... ??? +  + <module> +  Works just like <class>. +  + <typedef> +  The type is given in a <type> child: +  +  typedef float Boat; +  +  <typedef name="Boat"><type><float/></type></typedef> +  + <variable> +  The type of the variable is given in a <type> child: +  +  int x; +  +  <variable name="x"><type><int/></type></variable> +  + ====================================================================== + e) Pike types + ---------------------------------------------------------------------- +  + Above we have seen the types int and float represented as <int/> and <float/>. + Some of the types are complex, some are simple. The simpler types are just on + the form <foo/>: +  +  <float/> +  <mixed/> +  <program/> +  <string/> +  <void/> +  + The same goes for mapping, array, function, object, multiset, &c that have + no narrowing type qualification: <mapping/>, <array/>, <function/> ... +  + The complex types are represented as follows: +  + array +  If the type of the elements of the array is specified it is given in a +  <valuetype> child element: +  +  array(int) +  +  <array><valuetype><int/></valuetype></array> +  + function +  The types of the arguments and the return type are given (the order +  of the <argtype> elements is significant, of course): +  +  function(int, string: mixed) +  +  <function> +  <argtype><int/></argtype> +  <argtype><string/></argtype> +  <returntype><mixed/></returntype> +  </function> +  + int +  An int type can have a min and/or max value. The values can be numbers or +  identifiers: +  +  int(0..MAX) +  +  <int><min>0</min><max>MAX</max></int> +  + mapping +  The types of the indices and values are given: +  +  mapping(int:int) +  +  <mapping> +  <indextype><int/></indextype> +  <valuetype><int/></valuetype> +  + multiset +  The type of the indices is given: +  +  multiset(string) +  +  <multiset> +  <indextype><string/></indextype> +  </multiset> +  + object +  If the program/class is specified, it is given as the text child of +  the <object> element: +  +  object(Foo.Bar.Ippa) +  +  <object>Foo.Bar.Ippa</object> +  + Then there are two special type constructions. A disjunct type is written + with the <or> element: +  +  string|int +  +  <or><string/><int/></or> +  + An argument to a method can be of the varargs type: +  +  function(string, mixed ... : void) +  +  <function> +  <argtype><string/></argtype> +  <argtype><varargs><mixed/></varargs></argtype> +  <returntype><void/></returntype> +  </function> +  + ====================================================================== + f) XML generated from the doc markup + ---------------------------------------------------------------------- +  + The documentation for an entity is put in a <doc> element. The <doc> element + is either a child of the element representing the entity (in the case of + <class>, <module>, <enum>, or <modifiers>) or a child of the <docgroup> that + contains the element representing the entity. +  + The doc markup has two main types of keywords. Those that create a container + and those that create a new subsection within a container, implicitly closing + the previous subsection. Consider e.g.: +  +  //! @mapping +  //! @member int "ip" +  //! The IP# of the host. +  //! @member string "address" +  //! The name of the host. +  //! @member float "latitude" +  //! @member float "longitude" +  //! The coordinates of its physical location. +  //! @endmapping +  + Here @mapping and @endmapping create a container, and each @member start a + new subsection. The two latter @member are grouped together and thus they + form ONE new subsection together. Each subsection is a <group>, and the + <group> has one or more <member> children, and a <text> child that contains + the text that describes the <member>s: +  +  <mapping> +  <group> +  <member><type><int/></type><index>"ip"</index></member> +  <text> +  <p>The IP# of the host.</p> +  </text> +  </group> +  <group> +  <member><type><string/></type><index>"address"</index></member> +  <text> +  <p>The name of the host.</p> +  </text> +  </group> +  <group> +  <member><type><float/></type><index>"latitude"</index></member> +  <member><type><float/></type><index>"longitude"</index></member> +  <text> +  <p>The coordinates of its physical location.</p> +  </text> +  </group> +  </mapping> +  + Inside a <text> element, there can not only be text, but also a nested level + of, say @mapping - @endmapping. In that case, the <mapping> element is put in + the document order place as a sibling of the <p> that contain the text: +  +  //! @mapping +  //! @member mapping "nested-mapping" +  //! A mapping inside the mapping: +  //! @mapping +  //! @member string "zip-code" +  //! The zip code. +  //! @endmapping +  //! And some more text ... +  //! @endmapping +  +  becomes: +  +  <mapping> +  <group> +  <member><type><mapping/></type><index>"nested-mapping"</index></member> +  <text> +  <p>A mapping inside the mapping:</p> +  <mapping> +  <group> +  <member><type><string/></type><index>"zip-code"</index></member> +  <text> +  <p>The zip code.</p> +  </text> +  </group> +  </mapping> +  <p>And some more text ...</p> +  </text> +  </group> +  </mapping> +  + Inside the <p> elements, there may also be some more "layout-ish" tags like + <b>, <code>, <tt>, <i>, needed to make the text more readable. Those tags are + expressed as @i{ ... @} in the doc markup. However there are no <br>. A + paragraph break is done by ending the <p> and beginning a new. A </p><p> is + inserted for each sequence of blank lines in the doc markup: +  +  //! First paragraph. +  //! +  //! Second paragraph. +  //! +  //! +  +  becomes: +  +  <p>First paragraph.</p><p>Second paragraph.</p> +  + Note that the text is trimmed from leading and ending whitespaces, and there + are never any empty <p> elements. +  + In the example above the keyword `@mapping' translated into <mapping>, whereas + the keyword `@member string "zip-code"' translated into: +  <member><type><string/></type><index>"zip-code"</index></member> +  + The translation of keyword->XML is done differently for each keyword. How it + is done can be seen in lib/modules/Tools.pmod/AutoDoc.pmod/DocParser.pmod. Most + keywords just interpret the arguments as a space-separated list, and put their + values in attributes to the element. In some cases (such as @member) though, + some more intricate parsing must be done, and the arguments may be complex + (like Pike types) and are represented as child elements of the element. +  + ====================================================================== + g) Top level sections of different Pike entities. + ---------------------------------------------------------------------- +  + In every doc comment there is an implicit "top container", and subsections can + be opened in it. E.g.: +  +  //! A method. +  //! @param x +  //! The horizontal coordinate. +  //! @param y +  //! The vertical coordinate. +  //! @returns +  //! Nothing :) +  void foo(int x, int y) +  + becomes: +  +  <docgroup homogen-name="foo" homogen-type="method"> +  <doc> +  <text><p>A method.</p></text> +  <group> +  <param name="x"/> +  <text><p>The horizontal coordinate.</p></text> +  </group> +  <group> +  <param name="y"/> +  <text><p>The vertical coordinate.</p></text> +  </group> +  <group> +  <returns/> +  <text><p>Nothing :)</p></text> +  </group> +  </doc> +  <method name="foo"> +  ...... +  </method> +  </docgroup> +  + Which "top container" subsections are allowed depends on what type of entity is + documented: +  + ALL - <bugs/> +  <deprecated> ... </deprecated> +  <example/> +  <note/> +  <seealso/> +  + <method> - <param name="..."/> +  <returns/> +  <throws/>   Newline at end of file added.