DRAFT: Kueea Module Declaration Language
This document is a live working draft. Do not implement.
KMDL documents declare one Kueea Module per document. Their syntax is designed to be human-readable and fairly easy to read and write using the most simple text editors.
KMDL processors take KMDL documents as input. The primary output are source files in a given programming language. Other output include module documentation in HTML or other formats. They are part of tool chains that build Kueea Module implementations.
Keywords
The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Overview
Processing a KMDL document produces a tree of items.
The root of the tree is the module item. Items declared in the document are its children: classes, data objects and functions.
An item is a set of variables. Variables of an item are referenced in prose like this: item.variable (using a dot separator).
Each item is associated with the module level at which it is declared. Higher module levels include all items declared at lower levels.
Items have an associated human-readable textual description. This text is treated as opaque data by a KMDL processor. It is only considered when generating output for human users.
Syntax
A KMDL document is a sequence of Unicode characters. The encoding of the document MUST be UTF-8.
Lines are sequences of characters separated by a sequence of two characters: U+000D CARRIAGE RETURN and U+000A LINE FEED.
Whitespace is either U+0020 SPACE or U+0009 HORIZONTAL TAB.
The documents are read and processed line by line. Maximum length of a line is 1024 code units, including the line separator.
There are three kinds of lines: processor instructions, item descriptions, and comments.
Comments
Comments are lines that are read and discarded. There are two kinds of comments.
One-line comments are lines beginning with optional whitespace followed by no more than one consecutive U+0023 NUMBER SIGN character.
# This is a one-line comment. # This is a one-line comment. THIS IS NOT A COMMENT. # # # This is a one-line comment.
Multi-line comments begin and end with a line beginning with optional whitespace followed by 2 consecutive U+0023 NUMBER SIGN characters.
## This is the first line of a multi-line comment. This is a comment. .This is a comment. # # This is inside a multi-line comment. ## This is the last line of a multi-line comment. THIS IS NOT A COMMENT.
Instructions
Instructions declare items and change processor state. These lines begin with optional whitespace followed by one U+002E FULL STOP character and the instruction name. Instruction arguments, each preceeded by whitespace. follow the name.
Instruction names are case-sensitive. They MUST be written with small letters and are four-character long.
.inst arg1 arg2 .inst arg1
Item descriptions
Any other line is text - a human-readable textual description of the currently active item.
These lines are passed as input to another program when generating module documentation or are ignored if the information is unneeded.
Their syntax is out of scope of this document.
.item example1 Description of the example1 item. Description of the example1 item. .item example2 Description of the example2 item.
Line indentation
The preceeding whitespace on an instruction line sets the amount of ignored preceeding whitespace for the text lines that come after it.
Both of the whitespace characters count as one.
Consider the following example:
.inst first Line 1-1. Line 1-2. Line 1-3. .inst second Line 2-1. Line 2-2. Line 2-3.
The first line is an instruction indented by 3 whitespace characters. The ignored indentation becomes 3.
The second line is text indented by 3 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has no preceeding whitespace characters.
The third line is text indented by 5 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has 2 preceeding whitespace characters.
The fourth line is text indented by 2 whitespace characters. The parser removes the first 3 whitespace characters. In this case, the line has less whitespace, so all is removed. The resulting line has 0 preceeding whitespace characters.
The item description for the first item is thus:
Line 1-1. Line 1-2. Line 1-3.
The fifth line is an instruction indented by 0 whitespace characters. The ignored indentation becomes 0.
The sixth line is text indented by 3 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 3 preceeding whitespace characters.
The seventh line is text indented by 5 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 5 preceeding whitespace characters.
The eigth line is text indented by 0 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 0 preceeding whitespace characters.
The item description for the second item is thus:
Line 2-1. Line 2-2. Line 2-3.
ABNF (formal syntax)
The following KMDL
rule expresses the syntax in ABNF.
KMDL = docv CRLF *( line CRLF ) docv = %s".kmdl" SP 1*4DIGIT line = comm / inst / text comm = cmul / cone cmul = *WSP "##" *OCTET CRLF *WSP "##" *OCTET cone = *WSP "#" *OCTET inst = *WSP "." *( WSP / VCHAR ) text = [ *WSP "\" ] *OCTET
ABNF rules are referenced in prose like this: <rule>.
The first non-whitespace character of <text> MAY be a U+005C REVERSE SOLIDUS. If so, the \
character is removed before further processing of the line. This is an escape mechanism in case the line begins with #
or .
.
Non-items
This section defines objects that are not items.
Containers
Some objects are stored in a container object.
A list is an ordered container, which means that position of elements in a list is significant.
A set is an unordered container.
Class identifier
Classes are identified by 128-bit values. These values are globally unique and are treated as opaque data.
It is expected that the value is a Universally Unique Identifier. The value MAY NOT be a valid UUID, although it SHOULD be.
The nil UUID value is reserved and means no value.
Modules and interfaces are special kind of classes.
Class level
Classes have levels.
A level is a 16-bit unsigned integer.
Class of a given level includes every member declared at all of its lower levels.
Symbol identifier
Some items have an associated symbol identifier.
A symbol identifier is a 64-bit unsigned integer.
There MUST NOT be two symbol identifiers of the same value within a module.
By default, the value is the result of passing a character string to the 64-bit FNV-1a (Fowler-Noll-Vo) hash function:
hash = 0xCBF29CE484222325 for each octet_of_data to be hashed hash = hash XOR octet_of_data hash = hash * 0x100000001B3 return hash
Symbol identifiers should only be explicitly declared in case of a collision. The encoding of the string is the same as for the document - UTF-8.
The hashed string consists of names of items separated by a U+0024 DOLLAR SIGN character.
Examples:
Input string | Output value |
---|---|
global_object | 0x8EAE284DFC37BA58 |
iface$function | 0x59976F93E8833DA3 |
class$0000$function | 0x713BC2D41B847A6F |
Tag
A tag is a sequence of characters.
The minimum length of the sequence is 1.
The maximum length of the sequence is 16.
Tags are stored in sets.
Name
Some items have an associated name.
A name is a sequence of characters.
The minimum length of the sequence is 1.
The maximum length of the sequence is 64.
Item reference
An item reference is a list of names.
The first name is special in that it might be a module (class) identifier instead.
Register types and classes
A class can be assigned a register type. Such class becomes a register class.
These classes are special in that they have a pre-defined interface for transferring values from a CPU register to an object and vice-versa.
These functions are needed because the order of octets in memory might be different from the byte order the CPU expects.
For example, consider a class called u32
occupying 4 octets in memory. We then declare that there is a function for loading a value from an instance of this class into a 32-bit-wide CPU register, which then can be used to perform integer arithmetic on the value.
Otherwise, there is no mapping between CPU registers and object data. The data is seen as nothing more than an array of opaque octets.
The set of possible register types and their values is predefined.
Memory alignment
Objects are aligned in memory to a specific amount of octets.
An alignment is a 32-bit unsigned integer.
Object type
An object type is a tuple of:
- item reference to a class,
- level of the referenced class,
- type of handle.
There are five predefined classes: octet, boolean value, result of comparision, object size and handle. Predefined classes are always at level 0. All other types are classes declared by modules.
An octet is 8 bits with no associated meaning. Its register type is an 8-bit unsigned integer. This is the fundamental type - all objects are arrays of octets.
A boolean occupies 1 octet. Its register type is an unsiged integer of at least 1 bits. It is mappped to an unsigned 8-bit integer in practice. Possible values are: true (non-zero) and false (zero).
A cmprval occupies 1 octet. Its register type is a signed integer of at least 2 bits. It is mappped to a signed 8-bit integer in practice. Possible values are: 0 means equal, 1 or more means more than, -1 means a comparision error and -2 or less means less than. Simply put, one first tests for -1 (error) and then compares with 0. Functions that return this object also set the CPU flags accordingly, so that a conditional jump may immediately follow the function call.
An objsize occupies 4 octets aligned to 4 octets. Its register type is a 32-bit unsigned integer.
A handle occupies 24 octets aligned to 8 octets: a 64-bit address and a 128-bit node identifier. Its register type is a (CPU-defined) memory reference.
The type of handle specifies access rights to the referenced object:
- NaH
- Immediate value, not a handle. Defined for consistency.
- none
- No access rights. Used when rights are already obtained.
- read
- Read rights.
- rdex
- Read and execute rights.
- rdwr
- Read and write rights.
- rwex
- Read, write and execute rights.
Array length
Representation of an array length MUST consist of at least:
- var: reference to the member storing the amount of elements,
- min: minimum amount of elements,
- max: maximum amount of elements.
Amount of elements is a 32-bit unsigned integer.
The member referenced by var MUST be an instance of a register class with an unsigned integer register type.
Items
All items contain a desc variable, which is a list of objects which MUST consist of at least:
- data: opqaue buffer; textual data,
- format: name; format of the data.
When appending a new object n to desc, where o is the last object in desc: if n.format equals to o.format, data in n.data MAY be appended to o.data with a preceeding line separator instead of appending a new object.
Module
Representation of a module MUST consist of at least:
- mid: the module's (class) identifier,
- mlv: the module's (class) level,
- data: list of data members,
- func: set of function members,
- vals: set of named values,
- types: set of classes,
- mrefs: set of external module references, tuples of: name (alias), module identifier and target module level.
Class
Representation of a class MUST consist of at least:
- cid: class identifier,
- clv: current class level,
- creg: register type,
- data: list of data members,
- func: set of function members,
- vals: set of named values,
- tags: set of tags,
- name: name of the class.
Data member
Representation of a data member MUST consist of at least:
- mlv: associated module level,
- clv: associated class level,
- sid: symbol identifier,
- tags: set of tags,
- type: type reference,
- alen: array length,
- align: memory alignment,
- value: default value (if type is a register class),
- name: name of the object.
Function member
Representation of a function member MUST consist of at least:
- mlv: module level,
- clv: class level,
- sid: symbol identifier,
- tags: set of tags,
- params: list of function parameters,
- rval: function return value,
- name: name of the function.
Function parameter
Representation of a function parameter MUST consist of at least:
- in: input type reference,
- out: output type reference,
- name: parameter name.
Function return value
Representation of a function return value MUST consist of at least:
- type: type reference.
Named value
Representation of a named value MUST consist of at least:
- mlv: associated module level,
- clv: associated class level,
- type: type reference to a register class,
- value: register value,
- name: name of the value.
Parser
A KMDL parser consists of a line parser, an instruction parser and the functions the latter invokes.
Both the line parser and the functions have access to a shared state in addition to their own, private state.
Shared state
Shared state of the parser consists of the following variables:
- mset: set of modules,
- mbeg: reference to the current module,
- item: reference to the current item,
- desc: reference to the current item description,
- text: name of the current text format.
An implementation also needs to be able to determine the type of the current item in item.
When a reference is nothing, it means that the reference is set to a value that does not reference any objects.
Loading algorithm
The processor loads a module using the following algorithm.
The function takes 3 parameters: a module identifier, a module level and an external function for fetching KMDL documents. These are mentioned in steps 1-3.
- Let mid be the identifier of the loaded module.
- Let mlv be the target level of the loaded module.
- Let fetch be the extenal fetcher.
- Search mset for a module m with m.mid equal to mid.
- If found and m.mlv is no less than mlv, return success.
- If found and m.mlv is less than mlv, remove m from mset.
- Obtain an octet stream doc by calling fetch, passing mid and mlv as arguments.
- If fetch has failed, return failure.
- Set mbeg to a newly created module representation.
- Set mbeg.mid to mid.
- Set item to nothing,
- Set text to
markdown
. - Invoke the line parser on doc.
- If the parser failed, delete mbeg and return failure.
- If the mbeg.mlv is less than mlv, delete mbeg and return failure.
- Insert mbeg into mset.
- Iterate mbeg.mrefs and recursively call this algorithm with the arguments from the tuple; if any of the referenced modules failed to load, return failure.
- Return success.
After the loading algorithm successfully finishes, the processor resolves all references in the loaded module and generates the requested output.
Line parser
State of the line parser consists of the following variables:
- line: current line (Unicode string),
- wslv: current line indentation (integer),
- skip: ignored line indentation (ingeter).
The line parser takes a stream of octets as input. It loads lines and processes them until the end of the stream.
The parser loads a line as follows:
- Let cr be a boolean.
- Let input be the input stream of octets.
- Set line to an empty string.
- Set cr to false.
- While input is not empty:
- Decode the next character c from input.
- If line is longer than 1024 characters, return failure.
- Append c to line.
- If cr is true and c is U+000A LINE FEED, remove the last character in line and return success.
- If c is U+000D CARRIAGE RETURN, set cr to true; otherwise, set cr to false.
- Return success.
After each line is loaded, it is processed. The first line has special processing.
The first line
docv = %s".kmdl" SP 1*4DIGIT
If line does not exactly match the <docv> rule above, the line parser MUST return failure and load no further lines.
The DIGIT
s encode the version of a KMDL processor as a decimal integer. If the version is higher (more) than the implemented one, the line parser MUST return failure and load no further lines.
This document defines version 1
.
Other lines
The parser counts the amount of whitespace at the beginning of line and stores the resulting value in wslv.
Further processing depends on the first character after the whitespace.
If the line is a one-line comment, the line is ignored.
If the line is a multi-line comment, the parser loads subsequent lines until another multi-line comment is encountered. All of these lines are ignored.
If the line is an instruction, skip is set to wslv. Whitespace at the beginning and end of line is removed. The U+002E FULL STOP at the beginning is removed. The line is then fed to the instruction parser. In case of an error, the parser MUST fail and load no further lines.
Otherwise, the line is text. Up to skip whitespace characters are removed since the beginning of line. If the first character after the removal is \
, the character is removed.
If desc reference an object, create and push an object o with o.data equal to line and o.format equal to text, to desc.
Instruction parser
cmd1 = fun1 *( 1*WSP arg1 ) fun1 = 4LOALPHA arg1 = TAGS / SID / SIDN / VREG / ALEN arg1 /= TYPE / IREF / CID / NAME / UINT
Instructions MUST match the <cmd1> rule.
The instruction parser converts the instruction line into a list of typed arguments and calls the function named by the instruction.
Instructions begin with a four-letter function name, followed by argument tokens, each preceeded by whitespace.
The parser MUST fail on any unrecognized function or argument, or when an argument is unexpected.
Argument tokens are defined such that a parser can determine the type of an argument from the input stream itself. The order of alternatives in <arg1> is the recommended order of tests.
Instruction parameters
This section defines syntax of instruction parameters.
Parameter rules use capital letters by convention.
Class identifier
CID = 2hexu 15( [ "-" ] 2hexu ) / %s"NOID" hexu = %x30-39 / %x41-46
The keyword NOID
is equivalent to 00000000-0000-0000-0000-000000000000
.
Unsigned integer
UINT = udec / uhex udec = 1*20DIGIT uhex = "0x" 1*16HEXDIGIT
Integer arguments are unsigned and may be at most 64-bit long. They are written in decimal or hexadecimal notation.
Symbol identifier
SID = "#" UINT SIDN = "#" NAME "#" UINT
Because there are instructions with more than one symbol identifier, there are two variants: one with a name and one without.
Tags
TAGS = tag *( 1*WSP tag ) tag = "+" 1*16LOALPHA
Tags are character string of up to 16 characters, which are preceeded by a U+002B PLUS SIGN character.
They are parsed into a list of strings.
Functions test for the presence of a tag in the list.
Name
NAME = LOALPHA *63( LOALPHA / DIGIT / "_" )
Items are given a human-readable name for reference. All letters in names MUST be small.
It is RECOMMENDED that names of functions be composed in subject-object-verb order, for example object_units_replace
. This name requirement is for consistency and grouping of members.
Note that one can generate aliases in camelCase
, too, if needed. Source code could be converted back and forth.
Item reference
IREF = [ mref ] 1*( "." NAME ) mref = NAME / CID
An item reference begins with a module reference, followed by a sequence of item names, each referring to an item declared as part of the preceeding item.
The module reference may be omitted as a shorthand for referencing the module declared by the currently parsed document.
References are resolved after all modules are loaded. Referenced item MAY be declared later in a document.
Register value
VREG = "=" ( valf / vali / valu / valb ) sign = "+" / "-" vhex = "0x" 1*HEXDIGIT vdec = 1*DIGIT fdec = vdec [ "." 1*DIGIT ] [ "e" [ sign ] vdec ] fbin = vhex [ "." 1*HEXDIGIT ] [ "p" [ sign ] vdec ] valu = vhex / vdec vali = sign valu valf = [ sign ] ( "NaN" / "INF" / fdec / fbin ) valb = %s"true" / %s"false"
Register values are only used to assign a default value to a data member which is an instance of a register class.
These values are required for class preconstructors.
Object type
TYPE = hndt / cref hndt = hndr "<" ( cref / %s"handle" / %s"?" ) ">" hndr = %s"none" / %s"read" / %s"rdex" / %s"rdwr" / %s"rwex" cref = %s"octet" / %s"bool" [ %s"ean" ] / %s"objsize" / %s"cmprval" cref /= IREF ":" UINT
Syntax of handles begin with access rights associated with the handle, followed by a type of referenced object in angle brackets. Handles may also reference objects of undefined (?
) type or reference another handle in memory.
The type of object begins with an item reference to a class, followed by its level after a colon, or a predefined class name.
For example, read<stream.buffer:0>
means a read-only handle to an instance of class buffer
at level 0 from module stream
.
Array length
ALEN = "[" [ mlen ":" ] arrl [ ":" arrl ] "]" mlen = NAME *( "." NAME ) arrl = UINT / %s"MAX"
The special keyword MAX
is an alias for 232-1.
Parsing examples:
[10] => min = 10, max = 10, var = () [1:20] => min = 1, max = 20, var = () [1:MAX] => min = 1, max = MAX, var = () [len:0:255] => min = 0, max = 255, var = ("len") [obj.len:MAX] => min = 0, max = MAX, var = ("obj", "len")
Processor functions
Functions are defined by listing their parameters in ascending order, describing the function and formally specifiying its outcome.
Processor state consists of the following variables:
- mfin: level finalization state (boolean),
- clvl: current class level (integer),
- cbeg: current container (reference),
- func: current function (reference).
The text
function
Updates the format of item descriptions.
- <NAME> format
- Name of the new format.
The syntax of item descriptions is Markdown by default and may be changed at any point with the instruction.
Format names SHOULD be subtypes of text/
Internet Media Types.
An example:
This will be interpreted as _Markdown_ text. .text html <p>This will be interpreted as <b>HTML</b> text.</p>
Implementations MUST modify the state as follows:
- Set text to format.
The load
function
References an external module.
- <CID> mid
- Module identifier.
- <UINT> mlv
- Required minimum level of the module.
- <NAME> name (optional)
- Alias for the module.
Modules reference items defined in other modules. These modules must be loaded or else dereference phase will fail.
The instruction may appear at any point in the document. It does not have to appear before any reference to the module. It is RECOMMENDED that these instructions appear at the beginning of a document, though.
Implementations MUST return failure if any of the following is true:
- mid is nil.
- mlv is equal to or more than 216.
- name was given and mbeg.mrefs has a tuple t with t.name equal to name.
Implementations MUST modify the state as follows:
- Find a tuple t in mbeg.mrefs with t.mid equal to mid.
- If found and t.mlv is less than mlv, set t.mlv to mlv.
- Otherwise, insert a new tuple (mid, mlv, name) into mbeg.mrefs.
The mbeg
function
Begins declaration of the module.
- <CID> mid
- Module identifier.
The instruction SHOULD NOT appear more than once in a KMDL document.
A module is a special kind of a class. Kueea Nodes have at most one instance of these classes. All loaded implementations share the same instance. In other words, modules are singleton objects.
The instance is temporary, stored in volatile memory.
The following functions are implicitly declared:
_create
- Creates a new instance of the module.
- Returns: Read-write handle to an instance.
_upgrade
- Upgrades a module instance from a lower level.
- Parameter: Read-write handle to the old instance.
- Returns: Read-write handle to the new instance.
Both of these functions are called by the kernel only.
The _create
function is called when there is no instance available.
The _upgrade
function is called when the lowest level of all loaded implementations is higher than the level of the current instance.
This also means that it is not possible to load an implementation of a module at level lower than the level of the current module instance.
Implementations MUST return failure if any of the following is true:
- mid is nil.
- mbeg.mid is not mid.
- mbeg.data is not empty.
- mbeg.func is not empty.
- mbeg.vals is not empty.
- mbeg.types is not empty.
Implementations MUST modify the state as follows:
- Set mbeg.mlv to 0.
- Set mfin to true.
- Set cbeg to nothing.
- Set func to nothing.
- Set item to mbeg.
- Set desc to item.desc.
The mlvl
function
Increases the current level of the module.
- <UINT> mlv
- Module level.
- <TAGS> tags
- Module level finalization flag.
The new level applies to all items declared after the instruction.
tags contains either+final
or +draft
.
If the level is final, no changes to it and lower ones will ever be made.
Implementations MUST return failure if any of the following is true:
- mbeg is nothing.
- mlv is equal to 0, mbeg.mlv is equal to 0 and one of either: mbeg.data is not empty, mbeg.vals is not empty. mbeg.func is not empty. mbeg.types is not empty.
- mlv is equal to or more than 216.
- mlv is less than mbeg.mlv.
- Both
+final
and+draft
are in tags. - Neither
+final
or+draft
are in tags. - No
+draft
in tags and mfin is false.
Implementations MUST modify the state as follows:
- Set cbeg to nothing.
- Set func to nothing.
- Set item to mbeg.
- Set desc to item.desc.
- Set mfin to false if
+draft
in tags. - Set mbeg.mlv to mlv.
The cbeg
function
Begins a class declaration.
- <NAME> name
- Name of the class.
- <TAGS> tags
- Type of the class.
- <CID> id (optional)
- Identifier of the class.
If omitted, id becomes a version 5 UUID (namespace, SHA-1). The namespace UUID is the UUID of the module. The name is the name of the class (encoded in UTF-8).
If id is the nil UUID, the class will not have a descriptor. It will not have any creation functions. This is used to declare complex members for other classes.
tags can contain either+alias
or +iface
.
The presence of +alias
in tags changes the class declaration into a class alias declaration. The aliased class is referenced by its ID.
The presence of +iface
in tags changes the class declaration into an interface declaration. Interfaces must have an ID which cannot be nil.
Implementations MUST return failure if any of the following is true:
- mbeg is nothing.
- Both
+alias
and+iface
are in tags. - There is
+alias
in tags and id is nil. - There is
+iface
in tags and id is nil. - There is an item i in mbeg.data, mbeg.vals or mbeg.func, of which i.name is equal to name.
- There is a class c in mbeg.types, of which c.name is equal to name and c.cid is not equal to id.
- No
+alias
in tags, id is not nil and there is a class c in mbeg.types, of which c.cid is equal to id and c.name is not equal to name.
Implementations MUST modify the state as follows:
- Find the class c in mbeg.types, of which c.name is equal to name.
- If not found, set c to a new class and insert c into mbeg.types.
- If found, increase c.clv by one; otherwise set c.cid to id, c.clv to 0, c.name to name and c.tags to tags.
- Set cbeg to c.
- Set func to nothing.
- Set item to c.
- Set desc to item.desc.
The cend
function
Explicitly ends a class declaration.
No parameters.
Implementations MUST modify the state as follows:
- Set cbeg to nothing,
- Set func to nothing,
- Set item to mbeg.
- Set desc to item.desc.
The clvl
function
Increases the level of the current class.
- <UINT> level
- New level.
Implementations MUST return failure if any of the following is true:
- cbeg is nothing.
- cbeg.tags contains
+alias
. - level is equal to or more than 216.
Implementations MUST modify the state as follows:
- Set cbeg.clv to level.
- Set func to nothing.
- Set item to cbeg.
- Set desc to item.desc.
The creg
function
Assigns a register type to a class.
- <NAME> type
- Register type.
This instruction declares that the class is a register class. Register classes are those for which exists a conversion between an object stored in memory and a register type stored in CPU registers.
reg1 = regf / regi / regu regu = %s"u" ( "8" / "16" / "32" / "64" ) regi = %s"i" ( "8" / "16" / "32" / "64" ) regf = %s"f" ( "16" / "32" / "64" / "128" )
The first character of <reg1> is the class of a register. Remaining characters specify N - the width of the register.
Register classes are:
u
- integer in the range [0, 2N-1]
i
- integer in the range [-2N-1, 2N-1-1]
f
- real number representable by the IEEE 754 binaryN format
Such classes implicitly declare the following read-write function members:
_save
- Saves a value to memory.
- Parameter: Register type; the value.
- Returns: Nothing.
_load
- Loads a value from memory.
- Returns: Register type; the value.
Additionally, for integers:
_add
- Loads, adds a value, saves.
- Parameter: Register type; addend.
- Returns: Boolean; value of the carry flag.
_sub
- Loads, subtracts a value, saves.
- Parameter: Register type; subtrahent.
- Returns: Boolean; value of the carry flag.
_mul
- Loads, multiplies by a value, saves.
- Parameter: Register type; multiplier.
- Returns: Register type; high bits of the result.
_div
- Loads, divides by a value, saves.
- Parameter: Register type; divisor.
- Returns: Register type; remainder.
_neg
- Only for signed integers.
- Loads, negates the value, saves.
- Returns: Register type; the saved value.
_and
- Loads, does a bitwise AND, saves.
- Parameter: Register type; bit mask.
- Returns: Register type; the saved value.
_xor
- Loads, does a bitwise XOR, saves.
- Parameter: Register type; bit mask.
- Returns: Register type; the saved value.
_inv
- Loads, inverts all bits, saves.
- Returns: Register type; the saved value.
_set
- Loads, sets specified bits (bitwise OR), saves.
- Parameter: Register type; bit mask.
- Returns: Register type; the saved value.
_clr
- Loads, clears specified bits, saves.
- Parameter: Register type; bit mask.
- Returns: Register type; the saved value.
Implementations MUST return failure if any of the following is true:
- type does not match <reg1>.
- cbeg is nothing.
- cbeg.tags contains
+alias
. - cbeg.creg is non-empty.
Implementations MUST modify the state as follows:
- Set cbeg.creg to type.
The data
function
Declares the next data member in memory order.
- <TYPE> type
- Type of the object.
- <NAME> name
- Name of the object.
- <ALEN> array (optional)
- Length of the array.
- <VREG> value (optional)
- Default value of elements.
- <UINT> align (optional)
- Memory alignment.
- <TAGS> tags (optional)
- Tags.
- <SID> sid (optional)
- Symbol identifier.
In the case of handles, if there are two handles to the same object, only one of them needs to have rights to the object.
The following tags are recognized in tags:
+ro
- Member is (theoretically) read-only.
+alt
- Member is an alternative of the previous one.
+iface
- Member is an object of an implemented interface.
+nodesc
- Member has no associated description.
An alen is valid if, and only if:
- alen.max is equal to or more than 232,
- alen.min is equal to or more than 232,
- alen.min is more than alen.max.
The range in alen is used in calculation of the minimum and possible maximum lengths of a class instance. Offsets of all subsequent members are calculated at runtime.
alen.var MUST reference a previously declared member. It holds the current length of the array. The test is done after all documents are loaded.
An align is valid if, and only if:
- align is equal to or more than 232,
- align is not a power of 2.
align is an override of the default memory address alignment requirement of the member. Padding is inserted before the member if it is not properly aligned. Authors must consider alignment and padding when designing classes.
If sid is omitted and cbeg is not nothing, its value is computed over the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN and name. Otherwise, sid must be omitted as there is no symbol.
Implementations MUST return failure if any of the following is true:
- alen is invalid.
- align is invalid.
- tags contains
+alt
and item is not a data member.
Additionaly, if cbeg is nothing:
- For each data member d in mbeg.data: d.sid is equal to sid or d.name is equal to name.
- For each function f in mbeg.func: f.name is equal to name or f.sid is equal to sid.
- For each named value v in mbeg.vals: v.name is equal to name.
Otherwise, if cbeg is not nothing:
- cbeg.tags contains
+alias
. - sid is not omitted.
- For each data member d in cbeg.data: d.name is equal to name.
- For each function f in cbeg.func: f.name is equal to name.
- For each named value v in cbeg.enum: v.name is equal to name.
- If cbeg.data is non-empty and its last element d is such that d.clv is more than or equal to cbeg.clv and d.mlv is less than mbeg.mlv.
These tests MUST be done once in the dereference phase:
- tags contains
+iface
and type is not an interface. - alen.var is set and does not reference a member of an unsigned integer register class.
- value was given and type is not a register class.
- value was given and it is out of range or has incorrect syntax.
- More than one of the alternatives have a defined value.
Implementations MUST modify the state as follows:
- Create a new data member d.
- Set d.type to type, d.tags to tags, d.array to array, d.align to align, d.value to value, d.mlv to mbeg.mlv, d.sid to sid, d.name to name.
- If cbeg is not nothing, set d.clv to cbeg.clv.
- Append d to mbeg.data if cbeg is nothing; otherwise, append d to cbeg.data.
- Set func to nothing.
- Set item to d.
- Set desc to item.desc if tags does not contain
+nodesc
.
The nval
function
Declares a named value.
- <NAME> name
- Name of the value.
- <VREG> value
- Named value.
The func
function
Begins declaration of a normal function member.
- <NAME> name
- Name of the function.
- <TAGS> tags (optional)
- Tags.
- <SID> sid (optional)
- Symbol identifier.
If sid is omitted: if cbeg is not nothing, sid is computed over the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN, name; otherwise, it is computed over just name.
The following tags are recognized in tags:
+ro
- Function does not write to the class instance.
+mod
- Function requires access to the module instance.
+krn
- Function requires access to the kernel instance.
+var
- Function expects more parameters than declared.
Implementations MUST return failure if any of the following is true:
If cbeg is nothing:
- For each data member d in mbeg.data: d.name is equal to name or d.sid is equal to sid.
- For each function f in mbeg.func: f.name is equal to name or f.sid is equal to sid.
- For each named value v in mbeg.vals: v.name is equal to name.
Otherwise, if cbeg is not nothing:
- cbeg.tags contains
+alias
. - For each data member d in cbeg.data: d.name is equal to name.
- For each function f in cbeg.func: f.name is equal to name or f.sid is equal to sid.
- For each named value v in cbeg.enum: v.name is equal to name.
Implementations MUST modify the state as follows:
- Create a new function member f.
- Set f.tags to tags, f.mlv to mbeg.mlv, f.sid to sid, f.name to name.
- If cbeg is not nothing, set f.clv to cbeg.clv.
- Insert f into mbeg.func if cbeg is nothing; otherwise, insert f to cbeg.func.
- Set func to f.
- Set item to f.
- Set desc to item.desc.
The fret
function
Declares the type of the return value of the current function.
- <TYPE> type
- Type of the value.
If the function does not return a value, then the document simply does not declare any return value.
Implementations MUST return failure if any of the following is true:
- func is nothing.
- func.rval is not nothing.
- func.tags contains
+init
.
Implementations MUST modify the state as follows:
- Create a new data member r.
- Set r.type to type.
- Set func.rval to r.
- Set item to r.
- Set desc to item.desc.
The farg
function
Declares the next in-order parameter to the current function.
- <TYPE> type_in
- Input type of the parameter.
- <NAME> name
- Name of the parameter.
- <TYPE> type_out (optional)
- Output type of the parameter.
The type of a function parameter is one of:
- an object passed by value of length up to 128 octets,
- an object passed by handle (by memory reference),
- an object passed by reference to a handle (bidirectional handle),
- a reference to an instance of a register class that is guaranteed to be modified only by the function.
Objects passed by value are specified by simply writing a class reference.
.farg module.class:0 by_value
Objects passed by handle are written by specifying the handle.
.farg read<module.class:0> by_handle
Objects passed by a bidirectional handle (one that gives access to the function and back to the caller) are written as two handles.
.farg none<module.class:0> bidi_handle1 rdwr<module.class:0> .farg rdwr<module.class:0> bidi_handle2 rdwr<module.class:1>
The bidi_handle1
parameter in the example is a reference to a handle. The function is not given any access rights to the refrenced memory. Upon return, the handle contains an address to an instance of module.class
at level 0 and the caller receives read-write access to this object.
The bidi_handle2
parameter in the example is a reference to a handle. The function is given read-write access rights to an instance of module.class
at level 0. Upon return, the handle contains an address to an instance of module.class
at level 1 and the caller receives read-write access to this object.
The last category is written as two class references. The two references MUST reference a register class.
.farg int.u8 u8_ref int.u8
Their primary use is returning a small object in a situation when passing a buffer by handle would expose too much data to the function. This is also faster than passing by handle (no handle processing).
The referenced object may be safely copied before the call is made and then copied back to the original buffer upon returning back.
Implementations MUST return failure if any of the following is true:
- func is nothing.
- type_in is a handle, type_out was not omitted and is not a handle.
- type_in is a class reference, type_out was not omitted and is not a class reference.
Additionaly, these tests MUST be done once in the dereference phase:
- if type_in and type_out are class references and the referenced types are not a register classes.
Implementations MUST modify the state as follows:
- Create a new function parameter p.
- Set p.name to name, p.type_in to type_in, p.type_out to type_out.
- Append p to func.params.
- Set item to p.
- Set desc to item.desc.
The init
function
Begins declaration of a class constructor.
- <NAME> name
- Name of the constructor.
- <TAGS> tags (optional)
- Tags.
- <SIDN> sid_init (optional)
- Symbol identifier for the
init
symbol. - <SIDN> sid_create (optional)
- Symbol identifier for the
create
symbol.
This function declares two function members: name_init and name_create.
The name_init function is a normal member function which takes the defined parameters as arguments and returns a status boolean.
The name_create function is a static member function which takes kernel-defined parameters first and then the defined parameters as arguments. The function returns a read-write handle to an instance of the class. It always requires access to the module context (to the class descriptor).
Compute name_init as follows:
- Let n be an empty string.
- Append
init
to n. - If name is equal to
default
, return n. - Append one U+0024 DOLLAR SIGN to n.
- Append name to n.
- Return n.
Compute name_create as follows:
- Let n be an empty string.
- Append
create
to n. - If name is equal to
default
, return n. - Append one U+0024 DOLLAR SIGN to n.
- Append name to n.
- Return n.
If omitted, sid_init is computed over the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN, name_init.
If omitted, sid_create is computed over the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN, name_create.
The following tags are recognized in tags:
+mod
- Function requires access to the module instance.
+krn
- Function requires access to the kernel instance.
By convention, destructors SHOULD be called fini
. Prototype is defined by the kernel's object interface.
Implementations MUST return failure if any of the following is true:
- cbeg is nothing.
- cbeg.tags contains
+alias
. - For each function f in cbeg.func: f.sid is equal to sid_init or f.sid is equal to sid_create.
Implementations MUST modify the state as follows:
- Create a new function member create.
- Set create.tags to tags, create.mlv to mbeg.mlv, create.clv to cbeg.clv, create.sid to sid_create, create.name to name_create.
- Add
+create
to create.tags. - Insert create into cbeg.func.
- Create a new function member init.
- Set init.tags to tags, init.mlv to mbeg.mlv, init.clv to cbeg.clv, init.sid to sid_init, init.name to name_init.
- Add
+init
to init.tags. - Insert init into cbeg.func.
- Set func to init.
- Set item to init.
- Set desc to item.desc.
The evnt
function
Begins declaration of an event function.
- <NAME> name
- Name of the event.
- <SIDN> sid_install (optional)
- Symbol identifier for the
install
function. - <SIDN> sid_uninstall (optional)
- Symbol identifier for the
uninstall
function.
Compute name_func as follows:
- Let n be an empty string.
- Append name to n.
- Append one U+0024 DOLLAR SIGN to n.
- Append
func
to n. - Return n.
Compute name_install as follows:
- Let n be an empty string.
- Append name to n.
- Append one U+0024 DOLLAR SIGN to n.
- Append
install
to n. - Return n.
Compute name_uninstall as follows:
- Let n be an empty string.
- Append name to n.
- Append one U+0024 DOLLAR SIGN to n.
- Append
uninstall
to n. - Return n.
If omitted, sid_install is computed over: if cbeg is nothing, name_install; otherwise the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN, name_install.
If omitted, sid_uninstall is computed over: if cbeg is nothing, name_uninstall; otherwise the concatenation of: cbeg.name, one U+0024 DOLLAR SIGN, name_uninstall.
The mesg
function
Begins declaration of a message function.
- <NAME> name
- Name of the message.
- <SID> sid (optional)
- Symbol identifier.
Message functions always return a human-readable message.
The impl
function
Declares an implementation of a function prototype.
- <NAME> name
- Name of the function.
- <IREF> type
- Reference to the function prototype.
- <TAGS> tags (optional)
- Tags.
- <SID> sid (optional)
- Symbol identifier.
File Format URI
File Format URI of KMDL documents is rd://74RDM3TULLIOIPQ6V2GC3EZ3/2020/KMDL#Document
.
Internet Media Type
Media type of KMDL documents is text/prs.kueea.kmdl
.
The charset
parameter MUST be included with the value UTF-8
.