Guide: Creating a Custom Model

1 Introduction

A model defines the vocabulary and behavior of your specification documents. It declares what types of spec objects exist (requirements, design items, test cases), what floats are available (diagrams, tables, code listings), how cross-references resolve, and what validation rules apply.

SpecCompiler ships with a default model that provides base types (SECTION, FIGURE, TABLE, PLANTUML, etc.). You create a custom model when your domain needs additional types, specialized validation, or custom rendering.

1.1 Overlay

Models work as overlays on top of default. When you set template: mymodel in project.yaml, the engine loads types in order:

models/default/types/ – Always loaded first.
models/mymodel/types/ – Loaded second; types with the same id override the default.

This means your custom model only needs to define the types it adds or overrides. Everything else inherits from default.

2 Model Directory Layout

Listing : Model directory structure

models/{name}/
  types/
    objects/          -- Spec object types (e.g., hlr.lua, vc.lua)
    specifications/   -- Specification types (e.g., srs.lua)
    floats/           -- Float types (e.g., figure.lua, chart.lua)
    views/            -- View types (e.g., abbrev.lua, math_inline.lua)
    relations/        -- Relation types (e.g., xref_decomposition.lua)
  proofs/             -- Validation proof queries (e.g., sd_601_*.lua)
  postprocessors/     -- Format post-processing (docx.lua, html5.lua)
  filters/            -- Pandoc Lua filters per output format
  styles/             -- Style presets (preset.lua, docx.lua, html.lua)
  data_views/         -- Chart data generators
  handlers/           -- Custom pipeline handlers

Only types/ is required. All other directories are optional.

3 Type Definition Pattern

Every type module is a Lua file that returns a table with two optional keys:

A schema key (M.object, M.float, M.relation, M.view, or M.specification) that declares the type’s metadata and gets registered into the database.
An optional M.handler table that hooks into the pipeline lifecycle.

3.1 Schema Keys by Category

Table : Schema keys by type category

Category	Schema Key	Example File
Spec Objects	`M.object`	`types/objects/section.lua`
Floats	`M.float`	`types/floats/figure.lua`
Relations	`M.relation`	`types/relations/xref_citation.lua`
Views	`M.view`	`types/views/abbrev.lua`
Specifications	`M.specification`	`types/specifications/srs.lua`

3.2 Handler Lifecycle

Handlers hook into pipeline phases via callback functions:

Table : Handler callback functions

Callback	Phase	Purpose
`on_initialize`	INITIALIZE	Parse content from Pandoc AST, store in database
`on_analyze`	ANALYZE	Validate, resolve references, generate PIDs
`on_transform`	TRANSFORM	Render content, resolve external resources
`on_render_SpecObject`	EMIT	Convert spec object to Pandoc blocks for output
`on_render_Code`	EMIT	Convert inline code to Pandoc inlines (views)
`on_render_CodeBlock`	EMIT	Convert code block to Pandoc blocks (floats)

The prerequisites field controls execution order: a handler with prerequisites = {"spec_views"} runs after the spec_views handler.

4 Walkthrough: Custom Object Type

This example creates a High-Level Requirement (HLR) type with required attributes.

4.1 Step 1: Create the Type File

Create models/mymodel/types/objects/hlr.lua:

Listing : Custom object type: hlr.lua

local M = {}

M.object = {
    id = "HLR",
    long_name = "High-Level Requirement",
    description = "A top-level system requirement",
    pid_prefix = "HLR",           -- Auto-PID prefix
    pid_format = "%s-%03d",       -- Produces HLR-001, HLR-002, etc.
    attributes = {
        {
            name = "priority",
            type = "ENUM",
            values = { "High", "Medium", "Low" },
            min_occurs = 1,       -- Required
            max_occurs = 1,
        },
        {
            name = "status",
            type = "ENUM",
            values = { "Draft", "Approved", "Implemented" },
            min_occurs = 1,
            max_occurs = 1,
        },
        {
            name = "rationale",
            type = "XHTML",       -- Rich text
            min_occurs = 0,       -- Optional
        },
    },
}

return M

4.2 Step 2: Use in Markdown

Listing : Using the custom object type

## hlr: User Authentication @HLR-001

> priority: High

> status: Draft

> rationale: Required by security policy section 4.2

The system shall authenticate users via username and password.

4.3 Step 3: Add a Handler (Optional)

If the type needs custom behavior during pipeline phases, add M.handler:

Listing : Object type with handler

local Queries = require("db.queries")

M.handler = {
    name = "hlr_handler",
    prerequisites = {},

    on_analyze = function(data, contexts, diagnostics)
        for _, ctx in ipairs(contexts) do
            local spec_id = ctx.spec_id or "default"
            local objects = data:query_all(
                Queries.content.objects_by_spec_type,
                { spec_id = spec_id, type_ref = "HLR" }
            )
            for _, obj in ipairs(objects or {}) do
                -- Custom validation logic here
            end
        end
    end,
}

4.4 Object Schema Fields Reference

Table : Object schema fields

Field	Type	Default	Description
`id`	string	required	Unique identifier (uppercase convention)
`long_name`	string	same as `id`	Human-readable name
`description`	string	`""`	Description text
`extends`	string	nil	Base type for inheritance
`is_default`	boolean	false	If true, headers without explicit type match this
`is_composite`	boolean	false	Composite object flag
`pid_prefix`	string	nil	Prefix for auto-generated PIDs
`pid_format`	string	nil	Printf format string for PIDs
`aliases`	list	nil	Alternative identifiers for syntax matching
`attributes`	list	nil	Attribute definitions (see Attribute Schema)

4.5 Attribute Schema

Table : Attribute definition fields

Field	Type	Default	Description
`name`	string	required	Attribute identifier
`type`	string	`"STRING"`	Datatype: STRING, INTEGER, REAL, BOOLEAN, DATE, ENUM, XHTML
`min_occurs`	integer	0	Minimum values (0 = optional, 1 = required)
`max_occurs`	integer	1	Maximum values
`min_value`	number	nil	Lower bound for numeric types
`max_value`	number	nil	Upper bound for numeric types
`values`	list	nil	Valid enum values (required when `type = "ENUM"`)
`datatype_ref`	string	nil	Explicit datatype ID (overrides auto-generated)

5 Walkthrough: Custom Float Type

Floats are numbered elements declared in fenced code blocks. This example creates a custom float type for diagrams.

5.1 Step 1: Create the Type File

Create models/mymodel/types/floats/sequence_diagram.lua:

Listing : Custom float type: sequence_diagram.lua

local M = {}

M.float = {
    id = "SEQUENCE",
    long_name = "Sequence Diagram",
    description = "UML Sequence Diagram rendered via PlantUML",
    caption_format = "Figure",        -- Caption prefix in output
    counter_group = "FIGURE",         -- Shares counter with FIGURE, PLANTUML
    aliases = { "seq", "sequence" },  -- Syntax: ```seq:label or ```sequence:label
    needs_external_render = true,     -- Requires external tool
}

return M

5.2 Float Schema Fields Reference

Table : Float schema fields

Field	Type	Default	Description
`id`	string	required	Unique identifier (uppercase)
`caption_format`	string	same as `id`	Prefix used in output captions
`counter_group`	string	same as `id`	Counter sharing group (e.g., FIGURE, TABLE)
`aliases`	list	nil	Alternative syntax identifiers
`needs_external_render`	boolean	false	Whether rendering requires an external tool
`style_id`	string	nil	Custom style identifier for output formatting

5.3 Counter Groups

Multiple float types can share a numbering sequence by using the same counter_group. For example, FIGURE, PLANTUML, and CHART all use counter_group = "FIGURE", so they are numbered sequentially as Figure 1, Figure 2, Figure 3 regardless of which specific type each is.

6 Walkthrough: Float with External Rendering

When a float type needs an external tool to produce its output (PlantUML for diagrams, Deno for charts, etc.), it uses the external render handler. This handler collects all items that need rendering, spawns external processes in parallel, and dispatches results back to type-specific callbacks.

6.1 How External Rendering Works

The pipeline flow for external renders:

INITIALIZE – The float is parsed from the Markdown code block and stored in spec_floats with raw_content.
TRANSFORM – The external render handler (src/pipeline/transform/external_render_handler.lua) queries all floats where needs_external_render = 1 and resolved_ast IS NULL.
Prepare – For each float, the handler calls the registered prepare_task callback, which writes input files and builds a command descriptor.
Cache check – If output_path exists on disk (from a previous build), the task is skipped and handle_result is called immediately with the cached path.
Batch spawn – All non-cached tasks are spawned in parallel via task_runner.spawn_batch.
Dispatch – Results (stdout, stderr, exit code) are dispatched to each type’s handle_result callback, which updates resolved_ast in the database.

6.2 Registering a Renderer

External renderers are registered at module load time by calling external_render.register_renderer(type_ref, callbacks). The callbacks table must provide two functions:

Table : External render callback functions

Callback	Signature and Purpose
`prepare_task`	`function(float, build_dir, log, data, model_name) -> task\|nil` – Writes input files, builds command descriptor. Returns nil to skip rendering.
`handle_result`	`function(task, success, stdout, stderr, data, log)` – Processes output. Updates `resolved_ast` in the database via `float_base.update_resolved_ast`.

6.3 Task Descriptor

The prepare_task callback returns a task descriptor table:

Table : Task descriptor fields

Field	Type	Description
`cmd`	string	Command to execute (e.g., `"plantuml"`, `"deno"`)
`args`	list	Command arguments
`opts`	table	Options: `cwd` (working directory), `timeout` (milliseconds)
`output_path`	string	Expected output file path; if it exists, the task is skipped (cache hit)
`context`	table	Arbitrary data passed through to `handle_result` (float record, hash, paths, etc.)

6.4 Example: PlantUML Renderer

The built-in PlantUML renderer demonstrates the full pattern:

Listing : PlantUML external renderer (simplified)

local float_base = require("pipeline.shared.float_base")
local task_runner = require("infra.process.task_runner")
local external_render = require("pipeline.transform.external_render_handler")

local M = {}

M.float = {
    id = "PLANTUML",
    long_name = "PlantUML Diagram",
    caption_format = "Figure",
    counter_group = "FIGURE",
    aliases = { "puml", "plantuml", "uml" },
    needs_external_render = true,   -- Enables external render pipeline
}

external_render.register_renderer("PLANTUML", {
    prepare_task = function(float, build_dir, log)
        local content = float.raw_content or ''
        -- Ensure @startuml/@enduml wrapper
        if not content:match('@startuml') then
            content = '@startuml\n' .. content .. '\n@enduml'
        end

        local hash = pandoc.sha1(content)
        local diagrams_path = build_dir .. "/diagrams"
        local puml_file = diagrams_path .. "/" .. hash .. ".puml"
        local png_file = diagrams_path .. "/" .. hash .. ".png"

        task_runner.ensure_dir(diagrams_path)
        task_runner.write_file(puml_file, content)

        return {
            cmd = "plantuml",
            args = { "-tpng", puml_file },
            opts = { timeout = 30000 },
            output_path = png_file,       -- Cache key: skip if PNG exists
            context = {
                hash = hash,
                float = float,
                relative_path = "diagrams/" .. hash .. ".png",
            }
        }
    end,

    handle_result = function(task, success, stdout, stderr, data, log)
        local ctx = task.context
        if not success then
            log.warn("PlantUML failed for %s: %s",
                ctx.float.identifier:sub(1,12), stderr)
            return
        end

        -- Store resolved path as JSON in resolved_ast
        local json = string.format(
            '{"png_paths":["%s"]}',
            ctx.relative_path
        )
        float_base.update_resolved_ast(data, ctx.float.identifier, json)
    end
})

return M

6.5 Example: Chart Renderer with Data Injection

The chart renderer adds a data injection step before rendering, loading data views from models/{model}/data_views/:

Listing : Chart external renderer (simplified)

local float_base = require("pipeline.shared.float_base")
local task_runner = require("infra.process.task_runner")
local data_loader = require("core.data_loader")
local external_render = require("pipeline.transform.external_render_handler")

local M = {}

M.float = {
    id = "CHART",
    long_name = "Chart",
    caption_format = "Figure",
    counter_group = "FIGURE",
    aliases = { "echarts", "echart" },
    needs_external_render = true,
}

external_render.register_renderer("CHART", {
    prepare_task = function(float, build_dir, log, data, model_name)
        local attrs = float_base.decode_attributes(float)
        local json_content = float.raw_content or '{}'

        -- Data injection: load view module and merge data into ECharts config
        local view_name = attrs.view
        if view_name and data then
            local inject_attrs = { view = view_name, model = model_name }
            local config = pandoc.json.decode(json_content)
            local injected = data_loader.inject_chart_data(
                config, inject_attrs, data, log)
            if injected then
                json_content = pandoc.json.encode(injected)
            end
        end

        local hash = pandoc.sha1(json_content)
        local charts_path = build_dir .. "/charts"
        local json_file = charts_path .. "/" .. hash .. ".json"
        local png_file = charts_path .. "/" .. hash .. ".png"

        task_runner.ensure_dir(charts_path)
        task_runner.write_file(json_file, json_content)

        return {
            cmd = "deno",
            args = {
                "run", "--allow-read", "--allow-write", "--allow-env",
                "echarts-render.ts", json_file, png_file,
                tostring(attrs.width or 600),
                tostring(attrs.height or 400)
            },
            opts = { timeout = 60000 },
            output_path = png_file,
            context = {
                hash = hash,
                float = float,
                relative_path = "charts/" .. hash .. ".png",
            }
        }
    end,

    handle_result = function(task, success, stdout, stderr, data, log)
        local ctx = task.context
        if not success then
            log.warn("Chart render failed: %s", stderr)
            return
        end

        local json = string.format('{"png_path":"%s"}', ctx.relative_path)
        float_base.update_resolved_ast(data, ctx.float.identifier, json)
    end
})

return M

6.6 Creating Your Own External Renderer

To create a float type that uses an external tool:

Set needs_external_render = true in the float schema.
Register callbacks with external_render.register_renderer("YOUR_TYPE", { ... }) at module load time (top-level code, not inside a function).
In prepare_task: Write input content to a temporary file, build the command and arguments, and return a task descriptor with output_path for file-based caching.
In handle_result: Parse the output (stdout, generated files), serialize the result as JSON, and call float_base.update_resolved_ast(data, identifier, json) to store it.
Do not define M.handler.on_transform – the external render handler orchestrates the TRANSFORM phase for all registered types. Defining your own on_transform would bypass the parallel batch execution.

Key utilities available:

Table : Utility functions for external renderers

Function	Purpose
`task_runner.ensure_dir(path)`	Create directory if it does not exist
`task_runner.write_file(path, content)`	Write content to a file; returns `ok, err`
`task_runner.file_exists(path)`	Check if a file exists on disk
`task_runner.command_exists(cmd)`	Check if a command is available in PATH
`float_base.decode_attributes(float)`	Parse float’s `pandoc_attributes` JSON into a Lua table
`float_base.update_resolved_ast(data, id, json)`	Store the rendering result in the database

6.7 File-Based Caching

The external render handler provides automatic file-based caching via the output_path field in the task descriptor. If the output file already exists on disk when prepare_task returns, the handler skips spawning the external process and immediately calls handle_result with empty stdout/stderr. This means:

The content hash should be part of the output filename (e.g., diagrams/{sha1}.png) so that content changes produce a new filename and trigger re-rendering.
The handle_result callback should work correctly whether called after a fresh render or a cache hit (it receives the same task.context).
Deleting the output files forces re-rendering on the next build (the database resolved_ast is also cleared during INITIALIZE).

6.8 External Rendering for Views

The same external_render.register_renderer mechanism works for views that need external tools. For example, math_inline.lua registers a renderer for the MATH_INLINE view type to convert AsciiMath to MathML/OMML via an external script. The handler queries spec_views (instead of spec_floats) with needs_external_render = 1 and dispatches to the same callback interface.

7 Walkthrough: Custom Relation with Inference Rules

Relations connect spec objects via link syntax. The relation resolver uses specificity scoring to infer the relation type.

7.1 Step 1: Create the Type File

Create models/mymodel/types/relations/traces_to.lua:

Listing : Custom relation type: traces_to.lua

local M = {}

M.relation = {
    id = "TRACES_TO",
    long_name = "Traces To",
    description = "Traceability link from LLR to HLR",
    link_selector = "@",             -- Uses [PID](@) syntax
    source_type_ref = "LLR",        -- Only from LLR objects
    target_type_ref = "HLR",        -- Only to HLR objects
    aliases = nil,                   -- No alias prefix
    is_default = false,
}

return M

7.2 Step 2: Use in Markdown

Listing : Using the relation type

### llr: Password Length Check @LLR-001

Passwords must be at least 8 characters. Traces to [HLR-001](@).

7.3 Inference Scoring

When multiple relation types could match a link, the resolver scores each candidate:

Table : Inference scoring dimensions

Dimension	Match	Constraint mismatch	No constraint (NULL)
Selector (`@` or `#`)	+1	Eliminated	+0
Source attribute	+1	Eliminated	+0
Source type	+1	Eliminated	+0
Target type	+1	Eliminated	+0

The highest-scoring candidate wins. If two candidates tie, the relation is flagged as ambiguous (relation_ambiguous). Constraints set to nil act as wildcards (+0) rather than eliminating the candidate.

7.4 Relation Schema Fields Reference

Table : Relation schema fields

Field	Type	Default	Description
`id`	string	required	Unique identifier (uppercase)
`link_selector`	string	nil	Required selector: `"@"` for PID refs, `"#"` for label refs
`source_type_ref`	string	nil	Constrain source to this object type (nil = any)
`target_type_ref`	string	nil	Constrain target to this object type (nil = any)
`source_attribute`	string	nil	Constrain to links within this attribute context
`aliases`	list	nil	Prefix aliases for `[alias:key](#)` syntax
`is_default`	boolean	false	Default relation for its selector when no better match

7.5 Relations with Handlers

A relation type can include a handler for custom transform behavior. For example, xref_citation.lua rewrites citation links to Pandoc Cite elements during the TRANSFORM phase:

Listing : Relation type with handler

M.handler = {
    name = "my_relation_handler",
    prerequisites = {"spec_relations"},  -- Run after relations are stored

    on_transform = function(data, contexts, diagnostics)
        for _, ctx in ipairs(contexts) do
            -- Custom transform logic
        end
    end
}

7.6 Base Types and Inheritance

Relation types support inheritance via base types. Instead of repeating link_selector and resolution logic in every type, you extend a base type:

traceable (models/default/types/relations/traceable.lua) — base for @ (PID) selector
xref (models/default/types/relations/xref.lua) — base for # (label) selector

Use extend() to create a concrete type:

Listing : Extending the traceable base type

local traceable = require("models.default.types.relations.traceable")
local M = {}

M.relation = traceable.extend({
    id = "TRACES_TO",
    long_name = "Traces To",
    description = "Traceability link from one object to another",
})

return M

The extend() call inherits link_selector = "@" from the base and merges your overrides. For # selector types, use xref.extend() instead.

7.7 Custom Link Display Text

By default, object references display the target’s PID and float references display the caption format with the float number (e.g., “Figure 3”). To customize display text, add a standard M.handler with an on_transform hook using the shared link_rewrite_utils utility:

Listing : Custom display text via on_transform

local traceable = require("models.default.types.relations.traceable")
local link_rewrite = require("pipeline.shared.link_rewrite_utils")
local M = {}

M.relation = traceable.extend({
    id = "XREF_DIC",
    long_name = "Dictionary Reference",
    description = "Cross-reference to a dictionary entry",
    target_type_ref = "DIC",
})

M.handler = {
    name = "xref_dic_handler",
    prerequisites = {"spec_relations"},
    on_transform = function(data, contexts, _diagnostics)
        link_rewrite.rewrite_display_for_type(data, contexts, "XREF_DIC", function(target)
            if target.title_text and target.title_text ~= "" then
                return target.title_text
            end
        end)
    end
}

return M

A link [DIC-AUTH-001](@) would display as “Authentication” instead of “DIC-AUTH-001”.

The display_fn receives a target table with fields pid, type_ref, and title_text. Return a string for custom display text, or nil to keep the default.

8 Walkthrough: Custom View

Views are inline elements declared with backtick syntax (`prefix: content`).

8.1 Step 1: Create the Type File

Create models/mymodel/types/views/symbol.lua:

Listing : Custom view type: symbol.lua

local M = {}
local Queries = require("db.queries")

M.view = {
    id = "SYMBOL",
    long_name = "Symbol",
    description = "Engineering symbol with unit definition",
    aliases = { "sym" },
    inline_prefix = "symbol",         -- Enables `symbol: content` syntax
    needs_external_render = false,
}

M.handler = {
    name = "symbol_handler",
    prerequisites = {"spec_views"},

    on_initialize = function(data, contexts, diagnostics)
        for _, ctx in ipairs(contexts) do
            local doc = ctx.doc
            if not doc or not doc.blocks then goto continue end

            local spec_id = ctx.spec_id or "default"
            local file_seq = 0

            local visitor = {
                Code = function(c)
                    local content = (c.text or ""):match("^symbol:%s*(.+)$")
                        or (c.text or ""):match("^sym:%s*(.+)$")
                    if not content then return nil end

                    file_seq = file_seq + 1
                    local identifier = pandoc.sha1(spec_id .. ":" .. file_seq .. ":" .. content)

                    data:execute(Queries.content.insert_view, {
                        identifier = identifier,
                        specification_ref = spec_id,
                        view_type_ref = "SYMBOL",
                        from_file = ctx.source_path or "unknown",
                        file_seq = file_seq,
                        raw_ast = content
                    })
                end
            }

            for _, block in ipairs(doc.blocks) do
                pandoc.walk_block(block, visitor)
            end
            ::continue::
        end
    end,

    on_render_Code = function(code, ctx)
        local content = (code.text or ""):match("^symbol:%s*(.+)$")
            or (code.text or ""):match("^sym:%s*(.+)$")
        if not content then return nil end

        -- Render as emphasized text
        return { pandoc.Emph({ pandoc.Str(content) }) }
    end,
}

return M

8.2 Step 2: Use in Markdown

Listing : Using the custom view type

The force is defined as `symbol: F = ma` where `symbol: F` is force in Newtons.

8.3 View Schema Fields Reference

Table : View schema fields

Field	Type	Default	Description
`id`	string	required	Unique identifier (uppercase)
`inline_prefix`	string	nil	Prefix for inline code dispatch (e.g., `"math"` enables `math:` syntax)
`aliases`	list	nil	Alternative prefixes for the same view type
`needs_external_render`	boolean	false	Whether rendering requires an external tool (batch processing)
`materializer_type`	string	nil	Materializer strategy (e.g., ‘toc’, ‘lof’, ‘custom’)
`counter_group`	string	nil	Counter group for numbered views

9 Validation Proofs

Proofs are SQL-based validation rules that run during the VERIFY phase. Each proof creates a SQL view; if the view returns any rows, those rows represent violations.

9.1 Proof File Pattern

Create models/mymodel/proofs/vc_missing_hlr_traceability.lua:

Listing : Validation proof module

local M = {}

M.proof = {
    view = "view_traceability_vc_missing_hlr",
    policy_key = "traceability_vc_to_hlr", -- Key in project.yaml validation section
    sql = [[
CREATE VIEW IF NOT EXISTS view_traceability_vc_missing_hlr AS
SELECT
  vc.identifier AS object_id,
  vc.pid AS object_pid,
  vc.title_text AS object_title,
  vc.from_file,
  vc.start_line
FROM spec_objects vc
WHERE vc.type_ref = 'VC'
  AND NOT EXISTS (
    SELECT 1
    FROM spec_relations r
    JOIN spec_objects target ON target.identifier = r.target_ref
    WHERE r.source_ref = vc.identifier
      AND target.type_ref = 'HLR'
  );
]],
    message = function(row)
        local label = row.object_pid or row.object_title or row.object_id
        return string.format(
            "Verification case '%s' has no traceability link to an HLR",
            label
        )
    end
}

return M

9.2 Proof Schema

Table : Proof module fields

Field	Type	Description
`view`	string	SQL view name (must match the CREATE VIEW name)
`policy_key`	string	Key for suppression in `project.yaml` validation section
`sql`	string	SQL CREATE VIEW statement; rows returned = violations
`message`	function	Takes a row table, returns a diagnostic message string

9.3 Suppressing Proofs

Users suppress proofs in project.yaml using the policy_key:

Listing : Suppressing a validation proof

validation:
  traceability_vc_to_hlr: ignore   # Suppress this proof

10 Model Overlay/Extension Pattern

10.1 How Overlays Work

The type loader (src/core/type_loader.lua) loads models in two passes:

Default model: Scans models/default/types/{category}/ and registers all types.
Custom model: Scans models/{template}/types/{category}/ and registers all types.

Since type registration uses INSERT OR REPLACE, a custom model type with the same id as a default type replaces it entirely. Types with new IDs are added alongside the defaults.

10.2 Path Resolution

The loader resolves model paths in order:

$SPECCOMPILER_HOME/models/{name}/types/ (Docker/production)
./models/{name}/types/ (local development)

10.3 Partial Customization Example

A model that only adds an HLR type and a custom proof:

Listing : Minimal overlay model

models/mymodel/
  types/
    objects/
      hlr.lua         -- Adds HLR type (default has no HLR)
    relations/
      traces_to.lua   -- Adds traceability relation
  proofs/
    sd_601_vc_missing_hlr.lua  -- Domain-specific validation

All other types (SECTION, FIGURE, TABLE, etc.) are inherited from default.

11 project.yaml Integration

Set the template field to use your custom model:

Listing : Using a custom model in project.yaml

project:
  code: MYPROJ
  name: My Project

template: mymodel   # Loads models/default/ then models/mymodel/

doc_files:
  - srs.md

Guide: DOCX Customization

1 Introduction

SpecCompiler generates DOCX output through a multi-stage pipeline:

SpecIR – Structured data in SQLite (objects, relations, floats, attributes).
Pandoc AST – The emitter assembles a Pandoc document from the SpecIR.
Pandoc DOCX Writer – Pandoc converts the AST to DOCX using a reference.docx for styles.
Lua Filter – A format-specific filter converts SpecCompiler markers to OOXML (captions, bookmarks, math).
Postprocessor – Manipulates the generated DOCX ZIP archive (positioned floats, caption orphan prevention, template-specific OOXML).

Customization is available at three levels: style presets (fonts, spacing, page layout), filters (AST-to-OOXML conversion), and postprocessors (raw OOXML manipulation).

2 How Pandoc reference.docx Works

Pandoc uses a reference document as a style template for DOCX output. The reference document defines paragraph styles (Normal, Heading 1, Caption, etc.), page dimensions, margins, and default formatting. Pandoc does not copy content from the reference document – only styles and settings.

SpecCompiler manages the reference document in two ways:

Auto-generated from presets (default): SpecCompiler builds a reference.docx from Lua style presets, storing it at {output_dir}/reference.docx.
User-provided: Set docx.reference_doc in project.yaml to use your own Word template.

3 Style Presets

Style presets are Lua files that declaratively define DOCX styles. They are located at:

models/{template}/styles/{preset}/preset.lua

3.1 Preset Table Structure

A preset file returns a Lua table with the following top-level keys:

Listing : Preset table structure

return {
    name = "My Preset",
    description = "Custom document styles",

    -- Page configuration
    page = {
        size = "A4",              -- "Letter" or "A4"
        orientation = "portrait", -- "portrait" or "landscape"
        margins = {
            top = "2.5cm",
            bottom = "2.5cm",
            left = "3cm",
            right = "2cm",
        },
    },

    -- Paragraph styles (array of style definitions)
    paragraph_styles = { ... },

    -- Table styles (array of table style definitions)
    table_styles = { ... },

    -- Caption formats per float type
    captions = { ... },

    -- Document settings
    settings = {
        default_tab_stop = 720,   -- In twips (720 = 0.5 inch)
        language = "en-US",
    },

    -- Optional: inherit from another preset
    extends = {
        template = "default",     -- Base template
        preset = "base",          -- Base preset name
    },
}

3.2 Paragraph Style Fields

Table : Paragraph style fields

Field	Type	Default	Description
`id`	string	required	Internal style ID (e.g., “Heading1”)
`name`	string	required	Display name in Word (e.g., “Heading 1”)
`based_on`	string	nil	Parent style ID for inheritance
`next`	string	nil	Style to apply to the next paragraph
`font.name`	string	nil	Font family name
`font.size`	number	nil	Font size in points
`font.color`	string	nil	Hex color without `#` (e.g., “2F5496”)
`font.bold`	boolean	nil	Bold text
`font.italic`	boolean	nil	Italic text
`spacing.line`	number	nil	Line spacing multiplier (1.0 = single, 1.15, 2.0, etc.)
`spacing.before`	number	nil	Space before paragraph in points
`spacing.after`	number	nil	Space after paragraph in points
`alignment`	string	nil	Text alignment: “left”, “center”, “right”, “both” (justified)
`indent.left`	string	nil	Left indent (e.g., “0.5in”, “1cm”)
`indent.right`	string	nil	Right indent
`keep_next`	boolean	nil	Keep with next paragraph (prevent orphaning)
`outline_level`	integer	nil	Outline level for TOC (0 = Heading 1, 1 = Heading 2, etc.)

3.3 Paragraph Style Example

Listing : Paragraph style definitions

paragraph_styles = {
    {
        id = "Normal",
        name = "Normal",
        font = { name = "Calibri", size = 11 },
        spacing = { line = 1.15, after = 8 },
        alignment = "left",
    },
    {
        id = "Heading1",
        name = "Heading 1",
        based_on = "Normal",
        next = "Normal",
        font = { name = "Calibri Light", size = 16, color = "2F5496" },
        spacing = { before = 12, after = 0, line = 1.15 },
        keep_next = true,
        outline_level = 0,
    },
    {
        id = "Caption",
        name = "Caption",
        based_on = "Normal",
        font = { name = "Calibri", size = 9, italic = true },
        spacing = { before = 0, after = 10, line = 1.15 },
    },
}

3.4 Table Styles

Listing : Table style definition

table_styles = {
    {
        id = "TableGrid",
        name = "Table Grid",
        borders = {
            top    = { style = "single", width = 0.5, color = "000000" },
            bottom = { style = "single", width = 0.5, color = "000000" },
            left   = { style = "single", width = 0.5, color = "000000" },
            right  = { style = "single", width = 0.5, color = "000000" },
            inside_h = { style = "single", width = 0.5, color = "000000" },
            inside_v = { style = "single", width = 0.5, color = "000000" },
        },
        cell_margins = {
            top = "0.05in",
            bottom = "0.05in",
            left = "0.08in",
            right = "0.08in",
        },
        autofit = true,
    },
}

3.5 Caption Configuration

Listing : Caption configuration per float type

captions = {
    figure = {
        template = "{prefix} {number}: {title}",
        prefix = "Figure",
        separator = ": ",
        style = "Caption",
    },
    table = {
        template = "{prefix} {number}: {title}",
        prefix = "Table",
        separator = ": ",
        style = "Caption",
    },
    listing = {
        template = "{prefix} {number}: {title}",
        prefix = "Listing",
        separator = ": ",
        style = "Caption",
    },
}

3.6 Preset Inheritance

Presets can extend other presets using the extends field. The child preset deeply merges with the base, with child values taking precedence:

Listing : Preset inheritance

-- models/mymodel/styles/academic/preset.lua
return {
    name = "Academic",
    description = "Academic paper styles",

    extends = {
        template = "default",    -- Base template
        preset = "default",      -- Base preset name
    },

    -- Override only what changes
    page = {
        size = "A4",
        margins = { top = "2.5cm", bottom = "2.5cm", left = "3cm", right = "2cm" },
    },

    paragraph_styles = {
        {
            id = "Normal",
            name = "Normal",
            font = { name = "Times New Roman", size = 12 },
            spacing = { line = 1.5, after = 0 },
            alignment = "both",   -- Justified
        },
    },
}

The loader detects circular dependencies and reports them as errors.

3.7 Format-Specific Style Overrides

Beyond the main preset.lua, you can provide format-specific style files:

models/{template}/styles/{preset}/docx.lua – DOCX-specific overrides
models/{template}/styles/{preset}/html.lua – HTML-specific overrides

These files return tables with keys like float_styles and object_styles that are merged with the base preset at emit time.

4 Postprocessors

Postprocessors manipulate the generated DOCX file after Pandoc produces it. They operate on raw OOXML inside the ZIP archive.

4.1 Loading

The base postprocessor (models/default/postprocessors/docx.lua) is always loaded. It handles:

Positioned floats – Converts inline images to anchored format with margin-relative positioning.
Caption orphan prevention – Adds keepNext to Caption-styled paragraphs.

Template-specific postprocessors are loaded from models/{template}/postprocessors/docx.lua.

4.2 Hook Interface

A template postprocessor exports functions that are called in sequence:

Table : Postprocessor hook functions

Hook	Input	Purpose
`process_document(content, config, log)`	`document.xml` content	Modify main document body
`process_styles(content, log, config)`	`styles.xml` content	Modify or inject style definitions
`process_numbering(content, log)`	`numbering.xml` content	Modify list numbering definitions
`process_content_types(content, log)`	`[Content_Types].xml` content	Add content type declarations
`process_settings(content, log)`	`settings.xml` content	Modify document settings
`process_rels(content, log)`	`document.xml.rels` content	Add/modify relationship entries
`create_additional_parts(temp_dir, log, config)`	Temp directory path	Create new parts (headers, footers)

All hooks are optional. Each receives the current XML content as a string and returns the modified content.

4.3 Writing a Custom Postprocessor

Create models/mymodel/postprocessors/docx.lua:

Listing : Custom DOCX postprocessor

local M = {}

function M.process_document(content, config, log)
    local modified = content

    -- Example: Add custom watermark text to every paragraph
    -- (Real implementations would use proper OOXML patterns)

    log.debug("[MYMODEL-POST] Processing document.xml")
    return modified
end

function M.process_styles(content, log, config)
    local modified = content

    -- Example: Inject a custom paragraph style
    log.debug("[MYMODEL-POST] Processing styles.xml")
    return modified
end

return M

4.4 The `config` Parameter

The config table passed to hooks contains:

template – The template name
docx – DOCX configuration from project.yaml
spec_metadata – Specification-level attributes (in create_additional_parts)

5 Filters

Pandoc Lua filters run during the DOCX write phase and convert SpecCompiler format markers to OOXML. The default filter (models/default/filters/docx.lua) handles:

Table : Default DOCX filter conversions

Input Marker	Output
`RawBlock("speccompiler", "page-break")`	OOXML page break
`RawBlock("speccompiler", "vertical-space:NNNN")`	OOXML spacing (in twips)
`RawBlock("speccompiler", "bookmark-start:ID:NAME")`	OOXML bookmark start
`RawBlock("speccompiler", "math-omml:OMML")`	OOXML math element
`Div.speccompiler-caption`	OOXML caption with SEQ field
`Div.speccompiler-numbered-equation`	OOXML numbered equation with tab layout
`Div.speccompiler-positioned-float`	Position markers for postprocessor
`Link` with `.ext` target	Rewritten to `.docx` target

5.1 When to Use Filters vs Postprocessors

Filters operate on the Pandoc AST before DOCX generation. Use them when you need to convert SpecCompiler markers to OOXML elements that Pandoc will then place in the document.
Postprocessors operate on the raw OOXML after DOCX generation. Use them when you need to manipulate the final XML directly (style injection, image positioning, headers/footers).

6 project.yaml Configuration

6.1 DOCX-Specific Settings

Listing : DOCX configuration in project.yaml

# Output format entry
outputs:
  - format: docx
    path: build/docx/{spec_id}.docx

# DOCX-specific configuration
docx:
  preset: default              # Style preset name
  # reference_doc: assets/reference.docx  # Custom reference (overrides preset)

6.2 Configuration Precedence

If docx.reference_doc is set, that file is used directly as the Pandoc reference document.
If docx.preset is set (or defaults to the model’s styles), SpecCompiler generates {output_dir}/reference.docx from the preset.
If neither is set, Pandoc uses its built-in default styles.

7 Reference Document Cache

When using presets, SpecCompiler caches the generated reference.docx to avoid regenerating it on every build.

The cache works as follows:

Compute SHA-1 hash of the preset file content.
Compare against the stored hash in the build_meta table (key-value store in specir.db).
If the hashes match and reference.docx exists on disk, skip generation.
If the preset changed or reference.docx is missing, regenerate and update the cache.

To force regeneration of the reference document, delete it:

Listing : Force reference document regeneration

rm -f build/reference.docx
./bin/speccompiler-core

SpecCompiler Core User Manual

1 Introduction

1.1 What is SpecCompiler?

SpecCompiler is a document processing pipeline that transforms structured Markdown specifications into multiple output formats (DOCX, HTML5). It provides:

Structured authoring: Define requirements, designs, and verification cases using a consistent syntax.
Traceability: Link objects together with Project Identifier (PID) and #label references.
Validation: Automatically verify data integrity through proof views backed by Structured Query Language (SQL) queries against the Specification Intermediate Representation (SpecIR).
Multi-format output: Generate Word documents and web content from a single source.

SpecCompiler processes documents through a five-phase pipeline: INITIALIZE, ANALYZE, TRANSFORM, VERIFY, and EMIT, as illustrated in Figure 1.

1.2 Scope

This manual covers:

Installation and verification of the SpecCompiler-Core Docker image (see Introduction).
Configuration of project files (project.yaml) as described in Project Configuration.
Authoring specification documents using the SpecCompiler Markdown syntax (Project Configuration).
Invocation of the tool and interpretation of its outputs (Invocation).
Verification diagnostics and error code reference.
Incremental build behavior and cache management.
Type system configuration and custom model creation; for a detailed walkthrough, see creating-a-model: Model Directory Layout in the companion model guide.
Troubleshooting common problems.

1.3 Pipeline Summary

The processing pipeline consists of five phases:

INITIALIZE – Parse Markdown input via Pandoc Abstract Syntax Tree (AST), extract specifications, spec objects, attributes, floats, relations, and views into the SpecIR stored in SQLite Database (SQLite).
ANALYZE – Resolve relation types and cross-references using specificity-based inference rules (see Equation 1).
TRANSFORM – Resolve floats (render PlantUML, charts, tables), materialize views, rewrite links, and render spec objects using type-specific handlers.
VERIFY – Execute proof views (SQL queries) against the SpecIR database to detect constraint violations; see creating-a-model: Validation Proofs in the model guide.
EMIT – Assemble Pandoc documents from the SpecIR and generate output files in configured formats via parallel Pandoc subprocess invocations.

Figure – Processing Pipeline

Figure – Operations per Pipeline Phase

The handler counts shown in Figure 2 reflect the default model. Custom models may add handlers in any phase.

2 Installation

2.1 Prerequisites

The following are required to run SpecCompiler-Core:

Table : Runtime prerequisites

Prerequisite	Minimum Version	Notes
Docker runtime	20.10+	Docker Desktop or Docker Engine (daemon must be running)
Disk space	2 GB	For the Docker image and build artifacts
Host OS	Linux, macOS, or Windows (with WSL2)	The container runs Debian Bookworm (slim)

All dependency versions are pinned in scripts/versions.env.

2.2 Building the Image

Run the Docker installer from the repository root:

Listing : Build and install via Docker

bash scripts/install.sh

This performs three steps:

Docker build – Executes a multi-stage Docker build:
- Toolchain – Compiles Lua, Pandoc (with GHC), and native Lua extensions (luv, lsqlite3, zip) from source. Builds Deno TypeScript utilities. Downloads and wraps PlantUML with a minimal JRE. This stage is cached and only rebuilt when scripts/versions.env, scripts/build.sh, or src/tools/ change.
- Runtime-base – Copies only runtime artifacts into a lean Debian Bookworm image without build tools. Installs runtime dependencies (Python/reqif, graphviz, lcov). This is the stable base for code-only updates.
- Runtime – Overlays src/ and models/ onto runtime-base to produce the final image.
Wrapper generation – Creates a specc Command-Line Interface (CLI) command at ~/.local/bin/specc.
Config – Writes the image reference to ~/.config/speccompiler/env.

The installer supports three modes:

Default (bash scripts/install.sh) – Builds the full image if not present.
Force (bash scripts/install.sh --force) – Rebuilds everything from scratch, including the toolchain.
Code-only (bash scripts/install.sh --code-only) – Updates only src/ and models/ layers without recompiling the toolchain. Always builds from the stable runtime-base image, preventing Docker layer accumulation. Dangling images from previous builds are automatically pruned.

2.3 Verifying Installation

After building, verify the image is available:

Listing : Verify Docker image availability

docker images speccompiler-core

To verify the tool runs correctly, navigate to a directory containing a project.yaml file and run:

Listing : Run wrapper command

specc build

2.4 The specc Wrapper

The specc command is a Docker wrapper generated by the installer. It supports three subcommands:

Table : Wrapper subcommands

Command	Description
`specc build [project.yaml]`	Build the project (default file: `project.yaml`)
`specc clean`	Remove `build/` directory and `specir.db`
`specc shell`	Open an interactive Bash shell inside the container

The build subcommand runs docker run --rm with:

--user "$(id -u):$(id -g)" – Preserves host UID/GID.
-v "$(pwd):/workspace" – Mounts current directory.
-e "SPECCOMPILER_HOME=/opt/speccompiler" – Sets installation root.
-e "SPECCOMPILER_DIST=/opt/speccompiler" – Sets distribution root.
-e "SPECCOMPILER_LOG_LEVEL=${SPECCOMPILER_LOG_LEVEL:-INFO}" – Passes log level.

Inside the container, the speccompiler-core entry point invokes Pandoc with the SpecCompiler Lua filter. The -o /dev/null Pandoc flag is intentional – actual output files are generated by the EMIT phase.

3 Project Configuration

All project configuration is specified in a project.yaml file located in the project root directory.

3.1 Complete Configuration Reference

Listing : project.yaml reference

# ============================================================================
# Project Identification (REQUIRED)
# ============================================================================
project:
  code: MYPROJ          # Project code identifier (string, required)
  name: My Project SRS  # Human-readable project name (string, required)

# ============================================================================
# Type Model (REQUIRED)
# ============================================================================
template: default        # Type model name (string, default: "default")
                         # Must match a directory under models/

# ============================================================================
# Logging Configuration (OPTIONAL)
# ============================================================================
logging:
  level: info            # DEBUG | INFO | WARN | ERROR (default: "INFO")
  format: auto           # auto | json | text (default: "auto")
  color: true            # ANSI color codes (default: true)

# ============================================================================
# Validation Policy (OPTIONAL)
# ============================================================================
validation:
  missing_required: ignore
  cardinality_over: ignore
  invalid_cast: ignore
  invalid_enum: ignore
  invalid_date: ignore
  bounds_violation: ignore
  dangling_relation: ignore
  unresolved_relation: ignore

# ============================================================================
# Input Files (REQUIRED)
# ============================================================================
output_dir: build/       # Base output directory (default: "build")

doc_files:               # Markdown files to process, in order
  - srs.md
  - sdd.md

# ============================================================================
# Output Format Configurations (OPTIONAL)
# ============================================================================
outputs:
  - format: docx
    path: build/docx/{spec_id}.docx
  - format: html5
    path: build/www/{spec_id}.html

# ============================================================================
# DOCX Configuration (OPTIONAL)
# ============================================================================
docx:
  preset: null           # Style preset name (models/{template}/presets/)
  # reference_doc: assets/reference.docx  # Custom Word reference

# ============================================================================
# HTML5 Configuration (OPTIONAL)
# ============================================================================
html5:
  number_sections: true
  table_of_contents: true
  toc_depth: 3
  standalone: true
  embed_resources: true
  resource_path: build

# ============================================================================
# Bibliography and Citations (OPTIONAL)
# ============================================================================
bibliography: refs.bib
csl: ieee.csl

3.2 Required Fields

Table : Required project fields

Field	Type	Description
`project.code`	string	Project code identifier
`project.name`	string	Human-readable project name
`doc_files`	list	One or more Markdown file paths to process

3.3 Default Values

Table : Default configuration values

Field	Default	Notes
`template`	`default`	Built-in base model is always loaded
`output_dir`	`build`	Also stores `specir.db`
`logging.level`	`INFO`	Overridden by `SPECCOMPILER_LOG_LEVEL` env var

4 Document Authoring

SpecCompiler extends standard Markdown with a structured overlay for specification documents. The syntax uses existing Markdown constructs (headers, blockquotes, code blocks, links) with specific patterns that the pipeline recognizes.

4.1 Specifications

Level 1 headers declare the top-level document container.

Pattern: # type: Title @PID

Listing : Specification declaration

# srs: Software Requirements Specification @SRS-001

4.2 Spec Objects

Level 2-6 headers declare requirements, design elements, sections, or any typed element.

Pattern: ## type: Title @PID

Listing : Spec object declaration

## hlr: User Authentication @HLR-001
### llr: Password Validation @LLR-001
#### section: Implementation Notes

If @PID is omitted, a PID is auto-generated using the type’s pid_prefix and pid_format.

4.3 Attributes

Blockquotes declare attributes using the key: value pattern. They belong to the most recently opened Specification or SpecObject header and do not need to appear immediately after it:

Listing : Attribute declaration

## hlr: User Authentication @HLR-001

> priority: High

> status: Draft

> rationale: Required by security policy

Rules:

Each attribute blockquote must be separated by a blank line.
The first line must match key: value (where key is [A-Za-z0-9_]+). If not, the blockquote is treated as prose. The key does not need to be a registered attribute type; unregistered keys default to STRING datatype.
Multi-line values are supported: continuation lines append to the preceding attribute.
Supported datatypes: STRING, INTEGER, REAL, BOOLEAN, DATE (YYYY-MM-DD), ENUM, XHTML.

4.4 Floats

Fenced code blocks with a typed first class declare numbered elements.

Pattern: ```type.lang:label{key="val"}

4.4.1 PlantUML Diagram

Listing : PlantUML float syntax

```plantuml:diag-state{caption="State Machine"}
@startuml
[*] --> Active
Active --> Inactive
@enduml
```

4.4.2 Table

Listing : Table float syntax

```list-table:tbl-interfaces{caption="External Interfaces"}
> header-rows: 1
> aligns: l,l,l

* - Interface
  - Protocol
  - Direction
* - GPS
  - ARINC-429
  - Input
```

4.4.3 CSV Table

The Comma-Separated Values (CSV) float alias provides a compact syntax for tabular data:

Listing : CSV float syntax

```csv:tbl-data{caption="Sample Data"}
Name,Value,Unit
Temperature,72.5,F
Pressure,1013.25,hPa
```

Both csv and list-table produce TABLE floats. Use csv for simple tabular data and list-table for tables with rich Markdown content in cells. See Floats in Practice for live examples of each.

4.4.4 Listing (Code)

Listing : Listing float syntax

```listing.c:lst-init{caption="Initialization Routine"}
void init(void) {
    setup_hardware();
}
```

4.4.5 Chart (ECharts)

Listing : Chart float syntax

```chart:chart-coverage{caption="Test Coverage"}
{
  "xAxis": { "data": ["Module A", "Module B"] },
  "series": [{ "type": "bar", "data": [95, 87] }]
}
```

Charts support data injection via view modules. Add view="gauss" and params="mean=0,sigma=1" to the code fence attributes to inject generated data into the ECharts configuration at render time. See Figure 5 for a working example.

4.4.6 Math

Listing : Math float syntax

```math:eq-force{caption="Newton's Second Law"}
F = ma
```

Math floats use AsciiMath notation and are rendered to MathML for HTML5 output and OMML for DOCX. See Equation 1 and Equation 2 for live examples in this manual.

4.4.7 Float Syntax Summary

Table : Float syntax components

Component	Description
`type`	Float type identifier (for example `figure`, `plantuml`, `csv`, `list-table`, `listing`, `chart`, `math`)
`.lang`	Optional language hint for syntax highlighting
`:label`	Float label for cross-referencing; must be unique within the specification
`{key="val"}`	Key-value attributes; common attribute: `caption`

4.5 Relations (Links)

Links use the pattern [content](selector). Selectors are not hardcoded – they are registered by relation types in the model’s type system. Each relation type declares a link_selector field, and the pipeline uses it for resolution and type inference. The default model registers the following selectors:

Table : Default model selectors

Selector	Registered by	Resolution
`@`	`traceable` base (XREF_SEC, and model-specific types)	PID lookup: same-spec first, then cross-document fallback
`#`	`xref` base (XREF_FIGURE, XREF_TABLE, XREF_LISTING, XREF_MATH, XREF_SECP)	Scoped label resolution: local scope, then same-spec, then global
`@cite`	XREF_CITATION	Rewritten to pandoc Cite element (parenthetical)
`@citep`	XREF_CITATION	Rewritten to pandoc Cite element (in-text)

Custom models can register additional selectors by defining relation types with new link_selector values.

Table : Relation syntax patterns

Syntax	Example	Description
`[PID](@)`	`[HLR-001](@)`	Reference by PID
`[type:label](#)`	`[fig:diagram](#)`	Typed float reference
`[scope:type:label](#)`	`[REQ-001:fig:detail](#)`	Scoped float reference
`[key](@cite)`	`[smith2024](@cite)`	Parenthetical citation
`[key](@citep)`	`[smith2024](@citep)`	In-text citation

4.5.1 Type Inference

After a link is resolved, the inference algorithm scores it against all registered relation types using 4 unweighted dimensions. Each matching dimension adds +1 to the specificity score. A constraint mismatch eliminates the candidate entirely. The total score for a candidate is computed as:

S = \sum_{i = 1}^{4} d_{i}, d_{i} \in {0, 1}

(1)

The four dimensions ( $d_{1}$ through $d_{4}$ ) correspond to selector, source attribute, source type, and target type as shown in Table 8:

Table : Type inference scoring dimensions

Dimension	Match	Constraint mismatch	No constraint (NULL)
Selector (`@`, `#`, `@cite`, etc.)	+1	Eliminated	+0
Source attribute	+1	Eliminated	+0
Source type	+1	Eliminated	+0
Target type	+1	Eliminated	+0

The highest-scoring candidate wins. If two candidates tie, the relation is marked ambiguous. For example, [fig:diagram](#) resolving to a FIGURE float will match XREF_FIGURE (selector # + target type FIGURE = specificity $S = 2$ ) over the generic xref base (selector # only = specificity $S = 1$ ).

4.6 Views

Inline code with a specific prefix declares view placeholders:

Listing : Inline view placeholder

`toc:`

Default model view types:

Table : Default view types

Type	Aliases	Description
`toc` - —	Table of Contents (TOC) from spec object headings
`lof`	`lot`	List of floats (figures, tables, etc.)
`abbrev`	`sigla`, `acronym`	Define an abbreviation inline: Full Meaning (ABBR)
`abbrev_list`	`sigla_list`, `acronym_list`	Render a sorted table of all abbreviations defined via `abbrev:`
`math_inline`	`eq`, `formula`	Inline math expression rendered to MathML/OMML
`gauss`	`gaussian`, `normal`	Generate Gaussian distribution data for chart floats

4.7 Body Content

Prose paragraphs, lists, and tables between headers accumulate to the most recently opened Specification or Spec Object.

4.8 File Includes

Split large documents into multiple files using fenced code blocks with the include class:

Listing : Include directive syntax

```include
path/to/chapter1.md
path/to/chapter2.md
```

Each line is a file path relative to the including document’s directory. Absolute paths are also supported. Lines starting with # are treated as comments and ignored.

Include blocks are expanded recursively before the pipeline runs. Circular includes are detected and produce an error. The maximum nesting depth is 100 levels.

Included files are tracked in the build graph for incremental builds – a change to any included file triggers a rebuild.

5 Using the Default Model

5.1 Rationale

The default model ships a complete document authoring toolkit so that authors can write structured technical documents without defining custom types. It provides:

Numbered floats – figures, tables, code listings, math equations, PlantUML diagrams, and ECharts charts, each with automatic numbering and captions.
Typed cross-references – relation types that resolve @ and # links to specific float and object categories, enabling the pipeline to render appropriate display text (for example, “Figure 3” or “Table 1”).
Bibliography citations – integration with Pandoc’s citeproc for parenthetical and in-text citation rendering from BibTeX files.
Content views – generated content blocks such as TOC, list of figures, abbreviation tables, and inline math.

The following subsections demonstrate these features with live floats and cross-references. Every float, view, and link shown below is processed by SpecCompiler when this manual is built.

5.2 Floats in Practice

A SpecCompiler document can use all default float types. Each float has a type prefix, a label for cross-referencing, and a caption. The examples below are live – they are rendered when this manual is processed.

5.2.1 Architecture Diagram (PlantUML)

Figure – Layered Architecture

5.2.2 Component Table (list-table)

Table : System Components

Component	Layer	Technology
Web UI	Presentation	React
Auth Service	Business Logic	Node.js
Data Service	Business Logic	Python
Database	Persistence	PostgreSQL

5.2.3 Performance Metrics (CSV)

Table : Performance Metrics

Metric	Target	Actual	Status
Response time (ms)	200	185	Pass
Throughput (req/s)	1000	1120	Pass
Error rate (%)	1.0	0.3	Pass
Memory usage (MB)	512	487	Pass

5.2.4 Initialization Code (Listing)

Listing : Service Initialization

def initialize(config):
    db = connect(config.db_url)
    auth = AuthService(db)
    return Application(auth, db)

5.2.5 Latency Model (Math)

L = T_{\neq t w or k} + T_{p r o c e s sin g} + T_{d b}

(2)

5.2.6 Throughput Chart (ECharts)

Figure – Throughput by Module

5.3 Cross-References

Every float and object defined above can be referenced from prose. The following paragraph demonstrates cross-reference resolution using the # selector.

The system architecture is depicted in Figure 3. Component details, including the technology stack for each layer, are listed in Table 10. Performance targets and actuals are compared in Table 11 – all four metrics pass their thresholds. The initialization logic is shown in Listing 16, and the latency model driving performance requirements is defined by Equation 2. Finally, throughput measurements by module are visualized in Figure 4.

The @ selector resolves by PID and works across documents. For example, this sentence references the introduction of this manual: INDEX. Cross-document references to the companion guides also work; see creating-a-model: Model Directory Layout for the model directory layout and docx-customization: Style Presets for DOCX style presets.

Table : Cross-reference selector comparison

Selector	Syntax	Resolution
`@` (PID)	`[PID](@)`	Exact PID lookup. Same-spec first, then cross-document fallback. Never ambiguous.
`#` (Label)	`[type:label](#)`	Scoped resolution: local scope, then same specification, then global. May be ambiguous if multiple matches at the same scope level.

5.4 Section References

Headers without an explicit TYPE: prefix default to the SECTION type. Sections receive auto-generated PIDs and labels that can be used for cross-referencing:

PID format: {spec_pid}-sec{depth.numbers} – for example, SRS-sec1, SRS-sec1.2, SRS-sec2.3.1. Use the @ selector: [SRS-sec1.2](@).
Label format: section:{title-slug} – for example, ## Introduction produces the label section:introduction. Use the # selector: [section:introduction](#).

The @ selector performs an exact PID lookup and is never ambiguous. The # selector uses scoped resolution (closest scope wins), which is useful when multiple specifications have sections with similar names.

For cross-document section references with the # selector, use the explicit scope syntax: [SPEC-A:section:design](#) to target a section labeled “design” within the specification whose PID is SPEC-A.

This manual references its own sections using both selectors. Here are examples that resolve within this document:

By PID: Introduction links to Installation, Project Configuration links to Document Authoring.
By label: Pipeline Summary links to Pipeline Summary, Troubleshooting links to Troubleshooting.

Cross-document references work identically. Because the companion guides are listed in the same project.yaml, these links resolve at build time:

creating-a-model: Walkthrough: Custom Float Type links to the float walkthrough in the model guide.
docx-customization: Postprocessors links to the DOCX customization guide.
docx-customization: Preset Inheritance links to preset inheritance in the DOCX guide.

5.5 Citations and Bibliography

SpecCompiler integrates with Pandoc’s citeproc processor for scholarly citations.

Step 1. Add bibliography configuration to project.yaml:

Listing : Bibliography configuration in project.yaml

bibliography: refs.bib
csl: ieee.csl

Step 2. Create a BibTeX file (refs.bib):

Listing : Example BibTeX file

@article{smith2024,
  author  = {Smith, John},
  title   = {Advances in Systems Engineering},
  journal = {IEEE Transactions},
  year    = {2024}
}
@book{jones2023,
  author    = {Jones, Alice},
  title     = {Software Architecture Patterns},
  publisher = {O'Reilly},
  year      = {2023}
}

Step 3. Use citation syntax in your document:

Listing : Citation syntax examples

Recent work [smith2024](@cite) demonstrates the approach.

As Smith [smith2024](@citep) argues, the method is effective.

Multiple sources support this [smith2024;jones2023](@cite).

[key](@cite) produces a parenthetical citation – for example, “(Smith, 2024)” in author-date styles or “[1]” in numeric styles.
[key](@citep) produces an in-text citation – for example, “Smith (2024)” or “Smith [1]”.
Multiple keys separated by ; produce a grouped citation.

Processing pipeline: During the TRANSFORM phase, citation links are rewritten to Pandoc Cite elements. During EMIT, Pandoc’s citeproc processor formats citations and appends a bibliography list to the document according to the configured CSL style.

5.6 Views in Practice

Views generate content blocks from the SpecIR.

5.6.1 Abbreviations

The abbrev: view defines abbreviations inline. On first use, the full meaning is displayed alongside the abbreviation. All definitions are collected for the abbrev_list view shown in the List of Abbreviations appendix.

This manual defines abbreviations on first use throughout the text. For example, Entity-Attribute-Value (EAV) is the database pattern used for flexible attributes, and Newline-Delimited JSON (NDJSON) is the format used for diagnostic output.

The syntax is: `abbrev: Full Meaning Text (ABBREVIATION)`. The abbreviation goes in parentheses at the end.

5.6.2 Inline Math

The eq: prefix renders inline math expressions using AsciiMath notation. For example, the quadratic formula is $x = \frac{- b \pm \sqrt{b^{2} - 4 a c}}{2 a}$ , and Euler’s identity is $e^{i π} + 1 = 0$ .

Inline math is useful for formulas within prose paragraphs, while block math: floats (like Equation 1 and Equation 2) provide numbered equations with captions.

5.6.3 Chart with Data View Injection (Gauss)

Charts can load data dynamically from view modules using the view attribute. The gauss view generates a Gaussian probability density function and injects it into the ECharts dataset. The chart below demonstrates this – the view="gauss" attribute triggers the data injection pipeline:

Figure – Standard Normal Distribution

The params attribute passes mean, sigma, xmin, xmax, and points to the Gauss view’s generate() function. The function returns an ECharts dataset that replaces the chart’s placeholder data at render time. This same mechanism supports custom data views that query the SpecIR database; see creating-a-model: Walkthrough: Custom View in the model guide for details on creating view modules.

5.6.4 Generated Lists

The [LOF] and [LOT] views produce navigable lists of figures and tables. These are rendered in the appendices of this manual:

List of Figures – generated by [LOF]
List of Tables – generated by [LOT]
List of Abbreviations – generated by abbrev_list

6 Invocation

6.1 Basic Usage

Listing : Basic invocation

specc build

Processes all files from doc_files in the current directory’s project.yaml. An alternative project file can be specified: specc build my-project.yaml.

6.2 Environment Variables

Table : Environment variables

Variable	Default	Description
`SPECCOMPILER_LOG_LEVEL`	`INFO`	Override log level: `DEBUG`, `INFO`, `WARN`, `ERROR`
`SPECCOMPILER_HOME`	`/opt/speccompiler`	SpecCompiler installation root (model and binary lookup)
`SPECCOMPILER_DIST`	`/opt/speccompiler`	Distribution root (used internally for external renderers)
`SPECCOMPILER_IMAGE`	`speccompiler-core:latest`	Docker image reference (overrides default in wrapper)
`NO_COLOR`	(unset)	Disable ANSI color codes in output

6.3 Exit Codes

Table : Exit Codes

Code	Meaning
0	Success: all documents processed and outputs generated
1	Failure: Docker not running or missing config or pipeline error

7 Output Formats

Four output formats are supported. Multiple formats can be generated in a single run.

7.1 DOCX (Microsoft Word)

Style presets via docx.preset or custom docx.reference_doc.
Model-specific postprocessors for format transformations.

For a complete guide on customizing DOCX output – including paragraph styles, table styles, caption configuration, and postprocessors – see docx-customization: Style Presets and docx-customization: Postprocessors in the companion DOCX Customization guide.

7.2 HTML5

Table : HTML5 output options

Option	Type	Default	Description
`number_sections`	boolean	false	Add section numbering
`table_of_contents`	boolean	false	Generate table of contents
`toc_depth`	integer	3	Heading depth for TOC
`standalone`	boolean	false	Produce complete HTML document
`embed_resources`	boolean	false	Embed CSS and images inline

7.3 Markdown (GitHub-Flavored Markdown (GFM))

GitHub-Flavored Markdown. Useful for review platforms and static site generators.

7.4 JSON (Pandoc AST)

Full Pandoc AST for programmatic integration with other tools.

8 Verification and Diagnostics

8.1 Diagnostic Output

Diagnostics are emitted in NDJSON format to stderr:

Listing : Diagnostic NDJSON example

{"level":"error","message":"[object_missing_required] Object missing required attribute 'priority' on HLR-001","file":"srs.md","line":42}

8.2 Diagnostic Reference

Table : Validation diagnostics

Policy Key	Description
`spec_missing_required`	Specification missing required attribute
`spec_invalid_type`	Invalid specification type reference
`object_missing_required`	Spec object missing required attribute
`object_cardinality_over`	Attribute cardinality exceeded
`object_cast_failures`	Attribute type cast failure
`object_invalid_enum`	Invalid enum value
`object_invalid_date`	Invalid date format (expected YYYY-MM-DD)
`object_bounds_violation`	Value outside declared bounds
`object_duplicate_pid`	Duplicate PID across spec objects
`float_orphan`	Float has no parent object (orphan)
`float_duplicate_label`	Duplicate float label in specification
`float_render_failure`	External render failure
`float_invalid_type`	Invalid float type reference
`relation_unresolved`	Unresolved link (PIDs are case-sensitive)
`relation_dangling`	Dangling relation (target not found)
`relation_ambiguous`	Ambiguous float reference
`view_materialization_failure`	View materialization failure

8.3 Suppressing Validation Rules

Every diagnostic listed in Table 16 can be suppressed or downgraded in project.yaml using its policy key:

Listing : Suppressing a validation rule

validation:
  float_orphan: ignore              # suppress entirely
  relation_unresolved: warn         # downgrade to warning

All proofs default to error (halt the build). Set a key to warn to emit a warning without halting, or ignore to suppress the diagnostic entirely. Custom proofs can define their own policy keys; see creating-a-model: Validation Proofs in the model guide.

9 Incremental Builds

9.1 Build Cache Mechanism

File hashing – SHA-1 hash of each input file.
Include dependency tracking – Tracked in build_graph table.
Cache comparison – Current hashes vs build_cache table.
Skip decision – Unchanged documents reuse cached SpecIR data.

9.2 Forcing a Full Rebuild

Listing : Force full rebuild

specc clean
specc build

10 Type System and Models

10.1 Custom Models

Set template: mymodel in project.yaml. Types load in order:

models/default/types/ – Always loaded first.
models/mymodel/types/ – Loaded as overlay.

For a complete walkthrough on creating custom types, including object types, float types, relation types with inference rules, and view types, see the companion Creating a Custom Model guide. Key sections include:

creating-a-model: Model Directory Layout – Directory structure for models.
creating-a-model: Type Definition Pattern – Schema keys by category and handler lifecycle.
creating-a-model: Walkthrough: Custom Object Type – Custom object type with attributes.
creating-a-model: Walkthrough: Custom Relation with Inference Rules – Relation inference scoring.
creating-a-model: Validation Proofs – SQL-based proof views for the VERIFY phase.

10.2 Built-in Models

SpecCompiler ships with default and sw_docs. The default model provides general-purpose types (specifications, sections, floats, cross-references, views). The sw_docs model overlays default with types for requirements engineering and traceability:

Object types: HLR, LLR, NFR, VC, TR, FD, CSC, CSU, DIC, DD, SF (all extend a common TRACEABLE base with status attribute and PID auto-generation)
Specification types: SRS, SDD, SVC, SUM, TRR (document templates with version, status, date)
Relation types: TRACES_TO, BELONGS, REALIZES, XREF_DECOMPOSITION, XREF_DIC (traceability links with specificity-based inference)
View types: TRACEABILITY_MATRIX, TEST_RESULTS_MATRIX, TEST_EXECUTION_MATRIX, COVERAGE_SUMMARY, REQUIREMENTS_SUMMARY (query-based tables materialized from the SpecIR)
Proofs: Traceability chain validation (VC-HLR, TR-VC, FD-CSC/CSU coverage)
Postprocessor: Interactive single-file HTML5 web application

The docs/engineering_docs/ directory in this repository uses sw_docs and serves as a living example of the model in practice.

10.3 Type Directory Structure

Listing : Type model directory structure

models/{template}/
  types/
    objects/          # Spec object types
    floats/           # Float types
    relations/        # Relation types
    views/            # View types
    specifications/   # Specification types
  postprocessors/     # Format-specific post-processing
  styles/             # DOCX style presets
  filters/            # Pandoc Lua filters per output format

10.4 Type Module Structure

Object type:

Listing : Object type module example

local M = {}
M.object = {
    id = "HLR",
    long_name = "High-Level Requirement",
    pid_prefix = "HLR",
    pid_format = "%s-%03d",
    attributes = {
        { name = "priority", datatype_ref = "PRIORITY_ENUM",
          min_occurs = 1, max_occurs = 1,
          values = {"High", "Medium", "Low"} },
    }
}
return M

Relation type:

Listing : Relation type module example

local M = {}
M.relation = {
    id = "TRACES_TO",
    link_selector = "@",
    source_type_ref = "LLR",
    target_type_ref = "HLR",
}
return M

10.5 Attribute Schema Fields

Table : Attribute schema fields

Field	Type	Default	Description
`name`	string	required	Attribute identifier
`datatype_ref`	string	`STRING`	STRING, INTEGER, REAL, BOOLEAN, DATE, ENUM, XHTML
`min_occurs`	integer	0	Minimum values (0 optional, 1 required)
`max_occurs`	integer	1	Maximum values
`min_value`	number	nil	Lower bound for numeric values
`max_value`	number	nil	Upper bound for numeric values
`values`	list	nil	Valid enum values

11 Troubleshooting

11.1 Docker Not Running

Error: Docker is not running – Start Docker daemon, verify with docker info.

11.2 No project.yaml Found

Run specc build from the directory containing project.yaml.

11.3 PlantUML Render Failure

Verify PlantUML syntax, ensure Docker image has Java JRE, check @startuml/@enduml markers.

11.4 Unresolved Relations

PIDs are case-sensitive. Verify target PID exists in doc_files. For cross-document references, ensure both documents are listed in the same project.yaml.

11.5 Build Seems Stale

Listing : Clean stale build cache

specc clean
specc build

11.6 Debugging

Listing : Enable debug logging

SPECCOMPILER_LOG_LEVEL=DEBUG specc build

12 Known Limitations

No interactive validation – Batch mode only, no LSP or watch mode.
Docker-only distribution – Native install requires replicating the full build environment (scripts/build.sh --install is provided but requires all system dependencies).
Single-writer SQLite – Concurrent builds cause locking errors; use separate output directories.
Float labels per-specification – Same label can exist across specs; use scoped syntax for cross-spec references.
PID case sensitivity – [hlr-001](@) will not match @HLR-001.

13 List of Figures

Figure 1 - Processing Pipeline
Figure 2 - Operations per Pipeline Phase
Figure 3 - Layered Architecture
Figure 4 - Throughput by Module
Figure 5 - Standard Normal Distribution

14 List of Tables

Table 1 - Runtime prerequisites
Table 2 - Wrapper subcommands
Table 3 - Required project fields
Table 4 - Default configuration values
Table 5 - Float syntax components
Table 6 - Default model selectors
Table 7 - Relation syntax patterns
Table 8 - Type inference scoring dimensions
Table 9 - Default view types
Table 10 - System Components
Table 11 - Performance Metrics
Table 12 - Cross-reference selector comparison
Table 13 - Environment variables
Table 14 - Exit Codes
Table 15 - HTML5 output options
Table 16 - Validation diagnostics
Table 17 - Attribute schema fields

15 List of Listings

Figure 1 - Processing Pipeline
Figure 2 - Operations per Pipeline Phase
Figure 3 - Layered Architecture
Figure 4 - Throughput by Module
Figure 5 - Standard Normal Distribution

16 List of Abbreviations

AST	Pandoc Abstract Syntax Tree
CLI	Command-Line Interface
CSV	Comma-Separated Values
EAV	Entity-Attribute-Value
GFM	GitHub-Flavored Markdown
NDJSON	Newline-Delimited JSON
PID	Project Identifier
SpecIR	Specification Intermediate Representation
SQL	Structured Query Language
SQLite	SQLite Database

1 Introduction

1.1 Overlay

2 Model Directory Layout

3 Type Definition Pattern

3.1 Schema Keys by Category

3.2 Handler Lifecycle

4 Walkthrough: Custom Object Type

4.1 Step 1: Create the Type File

4.2 Step 2: Use in Markdown

4.3 Step 3: Add a Handler (Optional)

4.4 Object Schema Fields Reference

4.5 Attribute Schema

5 Walkthrough: Custom Float Type

5.1 Step 1: Create the Type File

5.2 Float Schema Fields Reference

5.3 Counter Groups

6 Walkthrough: Float with External Rendering

6.1 How External Rendering Works

6.2 Registering a Renderer

6.3 Task Descriptor

6.4 Example: PlantUML Renderer

6.5 Example: Chart Renderer with Data Injection

6.6 Creating Your Own External Renderer

6.7 File-Based Caching

6.8 External Rendering for Views

7 Walkthrough: Custom Relation with Inference Rules

7.1 Step 1: Create the Type File

7.2 Step 2: Use in Markdown

7.3 Inference Scoring

7.4 Relation Schema Fields Reference

7.5 Relations with Handlers

7.6 Base Types and Inheritance

7.7 Custom Link Display Text

8 Walkthrough: Custom View

8.1 Step 1: Create the Type File

8.2 Step 2: Use in Markdown

8.3 View Schema Fields Reference

9 Validation Proofs

9.1 Proof File Pattern

9.2 Proof Schema

9.3 Suppressing Proofs

10 Model Overlay/Extension Pattern

10.1 How Overlays Work

10.2 Path Resolution

10.3 Partial Customization Example

11 project.yaml Integration

1 Introduction

2 How Pandoc reference.docx Works

3 Style Presets

3.1 Preset Table Structure

3.2 Paragraph Style Fields

3.3 Paragraph Style Example

3.4 Table Styles

3.5 Caption Configuration

3.6 Preset Inheritance

3.7 Format-Specific Style Overrides

4 Postprocessors

4.1 Loading

4.2 Hook Interface

4.3 Writing a Custom Postprocessor

4.4 The config Parameter

5 Filters

5.1 When to Use Filters vs Postprocessors

6 project.yaml Configuration

6.1 DOCX-Specific Settings

6.2 Configuration Precedence

7 Reference Document Cache

1 Introduction

1.1 What is SpecCompiler?

1.2 Scope

1.3 Pipeline Summary

2 Installation

2.1 Prerequisites

2.2 Building the Image

2.3 Verifying Installation

2.4 The specc Wrapper

3 Project Configuration

3.1 Complete Configuration Reference

3.2 Required Fields

3.3 Default Values

4.4 The `config` Parameter