Guide: Creating a Custom Model
1 Introduction
A model defines the vocabulary and behavior of your specification documents. It declares what types of spec objects exist (requirements, design items, test cases), what floats are available (diagrams, tables, code listings), how cross-references resolve, and what validation rules apply.
SpecCompiler ships with a default
model that provides base types (SECTION, FIGURE, TABLE, PLANTUML, etc.).
You create a custom model when your domain needs additional types,
specialized validation, or custom rendering.
1.1 Overlay
Models work as overlays on top of default. When you set template: mymodel in project.yaml, the engine loads types in
order:
models/default/types/– Always loaded first.models/mymodel/types/– Loaded second; types with the sameidoverride the default.
This means your custom model only needs to define the types it adds
or overrides. Everything else inherits from default.
2 Model Directory Layout
models/{name}/
types/
objects/ -- Spec object types (e.g., hlr.lua, vc.lua)
specifications/ -- Specification types (e.g., srs.lua)
floats/ -- Float types (e.g., figure.lua, chart.lua)
views/ -- View types (e.g., abbrev.lua, math_inline.lua)
relations/ -- Relation types (e.g., xref_decomposition.lua)
proofs/ -- Validation proof queries (e.g., sd_601_*.lua)
postprocessors/ -- Format post-processing (docx.lua, html5.lua)
filters/ -- Pandoc Lua filters per output format
styles/ -- Style presets (preset.lua, docx.lua, html.lua)
data_views/ -- Chart data generators
handlers/ -- Custom pipeline handlers
Only types/ is required. All other
directories are optional.
3 Type Definition Pattern
Every type module is a Lua file that returns a table with two optional keys:
- A schema key (
M.object,M.float,M.relation,M.view, orM.specification) that declares the type’s metadata and gets registered into the database. - An optional
M.handlertable that hooks into the pipeline lifecycle.
3.1 Schema Keys by Category
| Category | Schema Key | Example File |
|---|---|---|
| Spec Objects | M.object |
types/objects/section.lua |
| Floats | M.float |
types/floats/figure.lua |
| Relations | M.relation |
types/relations/xref_citation.lua |
| Views | M.view |
types/views/abbrev.lua |
| Specifications | M.specification |
types/specifications/srs.lua |
3.2 Handler Lifecycle
Handlers hook into pipeline phases via callback functions:
| Callback | Phase | Purpose |
|---|---|---|
on_initialize |
INITIALIZE | Parse content from Pandoc AST, store in database |
on_analyze |
ANALYZE | Validate, resolve references, generate PIDs |
on_transform |
TRANSFORM | Render content, resolve external resources |
on_render_SpecObject |
EMIT | Convert spec object to Pandoc blocks for output |
on_render_Code |
EMIT | Convert inline code to Pandoc inlines (views) |
on_render_CodeBlock |
EMIT | Convert code block to Pandoc blocks (floats) |
The prerequisites field controls
execution order: a handler with prerequisites = {"spec_views"} runs
after the spec_views
handler.
4 Walkthrough: Custom Object Type
This example creates a High-Level Requirement (HLR) type with required attributes.
4.1 Step 1: Create the Type File
Create models/mymodel/types/objects/hlr.lua:
local M = {}
M.object = {
id = "HLR",
long_name = "High-Level Requirement",
description = "A top-level system requirement",
pid_prefix = "HLR", -- Auto-PID prefix
pid_format = "%s-%03d", -- Produces HLR-001, HLR-002, etc.
attributes = {
{
name = "priority",
type = "ENUM",
values = { "High", "Medium", "Low" },
min_occurs = 1, -- Required
max_occurs = 1,
},
{
name = "status",
type = "ENUM",
values = { "Draft", "Approved", "Implemented" },
min_occurs = 1,
max_occurs = 1,
},
{
name = "rationale",
type = "XHTML", -- Rich text
min_occurs = 0, -- Optional
},
},
}
return M4.2 Step 2: Use in Markdown
## hlr: User Authentication @HLR-001
> priority: High
> status: Draft
> rationale: Required by security policy section 4.2
The system shall authenticate users via username and password.4.3 Step 3: Add a Handler (Optional)
If the type needs custom behavior during pipeline phases, add M.handler:
local Queries = require("db.queries")
M.handler = {
name = "hlr_handler",
prerequisites = {},
on_analyze = function(data, contexts, diagnostics)
for _, ctx in ipairs(contexts) do
local spec_id = ctx.spec_id or "default"
local objects = data:query_all(
Queries.content.objects_by_spec_type,
{ spec_id = spec_id, type_ref = "HLR" }
)
for _, obj in ipairs(objects or {}) do
-- Custom validation logic here
end
end
end,
}4.4 Object Schema Fields Reference
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier (uppercase convention) |
long_name |
string | same as id |
Human-readable name |
description |
string | "" |
Description text |
extends |
string | nil | Base type for inheritance |
is_default |
boolean | false | If true, headers without explicit type match this |
is_composite |
boolean | false | Composite object flag |
pid_prefix |
string | nil | Prefix for auto-generated PIDs |
pid_format |
string | nil | Printf format string for PIDs |
aliases |
list | nil | Alternative identifiers for syntax matching |
attributes |
list | nil | Attribute definitions (see Attribute Schema) |
4.5 Attribute Schema
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | required | Attribute identifier |
type |
string | "STRING" |
Datatype: STRING, INTEGER, REAL, BOOLEAN, DATE, ENUM, XHTML |
min_occurs |
integer | 0 | Minimum values (0 = optional, 1 = required) |
max_occurs |
integer | 1 | Maximum values |
min_value |
number | nil | Lower bound for numeric types |
max_value |
number | nil | Upper bound for numeric types |
values |
list | nil | Valid enum values (required when
type = "ENUM") |
datatype_ref |
string | nil | Explicit datatype ID (overrides auto-generated) |
5 Walkthrough: Custom Float Type
Floats are numbered elements declared in fenced code blocks. This example creates a custom float type for diagrams.
5.1 Step 1: Create the Type File
Create models/mymodel/types/floats/sequence_diagram.lua:
local M = {}
M.float = {
id = "SEQUENCE",
long_name = "Sequence Diagram",
description = "UML Sequence Diagram rendered via PlantUML",
caption_format = "Figure", -- Caption prefix in output
counter_group = "FIGURE", -- Shares counter with FIGURE, PLANTUML
aliases = { "seq", "sequence" }, -- Syntax: ```seq:label or ```sequence:label
needs_external_render = true, -- Requires external tool
}
return M5.2 Float Schema Fields Reference
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier (uppercase) |
caption_format |
string | same as id |
Prefix used in output captions |
counter_group |
string | same as id |
Counter sharing group (e.g., FIGURE, TABLE) |
aliases |
list | nil | Alternative syntax identifiers |
needs_external_render |
boolean | false | Whether rendering requires an external tool |
style_id |
string | nil | Custom style identifier for output formatting |
5.3 Counter Groups
Multiple float types can share a numbering sequence by using the same
counter_group. For example,
FIGURE, PLANTUML, and CHART all use counter_group = "FIGURE", so they are
numbered sequentially as Figure 1, Figure 2, Figure 3 regardless of
which specific type each is.
6 Walkthrough: Float with External Rendering
When a float type needs an external tool to produce its output (PlantUML for diagrams, Deno for charts, etc.), it uses the external render handler. This handler collects all items that need rendering, spawns external processes in parallel, and dispatches results back to type-specific callbacks.
6.1 How External Rendering Works
The pipeline flow for external renders:
- INITIALIZE – The float is parsed from the Markdown code block and stored in
spec_floatswithraw_content. - TRANSFORM – The external render handler (
src/pipeline/transform/external_render_handler.lua) queries all floats whereneeds_external_render = 1andresolved_ast IS NULL. - Prepare – For each float, the handler calls the registered
prepare_taskcallback, which writes input files and builds a command descriptor. - Cache check – If
output_pathexists on disk (from a previous build), the task is skipped andhandle_resultis called immediately with the cached path. - Batch spawn – All non-cached tasks are spawned in parallel via
task_runner.spawn_batch. - Dispatch – Results (stdout, stderr, exit code) are dispatched to each type’s
handle_resultcallback, which updatesresolved_astin the database.
6.2 Registering a Renderer
External renderers are registered at module load time by calling
external_render.register_renderer(type_ref, callbacks).
The callbacks table must provide two functions:
| Callback | Signature and Purpose |
|---|---|
prepare_task |
function(float, build_dir, log, data, model_name) -> task|nil
– Writes input files, builds command descriptor. Returns nil to skip
rendering. |
handle_result |
function(task, success, stdout, stderr, data, log)
– Processes output. Updates resolved_ast in the database
via float_base.update_resolved_ast. |
6.3 Task Descriptor
The prepare_task callback
returns a task descriptor table:
| Field | Type | Description |
|---|---|---|
cmd |
string | Command to execute (e.g.,
"plantuml", "deno") |
args |
list | Command arguments |
opts |
table | Options: cwd (working
directory), timeout (milliseconds) |
output_path |
string | Expected output file path; if it exists, the task is skipped (cache hit) |
context |
table | Arbitrary data passed through to
handle_result (float record, hash, paths, etc.) |
6.4 Example: PlantUML Renderer
The built-in PlantUML renderer demonstrates the full pattern:
local float_base = require("pipeline.shared.float_base")
local task_runner = require("infra.process.task_runner")
local external_render = require("pipeline.transform.external_render_handler")
local M = {}
M.float = {
id = "PLANTUML",
long_name = "PlantUML Diagram",
caption_format = "Figure",
counter_group = "FIGURE",
aliases = { "puml", "plantuml", "uml" },
needs_external_render = true, -- Enables external render pipeline
}
external_render.register_renderer("PLANTUML", {
prepare_task = function(float, build_dir, log)
local content = float.raw_content or ''
-- Ensure @startuml/@enduml wrapper
if not content:match('@startuml') then
content = '@startuml\n' .. content .. '\n@enduml'
end
local hash = pandoc.sha1(content)
local diagrams_path = build_dir .. "/diagrams"
local puml_file = diagrams_path .. "/" .. hash .. ".puml"
local png_file = diagrams_path .. "/" .. hash .. ".png"
task_runner.ensure_dir(diagrams_path)
task_runner.write_file(puml_file, content)
return {
cmd = "plantuml",
args = { "-tpng", puml_file },
opts = { timeout = 30000 },
output_path = png_file, -- Cache key: skip if PNG exists
context = {
hash = hash,
float = float,
relative_path = "diagrams/" .. hash .. ".png",
}
}
end,
handle_result = function(task, success, stdout, stderr, data, log)
local ctx = task.context
if not success then
log.warn("PlantUML failed for %s: %s",
ctx.float.identifier:sub(1,12), stderr)
return
end
-- Store resolved path as JSON in resolved_ast
local json = string.format(
'{"png_paths":["%s"]}',
ctx.relative_path
)
float_base.update_resolved_ast(data, ctx.float.identifier, json)
end
})
return M6.5 Example: Chart Renderer with Data Injection
The chart renderer adds a data injection step before rendering,
loading data views from models/{model}/data_views/:
local float_base = require("pipeline.shared.float_base")
local task_runner = require("infra.process.task_runner")
local data_loader = require("core.data_loader")
local external_render = require("pipeline.transform.external_render_handler")
local M = {}
M.float = {
id = "CHART",
long_name = "Chart",
caption_format = "Figure",
counter_group = "FIGURE",
aliases = { "echarts", "echart" },
needs_external_render = true,
}
external_render.register_renderer("CHART", {
prepare_task = function(float, build_dir, log, data, model_name)
local attrs = float_base.decode_attributes(float)
local json_content = float.raw_content or '{}'
-- Data injection: load view module and merge data into ECharts config
local view_name = attrs.view
if view_name and data then
local inject_attrs = { view = view_name, model = model_name }
local config = pandoc.json.decode(json_content)
local injected = data_loader.inject_chart_data(
config, inject_attrs, data, log)
if injected then
json_content = pandoc.json.encode(injected)
end
end
local hash = pandoc.sha1(json_content)
local charts_path = build_dir .. "/charts"
local json_file = charts_path .. "/" .. hash .. ".json"
local png_file = charts_path .. "/" .. hash .. ".png"
task_runner.ensure_dir(charts_path)
task_runner.write_file(json_file, json_content)
return {
cmd = "deno",
args = {
"run", "--allow-read", "--allow-write", "--allow-env",
"echarts-render.ts", json_file, png_file,
tostring(attrs.width or 600),
tostring(attrs.height or 400)
},
opts = { timeout = 60000 },
output_path = png_file,
context = {
hash = hash,
float = float,
relative_path = "charts/" .. hash .. ".png",
}
}
end,
handle_result = function(task, success, stdout, stderr, data, log)
local ctx = task.context
if not success then
log.warn("Chart render failed: %s", stderr)
return
end
local json = string.format('{"png_path":"%s"}', ctx.relative_path)
float_base.update_resolved_ast(data, ctx.float.identifier, json)
end
})
return M6.6 Creating Your Own External Renderer
To create a float type that uses an external tool:
- Set
needs_external_render = truein the float schema. - Register callbacks with
external_render.register_renderer("YOUR_TYPE", { ... })at module load time (top-level code, not inside a function). - In
prepare_task: Write input content to a temporary file, build the command and arguments, and return a task descriptor withoutput_pathfor file-based caching. - In
handle_result: Parse the output (stdout, generated files), serialize the result as JSON, and callfloat_base.update_resolved_ast(data, identifier, json)to store it. - Do not define
M.handler.on_transform– the external render handler orchestrates the TRANSFORM phase for all registered types. Defining your ownon_transformwould bypass the parallel batch execution.
Key utilities available:
| Function | Purpose |
|---|---|
task_runner.ensure_dir(path) |
Create directory if it does not exist |
task_runner.write_file(path, content) |
Write content to a file; returns
ok, err |
task_runner.file_exists(path) |
Check if a file exists on disk |
task_runner.command_exists(cmd) |
Check if a command is available in PATH |
float_base.decode_attributes(float) |
Parse float’s
pandoc_attributes JSON into a Lua table |
float_base.update_resolved_ast(data, id, json) |
Store the rendering result in the database |
6.7 File-Based Caching
The external render handler provides automatic file-based caching via
the output_path field in the task
descriptor. If the output file already exists on disk when prepare_task returns, the handler
skips spawning the external process and immediately calls handle_result with empty
stdout/stderr. This means:
- The content hash should be part of the output filename (e.g.,
diagrams/{sha1}.png) so that content changes produce a new filename and trigger re-rendering. - The
handle_resultcallback should work correctly whether called after a fresh render or a cache hit (it receives the sametask.context). - Deleting the output files forces re-rendering on the next build (the database
resolved_astis also cleared during INITIALIZE).
6.8 External Rendering for Views
The same external_render.register_renderer
mechanism works for views that need external tools. For example, math_inline.lua registers a renderer
for the MATH_INLINE view type to
convert AsciiMath to MathML/OMML via an external script. The handler
queries spec_views (instead of
spec_floats) with needs_external_render = 1 and
dispatches to the same callback interface.
7 Walkthrough: Custom Relation with Inference Rules
Relations connect spec objects via link syntax. The relation resolver uses specificity scoring to infer the relation type.
7.1 Step 1: Create the Type File
Create models/mymodel/types/relations/traces_to.lua:
local M = {}
M.relation = {
id = "TRACES_TO",
long_name = "Traces To",
description = "Traceability link from LLR to HLR",
link_selector = "@", -- Uses [PID](@) syntax
source_type_ref = "LLR", -- Only from LLR objects
target_type_ref = "HLR", -- Only to HLR objects
aliases = nil, -- No alias prefix
is_default = false,
}
return M7.2 Step 2: Use in Markdown
### llr: Password Length Check @LLR-001
Passwords must be at least 8 characters. Traces to [HLR-001](@).7.3 Inference Scoring
When multiple relation types could match a link, the resolver scores each candidate:
| Dimension | Match | Constraint mismatch | No constraint (NULL) |
|---|---|---|---|
Selector (@
or #) |
+1 | Eliminated | +0 |
| Source attribute | +1 | Eliminated | +0 |
| Source type | +1 | Eliminated | +0 |
| Target type | +1 | Eliminated | +0 |
The highest-scoring candidate wins. If two candidates tie, the
relation is flagged as ambiguous (relation_ambiguous). Constraints set to
nil act as wildcards (+0) rather
than eliminating the candidate.
7.4 Relation Schema Fields Reference
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier (uppercase) |
link_selector |
string | nil | Required selector: "@" for
PID refs, "#" for label refs |
source_type_ref |
string | nil | Constrain source to this object type (nil = any) |
target_type_ref |
string | nil | Constrain target to this object type (nil = any) |
source_attribute |
string | nil | Constrain to links within this attribute context |
aliases |
list | nil | Prefix aliases for
[alias:key](#) syntax |
is_default |
boolean | false | Default relation for its selector when no better match |
7.5 Relations with Handlers
A relation type can include a handler for custom transform behavior.
For example, xref_citation.lua
rewrites citation links to Pandoc Cite elements during the TRANSFORM
phase:
M.handler = {
name = "my_relation_handler",
prerequisites = {"spec_relations"}, -- Run after relations are stored
on_transform = function(data, contexts, diagnostics)
for _, ctx in ipairs(contexts) do
-- Custom transform logic
end
end
}7.6 Base Types and Inheritance
Relation types support inheritance via base types. Instead of
repeating link_selector and
resolution logic in every type, you extend a base type:
traceable(models/default/types/relations/traceable.lua) — base for@(PID) selectorxref(models/default/types/relations/xref.lua) — base for#(label) selector
Use extend() to create a
concrete type:
local traceable = require("models.default.types.relations.traceable")
local M = {}
M.relation = traceable.extend({
id = "TRACES_TO",
long_name = "Traces To",
description = "Traceability link from one object to another",
})
return MThe extend() call inherits link_selector = "@" from the base and
merges your overrides. For #
selector types, use xref.extend() instead.
7.7 Custom Link Display Text
By default, object references display the target’s PID and float
references display the caption format with the float number (e.g.,
“Figure 3”). To customize display text, add a standard M.handler with an on_transform hook using the shared
link_rewrite_utils utility:
local traceable = require("models.default.types.relations.traceable")
local link_rewrite = require("pipeline.shared.link_rewrite_utils")
local M = {}
M.relation = traceable.extend({
id = "XREF_DIC",
long_name = "Dictionary Reference",
description = "Cross-reference to a dictionary entry",
target_type_ref = "DIC",
})
M.handler = {
name = "xref_dic_handler",
prerequisites = {"spec_relations"},
on_transform = function(data, contexts, _diagnostics)
link_rewrite.rewrite_display_for_type(data, contexts, "XREF_DIC", function(target)
if target.title_text and target.title_text ~= "" then
return target.title_text
end
end)
end
}
return MA link [DIC-AUTH-001](@) would
display as “Authentication” instead of “DIC-AUTH-001”.
The display_fn receives a target table with fields pid, type_ref, and title_text. Return a string for custom
display text, or nil to keep the
default.
8 Walkthrough: Custom View
Views are inline elements declared with backtick syntax (`prefix: content`).
8.1 Step 1: Create the Type File
Create models/mymodel/types/views/symbol.lua:
local M = {}
local Queries = require("db.queries")
M.view = {
id = "SYMBOL",
long_name = "Symbol",
description = "Engineering symbol with unit definition",
aliases = { "sym" },
inline_prefix = "symbol", -- Enables `symbol: content` syntax
needs_external_render = false,
}
M.handler = {
name = "symbol_handler",
prerequisites = {"spec_views"},
on_initialize = function(data, contexts, diagnostics)
for _, ctx in ipairs(contexts) do
local doc = ctx.doc
if not doc or not doc.blocks then goto continue end
local spec_id = ctx.spec_id or "default"
local file_seq = 0
local visitor = {
Code = function(c)
local content = (c.text or ""):match("^symbol:%s*(.+)$")
or (c.text or ""):match("^sym:%s*(.+)$")
if not content then return nil end
file_seq = file_seq + 1
local identifier = pandoc.sha1(spec_id .. ":" .. file_seq .. ":" .. content)
data:execute(Queries.content.insert_view, {
identifier = identifier,
specification_ref = spec_id,
view_type_ref = "SYMBOL",
from_file = ctx.source_path or "unknown",
file_seq = file_seq,
raw_ast = content
})
end
}
for _, block in ipairs(doc.blocks) do
pandoc.walk_block(block, visitor)
end
::continue::
end
end,
on_render_Code = function(code, ctx)
local content = (code.text or ""):match("^symbol:%s*(.+)$")
or (code.text or ""):match("^sym:%s*(.+)$")
if not content then return nil end
-- Render as emphasized text
return { pandoc.Emph({ pandoc.Str(content) }) }
end,
}
return M8.2 Step 2: Use in Markdown
The force is defined as `symbol: F = ma` where `symbol: F` is force in Newtons.8.3 View Schema Fields Reference
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Unique identifier (uppercase) |
inline_prefix |
string | nil | Prefix for inline code dispatch (e.g.,
"math" enables math: syntax) |
aliases |
list | nil | Alternative prefixes for the same view type |
needs_external_render |
boolean | false | Whether rendering requires an external tool (batch processing) |
materializer_type |
string | nil | Materializer strategy (e.g., ‘toc’, ‘lof’, ‘custom’) |
counter_group |
string | nil | Counter group for numbered views |
9 Validation Proofs
Proofs are SQL-based validation rules that run during the VERIFY phase. Each proof creates a SQL view; if the view returns any rows, those rows represent violations.
9.1 Proof File Pattern
Create models/mymodel/proofs/vc_missing_hlr_traceability.lua:
local M = {}
M.proof = {
view = "view_traceability_vc_missing_hlr",
policy_key = "traceability_vc_to_hlr", -- Key in project.yaml validation section
sql = [[
CREATE VIEW IF NOT EXISTS view_traceability_vc_missing_hlr AS
SELECT
vc.identifier AS object_id,
vc.pid AS object_pid,
vc.title_text AS object_title,
vc.from_file,
vc.start_line
FROM spec_objects vc
WHERE vc.type_ref = 'VC'
AND NOT EXISTS (
SELECT 1
FROM spec_relations r
JOIN spec_objects target ON target.identifier = r.target_ref
WHERE r.source_ref = vc.identifier
AND target.type_ref = 'HLR'
);
]],
message = function(row)
local label = row.object_pid or row.object_title or row.object_id
return string.format(
"Verification case '%s' has no traceability link to an HLR",
label
)
end
}
return M9.2 Proof Schema
| Field | Type | Description |
|---|---|---|
view |
string | SQL view name (must match the CREATE VIEW name) |
policy_key |
string | Key for suppression in
project.yaml validation section |
sql |
string | SQL CREATE VIEW statement; rows returned = violations |
message |
function | Takes a row table, returns a diagnostic message string |
9.3 Suppressing Proofs
Users suppress proofs in project.yaml using the policy_key:
validation:
traceability_vc_to_hlr: ignore # Suppress this proof10 Model Overlay/Extension Pattern
10.1 How Overlays Work
The type loader (src/core/type_loader.lua) loads models
in two passes:
- Default model: Scans
models/default/types/{category}/and registers all types. - Custom model: Scans
models/{template}/types/{category}/and registers all types.
Since type registration uses INSERT OR REPLACE, a custom model type
with the same id as a default type
replaces it entirely. Types with new IDs are added alongside the
defaults.
10.2 Path Resolution
The loader resolves model paths in order:
$SPECCOMPILER_HOME/models/{name}/types/(Docker/production)./models/{name}/types/(local development)
10.3 Partial Customization Example
A model that only adds an HLR type and a custom proof:
models/mymodel/
types/
objects/
hlr.lua -- Adds HLR type (default has no HLR)
relations/
traces_to.lua -- Adds traceability relation
proofs/
sd_601_vc_missing_hlr.lua -- Domain-specific validation
All other types (SECTION, FIGURE, TABLE, etc.) are inherited from
default.
11 project.yaml Integration
Set the template field to use
your custom model:
project:
code: MYPROJ
name: My Project
template: mymodel # Loads models/default/ then models/mymodel/
doc_files:
- srs.mdGuide: DOCX Customization
1 Introduction
SpecCompiler generates DOCX output through a multi-stage pipeline:
- SpecIR – Structured data in SQLite (objects, relations, floats, attributes).
- Pandoc AST – The emitter assembles a Pandoc document from the SpecIR.
- Pandoc DOCX Writer – Pandoc converts the AST to DOCX using a
reference.docxfor styles. - Lua Filter – A format-specific filter converts SpecCompiler markers to OOXML (captions, bookmarks, math).
- Postprocessor – Manipulates the generated DOCX ZIP archive (positioned floats, caption orphan prevention, template-specific OOXML).
Customization is available at three levels: style presets (fonts, spacing, page layout), filters (AST-to-OOXML conversion), and postprocessors (raw OOXML manipulation).
2 How Pandoc reference.docx Works
Pandoc uses a reference document as a style template for DOCX output. The reference document defines paragraph styles (Normal, Heading 1, Caption, etc.), page dimensions, margins, and default formatting. Pandoc does not copy content from the reference document – only styles and settings.
SpecCompiler manages the reference document in two ways:
- Auto-generated from presets (default): SpecCompiler builds a
reference.docxfrom Lua style presets, storing it at{output_dir}/reference.docx. - User-provided: Set
docx.reference_docinproject.yamlto use your own Word template.
3 Style Presets
Style presets are Lua files that declaratively define DOCX styles. They are located at:
models/{template}/styles/{preset}/preset.lua
3.1 Preset Table Structure
A preset file returns a Lua table with the following top-level keys:
return {
name = "My Preset",
description = "Custom document styles",
-- Page configuration
page = {
size = "A4", -- "Letter" or "A4"
orientation = "portrait", -- "portrait" or "landscape"
margins = {
top = "2.5cm",
bottom = "2.5cm",
left = "3cm",
right = "2cm",
},
},
-- Paragraph styles (array of style definitions)
paragraph_styles = { ... },
-- Table styles (array of table style definitions)
table_styles = { ... },
-- Caption formats per float type
captions = { ... },
-- Document settings
settings = {
default_tab_stop = 720, -- In twips (720 = 0.5 inch)
language = "en-US",
},
-- Optional: inherit from another preset
extends = {
template = "default", -- Base template
preset = "base", -- Base preset name
},
}3.2 Paragraph Style Fields
| Field | Type | Default | Description |
|---|---|---|---|
id |
string | required | Internal style ID (e.g., “Heading1”) |
name |
string | required | Display name in Word (e.g., “Heading 1”) |
based_on |
string | nil | Parent style ID for inheritance |
next |
string | nil | Style to apply to the next paragraph |
font.name |
string | nil | Font family name |
font.size |
number | nil | Font size in points |
font.color |
string | nil | Hex color without # (e.g.,
“2F5496”) |
font.bold |
boolean | nil | Bold text |
font.italic |
boolean | nil | Italic text |
spacing.line |
number | nil | Line spacing multiplier (1.0 = single, 1.15, 2.0, etc.) |
spacing.before |
number | nil | Space before paragraph in points |
spacing.after |
number | nil | Space after paragraph in points |
alignment |
string | nil | Text alignment: “left”, “center”, “right”, “both” (justified) |
indent.left |
string | nil | Left indent (e.g., “0.5in”, “1cm”) |
indent.right |
string | nil | Right indent |
keep_next |
boolean | nil | Keep with next paragraph (prevent orphaning) |
outline_level |
integer | nil | Outline level for TOC (0 = Heading 1, 1 = Heading 2, etc.) |
3.3 Paragraph Style Example
paragraph_styles = {
{
id = "Normal",
name = "Normal",
font = { name = "Calibri", size = 11 },
spacing = { line = 1.15, after = 8 },
alignment = "left",
},
{
id = "Heading1",
name = "Heading 1",
based_on = "Normal",
next = "Normal",
font = { name = "Calibri Light", size = 16, color = "2F5496" },
spacing = { before = 12, after = 0, line = 1.15 },
keep_next = true,
outline_level = 0,
},
{
id = "Caption",
name = "Caption",
based_on = "Normal",
font = { name = "Calibri", size = 9, italic = true },
spacing = { before = 0, after = 10, line = 1.15 },
},
}3.4 Table Styles
table_styles = {
{
id = "TableGrid",
name = "Table Grid",
borders = {
top = { style = "single", width = 0.5, color = "000000" },
bottom = { style = "single", width = 0.5, color = "000000" },
left = { style = "single", width = 0.5, color = "000000" },
right = { style = "single", width = 0.5, color = "000000" },
inside_h = { style = "single", width = 0.5, color = "000000" },
inside_v = { style = "single", width = 0.5, color = "000000" },
},
cell_margins = {
top = "0.05in",
bottom = "0.05in",
left = "0.08in",
right = "0.08in",
},
autofit = true,
},
}3.5 Caption Configuration
captions = {
figure = {
template = "{prefix} {number}: {title}",
prefix = "Figure",
separator = ": ",
style = "Caption",
},
table = {
template = "{prefix} {number}: {title}",
prefix = "Table",
separator = ": ",
style = "Caption",
},
listing = {
template = "{prefix} {number}: {title}",
prefix = "Listing",
separator = ": ",
style = "Caption",
},
}3.6 Preset Inheritance
Presets can extend other presets using the extends field. The child preset deeply
merges with the base, with child values taking precedence:
-- models/mymodel/styles/academic/preset.lua
return {
name = "Academic",
description = "Academic paper styles",
extends = {
template = "default", -- Base template
preset = "default", -- Base preset name
},
-- Override only what changes
page = {
size = "A4",
margins = { top = "2.5cm", bottom = "2.5cm", left = "3cm", right = "2cm" },
},
paragraph_styles = {
{
id = "Normal",
name = "Normal",
font = { name = "Times New Roman", size = 12 },
spacing = { line = 1.5, after = 0 },
alignment = "both", -- Justified
},
},
}The loader detects circular dependencies and reports them as errors.
3.7 Format-Specific Style Overrides
Beyond the main preset.lua, you
can provide format-specific style files:
models/{template}/styles/{preset}/docx.lua– DOCX-specific overridesmodels/{template}/styles/{preset}/html.lua– HTML-specific overrides
These files return tables with keys like float_styles and object_styles that are merged with the
base preset at emit time.
4 Postprocessors
Postprocessors manipulate the generated DOCX file after Pandoc produces it. They operate on raw OOXML inside the ZIP archive.
4.1 Loading
The base postprocessor (models/default/postprocessors/docx.lua)
is always loaded. It handles:
- Positioned floats – Converts inline images to anchored format with margin-relative positioning.
- Caption orphan prevention – Adds
keepNextto Caption-styled paragraphs.
Template-specific postprocessors are loaded from models/{template}/postprocessors/docx.lua.
4.2 Hook Interface
A template postprocessor exports functions that are called in sequence:
| Hook | Input | Purpose |
|---|---|---|
process_document(content, config, log) |
document.xml content |
Modify main document body |
process_styles(content, log, config) |
styles.xml content |
Modify or inject style definitions |
process_numbering(content, log) |
numbering.xml content |
Modify list numbering definitions |
process_content_types(content, log) |
[Content_Types].xml
content |
Add content type declarations |
process_settings(content, log) |
settings.xml content |
Modify document settings |
process_rels(content, log) |
document.xml.rels
content |
Add/modify relationship entries |
create_additional_parts(temp_dir, log, config) |
Temp directory path | Create new parts (headers, footers) |
All hooks are optional. Each receives the current XML content as a string and returns the modified content.
4.3 Writing a Custom Postprocessor
Create models/mymodel/postprocessors/docx.lua:
local M = {}
function M.process_document(content, config, log)
local modified = content
-- Example: Add custom watermark text to every paragraph
-- (Real implementations would use proper OOXML patterns)
log.debug("[MYMODEL-POST] Processing document.xml")
return modified
end
function M.process_styles(content, log, config)
local modified = content
-- Example: Inject a custom paragraph style
log.debug("[MYMODEL-POST] Processing styles.xml")
return modified
end
return M4.4
The config Parameter
The config table passed to hooks
contains:
template– The template namedocx– DOCX configuration fromproject.yamlspec_metadata– Specification-level attributes (increate_additional_parts)
5 Filters
Pandoc Lua filters run during the DOCX write phase and convert
SpecCompiler format markers to OOXML. The default filter (models/default/filters/docx.lua)
handles:
| Input Marker | Output |
|---|---|
RawBlock("speccompiler", "page-break") |
OOXML page break |
RawBlock("speccompiler", "vertical-space:NNNN") |
OOXML spacing (in twips) |
RawBlock("speccompiler", "bookmark-start:ID:NAME") |
OOXML bookmark start |
RawBlock("speccompiler", "math-omml:OMML") |
OOXML math element |
Div.speccompiler-caption |
OOXML caption with SEQ field |
Div.speccompiler-numbered-equation |
OOXML numbered equation with tab layout |
Div.speccompiler-positioned-float |
Position markers for postprocessor |
Link with .ext
target |
Rewritten to .docx
target |
5.1 When to Use Filters vs Postprocessors
- Filters operate on the Pandoc AST before DOCX generation. Use them when you need to convert SpecCompiler markers to OOXML elements that Pandoc will then place in the document.
- Postprocessors operate on the raw OOXML after DOCX generation. Use them when you need to manipulate the final XML directly (style injection, image positioning, headers/footers).
6 project.yaml Configuration
6.1 DOCX-Specific Settings
# Output format entry
outputs:
- format: docx
path: build/docx/{spec_id}.docx
# DOCX-specific configuration
docx:
preset: default # Style preset name
# reference_doc: assets/reference.docx # Custom reference (overrides preset)6.2 Configuration Precedence
- If
docx.reference_docis set, that file is used directly as the Pandoc reference document. - If
docx.presetis set (or defaults to the model’s styles), SpecCompiler generates{output_dir}/reference.docxfrom the preset. - If neither is set, Pandoc uses its built-in default styles.
7 Reference Document Cache
When using presets, SpecCompiler caches the generated reference.docx to avoid regenerating it
on every build.
The cache works as follows:
- Compute SHA-1 hash of the preset file content.
- Compare against the stored hash in the
build_metatable (key-value store inspecir.db). - If the hashes match and
reference.docxexists on disk, skip generation. - If the preset changed or
reference.docxis missing, regenerate and update the cache.
To force regeneration of the reference document, delete it:
rm -f build/reference.docx
./bin/speccompiler-coreSpecCompiler Core User Manual
1 Introduction
1.1 What is SpecCompiler?
SpecCompiler is a document processing pipeline that transforms structured Markdown specifications into multiple output formats (DOCX, HTML5). It provides:
- Structured authoring: Define requirements, designs, and verification cases using a consistent syntax.
- Traceability: Link objects together with Project Identifier (PID) and
#labelreferences. - Validation: Automatically verify data integrity through proof views backed by Structured Query Language (SQL) queries against the Specification Intermediate Representation (SpecIR).
- Multi-format output: Generate Word documents and web content from a single source.
SpecCompiler processes documents through a five-phase pipeline: INITIALIZE, ANALYZE, TRANSFORM, VERIFY, and EMIT, as illustrated in Figure 1.
1.2 Scope
This manual covers:
- Installation and verification of the SpecCompiler-Core Docker image (see Introduction).
- Configuration of project files (
project.yaml) as described in Project Configuration. - Authoring specification documents using the SpecCompiler Markdown syntax (Project Configuration).
- Invocation of the tool and interpretation of its outputs (Invocation).
- Verification diagnostics and error code reference.
- Incremental build behavior and cache management.
- Type system configuration and custom model creation; for a detailed walkthrough, see creating-a-model: Model Directory Layout in the companion model guide.
- Troubleshooting common problems.
1.3 Pipeline Summary
The processing pipeline consists of five phases:
- INITIALIZE – Parse Markdown input via Pandoc Abstract Syntax Tree (AST), extract specifications, spec objects, attributes, floats, relations, and views into the SpecIR stored in SQLite Database (SQLite).
- ANALYZE – Resolve relation types and cross-references using specificity-based inference rules (see Equation 1).
- TRANSFORM – Resolve floats (render PlantUML, charts, tables), materialize views, rewrite links, and render spec objects using type-specific handlers.
- VERIFY – Execute proof views (SQL queries) against the SpecIR database to detect constraint violations; see creating-a-model: Validation Proofs in the model guide.
- EMIT – Assemble Pandoc documents from the SpecIR and generate output files in configured formats via parallel Pandoc subprocess invocations.
The handler counts shown in Figure 2 reflect the default model. Custom models may add handlers in any phase.
2 Installation
2.1 Prerequisites
The following are required to run SpecCompiler-Core:
| Prerequisite | Minimum Version | Notes |
|---|---|---|
| Docker runtime | 20.10+ | Docker Desktop or Docker Engine (daemon must be running) |
| Disk space | 2 GB | For the Docker image and build artifacts |
| Host OS | Linux, macOS, or Windows (with WSL2) | The container runs Debian Bookworm (slim) |
All dependency versions are pinned in scripts/versions.env.
2.2 Building the Image
Run the Docker installer from the repository root:
bash scripts/install.shThis performs three steps:
- Docker build – Executes a multi-stage Docker build:
- Toolchain – Compiles Lua, Pandoc (with GHC), and native Lua extensions (luv, lsqlite3, zip) from source. Builds Deno TypeScript utilities. Downloads and wraps PlantUML with a minimal JRE. This stage is cached and only rebuilt when
scripts/versions.env,scripts/build.sh, orsrc/tools/change. - Runtime-base – Copies only runtime artifacts into a lean Debian Bookworm image without build tools. Installs runtime dependencies (Python/reqif, graphviz, lcov). This is the stable base for code-only updates.
- Runtime – Overlays
src/andmodels/onto runtime-base to produce the final image.
- Wrapper generation – Creates a
speccCommand-Line Interface (CLI) command at~/.local/bin/specc. - Config – Writes the image reference to
~/.config/speccompiler/env.
The installer supports three modes:
- Default (
bash scripts/install.sh) – Builds the full image if not present. - Force (
bash scripts/install.sh --force) – Rebuilds everything from scratch, including the toolchain. - Code-only (
bash scripts/install.sh --code-only) – Updates onlysrc/andmodels/layers without recompiling the toolchain. Always builds from the stableruntime-baseimage, preventing Docker layer accumulation. Dangling images from previous builds are automatically pruned.
2.3 Verifying Installation
After building, verify the image is available:
docker images speccompiler-coreTo verify the tool runs correctly, navigate to a directory containing
a project.yaml file and run:
specc build2.4 The specc Wrapper
The specc command is a Docker
wrapper generated by the installer. It supports three subcommands:
| Command | Description |
|---|---|
specc build [project.yaml] |
Build the project (default file:
project.yaml) |
specc clean |
Remove build/ directory and
specir.db |
specc shell |
Open an interactive Bash shell inside the container |
The build subcommand runs docker run --rm with:
--user "$(id -u):$(id -g)"– Preserves host UID/GID.-v "$(pwd):/workspace"– Mounts current directory.-e "SPECCOMPILER_HOME=/opt/speccompiler"– Sets installation root.-e "SPECCOMPILER_DIST=/opt/speccompiler"– Sets distribution root.-e "SPECCOMPILER_LOG_LEVEL=${SPECCOMPILER_LOG_LEVEL:-INFO}"– Passes log level.
Inside the container, the speccompiler-core entry point invokes
Pandoc with the SpecCompiler Lua filter. The -o /dev/null Pandoc flag is
intentional – actual output files are generated by the EMIT phase.
3 Project Configuration
All project configuration is specified in a project.yaml file located in the project
root directory.
3.1 Complete Configuration Reference
# ============================================================================
# Project Identification (REQUIRED)
# ============================================================================
project:
code: MYPROJ # Project code identifier (string, required)
name: My Project SRS # Human-readable project name (string, required)
# ============================================================================
# Type Model (REQUIRED)
# ============================================================================
template: default # Type model name (string, default: "default")
# Must match a directory under models/
# ============================================================================
# Logging Configuration (OPTIONAL)
# ============================================================================
logging:
level: info # DEBUG | INFO | WARN | ERROR (default: "INFO")
format: auto # auto | json | text (default: "auto")
color: true # ANSI color codes (default: true)
# ============================================================================
# Validation Policy (OPTIONAL)
# ============================================================================
validation:
missing_required: ignore
cardinality_over: ignore
invalid_cast: ignore
invalid_enum: ignore
invalid_date: ignore
bounds_violation: ignore
dangling_relation: ignore
unresolved_relation: ignore
# ============================================================================
# Input Files (REQUIRED)
# ============================================================================
output_dir: build/ # Base output directory (default: "build")
doc_files: # Markdown files to process, in order
- srs.md
- sdd.md
# ============================================================================
# Output Format Configurations (OPTIONAL)
# ============================================================================
outputs:
- format: docx
path: build/docx/{spec_id}.docx
- format: html5
path: build/www/{spec_id}.html
# ============================================================================
# DOCX Configuration (OPTIONAL)
# ============================================================================
docx:
preset: null # Style preset name (models/{template}/presets/)
# reference_doc: assets/reference.docx # Custom Word reference
# ============================================================================
# HTML5 Configuration (OPTIONAL)
# ============================================================================
html5:
number_sections: true
table_of_contents: true
toc_depth: 3
standalone: true
embed_resources: true
resource_path: build
# ============================================================================
# Bibliography and Citations (OPTIONAL)
# ============================================================================
bibliography: refs.bib
csl: ieee.csl3.2 Required Fields
| Field | Type | Description |
|---|---|---|
project.code |
string | Project code identifier |
project.name |
string | Human-readable project name |
doc_files |
list | One or more Markdown file paths to process |
3.3 Default Values
| Field | Default | Notes |
|---|---|---|
template |
default |
Built-in base model is always loaded |
output_dir |
build |
Also stores specir.db |
logging.level |
INFO |
Overridden by
SPECCOMPILER_LOG_LEVEL env var |
4 Document Authoring
SpecCompiler extends standard Markdown with a structured overlay for specification documents. The syntax uses existing Markdown constructs (headers, blockquotes, code blocks, links) with specific patterns that the pipeline recognizes.
4.1 Specifications
Level 1 headers declare the top-level document container.
Pattern: # type: Title @PID
# srs: Software Requirements Specification @SRS-0014.2 Spec Objects
Level 2-6 headers declare requirements, design elements, sections, or any typed element.
Pattern: ## type: Title @PID
## hlr: User Authentication @HLR-001
### llr: Password Validation @LLR-001
#### section: Implementation NotesIf @PID is omitted, a PID is
auto-generated using the type’s pid_prefix and pid_format.
4.3 Attributes
Blockquotes declare attributes using the key: value pattern. They belong to the
most recently opened Specification or SpecObject header and do not need
to appear immediately after it:
## hlr: User Authentication @HLR-001
> priority: High
> status: Draft
> rationale: Required by security policyRules:
- Each attribute blockquote must be separated by a blank line.
- The first line must match
key: value(wherekeyis[A-Za-z0-9_]+). If not, the blockquote is treated as prose. The key does not need to be a registered attribute type; unregistered keys default toSTRINGdatatype. - Multi-line values are supported: continuation lines append to the preceding attribute.
- Supported datatypes:
STRING,INTEGER,REAL,BOOLEAN,DATE(YYYY-MM-DD),ENUM,XHTML.
4.4 Floats
Fenced code blocks with a typed first class declare numbered elements.
Pattern: ```type.lang:label{key="val"}
4.4.1 PlantUML Diagram
```plantuml:diag-state{caption="State Machine"}
@startuml
[*] --> Active
Active --> Inactive
@enduml
```4.4.2 Table
```list-table:tbl-interfaces{caption="External Interfaces"}
> header-rows: 1
> aligns: l,l,l
* - Interface
- Protocol
- Direction
* - GPS
- ARINC-429
- Input
```4.4.3 CSV Table
The Comma-Separated Values (CSV) float alias provides a compact syntax for tabular data:
```csv:tbl-data{caption="Sample Data"}
Name,Value,Unit
Temperature,72.5,F
Pressure,1013.25,hPa
```Both csv and list-table produce TABLE floats. Use
csv for simple tabular data and
list-table for tables with rich
Markdown content in cells. See Floats in Practice for live examples of
each.
4.4.4 Listing (Code)
```listing.c:lst-init{caption="Initialization Routine"}
void init(void) {
setup_hardware();
}
```4.4.5 Chart (ECharts)
```chart:chart-coverage{caption="Test Coverage"}
{
"xAxis": { "data": ["Module A", "Module B"] },
"series": [{ "type": "bar", "data": [95, 87] }]
}
```Charts support data injection via view modules. Add view="gauss" and params="mean=0,sigma=1" to the code
fence attributes to inject generated data into the ECharts configuration
at render time. See Figure 5 for a working example.
4.4.6 Math
```math:eq-force{caption="Newton's Second Law"}
F = ma
```Math floats use AsciiMath notation and are rendered to MathML for HTML5 output and OMML for DOCX. See Equation 1 and Equation 2 for live examples in this manual.
4.4.7 Float Syntax Summary
| Component | Description |
|---|---|
type |
Float type identifier (for example
figure, plantuml, csv,
list-table, listing, chart,
math) |
.lang |
Optional language hint for syntax highlighting |
:label |
Float label for cross-referencing; must be unique within the specification |
{key="val"} |
Key-value attributes; common attribute:
caption |
4.5 Relations (Links)
Links use the pattern [content](selector). Selectors are
not hardcoded – they are registered by relation types
in the model’s type system. Each relation type declares a link_selector field, and the pipeline
uses it for resolution and type inference. The default model registers the following
selectors:
| Selector | Registered by | Resolution |
|---|---|---|
@ |
traceable base (XREF_SEC, and
model-specific types) |
PID lookup: same-spec first, then cross-document fallback |
# |
xref base (XREF_FIGURE,
XREF_TABLE, XREF_LISTING, XREF_MATH, XREF_SECP) |
Scoped label resolution: local scope, then same-spec, then global |
@cite |
XREF_CITATION | Rewritten to pandoc Cite element (parenthetical) |
@citep |
XREF_CITATION | Rewritten to pandoc Cite element (in-text) |
Custom models can register additional selectors by defining relation
types with new link_selector
values.
| Syntax | Example | Description |
|---|---|---|
[PID](@) |
[HLR-001](@) |
Reference by PID |
[type:label](#) |
[fig:diagram](#) |
Typed float reference |
[scope:type:label](#) |
[REQ-001:fig:detail](#) |
Scoped float reference |
[key](@cite) |
[smith2024](@cite) |
Parenthetical citation |
[key](@citep) |
[smith2024](@citep) |
In-text citation |
4.5.1 Type Inference
After a link is resolved, the inference algorithm scores it against all registered relation types using 4 unweighted dimensions. Each matching dimension adds +1 to the specificity score. A constraint mismatch eliminates the candidate entirely. The total score for a candidate is computed as:
The four dimensions ( through ) correspond to selector, source attribute, source type, and target type as shown in Table 8:
| Dimension | Match | Constraint mismatch | No constraint (NULL) |
|---|---|---|---|
Selector (@,
#, @cite, etc.) |
+1 | Eliminated | +0 |
| Source attribute | +1 | Eliminated | +0 |
| Source type | +1 | Eliminated | +0 |
| Target type | +1 | Eliminated | +0 |
The highest-scoring candidate wins. If two candidates tie, the
relation is marked ambiguous. For example, [fig:diagram](#) resolving to a FIGURE
float will match XREF_FIGURE (selector # + target type FIGURE = specificity
)
over the generic xref base (selector # only = specificity
).
4.6 Views
Inline code with a specific prefix declares view placeholders:
`toc:`Default model view types:
| Type | Aliases | Description |
|---|---|---|
toc - — |
Table of Contents (TOC) from spec object headings | |
lof |
lot |
List of floats (figures, tables, etc.) |
abbrev |
sigla,
acronym |
Define an abbreviation inline: Full Meaning (ABBR) |
abbrev_list |
sigla_list,
acronym_list |
Render a sorted table of all abbreviations
defined via abbrev: |
math_inline |
eq, formula |
Inline math expression rendered to MathML/OMML |
gauss |
gaussian,
normal |
Generate Gaussian distribution data for chart floats |
4.7 Body Content
Prose paragraphs, lists, and tables between headers accumulate to the most recently opened Specification or Spec Object.
4.8 File Includes
Split large documents into multiple files using fenced code blocks
with the include class:
```include
path/to/chapter1.md
path/to/chapter2.md
```Each line is a file path relative to the including document’s
directory. Absolute paths are also supported. Lines starting with # are treated as comments and
ignored.
Include blocks are expanded recursively before the pipeline runs. Circular includes are detected and produce an error. The maximum nesting depth is 100 levels.
Included files are tracked in the build graph for incremental builds – a change to any included file triggers a rebuild.
5 Using the Default Model
5.1 Rationale
The default model ships a
complete document authoring toolkit so that authors can write structured
technical documents without defining custom types. It provides:
- Numbered floats – figures, tables, code listings, math equations, PlantUML diagrams, and ECharts charts, each with automatic numbering and captions.
- Typed cross-references – relation types that resolve
@and#links to specific float and object categories, enabling the pipeline to render appropriate display text (for example, “Figure 3” or “Table 1”). - Bibliography citations – integration with Pandoc’s citeproc for parenthetical and in-text citation rendering from BibTeX files.
- Content views – generated content blocks such as TOC, list of figures, abbreviation tables, and inline math.
The following subsections demonstrate these features with live floats and cross-references. Every float, view, and link shown below is processed by SpecCompiler when this manual is built.
5.2 Floats in Practice
A SpecCompiler document can use all default float types. Each float has a type prefix, a label for cross-referencing, and a caption. The examples below are live – they are rendered when this manual is processed.
5.2.1 Architecture Diagram (PlantUML)
5.2.2 Component Table (list-table)
| Component | Layer | Technology |
|---|---|---|
| Web UI | Presentation | React |
| Auth Service | Business Logic | Node.js |
| Data Service | Business Logic | Python |
| Database | Persistence | PostgreSQL |
5.2.3 Performance Metrics (CSV)
| Metric | Target | Actual | Status |
|---|---|---|---|
| Response time (ms) | 200 | 185 | Pass |
| Throughput (req/s) | 1000 | 1120 | Pass |
| Error rate (%) | 1.0 | 0.3 | Pass |
| Memory usage (MB) | 512 | 487 | Pass |
5.2.4 Initialization Code (Listing)
def initialize(config):
db = connect(config.db_url)
auth = AuthService(db)
return Application(auth, db)5.2.5 Latency Model (Math)
5.2.6 Throughput Chart (ECharts)
5.3 Cross-References
Every float and object defined above can be referenced from prose.
The following paragraph demonstrates cross-reference resolution using
the # selector.
The system architecture is depicted in Figure 3. Component details, including the technology stack for each layer, are listed in Table 10. Performance targets and actuals are compared in Table 11 – all four metrics pass their thresholds. The initialization logic is shown in Listing 16, and the latency model driving performance requirements is defined by Equation 2. Finally, throughput measurements by module are visualized in Figure 4.
The @ selector resolves by PID
and works across documents. For example, this sentence references the
introduction of this manual: INDEX. Cross-document references to the
companion guides also work; see creating-a-model: Model Directory Layout
for the model directory layout and docx-customization: Style Presets for
DOCX style presets.
| Selector | Syntax | Resolution |
|---|---|---|
@ (PID) |
[PID](@) |
Exact PID lookup. Same-spec first, then cross-document fallback. Never ambiguous. |
# (Label) |
[type:label](#) |
Scoped resolution: local scope, then same specification, then global. May be ambiguous if multiple matches at the same scope level. |
5.4 Section References
Headers without an explicit TYPE: prefix default to the SECTION
type. Sections receive auto-generated PIDs and labels that can be used
for cross-referencing:
- PID format:
{spec_pid}-sec{depth.numbers}– for example,SRS-sec1,SRS-sec1.2,SRS-sec2.3.1. Use the@selector:[SRS-sec1.2](@). - Label format:
section:{title-slug}– for example,## Introductionproduces the labelsection:introduction. Use the#selector:[section:introduction](#).
The @ selector performs an exact
PID lookup and is never ambiguous. The # selector uses scoped resolution
(closest scope wins), which is useful when multiple specifications have
sections with similar names.
For cross-document section references with the # selector, use the explicit scope
syntax: [SPEC-A:section:design](#) to target a
section labeled “design” within the specification whose PID is SPEC-A.
This manual references its own sections using both selectors. Here are examples that resolve within this document:
- By PID: Introduction links to Installation, Project Configuration links to Document Authoring.
- By label: Pipeline Summary links to Pipeline Summary, Troubleshooting links to Troubleshooting.
Cross-document references work identically. Because the companion
guides are listed in the same project.yaml, these links resolve at
build time:
- creating-a-model: Walkthrough: Custom Float Type links to the float walkthrough in the model guide.
- docx-customization: Postprocessors links to the DOCX customization guide.
- docx-customization: Preset Inheritance links to preset inheritance in the DOCX guide.
5.5 Citations and Bibliography
SpecCompiler integrates with Pandoc’s citeproc processor for scholarly citations.
Step 1. Add bibliography configuration to project.yaml:
bibliography: refs.bib
csl: ieee.cslStep 2. Create a BibTeX file (refs.bib):
@article{smith2024,
author = {Smith, John},
title = {Advances in Systems Engineering},
journal = {IEEE Transactions},
year = {2024}
}
@book{jones2023,
author = {Jones, Alice},
title = {Software Architecture Patterns},
publisher = {O'Reilly},
year = {2023}
}Step 3. Use citation syntax in your document:
Recent work [smith2024](@cite) demonstrates the approach.
As Smith [smith2024](@citep) argues, the method is effective.
Multiple sources support this [smith2024;jones2023](@cite).[key](@cite)produces a parenthetical citation – for example, “(Smith, 2024)” in author-date styles or “[1]” in numeric styles.[key](@citep)produces an in-text citation – for example, “Smith (2024)” or “Smith [1]”.- Multiple keys separated by
;produce a grouped citation.
Processing pipeline: During the TRANSFORM phase,
citation links are rewritten to Pandoc Cite elements. During EMIT, Pandoc’s
citeproc processor formats citations and appends a bibliography list to
the document according to the configured CSL style.
5.6 Views in Practice
Views generate content blocks from the SpecIR.
5.6.1 Abbreviations
The abbrev: view defines
abbreviations inline. On first use, the full meaning is displayed
alongside the abbreviation. All definitions are collected for the abbrev_list view shown in the List of
Abbreviations appendix.
This manual defines abbreviations on first use throughout the text. For example, Entity-Attribute-Value (EAV) is the database pattern used for flexible attributes, and Newline-Delimited JSON (NDJSON) is the format used for diagnostic output.
The syntax is: `abbrev: Full Meaning Text (ABBREVIATION)`.
The abbreviation goes in parentheses at the end.
5.6.2 Inline Math
The eq: prefix renders inline
math expressions using AsciiMath notation. For example, the quadratic
formula is
,
and Euler’s identity is
.
Inline math is useful for formulas within prose paragraphs, while
block math: floats (like Equation 1 and
Equation 2)
provide numbered equations with captions.
5.6.3 Chart with Data View Injection (Gauss)
Charts can load data dynamically from view modules using the view attribute. The gauss view generates a Gaussian
probability density function and injects it into the ECharts dataset.
The chart below demonstrates this – the view="gauss" attribute triggers the
data injection pipeline:
The params attribute passes
mean, sigma, xmin, xmax, and points to the Gauss view’s generate() function. The function
returns an ECharts dataset that replaces the chart’s placeholder data at
render time. This same mechanism supports custom data views that query
the SpecIR database; see creating-a-model: Walkthrough: Custom
View in the model guide for details on creating view modules.
5.6.4 Generated Lists
The [LOF] and [LOT] views produce navigable lists of figures and tables. These are rendered in the appendices of this manual:
- List of Figures – generated by [LOF]
- List of Tables – generated by [LOT]
- List of Abbreviations – generated by
abbrev_list
6 Invocation
6.1 Basic Usage
specc buildProcesses all files from doc_files in the current directory’s
project.yaml. An alternative
project file can be specified: specc build my-project.yaml.
6.2 Environment Variables
| Variable | Default | Description |
|---|---|---|
SPECCOMPILER_LOG_LEVEL |
INFO |
Override log level: DEBUG,
INFO, WARN, ERROR |
SPECCOMPILER_HOME |
/opt/speccompiler |
SpecCompiler installation root (model and binary lookup) |
SPECCOMPILER_DIST |
/opt/speccompiler |
Distribution root (used internally for external renderers) |
SPECCOMPILER_IMAGE |
speccompiler-core:latest |
Docker image reference (overrides default in wrapper) |
NO_COLOR |
(unset) | Disable ANSI color codes in output |
6.3 Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success: all documents processed and outputs generated |
| 1 | Failure: Docker not running or missing config or pipeline error |
7 Output Formats
Four output formats are supported. Multiple formats can be generated in a single run.
7.1 DOCX (Microsoft Word)
- Style presets via
docx.presetor customdocx.reference_doc. - Model-specific postprocessors for format transformations.
For a complete guide on customizing DOCX output – including paragraph styles, table styles, caption configuration, and postprocessors – see docx-customization: Style Presets and docx-customization: Postprocessors in the companion DOCX Customization guide.
7.2 HTML5
| Option | Type | Default | Description |
|---|---|---|---|
number_sections |
boolean | false | Add section numbering |
table_of_contents |
boolean | false | Generate table of contents |
toc_depth |
integer | 3 | Heading depth for TOC |
standalone |
boolean | false | Produce complete HTML document |
embed_resources |
boolean | false | Embed CSS and images inline |
7.3 Markdown (GitHub-Flavored Markdown (GFM))
GitHub-Flavored Markdown. Useful for review platforms and static site generators.
7.4 JSON (Pandoc AST)
Full Pandoc AST for programmatic integration with other tools.
8 Verification and Diagnostics
8.1 Diagnostic Output
Diagnostics are emitted in NDJSON format to stderr:
{"level":"error","message":"[object_missing_required] Object missing required attribute 'priority' on HLR-001","file":"srs.md","line":42}8.2 Diagnostic Reference
| Policy Key | Description |
|---|---|
spec_missing_required |
Specification missing required attribute |
spec_invalid_type |
Invalid specification type reference |
object_missing_required |
Spec object missing required attribute |
object_cardinality_over |
Attribute cardinality exceeded |
object_cast_failures |
Attribute type cast failure |
object_invalid_enum |
Invalid enum value |
object_invalid_date |
Invalid date format (expected YYYY-MM-DD) |
object_bounds_violation |
Value outside declared bounds |
object_duplicate_pid |
Duplicate PID across spec objects |
float_orphan |
Float has no parent object (orphan) |
float_duplicate_label |
Duplicate float label in specification |
float_render_failure |
External render failure |
float_invalid_type |
Invalid float type reference |
relation_unresolved |
Unresolved link (PIDs are case-sensitive) |
relation_dangling |
Dangling relation (target not found) |
relation_ambiguous |
Ambiguous float reference |
view_materialization_failure |
View materialization failure |
8.3 Suppressing Validation Rules
Every diagnostic listed in Table 16 can be suppressed or downgraded
in project.yaml using its
policy key:
validation:
float_orphan: ignore # suppress entirely
relation_unresolved: warn # downgrade to warningAll proofs default to error
(halt the build). Set a key to warn to emit a warning without
halting, or ignore to suppress
the diagnostic entirely. Custom proofs can define their own policy keys;
see creating-a-model: Validation Proofs in
the model guide.
9 Incremental Builds
9.1 Build Cache Mechanism
- File hashing – SHA-1 hash of each input file.
- Include dependency tracking – Tracked in
build_graphtable. - Cache comparison – Current hashes vs
build_cachetable. - Skip decision – Unchanged documents reuse cached SpecIR data.
9.2 Forcing a Full Rebuild
specc clean
specc build10 Type System and Models
10.1 Custom Models
Set template: mymodel in project.yaml. Types load in order:
models/default/types/– Always loaded first.models/mymodel/types/– Loaded as overlay.
For a complete walkthrough on creating custom types, including object types, float types, relation types with inference rules, and view types, see the companion Creating a Custom Model guide. Key sections include:
- creating-a-model: Model Directory Layout – Directory structure for models.
- creating-a-model: Type Definition Pattern – Schema keys by category and handler lifecycle.
- creating-a-model: Walkthrough: Custom Object Type – Custom object type with attributes.
- creating-a-model: Walkthrough: Custom Relation with Inference Rules – Relation inference scoring.
- creating-a-model: Validation Proofs – SQL-based proof views for the VERIFY phase.
10.2 Built-in Models
SpecCompiler ships with default and sw_docs. The default model provides general-purpose
types (specifications, sections, floats, cross-references, views). The
sw_docs model overlays default with types for requirements
engineering and traceability:
- Object types: HLR, LLR, NFR, VC, TR, FD, CSC, CSU, DIC, DD, SF (all extend a common TRACEABLE base with
statusattribute and PID auto-generation) - Specification types: SRS, SDD, SVC, SUM, TRR (document templates with version, status, date)
- Relation types: TRACES_TO, BELONGS, REALIZES, XREF_DECOMPOSITION, XREF_DIC (traceability links with specificity-based inference)
- View types: TRACEABILITY_MATRIX, TEST_RESULTS_MATRIX, TEST_EXECUTION_MATRIX, COVERAGE_SUMMARY, REQUIREMENTS_SUMMARY (query-based tables materialized from the SpecIR)
- Proofs: Traceability chain validation (VC-HLR, TR-VC, FD-CSC/CSU coverage)
- Postprocessor: Interactive single-file HTML5 web application
The docs/engineering_docs/
directory in this repository uses sw_docs and serves as a living example
of the model in practice.
10.3 Type Directory Structure
models/{template}/
types/
objects/ # Spec object types
floats/ # Float types
relations/ # Relation types
views/ # View types
specifications/ # Specification types
postprocessors/ # Format-specific post-processing
styles/ # DOCX style presets
filters/ # Pandoc Lua filters per output format
10.4 Type Module Structure
Object type:
local M = {}
M.object = {
id = "HLR",
long_name = "High-Level Requirement",
pid_prefix = "HLR",
pid_format = "%s-%03d",
attributes = {
{ name = "priority", datatype_ref = "PRIORITY_ENUM",
min_occurs = 1, max_occurs = 1,
values = {"High", "Medium", "Low"} },
}
}
return MRelation type:
local M = {}
M.relation = {
id = "TRACES_TO",
link_selector = "@",
source_type_ref = "LLR",
target_type_ref = "HLR",
}
return M10.5 Attribute Schema Fields
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | required | Attribute identifier |
datatype_ref |
string | STRING |
STRING, INTEGER, REAL, BOOLEAN, DATE, ENUM, XHTML |
min_occurs |
integer | 0 | Minimum values (0 optional, 1 required) |
max_occurs |
integer | 1 | Maximum values |
min_value |
number | nil | Lower bound for numeric values |
max_value |
number | nil | Upper bound for numeric values |
values |
list | nil | Valid enum values |
11 Troubleshooting
11.1 Docker Not Running
Error: Docker is not running –
Start Docker daemon, verify with docker info.
11.2 No project.yaml Found
Run specc build from the
directory containing project.yaml.
11.3 PlantUML Render Failure
Verify PlantUML syntax, ensure Docker image has Java JRE, check @startuml/@enduml markers.
11.4 Unresolved Relations
PIDs are case-sensitive. Verify target PID exists in
doc_files. For cross-document
references, ensure both documents are listed in the same project.yaml.
11.5 Build Seems Stale
specc clean
specc build11.6 Debugging
SPECCOMPILER_LOG_LEVEL=DEBUG specc build12 Known Limitations
- No interactive validation – Batch mode only, no LSP or watch mode.
- Docker-only distribution – Native install requires replicating the full build environment (
scripts/build.sh --installis provided but requires all system dependencies). - Single-writer SQLite – Concurrent builds cause locking errors; use separate output directories.
- Float labels per-specification – Same label can exist across specs; use scoped syntax for cross-spec references.
- PID case sensitivity –
[hlr-001](@)will not match@HLR-001.
13 List of Figures
14 List of Tables
- Table 1 - Runtime prerequisites
- Table 2 - Wrapper subcommands
- Table 3 - Required project fields
- Table 4 - Default configuration values
- Table 5 - Float syntax components
- Table 6 - Default model selectors
- Table 7 - Relation syntax patterns
- Table 8 - Type inference scoring dimensions
- Table 9 - Default view types
- Table 10 - System Components
- Table 11 - Performance Metrics
- Table 12 - Cross-reference selector comparison
- Table 13 - Environment variables
- Table 14 - Exit Codes
- Table 15 - HTML5 output options
- Table 16 - Validation diagnostics
- Table 17 - Attribute schema fields
15 List of Listings
16 List of Abbreviations
| AST | Pandoc Abstract Syntax Tree |
| CLI | Command-Line Interface |
| CSV | Comma-Separated Values |
| EAV | Entity-Attribute-Value |
| GFM | GitHub-Flavored Markdown |
| NDJSON | Newline-Delimited JSON |
| PID | Project Identifier |
| SpecIR | Specification Intermediate Representation |
| SQL | Structured Query Language |
| SQLite | SQLite Database |