HDF

Human Data Forms (HDF)

Specification & Grammar

This document defines the Human Data Forms (HDF) file format: a Lisp-inspired, human-oriented data language intended as a replacement for JSON, YAML, and TOML for configuration, inventory, and infrastructure-style data.

1. Design Goals


2. Lexical Structure

2.1 Encoding


2.2 Whitespace

whitespace ::= " " | "\t" | "\n" | "\r"
ws          ::= whitespace*
ws1         ::= whitespace+
Whitespace separates tokens and has no semantic meaning.

2.3 Line Comments (Optional Sugar)

comment ::= ";" { any_char_except_newline }
Line comments are treated as whitespace and have no semantic representation. The canonical comment mechanism is the rem form (see §6).

3. Top-Level Grammar

document ::= ws element* ws

4. Structural Forms

4.1 Lists (Forms)

list ::= "(" ws element* ws ")"
Lists are ordered and may contain values or other lists.

4.2 Elements

element ::= value | list

5. Values

value ::= raw_string
        | quoted_string
        | number
        | boolean
        | null
        | keyword
Parsing order is significant.

6. Keywords (Unquoted Strings / Atoms)

Keywords are the default textual value.
keyword ::= keyword_initial keyword_subsequent*
keyword_initial ::= letter | "_"
keyword_subsequent ::= letter
                     | digit
                     | "_"
                     | "."
                     | "/"
                     | ":"
                     | "-"
letter ::= "A"…"Z" | "a"…"z"
digit  ::= "0"…"9"

Rules


7. Quoted Strings

Quoted strings support escape processing.
quoted_string ::= '"' quoted_char* '"'
quoted_char ::= escape_sequence
              | any_char_except_quote_backslash_newline

Escapes

escape_sequence ::= "\" escape_code
escape_code ::= '"'
              | "\"
              | "n"
              | "r"
              | "t"
              | "0"
              | "x" hex hex
              | "u" hex hex hex hex
              | "U" hex hex hex hex hex hex hex hex
hex ::= digit | "A"…"F" | "a"…"f"

8. Raw Strings (Unified, Multiline)

Raw strings are the only multiline-capable string form. They support two variants: double-bracket (Lua-style) and single-bracket (Lisp-style).

Syntax

raw_string ::= "[" raw_equals "[" raw_content "]" raw_equals "]"
             | "[" raw_equals raw_content raw_equals "]"
raw_equals ::= "="*

Semantics

Examples

(rem "Double bracket")
[[line one
line two]]

(rem "Single bracket")
[line one
line two]

(rem "With equal signs to allow nested brackets")
[=[ This [ ] is allowed because of the equal sign ]=]

9. Comment Forms (rem)

9.1 Principle

Comments are represented as syntactically valid data forms, not lexical trivia. A comment is any list whose first element is the keyword:
rem

9.2 Syntax

comment_form ::= "(" "rem" element* ")"
The rem form accepts zero or more elements. Valid examples:
(rem)
(rem note)
(rem "single-line comment")

9.3 Semantics (Normative)

Interpretation of the contents is entirely delegated to tools.

9.4 Multiline Comments

Multiline comments are expressed using raw strings inside rem:
(rem [=[
This is a multiline comment.
It may contain arbitrary text.
]=])
No separate multiline comment syntax exists.

9.5 Structured Comments

Because rem accepts arbitrary forms, it naturally supports structured metadata:
(rem
  author alice
  since 2026-01-01
  note "autogenerated"
)
The HDF specification assigns no meaning to such structure.

9.6 Documentation Comments (Convention)

A common convention is that a rem form immediately preceding another form documents that form:
(rem "Main nginx vhost")
(vhost
  name example.com
  root /var/www/example
)
This association is conventional and not enforced by the grammar.

10. Literals

10.1 Boolean

boolean ::= "true" | "false"

10.2 Null

null ::= "null"

11. Numbers

number ::= integer | float
integer ::= "-"? digit+
float ::= "-"? digit+ "." digit+ exponent?
        | "-"? digit+ exponent
exponent ::= ("e" | "E") ("+" | "-")? digit+

12. Errors

Invalid constructs include:

13. Minimal Examples

(rem "Cluster definition")

(cluster
  (rem "Primary node")
  (node pve-01 10.0.0.1)

  (rem)
  (node pve-02 10.0.0.2)
)

14. Intentional Omissions