Skip to main content

Cookpit v3.2 — Canonical ID Derivation Profile

The canonical generation profile referenced by rules.md G3 and M1 as cookpit-ai-canonical-v3.2. This document publishes the exact entityType, canonicalContent and canonicalPosition strings per entity type, and the SHA-256 hashing rule that turns those strings into the file's deterministic ids.

Implementations MUST follow this profile exactly. Two independent implementations using this profile against the same source recipe MUST produce byte-identical ids. The validator's V-IDS-DETERMINISTIC hard criterion checks each id by recomputing it under this profile and comparing.


1. The derivation formula

Every entity in a v3.2 cooking file carries a deterministic id of the shape <typePrefix><10 hex> (rule G1). The 10 hex digits are the first 10 characters of the lowercase SHA-256 hex digest of the canonical input string:

<typePrefix> + first 10 hex of SHA-256(<canonical input string>)

The canonical input string has a fixed shape:

v3.2|<entityType>|<canonicalContent>|<canonicalPosition>

All four parts are joined with the ASCII pipe character | (U+007C). The version literal v3.2 is always the first component; this scopes ids to the v3.2 schema and ensures a v3.3 id derivation never collides with v3.2.

Each entity type defines its own entityType literal, canonicalContent extraction and canonicalPosition rule, in §3 below.


2. Hashing rules

  1. Encoding. The canonical input string is encoded as UTF-8 bytes before hashing. No BOM. No trailing newline. Exact byte-equality matters.

  2. Hash function. SHA-256 (FIPS 180-4). Produces a 32-byte digest rendered as 64 lowercase hexadecimal characters.

  3. Truncation. The first 10 characters of the lowercase hex rendering are taken as the id suffix. This is a 40-bit hash space; the V-IDS-UNIQUE criterion guards against collisions within a single file. Across the corpus, 40 bits is sufficient given typical file sizes (≤ 200 entities per file).

  4. Type prefix. Prepended to the truncated hash per rule G2:

    EntityPrefix
    cooking file / liveCookf
    ingredienti
    equipmente
    utensilu
    sundrys
    prerequisite item (any group)q
    hotspoth
    processp
    taskt
    prepCook phasey
    preCook phasez

    Result: ^[a-z][0-9a-f]{10}$ (rule G1).


3. canonicalContent and canonicalPosition per entity type

The canonicalContent is a deliberate, stable representation of the entity's identity. The canonicalPosition disambiguates entities with identical canonical content (same-noun-different-purpose ingredients, repeated process labels, etc.).

3.1 cooking file (f…)

ComponentValue
entityTypefile
canonicalContentThe recipe's name field, verbatim (Unicode-NFC)
canonicalPositionempty string

Worked example:

input : "v3.2|file|Ultimate spaghetti carbonara|"
sha256 : 3cdfe50d066a3a7d2cb8b8b1f2... (truncate to 10 hex)
output : f3cdfe50d06

3.2 ingredient (i…)

ComponentValue
entityTypeingredient
canonicalContentThe cookpit.ingredients[].text field, verbatim (NFC)
canonicalPositionThe 0-based index of the ingredient in cookpit.ingredients[], as a decimal integer string

Worked example (carbonara ingredient at index 0):

input : "v3.2|ingredient|Pancetta|0"
output : i01fb4e921a

The position component disambiguates same-noun-different-purpose ingredients (e.g. pork-fillet-braised-cheeks-and-pork-belly's two parsley entries — each carries a distinct text and a distinct position).

3.3 equipment (e…)

ComponentValue
entityTypeequipment
canonicalContentThe cookpit.equipment[].text field, verbatim (NFC)
canonicalPositionThe 0-based index in cookpit.equipment[]

3.4 utensil (u…)

ComponentValue
entityTypeutensil
canonicalContentThe cookpit.utensils[].text field, verbatim (NFC)
canonicalPositionThe 0-based index in cookpit.utensils[]

3.5 sundry (s…)

ComponentValue
entityTypesundry
canonicalContentThe cookpit.sundries[].text field, verbatim (NFC)
canonicalPositionThe 0-based index in cookpit.sundries[]

3.6 prerequisite item (q…)

ComponentValue
entityTypeOne of prereq-ingredient, prereq-equipment, prereq-utensil, prereq-sundry, prereq-skill, prereq-note
canonicalContentThe prereq item's text field, verbatim (NFC)
canonicalPositionThe 0-based index of the item within its containing prereq group, as a decimal integer string

The entityType distinguishes ingredient / equipment / utensil / sundry / skill / note prereq groups so a prereq item with identical text in different groups receives distinct ids.

3.7 hotspot (h…)

ComponentValue
entityTypehotspot
canonicalContentThe hotspot's text field, verbatim (NFC)
canonicalPositionThe 0-based index in cookpit.prerequisites.hotspots[]

3.8 process (p…)

ComponentValue
entityTypeOne of process (liveCook), prepCook-process (prepCook), preCook-process (preCook)
canonicalContentThe phase's processes[].label field, verbatim (NFC)
canonicalPositionThe 0-based index of the process within its phase's processes[] array

The phase-prefixed entityType ensures that two phases with identically-labelled processes (e.g. a Resting the meat process in both preCook and liveCook of a single file) receive distinct ids.

3.9 task (t…)

ComponentValue
entityTypeOne of task (liveCook), prepCook-task (prepCook), preCook-task (preCook)
canonicalContentThe phase's tasks[].action field, verbatim (NFC)
canonicalPositionThe tasks[].time field (e.g. 00:00:30.M1)

Tasks use the lane-time as position rather than the array index, because the array index can shift when alarms are added or re-ordered. The lane-time is the canonical identity moment within the phase. The phase-prefixed entityType ensures cross-phase distinctness — two phases may legitimately schedule a task with identical action and identical lane-time, and they will still receive distinct ids.

3.10 prepCook phase (y…)

ComponentValue
entityTypeprepCook
canonicalContentThe cookpit.prepCook.label field, verbatim (NFC)
canonicalPositionThe literal string prepCook

3.11 preCook phase (z…)

ComponentValue
entityTypepreCook
canonicalContentThe cookpit.preCook.label field, verbatim (NFC)
canonicalPositionThe literal string preCook

3.12 liveCook phase (no own id)

cookpit.liveCook does NOT carry its own id field. Its identity is the file's cookpit.id (f…). Any task or process inside cookpit.liveCook uses the unprefixed entityType (task, process) — preserving stability for files that have only a liveCook phase (the most common case in the corpus).


4. Worked walk-through: the canonical-id self-test

Implementations MUST pass this self-test:

import hashlib

def derive_id(prefix, entity_type, canonical_content, canonical_position):
canonical_input = f"v3.2|{entity_type}|{canonical_content}|{canonical_position}"
h = hashlib.sha256(canonical_input.encode("utf-8")).hexdigest()[:10]
return f"{prefix}{h}"

assert derive_id("f", "file", "Ultimate spaghetti carbonara", "") == "f3cdfe50d06"
assert derive_id("i", "ingredient", "Pancetta", "0") == "i01fb4e921a"
assert derive_id("t", "task", "Start.", "00:00:00.A0") == "tafb65451f2"
assert derive_id("p", "process", "Boiling the spaghetti", "0") == "p1413fec000"
assert derive_id("u", "utensil", "Chef's knife", "0") == "u4a7126d800"
assert derive_id("q", "prereq-ingredient",
"Pancetta finely chopped, rind off.", "0") == "q63e37eefe6"

These vectors are taken from the published spaghetti_carbonara.v3.2.cpt.A.jsonld example at https://cookchow.com/recipes/3.2/; each matches the file's actual id. Implementations that disagree on any vector are non-conformant.


5. Stability requirements

  1. Whitespace and case in canonicalContent are significant. A leading space, trailing newline, or alternative casing produces a different id. Implementations MUST NOT normalise the canonical content beyond Unicode NFC.

  2. Diacritics in canonicalContent are preserved. Gruyère produces a different id from Gruyere. Files that use one form must consistently use that form.

  3. Position indices are stable across the file's lifetime. Once a file is published, its ingredient at index 5 must remain at index 5; inserting a new ingredient at index 3 shifts indices 3..N and would change the ids of every shifted entity. Schema-evolution tooling SHOULD warn before such shifts.

  4. Task lane-times are stable across the file's lifetime. Editing a task's time field changes its id; the chef-app's runtime plan-references all rely on the id staying anchored to the action's source-derived moment.


6. Conformance

A v3.2 file conforms to cookpit-ai-canonical-v3.2 if and only if every id in the file matches the value computed by the formula in §1 applied to the per-entity-type rules in §3. The validator's V-IDS-DETERMINISTIC criterion runs this check for every id and reports the first mismatch.

Files generated before this profile was published may have used defensible-but-different conventions; those files MUST be re-derived against this profile to claim v3.2 conformance.