Architecture — dprr-mcp

Overview

The system has three layers: a Claude Code skill that orchestrates the query pipeline, an MCP server that exposes three tools, and a local Oxigraph RDF store.

flowchart TD
    A["Claude Code Skill (/dprr)
analyse → generate → validate → execute → synthesise"]
    B["MCP Server (dprr-server)
get_schema · validate_sparql · execute_sparql"]
    C["Local Oxigraph Store
RDF triples from DPRR dataset"]

    A -- "MCP protocol (streamable-http)" --> B
    B --> C

Server startup and data initialisation

On first startup the server auto-downloads the DPRR RDF dataset from the latest GitHub release and bulk-loads it into a local Oxigraph store. On subsequent startups the existing store is reopened read-only, avoiding file locking.

flowchart TD
    start([dprr-server starts]) --> resolve["Resolve data directory
DPRR_DATA_DIR → $XDG_DATA_HOME/dprr-mcp
→ ~/.local/share/dprr-mcp"]
    resolve --> check{store/ exists
and non-empty?}

    check -- Yes --> readonly["Open store read-only
Store.read_only()"]
    check -- No --> ttl{dprr.ttl exists
in data dir?}

    ttl -- Yes --> load
    ttl -- No --> fetch["Download tarball
DPRR_DATA_URL or
latest GitHub release"]
    fetch --> extract["Extract dprr.ttl
to data directory"]
    extract --> load["Create mutable store
Store.bulk_load(dprr.ttl)"]
    load --> close["Close write handle"]
    close --> readonly

    readonly --> context["Load YAML context files
prefixes · schemas · examples · tips"]
    context --> schema_dict["Build schema_dict
class → predicate → range mappings"]
    schema_dict --> ready([Server ready
listening on /mcp])

Tool execution

get_schema

Returns the DPRR ontology overview — PREFIX declarations, a one-line summary of each class, and cross-cutting query tips. No query execution or store access.

flowchart LR
    call(["get_schema()"]) --> prefixes["Format PREFIX
declarations"]
    prefixes --> classes["Render class
summary"]
    classes --> tips["Collect cross-
cutting tips"]
    tips --> result(["Return prefixes
+ classes + tips"])

validate_sparql

Checks a query without executing it. Two-tier validation: first parse the query and auto-repair missing PREFIX declarations, then validate classes and predicates against the schema. Returns contextual tips and examples relevant to the query's classes.

flowchart TD
    call(["validate_sparql(sparql)"]) --> parse["Tier 1 — Parse query
rdflib.prepareQuery()"]
    parse -- parse error --> detect{"Missing namespace
prefix?"}
    detect -- Yes --> inject["Auto-inject PREFIX
declarations from
prefixes.yaml"]
    inject --> reparse["Re-parse query"]
    detect -- No --> fail1(["Return parse error
+ relevant context"])
    parse -- ok --> sem
    reparse -- ok --> sem
    reparse -- error --> fail1

    sem["Tier 2 — Semantic validation
Extract BGP triples from
SPARQL algebra"] --> types["Map ?vars to classes
via rdf:type triples"]
    types --> preds["Check each predicate
against schema_dict"]
    preds -- unknown --> suggest["Suggest closest match
(fuzzy matching)"]
    preds -- all valid --> ctx

    suggest --> ctx["Extract referenced classes
Attach relevant tips + examples"]
    ctx --> result(["Return VALID / INVALID
+ relevant tips and examples"])

execute_sparql

Runs the same two-tier validation pipeline, then executes the query against the read-only Oxigraph store with timeout protection. Results are returned in toons table format.

flowchart TD
    call(["execute_sparql(sparql, timeout)"]) --> v["Tier 1 + 2
Same validation pipeline
as validate_sparql"]
    v -- errors --> err(["Return errors
+ relevant context"])

    v -- valid --> exec["Tier 3 — Execute query
store.query(sparql)
against read-only Oxigraph"]
    exec --> rows["Extract variable bindings
into row dicts"]
    rows --> timeout{"Completed
within timeout?"}
    timeout -- Yes --> format(["Format results
as toons table"])
    timeout -- No --> terr(["Return timeout error"])