|
|
Investigation notes - current OwlUnit + RdfParserUnit baseline vs RDF 1.2
EXPORT SIDE (OwlUnit, 0002232:0001916 lines, single ExportOWL entry point)
- Serialisation: RDF/XML only, string-stitched into a TStringList and written via SaveToUtf8File. An IXMLDocument is created and then unused.
- Namespaces declared up front: rdf, rdfs, owl, xsd, dc, dcterms, skos, swrl plus four OWL-S namespaces (profile, grounding, process, service) and daml that are NEVER referenced in the body - dead declarations from an abandoned experiment, worth removing.
- Datatypes: DataTypeMap routes through dbdmd.RetrieveDataType("XML", ...) to xsd:* (string/date/time/integer/boolean/decimal/dateTimeStamp/hexBinary, plus owl:rational/owl:real). xsd: is rewritten to the full XMLSchema# IRI at output.
- Literals are language-tagged via xml:lang. BUG: the region subtag is stripped (en-US -> en) at lines 511-512, 575-576 and elsewhere. RDF 1.2 supports full BCP 47 - the strip is lossy.
- Multi-locale support is gated on Features.MultiLocale; each translation walked from Translations.Sections.
- OWL 2 idioms already present: owl:Restriction with owl:minCardinality/owl:maxCardinality (xsd:nonNegativeInteger), owl:hasKey rdf:parseType="Collection", owl:oneOf as rdf:first/rdf:rest cons-lists, owl:unionOf rdf:parseType="Collection".
- CaseTalk-specific data attached as <local:...> annotation properties (AddCustomAttribute) plus a DeclareCustomAnnotations pass to register them. Role readings forced into synthetic <local:Incoming rdf:ID="in..."/> + <local:Outgoing rdf:ID="out..."/> resources - a hand-rolled reification because RDF 1.1 has no clean way to attach metadata to a triple.
- Optional SKOS mixin (mixSKOS): skos:inScheme, skos:definition, skos:note, skos:example, skos:prefLabel.
PARSER SIDE (RdfParserUnit, 93 lines, TRDFParser)
- Single regex: ^<...>\s+<...>\s+(<...>|"...")\s.\s$ - strict N-Triples ONLY.
- No datatyped literals ("5"^^xsd:integer), no language tags ("hi"@en), no blank nodes (:b), no comments, no Turtle prefixes, no escape handling, no multi-line. Triple is (Subject, Predicate, Object) as bare strings.
- Effectively a toy - will not round-trip its own export (export is RDF/XML, parser only reads NT).
RDF 1.2 SHIFTS RELEVANT TO CASETALK
- Triple terms (<<:s :p :o>>) become first-class objects. Direct replacement for our synthetic local:Incoming/local:Outgoing rdf:ID reification - attach role-reading metadata to the triple itself.
- Base direction on language strings ("text"@en--ltr / --rtl) and rdf:dirLangString datatype.
- Quoted vs asserted triples distinction (model-level, not syntactic).
- BCP 47 language tags fully supported - vindicates removing the en-US -> en strip.
- Triple terms have no idiomatic RDF/XML syntax - they are Turtle/TriG/N-Quads-native. A Turtle export is the natural vehicle for adopting RDF 1.2.
PROPOSED FOLLOW-UPS (split into separate tickets once scope is agreed)
- Add a Turtle (.ttl) export path next to the RDF/XML path. Reuse the structural logic in OwlUnit; only change the serialiser.
- Replace the local:Incoming/Outgoing reification with RDF 1.2 triple terms in the Turtle path - keep the RDF/XML path unchanged for tooling that still consumes 1.1.
- Stop stripping the language-tag region (full BCP 47).
- Drop the dead OWL-S / daml namespace declarations.
- Decide what RdfParserUnit is for. If it is meant to import RDF, upgrade it to a real Turtle/RDF 1.2 parser (Jena is already shipped under Tools/jena and can preprocess to NT via riot if a Delphi rewrite is too costly). If it is dead code, remove it.
- Add a few RDF 1.2 datatypes to DataTypeMap (rdf:dirLangString and friends).
Out of scope here: SHACL/ShEx export, named-graph framing - those are RDF-stack improvements but not what the email asked about. |
|
|
Refinement: existing Turtle templates - consolidate, do not reinvent
The codebase already ships two Turtle/RDF export templates under develop/modules/templates/:
-
ttl.ini ("Turtle (TTL)", 442 lines) - most polished. Emits @prefix rdf/rdfs/owl/xsd/skos/dcterms/foaf/vann/cc/schema/dbo. Sections cover ontology metadata, classes from Tables, subtypes from FKs, datatype properties from Columns, object properties from FKs, SKOS concept schemes from value-constrained Domains, OWL restrictions (cardinality, allValuesFrom + oneOf), annotation-property declarations, Schema.org alignment heuristics, VoID dataset, PROV-O provenance. Includes table.Verbalization -> :factExpressions / :businessRules triple-string annotations. Visible bugs: unbalanced parens in Schema.org alignment block, @(schema.Tables).Count emitted mid-Turtle as a literal.
-
ig_turtle_export.ini ("Turtle Export", 352 lines) - less polished but multi-artefact: ontology + SHACL shapes + per-table instance file + SPARQL query examples. Uses a dedicated tmg: namespace ([url=http://casetalk.com/tmg/schema" rel="noopener]http://casetalk.com/tmg/schema[/url]#) for CaseTalk-specific predicates - much cleaner namespace design than ttl.ini's bare :hasArtificialKey style.
Scope vs OwlUnit: both templates walk schema.Tables/Columns/ForeignKeys/Domains - the derived relational schema, not the conceptual TMG. They cannot see OTFTs/ExpRels/Roles natively; they only get FCOIM content via table.Verbalization. OwlUnit goes straight at TMG and exports OTFTs, expressions and role readings.
Revised plan
- Keep OwlUnit as the OTFT-level RDF/XML export (the conceptual-model surface).
- Consolidate ttl.ini and ig_turtle_export.ini into a single canonical Turtle template: keep ig_turtle_export.ini's tmg: namespace design, fold in ttl.ini's PROV-O/VoID/annotation-property declarations, fix the parens and the @(...).Count escape, retire the duplicate.
- Adopt RDF 1.2 inside the Turtle template (not in OwlUnit): replace synthetic :factExpressions string annotations with triple terms (<<:Customer :hasReading "Customer places Order">> :assertedBy :CaseTalk), stop hard-coding @en (use database.Locale or table.Locale), and emit rdf:dirLangString for RTL locales. Triple terms have no idiomatic RDF/XML, so this is naturally Turtle-only.
This is filed as a follow-up ticket (template consolidation + RDF 1.2 upgrade) so 0005612 can stay as the broader research/audit issue. |
|
|
Resolved as the audit-and-first-pass is complete. Concrete RDF 1.2 / OwlUnit work shipped in 15.0:
- OwlUnit cleanup (commit 6198eb560): removed the unused IXMLDocument scaffolding and the four Xml.* units it pulled in; pruned seven dead xmlns declarations (profile, grounding, dcterms, swrl, process, service, daml) from the rdf:RDF root. Output narrowed to what is actually emitted: rdf, rdfs, owl, xsd, dc, skos, casetalk, local.
- Turtle export consolidated and brought to RDF 1.2 in 0005614 (commit d93f7a439): ttl.ini retired, ig_turtle_export.ini becomes the canonical Turtle path with PROV-O / VoID / annotation declarations, verbalisations attached via triple terms (<< :C a owl:Class >> tmg:hasFactExpression "..."), SHACL + SPARQL companions.
- Investigation notes for OwlUnit and RdfParserUnit captured on this ticket - the inventory of what the export currently emits, what is broken, and where RDF 1.2 unlocks cleaner idioms.
Deferred (not blocking 15.0; open new tickets when scheduling):
- Stop stripping the BCP 47 region from the language tag in OwlUnit (en-US -> en).
- Replace OwlUnit's synthetic local:Incoming / local:Outgoing role-reading reification with triple terms in a future Turtle path inside OwlUnit (the .ini template covers the schema-level Turtle output; OwlUnit still emits OTFT-level RDF/XML only).
- RdfParserUnit is still NT-only. Decide whether to upgrade to a real Turtle/RDF 1.2 parser (Jena under Tools/jena can be shelled out as a preprocessor) or remove as dead code.
- A few RDF 1.2 datatypes (rdf:dirLangString and friends) for DataTypeMap.
|