OSW Schema

Item:OSWab674d663a5b472f838d8e1eb43e6784 /
Revision as of 02:06, 29 April 2024 by Admin (talk | contribs) (Update package: OSW Docs - Core)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
OSW Schema [OSWab674d663a5b472f838d8e1eb43e6784]
ID OSWab674d663a5b472f838d8e1eb43e6784
UUID ab674d66-3a5b-472f-838d-8e1eb43e6784
Label OSW Schema
Machine compatible name OswSchema
Statements (outgoing)
Statements (incoming)
Keywords

Description

Documentation about to OSW data schema

Item
Type(s)/Category(s) Article
CreativeWork
Article

Schema for the dynamical Composition of nested class structures


Introduction

Transforming Semantic Mediawiki into a knowledge base with structured data used to require PageForms and multiple template per class:

  • Template to store the data and render the page
  • Form to edit the page
  • Form to query for instances of the class
  • Template to render the query results
  • Subtemplate to handle complex data

Reusing structures with this approach is difficult and storing data in wikicode leads to a significant lock-in regarding editing those data using existing stardards and tools.

The new concept is based on strictly storing data in json and use wikicode just for structured text and render templates. Insteaf of distributing content among multiple pages all relevent content is stored withing different slots of the same page. Additional, inheritance is highly supported and enables a broad reuse of any structure.

Overview

General

Dual hierarchy example

Why are Categories (Classes) different from Items (Instances)?

  • Pro
    • Reflects rdf(s) and owl standard
    • Reflects OOP in Python and other programming languages, only (?) Javascript supports a similar concepts with prototypes
    • Compatible with SMW features like rdf-export and Inferencing
  • Contra
    • User needs to decide before creating a term or move the term later to a different namespace (renaming/redirect)

Slots

Slots
Key Model Description
main wikitext default content slot, rendered between the page header and footer
jsonschema json stored within a category (=class) page, defining the schema for the jsondata slot of any category member (instance)
jsondata json structured data
schema_template text stored within a metacategory/-class, contains a handlebars template to build the jsonschema of a class from its jsondata slot
data_template wikitext stored within a category (=class) page, defining how the jsondata attributes of any category member (instance) are mapped to semantic properties
header_template wikitext stored within a category (=class) page, renders the page header of any category member (instance)
footer_template wikitext stored within a category (=class) page, renders the page footer of any category member (instance)
header wikitext renders the page header
footer wikitext renders the page footer
template/internal wikitext hidden content, not rendered. Can be used to call parser functions or lua modules

Meta-Schemas

CategoryClass ist the default Metacategory / -class for all categories / classes. Its slot schema_template contains a handlebars template that sets schema attributes like title, allOf, description, etc. from the user generated jsondata. Additional Metacategories can be created as subclasses of CategoryClass to simplify the creation of complex schemas, e. g. Category:OSWecff4345b4b049218f8d6628dc2f2f21. This feature is compareable to python metaclasses.

Matecategories /-classes contain a handlebars template within the schema_template slot. The templated is evaluated with the jsondata-slot content to create / update the jsonschema-slot content of any derivated class on every edit.


Json-Schema

Base

https://json-schema.org/ (Draft 4)

A Jsonschema can reference other schemas. This is equivalent to [[Subcategory of::Entity]] (Semantic Mediawiki) and owl:subclass_of

{
    "title": "MyEntitySubclass",
    "type": "object",
    "allOf": [{"$ref": "../Category/Entity.slot_jsonschema.json"}]
}

Note: refs were previously noted as "/wiki/Category:Entity?action=raw&slot=jsonschema"but this notation only works if /wiki is actually the root path of the system. Path-relative notations (./)Category:Entity and url-params are problematic so initial absolute and subsequent relative resolving is done via Special:SlotResolver, e. g. Special:SlotResolver/JsonSchema/Label.slot_main.json or Special:SlotResolver/Category/Entity.slot_jsonschema.json. To reuse a schema in an external tool you can use e. g. "$schema": "https://opensemantic.world/wiki/Special:SlotResolver/Category/Entity.slot_jsonschema.json"

Json-Editor

https://github.com/json-editor/json-editor, which is used to render edit & query forms based on provided jsonschema, adds additional keywords and options.

Autocompletion

Enables autocompletion in input fields. Configuration see Additional options

Custom Extensions

Embedded i18n support:

keywords title and description can be extended with additional keywords title* and description*, which hold and object with lang-keys (de, en, etc.) pointing to the translated strings.

{
    "title": "Default Title",
    "title*": {"en": "Title (en)", "de": "Titel (de)"}
}
Handlebars Template Helper

Whenever templates are supported, the following custom handlebars helpers are supported as well:

Key Syntax Example Example result Description Comment
when
{{#when <operand1> <operator> <operand2>}}...{{/when}}
{{#when 2 'eq' 1}}equal{{else when var1 'gt' var2}}gt{{else}}lt{{/when}}
gt compare operator Supported operators: see implementation
replace
{{#replace <find> <replace>}}{{string}}{{/replace}}
{{#replace "test" "test2"}}_test_{{/replace}}
_test2_ string replace operator
split
{{#split <find> <index>}}<string>{{/split}}
{{#split "/" -1}}https://test.com/target{{/split}}
target string split operator
each_split
{{#each_split <string> <find>}}...{{/each_split}}
{{#each_split "https://test.com/target" "/"}}{{.}},{{/each_split}}
https:,,test.com,target, split result iterator
substring
{{#substring start end}}<string>{{/substring}}
{{#substring 0 -2}}My-test-string{{/substring}} => My-test-stri
My-test-stri substring operator negative indices are supported (counted from end-of-string)
calc
{{calc <operand1> <operator> <operand2>}}
or
{{#calc <operand1> <operator>}}<operand2>{{/calc}}
a={{calc (calc 1 '+' 1) '*' 10}}
b={{#calc 3 '*'}}2{{/calc}}
a=20 b=6 math callback
patternformat
{{#patternformat <pattern>}}<string>{{/patternformat}}
or
{{patternformat <pattern> <value>}}
{{patternformat '00.0000' '1.1' }}
01.1000 pattern formator for both numbers and strings
dateformat
{{dateformat <format> <date>}}
{{dateformat 'Y' (_now_)}}
2024 formats a datetime value supported formats: https://flatpickr.js.org/formatting/
_uuid_
{{_uuid_}}
{{_uuid_}}
ad56b31f-9fe5-466a-8be7-89bce58045f1 uuidv4
_now_
{{_now_}}
{{_now_}}
2024-02-04T04:31:08.050Z current datetime iso-format

Note on helper and param naming collision: When a helper has the same name as a key in the json params, the helper is prioritized.However, you can use this.<param> to enforce the param over the helper.

Example: helper 'test' returns 'helper', json data is {"test": "param"}.

Template "{{test}} {{#test}}{{/test}} {{this.test}}" will be evaluated to helper helper param

Special Template Variables

Available in format: dynamic_template

Variable Description Example Example result Note
_current_user_ The identity of the current active user
{{{_current_user_}}}
User:MyUserName
_current_subject_ The title / OSW-ID of the current page / entry
{{{_current_subject_}}}
Item:OSWab674d663a5b472f838d8e1eb43e6784
_array_index_ The index of an array item within its parent array
{{{_array_index_}}}
1
_global_index_ Retrieves the smallest non-existing prefixed index for values of the specified property (default: Property:HasId). See also Additional options / global_index
ID-{{{_global_index_}}}
Existing entries: "HasId::ID-0001", "HasId::ID-0002", "HasId::ID-0003"

Template resolves to "ID-0004"

Additional keywords
Special jsonschema attributes ($properties.<property>.*)
Key Alias Subkeys Value Description Note
range - - <category> range of a property in the sense of OWL restricting the class(es) the pointed item could be instance of. Currently supports only a single string. Multiple categories connected with AND or OR: tbd Also used to generate a suitable inline editor to create or edit these items (see also $properties.<property>.options.autocomplete)
template <string> handlebars template string. Available variables: watched values Build-in
dynamic_template handlebars template string. Available variables: watched values Extended template feature
eval_template evaluation template for the current json object (while 'template' is used by jsoneditor in the UI) eval_templates are expanded before the json data is passed to render templates and property mapping
type mediawiki uses the wiki template parser. Cannot handle objects and arrays => non-literals get stripped
" mustache uses the lua mustache template parser https://github.com/OpenSemanticLab/lustache/tree/scribunto-module-pages. Can handle objects and arrays https://mustache.github.io/, https://stackblitz.com/edit/mustache-tester?file=index.js
" mustache-wikitext applies mustache first, then wikitext wikitext parts containing {{ need to be wrapped inside {{=<% %>=}} and<%={{ }}=%>
mode <none> the given template will be used to render the json object and store it's semantic data
" render the given template will be used to render the json object
" store the given template will be used to store semantic data
value <string> the template string
data_source_maps see section "Remote / external data import"

Note on default rendering in infoboxes:

Special jsondata attributes ($.*)
Key Alias Subkeys Value Description Note
type - an array of category pages if defined, the given category will be used to render the json object and store it's semantic data
Additional options
Additional options ($properties.<property>.options.*)
Key Subkeys SubSubkeys Value Example Description Note
conditional_visible
modes <array> ["default", "query"] Display this field only in the selected modes of the editor
conditional_hide tbd
range string "Category:Item" Shortcut for a static query [[Category:Item]][[Display_title_of_normalized::~*{{_user_input_normalized}}*]]|?... . Creates an inline editor for the given category
subclassof_range string "Category:Device" Indicates that the target are subclasses of the given device. Inline editor will use the meta class of the given category, e. g. "Category:MetaDeviceCategory", or will use the range if given
autocomplete buildin option
mode smw query mode for now only supports semantic mediawiki
query [[HasLabel::~*{{_user_input}}*]]|?HasLabel=label handlebars query template. Available are all keys of the current schema and _user_input, _user_input_lowercase, _user_input_normalized, _user_lang             |?Display_title_of=label,

            |?HasImage=image, |?HasDescription=description and |limit=100 are appended automatically if not specified

category Category:Item creates a static query [[Category:Item]][[Display_title_of_normalized::~*{{_user_input_normalized}}*]]|?.... Creates an inline editor for the given category see also $properties.<property>.range
property Property:HasLabel query entities with existing value of the specified property [[HasLabel::+]][[Display_title_of_normalized::~*{{_user_input_normalized}}*]]?...
render_template how to display query results in the suggestion list
type <array> ["handlebars", "wikitext"] template engines are applied in the specified order. wikitext will result in parse-API calls, which is not recommanded for large result sets
value template string [[{{result.fulltext}}]] the actual template string. Pure handlebars templates can contain html tags like links (<a>) and images (<img>), wikitext templates need to use the wiki-syntax [[ ]] wiki-links to categories need a : prefix: [[:{{result.fulltext}}]]
label_template how to display the item after getting selected by the user
type <array> ["handlebars"] only handlebars supported
value template string result.printouts.label.[0]
field_maps Auto-set editor fields based on the SMW ask-API query result. Example: JsonSchema:QuantityStatement
source_path <jsonpath> "$" jsonpath to apply on the query result
template template string "{{{result.printouts.label.[0]}}}" handlebars template applied on the json-object retrieved from the source path of the query result
target_path <jsonpath> "$(unit_symbol)" jsonpath of the target field / editor. You can use jsoneditors watch variables (recommended) to auto-generate to expression
dynamic_template options for dynamic_template
override <enum value> unstored|empty|always always Allow the template to override the current value of the field if the current value is undefined or an empty string (empty), was not yet stored (unstored) or always. dynamic_templates with _global_index_ default to unsafed (=> do not update the value after initial stored), user editable fields (not hidden and not readonly => only set an initial default) to emtpy, else always is used.
global_index
property <property name> Property:HasId The Property to determine the next lowest index of a prefixed value within the dynamic_template option. Currently hardcoded to Property:HasId
number_pattern <string> 0000 The pattern of the index. Index = 2, pattern = 0000 => 0002. Currently hardcoded to 0000
increment <numver> 1 The increment of the index. Currently hardcoded to 1
role query {"filter": "min|max|eq"} {"filter": "min"} Creates a semantic mediawiki query for a numerical property, e. g. if the property maps to "HasNumber", the filter is "min" and the user provided value is 3 this results in the query ">3Property "HasNumber" (as page type) with input value ">3" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process."
Autocomplete

Default setting (planned, see also Object Properties and Data / Annotation Properties)

  • for data / annotation properties: existing values of the property: {{#ask:[[<Property>::+]]|?<Property>=value}}
  • for object properties: existing instances of the range category: {{#ask:[[Category:<range>]]}}
Shortcuts
  • category: Populates the field with instances of the given category (and its subcategories)
{
    "type": "string",
    "format": "autocomplete",
    "options":{	
        "autocomplete": {
            "category":"Category:X"
        }
    }
}
Remote / external data import
Key Alias Subkeys SubSubkeys Value Example Description Note
data_source_maps - To fetch data from an external source (by user request) and store it in the edit form
id <string> pub.orcid.org
label <string> ORCID Label for the data source, displayed on the request button
required <array> ["orcid"] Properties of the schema that are required to send the request (typically those occure as template params in source the request button is disabled if required properties are missing
source <url> https://pub.orcid.org/v3.0/{{#split '/' -1}}{{orcid}}{{/split}} API endpoint to fetch data supports templates. API must allow request from a foreign domain
format <enum> jsonld The format of the requested resource. One of json (default), jsonld, xml, html Used to set the correct contentType header field and parse the result
mode <enum> jsonpath The format of the path expression in the object_map. One of jsonpath (default for json, jsonld), xpath (default for xml, html)
object_map <dict> {"first_name": "$.givenName"} Stores the result of the path expression value (right-hand, evaluated on the API result) in the given key (left-hand) Result of the path expression can be an array or object.
template_map <dict> Constructs a value from a handlebars template (left-hand) and asigns it to the given key (right-hand) tbd
request_map tbd
map tbd
{
    "...": {},
    "data_source_maps": [
        {
            "id": "pubchem.ncbi.nlm.nih.gov",
            "source": "https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{{pubchem_cid}}/JSON",
            "label": "PubChem",
            "required": [
                "pubchem_cid"
            ],
            "object_map": {
                "cas_numbers": "$..[?(@.TOCHeading==='CAS')].Information..Value.StringWithMarkup..String"
            }
        }
    ],
    "properties": {
        "pubchem_cid": {
            "title": "PubChem CID",
            "type": "string"
        },
        "cas_numbers": {
            "type": "array",
            "...": {}
        },
        "...": {}
    }
}

Tests could be run on various playgrounds:

  • Handlebars template: [1]
Search

Properties can be marked as inputs for the categories query form by adding the conditional_visible option. Therefor a context mapping to a SWM Property is mandatory (see #JSON-LD). With "hidden": true the property is only visible in the query form. With "hidden": false it's visible both in the edit and in the query form.

{
    "@context":
        "...",
        {
            "query_label": "Property:HasLabel"
        }
    ],
    "...": {},
    "properties": {
        "query_label": {
            "type": "string",
            "options": {
                "hidden": true,
                "conditional_visible": {
                    "modes": [
                        "query"
                    ]
                }
            }
        }
    }
}

The default comperator for text properties is ~ (like), for all other properties = (equal). The comperator can be defined with the role filter option, e. g. "min" for >.

{
    "@context": [
        {
            "start_date_min": "Property:HasStartDateAndTime"
        }
    ],
    "...": "...",
    "properties": {
        "start_date_min": {
            "type": "string",
            "format": "datetime-local",
            "options": {
                "flatpickr": {},
                "conditional_visible": {
                    "modes": "query"
                },
                "role": {
                    "query": {
                        "filter": "min"
                    }
                }
            }
        }
    }
}


JSON-LD

json-ld jsonschema: https://github.com/json-ld/json-ld.org/blob/main/schemas/jsonld-schema.json

Local playground: https://wiki-dev.open-semantic-lab.org/w/extensions/MwJson/json-ld/playground/index.html

References

json-ld should be embedded into jsonschema, but has its own referencing mechanism:

{
    "@context": [
        "../Category/Entity.slot_jsonschema.json",
        {
            "Property": {"@id": "https://wiki-dev.open-semantic-lab.org/id/Property-3A", "@prefix": true},
            "manufacturer": "Property:HasManufacturer"
        }
    ],
    "title": "MyEntitySubclass",
    "type": "object",
    "allOf": [{"$ref": "../Category/Entity.slot_jsonschema.json"}],
    "properties": {
        "manufacturer": {
            "type": "string"
        }
    }
}

Example: [4]

For a remote context the same mechanism are used as in json-schema $refs. To reuse a context in external tools you can use e. g. "@context": "https://opensemantic.world/wiki/Special:SlotResolver/Category/Entity.slot_jsonschema.json" .

Property mapping

Properties with a local definition (SMW Property) are automatically mapped. Jsondata of an instance of the category could then be provided with an json-ld context:

{
    "@context": [
        {
          	"@version": 1.1,
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
          	"@vocab": "https://wiki-dev.open-semantic-lab.org/id/Property-3A",
            "myproperty": "Property:MyProperty",
            "myproperty2": "wiki:Property-3AMyProperty",
          	"myproperty3": "MyProperty",
          	"Item": {"@id": "wiki:Item-3A", "@prefix": true},
          	"myObjectProperty": {"@id": "Property:MyObjectProperty", "@type": "@id"}
        }
    ],
  	"myproperty": "Works by using '@prefix': true (preferred)",
  	"myproperty2": "Works by using 'wiki' prefix with terminating '/'",
  	"myproperty3": "Works by using '@vocab'",
  	"myObjectProperty": "Item:123456"
}

Currently there seems no way to express that a property has two ids (e. g. with "label": {"@id": ["property:HasLabel", "skos:prefLabel"]}): https://github.com/json-ld/json-ld.org/issues/160 As a workaround, an additional context notation is provided: <property>* pointing to a list of additional "@id" mappings:

{
    "@context": [
        {
            "@version": 1.1,
            "skos": "https://www.w3.org/TR/skos-reference/",
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
            "label": "skos:prefLabel",
            "label*": "Property:HasLabel",
            "label**": "Property:Display_title_of"
        }
    ],
  	"label": "Maps externally to skos:prefLabel and internally to Property:HasLabel"
}

OSW allows mapping to external (<any_prefix>:<property>) and internal vocabs (Property:<property>). Please note that properties mapped to an external vocab are currently not available in Semantic MediaWiki and the related query interfaces. Using the * notation it is possible to map to both external vocab  ("property": "<any_prefix>:<property>") and internal ("property*": "Property:<property>", "property**": "Property:<another_property>").

Object Properties and Data / Annotation Properties

Properties default to data / annotation properties (value is a literal). Object properties (value is an identifier/reference to another object) can by defined by adding "@type": "@id".

Subobjects

If the value of a mapped property is an object (after expanding all eval_templates), it will get stored as a smw subobject with an id derivated from the field uuid, a display title from label and a category from type (if provided). Subobjects also support JSON-LD @reverse notation, allowing to store properties pointing from the subobject to the superordinated or root object.

Example: can be selected with [[MyObjectProperty.MyProperty::myvalue]]

{
    "@context": [
        {
          	"@version": 1.1,
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
            "myproperty": "Property:MyProperty",
          	"Item": {"@id": "wiki:Item-3A", "@prefix": true},
          	"myObjectProperty": {"@id": "Property:MyObjectProperty", "@type": "@id"}
        }
    ],
  	"myObjectProperty": {
  	    "uuid": "2ea5b605-c91f-4e5a-9559-3dff79fdd4a5",
  	    "label": "MySubobject",
  	    "myproperty": "myvalue"
  	}
}

Labels and i18n

i18n language keys can be embedded in to an label object to create a language tagged string

{
    "@context": [
        {
          	"@version": 1.1,
            "skos": "https://www.w3.org/TR/skos-reference/",
          	"label": {"@id": "skos:prefLabel", "@container": "@language"},
          	"label2": {"@id": "skos:prefLabel"},
          	"label_text": {"@id": "@value"},
          	"label_lang_key": {"@id": "@language"}
        }
    ],
  	"label": {"en": "'Text' gets transformed to 'Text@en' by applying @container"},
  	"label2": {"label_text": "'Text' gets transformed to 'Text@de' by subkeys @id's", "label_lang_key": "de"}
}

Ontology term import/export

Existing ontology terms can be imported/exported via json-ld directly or ttl by defining the corresponding context, e. g. for EMMO-Terms: [5]

Recursive Parsing

Module:Category

Called from Category:<UUID>@template

  1. Synchonize Category:<UUID>@jsondata.subclass_of with Category:<UUID>@jsonschema.allOf
  2. Expand Category:Category@header_template with jsondata parameters
  3. Render Category:<UUID>@main
  4. Expand Category:Category@footer_template with jsondata parameters

Module:Entity

Called from item@template, item = Item:<UUID>

Recursion
  1. For each Item:<UUID>@jsondata.osl_category as category:
    1. For each category@jsondata.osl_category or category@jsonschema.allOf as supercategory:
      1. For each supercategory@jsondata.osl_category or supercategory@jsonschema.allOf as supersupercategory:
        1. ...
      2. Expand supercategory@header_template with item@jsondata parameters. Fallback: Render infobox
      3. Expand supercategory@data_template with item.jsondata parameters. Fallback: Use Json-LD mapping within category:jsonschema
    2. If category@header_template: Expand category@header_template with item@jsondata parameters
    3. Else: Render infobox with all attribute-value pairs
    4. Expand category@data_template with item.jsondata parameters. Fallback: Use Json-LD mapping within category:jsonschema
  2. Render item@main
  3. footer...
Data Storing
  1. template specified: use template
  2. category specified:
    1. category@data_template specified: use data_template
    2. Use Json-LD mapping
      1. mapping specified: Store semantic property
        1. Literal value: store value
        2. Object value
          1. Property has type text/code: Store json string
          2. osl_category / osl_template specifided: see below
      2. Don't store semantic property

Nested objects within item@jsondata are handled

  • osl_category: same handling as the root object, but:
    • Rendering with category@header_template. Fallback: Nested info box
    • Data storing with category@data_template. Fallback: Creating a subobject with Json-LD mapping + (inverse) semantic relation to the root object
  • osl_template: expand the template, return value is asigned to the property

Python Code Generation

see https://github.com/OpenSemanticLab/osw-python

Statement

PCB contains 10% +/- 1% Lead and Gold
s p/s o/p/s o/p o...
PCB contains Lead
contains HasMassConcentration 10%
HasMassConcentration HasPrecision 1%
PCB contains Gold

File Handling

Copy-Policy

drop: do not copy the file

copy: copy the file and store the reference to it

copy-ref: store the referece to the original file

ask-on-edit: store the reference but ask the user to copy the original file when he tries to edit it (current_page != creation_page)

Meta-Data

HasProject=project inherit: permissions from project

HasCreationPage=creation_page: wiki page within this file was created

HasEditPage=edit_page: wiki page within this file was edited

HasCreator=creator: initial creator of the file

HasEditor=editor: editors of the file

Links

jsondata
type
"Category:OSW92cc6b1a2e6b4bb7bad470dfdcfdaf26"
uuid"ab674d66-3a5b-472f-838d-8e1eb43e6784"
label
text"OSW Schema"
lang"en"
description
text"Documentation about to OSW data schema"
lang"en"
name"OswSchema"
attachments
"File:OSW2f275e3441c84f63a6cbee2861c488f2.drawio.svg"
"File:OSW49d68bb7a5de413ba1077bc5f459a766.drawio.svg"
"File:OSW61f1999ee6d145c9b76fb55d02578ce5.drawio.svg"
"File:OSW95a74be1e22d4b6e9e4f836127d5915a.drawio.svg"