OSW Schema

Item:OSWab674d663a5b472f838d8e1eb43e6784 /
Revision as of 10:03, 1 September 2023 by Admin (talk | contribs) (Update package: OSW Docs - Core)
OSW Schema [OSWab674d663a5b472f838d8e1eb43e6784]
ID OSWab674d663a5b472f838d8e1eb43e6784
UUID ab674d66-3a5b-472f-838d-8e1eb43e6784
Label OSW Schema
Machine compatible name OswSchema
Statements (outgoing)
Statements (incoming)
Keywords

Description

Documentation about to OSW data schema

Item
Type(s)/Category(s) Article
CreativeWork
Article

Schema for the dynamical Composition of nested class structures


Introduction

Transforming Semantic Mediawiki into a knowledge base with structured data used to require PageForms and multiple template per class:

  • Template to store the data and render the page
  • Form to edit the page
  • Form to query for instances of the class
  • Template to render the query results
  • Subtemplate to handle complex data

Reusing structures with this approach is difficult and storing data in wikicode leads to a significant lock-in regarding editing those data using existing stardards and tools.

The new concept is based on strictly storing data in json and use wikicode just for structured text and render templates. Insteaf of distributing content among multiple pages all relevent content is stored withing different slots of the same page. Additional, inheritance is highly supported and enables a broad reuse of any structure.

Overview

General

Dual hierarchy example

Why are Categories (Classes) different from Items (Instances)?

  • Pro
    • Reflects rdf(s) and owl standard
    • Reflects OOP in Python and other programming languages, only (?) Javascript supports a similar concepts with prototypes
    • Compatible with SMW features like rdf-export and Inferencing
  • Contra
    • User needs to decide before creating a term or move the term later to a different namespace (renaming/redirect)

Slots

Slots
Key Model Description
main wikitext default content slot, rendered between the page header and footer
jsonschema json stored within a category (=class) page, defining the schema for the jsondata slot of any category member (instance)
jsondata json structured data
schema_template text stored within a metacategory/-class, contains a handlebars template to build the jsonschema of a class from its jsondata slot
data_template wikitext stored within a category (=class) page, defining how the jsondata attributes of any category member (instance) are mapped to semantic properties
header_template wikitext stored within a category (=class) page, renders the page header of any category member (instance)
footer_template wikitext stored within a category (=class) page, renders the page footer of any category member (instance)
header wikitext renders the page header
footer wikitext renders the page footer
template/internal wikitext hidden content, not rendered. Can be used to call parser functions or lua modules

Meta-Schemas

CategoryClass ist the default Metacategory / -class for all categories / classes. Its slot schema_template contains a handlebars template that sets schema attributes like title, allOf, description, etc. from the user generated jsondata. Additional Metacategories can be created as subclasses of CategoryClass to simplify the creation of complex schemas, e. g. Category:OSWecff4345b4b049218f8d6628dc2f2f21. This feature is compareable to python metaclasses.

Matecategories /-classes contain a handlebars template within the schema_template slot. The templated is evaluated with the jsondata-slot content to create / update the jsonschema-slot content of any derivated class on every edit.


Json-Schema

Base

https://json-schema.org/ (Draft 4)

A Jsonschema can reference other schemas. This is equivalent to [[Subcategory of::Entity]] (Semantic Mediawiki) and owl:subclass_of

{
    "title": "MyEntitySubclass",
    "type": "object",
    "allOf": [{"$ref": "/wiki/Category:Entity?action=raw&slot=jsonschema"}]
}

Json-Editor

https://github.com/json-editor/json-editor, which is used to render edit & query forms based on provided jsonschema, adds additional keywords and options.

Autocompletion

Enables autocompletion in input fields. Configuration see Additional options

Custom Extensions

Embedded i18n support:

keywords title and description can be extended with additional keywords title* and description*, which hold and object with lang-keys (de, en, etc.) pointing to the translated strings.

{
    "title": "Default Title",
    "title*": {"en": "Title (en)", "de": "Titel (de)"}
}
Additional keywords
Special jsonschema attributes ($properties.<property>.*)
Key Alias Subkeys Value Description Note
eval_template evaluation template for the current json object (while 'template' is used by jsoneditor in the UI) eval_templates are expanded before the json data is passed to render templates and property mapping
type mediawiki uses the wiki template parser. Cannot handle objects and arrays => non-literals get stripped
" mustache uses the lua mustache template parser https://github.com/OpenSemanticLab/lustache/tree/scribunto-module-pages. Can handle objects and arrays https://mustache.github.io/, https://stackblitz.com/edit/mustache-tester?file=index.js
" mustache-wikitext applies mustache first, then wikitext wikitext parts containing {{ need to be wrapped inside {{=<% %>=}} and<%={{ }}=%>
mode <none> the given template will be used to render the json object and store it's semantic data
" render the given template will be used to render the json object
" store the given template will be used to store semantic data
value <string>
page <wiki page>
slot
url
Special jsondata attributes ($.*)
Key Alias Subkeys Value Description Note
type - an array of category pages if defined, the given category will be used to render the json object and store it's semantic data
Additional options
Additional options ($properties.<property>.options.*)
Key Subkeys SubSubkeys Value Example Description Note
conditional_visible
modes <array> ["default", "query"] Display this field only in the selected modes of the editor
conditional_hide tbd
autocomplete buildin option
mode smw query mode for now only supports semantic mediawiki
query [[HasLabel::~*{{input}}*]]|?HasLabel=label handlebars query template
range Category:Item creates a static query [[Category:Item]][[HasLabel::~*{{input}}*]]|?HasLabel=label
property Property:HasLabel existing value of the property [[HasLabel::+]][[HasLabel::~*{{input}}*]]?HasLabel=value
render_template how to display query results in the suggestion list
type <array> ["handlebars", "wikitext"] template engines are applied in the specified order. wikitext will result in parse-API calls, which is not recommanded for large result sets
value template string [[{{result.fulltext}}]] the actual template string. Pure handlebars templates can contain html tags like links (<a>) and images (<img>), wikitext templates need to use the wiki-syntax [[ ]] wiki-links to categories need a : prefix: [[:{{result.fulltext}}]]
label_template how to display the item after getting selected by the user
type <array> ["handlebars"] only handlebars supported
value template string result.printouts.label.[0]
role query {"filter": "min|max|eq"} {"filter": "min"} Creates a semantic mediawiki query for a numerical property, e. g. if the property maps to "HasNumber", the filter is "min" and the user provided value is 3 this results in the query ">3Property "HasNumber" (as page type) with input value ">3" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process."
Autocomplete

Default setting (planned, see also Object Properties and Data / Annotation Properties)

  • for data / annotation properties: existing values of the property: {{#ask:[[<Property>::+]]|?<Property>=value}}
  • for object properties: existing instances of the range category: {{#ask:[[Category:<range>]]}}
Shortcuts
  • category: Populates the field with instances of the given category (and its subcategories)
{
    "type": "string",
    "format": "autocomplete",
    "options":{	
        "autocomplete": {
            "category":"Category:X"
        }
    }
}
Remote / external data import
{
    "...": {},
    "data_source_maps": [
        {
            "id": "pubchem.ncbi.nlm.nih.gov",
            "source": "https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/{{pubchem_cid}}/JSON",
            "label": "PubChem",
            "required": [
                "pubchem_cid"
            ],
            "object_map": {
                "cas_numbers": "$..[?(@.TOCHeading==='CAS')].Information..Value.StringWithMarkup..String"
            }
        }
    ],
    "properties": {
        "pubchem_cid": {
            "title": "PubChem CID",
            "type": "string"
        },
        "cas_numbers": {
            "type": "array",
            "...": {}
        },
        "...": {}
    }
}

Tests could be run on various playgrounds:

  • Handlebars template: [1]
Search

Properties can be marked as inputs for the categories query form by adding the conditional_visible option. Therefor a context mapping to a SWM Property is mandatory (see #JSON-LD). With "hidden": true the property is only visible in the query form. With "hidden": false it's visible both in the edit and in the query form.

{
    "@context":
        "...",
        {
            "query_label": "Property:HasLabel"
        }
    ],
    "...": {},
    "properties": {
        "query_label": {
            "type": "string",
            "options": {
                "hidden": true,
                "conditional_visible": {
                    "modes": [
                        "query"
                    ]
                }
            }
        }
    }
}

The default comperator for text properties is ~ (like), for all other properties = (equal). The comperator can be defined with the role filter option, e. g. "min" for >.

{
    "@context": [
        {
            "start_date_min": "Property:HasStartDateAndTime"
        }
    ],
    "...": "...",
    "properties": {
        "start_date_min": {
            "type": "string",
            "format": "datetime-local",
            "options": {
                "flatpickr": {},
                "conditional_visible": {
                    "modes": "query"
                },
                "role": {
                    "query": {
                        "filter": "min"
                    }
                }
            }
        }
    }
}


JSON-LD

json-ld jsonschema: https://github.com/json-ld/json-ld.org/blob/main/schemas/jsonld-schema.json

Local playground: https://wiki-dev.open-semantic-lab.org/w/extensions/MwJson/json-ld/playground/index.html

References

json-ld should be embedded into jsonschema, but has its own referencing mechanism:

{
    "@context": [
        "/wiki/Category:Entity?action=raw&slot=jsonschema",
        {
            "Property": {"@id": "https://wiki-dev.open-semantic-lab.org/id/Property-3A", "@prefix": true},
            "manufacturer": "Property:HasManufacturer"
        }
    ],
    "title": "MyEntitySubclass",
    "type": "object",
    "allOf": [{"$ref": "/wiki/Category:Entity?action=raw&slot=jsonschema"}],
    "properties": {
        "manufacturer": {
            "type": "string"
        }
    }
}

Example: [4]

Property mapping

Properties with a local definition (SMW Property) are automatically mapped. Jsondata of an instance of the category could then be provided with an json-ld context:

{
    "@context": [
        {
          	"@version": 1.1,
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
          	"@vocab": "https://wiki-dev.open-semantic-lab.org/id/Property-3A",
            "myproperty": "Property:MyProperty",
            "myproperty2": "wiki:Property-3AMyProperty",
          	"myproperty3": "MyProperty",
          	"Item": {"@id": "wiki:Item-3A", "@prefix": true},
          	"myObjectProperty": {"@id": "Property:MyObjectProperty", "@type": "@id"}
        }
    ],
  	"myproperty": "Works by using '@prefix': true (preferred)",
  	"myproperty2": "Works by using 'wiki' prefix with terminating '/'",
  	"myproperty3": "Works by using '@vocab'",
  	"myObjectProperty": "Item:123456"
}

Currently there seems no way to express that a property has two ids (e. g. with "label": {"@id": ["property:HasLabel", "skos:prefLabel"]}): https://github.com/json-ld/json-ld.org/issues/160 As a workaround, an additional context notation is provided: <property>* pointing to a list of additional "@id" mappings:

{
    "@context": [
        {
            "@version": 1.1,
            "skos": "https://www.w3.org/TR/skos-reference/",
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
            "label": "skos:prefLabel",
            "label*": "Property:HasLabel",
            "label**": "Property:Display_title_of"
        }
    ],
  	"label": "Maps externally to skos:prefLabel and internally to Property:HasLabel"
}
Object Properties and Data / Annotation Properties

Properties default to data / annotation properties (value is a literal). Object properties (value is an identifier/reference to another object) can by defined by adding "@type": "@id".

Subobjects

If the value of a mapped property is an object (after expanding all eval_templates), it will get stored as a smw subobject with an id derivated from the field uuid, a display title from label and a category from type (if provided).

Example: can be selected with [[MyObjectProperty.MyProperty::myvalue]]

{
    "@context": [
        {
          	"@version": 1.1,
            "wiki": "https://wiki-dev.open-semantic-lab.org/id/",
          	"Property": {"@id": "wiki:Property-3A", "@prefix": true},
            "myproperty": "Property:MyProperty",
          	"Item": {"@id": "wiki:Item-3A", "@prefix": true},
          	"myObjectProperty": {"@id": "Property:MyObjectProperty", "@type": "@id"}
        }
    ],
  	"myObjectProperty": {
  	    "uuid": "2ea5b605-c91f-4e5a-9559-3dff79fdd4a5",
  	    "label": "MySubobject",
  	    "myproperty": "myvalue"
  	}
}

Labels and i18n

i18n language keys can be embedded in to an label object to create a language tagged string

{
    "@context": [
        {
          	"@version": 1.1,
            "skos": "https://www.w3.org/TR/skos-reference/",
          	"label": {"@id": "skos:prefLabel", "@container": "@language"},
          	"label2": {"@id": "skos:prefLabel"},
          	"label_text": {"@id": "@value"},
          	"label_lang_key": {"@id": "@language"}
        }
    ],
  	"label": {"en": "'Text' gets transformed to 'Text@en' by applying @container"},
  	"label2": {"label_text": "'Text' gets transformed to 'Text@de' by subkeys @id's", "label_lang_key": "de"}
}

Ontology term import/export

Existing ontology terms can be imported/exported via json-ld directly or ttl by defining the corresponding context, e. g. for EMMO-Terms: [5]

Recursive Parsing

Module:Category

Called from Category:<UUID>@template

  1. Synchonize Category:<UUID>@jsondata.subclass_of with Category:<UUID>@jsonschema.allOf
  2. Expand Category:Category@header_template with jsondata parameters
  3. Render Category:<UUID>@main
  4. Expand Category:Category@footer_template with jsondata parameters

Module:Entity

Called from item@template, item = Item:<UUID>

Recursion
  1. For each Item:<UUID>@jsondata.osl_category as category:
    1. For each category@jsondata.osl_category or category@jsonschema.allOf as supercategory:
      1. For each supercategory@jsondata.osl_category or supercategory@jsonschema.allOf as supersupercategory:
        1. ...
      2. Expand supercategory@header_template with item@jsondata parameters. Fallback: Render infobox
      3. Expand supercategory@data_template with item.jsondata parameters. Fallback: Use Json-LD mapping within category:jsonschema
    2. If category@header_template: Expand category@header_template with item@jsondata parameters
    3. Else: Render infobox with all attribute-value pairs
    4. Expand category@data_template with item.jsondata parameters. Fallback: Use Json-LD mapping within category:jsonschema
  2. Render item@main
  3. footer...
Data Storing
  1. template specified: use template
  2. category specified:
    1. category@data_template specified: use data_template
    2. Use Json-LD mapping
      1. mapping specified: Store semantic property
        1. Literal value: store value
        2. Object value
          1. Property has type text/code: Store json string
          2. osl_category / osl_template specifided: see below
      2. Don't store semantic property

Nested objects within item@jsondata are handled

  • osl_category: same handling as the root object, but:
    • Rendering with category@header_template. Fallback: Nested info box
    • Data storing with category@data_template. Fallback: Creating a subobject with Json-LD mapping + (inverse) semantic relation to the root object
  • osl_template: expand the template, return value is asigned to the property

Python Code Generation

see https://github.com/OpenSemanticLab/osw-python

Statement

PCB contains 10% +/- 1% Lead and Gold
s p/s o/p/s o/p o...
PCB contains Lead
contains HasMassConcentration 10%
HasMassConcentration HasPrecision 1%
PCB contains Gold

File Handling

Copy-Policy

drop: do not copy the file

copy: copy the file and store the reference to it

copy-ref: store the referece to the original file

ask-on-edit: store the reference but ask the user to copy the original file when he tries to edit it (current_page != creation_page)

Meta-Data

HasProject=project inherit: permissions from project

HasCreationPage=creation_page: wiki page within this file was created

HasEditPage=edit_page: wiki page within this file was edited

HasCreator=creator: initial creator of the file

HasEditor=editor: editors of the file

Links

jsondata
type
"Category:OSW92cc6b1a2e6b4bb7bad470dfdcfdaf26"
uuid"ab674d66-3a5b-472f-838d-8e1eb43e6784"
label
text"OSW Schema"
lang"en"
description
text"Documentation about to OSW data schema"
lang"en"
name"OswSchema"
attachments
"File:OSW2f275e3441c84f63a6cbee2861c488f2.drawio.svg"
"File:OSW49d68bb7a5de413ba1077bc5f459a766.drawio.svg"
"File:OSW61f1999ee6d145c9b76fb55d02578ce5.drawio.svg"
"File:OSW95a74be1e22d4b6e9e4f836127d5915a.drawio.svg"