Knowledge
Semantics: Connect Business with Data Products
A schema tells you how data is stored. A semantic layer tells you what it means. This page explains what a semantic layer is, how Entropy Data's Semantics feature models one, and how the resulting ontology connects to data products and data contracts.
The problem
Most data platforms do a good job of telling you which table, which column, and which type. They do a worse job at telling you what any of it means. The questions that slow teams down tend to be of the second kind:
- Three teams have a
customer_idcolumn. Do they refer to the same customers? Are guest checkouts included? - Finance and Product both report Gross Merchandise Value. The numbers do not match. Which formula is authoritative?
- A new analyst needs shipment tracking data. Searching for "shipment" across a dozen schemas returns forty columns. Which ones are canonical?
These are semantic problems, not schema problems, and they scale poorly as the number of data products, teams, and AI tools grows.
What a semantic layer is
A semantic layer is a named, governed definition of your domain, independent of any specific table or system. It is built from two main elements:
- Concepts: the things in your business: entities, properties, and metrics.
- Relationships: how those concepts connect to each other and to data contract fields.
Concepts
Each concept has a stable ID, a human-readable name, a description, and an IRI so it can be referenced from external tools. For example, the Editorial Object concept from EBU Core Plus has the IRI:
http://www.ebu.ch/metadata/ontologies/ebucoreplus#EditorialObject
Concepts live in namespaces (the default is main) and come in four kinds:
- Entity: the nouns of your domain. Customer, Order, Article, Shipment.
- Shared Property: a reusable attribute with a primitive type that attaches to one or more entities. Customer Email, SKU, Order ID. Defined once, referenced from multiple places.
- Metric: a measurable quantity with a unit, a "better when" direction, and an optional formula. Gross Merchandise Value, Conversion Rate, Order Fulfillment Time.
- Group: a container to organize concepts by domain or subject area. Sales, Fulfillment, Catalog, Controlling.
Concepts carry the metadata that governance and AI tooling depend on: data type, classification (for example PII, sensitive, restricted), required and unique flags, examples, enums, regex patterns, multi-language annotations, and tags.
Concepts as YAML
The full ontology can be edited as YAML, which is practical for bulk edits, code review, and keeping the ontology in Git. Individual concepts also have a per-concept YAML editor. An excerpt from the demo retail ontology:
concepts:
- id: order
name: Order
kind: entity
group: Sales
description: A confirmed purchase placed by a customer in the online shop.
properties:
- ref: Order ID
- ref: Customer ID
- name: placed_at
kind: property
data_type: timestamp
- name: order_status
kind: property
data_type: string
enum: [pending, confirmed, shipped, delivered, cancelled, returned]
- name: total_amount
kind: property
data_type: number
- name: currency
kind: property
data_type: string
pattern: ^[A-Z]{3}$
examples: [EUR, USD]
annotations:
- name: owl:equivalentClass
value: http://schema.org/Order
Two details are worth pointing out. The ref entries pull in shared properties (Order ID and Customer ID), which are defined once and reused across entities. The owl:equivalentClass annotation aligns the concept with an external ontology (schema.org in this case), which keeps the internal model interoperable with standards such as GoodRelations or FIBO.
Metrics with formulas
A metric records its unit, the direction in which it improves, and how it is computed. That gives Finance, Product, and AI tools a single definition to agree on.
concepts:
- id: gmv
name: Gross Merchandise Value
kind: metric
group: Controlling
description: Total value of all placed orders before refunds, returns, and discounts.
unit: EUR
better_when: higher
formula: "SUM(order.total_amount)"
- id: average_order_value
name: Average Order Value
kind: metric
group: Sales
unit: EUR
better_when: higher
formula: "SUM(order.total_amount) / COUNT(DISTINCT order.id)"
Relationships
Concepts connect to each other through directed, typed relationships. The common types are hasProperty, isA, memberOf, measures, derived_from, and relatedTo.
relationships:
- id: gmv_measures_order_total
type: measures
relates:
- concept: gmv
- concept: order.total_amount
verbalizes: "{Gross Merchandise Value} measures {Order.total_amount}"
- id: aov_derived_from_gmv
type: derived_from
relates:
- concept: average_order_value
- concept: gmv
verbalizes: "{Average Order Value} derived from {Gross Merchandise Value}"
Each concept page shows incoming and outgoing edges so you can navigate the ontology in either direction. The Diagram view shown at the top of this page renders the same namespace as an interactive graph, with pan, zoom, and a click-through from any node to its concept page.
Translations in multiple languages
Each concept, property, and relationship can carry translations of its name and description in any number of languages. Translations are stored as annotation entries tagged with lang, exposed through the SPARQL endpoint, and shown in the UI based on the user's language preference. A single ontology can serve English-speaking analysts, French-speaking product managers, and German-speaking auditors from the same source.
concepts:
- id: customer
name: Customer
kind: entity
description: A natural person who places orders in the online shop.
annotations:
- name: name
value: Kunde
lang: de
- name: name
value: Client
lang: fr
- name: description
value: Eine natürliche Person, die Bestellungen im Online-Shop aufgibt.
lang: de
- name: description
value: Une personne physique qui passe des commandes dans la boutique en ligne.
lang: fr
The industry ontologies that ship with Entropy Data are translated out of the box. EBU Core Plus, for example, comes with English, German, and French labels and descriptions for every class and property.
Linking data products and data contracts
A semantic layer is only useful if the data that implements it points back to it. Entropy Data uses the authoritativeDefinitions mechanism from the Open Data Contract Standard to connect concepts to the implementing data at three levels:
- On a data product (ODPS), representing the data product as a whole.
- On a data contract or one of its schema objects.
- On a specific field inside a contract schema.
The same link can be written directly into ODCS YAML. From a demo shipments contract:
properties:
- name: shipment_id
businessName: Shipment ID
logicalType: string
primaryKey: true
authoritativeDefinitions:
- type: "semantics"
url: "https://demo.entropy-data.com/my-organization/semantics/main/shipment_id"
- name: order_id
authoritativeDefinitions:
- type: "semantics"
url: "https://demo.entropy-data.com/my-organization/semantics/main/order_id"
The type: "semantics" marker tells Entropy Data the URL resolves to a semantic concept, and the link is rendered as a clickable reference on the contract page. On the concept page, Entropy Data lists every data product and data contract that references it, so questions like "which datasets contain customer email addresses?" can be answered directly.
What Semantics is useful for
Discovery by meaning
Consumers search for Shipment and find every data product that implements that concept, regardless of how the underlying columns are named.
Consistent definitions across teams
One Customer Email definition is referenced by multiple data contracts. Changing its classification to sensitive updates every reference.
A single source of truth for metrics
GMV, Conversion Rate, and the contribution margins have canonical definitions with formulas, which removes the ambiguity that tends to accumulate in spreadsheets.
Context for AI agents
LLMs do not know what your business means by Active Customer or Contribution Margin 2. When agents access data through our MCP server, Semantics supplies that context, so they join on the right keys and use the right metric definitions.
Start with an industry ontology
You do not need to model your domain from scratch. Entropy Data ships with ready-to-import ontologies for several industries. Open Studio, go to Semantics, and use Add → Import Industry Standard:
- GoodRelations (E-Commerce)
- FIBO (Finance)
- EBU Core Plus (Media)
- CGMES (Energy)
- IDMP (Pharma)
- EPCIS (Supply Chain)
- IATA ONE Record (Air Cargo)
- TM Forum SID (Telco)
You can also upload your own RDF/OWL files (Turtle, OWL, RDF/XML, N-Triples, N3, JSON-LD), or query the ontology programmatically via the SPARQL endpoint at /api/semantics/sparql.
Open standards: OSI
Entropy Data is committed to open standards for the semantic layer and has joined the Open Semantic Interchange (OSI) initiative. OSI is a vendor-agnostic, open-source effort to standardize how semantic models are exchanged between BI platforms, AI agents, and analytics tools, so that a single definition of an entity or metric stays consistent across your ecosystem. We will support the upcoming OSI Ontology standard natively in Semantics, next to our support for Bitol's Open Data Contract Standard and Open Data Product Standard on the data contract and data product side.
How Semantics fits the rest of Entropy Data
- Marketplace uses semantic concepts for discovery.
- Studio is where domain experts and data product owners curate concepts and link them to contracts.
- Governance reuses semantic classifications (PII, sensitive, and so on) in cross-cutting policies.
- The MCP server uses Semantics to ground LLM tool calls in your business vocabulary.
Getting started
Semantics is available behind a feature flag:
- Entropy Data Cloud: contact us to enable it for your organization.
- Self-hosted: set
APPLICATION_SEMANTICS_ENABLED=trueand restart the application.
To try Semantics without installing anything, open the Demo and go to Studio → Semantics. The demo ships with major industry standards: Go to Studio → Semantics → Add → Import Ontology to explore provided industry standards or to upload your own existing ontology.