IFC-LD

Note	As of fall 2022, this specification is under active community review. Prose and data samples may have inconsistencies. Please report all issues or proposals through Github.

1. Rationale

IFC-LD is a data model and family of serializations designed to support the exchange of information about the built environment. It provides a standard set of predefined "Containers" - Packages, Entities, and Components - in which stakeholders can store and exchange information. It is interoperable with all existing versions of IFC, while embracing new directions. Practically, IFC-LD aims to replace ISO 10303-21 ("P21" files) as the leading exchange format for IFC.

IFC-LD attempts to solve several problems with IFC-based data exchanges through use of convention:

To avoid use of custom parsers for ISO 10303-21 ("P21") files, IFC-LD provides a standard set of JSON-based serializations.
To support integration and linking of data, IFC-LD uses JSON-LD to make the JSON serializations double as linked data.
To discourage the use of monolithic model files, IFC-LD makes all assets (down to individual properties or "Components") uniformly addressable, and provides well-defined ways to share them with stakeholders.

IFC-LD standardizes the ways IFC and IFC-adjacent data is packaged, but delegates to other specifications what those packages should contain for a particular use case. This separation of structure of the data envelope (IFC-LD) from the structure of its contents is what allows IFC-LD to integrate all existing IFC schemas while supporting new directions.

2. Requirements

2.1. Data Addressability

All exchanged data must be unformly addressable, because linking or referencing data requires it have a name. Given the distributed, global nature of the built environment, the most natural namespace for built assets is the web, where shared assets are addressable by URIs.

2.2. Interoperability (with STEP)

The vast majority of the world’s existing IFC data is encoded as ISO 10303-21 files. It is also the primary serialization existing commercial systems support. To encourage broad adoption of IFC-LD, and to ensure we can migrate existing asset data, interoperability with the Part 21 exchange format is required.

2.3. Data Distribution

The data describing built assets is inherently distributed, because stakeholders describe assets from different points of view at different points in time. Any forward-looking exchange strategy, including IFC-LD, must take widespread data distribution and composition as prerequisites.

2.4. Extensibility

The IFC-LD specification cannot anticipate every need. Instead, it must provide a small set of capabilities that could later be extended. Rather than inventing these extension capabilities from scratch, it can leverage those it builds upon (i.e., RDFS).

3. Design

3.1. Data Model

IFC-LD models a standard set of data "containers". Each container type and its relations are specified within RDFS, a data-modeling vocaulary for linked data. Through the use of JSON-LD, users of IFC-LD can produce and consume valid IFC-LD without any knowledge of linked data, treating it only as JSON following a standard JSON Schema.

IFC-LD defines the following container types for organizing information.

3.1.1. Container

"Container" is the root element within the IFC-LD type hierarchy. All assets exchanged via IFC-LD are derived from Container. Containers defines several key properties:

@id: Required. Defines the URI address of the IFC-LD Container
@type: Required. Defines one or more sets (also identified by URIs) with which this Container shares a common trait (e.g., "ifc4:person" is the @type of all IFC4 Person instances)
group: Optional. A set of strings tagging the Container under a common functional category (e.g., "Energy Performance", "Design Phase", "Unverified")

3.1.2. Components

A Component is a Container that holds "contents", captured at a specific time, declared by a specific stakeholder. A Component’s "contents" can represent a property value, a reference to an Entity, or some other capability defined by other specifications.

Here are example Components in IFC-LD:

[
    {
        "description": "I am a Component representing the fourth value of an IFC4 Person Entity.",
        "@id": "2NrzylOdL18BgYXZEmG1XH",
        "@type": [
            "Component",
            "ifc4:person/4"
        ],
        "componentof": "2TqpgZlI94j9CZf2Mh_w89",
        "contents": [
            {
                "@type": "xsd:string",
                "value": "Marjorie"
            }
        ],
        "meta": {
            "who": "General Rotors",
            "when": "2019-11-10T17:00:30"
        }
    },
    {
        "description": "I am a Component representing an embedded link from an IFC5 Building to another Entity",
        "@id": "2TqpgZlI94j9CZf2Mh_w89/1",
        "@type": [
            "Component",
            "ifc5:building/3"
        ],
        "componentof": "2TqpgZlI94j9CZf2Mh_w89",
        "contents": [
            {
                "@id": "2UZ7fqDD9EV9WoEvbxmdRf"
            }
        ],
        "meta": {
            "who": "Acme Corp",
            "when": "2020-11-30T17:44:46"
        }
    }
]

Components should be related to the Entities they qualify through the "componentof" property. A Component may be the componentof more than one Entity. When seen from the point of view of an Entity, Components describe qualifying aspects of that Entity.

Component data is stored in its "contents". The structure of contents depends on the Component. For Entity references, contents provides the @id of the targeted Entity. For property values, "contents" is made up of a "@type" and "value", the latter of which may be a (possibly nested) list.

Note	During conversion with P21 data, a Component’s "contents" @type maps to a known semantic type, like IfcRatioMeasure. Otherwise, a default XSD type is used.

Components may also declare a "shortcut" property, whose value is a URI, specifying the relationship the Component defines between its linked Entities (via componentof) and its contents. For example, a Component may declare its linked Entity is related to its Entity reference contents via Building Topology Ontology's "hasElement" relation:

{
    "@id": "2TqpgZlI94j9CZf2Mh_w89/1",
    "@type": [
        "Component",
        "ifc5:building/3"
    ],
    "componentof": "2TqpgZlI94j9CZf2Mh_w89",
    "shortcut": "http://w3id.org/bot#hasElement",
    "contents": [
        {
            "@id": "2UZ7fqDD9EV9WoEvbxmdRf"
        }
    ],
    "meta": {
        "who": "Acme Corp",
        "when": "2020-11-30T17:44:46"
    }
}

Linked data-aware systems may then use this shortcut property to "unpack" the underlying RDF triple:

SELECT DISTINCT ?Entity ?shortcut ?target WHERE {
  ?Component <http://ifc-ld.org#Componentof> ?Entity .
  ?Component <http://ifc-ld.org#shortcut> ?shortcut .
  ?Component <http://ifc-ld.org#contents> ?target .
}

Use of Component shortcuts is optional.

3.1.3. Entities

"Entities" are a Container for Components, with each Component encoding some qualifying aspect of the Entity. Like Components, Entities are addressed by URI. Entities typically capture any asset, real or imagined, stakeholders want to track across time and revisions. Because their state is the superposition of many Components (each with its own provenance information), Entities do not not themselves declare provenance information like timestamp or ownership.

Entities should be grouped into one or more Packages, each of which captures the circumstances that changed the state of the Entity (through the introduction of new Components). An Entity may be the entityof more than one Package.

Here is an example Entity:

{   "@id":"2TqpgZlI94j9CZf2Mh_w89",
    "@type":["Entity", "ifc4:person"],
    "entityof":"2EgCkpGWD308FjMK8nRzS0",
    "components":[]
}

Note	For interoperability with IFC-STP, Entities correspond to P21 instances.

3.1.4. Packages

Packages are Containers for Entities. Packages define some event - a project deliverable, point-to-point exchange, etc - that justified the modification or transaction of Entities and Components. Because Packages describe events, they contain provenance information similar to Components.

Here is an example Package:

{   "@id":"2EgCkpGWD308FjMK8nRzS0",
    "@type":["Package"],
    "meta":{
        "who":"Acme Corp",
        "when":"2020-11-30T17:44:46"
    },
    "entities":[]
}

3.2. Serializations

The IFC-LD data model is independent of its serialization. It it possible, for example, to use standard RDF encodings to serialize IFC-LD as XML, HTML, JSON, Turle, or N3. IFC-LD recommends JSON as its primary serialization format, because it allows the data to be parsed without external tools, and can be treated as JSON for systems without linked data support.

JSON’s flexibility permits more than one JSON representation of the same IFC-LD data. For example, a collection of Packages, Entities, and Components could be organized as a JSON array (suitable for streaming), or as a hierarchical JSON Entity (suitable for browsing or lookup by ID). IFC-LD therefore supports several standard JSON serializations, each identified by its own media type. Where possible, these serializations strive to be losslessly interoperable.

3.2.1. application/ifc.json+stream

The IFC-LD "Stream" format, identified by media type application/ifc.json+stream, captures all IFC-LD assets in a flat array. Here is a sample IFC-LD stream:

[
    {
        "@id": "2EgCkpGWD308FjMK8nRzS0",
        "@type": [
            "Package"
        ],
        "meta": {
            "who": "Acme Corp",
            "when": "2020-11-30T17:44:46"
        }
    },
    {
        "@id": "2TqpgZlI94j9CZf2Mh_w89",
        "@type": [
            "Entity",
            "ifc4:person"
        ],
        "entityof": "2EgCkpGWD308FjMK8nRzS0"
    },
    {
        "@id": "2NrzylOdL18BgYXZEmG1XH",
        "@type": [
            "Component",
            "ifc4:person/4"
        ],
        "componentof": "2TqpgZlI94j9CZf2Mh_w89",
        "contents": [
            {
                "@type": "xsd:string",
                "value": "Marjorie"
            }
        ],
        "meta": {
            "who": "General Rotors",
            "when": "2019-11-10T17:00:30"
        }
    }
]

IFC-LD streams are useful for event-driven systems and one-at-a-time processing. They encourage the the exchange of "delta updates", wherein individual Components, Entities, and Packages can be exchanged, instead of monolithic model files.

3.2.2. application/ifc.json+pkg

The IFC-LD "Package" format, identified by media type application/ifc.json+pkg, captures IFC-LD assets as a nested hierarchy of Containers, where a root Package contains Entities, and Entities contain Components.

Here is a sample IFC-LD Package file:

{
    "@id": "2EgCkpGWD308FjMK8nRzS0",
    "@type": [
        "Package"
    ],
    "meta": {
        "who": "Acme Corp",
        "when": "2020-11-30T17:44:46"
    },
    "entities": {
        "2TqpgZlI94j9CZf2Mh_w89": {
            "@id": "2TqpgZlI94j9CZf2Mh_w89",
            "@type": [
                "Entity",
                "ifc4:person"
            ],
            "entityof": "2EgCkpGWD308FjMK8nRzS0",
            "components": {
                "2NrzylOdL18BgYXZEmG1XH": {
                    "@id": "2NrzylOdL18BgYXZEmG1XH",
                    "@type": [
                        "Component",
                        "ifc4:person/4"
                    ],
                    "componentof": "2TqpgZlI94j9CZf2Mh_w89",
                    "contents": [
                        {
                            "@type": "xsd:string",
                            "value": "Marjorie"
                        }
                    ],
                    "meta": {
                        "who": "General Rotors",
                        "when": "2019-11-10T17:00:30"
                    }
                }
            }
        }
    }
}

The IFC-LD Package format is most useful for browsing the Entities of a single Package. Because Entities and Components in this format are indexed by JSON keys, systems without linked data support can still traverse the network of linked Entities by dereferencing the corresponding JSON keys.

Note	For interoperability with IFC-STP, the IFC-LD Package format corresponds to a P21 instance population.

3.2.3. Serialization Interoperability

application/ifc.json+stream and application/ifc.json+pkg are losslessly interoperable. A IFC-LD stream can be encoded in the package format and back again with no loss of information. The Proof-of-Concept Transformation Pipeline outlines a reference architecture for this transformation flow, with notes for future work.

Future versions of the IFC-LD specification may introduce at least one JSON serialization that is derived from, but not losslessly convertible to, the stream or package formats. These serializations would be tailored to specific use cases (e.g., visualization) where small file size is the key Entityive.

3.3. Address Structure

Being linked data, all IFC-LD Container addresses are URIs. The structure of these URIs is unconstrained in principle, in that they are effective if they accurately address the assets they describe. However, IFC-LD does recommend specific URI patterns for particular use cases.

As general rules, IFC-LD recommends:

All Packages, Entities, and Components use http://ifc-ld.org# as their default base path.

For IFC P21 interoperability, IFC-LD recommends:

Packages, Entities, Components form their URIs by concatenting the base path with a valid 22-character IfcGUID, either extracted from the relevant P21 instance, or randomly generated otherwise. For example, http://ifc-ld.org#2TqpgZlI94j9CZf2Mh_w89/`

For IFC P21 interoperability, IFC-LD requires:

All Entity @types be defined by concatenating the IFC schema version’s base path (defined for each schema version part of the JSON-LD Specification) with the name of the IFC type after trimming the "Ifc" prefix, and converting to lowercase. For example, http://buildingsmart.org/standards/ifc4/person to address all IFC4 IfcPersons. The schema version is taken from the P21 "file_schema" header and converted to lowercase.
All Component @types be defined by concatenating the parent P21 Entity’s URI with the offset of the Component within the Entity. For example, http://buildingsmart.org/standards/ifc4/person/4 to address all Components that are the fourth entry of an IFC4 Person instance.

3.4. Provenance

IFC-LD provides a user-extensible field, "meta", for storing provenance data about its parent Container. It currently contains only two values:

who which declares the actor (identified by any string, including a URI) that created the Container. It corresponds to the wasGeneratedBy property of the PROV-O ontology.
when which declares the RFC 3339 timestamp when the Container was created. It corresponds to the generatedAtTime property of the PROV-O ontology.

Integrators may introduce other provenance fields to "meta" as needed.

4. References

4.1. Looking Forward: ECS, Modules, and Compositional Modeling

There’s a growing recognition within the industry to support "modeling by composition" for built asset data. Property Sets within the existing IFC specification support one flavor of this, in that multiple parties can attach different pieces of data to the same asset without prior coordination. How far can this model be taken and how can IFC-LD help?

IFC-LD supports compositional modeling at its core. Entities, which represent assets that persist across time and stakeholder revision, are primarily identified by a URI. Components, describing qualifying aspects (e.g., properties, relations) of Entities, can be linked to any number of Entities via their componentof values. As a result, multiple stakeholders, at multiple points in time, can attach multiple pieces of data (Component contents) to the same Entity, without prior coordination. By labeling Components as belonging to one or more Component groups, Components can be bundled into functional categories, making them behave equivalently to current IFC Property Sets. Together, Entities and Components form the foundation of an industry standard entity component model.

Note	Compositional modeling in IFC-LD can be applied regardless of whether P21 integration is applied. The same Entity may contain Components mapped to existing IFC schemas and those developed independently without conflict.

For example, consider the following set of Component definitions:

[
    {
        "description": "I am a Component representing area of utilization expressed according to local building codes.",
        "group":"Site Planning",
        "@id": "3boaO5Pe9D4fu1Ktf1qAcZ",
        "@type": [
            "Component"
        ],
        "componentof": "2TqpgZlI94j9CZf2Mh_w89",
        "contents": [
            {
                "@type": "ifc4:areameasure",
                "value": 3252.6
            }
        ],
        "meta": {
            "who": "Quadri Lateral",
            "when": "2020-05-10T00:00:30"
        }
    },
    {
        "description": "I am a Component representing the calculated maximum height of buildings on this site according to local building codes.",
        "group":"Site Planning",
        "@id": "3brTazpcT2DQSCkG2kuGIm",
        "@type": [
            "Component"
        ],
        "componentof": "2TqpgZlI94j9CZf2Mh_w89",
        "contents": [
            {
                "@type": "ifc4:positivelengthmeasure",
                "value": 32
            }
        ],
        "meta": {
            "who": "Acme Corp",
            "when": "2020-11-30T17:44:46"
        }
    },

    {
        "description": "I am a Component indicating whether the Site permits public access.",
        "group":"Owner Reqiurements",
        "@id": "3bvTIG_hLCjwhMCFTZy15R",
        "@type": [
            "Component"
        ],
        "componentof": ["2TqpgZlI94j9CZf2Mh_w89","07rDKiqTT59BNsVJj21ptZ"],
        "contents": [
            {
                "@type": "http://schema.org/publicAccess",
                "value": false
            }
        ],
        "meta": {
            "who": "Owner Us",
            "when": "2015-11-10T00:10:00"
        }
    }
]

All three Components are the componentof the same Entity 2TqpgZlI94j9CZf2Mh_w89 - in this case, some kind of building site. The third Component is also the componentof a second Entity, 07rDKiqTT59BNsVJj21ptZ, which represents a second site under the owner’s control. The use of two Entity references in the third componentof allows the same declaration of public access to be shared across Entities. Change the Component value to true, and all linked Entities will now record public access as allowed.

Next, notice the provenance data attached to each Component. Each Component has been contributed by a different stakeholder at a different point in time. All can be composed, because they use global Entity addresses to qualify what they’re describing.

Finally, note that by specifying "group" labels, each of the Components can be organized into functional categories akin to traditional IFC Property Sets. In this case, the categories are "Site Planning" and "Owner Requirements". Stakeholders may use these group labels to filter for Components of interest.

These key features of Components - the ability to link to multiple Entities and group by functional category - is what makes IFC-LD a compositional system at its core.

4.1.1. Modules

To help take IFC in new directions, we recommend the compositional modeling support of IFC-LD be used to develop collections related of Component definitions, called "Modules", that fulfill specific use cases, design processes, or functional categories. We imagine modules tailored to energy performance, massing design, construction administration, and more. Industry working groups for each module would take responsibility for the module’s design and release cycle. Many modules would be publicly shared to increase interoperability, but some may be tailored to particular delivery processes or firms. All Modules should supply a standard set of Component schema definitions (see Schema Validation Strategy).

4.2. Schema Validation Strategy

Being built on linked data, and requiring closed-world semantics for project delivery, IFC-LD recommends schema definitions be written in SHACL, a linked data constraint definition language. SHACL has a moderate learning curve, but provides a rich schema language that can capture complex, graph-based constraints and rules not possible in most alternatives. Because SHACL operates on the underlying linked data, not on a particular syntax, the same schema definitions can be applied to IFC-ID entities regardless of the serialization used (including those yet to be developed).

To help keep the overhead of working with SHACL low, we recommend the following guidelines be established for all IFC-LD schema definition:

Where possible, target SHACL schema definitions, called "shapes", against IFC-LD Components. This allows every shape to remain small and independently testable. It also allows legacy constraints from the existing IFC schema to be developed on an as-needed basis.
As part of an IFC-LD community effort, publish a SHACL "Component Schema Starter Kit" with standard shapes for common use cases (matching a property value exactly, ensuring a number is within a range, etc). Many common constraint checks are variations on a theme, so providing templates allows new ideas to be tested with minimal overhead.

Because Components should declare at least one custom @type, Component-based SHACL shapes can be easily written by targetting the Component’s @type using sh:targetClass. For example, here’s a SHACL shape constraining the size of IFC4 IfcGUID Components (defined as the first Component of IfcRoot):

ex:IfcGUIDShape
	a sh:NodeShape ;
	sh:targetClass ifc4:ifcroot/0 ;
    sh:severity sh:Violation ;
    sh:property [
		sh:path ifc-ld:contents ;
        sh:minLength 22 ;
		sh:maxLength 22 ;
		sh:message "IfcGUIDs must be exactly 22 characters in length."@en ;
	] .

Caution

Component contents structure has been under revision and may need to be brought into alignment with SHACL sh:path targets.

4.3. Industry Adoption Strategy

For IFC-LD to be successful, it is important it be easily integrated into industry practice. By supporting losslessly interoperability with P21, a IFC-LD reference implementation could immediately be put into practice. Once a suitable reference implementation has been developed, we recommend:

It be containerized and made available for both public deployment (with rate and size limits) and private use.
Vendor import/export dialog boxes provide hooks to pass P21 data to or from a deployment of the reference implementation, circumventing the need for vendor tools to build IFC-LD support natively.
Over time, the community work to stablize the reference implementation so it can be easily leveraged across vendor products and tools.

4.4. Proof-of-Concept Transformation Pipeline

As of fall 2022, a proof-of-concept transformation pipeline demonstrating lossless interoperability between the P21, application/ifc.json+stream, application/ifc.json+pkg formats has been developed. The pipeline’s design is unusual in that, outside of a P21 parser, all transformations are standard JSON-LD operations (compactions, flattenings, or frames); all data transformation is data-driven from JSON-LD definitions.

A diagram of the POC transformation pipeline

The POC pipeline spends the majority of its time performing JSON-LD framing between the stream and package formats. For a production system, we recommend all compactions on incoming IFC-LD assets be done one-at-a-time (instead of on the whole stream, as is currently done), and that the JSON-LD reframing operation be replaced with an imperative equivalent that could operate on persistent storage.

The intent is to release the POC transformation pipeline, along with several optimizations, before the end of 2022.

4.5. JSON-LD Specification

The IFC-LD Package format assumes the following JSON-LD @context:

{
    "@vocab": "http://ifc-ld.org/",
    "@base": "http://ifc-ld.org/",
    "meta": "@nest",
    "who": {
        "@id": "http://ifc-ld.org#who",
        "@nest": "meta"
    },
    "when": {
        "@id": "http://ifc-ld.org#when",
        "@nest": "meta"
    },
    "entityof": {
        "@type": "@id",
        "@id": "http://ifc-ld.org#entityof",
        "@nest": "meta"
    },
    "componentof": {
        "@type": "@id",
        "@id": "http://ifc-ld.org#componentof",
        "@nest": "meta"
    },
    "shortcut": {
        "@type": "@id",
        "@id": "http://ifc-ld.org#shortcut"
    },
    "contents": {
        "@id": "http://ifc-ld.org#contents",
        "@container": "@list"
    },
    "values": {
        "@id": "http:/ifc-ld.org#values",
        "@container": "@list"
    },
    "group": {
        "@id": "http://ifc-ld.org#group",
        "@type": "http://www.w3.org/2001/XMLSchema#string"
    },
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ifc2": "http://buildingsmart.org/standards/ifc2/",
    "ifc2x2": "http://buildingsmart.org/standards/ifc2x2/",
    "ifc2x3": "http://buildingsmart.org/standards/ifc2x3/",
    "ifc4": "http://buildingsmart.org/standards/ifc4/",
    "ifc4x1": "http://buildingsmart.org/standards/ifc4x1/",
    "ifc4x2": "http://buildingsmart.org/standards/ifc4x2/",
    "ifc4x3": "http://buildingsmart.org/standards/ifc4x3/",
    "ifc5": "http://buildingsmart.org/standards/ifc5/"
}

The Proof-of-Concept Transformation Pipeline manipulates this standard context in various ways to convert between intermediate formats.

Of note are the standard prefixes for published IFC versions and use of ifc-ld.org as the default vocabulary and base path for all data.

4.6. Meta-Model Specification

@base <http://ifc-ld.org#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix prov: <http://www.w3.org/ns/prov#> .


:Container
    a rdfs:Class .

:Package 
    a rdfs:Class ;
    rdfs:subclassOf :Container .

:Entity
    a rdfs:Class ;
    rdfs:subclassOf :Container .

:Component
    a rdfs:Class ;
    rdfs:subclassOf :Container .

:componentof a rdf:Property ;
    rdfs:domain :Component ;
    rdfs:range :Entity .

:components a rdf:Property ;
    rdfs:domain :Entity ;
    rdfs:range :Component .

:shortcut a rdf:Property ;
    rdfs:domain :Component .

:entityof a rdf:Property ;
    rdfs:domain :Entity ;
    rdfs:range :Package .

:entities a rdf:Property ;
    rdfs:domain :Package ;
    rdfs:range :Entity .

:who a rdf:Property ;
    rdfs:subPropertyOf prov:wasAttributedTo .

:when a rdf:Property ;
    rdfs:subPropertyOf prov:generatedAtTime .

:contents a rdf:Property .

:group a rdf:Property ;
    rdfs:subPropertyOf xsd:string .

4.7. OpenAPI Specification

Coming soon. Please see in-progress IFC-LD Package Format (JSON Schema Draft 7) definition in the meantime.

4.8. Playground

Coming soon. Once the pipeline reference implementation is complete, the intent is to deploy it with a small web frontend here so users can play with the various IFC-LD formats directly in their browser.

4.9. FAQs

4.9.1. The format seems "busy". Aren’t there a smaller, simpler JSON formats for IFC?

Yes. Consider the following P21 instance:

#47= IFCMEASUREWITHUNIT(IFCRATIOMEASURE(0.0174532925199433),#45);

You could encode this in JSON as

[
    ["IFCMEASUREWITHUNIT", 0.017453292519943, 45]
]

It captures most of what’s in the original instance:

The P21 instance ID is replaced with a dedicated offset within a JSON parent array
The instance type is stored as the first element of the array.
All attributes are stored at their respective offset. Entity references are encoded as integers.
It’s compact and amenable to streaming.

It’s a nice encoding. It also has multiple problems:

The instance’s ID is always relative to other instances within the parent array; it has no absolute address.
There is no place to stash metadata, including which IFC version is being encoded.
IFC type parameters (e.g., IFCRATIOMEASURE) are lost in translation.
There are no schema annotations, including property names or type hints (i.e., is this element a literal or an Entity reference)

What changes would need to be made to the encoding to fix this?

We’d want to give instances, and their parts, distinct addresses so they could be referenced from anywhere.
We’d want to be able to add metadata to any Entity, especially to qualify the values we’re encoded (e.g. against a particular schema or asset version).
We’d want type parameters to be preserved for IFC compatibility and annotation of values.
We’d want to reintroduce property names from the schema, along with conventions for identifying the type of an attribute.

Fixing these problems resolves to something that looks very much like IFC-LD. We end up with a format that is largely self-describing. This inherently increases the data’s size, but in trade reduces the burden on consuming systems, because the data itself will describe how it should be consumed. We suggest the value of self-describing, integrable data outweighs the storage overhead, especially given the need and complexity of common coordination models within the design and construction industries.

4.9.2. Why does IFC-ID recommend schema definitions be defined SHACL? Why not JSON Schema?

JSON Schema is primarily useful for validating the syntactic structure of JSON documents. It can ensure a JSON document has a specific set of field names, arrays of a certain size, or numeric values within a particular range. It is challenged, however, to define constraint rules that are deeply relational (graph-like), or to handle constraint checking when the input data can come in different "shapes". Unfortunately, both of these qualities are core to the structure of IFC-LD.

SHACL avoids JSON Schema’s limitations here because it defines a graph-based constraint language that’s independent of the serialization. It allows complex constraint definitions that traverse multiple Entities, or members of a specific class of Entities, and works no matter how the data is encoded - XML, JSON-LD, Turtle, etc. This does make the schema definition language more complex, but many industry rulesets inherently require this complexity for validation.

SHACL is the default recommended schema language for IFC-LD simply because it provides the greatest flexibility to the widest number of stakeholders. The heuristic to apply SHACL shape definitions to IFC-LD Components intends to simplify and standardize the schema extension workflow.

The use of SHACL does not preclude the use of JSON Schema for validating specific parts of IFC-LD. In fact, the OpenAPI Specification depends on JSON Schema to check the structure of Packages, Entitys, and Components. Users are encouraged to expand these validation definitions for their use cases as needed.

4.9.3. Why are legacy IFC Components @types defined by offset, instead of by name?

Legacy IFC Components have URIs that are based on an attribute’s offset within a P21 instance, rather than its name in the corresponding EXPRESS schema. For example, the type of the fourth attribute of an IFC4 Person is identified by the curie ifc4:person/4, rather than ifc4:person/middlename. Tying the URI to the attribute name requires any translation pipeline somehow be aware of the schema the instances model, because P21 files do not include attribute name information. The Proof-of-Concept Transformation Pipeline does this through a data enrichment step. Should the schema version not be known to the pipeline, or no enrichment data be available, a translation pipeline would have no way to know how to encode the attribute type URI. Identifying the attribute URI by offset, which is guaranteed to be known, ensures graceful degradation in the event an unknown schema is encountered. As a side effect, it means IFC-LD technically works with all P21 files, even those unrelated to IFC.

4.9.4. How does IFC-LD compare to ifcOWL?

IFC-LD is Like ifcOWL in that it is:

based on linked data
interoeprable with the existing IFC schema

IFC-ID differs from ifcOWL in that it does not:

make linked data support or comprehension a prerequisite for use
does not model EXPRESS schema concepts, but instead provides its own simple Package-Entity-Component metamodel
delegates most schema validation and inferencing to other specifications, especially SHACL
encourages use of "closed-world" schema validation over "open-world" inferencing, at least for common use cases
encourages the design of new types and definitions as needed through Components

IFC-LD tries to package the power of linked data in a framework that’s approachable, intuitive, and open-ended.

4.9.5. How does IFC-LD compare to other industry linked data ontologies?

There’s been exciting work on purpose-built industry linked data models over the last few years. Each of these model can be seen as a "module" describing one aspect of industry processes or structures. Being based on linked data, these modules are composable, but resolution of the resulting data is low-level: everything is a collection of (subject, predicate, Entity) triples. We believe this resolution is too granular for most industry use cases, especially those without linked data support. IFC-LD provides a more Entity-oriented definition of what a module is and allows any of those purpose-built data models to be plugged in as needed (e.g., via a Component’s shortcut). By analogy with electricity, IFC-LD defines the shape of a standard data socket, so anything fitting that socket can plug in. IFC-LD is effectively an higher-level abstraction and composition layer on top of these other models.

4.10. Acknowledgements

Created by Devon Sparks (devonsparks.com) with input from members of the BuildingSmart community. IFC-LD is not currently an officially sponsored BuildingSmart International project.

Thanks to the stepcode team for collaboration on supporting pull requests for the STEP P21 parser and answering questions about the design of the STEP EXPRESS parser.