3D Tiles 1.1 Metadata for CesiumJS

Regarding 3D Tiles 1.1 metadata specification, I wonder how should I prepare a tileset in terms of assigning metadata to elements to a tileset.

A good example is on Sandcastle’s 3D Tiles 1.1 Photogrammetry Classification at Cesium Sandcastle.

There is a tileset service whose IonAssetId = 2333904. When setting
infoBox: true
I can identify that the tileset has properties of “color” and “component”. Based on this, classifications and styles are achieved in the Sandcastle’s CesiumJS app.

Here, I’m eager to know how I can prepare for such a tileset as IonAssetId = 2333904 to be used with a CesiumJS app as in the Sandcastle - shown as the attached picture.

I would appreciate any comments.

There are many different forms of input data for 3D Tiles tilesets. For example, point clouds, imagery data, general meshes (created from scans, e.g. for buildings or terrain), or CAD data. And in many cases, these input data sets already do contain metadata, in many different forms. (There’s that joke: “What is ‘metadata’?” - “It is a word with eight letters!” :slightly_smiling_face: ).

The point is that there can hardly be a general rule about which metadata should be represented in which form in the resulting tileset. This is why the 3D Metadata Specification that is used in 3D Tiles allows different representations of metadata, on different levels of granularity. This is illustrated in this image:


The latter might be a technical level of detail that is not relevant for end-users. In many cases, Cesium ion is already preserving metadata from input data, and making it available in a form that is suitable for the respective type of data. For example:


So when you ask about how to “prepare” a tileset for assigning metadata, then it may be necessary to talk about some of the details:

  • What is the source data type? (Point clouds, CAD data…?)
  • Does the source data already contain metadata?
  • Which parts of the metadata should be preserved?
  • Are there any use-case specific requirements for the granularity of the metadata?

For certain complex, user-specific use cases, it may be necessary to assign metadata in a way that is not completely automated, and there again, many details depend on the type and granularity of the data. But even if certain steps may not be fully automated today, any feedback about actual real-world use cases can help us to prioritze the developments in this area.

Thank you Marco13 for your reply note.

I’m so glad that you moderate this request as a member of Cesium Team. I also appreciate your reminding me of the overall metadata concept in general.

Nevertheless, however, let me talk about my inquiry in a straightforward manner. As I previously mentioned, let’s focus on the Cesium ion AssetId = 2333094. (Unless necessary, let’s forget anything about point clouds, CAD data, etc.) I’m not sure of the source data of the asset, but I assume this would be from the same source like AssetId = 1415196, Aerometrex San Francisco High Resolution 3D Model. (I trust you may track the source of it with the aid of Cesium colleagues. That’s why I’m glad you moderate this topic as a member of Cesium Team.)

My question, in a nutshell, is “How was the source data of AssetId = 2333094 prepared or arranged?”
• Is it required to use in glTF format?
• In order to keep the Texels level of granularity as “component = Wall”, what is the data operator’s work to make it?

These are my concerns.

The asset 2333904 that is used in the “3D Tiles 1.1 Photogrammetry Classification” Sandcastle has been created with a custom processing step, in order to apply the classification information that was provided by the data provider (Aerometrex).

In this case, the input data has been OBJ files. The OBJ format does not directly support any form of metadata. So the classification information was encoded in textures (that are used via MTL files). And this classification information has then been applied to the generated 3D Tiles data, using a custom processing step.

Trying to address the questions more specifically:

My question, in a nutshell, is “How was the source data of AssetId = 2333094 prepared or arranged?”

The source data has been OBJ files. The classification information was given via textures. This has been provided by the data provider. But it required special processing steps to bring this metadata into 3D Tiles.

Is it required to use in glTF format?

From what you said, you seem to refer to photogrammetry data. And there are different supported formats for that. But not all of them have an agreed-upon, standardized representation of metadata. So there currently is no generic way to “transport” metadata from the source data into the 3D Tiles data set.

In order to keep the Texels level of granularity as “component = Wall”, what is the data operator’s work to make it?

Given that there is no universal representation of that information for all types of input data, this question cannot be answered generically. Depending on the type and structure of the input data, and the type and structure and granularity of the desired metadata, we can think through different approaches for how this metadata could be brought into a 3D Tiles data set. (It may involve manual work or custom processing steps, but maybe some of them can be implemented with reasonable effort, and maybe some of this could also help to prioritize future development efforts).

Your answers per topic are very helpful.

  1. Regarding metadata representation, I understand your explanation as follows;
    To build metadata from OBJ to 3D Tiles 1.1, the following steps are required.
    • Encoding class info in textures via MTL from OBJ – done by data provider
    • Applying the class info to 3D Tiles data – using a custom process

Q1. Is there any publicly open encoding and/or applying class info tools? (As a matter of fact, I ultimately would like to get to this.)
Q2. Does Cesium have development plan/roadmap to provide such tools whether publicly open or commercially available?

  1. Representing different formats of photogrammetry data, it is agreeable that there is no generic way to transport metadata from the source data into the 3D Tiles dataset for now.
    By the way, we’re talking about it in the context of 3D Tiles 1.1. When we look at 3D Metadata Specification of 3D Tiles, EXT_structural_metadata specifies glTF 2.0. And, MetadataGranularities at 3d-tiles-samples/1.1/MetadataGranularities at main · CesiumGS/3d-tiles-samples · GitHub info refers to glTF(*.glb) only. More straightforwardly, at the last page of 3D Tiles 1.1 Reference Card document, 3d-tiles/3d-tiles-reference-card-1.1.pdf at main · CesiumGS/3d-tiles · GitHub, the title is ‘glTF Extensions for 3D Tiles 1.1’. These led me to think as if glTF format was required to use 3D Tiles 1.1 metadata specification.

Q3. Could you please clarifiy the metadata representation boundary depending on photogrammetry data formats, especially for glTF?

  1. I understand that keeping the texels level of granulaity and its operation effort involves manual work or custom processing steps. If the answers to Q1, Q2, and Q3 are answered appropriately, I trust this issue would be resolved as well. At the same time, I’ll look into Metadata Granulaities a little more.

Looking forward to your following ups.

The summary - and your undestanding of that process - is correct. But I may have to emphasize that this approach was chosen for that specific data set, because the data provider gave us that classification information as textures.

There could be other ways to transport this information - even for basic file formats like OBJ. For example, the OBJ format contains the concept of “groups”. Some details are described at Object Files (.obj) , but the summary of the relevant point here is: There could be an OBJ file that contains information about a cube, consisting of 8 vertices (v) and several faces f (which, in turn refer to the vertices).

v 0.000000 2.000000 2.000000
...
v 2.000000 2.000000 0.000000

g frontFaceOfCube
f 1 2 3 4
g backFaceOfCube
f 8 7 6 5
....

And these lines starting with g define these “groups” - in that example, one group for the front- and one for the back face of the cube.

And in theory, one could use something like this to perform the classification. In a more complex model, say the model of a building, there could be group definitions like g frontDoor and g roof that define the groups of faces that receive the same “classification”.
(One step further, one could even store an external CSV file that associates additional information with these groups…)

But … there is no standardized way of doing all that. So to answer your question:

Q1. Is there any publicly open encoding and/or applying class info tools? (As a matter of fact, I ultimately would like to get to this.)

I’m not aware of any specific editor for this. While I think that certain authoring tools will certainly allow defining “groups” that may then be exported to, say, OBJ files like this, that information can not reliably be used when converting these OBJ files into 3D Tiles, due to a lack of standardization.

(Custom pre- or post-processing steps are often possible, but may come with considerable effort…)

Q2. Does Cesium have development plan/roadmap to provide such tools whether publicly open or commercially available?

There is no specific, public roadmap for this that I could point to right now. The general support for metadata is constantly extended, in all parts of the toolchain. The first part here was to support this metadata e.g. for visualization in clients like CesiumJS or cesium-native. Further steps have been to actually preserve certain kinds of metadata in the tiling process (as described in the links above, for point clouds or IFC/CAD data). But I’m not aware of a specific tool that would allow to do this for photogrammetry data.

The 3D Metadata Specification is a generic description of how to encode metadata in 3D applications. There are many common and very generic parts for that. For example, the concepts of a schema and classes with properties and data types (like UINT16 or “3D vectors of float values”).

And there are two “instances” of 3D Metadata in the context of 3D Tiles:

  • 3D Tiles Metadata, which can be inserted in the tileset.json file of a 3D Tiles data set
  • The EXT_structural_metadata glTF extension, which can be contained in the glTF/.glb tile content that is used in a 3D Tiles data set

(An aside: This has led to something that looks like an undesirable duplication. And it probably is :slightly_smiling_face: For example, the definition of a “class” in 3D Tiles Metadata and the defintition of a “class” in glTF are essentially equal. But this is intentional. The same JSON-based description of a “class” can be used in both contexts).

But specifically addressing this point

These led me to think as if glTF format was required to use 3D Tiles 1.1 metadata specification.

This is not the case. You can have a 3D Tiles 1.1 tileset with metadata, even when you are not using glTF.

This is related to the “granularity levels” that I mentioned. For example, if you have a tileset.json where each tile.content represents a complete building, then you can assign metadata to each tile (and therefore, the whole building) like this:

tileset.json:

{
  // The schema definition
  "schema": {
    "classes": {
      "building": {
        "properties": {
          ...
        }
      }
    }
  },

  "root": {
    "children": [ {
      // The content (geometry data) for "Building A"
      "content": { "uri": "buildingA.b3dm" },

      // The metadata for "Building A"
      "metadata": {
        "class": "building",
        "properties": {
          "height": 123.456,
        }
     }
    }, {
      // The content (geometry data) for "Building B"
      "content": { "uri": "buildingB.b3dm" },

      // The metadata for "Building B"
      "metadata": {
        "class": "building",
        "properties": {
          "height": 234.567,
        }
     }
    },
  }]

In this example, I intentionally suggested that the geometry is stored in .b3dm files (and not .glb files). But the metadata is stored in the tileset.json, and refers to the whole geometry (i.e. the whole building).

In contrast to that, imagine that you have a single “building”, stored as a glTF/.glb file, and now you want to assign metadata to the “door” or “roof” of that building. (Or essentially do the classification that you referred to). In this case, there is no place in the tileset.json where you could say

“This part of the geometry is the ‘roof’”

Instead, you have to encode this information inside of the glTF file itself. And for that, you can use the EXT_mesh_features and EXT_structural_metadata extensions. The definition of the schema will be exactly the same as it is in the tileset.json. (It uses the same JSON format, because it is adhering to the “3D Metadata Specification” in both cases). But how the actual values are encoded in the glTF is different. (Within glTF, the metadata is often encoded in binary form - omitting some technical details here…)

Q3. Could you please clarifiy the metadata representation boundary depending on photogrammetry data formats, especially for glTF?

I’m not entirely sure whether I fully understand what you mean by “representation boundary”. Hoping that my undestanding is roughly correct:

The source data of “photogrammetry” data is often just a huge triangle mesh. This may be a whole city block that was scanned with a drone and stored as a 10GB OBJ file. In the worst (but common) case, this file has no inherent “structure”. It just contains a few hundred million triangles. It is not clear what a single “building” is in this data set, and even less what the “front door of a building” is.

When this data is converted to 3D Tiles, then it is optimized for streaming and efficient visualization. This involves a spatial subdivision of the data, and the computation of different LODs (levels of detail). The geometry data may eventually be stored in GLB files. But there are different levels of detail for that. So you might end up with

  • Level 0: One GLB file with 10MB, representing the whole city block with a low level of detail
  • Level 1: 10 GLB files with each one having 10 MB, representing “single buildings” with a higher level of detail
  • Level 2: 100 GLB files with each one having 10 MB, representing all the detailed components of these 10 buildings

Now, the challenges are:

  • The information about “classification” has to be present in all these levels of detail.
  • The data is usually just subdivided spatially

Even if you assigned a “classification value” to each building in the source data, then you’d still have to encode this classification in the single GLB file for “Level 0”. And given that the processing/subdivision happens only spatially, it is by no means clear that there eventually will be “a single GLB file for each building”…

This means that there must be some way to “transport” this information in a way that can be taken into account during the conversion of OBJ to 3D Tiles. And there currently is no universally applicable process for that…

(I know, all this will likely not give you an immediate solution to the task that you are working on. But maybe it helps with finding approaches for such a solution…)

1 Like

I’m very grateful for your kind explanation. I don’t know how to thank you for the effort. I may not be the only one who appreciates this valuable information.

Your answers to my Q1 and Q2 fully satisfied my original request. Regarding Q3, I admit my inquiry was lack of clarity. Nevertheless, you’ve already explained my question in the answers to Q1 & Q2. (The term “boundary” wasn’t intended to physical but to conceptual. Anyway, my fault.)

Even though there are more extended questions, but I regard this be the appropriate scope of level to close this thread. Many thanks,