GLTF scenes to 3D tiles

I have about 50 GLTF models (glb) that I use to create a scene over about 2 square miles, rendered in Cesium JS. Some of these models are trees and are loaded multiple times with different rotation/scales for variety. Overall, a single scene may display well over 5,000 models. My models have been optimized as much as possible already (e.g. compressed, webp, meshs are bare minimum), and each model is less than 100KB in size. Some of my models use the same image textures. The user experience in the app will be at ground level (as if someone were standing and looking around), rather than an overhead view, so I would expect that tiling this data makes sense as there will be a lot of area not visible at any single location. I’m starting with a single scene but expect upwards of 50,000, so trying to create an automated pipeline for this. I have lat/lon values for the location of every model in the scene already in a database. I’ve been trying to figure out the best approach for how to ensure the best performance possible, but I’ve seen a lot of conflicting advice around, mainly due to things changing over time. Here are the different approaches I’m looking at. Please let me know if my understanding is correct and

  1. Load models directly into Cesium JS, no tiling
  • Simple to achieve and since my models are small, the network loading time would be low.
  • Likely not taking advantage of instanced modeling and loading the same model several times. It’s already taking 30+ seconds to load the scene.
  • Loads the whole scene, not just the area the user is close to which would mean more memory usage.
  • This works, but likely not the most optimal solution.
  1. Create a KMZ for the scene and import into Cesium ION to create tiles.
  • Would need to convert my GLB to Collada files and capture all the texture files. Keep this all organized and create KMZ file accordingly. Cesium ION would then convert my models back to GLTF, then to 3D tiles.
  • Since many of these models are close together (e.g. dense treed area), several of the same models, or models that use the same textures, will end up in the same 3D tile. The models should be instanced and thus more performant than if I were loading them individually like I am currently.
  • With a bit of work I should be able to automate this scene creation process.
  1. Create one giant GLTF and then convert to tiles
  • I also have terrain data and base map imagery in my app. I could potentially combine this with the models and create the full scene in Blender, then export as a single GLTF. This would be similar to the Google Maps 3D tiles, where there is only one model in each tile that contains a cutout of a larger model. I would have to use the terrain data in Blender anyways with this approach since I would need to set the elevation of each model relative to each other. Then upload that to Cesium ION and create 3D tiles (I believe this model would be chopped up into chunks per tile). This would give me the benefit of reducing the number of network requests from 3 per tile area (terrain, base map imagery, 3D model tile) down to one.
  • Maybe a bit more work to get things to align with the surrounding area (e.g. I fallback to low res terrain and imagery in areas outside of the main scene) and would need to ensure things look file elevation wise.
  • I suspect this approach will be a lot more manual, and harder to automate.

I suspect #2 is the path that would be ideal here, please correct me if I’m wrong.

Also, in my app I have some other assets that would be in addition to this:

  • I have some geometry objects. I’m currently using primitives for this in the Cesium JS control. I found KMZ in Cesium ION doesn’t support geometries. I did see some experiments on Github where people were creating 3Dtiles from geometries, so I might be able to do that, then combine the two #d tile sets together.
  • I have a few models that are time dependent (animations, or event drive), that I would still load directly. I would likely only have 4 or 5 of those loaded at any one time. I also expect that I would change these assets from time to time (position, model being used…) so it wouldn’t make sense for these to be in the main 3D tile set for a scene.

I have also encountered similar difficulties before, and I believe that #1 is definitely the worst. My ultimate solution was to split a huge model into multiple models and load them using 3D Tiles, although the performance was still slightly worse than expected.

Another reason for different forms of advice that seem to be conflicting could be that the details of the solution often depend on the details of the application goals and requirements (and sometimes in subtle ways). For example: When someone (only) says “I’d like to render 5000 models”, then the advice would be vastly different from the advice for the case that someone says “I have 50 models, and want to render 1000 instances of each of them”.

You already gave more specific information, and mentioned that some of the models in your case are supposed to be instances (of trees). And for this, using real instancing is usually by far the most efficient solution, both in terms of memory requirements and load times, as well as rendering performance. You can look at some of the samples at GitHub - bertt/cesium_3dtiles_samples (and maybe related threads like From I3dm to EXT_mesh_gpu_instancing? ) to get an idea of how this could be accomplished.

(You also mentioned that you have the lat/lon information of models in a database, and this sounds like it could be related to Consider convenience functions to create instanced models · Issue #84 · CesiumGS/3d-tiles-tools · GitHub . There is currently no specific timeline for addressing this issue, but some of the necessary building blocks are already there…)

Options 2 and 3 sound like they could lose a lot of information that you have because of your knowledge of the structure of the scene. For example: It could be difficult to perform these processes in a way that preserves the instancing information (in the worst case, you might end up with different GLB files that contain “the same” geometry data, but that is something that will have to be investigated)

You also said that you have other (non-instanced?) models, geometry objects, and time-dependent data. It’s not clear what exactly this is. Related questions could be: Are these models “large”, so that they could benefit from a conversion into 3D Tiles and Level Of Detail? Are the time-dependent models just animated glTF files, or is the time-dependency supposed to be controlled in a fine-grained manner by the application?

It looks like when you loading models directly into Cesium JS, the instancing (e.g. ModelInstanceCollection) function that existed previously was marked private around version 0.96.

It sounds like i3dm may be the way to go for these models that I will be instancing. I’ve seen some of the stuff Brett has put out there, it’s great. I’ll also try the KMZ approach as I have read in a few places that Cesium Ion would notice the same model being used and using instancing as well. Worth a try as having my pipeline spit out a KMZ file after an edit is pretty straight forward.

As for the other things in my scene, here is a summary of my app. I’m making a golf simulation app that connects up to a launch monitor (sensor that measures golf balls when you hit them). I’ve created a physics engine that models the flight path and have tied in a method for handling collisions in 3D space (nearly got all that part working in Cesium, bit more things to tweak). From there I have started building an AI/ML data pipeline that can recreate 3D models of golf courses around the world using various types of information (e.g. for terrain it will look for lidar data, then DEM, and if DEM is too low res, it will look at the course layout or generate a layout from imagery and AI then based on previous courses that have been created, create an estimate of what the terrain looks like). From there I have a collection of a few hundred different models for trees, buildings, vehicles, and a few other things you would expect to see near a golf course. A single course would only use about 50 of these models on average (e.g. different trees in different areas). In the 3D tiles I would trees, buildings, and things like cars and other static assets. In addition to that I have the following:

  • A single 3D model of a golf flag that will move around throughout the game, including moving out of the way when needed. It has an animation in it to flap with the wind. This model is about 150kb.
  • Several polygon primitives that will overlap bodies of water and streams. I managed to get the water material and animation working on any polygon primitive. I am also experimenting with having a second polygon underneath that has an image of a reflection to try and fake it a bit more since we don’t have true reflections in Cesium JS. I’m still experimenting with this. I’ve also be trying to make a water GTLF model on a flat plane that I could reuse, but getting water animations from blender to GLTF to work is a pain and likely why the water material is animating a normal image for this.
  • a model of a golf ball that will of course be doing a lot of moving around. This is less than a KB in size.
  • I also have vector tiles that breaks up all the areas of the course (fairway, greens, out of bounds…) that I can easily use to identify what type of ground the ball impacted with so I can calculate the correct type of bounce or take the ball out of play. I will also need to do collision detection with models and have seen some threads on how to achieve this.
  • I’m also playing around with some other ideas, like using flat images with wall polygons (or as a GLTF plane) as a simple model for certain things, like laying a retaining wall image on the side of terrain since a raster tile layer wouldn’t stretch nicely over that.

This is primarily a fun side project (when I’m not tearing my hair out, lol). I’m purposely trying to get this to work in a web app as I want the app to be able to run on low end devices across multiple platforms. Since I already have a physics engine that is optimized for this scenario, all I really need is a place to render everything. I could possibly do all of this in babylon.js or three.js, or possibly even maplibre (I have a ton of experience with that) but wanted to have a crack at getting this working in Cesium first. I’ve used Cesium many times over the years but not as regularly as other platforms, mainly due to 2D map scenarios being easier to achieve on other maps. I have been following Cesium since its AGI days and had many meetings with Patrick back then.

That sounds like a cool and interesting project.

But I have to weave in another disclaimer here: When a question is very broad, like ~“How to create an app with many (possibly animated) 3D objects and phsyics and user interaction”, then there can not be a sensible answer in the context of a forum thread. I still hope that others will chime in here, and provide some helpful hints. For now, all that I can say is very narrowly focussed on specific aspects, and I can try to mention pros and cons of possible solutions. But eventually, there are many engineering questions involved. Usually, generic answers will be wrong in one case or another, and the answers that are not wrong nearly always start with “It depends…”.

That said, for the specific case of rendering many instances of the same model:

It sounds like i3dm may be the way to go for these models that I will be instancing.

There are basically two approaches: One is I3DM, and the other is glTF/GLB with the EXT_mesh_gpu_instancing extension. The I3DM format is considered a “legacy” format, but it is still supported and will be supported in the future (in CesiumJS - support in cesium-native is coming soon). The solution with glTF (+extension) may be easier to handle in some ways: There’s broader support in terms of tools for creating such assets, and rendering engines that support it.

Some details: There is a (short) Migration Guide that explains how the features of I3DM may be “emulated” with glTF. But there are certain things that can not (generically) be represented with glTF+EXT_mesh_gpu_instancing. The latter refers, for example, to certain forms of animations that are possible in I3DM, but not with a single GLB. (It sounds like your instanced models are not animated, so there will hardly be a reason to use I3DM, but if you prefer, you can still use it).

Worth a try as having my pipeline spit out a KMZ file after an edit is pretty straight forward.

This should indeed be worth a try. It might turn out to be the easiest solution for your application case. Otherwise, you could still create the glTF+EXT_mesh_gpu_instancing models with reasonable effort.

a collection of a few hundred different models for trees, buildings, vehicles, and a few other things

In addition to that I have the following:

These (few) additional model elements will certainly not be the problem. It sounds like there are some other challenging engineering questions involved, about generating the actual course and letting all the models (and their “meaning” and behavior) play together correctly in the app. Maybe you can carve out more specific questions for that, e.g. about the water rendering, reflections, physics, vector tiles, or the handling of the terrain/LiDAR data.

I think that from the perspective of the rendering and integration, CesiumJS should offer quite a few solutions for various aspects here (and when you already have e.g. a physics engine for the things that CesiumJS does not offer, it’s really more a matter of integrating it).

Therefore, a slightly higher-level question: How exactly do you intend to create the actual 3D scene here? Are you planning to let “one golf course” = “one 3D Tiles tileset” (with the ‘moving’ things like the flag or the ball still being independent models)? I’m wondering about the best way to combine the terrain, the information about the type of the terrain (water, sand. grass…), and the auxiliary elements (like instanced models) here.

I’m just focusing on the large set of static models that I will reuse a lot in this thread, the other info was just for background (I have most of the other stuff working already, or have experience doing that stuff). I wasn’t aware of the newer instancing method within GLTF, so I will look into that.

As for how I plan on creating the 3D scene, I’m hoping to have a single set of 3D tiles for the static elements of the courses that would likely never change (e.g. trees). With each new course added I can merge in the new tiles with existing ones (already know how to do that). I will also likely package the subset of tiles for offline purposes so users can download a single course if desired. In terms of packaging, I’ll plan to use Sqlite similar to how MBTile files are done (I believe I’ll need to extend it a bit to handle the JSON files that go with 3D tiles), an if that doesn’t work, similar zip them up. For the purpose of this thread I’m just focusing on the base case of having a single set of tiles hosted, no need to complicate things at this point. From here there will be a set of common 3D models that I will reuse between courses regularly and will host with the app; golf flag, ball, cars, houses. Then there will be a set of course specific models, like the main building which will likely vary from course to course (I might get away with reusing these occasionally). There are two types of base map imagery I use, one is low resolution imagery that I make available in the surrounding areas for both background, and so I can animate from one course to another if I want and have something in between. Each course will have high resolution imagery for the ground. This imagery is a combination of open high-resolution imagery, put through an AI process to remove objects like people and shadows using AI inpainting, and AI upscaling to increase resolution. I then overlay textures for certain area where I want the cleanest look (e.g. grass texture on the greens, tees, and fairways). This imagery will be both hosted and have a downloadable version in MBTile format. I have a geojson file with the areas of the course I would need to know for when a ball impacts the ground (how to bounce or is it out of bounds/in water). I could just do a point in polygon search on this but I’m also looking at using vector tiles as a way to quickly filter data down so I don’t have to point in polygon on all polygons every time (there are various optimizations I’m trying). I have a lot of experience with vector tiles (I help write the NetTopologySuite library for them), so comparing that performance and size to simply doing a quick filter on bounding box on the GeoJSON data. An old collision detection method used in gaming is to have an image where the pixel value indicates if a collision occurs and you only have to do a simple pixel lookup. However, I don’t want to host a second set of detailed raster imagery, and vector based calculations should be fast enough with today’s devices. I have detailed terrain data for each course, and have a proxy service that will decide on where to pull terrain data from (detailed data for course, or lower resolution world terrain from Cesium). For offline mode I would only capture the detailed terrain data I created and make that available offline.

Each course will have a file a config file that lists all the resources needed, and other required information, like what par is for each hole. To do all this I have to be fairly organized or course. For each 3D model I create I capture a set of properties like model height, name, and tags related to what the model is in simple database. I have a little tool for manually editing a course where I can easily search this database and pull in models as needed. This database is also used in the data pipeline when autogenerating a course (if a tree in the data has height and species, the appropriate model will be selected and scaled accordingly). Some of the data I have does include detailed tree data. In other areas I’m using machine learning to detect trees in imagery, estimate their species, then use their bounding of the crown with the species info to try and estimate height (unless I have lidar or DSM data I can cross reference). When all else fails or when I’m not too concerned with accuracy (like in the middle of a dense tree area, I simply look at what the native vegetation is, check imagery to see if it’s deciduous or coniferous trees, or a mix, then use a Poisson disk algorithm to procedurally place trees. I have a bunch of these things as small individual apps built on top of Cesium at the moment for testing purposes and am working to move the individual algorithms into a single data pipeline. At some point I’ll likely take these helper apps and make them open source for others to play around with.

For the animations I’m leveraging the built-in clocks of Cesium and have experience with getting those working smoothly. As part of the physics engine I calculate the ball path before animating it rather than calculating on the fly (literally) and animate that path using a SampledPositionProperty. That animation is working well. At any given time there likely won’t be more than a dozen models animating at the same time, and most of the time there likely will only be 3 or 4. I plan to have the odd plane or hot air balloon in the sky for effect and am also playing around with some cloud models. I know there is the cloud feature in Cesium JS, but I found a way to create even higher resolution looking clouds using a very low poly model (take an image of a cloud, add to plan, split plan into a few pieces and twist it a bit to give it volume, position so the user sees the side profile).

I know this a big undertaking, and bit crazy for a fun side project. I’ve been developing map platforms and working on hard geospatial problems most of my career and am passionate about this space. I’m using this project to push myself to try and combine all my experience, while also learning and integrating the latest stuff, especially related to AI.

As for how I plan on creating the 3D scene, I’m hoping to have a single set of 3D tiles for the static elements of the courses that would likely never change (e.g. trees). With each new course added I can merge in the new tiles with existing ones (already know how to do that). I will also likely package the subset of tiles for offline purposes so users can download a single course if desired.

When you talk about “merging”, then I wonder whether this is about creating a (really) single tileset with everything. If this is supposed to cover dozens or even hundreds of courses eventually, one whould keep in mind whether that scales well. (The 3D Tiles streaming itself will scale, due to frustum checks and such - but having a 100 GB tileset “allGolfCoursesOfTheWorld.json” sounds cumbersome).

In terms of packaging, I’ll plan to use Sqlite similar to how MBTile files are done

A proposal for a 3D Tiles packaging format specification · Issue #727 · CesiumGS/3d-tiles · GitHub might be relevant here. That issue talks about this being a “proposal”, but to be honest: On a conceptual level, this already exists and will likely not change. (The specification goes a bit beyond what has originally been proposed in Package tileset in a single file · Issue #89 · CesiumGS/3d-tiles · GitHub , but these are details for now). There’s also the 3D Tiles Tools convert command that can convert between “standard tilesets” and tilesets that are stored in a .3dtiles (SQLite) file.

The details about how you are generating the data and the AI approaches are beyond something that I can sensibly respond to right now.

For the animations I’m leveraging the built-in clocks of Cesium and have experience with getting those working smoothly.

That certainly makes sense. You probably don’t want to use glTF animations in most cases, because there currently is not much (API-level) control in CesiumJS about how these animations will be played. Something like a “replay” or “slow-motion” will likely be easier to accomplish with the clock API.

I know this a big undertaking, and bit crazy for a fun side project.

It sounds like many side projects, combined, but matches the earlier statement

At some point I’ll likely take these helper apps and make them open source for others to play around with.

If these projects/parts can become something that is “standalone”, and performs a useful task, then such “building blocks” can be carved out from such a (“big”/“crazy”) project incrementally, and with the added value of knowing that they work (at the very least, in the context of the larger project).

For others who come across this thread, here are a few additional links I found insightful: