Terrain Rendering

Hey folks,

In the next few months, I’ll be spending most of my time working toward bringing high-quality terrain rendering to Cesium. What does that mean? I’m still trying to work that out! That’s the purpose of this post: to lay out my thoughts on where we’re heading with terrain in Cesium and to solicit your input. Please don’t be shy. Even if you’ve never even run Cesium before, but maybe you’ve used terrain in other products, I want to hear what you have to say.

Ok, so Cesium runs in a web browser. It’s impractical to expect a bunch of terrain data to be present on the end-user’s machine. Even if it were there, browser sandboxing would make it hard for us to get access to it. Similarly, it’s impractical to download an entire terrain dataset to the client before we start rendering, since a high-resolution, worldwide terrain dataset can measure in the terabytes. What Cesium needs is a streaming terrain engine, where individual chunks (tiles?) of terrain are downloaded to the client from a server as they are needed. So far this is probably pretty obvious, but I wanted to lay it out.

Initially, I envision us rendering a single worldwide terrain data set, rather than compositing multiple data sets together on the client. In other words, if we want to visualize terrain data from multiple sources, we’ll serve them via a server that combines them into a single seamless terrain. We may relax this restriction in the future, when user needs justify the much greater complexity of doing so.

Where do we get the terrain data to render? I wrote up some initial thoughts on that here: https://github.com/AnalyticalGraphicsInc/cesium/wiki/Streaming-Terrain-Details. In order to render planetary-scale terrain at anything approaching a high resolution, the data must (at a minimum) be served in discrete chunks, because the entire data set is too big, and at multiple resolutions, because rendering a zoomed-out view still requires low-resolution data for the entire world. If you know of any other publicly-accessible sources of terrain data that meet these requirements, other than the ones described on the wiki page, I want to hear about it.

None of the sources I listed in the link above are ideal, but my current plan is to start with the ESRI World Elevation Services - though that’s definitely subject to debate. Long term, I envision us standing up our own terrain servers to serve terrain data tailored to the needs of Cesium. Perhaps sell a rack-mounted server that can be installed on private networks as a means to offset the hosting costs of the public server?

While terrain will (initially?) be rendered from a single worldwide source, we’d like to be able to import multiple sources of imagery and overlay them layer-style on the terrain. Work on imagery layers is already happening in the imagery_layers branch on GitHub. There is significant overlap between that effort and this terrain effort.

The terrain work, which is still in its infancy, is taking place in the terrain branch on GitHub.

This is pretty early and high-level, but please let me know what you think about this direction. Better yet, if you have ideas about how we can incorporate terrain into Cesium, let’s talk about it.

Thanks,

Kevin

Hi Kevin,
I still don’t have insight in the project, but generally speaking most of the aspects of desktop terrain rendering techniques can probably be used in this environment too. As you may remember, I still think clipmaps are superior for the purpose. You could download tiles starting from the coarsest one and refine scene as new, more detailed, tiles arrive.

Considering the servers, I haven’t used any of them. A dedicated server is probably the best solution if you are developing a professional Web application. Otherwise, paying for the server is not something you could afford. TIFF (either GeoTIFF or just TIFF) is not ideal solution for the terrain tiles. You should probably have some kind of proxy if you are using public servers. By the way, it is probably the best compromise to have just a proxy-server that will gather data from the public ones and convert (and cache) it locally.

If the clipmap approach is accepted, I see no reason to treat DEM in different way comparing to other raster data (i.e. textures). Combining different sources on the server side would be faster since you can cache previous results, but you’ll prevent cache pollution if combining is done on the client side. Generally it depends on the variety of the parameters and data sources clients can combine.

Regards,
Aleksandar

Hey Aleksandar,

Can you elaborate a bit on how you see a clipmap-based algorithm working? If I recall, your preferred approach is quite a bit different from the pure “geometry clipmapping” approach described by Hoppe and company, and in our virtual globe book.

My gut feel is to use a pretty conventional hierarchical LOD system, along the lines of Chunked LOD. Mostly because of its minimal hardware requirements (it doesn’t even need vertex texture fetch), good control of rendering quality, and ease of wrapping around an ellipsoid. But this is far from a final decision at this point.

You’re right that TIFF is not ideal for terrain tiles. That may be true in general, but it’s especially true in WebGL because browsers do not have built-in support for reading TIFFs. In the case of the ESRI World Elevation Services, it appears that the only way to get actual height values is in TIFF format. So we definitely need a proxy when using that terrain source. In my experiments so far, I have been converting the 32-bit floating point TIFFs served by ESRI into 24-bit PNGs, where the three bytes form an integer specifying the terrain height in millimeters. It’s messy, but it works. I’d like to look into instead encoding terrain tiles using something like the Google Body folks used for their models. It’s described nicely in an article in Cozzi’s upcoming OpenGL Insights book, and in this presentation. They achieved a very impressive 6 bytes per vertex. If we do use an algorithm like Chunked LOD, the server might serve an entire mesh rather than just a height map.

Interesting point about a clipmap-like approach allowing terrain data from multiple sources to be combined in much the same way as imagery. In OpenGlobe, I mapped the terrain to the ellipsoid by transforming a geodetic Lon/Lat/Height position to a Cartesian X, Y, Z position in the vertex shader. With that approach, I can see how you can read multiple height map textures in the VS, figure out which one applies for a given vertex, and then transform that to Cartesian space. The trouble with that approach, though, is that it’s not really possible (as far as I know) to do that transformation in 32-bit float precision with sufficient accuracy. So I’ve been assuming that transformation would have to take place on the CPU - perhaps even on a server. What do you think, is there another way?

Thanks for your thoughts; I really appreciate it.

Kevin

Hi Kevin,

Yes, my approach is quite different compared to Hoppe’s. What you’ve seen almost two years ago I have completely abandon. Now the geometry is generated in the VS or TS (the default is VS since TS requires SM5 hardware, and the performance is almost the same). Even with single precision FP calculation I have achieved 6 digit-precision of the perfect sphere with Earth’s radius. Only the list significant decimal digit differs from the counterpart calculated with DP on the CPU. In other words, the viewer can be 1 micron above Earth surface and still have no artifacts although the whole calculation is done with 6378km radius and a lot of trigonometric transformations. The projection is gnomonic. Disadvantage: VS texture fetch and atributless rendering required (since I’m not transferring any attributes to VS).

If you are targeting wider range of GPUs than your approach is probably better. I have to take a look at your chapter, Chanked LOD, to have more precise comments.

Having 6B/vert is great for the arbitrary mesh, but for the terrain it is huge overhead if you don’t have caves and overhangs. 2B/sample enables precision of 13.5cm, which is extremely accurate for the whole Earth.

Regards,
Aleksandar

уторак, 19. јун 2012. 19.44.54 UTC+2, Kevin Ring је написао/ла:

For terrain, initially we would like to use something like the SRTM_1km.tif (relatively small, but still 1GB) loaded into ArcGIS Server or GeoServer and served as WMS or WCS (can serve png, tif, or a variety of other formats).

Here are a couple data sources we had come across that might not be in your wiki:

http://glcf.umiacs.umd.edu/data/srtm/

ftp://topex.ucsd.edu/pub/srtm30_plus/

Thanks!

Ashley

Thanks, Ashley. I actually have a forwarded copy of your email to Cozzi from awhile back describing how to set up GeoServer to host DTED data, but I haven’t tried it out yet. Do you happen to know offhand how GeoServer encodes heights in PNG files? We may need to insert a proxy in front of GeoServer as well in order to present the data in a browser-friendly format, for the same reasons we need to do so with the ESRI services.

If I understand how WMS works, GeoServer will be able to serve that 1 gig TIFF as a single low-resolution image for use in the most zoomed-out views. Do you know what kind of performance we can expect from that, though? I suspect it would need to pre-process the image into multiple versions with different resolutions in order to be able to answer such a request quickly.

Kevin

Hi Aleksandar,

That precision sounds pretty reasonable. Was there a trick? Also, do you know if your approach could be extended to an oblate spheroid like WGS84 instead of a sphere? Since your source data is gnomonic, I’m assuming we’d need to reproject the source data (which is more likely to be geographic or mercator) somewhere in the pipeline.

I actually misspoke before - the Google Body guys achieved 6 bytes per triangle, not per vertex. And that included the normal and index data I believe. But your statement is certainly still true - a regular height map is more efficient.

Kevin

Yes, there are a plenty of “tricks”. :wink:

I had to implement my trigonometric functions, since GLSL has very bad precision.

Also, there is a specific way of calculating coordinates. Gnomonic projection is used just for the purpose of locality, but all samples are in WGS84. So no reprojection is required. Or to be more precise, I have my own coordinate system that resembles
Gnomonic, but the grid is in WGS84 for the most of areas of the globe (-45 to +45 Lat). Polar caps are reprojected in order to have better spatial distribution of samples and the same approach as for the equatorial region. The algorithm is not trivial. I’ll try to describe it in a whitepaper soon.

My document was simply instructions on how to load DTED into a Geoserver layer. I don’t know what height encoding they use.

GeoServer could serve the data as one big image or break it into multiple mosaics of images at many ‘zoom levels’. The preprocessing is usually done beforehand like you said and the layer is published using GeoServer’s ImagePyramid plugin. That being said, it’s probably going to convert the images to 8-bit by default if you go that route.

The same concept of pyramiding/tiling would apply to using ArcGIS Server as well I would imagine.

How would Cesium get the data from ESRI World Elevation Services? Is that interfaced via the ArcGIS REST API? That would work for us too as long as we could load/serve our own data on an ArcGIS Server.

To be honest, the only way I have played with actually getting elevation into a WebGL app is to use a script to preprocess/convert the data to be served as json.

Sounds interesting. I look forward to reading more about it.

Ah, I wasn’t aware of the GeoServer ImagePyramid plugin. That doesn’t sound like exactly what we need.

Yes, ESRI’s World Elevation Services are accessed via their ImageServer REST API at http://elevation.arcgisonline.com. You’ll need to sign up for an ArcGIS account and possibly for the elevation services beta to be able to access them. There are two serious limitations, though:

  1. The servers are not accessible via Cross-Origin Resource Sharing (CORS). That means that an image loaded from their servers will not be usable in WebGL. I’m told this will be addressed once they upgrade the server to ArcGIS Server 10.1 in the next few months.

  2. As far as I can tell, the only image format that represents the actual height data (rather than heights turned into a grayscale pixel in the range 0-255) is TIFF. But most browsers do not have support for decoding TIFFs.

It is for these two reasons that Cesium will need to access these services via a proxy rather than connecting to them directly. I suspect we might be in the same boat when talking to a GeoServer (at least on the second point), but I haven’t looked into it in depth, yet.

Some of the other WebGL globes out there do encode heights in JSON as you described, but I’m confident we can do better.

Kevin

Oops, meant to say it does sound like exactly what we need!

I would think it’s possible to get usable PNGs out of some WMS server that you could hit over CORS. The PNG would theoretically have the pixel values just be the actual elevation values. I’m not sure if GeoServer has this out of the box. An extension may need to be written.

Maybe we need some capability like this:

https://platform.terrainondemand.com/analyst/Doc/pages/WMSext.aspx

Also, I tried ESRI’s elevation services but it’s asking for some authentication when I add the service to ArcMap. I’m signed up for the beta but it seems the pages for the elevation beta are gone (since today or yesterday)?

Hi Ashley,

That WMSext looks promising. The page you linked says, "The full elevation values can be retrieved by requesting the image/png or image/tiff value for the format argument. The result of such a request will be

an image where the signed short integer values or floating point real numbers contained in the image file for each pixel are the elevation of the respective point on the map in meters." Presumably, the PNG format uses the integers only, since PNG doesn’t support floating-point pixels. Still, this presents a problem, though, because the browser doesn’t give us any way to get the 16-bit integer height values! The only way to get raw pixel values in a browser (to my knowledge) is via Canvas and getImageData(). But getImageData always returns the image in RGBA format where each of the components is 8 bits. It’s very frustrating.

Still, I suspect you’re right that we could write a plugin for GeoServer that serves the height data in a format we can work with in Cesium, rather than inserting an additional proxy.

Regarding the ESRI elevation services, they appear to still be working: https://elevation.arcgisonline.com/. When you click the first link at the top of that page, you’ll need to sign in with your ArcGIS Online account. Though I haven’t tried it myself, I’m guessing you need to use the same credentials when adding it via ArcMap.

Kevin

So there is no way to map an RGB value to some elevation value?

Either way, I wonder if WMS will have adequate accuracy for these elevation values or if a WCS will be needed to return the uncompressed/“raw” images.

What server/service are you currently testing with? I added the elevation.arcgisonline.com to my ArcMap as an ArcGIS Server but I couldn’t get it to work as a WMS or WCS (authentication token error for those).

Earlier in this thread you said “In the case of the ESRI World Elevation Services, it appears that the only way to get actual height values is in TIFF format. So we definitely need a proxy when using that terrain source. In my experiments so far, I have been converting the 32-bit floating point TIFFs served by ESRI into 24-bit PNGs, where the three bytes form an integer specifying the terrain height in millimeters.”

What if you just tried to request a png32 from the ESRI World Elevation WMS server (image/png32)? Is that any different from converting the TIFF to PNG in your custom proxy servlet?

http://elevation.arcgisonline.com/arcgis/services/WorldElevation/DSM/ImageServer/WMSServer?request=GetCapabilities&service=WMS

Hi Kevin,

In the next few months, I'll be spending most of my time working toward
bringing high-quality terrain rendering to Cesium. What does that mean?
I'm still trying to work that out! That's the purpose of this post: to lay
out my thoughts on where we're heading with terrain in Cesium and to solicit
your input. Please don't be shy. Even if you've never even run Cesium
before, but maybe you've used terrain in other products, I want to hear what
you have to say.

I'm really excited to hear this! For us earth-bound geospatial types,
terrain is the killer feature.

I've contributed a bit to the OpenWebGlobe (
http://swiss3d.openwebglobe.org/ ) and WebGL Earth (
http://www.webglearth.com/ ) projects, here's some feedback/experience
from those.

WebGL Earth encodes height maps in image files, and then displaces
vertices using texture lookup in the vertex shader. This is fast, but
the terrain quality isn't great.

OpenWebGlobe uses a classic Chunked LOD approach. Terrain tiles are
triangular meshes with associated metadata (e.g. maximum error) for
the Chunked LOD algorithm to decide whether to split or not. The
meshes are computed server-side, computing all the meshes for
Switzerland from 25m resolution dataset takes a few hours. The tiles
are converted into JSON format, gzip-compressed, and served statically
from Amazon's S3 (actually we proxy S3 to do referrer checks, but in
theory they could be served directly from S3).

Practically, the system works well, and it has the nice feature that
terrain meshes are simply 3D objects. The same code can then be used
to draw other models, e.g. buildings, and in theory the same Chunked
LOD/terrain tiles as models approach can be used to render Chunked LOD
cities à la Nokia Maps 3D WebGL ( http://maps3d.svc.nokia.com/webgl/
). This sort of aerially-captured city data is becoming increasingly
common, and it would be great if Cesium had the capability to display
it.

A few things that I'd look at doing differently next time include:

1. Investigate reading array buffer directly using XHR2 (
XMLHttpRequest - Web APIs | MDN ; see responseType =
"arraybuffer") to avoid the need to parse the mesh in Javascript.

2. Storing Chunked LOD metadata separately from the terrain tiles, and
grouping the metadata for multiple terrain tiles in a single object.
This way, the Chunked LOD algorithm can walk more of the LOD tree
without requiring the terrain tiles to be loaded.

3. Performance-wise (related to 1), I'd definitely profile gzip'ed raw
arraybuffer data against more sophisticated formats like Google Body
uses. I suspect that there will be not much difference in the amount
of data transferred across the wire, and that the raw arraybuffers
will be much faster to handle in the client. See also how Nokia do
it: http://idiocode.com/2012/02/01/nokia-3d-map-tiles/ .

4. Performance-wise (related to 2), generally I'd look at ways of
reducing the number of round-trips to the server. I suspect that it'll
be better to use a few, relatively large, tiles than many small ones.
Asking gl.draw to draw four times as many objects is generally likely
to be quicker than making four times as many requests to the terrain
server.

Looking forward, I think it will be a common case that terrain data
will come from multiple sources and need to be combined somewhere. As
Cesium is a globe, a global data set is required, e.g. SRTM as you
mention on the Wiki. However, our clients (effectively states and
counties) typically have much more detailled data for their local
areas. Therefore, we need to be able to use more detailed data when it
is available, and fallback to less detailled data when it is not. As
combining data sets, especially terrain tiles, is a computationally
intensive process, I'd look first a preparing the data on the server
side and serving effectively static, optimized-for-the-client files
rather than trying to combine data sets in real time either on the
server or on the client.

3D building data is also becoming increasingly common. This will also
need to be streamed to the client, and I'm sure that the code will
share a lot in common with the terrain rendering.

Finally, subjectively, I think it will be very hard to make gridded
terrain data work well in a web context, basically due to the greater
bandwidth needed to provide a similar quality to triangular mesh
terrain. I might be wrong though!

Regards,
Tom

Yes, it’s a lot different, unfortunately. If you request a PNG32, the ESRI server puts each height value in the range 0-255 (using some sort of logarithmic scale, maybe… I’m not sure exactly what algorithm they use to do it) and then puts that same one-byte height value into each of the R, G, and B components of the color at each pixel. The alpha is always 255.

My experimental approach is to instead divide the floating-point height by 1000 to get the height in millimeters, cast it to an integer, and then set the red component to be the low byte, the green component to be the middle byte, and the blue component to be the high byte. I don’t use alpha because of some quirks with the HTML Canvas element that I won’t bore you with unless you’re interested. This produces some funny looking images. I attached one so you can see what I’m talking about.

To answer the question in your other message about what elevation service I’m currently using… I’ve switched gears a little bit to focus on improving the Tile/TileProvider system currently in Cesium for imagery, and I’m doing that in the imagery_layers branch. I’ll be back focused on “core terrain” soon, though. At that point I think I’m still going to use the ESRI terrain servers initially because their massive terrain database will make for a nice proof-of-concept, though it should be really easy, once I have that working, to switch to a WMS server like GeoServer instead. I may even do both at about the same time.

I’ve successfully used ESRI’s servers via their REST interfaces from a browser (proxied and converted as described above), but I haven’t tried accessing it via WMS or loading it in ArcMap, so I’m not sure what issues might exist there. Did you include a token generated from the “Get Token” link at the top right of the elevation service pages? I don’t know offhand how that’s done in WMS, but it probably needs to be included somehow.

Kevin

world.png

Thanks for all the useful insights, Tom! I think you and I are very much on the same page as to how this all should work. Just a few questions and comments below…

WebGL Earth encodes height maps in image files, and then displaces

vertices using texture lookup in the vertex shader. This is fast, but

the terrain quality isn’t great.

Were the terrain quality problems mostly due to precision problems introduced by transforming geographic coordinates to X,Y,Z in the vertex shader?

OpenWebGlobe uses a classic Chunked LOD approach. Terrain tiles are

triangular meshes with associated metadata (e.g. maximum error) for

the Chunked LOD algorithm to decide whether to split or not. The

meshes are computed server-side, computing all the meshes for

Switzerland from 25m resolution dataset takes a few hours. The tiles

are converted into JSON format, gzip-compressed, and served statically

from Amazon’s S3 (actually we proxy S3 to do referrer checks, but in

theory they could be served directly from S3).

I should take a closer look at the terrain rendering and processing code in OpenWebGlobe. It sounds like a good match algorithmically, and the license is compatible.

Practically, the system works well, and it has the nice feature that

terrain meshes are simply 3D objects. The same code can then be used

to draw other models, e.g. buildings, and in theory the same Chunked

LOD/terrain tiles as models approach can be used to render Chunked LOD

cities à la Nokia Maps 3D WebGL ( http://maps3d.svc.nokia.com/webgl/

). This sort of aerially-captured city data is becoming increasingly

common, and it would be great if Cesium had the capability to display

it.

Agreed. Do you know of any open sources of city data, at least for testing?

  1. Storing Chunked LOD metadata separately from the terrain tiles, and

grouping the metadata for multiple terrain tiles in a single object.

This way, the Chunked LOD algorithm can walk more of the LOD tree

without requiring the terrain tiles to be loaded.

Excellent idea.

  1. Performance-wise (related to 1), I’d definitely profile gzip’ed raw

arraybuffer data against more sophisticated formats like Google Body

uses. I suspect that there will be not much difference in the amount

of data transferred across the wire, and that the raw arraybuffers

will be much faster to handle in the client. See also how Nokia do

it: http://idiocode.com/2012/02/01/nokia-3d-map-tiles/ .

You may very well be right. The Google Body guys claimed a big improvement over a simple gzipped mesh, but their data could be different from ours. I believe there’s an open source library for encoding data in their format, so it may not be much work to try it out and see how much smaller (if any) the data can get. My gut is that a 20% reduction in data size is worth a 20% increase in client-side CPU time (just to make up numbers), especially if some of that processing can be done in a web worker. But in any case, developing our terrain engine will be an iterative process, and fancy encodings won’t be in the earliest iterations.

Looking forward, I think it will be a common case that terrain data

will come from multiple sources and need to be combined somewhere. As

Cesium is a globe, a global data set is required, e.g. SRTM as you

mention on the Wiki. However, our clients (effectively states and

counties) typically have much more detailled data for their local

areas. Therefore, we need to be able to use more detailed data when it

is available, and fallback to less detailled data when it is not. As

combining data sets, especially terrain tiles, is a computationally

intensive process, I’d look first a preparing the data on the server

side and serving effectively static, optimized-for-the-client files

rather than trying to combine data sets in real time either on the

server or on the client.

Agreed, especially if serving optimized meshes. When working with height maps (despite their limitations), on-the-fly processing is a bit more practical.

I’m curious: do you agree with my assessment that combining imagery sources on the client is reasonable, even if combining terrain sources is not?

Finally, subjectively, I think it will be very hard to make gridded

terrain data work well in a web context, basically due to the greater

bandwidth needed to provide a similar quality to triangular mesh

terrain. I might be wrong though!

I tend to agree, despite the fact that, on the surface, gridded terrain appears more compact because only the heights are explicitly represented. I think there might be a place for gridded terrain, anyway, though. Some folks will be willing to accept popping artifacts and the like in order to avoid a lengthy pre-processing step.

Thanks again!

Kevin

WebGL Earth encodes height maps in image files, and then displaces

vertices using texture lookup in the vertex shader. This is fast, but

the terrain quality isn’t great.

Were the terrain quality problems mostly due to precision problems introduced by transforming geographic coordinates to X,Y,Z in the vertex shader?

I just realized the answer to my own question. The biggest terrain quality problems are, I assume, due to the lack of a geometric error metric for each tile, and the inflexibility of the LODs.