Precision about different methods to get 3D coordinates

Hi everyone, I am writing to ask which method to get 3D coordinates has the best precision. As far as I know, there are at least two functions to get 3D coordinates, scene.pickPositionand scene.globe.Pick, which according to the discussion in here should have different performance. According to the suggestion from one of the Cesium staff,
“I would recommend using either if you don’t have terrain or scene.globe.pick if you do have terrain. pickPosition works great for getting the position on a 3D tileset or a glTF model, but we haven’t had a chance to fix the globe picking issue.”, probably scene.globe.pick and pickPosition both work well in my case because I do have a 3D terrain model.

The problem is that since the ground-truth of coordinates are unknown, it’s pretty difficult to evaluate the precision of different results of these two methods. So far, I have tried using the PnP solution for the sanity check, but the results of both functions are very erroneous and just don’t make sense at all.

In addition to this, I am faced with a second problem. In my machine, Cesium.defined(ppCartesian) always returns False for certain rows and columns in the image and then I couldn’t get their coordinates. This is strange to me and I was wondering if this is likely to be caused by a coding issue?

Our code for getting 3D coordinates is like this:

for(var ii=0;ii<viewer.canvas.width;ii++){
for(var jj=0;jj<viewer.canvas.height;jj++){
var cur_pos = new Cesium.Cartesian2(ii, jj);
// use scene.pickPosition
var ppCartesian = viewer.scene.pickPosition(cur_pos);
if (Cesium.defined(ppCartesian)){
var ppCartographic = Cesium.Cartographic.fromCartesian(ppCartesian);
var s = Cesium.Math.toDegrees(ppCartographic.latitude).toFixed(7);
stringCart += “[(” + s.toString() + “,”;
s = Cesium.Math.toDegrees(ppCartographic.longitude).toFixed(7);
stringCart += s.toString() + “,”;
s = ppCartographic.height.toFixed(0);
stringCart += s.toString() + “), “;
stringCart += “(”+ii.toString()+”,”+jj.toString()+")], "
stringCart += “[(-1,-1,-1), “;
stringCart += “(”+ii.toString()+”,”+jj.toString()+")], "

Cesium 1.63 and firefox were used. Looking forward to your opinions. :slight_smile:

Can you share a Sandcastle example reproducing this issue that we can run? See instructions here: How to share custom Sandcastle examples. I’m curious how you’re evaluating the accuracy here.

Thanks for your message.

We actually use the cv2 pakcage in python and I am not sure if a Sandcastle examle could be created for this purpose. The PnP solution could be found here. And the code is pretty straightforward. In theory, one inputs the 2D pixel positions and the corresponding 3D coordinates in addition to camera intrinsic parameters (which are known in Cesium), the PnP algorithm outputs the 6D camera pose in world coordinate.

Ah, I see. So you’re passing in 2D images, the position of the camera in 3D, and the algorithm is supposed to estimate the camera pose?

I’m curious what you mean by this - since the input data is coming from a 3D engine simulation, and not from real world footage, the “ground truth” here should all be known, right? I am curious if you’ve tried this on inputs in a smaller range in a Cesium scene (for example, if you try it within a city, instead of like a globe view, where the near/far distances are a lot smaller. That may be an easier example to start with.

Using pickPosition will return undefined when you’re picking somewhere in the sky, since that’s empty space. That’s another reason why working in a smaller scene/an interior scene may be easier to start with.

Yes exactly! That’s what PnP algorithm does. Since it’s complex to have the ground truth coordinates, we use this for sanity check. And the coordinates from pickPosition and globe.pick lead to quite different PnP results. In this case, do you have any suggestion about how to tell which method is more accurate? We just find it hard to evaluate the coordinate accuracy.

I think the most accurate result is what you get from sampleTerrainMostDetailed. This takes a longitude/latitude and returns the height at that position.

globe.pick will do a ray cast using whatever geometry is currently loaded (so if the terrain you’re raycasting against is very far away the result may be inaccurate, compared to when you zoom in more). And scene.pickPosition will reconstruct the position from the depth buffer.

Thanks for your helpful message. In our design, there are quite many distant pixels (e.g. >2000m away from the camera position), and does the globe.pick have limited performance in such case? We have a good quality terrain models locally stored and it covers a very large area (including those ‘distant pixels’).

Generally, scene.pickPosition should be a first option to try, but as this link says, the depth-buffer is related to the multi-frustum rendering and somehow affected by the zoom-in and zoom-out (see tractor wheel example in the middle).

So, for our case working with far-away pixels, maybe neither globe.pick nor scene.pickPosition provides reliably accurate result?

If you have terrain, the most accurate thing is going to be what is returned from functions like the “sampleTerrainMostDetailed” since that queries the data itself and not what’s rendered on screen/reconstructed from the depth buffer.

For 3D Tiles the equivalent is “clampToHeightMostDetailed” as shown here:

Both of these approaches require computing roughly that point of intersection first though.

Thanks for your helpful message!