Trying to write CZML file from a Python dataframe

So I’m trying to convert some data from a pandas dataframe into a CZML file. Short version of a dataframe, it’s like a table - structured data. My dataframe looks like:

Row index Source and Type Epoch x y z t_delta e
0 SLR - Reference 2023-07-01 01:39:00+00:00 2.538802e+06 -3.614576e+06 5.507524e+06 0.0 2023-07-01T01:39:00Z
1 SLR - Reference 2023-07-01 01:42:00+00:00 1.652305e+06 -2.889726e+06 6.222747e+06 180.0 2023-07-01T01:42:00Z
2 SLR - Reference 2023-07-01 01:45:00+00:00 7.055082e+05 -2.059419e+06 6.710274e+06 360.0 2023-07-01T01:45:00Z
3 SLR - Reference 2023-07-01 01:48:00+00:00 -2.670395e+05 -1.153940e+06 6.952200e+06 540.0 2023-07-01T01:48:00Z
4 SLR - Reference 2023-07-01 01:51:00+00:00 -1.229828e+06 -2.063241e+05 6.939552e+06 720.0 2023-07-01T01:51:00Z

So Row index is merely row index, Source and Type is just source of data, Epoch is a timestamp object of time, x, y, z are my position, t_delta is my difference in time in seconds, and e is a string representation of epoch. I can call a column of the table and perform operations on it by running a command like df[‘Epoch’]

The coordinates are in kilometers and I can convert them to meters with running df_tst[['x', 'y', 'z']] = df_tst[['x', 'y', 'z']] * 1000.

So to show my code, I am basically creating a dictionary from the elements of the dataframe:

d_czml = [
  {
    "id":"document",
    "name":"simple",
    "version":"1.0",
    "clock":{
      "interval":f"{df_tst['e'].iloc[0]}/{df_tst['e'].iloc[-1]}",
      "currentTime":df_tst['e'].iloc[0],
      "multiplier":60,
      "range":"LOOP_STOP",
      "step":"SYSTEM_CLOCK_MULTIPLIER"
    }
  }, {
    "id":"Satellites",
    "name":"Satellites",
    "position": {
      "interpolationAlgorithm": "LAGRANGE",
      "interpolationDegree": 5,
      "referenceFrame": "INERTIAL",
      "cartesian": df_tst[['e', 'x', 'y', 'z']].to_numpy().ravel().tolist()
    }
  }
]

with open(data_dir / 'test_czml.czml', 'w') as f:
  f.write(json.dumps(d_czml))

The f"{df_tst['e'].iloc[0]}/{df_tst['e'].iloc[-1]}" creates a formatted string containing the first [0] element and the last [-1] epoch string e elements element.

The df_tst[['e', 'x', 'y', 'z']].to_numpy().ravel().tolist() creates a numpy array from the string epoch, x, y, and z. The ravel converts it to a 1D array, and tolist converts it to a list.

Lastly the the data is written in json to a file called ‘test_czml.czml’ at the data_dir (a python pathlib.Path object).

The resulting raw file looks like:

[{"id": "document", "name": "simple", "version": "1.0", "clock": {"interval": "2023-07-01T01:39:00Z/2023-07-01T01:51:00Z", "currentTime": "2023-07-01T01:39:00Z", "multiplier": 60, "range": "LOOP_STOP", "step": "SYSTEM_CLOCK_MULTIPLIER"}}, {"id": "Satellites", "name": "Satellites", "position": {"interpolationAlgorithm": "LAGRANGE", "interpolationDegree": 5, "referenceFrame": "INERTIAL", "cartesian": ["2023-07-01T01:39:00Z", 2538.80175795354, -3614.57594511061, 5507.52360176926, "2023-07-01T01:42:00Z", 1652.30508359447, -2889.72556218045, 6222.74718975222, "2023-07-01T01:45:00Z", 705.508225186085, -2059.4191635614, 6710.27405051915, "2023-07-01T01:48:00Z", -267.039490128933, -1153.94044132787, 6952.20021290686, "2023-07-01T01:51:00Z", -1229.82808332692, -206.324144329006, 6939.55202392579]}}]

Prettyfied:

[{
   "id": "document", 
   "name": "simple", 
   "version": "1.0", 
   "clock": {
      "interval": "2023-07-01T01:39:00Z/2023-07-01T01:51:00Z", 
      "currentTime": "2023-07-01T01:39:00Z", 
      "multiplier": 60, 
      "range": "LOOP_STOP", 
      "step": "SYSTEM_CLOCK_MULTIPLIER"
   }
}, {
   "id": "Satellites", 
   "name": "Satellites", 
   "position": {
      "interpolationAlgorithm": "LAGRANGE", 
      "interpolationDegree": 5, 
      "referenceFrame": "INERTIAL", 
      "cartesian": [
         "2023-07-01T01:39:00Z", 2538.80175795354, -3614.57594511061, 5507.52360176926, 
         "2023-07-01T01:42:00Z", 1652.30508359447, -2889.72556218045, 6222.74718975222, 
         "2023-07-01T01:45:00Z", 705.508225186085, -2059.4191635614, 6710.27405051915, 
         "2023-07-01T01:48:00Z", -267.039490128933, -1153.94044132787, 6952.20021290686, 
         "2023-07-01T01:51:00Z", -1229.82808332692, -206.324144329006, 6939.55202392579
      ]
   }
}] 

However nothing is rendering. CZML files do work as I have managed to get the satellite example working locally. I have been trying to work through the wiki files but I am still having trouble getting my head around the correct structure.

Right now I’d just like to get something rendered, can anyone spot something I’ve potentially missed?

Your CZML document does not define any graphical elements (billboard, point, model, etc., etc.). So you are defining an entity whose position can be evaluated, but there is no visual representation of that entity.

Ah, modifying my code (butchering the sample.czml file and then adding it to my own) I’ve managed to put together “something” that plots!

My code:

true = "true"
sources = ['SLR']
data_type = ['Reference', 'Prop']
d_czml = [
  {
    "id":"document",
    "name":"simple",
    "version":"1.0",
    "clock":{
      "interval":f"{df_tst['e'].iloc[0]}/{df_tst['e'].iloc[-1]}",
      "currentTime":df_tst['e'].iloc[0],
      "multiplier":60,
      "range":"LOOP_STOP",
      "step":"SYSTEM_CLOCK_MULTIPLIER"
    }
  }
]
for source in sources:
  for data_t in data_type: 
    df_tmp = df_tst[(df_tst['Source and Type'].str.contains(data_t))]
      # (df_tst['Source and Type'].str.contains(source)) & 
    d_czml.append(
      {
        "id":df_tmp['Source and Type'].iloc[0],
        "name":df_tmp['Source and Type'].iloc[0],
        "availaibility":[f"{df_tmp['e'].iloc[0]}/{df_tmp['e'].iloc[-1]}"],
        "billboard":{
          "eyeOffset":{
            "cartesian":[
              0,0,0
            ]
          },
          "horizontalOrigin":"CENTER",
          "image":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAADJSURBVDhPnZHRDcMgEEMZjVEYpaNklIzSEfLfD4qNnXAJSFWfhO7w2Zc0Tf9QG2rXrEzSUeZLOGm47WoH95x3Hl3jEgilvDgsOQUTqsNl68ezEwn1vae6lceSEEYvvWNT/Rxc4CXQNGadho1NXoJ+9iaqc2xi2xbt23PJCDIB6TQjOC6Bho/sDy3fBQT8PrVhibU7yBFcEPaRxOoeTwbwByCOYf9VGp1BYI1BA+EeHhmfzKbBoJEQwn1yzUZtyspIQUha85MpkNIXB7GizqDEECsAAAAASUVORK5CYII=",
          "pixelOffset":{
            "cartesian2":[
              0,0
            ]
          },
          "scale":1.5,
          "show":true,
          "verticalOrigin":"CENTER"
        },
        "label":{
          "fillColor":{
            "rgba":[
              255,0,255,255
            ]
          },
          "font":"11pt Lucida Console",
          "horizontalOrigin":"LEFT",
          "outlineColor":{
            "rgba":[
              0,0,0,255
            ]
          },
          "outlineWidth":2,
          "pixelOffset":{
            "cartesian2":[
              12,0
            ]
          },
          "show":true,
          "style":"FILL_AND_OUTLINE",
          "text":df_tmp['Source and Type'].iloc[0],
          "verticalOrigin":"CENTER"
        },
        "path":{
          "show":[
            {
              "interval":"2012-03-15T10:00:00Z/2012-03-16T10:00:00Z",
              "boolean":true
            }
          ],
          "width":1,
          "material":{
            "solidColor":{
              "color":{
                "rgba":[
                  255,0,255,255
                ]
              }
            }
          },
          "resolution":120,
          "leadTime":[
            {
              "interval":f"{df_tmp['e'].iloc[0]}/{df_tmp['e'].iloc[-1]}",
              "epoch":f"{df_tmp['e'].iloc[0]}",
              "number":[
                0,5537.546684141998,
                5537.546684141998,0
              ]
            },
          ],
          "trailTime":[
            {
              "interval":f"{df_tmp['e'].iloc[0]}/{df_tmp['e'].iloc[-1]}",
              "epoch":f"{df_tmp['e'].iloc[0]}",
              "number":[
                0,0,
                5537.546684141998,5537.546684141998
              ]
            }
          ]
        },
        "position": {
          "interpolationAlgorithm": "LAGRANGE",
          "interpolationDegree": 5,
          "referenceFrame": "INERTIAL",
          "cartesian": df_tmp[['e', 'x', 'y', 'z']].to_numpy().ravel().tolist()
        }
      }
    )

with open(data_dir / 'test_czml.czml', 'w') as f:
  f.write(json.dumps(d_czml))

Producing a result of:

Having something plotted I’ll work on seeing if I can get a “trail” plotted. I’m unsure of the “path” or if I need an “ellipsoid” value in my dictionaries.

In the “path” value of CZML, in the “leadTime” and “trailTime”, what does the “number” value stand for?

[{
...
}, {
   "id": ...,
   ...
   "path": {
      ...
      "leadTime": [{
         "interval": ...,
         "epoch": ...,
         "number": [
            0,5537.546684141998,
            5537.546684141998,0
           ]
      }, {
         ...
      }],
      "trailTime": [{
         "interval": ...,
         "epoch": ...,
         "number": [
            0, 0,
            5537.546684141998, 5537.546684141998,0
           ]
      }, {
         ...
      }],
   ...
}]

What does the number value define? Looks like a start/end time quarternion or epoch+Cartesian3 type data. What does that data define? The satellite position at the epoch?

The Path properties leadTime and trailTime are in seconds: Path · AnalyticalGraphicsInc/czml-writer Wiki · GitHub

These are also 1-for-1 with the Entity properties in CesiumJS but the CZML doc may be more complete here. PathGraphics - Cesium Documentation

So in breaking down the number in leadTime and trailTime:

leadTime

"number": [
  0,5537.546684141998,
  5537.546684141998,0
]

can be broken down as like:

"number": [
  t_1_delta_start, t_1_delta_end,
  t_2_delta_start, t_2_delta_end
]

where the numbers are time in seconds. t_1_delta_start is the distance in time from the starting epoch and t_1_delta_end the projection forward in time towards the end epoch and t_2_delta_start like the decrementation projection from the plotted object in time from the end epoch and t_2_delta_end is the end point?

To get t_1_delta_end and t_2_delta_start you can run something like epoch_max - epoch_min and convert to seconds?

trailTime

"number": [
  0, 0,
  5537.546684141998, 5537.546684141998
]

and

"number": [
  t_1_delta_start, t_1_delta_end,
  t_2_delta_start, t_2_delta_end
]

And trailTime is similar but instead of projecting forward,
t_1_delta_start and t_1_delta_end reference the starting epoch in seconds and t_2_delta_start and t_2_delta_end reference the distance in seconds to the end epoch?

So, I’ve slapped something together in python to create a list of dictionary of orbit times with respect to the circumference

import pandas as pd
import numpy as np

def calculate_periods(df:pd.DataFrame) -> pd.DataFrame:
  df['orbits'] = np.power((df[['x', 'y', 'z']] - df[['x', 'y', 'z']].shift(1)).prod(axis=1).abs(), 1/3).cumsum() // (2.25 * np.pi * np.power(df_tst[['x', 'y', 'z']].prod(axis=1).abs(), 1/3).mean()) # surface circumference of the earth is about 40075 km, not necessarily orbit circumference
  
  df['t_delta'] = (df['Epoch'] - df['Epoch'].shift(1)).dt.total_seconds()
  
  df_r = df[['t_delta', 'orbits']].groupby(['orbits']).sum()

  df_r['e2'] = df[['e', 'orbits']].groupby(['orbits']).last()
  df_r['e1'] = df_r['e2'].shift(1)
  df_r['e1'].iloc[0] = df['e'].iloc[0]
  # df_r['e1'] = df[['e', 'orbits']].groupby(['orbits']).first()
  
  return df_r[['t_delta', 'e1', 'e2']]

def calculate_lead_times(df:pd.DataFrame) -> list[dict]:
  df_r = calculate_periods(df)

  return df_r.apply(lambda x:{
    "interval":f"{x['e1']}/{x['e2']}",
    "epoch":x['e1'],
    "number":[
      0, x['t_delta'],
      x['t_delta'], 0
    ]
  }, axis=1).tolist()

def calculate_trail_times(df:pd.DataFrame) -> list[dict]:
  df_r = calculate_periods(df)
  
  return df_r.apply(lambda x:{
    "interval":f"{x['e1']}/{x['e2']}",
    "epoch":x['e1'],
    "number":[
      0, 0,
      x['t_delta'], x['t_delta']
    ]
  }, axis=1).tolist()

It starts by (site note, ‘Epoch’ is a datetime object and ‘e’ a string epoch in the format Cesium likes):

  • calculating the orbit length by cumulatively summing the absolute length of vectors between points
  • integer dividing it by the orbit circumference calculated from the mean satellite radius vector which is then used to calculate the “mean orbit circumference”. I know it should be 2 * pi * r but 2 tends to fall a bit short because, calculating using a mean value, and 2.25 gives a bit of overlap.
  • That gives a count of what orbit each point is in.
  • Calculate the change in time ‘t_delta’ in seconds
  • Group the data by orbits and:
    • Sum the ‘t_delta’ values by orbit so you get how long each orbit is in time
    • Get the last string epoch in the orbit.
    • Set the first string epoch by shifting the last string epoch.
    • Set the first element of the string epoch (previously a NaN object) to the first element in the string epochs.
  • Then I return the newly created dataframe in the order I prefer it.
  • The functions calculate_lead_times and calculate_trail_times just reformat the data into dictionaries and return the dictionaries as a list of dictionaries.

The result looks like this:

Edit:
The data from calculate_periods looks like:

orbits t_delta e1 e2
0 5940 2023-07-01T01:39:00Z 2023-07-01T03:18:00Z
1 5940 2023-07-01T03:18:00Z 2023-07-01T04:57:00Z
2 5940 2023-07-01T04:57:00Z 2023-07-01T06:36:00Z

Perhaps it would help to review the section on Sampled Property Values: CZML Structure · AnalyticalGraphicsInc/czml-writer Wiki · GitHub

Here’s the actual section from simple.czml.

      "leadTime":[
        {
          "interval":"2012-03-15T10:00:00Z/2012-03-15T10:44:56.1031157730031Z",
          "epoch":"2012-03-15T10:00:00Z",
          "number":[
            0,5537.546684141998,
            5537.546684141998,0
          ]
        },

Again, leadTime defines a property of type “number”. This means it effectively must define a function from “simulation time” to “number”. The function can be defined in many different ways, but in simple.czml it happens to be defined as a piece-wise function where the value is interpolated within each “piece”. In Cesium code terms, this is a CompositeProperty containing SampledProperty objects.

So, in the snippet above we are defining the value during the simulation time interval “2012-03-15T10:00:00Z/2012-03-15T10:44:56.1031157730031Z”. During that interval, the value of leadTime is computed by linear interpolating [1] using exactly two samples (control points). These samples happen to be defined using epoch seconds, meaning that the times are seconds relative to the epoch value. Samples can be defined in other ways. The sample value at time 0 is 5537.546684141998 and the sample value at time 5537.546684141998 is 0.

Why are these two numbers the same? This is entirely by choice! Specifically, since this is displaying a satellite, we use the path property to draw the current orbit pass. The “amount” of the line in “front” of the satellite starts large and goes to zero as the satellite moves, and the “amount” “behind” starts at zero and becomes large. Then at the pass break, we switch to the next interval, and so on. The numeric values are the orbital period of the satellite, but that is specifically because we chose to display a single pass (the current pass) in this example.

Again, this choice is specific to visualizing satellite orbits. For ground/sea/etc. vehicles a constant trailTime value is more conventional.

[1] Linear is the default interpolation algorithm: InterpolatableProperty · AnalyticalGraphicsInc/czml-writer Wiki · GitHub