Hello, we have several successful nDisplay projects with properly working nDisplay configurations. (5 machines, 5 displays).
Upon adding a Cesium tileset to a working nDisplay project - in this case the Photorealistic google tiles, the nDisplay portion of the project no longer works. The project runs in the editor and when run as an isolated single build, but the nDisplay remote displays crash upon creation (or shortly thereafter).
I am looking for some guidance on integration of these two elements (Rendering Cesium tiles + remote machine nDisplay configurations).
nDisplay loads an independent version of the same project and syncs frames with a Primary node, so each Unreal project is its own instance of the same Cesium-based render project.
We have 5 machines running 5 4k displays simultaneously
Questions:
Any simple fix that I am overlooking to allow a headless build to be run remotely?
Has anyone built an nDisplay example with Cesium that may have some insight or issues they came across and resolved?
Do I need to do anything specific on each remote machine (logged into Cesium ION with the same account) to allow for the simultaneous instances to run and not be in conflict?
Any help would be greatly appreciated! Happy to share logs, images, etc. to further it along.
A head machine that runs the nDisplay switchboard and launches the other 5 remote Unreal projects
5 machines that are each displaying a viewport of the same scene within the same project (N, E, S, W and down)
Tested #1: Each machine is referencing the same networked folder, logged in with the same Cesium Ion account and same Cesium Ion token
Tested #2: Each machine is referencing a local version of the same project (not shared), logged in with the same Cesium Ion account and same Cesium Ion token
Tested #3: Each machine is referencing a local version of the same project (not shared), logged in with the same Cesium Ion account and using different Cesium Ion tokens for the same project
Tested with all available tilesets and Unreal 5.2 & 5.3
All sample projects run smoothly as independent projects on each machine, no tile rendering issues
We have ne through all available logs on the nDisplay and Unreal side, but could use some more debugging from the Cesium side.
Is there an issue pulling multiple tiles from multiple machines simultaneously with the same app and token?
Is there some tile rendering issue that is triggered on nDisplay start?
we have pretty much exactly the same setup, i.e. one primary node and five nodes in our nDisplay cluster.
Our Unreal projects are on the same network share and all 6 nodes are accessing this one project.
Unfortunately we have not run into the problems you encountered.
Do all other nDisplay projects work well for you?
What I would check is:
Make sure that on every machine running Unreal all plugins that are used in the project(s) are installed and it’s exactly the same versions. Not just cesium but also others
Try with locally hosted Cesium tiles. The easiest would probably be to create or download Cesium tilesets, put them in a folder and use a simple python server. Then you can add the Cesium tiles from url with http://X.X.X.X:8000/tileset.json or similar
This way you could check if the problem is with Cesium Ion (i.e. the account or tokens) or with something else.
Be sure to use an IP address of the main node that all cluster nodes can access
I’m using this python script to do so:
import http.server
import socketserver
import os
CWD = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir,"CesiumData")
PORT = 8000
ADDRESS = "X.X.X.X" # an IP address of your primary node that all cluster nodes can access
class Handler(http.server.SimpleHTTPRequestHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, directory=CWD, **kwargs)
# handler = http.server.SimpleHTTPRequestHandler
with socketserver.TCPServer((ADDRESS, PORT), Handler) as httpd:
print(f"Server started at {ADDRESS}:{PORT}")
httpd.serve_forever()
Hello! Really appreciate the reply - added some more to our testing process and good to know that someone else is running from the same project location.
All other nDisplay projects work well in our environment, so this is specific to the Cesium project.
In fact, we can add and remove the Cesium Tileset actor (any of the provided tilesets) and when removed, the project runs just fine in nDisplay - add it back in and it crashes. Other Cesium actors work (sun, pawn, etc.) Very strange.
For sanity check, we duplicated everything again to make sure all versions of plugins and Unreal matched specifically.
Loaded a blank tileset scene without a specific API call and it still crashed
Removed the blank tileset and it worked again
We are working on the local tileset through local URL option now and will report back for future knowledge. Thanks for the code snippet.
Is there an issue pulling multiple tiles from multiple machines simultaneously with the same app and token?
Generally, no. However, the Google Photorealistic 3D Tiles have a fairly tight rate limit for the initial request. If all 6 machines simultaneously tried to make their initial request to the Google tile server, some of them might fail.
Is there some tile rendering issue that is triggered on nDisplay start?
Not to my knowledge.
Since you’re crashing, the next step is to get the call stack of the crash. We’re unlikely to be able to help you without that. That means building a Debug (or at least Development) configuration.
@jmcjedi Interestingly, I can now confirm that this happens for us as well. This was the first time that we added a Cesium tileset to a project that uses the Cesium for Unreal 2.X plugin! Previously with Cesium for Unreal 1.3 it worked. That’s why I told you in my previous answer that I couldn’t confirm the problem, I’m sorry.
So there probably have been some changes in tile loading from 1.3 to 2.0 that nDisplay doesn’t like.
Have you worked out a solution in the meantime?
It seems to have something to do with the ndisplay rsync? At least we are getting these messages in the Switchboard debug log when the crash happens.
RsyncServer.monitor: [log] 2024/03/08 12:32:02 [1166] 10.0.10.22 is not a known address for "node2": spoofed address?
[12:32:02][I]: RsyncServer.monitor: [log] 2024/03/08 12:32:02 [1166] connect from UNKNOWN (10.0.10.22)
[12:32:02][I]: RsyncServer.monitor: [log] 2024/03/08 12:32:02 [1166] rsync allowed access on module device_logs from UNKNOWN (10.0.10.22)
@arbertrary That is the exact same log message we get before the crash as well - from the individual node we have a handful of other calls that seem to error out, but the rsync call is always the one that causes the crash - so I can confirm. For our initial prototype, we went away from nDisplay and did our own syncing through other network means and a main timeline. However, we could get away with this because our experience did not have blended projectors or side-by-side displays. We could hide some millisecond de-sync issues.
We are back on the nDisplay use case though and are working through it. Our listener has these errors and the final rsync call that crashes:
warning: PresentMon requires elevated privilege in order to query processes started
on another account. Without it, those processes will be listed as '<error>'
and they can't be targeted by -process_name nor trigger -terminate_on_proc_exit.
error: failed to start trace session (access denied).
PresentMon requires either administrative privileges or to be run by a user in the
"Performance Log Users" user group. View the readme for more details.
LogSwitchboard: Display: Process 12192 (unreal) exited with returncode: 0
LogSwitchboard: Display: Received start command
LogSwitchboard: Display: Started process 8548: C:\Program Files\Epic Games\UE_5.3\Engine\Extras\ThirdPartyNotUE\cwrsync\bin\rsync.exe "/cygdrive/Z/Development/SpaceElevator_nDisplay/Saved/Logs/Node_0.log" "rsync://192.168.1.133:8730/device_logs/"
LogSwitchboard: Display: Process 8548 (retrieve) exited with returncode: 0
Have been looking through log files on all ends and will update with more information as we work through it - we have to solve this on our end and will share anything we come across.
@arbertrary Update: we removed and uninstalled all old versions of the Cesium plugin and unchecked it from our projects, then reinstalled specifically for 5.3 only (we were previously using 5.2 and are updating to 5.3). Upon making sure that both the primary node (where the editor/switchboard are running) and our test node (node_0) were fully updated and computers restarted, we were able to get the project to run. We are using Unreal version 5.3.2 and Cesium plugin version 2.4.0
That seems to have fixed it for us as well, thanks for the tip. I just updated the Cesium plugin on all our machines and it seems to work fine now. I haven’t tested it extensively, though but it looks good.
Updating stuff is sometimes scary in scenarios like our two setups because of the fear of breaking an (otherwise) running system so it’s sometimes not the first thing that comes to mind when debugging.