Remote Data Visualization with ParaView

This article describes how to use ParaView for remote data visualization. Two examples are provided. The first example shows the simplest option illustrating how to use ParaView without going through a batch system on SciberQuest's cluster nashi. The second example shows how ParaView can be run interactively through a batch system, such as the NASA Pliedies cluster.

Introduction: ParaView in client/server mode.
ParaView has three major components the client, the data server, and the render server. In the case of remote interactive visualization you will run the client, which is where the user interface resides, on your workstation and the data server on the cluster. The client only will connect to the first server process. This process will transmit data and commands from the client to the other server processes. ParaView's components connect to each other over sockets at an arbitrary port number that you will choose. However, there is very likely a firewall between your workstation and the cluster that prevents use of most ports, and therefor you will have to establish an ssh tunnel that will pipe the information through the ssh port which is left open on most firewalls.

Example 1: Without the batch system
For this example we are going to ignore the batch system.

Here is the overview of the process:
 * 1) Craete an ssh tunnel to a specific compute node on the cluster using arbitrary ports.
 * 2) Launch MPI process managers on the cluster.
 * 3) Launch a ParaView data server using mpiexec on the cluster.
 * 4) Launch a ParaView client on the workstation.
 * 5) Connect to the data server.
 * 6) Visualize!
 * 7) Clean up MPI process managers running on the cluster.

The ssh tunnel
Establishing the ssh tunnel between the workstation and the cluster is the first step in remote data visualization using ParaView. We will use "putty" a freely available ssh implementation for Windows, however any ssh implementation should work fine.

Go nto the start menu and launch putty. The following figure shows the Session configuration panel. Here you will enter username@nashi-submaster.ucsd.edu, and later save the configuration for use next time.

To configure the ssh tunnel navigate to the Connection-&gt;Ssh-&gt;Tunnels configuration panel, as shown in the following figure. You will set the Source port and Destination ports to numbers of your choosing. The destination also includes a hostname that can be used to tunnel into the cluster's private network. We use this to avoid running on nashi's head node, or to connect to a an assigned node from a batch job. For example set the source port to 11111, the destination to n001:44444 and click the Add button. The source port is used by the ParaView client running on your workstation. The destination hostname:port pair specify what host relative to nashi the ParaView data server will run on, and the port it will use to connect back to the client. These port numbers can be anything and the hostname should be one of the compute nodes. Make note of your choices of each as they will be passed into ParaView when it is run. Navigate back to the Session configuration panel and save the configuration so you don't have to do this each time you wish to connect. In the future you may Load the named configuration. Finally, click the Open button in the lower portion of the window, and log into nashi.

Start the ParaView data server.
At this point you should have a putty session open connected to nashi with an ssh tunnel established. Now you will start the ParaView data server that has been configured for the infiniband network. Execute the following sequence of commands at the bash prompt, of course replace n001 and 44444 with the values you used in the previous step.
 * 1) ssh n001
 * 2) module load PV3-3.7-IB
 * 3) mpdtrace
 * 4) mpdboot --totalnum=10 --file=/opt/mpich2/osu_ch3_mrail_vapi-gnu/share/nashi.hosts --mpd=/opt/mpich2/osu_ch3_mrail_vapi-gnu/bin/mpd --chkup
 * 5) mpiexec -np 8 pvserver --use-offscreen-rendering --server-port=44444
 * 6) mpdallexit

If all is well, ParaView should have written a message to the terminal indicating that it is waiting for a connection from the client on the given port. It will wait for the connection, once connected execute the visualisation pipelines you create, and finally close itself once you disconnect. Now you are ready to start your client.

A few notes about the preceding sequence of commands. We first ssh to n001, that is where the far end of tunnel we created is. The module command configures your environment for this ParaView build, including all dependencies such as MPI, HDF5, etc etc. The first mpdtrace command should yield an error about no mpds running, if not then you either skip the following mpdboot command, or first issue an mpdallexit. The mpdboot option --totalnum controls how many nodes are available to the following mpiexec commands. The -np argument of the mpiexec command actually determines how many processes will run. When -np is greater than --totalnum, process are allocated in a round robin manner to the available MPD's. For example if for a given run --totalnum=2 and -np=4 then the two nodes will each have two processes. The --server-port option is used to tell ParaView what port to use when establishing the socket connection to the client. It should be set to the port number you used in the Destination field of your ssh tunnel configuration above. IMPORTANT: make sure when you are all done issue and mpdallexit to bring the MPD ring down.

Start and connect the ParaView client.
Now that we have the ssh tunnel established and a ParaView data server started on nashi it's time to start the client on our local workstation and connect to the server. Go to the start menu and launch ParaView-3.7. The BOV reader plugin that is used to read SciberQuest datasets has been built against the version and should automatically load itself when ParaView launches.

In order to connect the ParaView client to nashi using the ssh tunnel you will click on the connect button near the upper left corner of ParaView's main window, as shown in the following figure, or you may use the menu option File-&gt;Connect.

The Choose Server dialog will open, from which you can select an existing configuration or create a new connection. Each connection stores specific information about the ports and hostname the client should use when creating the socket connection to the data server. You will need one entry per tunnel configuration, in other words per combination of Source and Destination port used in the previous step. If you are running ParaView for the first time the list will be empty. You will need to create a connection. Do so by clicking the Add button.



This will open the Configure Server dialog, shown in the following figure. The ssh tunnel we established above connects port 11111 on our workstation to localhost:44444 relative to nash-submaster, in other words port 44444 on nashi-submaster. There for we will set the Host field to localhost and the Port field to 11111 in the dialog and name the connection port-11111. keep in mind this information pertains to the workstation end of the tunnel.



Once the workstation port has been set click the Configure button. This will open the Startup type dialog. We will select Manual from the drop down menu, as shown in the following figure.



Click the Save button, which will bring you back to the Choose Server dialog, highlight the port-11111 entry and click the Connect button. If all goes well the data server will print a message in the ssh terminal indicating that the cleint successfully connected and the client will list a connection named cs://localhost:11111 in the Pipeline Browser.

All operations, such pipeline execution and browsing for datasets, will now execute on nashi.

Opening SciberQuest Datasets
The plugin that understands SciberQuest datasets should be automatically loaded on both the ParaView client and server. To verify one may open Tools-&gt;Manage Plugins/Extensions dialog, as shown in the following figure. If all is well you will see an entry for vtkBOVReader in both Remote Plugin and Local Plugin lists.



If plugin has been loaded on both the client and the server, ParaView will automatically use the SciberQuest plugin to read ScieberQuest data. SciberQuest datasets are brick of values with brick for each scalar array and a brick for each vector component per time step. The step number has been encoded in the file name of each brick. The reader plugin expects the bricks and a metadata file to be co-located in the same directory. Further, ParaView will use the file extension of the metadata file to choose the appropriate reader or plugin. What this means is that to load your dataset, you must first create the metatdata file in the directory containing the bricks. The metadata file is very simple, it must conatin the dimensison of the bricks, and be named with the ".bov" extension. An example metadata file named run.bov is as follows.
 * 1) BOV reader metadata file

nx=514, ny=514, nz=514

To load this dataset one will use the File-&gt;Open menu item and navigate to the folder containing run.bov and select it. The metadata will be read without reading any bricks, however, the available arrays and timesteps will be identified.

The user interface for the BOV reader is shown int he following figure. The arrays that have been found are listed the Arrays list box, simply check the arrays to be read. bellow the Arrays list box is a sub-setting interface where cell indexes can be specified. The default values are to use the entire extent, however interactivity can be improved if amount of data read in is reduced using the sub-setting interface.



Specific timesteps in the dataset should be selected by using the Animation buttons and Time entry box near the upper right of the ParaView client window, as shown in the following figure. You may save time by selecting the desired timestep before hitting apply as timesteps are available as soon as the metadata file is processed.



TCP
ParaView has also been compiled using TCP network. Keep in mind that this build will not scale as well as the infiniband build.


 * 1) ssh n001
 * 2) module load PV3-base
 * 3) mpdtrace
 * 4) mpdboot --totalnum=10 --file=/opt/mpich2-1.0.8p1/share/nashi.hosts --mpd=/opt/mpich2-1.0.8p1/bin/mpd --chkup
 * 5) mpiexec -np 8 pvserver --use-offscreen-rendering --server-port=44444
 * 6) mpdallexit

Performance

 * 1) Configure to use PVFS
 * 2) I have noticed reading one brick on 4 nodes that of 5 seconds PV round trip 1 second is spent in my reader. What is going on during the other 4 seconds??

Example 2: An interactive run through the batch system.
To run ParaView interactively via a cluster's batch system, first login without creating a tunnel, the start the batch job. When your job runs open a second putty session with ports forwarded to the assigned compute node just as above. The main difference being we can't create the tunnel until the job actually runs since we won't know what nodes we will be assigned.