SciVis SBIR Phase II


 * This is a DRAFT and is likely to evolve.**

=Phase II Proposal Assessment=

Documentation
A combination of Doxygen and Wiki pages will be used to document the developed classes. All submitted classes will have Doxygen style comments in its header files. Doxygen html pages should be generated on a regular basis and posted on a web site hosted by SciberQuest. Big picture documentation will be provided on a SciberQuest hosted Wiki.

Versioning
We will make use of subversion server hosted by SciberQuest for source code version control.

Build
We will use CMake so that our project can be seamlessly and automatically configured, built, and tested on all platforms of interest.

Testing
We will use the CTest portion of the CMake build tool for software quality control, validation and testing. We will configure a nightly test run on all platforms of interest. We will establish a nightly dashboard for reporting of test results, through which we can monitor performance and correctness over time, track code testing coverage, and identify memory leaks. Each contributed class will be submitted to the subversion repository with a test.

Software Process Summary
The labor break down is described in the following table. The units are man-months and man-hours. Note: The cost of developing specific CMake configuration files, validation and quality control tests, Doxygen and Wiki documentation is included in the estimate of the costs of developing the specific source code components.

Memory Management
The proposal specs out a paging scheme that relies on intercepting memory read/write/maloc/realloc/free operations on subclasses of vtkDataArray. The main point of contact would be in vtkDataArrayTemplate. We would insert the OOC paging logic into vtkDataArrayTemplate:: Get/SetValue, InsertNextValue, Get/SetTuple, and InsertNextTuple methods. We would insert the OOC memory management API calls in vtkDataArrayTemplate:: Allocate, ResizeAndExtend, and DeepCopy. We need to configure the paging/swap mechanism in the vtkDataArrayTemplate:: constructor (or subclasses) so that VTK algorithms which create data arrays on the heap during the course of their normal functionality don't end up with all of the data in memory. Containers for geometry and topology information also will need modification similar to those already described for vtkDataArray and subclasses. These containers include vtkPoints, vtkCellArray, vtkCellType, vtkCellLinks.

Algorithms and Filters
Each VTK object and algorithm which we plan to make use of, directly or indirectly, will need to be examined and potentially modified. This is because of the alternate Get/Set Pointer/Data API which allows developers to manipulate pointers to underlying data directly bypassing the Get/Set Value API. It is common practice for efficiency to use the pointer api. To get a feel for what the scope of what is involved, a grep of the VTK sources shows that there are 451 files with at least one match to the following regexp: [SG]et.*Pointer. Of those 451 the most important are going to be those in the directories: Common (62 matches), Filtering (36 matches), Graphics (63 matches), Rendering (48 matches), and IO (62 matches). We would likely not attempt to modify all of the IO classes, however the majority of the classes in the Common, Filtering, Graphics and Rendering would likely need modification. It is not unlikely that somewhere in the neighborhood of 50 to 100 classes would need modification to get a fairly minimal subset of VTK functionality ported.

Readers
The behavior of VTK readers would need to be altered so that data was not read into memory during the pipeline update. Readers would instead construct OOC objects that manage the paging of the data stored on disk and insert them into the appropriate VTK data, geometry and topology arrays. Various meta data will also need to be provided by reader's such as extents, bounds and ranges. We will initially support only one or two key readers.

General Approach to Integration
Assuming that the entire VTK library was ported successfully, the modifications described above are not likely to be acceptable for submission into VTK trunk, due to performance concerns. However we need not fork VTK to make this work. Instead we can make use of VTK's object factory mechanism. This will allow us to seamlessly swap in or out our OOC modified classes at run time. We will need an ongoing effort, perhaps once a year, to insure compatibility with the latest stable release of VTK.

Performance
There is a performance concern with this approach, namely that inserting logic into Set/Get value API has the potential to significantly degrade overall performance of VTK. VTK data containers are performance critical sections of code. Each element of data stored in the data array, points, and cell array types mentioned above will potentially be accessed multiple times by each filter of a given visualization pipeline. With array lengths are typically on the order of 10's of thousands to 100's of thousands of elements. Small changes have a big impact here. One property of the current implementation is that the Set/Get Value API is equivalent to pointer access if/when the calls are in-lined. This can often provide performance comparable to that of direct pointer manipulation.

VTK OOC Integration Summary
The labor break down is described in the following table. The units are man-months and man-hours.

Note: The VTK Integration line item accounts for a reasonable port of algorithms and filters. This is not a full port, but should provide a reasonable subset of functionality to build a visualization application on top of.

Application Architecture and User Interface
The application will be implemented with a client-server architecture, where the client side is comprised of a Qt graphical user interface written in C++, and the server side is comprised of a visualization task management application written in C++. The primary function of the user interface is to provide users with a convinient easy to use means for remotely constructing, manipulating and interactively controlling execution of visualization pipelines. A plugin mechanism for extending the application will be implemented. All visualization algorithms will be exposed via the plugin mechanism.

User Interface Client
The user interface client side application is a Qt application who's purpose is to communicate, interact with, and control the server side application. The client is comprised of the following subcomponents:

Pipeline browser and editor
The pipeline browse, and editor is a Qt dialog or panel that provides methodology for graphically (or otherwise) building visualization pipelines end to end without the pipeline ever having to execute. The render window component would allow for graphical configuring of the constructed pipeline, for example the ability to position and orient a cut plane using the mouse. Configuration would also involve file selection selection, which would be done through the file browser dialog. An important aspect of the development of the pipeline browser will be to create a streamable representation of a pipeline so that it can be transmitted via socket to the visualization management server for execution, and or saved to disk for later recall.

Task Queue browser and editor
Once constructed a visualization pipeline would be submitted to the application's task queue. The task queue browser and editor provides the means for users to configure, start, pause, stop, and remove visualization tasks that will run remotely. The configuration step would deal with details such as be assigning a task to one of perhaps many servers, assigning the number of threads or processes to allocate for that particular task. Reasonable defaults could be applied so that the user would not have to bother with the step of configuring the task, unless desired. The task queue browser would also provide per task feedback about the status of running tasks, such as how a percentage of work completed. An import aspect of creating the task management component will be to create a simple task manipulation language. The user interface would likely be exposed inside the pipeline browser.

Render Window
The render window component will accept polygonal data to render generated by threads or processes running under the visualization management server. Data will be transmitted via socket in binary format for efficiency. The received data will be converted into OpenGL primitives either via VTK or via scene graph api.

Visualization Management Server
The visualization management server is an application who's purpose is to listen and respond to client side requests, constructing, tracking, and managing visualization tasks (VTK pipelines running in worker threads or processes) as requested. The server application is comprised of the following components:

Visualization Task Queue
As serialized pipelines representations are received from the user interface they are queued for execution in the visualization task queue (VTQ). A secondary object called a visualization task will be responsible for deserializing the pipeline, constructing, initializing and executing the pipeline in a separate thread. The VTQ will track and manage the task objects which should be persistent until explicitly being deleted by a command from the user interface. Task persistence will allow for the usual form of interactivity, where user manipulates the pipeline and re-executes it. While a task executes the VTQ will process requests for progress from the client, and may execute task management commands such as start, pause stop, and terminate. Once a pipeline finishes the task will enter a command into the queue that will initiate a data transfer from the from the server to the client.

Shared Client-Server Components
There are a number of components that must be share by, or in some cases have functionality split between, the client and server. These are as follows:

Client Server Communication Layer
A communication layer will be responsible for queueing, sending and receiving messages and data between the client and server. Incoming messages will have the necessary meta data so that they may be routed to the appropriate component for processing. Both server and client will need specialized routing implementations so that messages are routed to the appropriate component. SOAP/XDR may be initially used in the transport layer and later may be swapped for a more efficient, and scalable protocol without affecting the the components make use of the communication layer's services.

Plugin Manager and Plugins
The plugin manager will provide a means of application extensibility. All visualization functionality will be provided in the form of plugins. A plugin will have two components, namely a user interface component and a server side component. The user interface plugin subcomponent will be a Qt panel with the ability to serialize its state into a stream. The server plugin subcomponent will provide the actual visualization functionality. It will provide an interface to the visualization management server and be capable of deserializing a client side configuration stream, constructing and initializing the requisite VTK objects. Plugins will be dynamically loaded shared libraries.

Specific Plugins
In order to provide basic functionality for users we will have to develop a handful of specific plugins. We should plan on at least the following: BOV reader, Cut plane, ISO surface, Stream Lines, and Volume Rendering plugins.

File Browser
The file browser will have functionality split between the client and server as follows. The client side subcomponent will consist of a Qt panel for displaying a view of a potentially remote file system, generate navigation commands that will be streamed to the server side subcomponent, and have a mechanism for selection a file or a subset of files from a large set of files located. The file system view will be streamed to and deserialized by the client subcomponent. The server side subcomponent of the file browser will accept navigation commands and generate efficient data structures for working with large collections of files, and be able to seriealize these data structures to a stream for communication to the client side subcomponent.

Phase II Proposal Summary
The labor break down is described in the following table. The units are man-months and man-hours. Note: This estimate represents work to give us a fully functional visualization application that meets the out-of-core data processing, domain specificity and interactivity requirements as set out in the SBIR proposal. It is a novel approach and does provide an end to end solution, if VTK were completely ported. However, our estimate doesn't represent a full port of VTK functionality. The user interface produced at the end of the project will not be as full featured as existing visualization applications. However the threaded design would provide interactivity not currently available in any of the popular visualization tools. The estimate includes time to develop 5 visualization algorithms as plugins. This will give us basic functionality to get up and running but clearly we would need to develop more plugins to become a viable alternative to existing applications such as ParaView and VisIt. Additionally, we would need to develop features that are currently available in these competing visualization applications, such as support for time series, generating animations, crash recovery, a spread sheet view, and a picking and selection mechanism. Development of these features is not included in the above estimate. We also have not addressed the processing AMR data at all. There are two major concerns I have with the proposal as is, namely the amount of labor required, and second the performance concern that intercepting memory accesses in vtkDataArray would introduce.

= Leveraging Existing Visualization Software and Libraries for Greater Impact = Given that a number of the most popular existing visualization libraries and applications have out-of-core support built in we would likely save many hours of labor if we were to adopt one of the existing applications or libraries to build on top of. Leveraging existing software would change the focus of the SBIR, instead of developing basic functionality and infrastructure for out-of-core processing we would be free to start developing advanced capability, and domain specific algorithms and interfaces from day one.

VTK Out-of-core Functionality
VTK currently out-of-core functionality implemented. In VTK vernacular this is called "streaming". The functionality is implemented in the vtkStreamingDemandDrivenPipeline. It is compatible with both structured and unstructured VTK data sets, but not AMR datasets. In this approach a reader must be capable of reading a subset of the available extents for structured data set types or pieces of a data set in the case of unstructured data set types on demand. The pipeline will execute itself multiple times until the whole extent or all of the pieces have been processes. Any of the algorithms that comprise a pipeline can drive the updates. A specific extent or piece request is stored in the pipeline information object (see vtkInformation), and then the pipeline is updated, the processes is autom,atically repeated until the driving algorithm indicates it is finished. This will be when all of the pieces or the whole extent is processed. Recently VTK streaming has been exposed in the ParaView application. A number of specialized rendering classes which are capable of driving the streaming pipeline would be of use to us as well. See the following section on Leveraging ParaView.

In order to build our application on top of VTK streaming directly we would need to develop streaming aware readers for the data formats we wish to support.

Application Architecture and User Interface.
We would need to develop our own threaded client-server application on top of VTK. We could proceed with the plan outlined in the section above entitled Application Architecture and User Interface.

VTK AMR Support
VTK has AMR infrastructure developed as part of past Phase I SBIR. The SBIR was not renewed and as a result only a partial implementation for AMR datasets was completed. More specifically support for block structured AMR where fine levels overlap course levels was not fully implemented. Support for a specialized form of AMR that is used by a Sandia code known as CTH or SPCTH has been developed with a couple of specialized algorithms, however the CTH flavor of AMR does not involve overlapping regions of refinement and thus this work is not compatible with block structured AMR data.

At the heart of VTK's AMR support are the vtkHierarchicalBoxDataset, and vtkUniformGrid datasets and a cell blanking mechanism where by overlapping cells are marked as blanked during a special pass which occurs after dataset construction. The vtkHierarchicalBoxDataSet is a subclass of vtkCompositeDataSet limited to collections of vtkUniformGrid objects with one vtkAMRBox object per grid. The purpose of the vtkAMRBox object is to organize meta-data such as box origin, bounds, indexes, and extents. The cell blanking support is implemented in the vtkUniformGrid object which is a subclass of vtkImageData. In theory as VTK filters process data stored in vtkUniformGrid objects the blanked cells are skipped, or treated specially. However, in practice very few VTK filters do this. Additionally common algorithms are known to produce poor results on AMR data, for example VTK iso-surfacing is likely to produce cracks at resolution interfaces. No attempt has been made to deal with these issues for block structured AMR datasets in VTK.

For our purposes block structured AMR data support would be important and specifically handling data produced by CHOMBO AMR framework would be key. In addition to dealing with the above issues we would implement a CHOMBO reader. An area of recent activity in the ParaView project, compatible with VTK, is the development of a bridge between VisIt and VTK. The initial effort on this project was to make use of the 84 VisIt database plugins. VisIt has more mature support for processing block structured AMR data including a CHOMBO database. The current state of the VTK VisIt database bridge with regard to AMR is that it's only partially complete. VisIt uses a different and more advanced blanking startegy than VTK does and a mapping needs to be developed in order to get the VTK style blanking out of the VisIt CHOMBO database. See the following section VisIt AMR Support for more information.

There are two approaches for adapting VTK to deal with the resolution changes that occur at refinement interfaces. The first approach is to develop specialized algorithms that are aware of the refinement change at level interfaces and can handle it with out issue. These algorithm could be developed in a set of new filters or inserted into the existing VTK filters. The second approach, made possible by the addition of composite datasets in VTK, is to develop a "stitch" filter that would generate thin (a few cells think) unstructured stitch datasets that provide a transition between refinement levels. In this approach the majority of the data remains in structured image or rectilinear datasets, and a small portion becomes unstructured. All of VTK's existing filters should function without issue as long as they handle blanked cells and ghost cells appropriately. The Stitching approach has the down side of having increasing the dataset's memory footprint, and reducing performance as the unstructured grids comprising the stitching region are processed. Given the fact that the majority of VTK filters do not handle blanked cells correctly it likely makes more sense to identify a number of algorithms which are important to us and develop specialized block structured AMR implementations. The possibility also exists to port specialized AMR algorithms developed for use with VisIt.

Leveraging VTK Summary
Note: The AMR Support line item includes time for some clean up of the cell balnking implementation and development of one or two specialized algorithms, most likely to be iso-surfacing, volume rendering, or stream lines.

ParaView Out-of-core Functionality
ParaView is an advanced visualization application built on top of VTK with a Qt 4 graphical interface. ParaView is tightly coupled to VTK and therefor exposes much of its functionality. Until very recently however, VTK streaming was not supported in ParaView. A current area of active development is exposing this functionality. A branch of ParaView called StreamingParaView is currently publicly available in the ParaView cvs repository. The way it works is that a special render view drives pipeline updates and specialized readers provide piece or extent based domain decomposition. As the pipeline executes repeatedly, once for each sub-domain, partial results are cached and rendered providing progressive results. ParaView's data parallel implementation is left in tact so that if MPI support is compiled in ParaView can do out-of-core processing in parallel. Filters which make use of MPI global communication when running in parallel will not work while streaming.

Domain Specificity with ParaView
One simple way to meet the domain specificity requirement using ParaView or StreamingParaView would be to develop a set of ParaView plugins that expose our custom functionality. ParaView plugins can be custom Qt panels, Qt toolbars, VTK readers, VTK writers. VTK filters, or ParaView views. In the extreme ParaView could be completely stripped and customized to our needs. ParaView is often described as an application however it might be better described as a highly customizable visualization framework. Its menus and toolbars are dynamically generated at compile time based on the contents of a handful XML files. One can simply replace these XML files with customized ones to create a domain specific customization. One example of doing so can be found in the Sandia's OverView project which is available in the ParaView cvs repository.

ParaView Interactivity
Interactivity is really a three pronged issue, user interface interactivity, remote interactivity, and large data interactivity. The first prong is user interface interactivity. Unfortunately, ParaView's usrer interface becomes unresponsive during long pipeline updates. Addressing this would involve adapting a threaded design as described in the proposal. Threading ParaView would likely be possible however its tight coupling to VTK and advanced undo-redo system would likely make this challenging. The second prong is remote interactivity. ParaView has two basic options for addressing this, the first option is to utilize a remote render server to render the images server side which often can reduce the amount of data transfered to the client. The second is ParaView's level of detail data reduction technique which is automatically activated during user interactions when predefined thresholds are exceeded. In the face of a low bandwidth connection it can be difficult to use ParaView interactively. One approach that might yield better results would be to compress the client-server communications. It's not clear that this would make a dramatic improvement and it certainly could degrade interactivity in some cases. The third interactivity prong is large data interactivity. One feature being co-developed with StreamingParaView is a multi-resolution support that will facilitate interactive large data exploration. The multi-resolution capabilities are very new and undocumented as far as I can tell from a survey of what's publicly available this is implemented at the reader by specifying a region of interest and a stride. In addition to the multi-resolution features one has the option to throw more compute resources at a visualization tasks by running ParaView in parallel.

ParaView AMR Support
Given ParaView's tight coupling with VTK, the best one could hope for is that all of VTK's functionality would be exposed in ParaView. However, ParaView does not expose VTK's AMR capabilities. A prime example of the lack of support can be seen in vtkPVGeometryFilter which is implicitly added to all pipelines constructed in ParaView. This filter does not ignore blanked cells as it should nor does it handle ghost levels correctly for vtkUniformGrid objects. In addition to the developments described in the previous section VTK AMR Support a handful of ParaView specific filters would need to have VTK AMR support added or updated.

Leveraging ParaView Summary
Note: We need to develop a list of domain specific customizations and advanced algorithms to implement.

VisIt Out-of-core Functionality
VisIt is an advanced visualization application built on top of VTK with a Qt 3 graphical user interface. VisIt unlike ParaView it is loosely coupled to VTK. This is advantageous for a number of reasons, first it has enabled the development a better pipeline, called AVT, with a number of optimizations which increase throughput and reduce memory footprint. A second advantage is that it simplifies the client code compared to the tightly coupled approach ParaView has taken.

VisIt currently supports out-of-core data processing when run in its "dynamic load balancing" mode. In this mode VisIt will process one domain at a time. This implementation relies on database components being able to perform a domain decomposition where domains are small enough to fit in the available memory. Many but not all of VisIt's databases can do so. For example, VisIt's BOV database can do it's own domain decomposition but CHOMBO can only provide it's boxes as domains. Any VisIt plots or operators which make use of global communication will not work correctly. The out-of-core functionality in VisIt is rarely used, and not currently being developed. On the other hand the fact that VisIt has this functionality at all is a plus, and would give us a clear starting point and something to build from. We would need to identify a sub set of VisIt's plots and operators that will be important to have out of core functionalty for and examine each one to check for the use of global communications.

Domain Specificity with VisIt
VisIt has been design using a plugin architecture. There are three plugin types, databases, plots and operators. Extending and customizing VisIt accomplished by developing specific plugins. Qt panel can be The user interface is less customizable than that of ParaView.

VisIt Interactivity
VisIt is a client server application designed to be used remotely over socket connections, it likely performs as well as ParaView or better. In terms of user interface interactivity, VisIt's user interface becomes unresponsive during long pipeline updates. Because it is loosely coupled to VTK it would conceivably be easier to thread than ParaView. In terms of large data interactivity, I don't know of any special functionality aside from running in parallel and throwing more resources at a job. That said we could develop the same type of striding that ParaView has implemented in our IO components without too much trouble.

VisIt AMR Support
VisIt's AMR support is more advanced than what is currently available in VTK and ParaView. Basic support for block structured AMR datasets exists. The idea is that special arrays are used to identified points and cells which are covered by a higher level. Operators and plot then ignore these points and cells. A number of specific AMR algorithms are implemented. A fully functional CHOMBO database plugin exists. We will want to explore the situation more fully and make sure that VisIt's AMR capabilities meet our requirements.

Summary
Notes: Some further investigation is warranted. I am less familiar with VisIt than ParaView. VisIt has working, but in need of refinement, out-of-core functionality and working support for CHOMBO AMR data. That's a pretty big ticket item. I have worked with its plugins and know that their system is designed well, and we could achieve domain specificity. We need to develop a list of domain specific modifications, and test iso-surfaces and stream lines on some CHOMBO data.

= Conclusion = We will achieve a greater impact and realize far more useful results if we choose to leverage existing out-of-core functionality and the advanced user interface available in one of either VisIt or ParaView. The reality is that application demand paging in VTK is likely to perform poorly compared to the existing OOC streaming implementation of VTK/ParaView and the dynamic load balancing implementation of VisIt, and its implementation could require as much as 70% of our available funding. The threaded client server architecture and GUI alone would far outstrip our available resources and could take as much as 120% of our available funding, and our application, although threaded, will be far less full featured than either VisIt or ParaView. Given those realities it will likely prove difficult for us to find an audience for our product if we follow this path. We simply don't have the resources to implement the original design in such a way that puts un ins a strategically viable position.

We will achieve a greater impact if we leverage existing out-of-core functionality in one of either VisIt or ParaView. Going this route will mean that we start developing advanced domain specific functionality from day one. A case can be made for choosing either VisIt or ParaView. In a nutshell, the big things are that VisIt comes with AMR support and many well written scalable IO components of interest to us, and is loosely coupled to VTK which facilitates core modifications such as threading the UI; While the big advantage of using ParaView is that currently Kitware and LANL are actively exposing VTK's streaming implentation in ParaView with advanced functionality such as progressive rendering and multi-resolution readers. There's clearly going to be more effort expended on our part to use ParaView, given its lack of AMR support, and streaming compatible readers, but if we can enter an open collaboration with either Kitware or LANL it would likely be advantageous for us to go the ParaView route as we will benefit from their on going efforts.

Our focus will be slightly different based on which one is chosen as each has a different set of features developed, and depending on how we choose our priorities, the scale may be tipped one way ot the other as there is a unique set of costs and benefits associated with either choice. ParaView on the one hand is highly customizable and is undergoing active out of core development including support for progressive rendering and multi-resolution exploration. Unfortunately it needs quite a bit of work to get AMR data working, and it's tight coupling to the VTK pipeline will introduce a number of complications. VisIt on the other hand, also has support for out-of-core data processing, however VisIt's OOC functionality has not been used very widely and is not in active development. However VisIt has support for AMR and CHOMBO specifically. There is also no reason the multi-resolution data exploration techniques could not be implemented in VisIt. Both applications have a strong reputation among the community.

Either way, It is clearly advantageous for us to choose one of these projects and build on top of existing functionality for greater over all impact.