9. Restarted Simulation Readers

When running a large or long simulation, it is common to need to restart the simulation. There are several reasons for needing to restart a simulation; system failures on large machines can bring down running programs occasionally, or scheduling policies may limit how long a simulation can run. Depending on the simulation, when it is restarted it may create a brand new set of results files. Thus, there will be a data file (or set of data files) for each time the simulation was restarted.

Having these multiple outputs from simulation restarts can lead to complications when reading them. For starters, each data set will have its own time series, but we will probably want to be able to step through time over all time data sets as if they were one. Furthermore, it is common for a restarted simulation to backtrack, meaning that there may be some overlap of time steps amongst the data sets. The file you read the data from is important as the data in some time steps may be invalid if, for example, the simulation ended abruptly while writing the data.

The ParaView development team has worked to simplify reading output from restarted simulations. The aim has been to try to make this as simple as possible, but there are still some steps that need to be taken to read in these series of data sets. This document describes how to load in restarted data for different file formats.

9.1. Exodus

Current simulations write one Exodus file per process and many simulations also write a new set of files each time the simulation is restarted. In the past it was necessary to mash all of these files into one large file. This is no longer necessary. (In fact, it is highly discouraged.) For many years, ParaView has been able to load in a set of files from one simulation run by simply selecting the first file in the set. ParaView will automatically find all other related files, given some standard naming conventions, and load them all together as a single data set. Dealing with restarts adds a level of complication because you now have a series of a series of files.

ParaView provides two solutions to the problem. The first is a naming convention adopted from existing simulation codes. The second is a case file that specifies the names of files in the time series.

Note that both of these solutions still rely on the time values provided in the Exodus files, even if they all contain only one time step. If, for example, all the files report data at time 0, ParaView will assume that they specify data at the same time and will only allow you to load one of the time steps.

9.1.1. Exodus Naming Conventions

By default, ParaView will assume that the numbers at the end of a file represent partition numbers and will attempt to load them all at once. But there is a convention for having numbers specifying a temporal sequence instead. If the files end in .e-s.{RS} (where {RS} is some number sequence), then the numbers (called restart numbers) are taken as a partition in time instead of space. For example, the following sequence of files represents a time series.

mysimoutput.e-s.000
mysimoutput.e-s.001
mysimoutput.e-s.002

You can use any number of digits in the restart numbers, but by convention the number used should match in all files. Also by convention you can leave off the -s.{RS} of the first file. The following file sequence is interpreted the same as that above.

mysimoutput.e
mysimoutput.e-s.001
mysimoutput.e-s.002

It is possible to combine a time series sequence with a spatial partitioning sequence. For files ending in .e-s.{RS}.{NP}.{RANK}, the {RS} is interpreted as the restart number (as in the previous example), the {NP} is interpreted as the number of partitions, and the {RANK} is interpreted as a partition number, which is usually simply the rank of the simulation process that dumped the file. A set of files with four spatial partitions that has three “restarts” (time series sets) will have filenames like the following.

mysimoutput.e-s.000.0004.0000
mysimoutput.e-s.000.0004.0001
mysimoutput.e-s.000.0004.0002
mysimoutput.e-s.000.0004.0003
mysimoutput.e-s.001.0004.0000
mysimoutput.e-s.001.0004.0001
mysimoutput.e-s.001.0004.0002
mysimoutput.e-s.001.0004.0003
mysimoutput.e-s.002.0004.0000
mysimoutput.e-s.002.0004.0001
mysimoutput.e-s.002.0004.0002
mysimoutput.e-s.002.0004.0003

As before, the -s.{RS} part of the files for the first time index are optional. Thus, the following is equivalent to the previous example.

mysimoutput.e.0004.0000
mysimoutput.e.0004.0001
mysimoutput.e.0004.0002
mysimoutput.e.0004.0003
mysimoutput.e-s.001.0004.0000
mysimoutput.e-s.001.0004.0001
mysimoutput.e-s.001.0004.0002
mysimoutput.e-s.001.0004.0003
mysimoutput.e-s.002.0004.0000
mysimoutput.e-s.002.0004.0001
mysimoutput.e-s.002.0004.0002
mysimoutput.e-s.002.0004.0003

9.1.2. Exodus Time Series Case Files

ParaView can handle the naming issue by reading a “case” file to specify the file set of each simulation run. The case file is a simple text file with the extension .ex-timeseries where each line contains the filename of a first file of an Exodus data set. The rest of the files are automatically determined in the same manner as if you had just loaded the single file set. Do **not* list every file of every data set.* The Exodus filenames may be given relative to the case file.

A case file can usually be generated by redirecting the output of the Unix find command. In some circumstances, you may be able to use the simpler ls command, but there are several technical issues that will probably prevent this from working on most large data sets. For example, let us say that every file is in a directory and has a name like the following:

mysimoutput.{RR}/mysimoutput.{RR}.{FFFF}

Where {RR} represents the restart number and {FFFF} represents the process number. Note that in this example we have placed the simulation output in different directories. We recommend this practice to limit the number of files in each directory. We can build a case file for this simulation data by running the following command:

find . –name 'mysimoutput.*.0000' > mysimoutput.ex-timeseries

The find command will then list all the first files in all the subdirectories. Do not worry about the order in the file, ParaView will automatically order the time steps and handle overlapping time. Simply open the mysimoutput.ex-timeseries file in ParaView.

Troubleshooting tip: If all of your files are in the same directory and end with the restart number, the case file will fail to load properly. For example, if you have the following three exodus files that represent a time series (as opposed to a spatial partitioning), listing them in a case file will not work.

mysimoutput.00
mysimoutput.01
mysimoutput.02

The problem is that ParaView will incorrectly identify each entry in your case file as a file in the same partitioned set of files. You will instead have to rename the files so that ParaView will not recognize them as an Exodus partition sequence. You could, for example, append .e to all the file names. You could also convert them to the Exodus Naming Conventions, in which case you will no longer need the case file at all.

9.2. SPCTH

Current CTH simulations write one spyplot file per process and also write a new set of files each time the simulation is restarted. ParaView is able to load in a set of files from one simulation run by simply selecting the first file in the set. ParaView will automatically find all other related files, given some standard naming conventions, and load them all together as a single data set. Dealing with restarts adds a level of complication because you now have a series of a series of files with no naming convention.

ParaView now handles the naming issue by requiring a “case” file to specify the file set of each simulation run. The case file is a simple text file with the extension .spcth-timeseries where each line contains the filename of a first file of a spyplot data set. The rest of the files are automatically determined in the same manner as if you had just loaded the single file set. Do **not* list every file of every data set.* The spyplot filenames may be given relative to the case file.

A case file can usually be generated by redirecting the output of the Unix find command. In some circumstances, you may be able to use the simpler ls command, but there are several technical issues that will probably prevent this from working on most large data sets. For example, let us say that every file is in a directory and has a name like the following:

mysimoutput{R}/spct{R}.{F}

Where {R} represents the restart identifier (maybe a number or a letter) and {F} represents the process number. Note that in this example we have placed the simulation output in different directories. We recommend this practice to limit the number of files in each directory. We can build a case file for this simulation data by running the following command.

find . –name 'spct*.0' > mysimoutput.spcth-timeseries

The find command will then list all the first files in all the subdirectories. Do not worry about the order in the file, ParaView will automatically order the time steps and handle overlapping time. Simply open the mysimoutput.spcth-timeseries file in ParaView.

9.3. Acknowledgments

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

SAND 2008-3286P