AUTO3DEM

From UF HPC Wiki

Jump to: navigation, search

Contents

Description

AUTO3DEM is an automated image reconstruction system. It is used to coordinate the execution of intensive parallel codes such as P3DR, PCUT, POR, PPFT, and PSF.

Website

cryoem.ucsd.edu

Sample Submission Script

The sample PBS script can now be found on the PBS Sample Job Scripts page.

Obviously, a number of things would have to be changed in the above script in order for it to work for yourself. Location of your input scripts, etc. should be changed. Also be sure to set the email address towards the top there so that you receive email notifications when your job is done.

Note also that this is a job that requires 8 processors to run, split across nodes as the Torque/Maui system deems fit. Your job may require more or fewer processors in order to run more efficiently, so please experiment a bit until you get a nice solution that is optimal. If you do find that there is a more optimal number of processors than the eight above, please let us know and we can modify this script here for others. Also note that the processors directive is located in two different places:

  1. On the nodes= line in the PBS Directives
  2. At the auto3dem line

In order to submit the above file, you would first get the file input into the system under some filename, then issue the following command:

$ qsub <script>

As a note, the #PBS -V directive in the script is provided due to a blurb found in the README notes of the source:

AUTO3DEM currently runs under the PBS and SGE batch schedulers. In
your job script, you will need to include the flag (-V) that exports
all environment variables to the job. Do not explicitly include a
nodefile in the auto3dem command since it is automatically passed
through the Perl %ENV hash.

Job Continuation

I am not going to go into detail here, but if a job ends prematurely due to a node going down or some other instance of a problem, it appears to be possible to restart the job from the point close to where it left off. Read about this in the documentation on the program's website, here.

Build

  • Build the serial version with the command:
./make_all serial nogui
  • Build the parallel version with the command:
./make_all parallel nogui
  • All built files get placed into the $SOURCE/BIN directory, so I have created a symbolic link to these binaries from the /apps/auto3dem directory such that /apps/auto3dem/bin points to them. For older versions, take a look at the /apps/auto3dem directory and you will find that there are links to the previous versions available.

Intel Changes

In order to build this with the Intel compiler, the files make.inc.parallel and make.inc.serial have to be changed to reflect that you are going to use the intel compilers.

After this, it should build just fine, minus the changes below...

Version 3.05

Some changes had to be made to the following files:

$SOURCE/convert/mrc2pif.f
$SOURCE/convert/pif2mrc.f
$SOURCE/convert/pif2ccp4.f

The changes were required because of the following errors:

ifort -O3 -cpp     -I../include -c mrc2pif.f
fortcom: Error: mrc2pif.f, line 473: A RETURN statement is invalid in the main program.
      return
------^
fortcom: Error: mrc2pif.f, line 475: A RETURN statement is invalid in the main program.
      return
------^

ifort -O3 -cpp     -I../include -c pif2mrc.f
fortcom: Error: pif2mrc.f, line 332: A RETURN statement is invalid in the main program.
      return
------^

ifort -O3 -cpp     -I../include -c pif2ccp4.f
fortcom: Error: pif2ccp4.f, line 467: A RETURN statement is invalid in the main program.
      return
------^

By commenting out the RETURN statement on each of the lines in the associated source file, the source compiled cleanly.

User Comments

Michael DiMattia

I figured out that the particular process I was running, a preliminary job to auto3dem, runs far better serially here in the lab.

The main job, 'auto3dem' then runs very well after I ftp the output from the previous job onto the cluster and start from there. CPU time compounds just like you would imagine it should on the cluster and this problem does not occur at all.