Set up your environment for Swift:
$ export PATH=~benc/swift-svn/bin:$PATH
On Friday we built a montecarlo simulation to calculate Pi. Try to get that working yourself on the cluster.
We need count of how many points in our simulation fall inside a circle. Because addition is both commutative and associative, we can split up this sum and compute in parallel. So we can structure the application as some pieces that compute parts of the total, and then a final piece to add them together. This is like creating tiles in the mandelbrot application, and then assembling the tiles together to give a final image.
Here are the two components, findpi.c:
#include<stdio.h>
#include<stdlib.h>
#include<time.h>
main() {
int i;
srand(time(NULL));
int in=0, out=0;
for(i=0; i < 200000; i++) {
float rx = rand()/(float)RAND_MAX;
float ry = rand()/(float)RAND_MAX;
if(rx*rx + ry*ry < 1) {
in++;
} else {
out++;
}
}
printf("%d %d\n",in,out);
exit(0);
}
and assemble:
#!/bin/bash TI=0 TO=0 cat *.pipart > foo while read i o ; do TI=$(( $TI + $i)) TO=$(( $TO + $o)) done < foo echo $TI $TO echo "4 * $TI / ( $TI + $TO )" | bc -l
Compile the C code like this:
$ gcc -o findpi findpi.c
and check that it works:
$ ./findpi 157452 42548
The output numbers will be different but should be roughly the same as here.
Now we will use Swift to run findpi a large number of times to generate a lot of data. This is the SwiftScript to use:
type file;
app (file chunk) findpi() {
findpi stdout=@chunk;
}
file results[] <simple_mapper; suffix=".pipart">;
foreach i in [1:100] {
results[i] = findpi();
}
Put this into a file called pi.swift. Next you need to
tell Swift where it can find the findpi program by
making a transformation catalog entry.
Copy Swift's default transformation catalog to your directory so that you can modify it:
$ cp ~benc/swift-svn/etc/tc.data .
and now edit it using your favourite text editor to have an entry for
findpi. Copy an existing line, and set it so that
findpi points to your compiled version of findpi. Try to figure out the
delightful syntax yourself...
Now run like this:
$ swift -tc.file ./tc.data pi.swift
Swift svn swift-r2846 cog-r2382
RunID: 20090413-0747-jgx1uts4
Progress:
Progress: uninitialized:1
Progress: Initializing:99 Selecting site:1
Now look in your directory for files ending .pipart.
Each one of these contains the output of one of the 100 invocations of
findpi.
Next run the assemble script to get an estimate of pi:
$ ./assemble
15697627 4302373
3.13952540000000000000
Above, you ran Swift in local mode. It was running jobs directly on osg-ui and was not using any cluster nodes. So you would not be able to make a large run.
To use the cluster, you can tell Swift that it should use PBS by making an entry in the sites catalog.
Copy the default sites catalog to your directory so you can modify it:
$ cp ~benc/swift-svn/etc/sites.xml .
and open it in your favourite text editor. Find the line that says:
<execution provider="local" />
and change it to:
<execution provider="pbs" />
Now run Swift again, using this new sites file:
$ swift -tc.file tc.data -sites.file sites.xml pi.swift
You won't get much speedup here, because the component applications run
very fast. You can make findpi do more work by
increasing the number of iterations. Change the constant 200000 in
findpi.c to something around 20,000,000, and recompile.
Then run Swift again and see how long runs take.
When a run is finished, you can plot the logs to give lots of pretty graphs, some useful, some not useful, like this:
First find the name of the log file you want to plot. You can find the most recent logfile like this:
$ $ ls -lt *.log | head -n1
pi-20090410-1703-0j48sac2.log
Now plot that log with the swift-plog-log command:
$ swift-plot-log pi-20090413-0809-655k4g8a.log
Your log plots will appear in the directory report-pi-20090413-0809-655k4g8a, which you can view in your web browser by place it in your public_html directory, as before:
$ mv report-pi-20090413-0809-655k4g8a/ ~/public_html/
and then opening in your web browser.
The first page contains a few useful graphs: how your jobs completed over time, how many jobs Swift thinks are active. Other graphs can be useful when digging deeper into how Swift performs.
Now we will put the mandelbrot application into Swift.
We already have the pieces of the mandel application - the mandelbrot tile generator and the montage command to join the tiles together. We need to describe this in SwiftScript and tell Swift where to find the applications in the transformation catalog.
Put this in a file called mandel.swift:
type file;
int side = 8;
file tile[][] <simple_mapper;suffix=".pgm">;
file mandel <"mandelbrot.gif">;
app (file result) render(int x, int y, int side) {
mandel x y side 0.0582 1.99965 200000 1000 1000 32000 stdout=@result;
}
app (file frame) montage(file tiles[][], int side) {
montage "-tile" @strcat(side,"x",side) "-geometry" "+0+0" @filenames(tiles) @
frame;
}
foreach x in [0:(side-1)] {
foreach y in [0:(side-1)] {
tile[y][x]=render(x,y, side);
}
}
mandel=montage(tile, side);
Next add lines to your tc.data file to describe the location of mandel and montage. (because montage is on the system path, you do not need to specify the full path to it - instead you can write montage on its own and Swift will find it)
Now run this on the cluster, and then plot a log of the results. While swift is running, watch qstat to see Swift putting jobs in the queue. When you have plotted the logs, look to see if you are getting all 56 cores of the cluster to use.
Here is code to make a mandelbrot animation. Put it in a file like
mandelanim.swift:
type file;
type frameparameters {
int iterations;
int zoom;
float yoff;
float xoff;
}
app (file result) render(int x, int y, int side, frameparameters ap) {
mandel x y side ap.xoff ap.yoff ap.iterations 1000 1000 ap.zoom stdout=@resul
t;
}
app (file frame) montage(file tiles[][], int side) {
montage "-tile" @strcat(side,"x",side) "-geometry" "+0+0" @filenames(tiles) @
frame;
}
(file mandel) renderframe(int side, frameparameters p, int framenumber) {
file tile[][] <simple_mapper;prefix=@strcat("frame",framenumber), suffix=".pgm"
>;
foreach x in [0:(side-1)] {
foreach y in [0:(side-1)] {
trace("tile ",x,y);
tile[y][x]=render(x,y, side, p);
}
}
mandel=montage(tile, side);
}
file specificationFile <"framespec.data">;
frameparameters spec[] = readData(specificationFile);
int side = 2;
file frames[] <simple_mapper;suffix=".gif">;
trace(spec);
foreach p,i in spec {
frames[i] = renderframe(side, p, i);
}
You need to get an animation specification file and put it in
framespec.data. Here is an example of one:
$ head framespec.data
iterations zoom yoff xoff
100 3000 1.99965 0.0582
1000 3000 1.99965 0.0582
1000 3000 1.99964 0.0582
Make sure to put in the four column headings. The values are the command-line parameters for mandel.
Now run this script and see that you get a lot of frame files in your working directory. Assemble them using the convert command from lab 2.