|
The National Computational Grid for Ireland |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Grid-IrelandGrid EventsSponsorsOperationsCertificatesCollaborationSite
|
Using Grid-Ireland
IntroductionYou will also need to apply for a Grid certificate. For more information see the guide to getting a Grid-Ireland User Certificate. You will also need to register with one of the Virtual Organizations. All Grid-Ireland users must accept the Grid-Ireland and Virtual Organization usage rules. Grid-Ireland currently provides Globus 2.4 and EGEE gLite software for Grid computing. For an overview of the EGEE gLite software and usage instructions consult the EGEE gLite User Guide. There are two generic ways to use the grid: through a web portal or via the command line. Web PortalsIf you need to contact the portal from outside your site, see the appendix for information on X11 forwarding. Migrating DesktopThis is a portal that you can run from your usual browser on your own desktop. Browse to the Migrating Desktop. From there it is possible to create proxy credentials and submit jobs, all from your usual browser. You need to have Java Web Start (JWS) installed. You can download it from java.com Migrating desktop has also been shown to work on MacOSX 10.4 using Safari 2.0.3 and Java 1.4.2 as supplied by Apple. The user's credentials are made available to the browser using Applications-->Utilities-->Keychain Access to import the certificate. The same steps as for other UN*X platforms are also used to create.pem files in the .globus directory.
You must have your cert key pair on your desktop machine, and you must point the Migrating Desktop at P-Grade PortalThe P-Grade Portal allows you to develop, execute and monitor workflows and workflow based parameter studies from a browser on your own desktop. For more detailed information on P-Grade click here. N.B. There are some client requirements for using the portal (a supported browser & Java plug-in) which are specified here Grid-Ireland operates a P-Grade portal providing transparent access to Grid-Ireland resources no matter where the user is located. To use the portal first browse to the Grid-Ireland P-Grade portal server here. If you are a Grid-Ireland user and would like to try P-Grade you may apply for an account from your browser by clicking on the "Create new account" link and filling in the required details. Once your account has been approved, browse to the P-Grade Portal - if this is your first time logging onto the portal you should then ensure that you are a member of the szupergrid group by checking the checkbox for this group under the "Configure group membership" panel which is located on the Welcome --> Settings tab (szupergrid is a p-grade portal group that allows access to the installed portlets). When this has been done the grid portal tabs for creating workflows, uploading certificates and accessing the information system will be displayed. The first thing you need to do before jobs can be submitted is to obtain a proxy cert for your p-grade session - click on the "Certificates" tab. There are two options (i) download an existing proxy cert from a MyProxy server, or (ii) Upload credentials to MyProxy & then download a proxy cert - in either case you will be asked for the hostname and port number of the MyProxy service [Grid-Ireland's MyProxy server is cagraidsvr20.cs.tcd.ie and the port is 7512]. Once you have a proxy cert then you need to enable this proxy for the use with any (or all) virtual organisation (VO) that you intend to submit workflows under. Please choose the vo_LCG_2_BROKER options (with the '_LCG_2_BROKER' trailing text) rather than the VO (without the '_LCG_2_BROKER' trailing text). E.g. to submit workflows as a cosmo VO user you should choose the cosmo_LCG_2_BROKER entry. Useful hintsThere is an very informative Online User's Manual available. Alternatively see the PDF User's Manual. Remember that p-grade uploads a binary from your local machine by default rather than executing something hosted on the remote machine. If your local machine is FC6 and the execute node is SL3, the uploaded hostname binary will not run. So if you want to run the ''hostname'' program on the remote host, you will have to upload a script which runs /bin/hostname. Also the "Name" of a workflow node is automatically converted to a script file "Name.sh" so don't call your script by the same name. Also remember that to adding input files to the job is done by adding "ports" to the job node you define. Everything in pgrade is a workflow, so files are ports that allow data to flow between nodes (or from the local machine). It takes a little while to get used to this approach. For MPI or PVM jobs, see Building your own Workflow. For parameter studies, see Parameter Studies (NB: upload the input files as replicas into an LFC directory; P-Grade automatically generates a job per each input file, then the sub-jobs are displayed with status information). If you are using a Windows system remember that Windows represent text file end-of-lines with "CRLF" rather than "LF" as in UNIX - this will cause errors of the form "/bin/bash: bad interpreter: No such file or directory". You can overcome this by creating text files in UNIX format (or converting existing text files to UNIX format) using a Windows editor such as Cream (select Format -> File Format). Command Line (on a UI)A user may interact with the Grid-Ireland middleware via the command line on a grid UI.Logging On to the UIIn order to make use of Grid-Ireland via the cmdline you will need an account on a Grid User Interface (UI) machine. A UI is installed at each Grid-Ireland site. If you would like an account please contact the Grid-Ireland helpdesk (grid-ireland-help at cs.tcd.ie) and we will check if you are eligible. Once you have a UI account, you will need to log on to the Grid UI at your site with the username that has been assigned to you:
After logging in, you must get a proxy credential which allows authentication with grid services. Note: these instructions assume you have already got a grid certificate and have installed successfully on the UI according to the instructions.
To use the Grid you must contact the VO membership service to
get a credential that verifies your membership of the VO.
Such a VOMS proxy is created with the
Globus MiddlewareGlobus provides a number of useful commands that act in the same way as
Operating System commands. For example, to run
EGEE gLite MiddlewareThe LHC Computing Grid (LCG) has produced and collected the Grid software necessary to analyse the data from the CERN Large Hadron Collider project. This is the foundation software on Grid-Ireland. Using Globus as its base, the EGEE gLite software provides enhanced workload and replica management facilities to cater for large scale compute- and data-intensive tasks.Useful Links
gLite Workload ManagementFrom the gridui command line, one can submit a job:
This will print out a job identifier that acts as a handle to the job. Then one can monitor the status of the job:
glite-wms-job-status <job identifier>
When the job status is Done then one can retrieve the results:
glite-wms-job-output <job identifier>
When submitting a job you can save the job identifier
(referred to as
glite-wms-job-submit -a -o myJobIdFile myJDLfile
The job identifier is then saved saved in the file
glite-wms-job-status -i myJobIdFile
And one can retrieve the results as follows:
glite-wms-job-output -i myJobIdFile --dir .
gLite Data ManagementThe Replica Catalog can be considered to be temporary
storage that is available at each site on the
Once you have registered a file in the Catalog you have two ways to get it to the job:
Similarly, when the job has generated an output file, you have two ways to register it in the Catalog:
Job Description LanguageWARNING!!! Information in this section is in development and is not guaranteed to be correct. Things to be aware of in JDLsIf the arguments contains quoted strings, the quotes must be escaped with a backslash:
Arguments = "\"Hello World!\" 10";
Special characters such as &, |, >, < are only allowed if specified inside a quoted string or preceded by triple '\\\':
Arguments = "-f file1\\\&file2";)
The user can provide a local executable name, which will be staged from the UI to the WN, but the InputSandbox must then have the full path:
Executable = {"egeode.sh"};
InputSandbox = {"/home/bloggsj/egeode/egeode.sh"};
OutputData syntaxOptionally one can register the job output with the Register Catalog, i.e. send it to the catalog. The filename is mandatory, but the user can optionally specify the LFN and the Storage Element:
OutputData = {
[
OutputFile = "dataset1.out"; # filename is mandatory
],[
OutputFile = "dataset2.out"; # filename is mandatory
LogicalFileName = "lfn:test-result1"; # optional LFN
],[
OutputFile = "dataset3.out"; # filename is mandatory
LogicalFileName = "lfn:test-result2"; # optional LFN
StorageElement = "gridstore.cs.tcd.ie"; # optional Storage Element
]
};
Job without data requirements
# Assumes script.sh is:
# #!/bin/sh
# /bin/echo Hello $1 and Welcome to the Grid-Ireland Tutorial!
# script.jdl
[
Type = "job";
JobType = "Normal";
Executable = "script.sh";
Arguments = "Brian";
StdOutput = "sim.out";
StdError = "sim.err";
InputSandbox = {"script.sh"};
OutputSandbox = {"sim.err","sim.out"};
# A site with more than 4 CPU is required.
Requirements=(other.GlueCEInfoTotalCPUs>4);
# If more than one resource matches, the resource with the largest
# number of CPU is chosen.
Rank = (other.GlueCEStateFreeCPUs);
]
Job with data requirements
# gridTest.jdl
[
Executable = "gridTest";
StdError = "stderr.log";
StdOutput = "stdout.log";
InputSandbox = {"/home/coghlan/test/gridTest"};
OutputSandbox = {"stderr.log", "stdout.log"};
InputData = "lfn:testbed0-00019";
DataAccessProtocol = "gridftp";
Requirements = other.Architecture=="INTEL" && \
other.OpSys=="LINUX" && other.FreeCpus >=4;
Rank = "other.GlueHostBenchmarkSF00";
]
Job with output data requirements
# scriptOutput.jdl
[
Type = "job";
JobType = "Normal";
Executable = "scriptOutput.sh";
Arguments = "Gabriele";
VirtualOrganisation = "webcom";
StdOutput = "sim.out";
StdError = "sim.err";
InputSandbox = {"scriptOutput.sh"};
OutputSandbox = {"sim.err", "sim.out"};
OutputData = {
[
OutputFile = "Trinity-2007-06-20.out";
LogicalFileName = "lfn:/grid/webcom/Trinity-2007-06-20.out";
StorageElement = "gridstore.cs.tcd.ie";
]};
Requirements=(other.GlueCEInfoTotalCPUs>4);
Rank=(other.GlueCEStateFreeCPUs);
RetryCount = 0;
]
Assumes scriptOutput.sh is:
#!/bin/sh fileOut="Trinity-2007-06-20.out" /bin/echo Hello $1 and Welcome to the Grid-Ireland Tutorial! > $fileOut Job with input data requirements
# Assumes scriptInput.sh is:
# #!/bin/sh
# lcg-cp vo marine lfn:myoutdata.1 file:`pwd`/dataset1.out
# echo "Before updating.."
# cat dataset1.out
# #Adding new entry on the dataset1.out file.
# /bin/echo Hello $1 and Welcome to the Grid-Ireland Tutorial! >> dataset1.out
# echo "After updating.."
# cat dataset1.out
# scriptInput.jdl
[
Type = "job";
JobType = "Normal";
Executable = "scriptInput.sh";
Arguments = "Andy";
VirtualOrganisation = "MarineGrid";
StdOutput = "std.out";
StdError = "std.err";
InputSandbox = {"scriptInput.sh"};
OutputSandbox = {"std.err", "std.out"};
InputData = "lfn:myoutdata.1";
DataAccessProtocol = {"gridftp","rfio"};
Requirements=(other.GlueCEInfoTotalCPUs>4);
Rank=(other.GlueCEStateFreeCPUs);
RetryCount = 0;
]
Job that stores data in the Replica Catalog using the lcg-* commands
# StoreData.jdl
[
Type="Job";
JobType="Normal";
Executable="StoreData.sh";
Arguments="myfile myfile gridstore.cp.dias.ie";
VirtualOrganisation="cosmo";
StdOutput="std.out";
StdError="std.err";
InputSandbox={"StoreData.sh","myfile"};
OutputSandbox={"std.out","std.err"};
Requirements=(other.GlueCEInfoTotalCPUs > 4);
Rank=other.GlueCEStateFreeCPUs;
]
Assumes StoreData.sh is:
#!/bin/sh
# StoreData.sh
ARGS=3
ERROR_BADARGS=10 # Bad arguments error
ERROR_FILE=20 # File to copy on RLS doesn't exist
ERROR_STORAGE=30 # Storage Element doesn't exist in this catalog
# Check for proper no. of command line args.
if [ $# -ne $ARGS ]
then
echo "`basename $0.` Stores data into Replica Catalog."
echo "Usage: `basename $0` [text-file2copy] [lfn-file2copy] [se-name]"
exit $ERROR_BADARGS
else
# Set the variables for the script.
FILE2COPY=$1
LFN2COPY=$2
SENAME=$3
fi
# Check if the ${FILE2COPY} exists on the path.
if [ ! -e `pwd`/${FILE2COPY} ]
then
echo "Error: `basename $0.` ${FILE2COPY} does not exist on the path."
exit ${ERROR_FILE}
fi
# Check if the SENAME is a good SE where to store data.
lcg-infosites --vo cosmo closeSE | grep ${SENAME}
if [ $? -eq 0 ]
then
# Upload the data to the RLS.
lcg-cr --vo cosmo -d ${SENAME} -l ${LFN2COPY} file:`pwd`/${FILE2COPY}
# Should really check for errors here
else
echo Sorry, but you have to specify a closer Storage Element.
exit $ERROR_STORAGE
fi
Job that updates data in the Replica Catalog using the lcg-* commands
# UpdateData.jdl
[
Type="Job";
JobType="Normal";
Executable="UpdateData.sh";
Arguments="myfile Gareth gridstore.cp.dias.ie";
VirtualOrganisation="cosmo";
StdOutput="std.out";
StdError="std.err";
InputSandbox={"UpdateData.sh"};
OutputSandbox={"std.out","std.err"};
Requirements=(other.GlueCEInfoTotalCPUs > 4);
Rank=other.GlueCEStateFreeCPUs;
]
Assumes UpdateData.sh is:
#!/bin/sh
# UpdateData.sh
ARGS=3
ERROR_BADARGS=10 # Bad arguments error.
ERROR_BADLFN=20 # LFN doesn't exist on the RLS.
ERROR_BADSTORAGE=30 # Storage Element doesn't exist in this Catalog.
# Check for proper no. of command line args.
if [ $# -ne $ARGS ]
then
echo "`basename $0.` Updates data in Replica Catalog."
echo "Usage: `basename $0` [lfn-file2copy] [argument] [se-name]"
exit ${ERROR_BADARGS}
else
# Set the arguments for the script.
LFN2RETRIEVE=$1
ARGUMENT=$2
SENAME=$3
fi
# Check if the ${LFN2RETRIEVE} exists on the RLS.
lcg-lr --vo cosmo lfn:${LFN2RETRIEVE}
if [ $? -eq 1 ]
then
echo "Sorry, but the LFN you specified does not exists on the RLS".
exit ${ERROR_BADLFN}
else
# Retrieve the file from the catalog...
lcg-cp --vo cosmo lfn:${LFN2RETRIEVE} file:`pwd`/myfile
# ...adding new data to myfile.
echo Hello ${ARGUMENT} and Welcome to the Grid-Ireland Tutorial!! >> `pwd`/myfile
fi
# Check if the SENAME is a good SE where to store data.
lcg-infosites --vo cosmo closeSE | grep ${SENAME}
if [ $? -eq 1 ]
then
echo Sorry, but you have to specify a closer Storage Element.
exit ${ERROR_BADSTORAGE}
else
# Set the variables for the new copy.
FILE2COPY=`pwd`/myfile
LFN2COPY=${LFN2RETRIEVE}
# Removing the old version from the RLS...
# Should really rename to a temp file here
lcg-del --vo cosmo -a lfn:${LFN2COPY}
# ... and stage the new data onto the RLS.
lcg-cr --vo cosmo -d ${SENAME} -l ${LFN2COPY} file:${FILE2COPY}
# Should really check for errors here,
# and then if no errors then delete temp file
echo "Your data has been correctly updated in the RLS.
echo Have a nice day!!"
fi
AppendicesSSH Key Pair GenerationSee How To Generate SSH Keys for use on the Grid User Interface. X11 forwarding with SSHThe UI at a site is only directly accessible from within that site. If you wish run a graphical program remotely on your site from home or when travelling, for example, it may be possible to use X11 forwarding with SSH. What follows are a few examples. It will be necssary to alter the details to suit your particular situation. To run the JSUI from home, first SSH into an externally accessible host at your institution, and enable X11 forwarding:
Then, you can run the JSUI command normally, as specified above:
To run a browser remotely at your site - to access the UI portal - first SSH into an externally accessibly host at your institution, and enable X11 forwarding:
It may be possible to run the browser from this externally accessible host directly:
Or it may be necessary to SSH again to another host that has the browser installed:
Windows users can set up X11 using Cygwin/X. The GLUE Grid Information Schema
Last modified Thu 12 November 2009 . View page history Switch to HTTPS . Website Help . Print View . Built with GridSite 1.1.15d The Grid-Ireland website is hosted on cagraidsvr06.cs.tcd.ie in the Department of Computer Science, Trinity College Dublin. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||