On the EGEE'09 conference in Barcelona, held from 21-25 September 2009, the MPI Wortking Group organised a session with speakers from the user community and the system administrator community. The presentations can be found in Indico at CERN.

After the session the chairman wrote an e-mail report to all attendees. It is included completely here, since it covers the event:

Dear participants of the MPI Session at the EGEE'09 conference,

Please find the minutes of the meeting below.

It proved that the MPI Session interested many people. The room in which
it was organized was completely full. The session was chaired by Dennis
van Dok (Nikhef). The first three talks were given by application users
from three different communities:

* Earth Science - Jean-Pierre Vilotte (Insitut de Physique du Globe de Paris,CNRS)
* Computational Chemistry - Alessandro Costantini (University of Perugia)
* Astronomy and Astrophysics - Salvatore Orlando (INAF - Osservatorio Astronomico di Palermo)

Slides of the presentations can be found here:

http://indico.cern.ch/sessionDisplay.py?sessionId=19&slotId=0&confId=55893#2009-09-22

From all application talks it became clear that the current
implementation of MPI on the EGEE infrastructure has a lot of problems.
These problems include:
* Abortion of jobs without reason
* Acceptance of jobs on clusters without requested resources
* Missing and/or incorrect environment variables setup

When a problem occurs, the time necessary to find the cause of it is
long. The overall success rate of jobs is low, as indicated by the
numbers in the presentation of Alessandro Constantini.

These application talks were followed by three talks on:
* Operational issues - Fokke Dijkstra (RUG-CIT)
* Monitoring - Paschalis Korosoglou (AUTH)
* MPI Working Group Proceedings - Jeroen Engelberts (SARA)
* MPI_utils for gLite - Oliver Keeble (CERN)

The operations talks reflected the unsatisfactory situation with MPI on
the EGEE infrastructure as well, which was underlined by the following
topics, amongst others:
* Missing consistency check on MPI enabled sites
* A tedious installation procedure
* Incomplete translation of JDL to scheduler parameters

The session was concluded with a panel discussion with the following
members:
- Isabel Campos (CSIC)
- Fokke Dijkstra (RUG-CIT)
- Oliver Keeble (CERN)
- Lawrence Fields (CERN)
- Paschalis Korosoglou (AUTH)

To open the discussion, the chairman put up a slide with the following
topics:

* Focus on problems we may be able to solve in coming few months.
* The project needs to take the support for MPI very seriously, and give it due priority.
* We're worried how MPI support is going to work in EGEE if everybody, including management, has other priorities.
* We're worried how MPI support is going to transfer to EGI.
* This calls for a champion, who has the mandate and the power to chase integrators/sites.

Isabel Campos opened the discussion by introducing herself and by
emphasizing that her group is still supporting the "mpi-start" package.
She stressed that MPI SAM tests should be enforced on sites that
indicate they are supporting MPI. She recently checked the presence of
mpi-start on the MPI enabled sites and found out that only 84 of the 120
CE's have the mpi-start package installed. Sites that don't have
mpi-start installed should remove the MPI tag from the information
system tag.

Someone in the audience hinted that not all MPI package are available
through gLite. Oliver Keeble replied that the package are available
through several repositories, not necessarily Scientific Linux.

Isabel Campos commented that see currently didn't see how support for
MPI was to be continued in EGI.

Frank Harris said that MPI Support is in need of an experienced project
leader, a champion so to say, who is actively chasing people to solve
problems rather than someone circling around GGUS ticket.

The discussion then turned towards filling in the role of this champion.
Brian Coghlan suggested John Walsh. Frank Harris suggested Isabel Campos
as well. Both candidates confirmed their willingness and said they
already worked close together in their roles in SA1 and SA3,
respectively. The audience reacted positively and it was decided that
their names would be suggested to the management, keeping in mind that
the audience represented a significant number of the stake holders. This
included people like Oliver Keeble, Francesco Giacomini and Laurence Field.

It was concluded that it is urgent to get the new MPI Task Force, led by
the champions, in place which will operate in close cooperation with the
user communities and the service providers.

Kind regards,

Jeroen Engelberts
SARA Reken- en Netwerkdiensten
Chairman of the MPI Working Group

mpi: WorkingGroup/MpiMeetings/2009-09-23 (last edited 2011-07-12 14:41:39 by localhost)