On the EGEE'09 conference in Barcelona, held from 21-25 September 2009, the MPI Wortking Group organised a session with speakers from the user community and the system administrator community. The presentations can be found in Indico at CERN.
After the session the chairman wrote an e-mail report to all attendees. It is included completely here, since it covers the event:
Dear participants of the MPI Session at the EGEE'09 conference, Please find the minutes of the meeting below. It proved that the MPI Session interested many people. The room in which it was organized was completely full. The session was chaired by Dennis van Dok (Nikhef). The first three talks were given by application users from three different communities: * Earth Science - Jean-Pierre Vilotte (Insitut de Physique du Globe de Paris,CNRS) * Computational Chemistry - Alessandro Costantini (University of Perugia) * Astronomy and Astrophysics - Salvatore Orlando (INAF - Osservatorio Astronomico di Palermo) Slides of the presentations can be found here: http://indico.cern.ch/sessionDisplay.py?sessionId=19&slotId=0&confId=55893#2009-09-22 From all application talks it became clear that the current implementation of MPI on the EGEE infrastructure has a lot of problems. These problems include: * Abortion of jobs without reason * Acceptance of jobs on clusters without requested resources * Missing and/or incorrect environment variables setup When a problem occurs, the time necessary to find the cause of it is long. The overall success rate of jobs is low, as indicated by the numbers in the presentation of Alessandro Constantini. These application talks were followed by three talks on: * Operational issues - Fokke Dijkstra (RUG-CIT) * Monitoring - Paschalis Korosoglou (AUTH) * MPI Working Group Proceedings - Jeroen Engelberts (SARA) * MPI_utils for gLite - Oliver Keeble (CERN) The operations talks reflected the unsatisfactory situation with MPI on the EGEE infrastructure as well, which was underlined by the following topics, amongst others: * Missing consistency check on MPI enabled sites * A tedious installation procedure * Incomplete translation of JDL to scheduler parameters The session was concluded with a panel discussion with the following members: - Isabel Campos (CSIC) - Fokke Dijkstra (RUG-CIT) - Oliver Keeble (CERN) - Lawrence Fields (CERN) - Paschalis Korosoglou (AUTH) To open the discussion, the chairman put up a slide with the following topics: * Focus on problems we may be able to solve in coming few months. * The project needs to take the support for MPI very seriously, and give it due priority. * We're worried how MPI support is going to work in EGEE if everybody, including management, has other priorities. * We're worried how MPI support is going to transfer to EGI. * This calls for a champion, who has the mandate and the power to chase integrators/sites. Isabel Campos opened the discussion by introducing herself and by emphasizing that her group is still supporting the "mpi-start" package. She stressed that MPI SAM tests should be enforced on sites that indicate they are supporting MPI. She recently checked the presence of mpi-start on the MPI enabled sites and found out that only 84 of the 120 CE's have the mpi-start package installed. Sites that don't have mpi-start installed should remove the MPI tag from the information system tag. Someone in the audience hinted that not all MPI package are available through gLite. Oliver Keeble replied that the package are available through several repositories, not necessarily Scientific Linux. Isabel Campos commented that see currently didn't see how support for MPI was to be continued in EGI. Frank Harris said that MPI Support is in need of an experienced project leader, a champion so to say, who is actively chasing people to solve problems rather than someone circling around GGUS ticket. The discussion then turned towards filling in the role of this champion. Brian Coghlan suggested John Walsh. Frank Harris suggested Isabel Campos as well. Both candidates confirmed their willingness and said they already worked close together in their roles in SA1 and SA3, respectively. The audience reacted positively and it was decided that their names would be suggested to the management, keeping in mind that the audience represented a significant number of the stake holders. This included people like Oliver Keeble, Francesco Giacomini and Laurence Field. It was concluded that it is urgent to get the new MPI Task Force, led by the champions, in place which will operate in close cooperation with the user communities and the service providers. Kind regards, Jeroen Engelberts SARA Reken- en Netwerkdiensten Chairman of the MPI Working Group
TCG working