SAGA Integration with Globus Online Data Movement Service
Posted on 6 October 2023
SAGA Integration with Globus Online Data Movement Service
Primary Mentor: Ole Weidner (ole.weidner@ed.ac.uk)
Secondary Mentor: Shantenu Jha (sjha@cct.lsu.edu)
Background
SAGA is a set of free cross-platform libraries written in C++ and Python which provide a set of high-level interfaces and runtime components that allow the development of distributed, grid and cloud-computing applications, frameworks and tools. SAGA is the first complete implementation of the Open Grid Forum Simple API for Grid Applications standard GFD-R-P.90.
SAGA is used by many research projects to cary out production level science on large-scale distributed infrastructure. The main application areas are currently computational biology (efficient sampling algorithms, genomics), as well as highenergy physics but the usefulness of SAGA is not confined to these specific areas. The community seems to have an increasing interest in using Globus Online (http://www.globusonline.org/) - a hosted service in the cloud that allows high-performance, reliable, secure data movement service - through the convenience of SAGAʼs filesystem API.
Project Goals
Together with our GSOC student, we would hope to be able to conceptualize, architect and implement a first working prototype of a SAGA adaptor that can communicate with the Globus Online cloud services.
Project Description
The project will comprise three phases:
- conceptualisation of a SAGA - Globus Online adaptor,
- implementation of a Globus Online adaptor, and
- testing of existing SAGA applications with the new Globus Online adaptor.
In the first phase we will explore how GOʼs RESTful API can be mapped into SAGAʼs adaptor concept. In the second phase, we will implement the Globus Online as a SAGA adaptor, written in C++.
In the third phase, we will run some existing SAGA applications (e.g. the SAGA Master-Worker framework and SAGA Pilot-Job) with the new adaptor (i.e. replace the regular, already existing Globus adaptor with the new Globus Online adaptor) to see if it (a) works as expected and (b) if there are any performance implications. These tests will be conduced on large-scale production infrastructure such as TeraGrid/XD and EGI. If the results are interesting, we will try to publish them (together with the student, of course).
Project Requirements
The student should be a good C++ programmer. A general interested in distributed, grid and cloud computing would be great. The applicant should be capable of working independently and he/she should be organized enough to take the charge of a project all the way from planning to implementation.
Further information
- Project website: http://saga-project.github.io/