Extension of Rapid to the Hadoop Framework
Google Summer of Code 2010 ideas
Primary Mentor: Jano van Hemert
Secondary Mentor: Jos Koetsier
Project: Rapid
.
Background
Rapid
is a unique way of quickly
designing and delivering web portal interfaces to applications that require computational resources, such as utility computing infrastructures or high-performance computing facilities. It focuses on the requirements of the end-user by designing customised user interfaces for domain-specific applications that allow users to achieve particular tasks from the comfort of their own web browser.
Project Goals
To add a module to Rapid
, a technology to quickly create web portal interfaces to execute applications on remote compute resources, that allows it to communicate with the Apache Hadoop
framework. Currently, Rapid can submit to several job submission engines, such as Sun Grid Engine, Condor and PBS. You will extend Rapid with code that will facilitate the submission to,
monitoring of and handling of data with the Hadoop framework.
Project Description
Currently, Rapid works with Grid and High-Performance Computing
infrastructure. It is your task to adapt Rapid so as it can be used to
generate intuitive interfaces that submit jobs to several cloud
infrastructures. Examples of these infrastructures are Amazon's Elastic Cloud 2, Eucalyptus, Rackspace, Linode and GoGrid. Preferably you will look into existing solutions that can handle several of these infrastructures at ones via a standard library, such as for example libcloud
.
Cloud provides several advantages over other distributed computing approaches such as Grid and high-performance computing. However, it also brings several problems, such as expensive data movement and the potential of wasting resources if virtual machines run idle. In this project you investigate solutions that involve Apache Hadoop to better organise the computation so as to make efficient use of compute resource.
Project Requirements
- Learn how Rapid is used.
- Learn how Rapid works internally, especially its modules to communicate with various computing infrastructures
- Plan and design how to add a module for Apache Hadoop
- Write the code with documentation
- Transfer the code to us
Skill set required for the student.
Java, XML





© The University of Southampton on behalf of OMII-UK. All Rights Reserved. |