OMII-UK Home

Extension of Rapid to the Hadoop Framework

Google Summer of Code 2010 ideas

Primary Mentor: Jano van Hemert
Secondary Mentor: Jos Koetsier
Project: Rapid.

Background

Rapid is a unique way of quickly designing and delivering web portal interfaces to applications that require computational resources, such as utility computing infrastructures or high-performance computing facilities. It focuses on the requirements of the end-user by designing customised user interfaces for domain-specific applications that allow users to achieve particular tasks from the comfort of their own web browser.

Project Goals

To add a module to Rapid, a technology to quickly create web portal interfaces to execute applications on remote compute resources, that allows it to communicate with the Apache Hadoop framework. Currently, Rapid can submit to several job submission engines, such as Sun Grid Engine, Condor and PBS. You will extend Rapid with code that will facilitate the submission to, monitoring of and handling of data with the Hadoop framework.

Project Description

Currently, Rapid works with Grid and High-Performance Computing infrastructure. It is your task to adapt Rapid so as it can be used to generate intuitive interfaces that submit jobs to several cloud infrastructures. Examples of these infrastructures are Amazon's Elastic Cloud 2, Eucalyptus, Rackspace, Linode and GoGrid. Preferably you will look into existing solutions that can handle several of these infrastructures at ones via a standard library, such as for example libcloud.

Cloud provides several advantages over other distributed computing approaches such as Grid and high-performance computing. However, it also brings several problems, such as expensive data movement and the potential of wasting resources if virtual machines run idle. In this project you investigate solutions that involve Apache Hadoop to better organise the computation so as to make efficient use of compute resource.

Project Requirements

  • Learn how Rapid is used.
  • Learn how Rapid works internally, especially its modules to communicate with various computing infrastructures
  • Plan and design how to add a module for Apache Hadoop
  • Write the code with documentation
  • Transfer the code to us

Skill set required for the student.

Java, XML

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-4) was last changed on 11-Mar-2010 14:31 by MarioAntonioletti [RSS]

© The University of Southampton on behalf of OMII-UK. All Rights Reserved. | Terms of Use | Privacy Policy | PageRank Checker