Docker container-based big data processing system in multiple clouds for everyone

Abstract

Big data processing is progressively becoming essential for everyone to extract the meaningful information from their large volume of data irrespective of types of users and their application areas. Big data processing is a broad term and includes several operations such as the storage, cleaning, organization, modelling, analysis and presentation of data at a scale and efficiency. For ordinary users, the significant challenges are the requirement of the powerful data processing system and its provisioning, installation of complex big data analytics and difficulty in their usage. Docker is a container-based virtualization technology and it has recently introduced Docker Swarm for the development of various types of multi-cloud distributed systems, which can be helpful in solving all above problems for ordinary users. However, Docker is predominantly used in the software development industry, and less focus is given to the data processing aspect of this container-based technology. Therefore, this paper proposes the Docker container-based big data processing system in multiple clouds for everyone, which explores another potential dimension of Docker for big data analysis. This Docker container-based system is an inexpensive and user-friendly framework for everyone who has the knowledge of basic IT skills. Additionally, it can be easily developed on a single machine, multiple machines or multiple clouds. This paper demonstrates the architectural design and simulated development of the proposed Docker container-based big data processing system in multiple clouds. Subsequently, it illustrates the automated provisioning of big data clusters using two popular big data analytics, Hadoop and Pachyderm (without Hadoop) including the Web-based GUI interface Hue for easy data processing in Hadoop.

Publication DOI: https://doi.org/10.1109/SysEng.2017.8088294
Divisions: College of Engineering & Physical Sciences
Additional Information: © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Event Title: 2017 IEEE International Systems Engineering Symposium (ISSE)
Event Type: Other
Event Location: VIENNA, AUSTRIA
Event Dates: 2017-10-11 - 2017-10-13
ISBN: 978-1538634042, 978-1538634035
Last Modified: 24 Apr 2024 07:28
Date Deposited: 14 Sep 2020 12:06
Full Text Link:
Related URLs: https://ieeexpl ... ocument/8088294 (Publisher URL)
PURE Output Type: Conference contribution
Published Date: 2017-10-30
Authors: Naik, Nitin (ORCID Profile 0000-0002-0659-9646)

Download

[img]

Version: Accepted Version

| Preview

Export / Share Citation


Statistics

Additional statistics for this record