IoT-enabled directed acyclic graph in spark cluster release_igbpeqbglnhvpnp6acgogy3prm

by Jahwan Koo, Nawab Muhammad Faseeh Qureshi, Isma Farah Siddiqui, Asad Abbas, Ali Kashif Bashir

Published in Journal of Cloud Computing: Advances, Systems and Applications by Springer Science and Business Media LLC.

2020  

Abstract

<jats:title>Abstract</jats:title> Real-time data streaming fetches live sensory segments of the dataset in the heterogeneous distributed computing environment. This process assembles data chunks at a rapid encapsulation rate through a streaming technique that bundles sensor segments into multiple micro-batches and extracts into a repository, respectively. Recently, the acquisition process is enhanced with an additional feature of exchanging IoT devices' dataset comprised of two components: (i) sensory data and (ii) metadata. The body of sensory data includes record information, and the metadata part consists of logs, heterogeneous events, and routing path tables to transmit micro-batch streams into the repository. Real-time acquisition procedure uses the Directed Acyclic Graph (DAG) to extract live query outcomes from in-place micro-batches through MapReduce stages and returns a result set. However, few bottlenecks affect the performance during the execution process, such as (i) homogeneous micro-batches formation only, (ii) complexity of dataset diversification, (iii) heterogeneous data tuples processing, and (iv) linear DAG workflow only. As a result, it produces huge processing latency and the additional cost of extracting event-enabled IoT datasets. Thus, the Spark cluster that processes Resilient Distributed Dataset (RDD) in a fast-pace using Random access memory (RAM) defies expected robustness in processing IoT streams in the distributed computing environment. This paper presents an IoT-enabled Directed Acyclic Graph (I-DAG) technique that labels micro-batches at the stage of building a stream event and arranges stream elements with event labels. In the next step, heterogeneous stream events are processed through the I-DAG workflow, which has non-linear DAG operation for extracting queries' results in a Spark cluster. The performance evaluation shows that I-DAG resolves homogeneous IoT-enabled stream event issues and provides an effective stream event heterogeneous solution for IoT-enabled datasets in spark clusters.
In application/xml+jats format

Archived Files and Locations

application/pdf  3.9 MB
file_j2qodshmavcq3lrhvinrloyz6q
journalofcloudcomputing.springeropen.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2020-09-14
Language   en ?
Journal Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  2192-113X
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: b99f8e45-2189-4ebd-9233-df0fcaab8b6a
API URL: JSON