CLASSIFICATION COMPLEX QUERY SQL FOR DATA LAKE MANAGEMENT USING MACHINE LEARNING release_y233nt36qzhs5pwrpfwqqwbyra

by Nurhadi Nurhadi, Rabiah Abdul Kadir, Ely Salwana Mat Surin

Published in Journal of Information System and Technology Management by Global Academic Excellence (M) Sdn Bhd.

2021   Issue 22, p15-24

Abstract

A query is a request for data or information from a database table or a combination of tables. It allows for a more accurate database search. SQL queries are divided into two types, namely, simple queries and complex queries. Complex SQL is the use of SQL queries that go beyond standard SQL by using the SELECT and WHERE commands. Complex SQL queries often involve the use of complex joins and subqueries, where the queries are nested in a WHERE clause. Complex SQL queries can be grouped into two types of queries, namely, Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) queries. In the implementation of complex SQL queries in the NoSQL database, a classification process is needed due to the varying data formats, namely, structured, semi-structured, and unstructured data. The classification process aims to make it easier for the query data to be organized by type of query. The classification method used in this research is the Naive Bayes Classifier (NBC) which is generally often used in text data, and the Support Vector Machine (SVM), which is known to work very well on data with large dimensions. The two methods will be compared to determine the best classification result. The results showed that SVM was 84.61% accurate in terms of classification, and comparatively, NBC was at 76.92%.
In application/xml+jats format

Archived Files and Locations

application/pdf  399.8 kB
file_cetl6qb375hqfjo3sawvwpc5hq
www.jistm.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Archived
Type  article-journal
Stage   published
Date   2021-09-01
Language   en ?
Journal Metadata
Not in DOAJ
Not in Keepers Registry
ISSN-L:  0128-1666
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 9f5b38e2-0838-4354-b5f8-1f96665310f1
API URL: JSON