Natural Language to SQL Queries: A Review

Authors

  • Mirza Shahzaib Baig Department of Creative Technologies, Faculty of Computing & AI, Air University, Islamabad
  • Azhar Imran Department of Creative Technologies, Faculty of Computing & AI, Air University, Islamabad
  • Amanullah Yasin Department of Creative Technologies, Faculty of Computing & AI, Air University, Islamabad
  • Abdul Haleem Butt Department of Creative Technologies, Faculty of Computing & AI, Air University, Islamabad
  • Muhammad Imran Khan Department of Creative Technologies, Faculty of Computing & AI, Air University, Islamabad

Keywords:

Natural Language Processing, Structured Query Language (SQL), text to relational database, Natural Language Interface for Databases (NLIDB), Intelligent Database System (IDBS),, Database Management System (DBMS)

Abstract

The relational database is the way of maintaining, storing, and accessing structured data but in order to access the data in that database the queries need to be translated in the format of SQL queries. Using natural language rather than SQL has introduced the advancement of a new kind of handling strategy called Natural Language Interface to Database frameworks (NLIDB).  NLIDB is a stage towards the turn of events of clever data set frameworks (IDBS) to upgrade the clients in performing adaptable questioning in data sets. A model that can deduce relational database queries from natural language. Advanced neural algorithms synthesize the end-to-end SQL to text relation which results in the accuracy of 80% on the publicly available datasets. In this paper, we reviewed the existing framework and compared them based on the aggregation classifier, select column pointer, and the clause pointer. Furthermore, we discussed the role of semantic parsing and neural algorithm’s contribution in predicting the aggregation, column pointer, and clause pointer.  In particular, people with limited background knowledge are unable to access databases with ease. Using natural language interfaces for relational databases is the solution to make natural language to SQL queries.  This paper presents a review of the existing framework to process natural language to SQL queries and we will also cover some of the speech to SQL model in discussion section, in order to understand their framework and to highlight the limitations in the existing models.

References

Singh, G., & Solanki, A. (2016). An algorithm to transform natural language into sql queries for relational databases. Selforganizology, 3(3), 100-116. Sripad, Joshi, and Laxmaiah E. n.d. 2013. Survey of Natural Language Interface to Databases.

Kim, H., So, B. H., Han, W. S., & Lee, H. (2020). Natural language to SQL: Where are we today? Proceedings of the VLDB Endowment, 13(10), 1737-1750.

Vig, Jesse, and Kalai Ramea. “Comparison of transfer-learning approaches for response selection in multi-turn conversations.” Workshop on DSTC7. 2019.

Yu, Tao, et al. “Syntaxsqlnet: Syntax tree networks for the complex and cross-domain text-to-SQL task.” arXiv preprint arXiv:1810.05237 (2018).

Sun, Zeyu, et al. “A grammar-based structural CNN decoder for code generation.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.

Finegan-Dollak, Catherine, et al. “Improving text-to-SQL evaluation methodology.” arXiv preprint arXiv:1806.09029 (2018).

Yu, Tao, et al. “Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task.” arXiv preprint arXiv:1809.08887 (2018).

Hwang, Wonseok, et al. “A comprehensive exploration on wikisql with table-aware word contextualization.” arXiv preprint arXiv:1902.01069 (2019).

Lin, Kevin, et al. “Grammar-based neural text-to-SQL generation.” arXiv preprint arXiv:1905.13326 (2019).

Maas, Andrew, et al. “Learning word vectors for sentiment analysis.” Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies. 2011.

Xu, Xiaojun, Chang Liu, and Dawn Song. “Sqlnet: Generating structured queries from natural language without reinforcement learning.” arXiv preprint arXiv:1711.04436 (2017).

Gardner, Matt, et al. “Allennlp: A deep semantic natural language processing platform.” arXiv preprint arXiv:1803.07640 (2018).

Affolter, Katrin, Kurt Stockinger, and Abraham Bernstein. “A Comparative Survey of Recent Natural Language Interfaces for Databases.” The VLDB Journal 28.5 (2019): 793–819. Crossref. Web.

Sujatha, B., & Raju, S. V. (2016). Natural Language Query Processing for Relational Database using EFFCN Algorithm. International Journal of Computer Sciences and Engineering, 4, 49-53.

Sukthankar, N., Maharnawar, S., Deshmukh, P., Haribhakta, Y., & Kamble, V. (2017). nQuery-A Natural Language Statement to SQL Query Generator. In Proceedings of ACL 2017, Student Research Workshop (pp. 17-23).

Stefan W., Ellen R., Gabriele S., (1996). Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Springer.

T. Ono, H. Hishigaki, A. Tanigami, T. Takagi, (2001), Automated extraction of information on proteinprotein interactions from the biological literature, Bioinformatics. doi:10.1093/bioinformatics/17.2.155.

Warren, D. H., & Pereira, F. C. (1982). An efficient easily adaptable system for interpreting natural language queries. Computational Linguistics, 8(3-4), 110-122.

Woods, William A, Ronald M Kaplan, and Bonnie Nash-Webber. (1972) The lunar sciences natural language information system. Bolt, Beranek and Newman, Incorporated.

Xu, X., Liu, C., & Song, D. (2017). Sqlnet: Generating structured queries from natural language without reinforcement learning. arXiv preprint arXiv:1711.04436.

Yossi Shani, Tal Cohen, and Yossi Vainshtein. (2016) "Natural Language Interface for Databases." KUERI.ME. 2016. http://kueri.me/.

Yaghmazadeh, N., Wang, Y., Dillig, I., & Dillig, T. (2017). Sqlizer: Query synthesis from natural language. Proceedings of the ACM on Programming Languages, 1(OOPSLA), 63.

Zhong, V., Xiong, C., & Socher, R. (2017). Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. arXiv preprint arXiv:1709.00103.

Lin, K., Bogin, B., Neumann, M., Berant, J., & Gardner, M. (2019). Grammar-based neural text-to-sql generation. arXiv preprint arXiv:1905.13326.

Zhang, Rui, et al. “Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions.” arXiv preprint arXiv:1909.00786 (2019).

Wang, Bailin, et al. “Rat-SQL: Relation-aware schema encoding and linking for text-to-SQL parsers.” arXiv preprint arXiv:1911.04942 (2019).

Dar, Hafsa Shareef, et al. “Frameworks for Querying Databases Using Natural Language: A Literature Review.” arXiv preprint arXiv:1909.01822 (2019).

Popescu, A. M., Etzioni, O., & Kautz, H. (2003, January). Towards a theory of natural language interfaces to databases. In Proceedings of the 8th international conference on Intelligent user interfaces (pp. 149-157)

Uma, M., Sneha, V., Sneha, G., Bhuvana, J., & Bharathi, B. (2019, February). Formation of SQL from natural language query using NLP. In 2019 International Conference on Computational Intelligence in Data Science (ICCIDS) (pp. 1-5). IEEE

Sukthankar, N., Maharnawar, S., Deshmukh, P., Haribhakta, Y., & Kamble, V. (2017, July). nQuery-A natural language statement to SQL query generator. In Proceedings of ACL 2017, Student Research Workshop (pp. 17-23)

Montgomery, C. A. (1972, August). Is natural language an unnatural query language? In Proceedings of the ACM annual conference-Volume 2 (pp. 1075-1078)

Iqbal, R., Murad, M. A. A., Selamat, M. H., & Azman, A. (2012, March). Negation query handling engine for natural language interfaces to ontologies. In 2012 International Conference on Information Retrieval & Knowledge Management (pp. 249-253). IEEE.

Mukherjee, P., Chattopadhyay, A., Chakraborty, B., & Nandi, D. (2021). Natural language query handling using extended knowledge provider system. International Journal of Knowledge-based and Intelligent Engineering Systems, 25(1), 1-19

Huang, P. S., Wang, C., Singh, R., Yih, W. T., & He, X. (2018). Natural language to structured query generation via meta-learning. arXiv preprint arXiv:1803.02400

Small, D. W., & Weldon, L. J. (1983). An experimental comparison of natural and structured query languages. Human Factors, 25(3), 253-263

Koutrika, G., Simitsis, A., & Ioannidis, Y. E. (2010, March). Explaining structured queries in natural language. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (pp. 333-344). IEEE

Gur, I., Yavuz, S., Su, Y., & Yan, X. (2018, July). Dialsql: Dialogue based structured query generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1339-1349)

Kaplan, S. J. (1984). Designing a portable natural language database query system. ACM Transactions on Database Systems (TODS), 9(1), 1-19

Yaghmazadeh, N., Wang, Y., Dillig, I., & Dillig, T. (2017). SQLizer: query synthesis from natural language. Proceedings of the ACM on Programming Languages, 1(OOPSLA), 1-26

Yaghmazadeh, N., Wang, Y., Dillig, I., & Dillig, T. (2017). SQLizer: query synthesis from natural language. Proceedings of the ACM on Programming Languages, 1(OOPSLA), 1-26

Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1995). Natural language interfaces to databases–an introduction. Natural language engineering, 1(1), 29-81

Kate, A., Kamble, S., Bodkhe, A., & Joshi, M. (2018, March). Conversion of natural language query to SQL query. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 488-491). IEEE.

Song, Y., Wong, R. C. W., Zhao, X., & Jiang, D. (2022). Speech-to-SQL: Towards Speech-driven SQL Query Generation from Natural Language Question. arXiv preprint arXiv:2201.01209.

Sujatha, B., & Raju, S. V. (2014). A Flexible and Efficient Natural Language Query interface to databases. International Journal of Computer Science and Information Technologies, 5(5), 6464-6467.

Dekleva, S. M. (1994). Is natural language querying practical? ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 25(2), 24-36.

Narechania, A., Fourney, A., Lee, B., & Ramos, G. (2021, April). DIY: Assessing the correctness of natural language to sql systems. In 26th International Conference on Intelligent User Interfaces (pp. 597-607).

Blanning, R. W. (1986). A System for natural language communication between a decision model and its users. IFAC Proceedings Volumes, 19(17), 77-85.

Amble, T. (2000, April). BusTUC-a natural language bus route oracle. In Sixth Applied Natural Language Processing Conference (pp. 1-6).

Narechania, A., Fourney, A., Lee, B., & Ramos, G. (2021, April). DIY: Assessing the correctness of natural language to sql systems. In 26th International Conference on Intelligent User Interfaces (pp. 597-607).

Zhang, X., Cheng, G., & Qu, Y. (2007). Ontology summarization based on rdf sentence graph. WWW '07.

Downloads

Published

2022-02-22

How to Cite

Baig, M. S., Imran, A., Yasin , A. ., Butt, A. H., & Muhammad Imran Khan. (2022). Natural Language to SQL Queries: A Review. International Journal of Innovations in Science & Technology, 4(1), 147–162. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/145