Integrating LLM for Cotton Soil Analysis in Smart Agriculture System

Syed Hassan Ali; Muhammad Farrukh Shahid; M. Hassan Tanveer; Abdul Rauf

Authors

Syed Hassan Ali Department of AI & Data Science (FAST-NUCES, Karachi, Pakistan)
Muhammad Farrukh Shahid Department of AI & Data Science (FAST-NUCES, Karachi, Pakistan)
M. Hassan Tanveer Department of Robotics & Mechatronics Engineering (Kennesaw State University, Marietta, GA, USA).
Abdul Rauf Department of AI & Data Science (FAST-NUCES, Karachi, Pakistan).

Keywords:

Large Language Models (LLMs), Soil Health, Cotton Soil Reports, Cotton Farming, Soil Analysis.

Abstract

Cotton is a critical crop for the agricultural economy, with its productivity closely tied to soil quality, particularly soil nutrient levels and pH. Monitoring and optimizing these properties is essential for sustainable cotton cultivation. This study proposes using fine-tuned Large Language Models (LLMs)—specifically GPT-2 and LLaMA-2—to automate soil analysis and produce detailed soil reports with actionable recommendations, addressing the limitations of traditional machine learning models in this context. A custom dataset was created by extracting key information from cotton-specific resources, focusing on soil nutrient interpretation and recommendations across different growth stages. Fine-tuning was applied to GPT-2 and LLaMA-2 models (specifically, the Nous Research version LLaMA2-7b-hf from Hugging Face), enabling them to generate data-driven reports on cotton soil health. The fine-tuned GPT-2 model achieved a training loss of 0.093 and an evaluation loss of 0.086, outperforming LLaMA-2, which had a training loss of 0.033 and an evaluation loss of 0.25. Evaluation with BERT Score showed that GPT-2 scored average Precision, Recall, and F1 scores of 0.9284, 0.9308, and 0.9296, respectively, highlighting its superior report accuracy and contextual relevance compared to LLaMA-2. The generated reports included soil properties and actionable nutrient management recommendations, effectively supporting optimized cotton growth. Implementing fine-tuned LLMs for soil report generation enhances nutrient management practices, contributing to higher yields and more sustainable cotton farming.

References

S. Lilhare, P. Sanodiya, V. K. Dwivedi, and T. Singh, “Leveraging artificial intelligence (ai) for soil management: A comprehensive overview,” International Journal of Emerging Technologies and Innovative Research, vol. 11, no. 4, pp. 230–238, 2024.

T. Pinthong and M. Ketcham, “The soil quality analysis using k-mean technique and model color,” in 2022 International Conference on Cybernetics and Innovations (ICCI), pp. 1–4, IEEE, 2022.

S. Patel and H. M. Kumar, “Soil quality identifying and monitoring approach for sugarcane using machine learning techniques,” in 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), pp. 1–5, IEEE, 2022.

A. J. Swetha, G. Kalyani, and B. Kirananjali, “Advanced soil fertility analysis and crop recommendation using machine learning,” in 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1035–1039, IEEE, 2023.

A. Banerjee, S. Mahajan, A. Rathore, S. Shirture, and M. Rajput, “Digi-farming assistant for soil quality analysis,” in 2023 International Conference on Applied Intelligence and Sustainable Computing (ICAISC), pp. 1–4, IEEE, 2023.

M. Javaid, A. Haleem, I. H. Khan, and R. Suman, “Understanding the potential applications of artificial intelligence in agriculture sector,” Advanced Agrochem, vol. 2, no. 1, pp. 15–30, 2023.

M. F. Celik, M. S. Isik, G. Taskin, E. Erten, and G. Camps-Valls, “Explainable artificial intelligence for cotton yield prediction with multisource data,” IEEE Geoscience and Remote Sensing Letters, 2023.

F. M. Carneiro, A. L. de Brito Filho, F. M. Ferreira, G. d. F. S. Junior, Z. N. Brandao, R. P. da Silva, and L. S. Shiratsuchi, “Soil and satellite remote sensing variables importance using machine learning to predict cotton yield,” Smart Agricultural Technology, vol. 5, p. 100292, 2023.

H. Chandra, P. M. Pawar, R. Elakkiya, P. S. Tamizharasan, R. Muthalagu, and A. Panthakkan, “Explainable ai for soil fertility prediction,” IEEE Access, 2023.

D. Araci, “Finbert: Financial sentiment analysis with pre-trained language models. arxiv 2019,” arXiv preprint arXiv:1908.10063.

J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.

B. Silva, L. Nunes, R. Estevao, and R. Chandra, “Gpt-4 as an agronomist assistant? answering agriculture˜ exams using large language models,” arXiv preprint arXiv:2310.06225, 2023.

M. T. Kuska, M. Wahabzada, and S. Paulus, “Ai for crop production–where can large language models (llms) provide substantial value?,” Computers and Electronics in Agriculture, vol. 221, p. 108924, 2024.

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.

H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.

X. Zhang, P. Yang, and B. Lu, “Artificial intelligence in soil management: The new frontier of smart agriculture,” Advances in Resources Research, vol. 4, no. 2, pp. 231–251, 2024.

R. Priscilla, R. Deepa, and A. Pandi, “Agriculture-based automation with recommendation systems based on ai models,” in 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS), pp. 1582–1589, IEEE, 2023.

K. Tan, “Large language models for crop yield prediction,” 2024.

Y. Ding, Y. Ma, W. Fan, Y. Yao, T.-S. Chua, and Q. Li, “Fashionregen: Llm-empowered fashion report generation,” in Companion Proceedings of the ACM on Web Conference 2024, pp. 991–994, 2024.

G. Colverd, P. Darm, L. Silverberg, and N. Kasmanoff, “Floodbrain: Flood disaster reporting by webbased retrieval augmented generation with an llm,” arXiv preprint arXiv:2311.02597, 2023.

O. S. Sowole, “Leveraging large language models for improving agricultural extension in nigeria,” in Deep Learning Indaba 2023, 2023.

A. Sharma, A. Sharma, A. Tselykh, A. Bozhenyuk, T. Choudhury, M. A. Alomar, and M. Sanchez-Chero,´ “Artificial intelligence and internet of things oriented sustainable precision farming: Towards modern agriculture,” Open Life Sciences, vol. 18, no. 1, p. 20220713, 2023.

“Pakistan cottongrower.” https://www.ccri.gov.pk/cottongrower/PCGJul-Sep2020.pdf. Accessed: 2024-05-30.

M. Malik, “Fertilizer role in sustainable cotton production,” Place Published, pp. 20–29, 1998.

G. Stevens, “Cotton fertility management,” 2019.

N. Khan, Y. Han, Z. Wang, G. Wang, L. Feng, B. Yang, and Y. Li, “Role of proper management of nitrogen in cotton growth and development,” International Journal of Biosciences, vol. 14, no. 5, p. 5, 2019.

A. O. Abaye, “Potassium fertilization of cotton,” 2019.

J. L. Oldham, D. M. Dodds, W. H. McCarty, et al., “Inorganic nutrient management for cotton production in mississippi,” 2010.

F. Ahmad, M. Akhtar, et al., “Phosphorus application strategies to improve cotton productivity under arid climatic conditions,” International Journal of Cotton Research and Technology, vol. 56, no. 1-4, pp. 14– 23, 2020.

“Advancing cotton education.” https://www.cotton.org/tech/ace/soil-fertility.cfm. Accessed: 2024-05-30.

A. Imran and K. Noureen, “Cotton crop development in central punjab (faisalabad, 2019),”

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318, 2002.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019.