Classifying Text in Citation Context as Relevant or Irrelevant to the Cited Paper

Authors

  • Afsheen Khalid Center for Excellence in IT, Institute of Management Sciences, Peshawar, Pakistan
  • Dilawar Khan Computer Science &IT Department, University of Engineering and Technology, Peshawar, Pakistan
  • Shaukat Ali Department of Computer Science, University of Peshawar, Peshawar, Pakistan

Keywords:

Citation Context, Conditional Random Field, Fixed Window, Citation Analysis, Relevant Text

Abstract

Citation contexts, whether in the form of full citing sentences or text within a fixed window around the citation, have been widely used in various citation analysis applications. However, the absence of precise techniques to identify the exact span of text describing citations forces these applications to rely on extended texts as citation contexts. In this paper, we introduced new features combined with baseline features to accurately identify text that characterizes citations. Specifically, we utilized a Conditional Random Field (CRF) sequence classifier to categorize the surrounding text of citations as relevant or irrelevant. The integration of these features enhances the precision, recall, and F-measure scores for the Relevant (R) class. Although the average values of all measures are similar to those obtained with baseline features alone. Our approach significantly improves the extraction of relevant text.

References

R. Moro, M. Vangel, and M. Bielikova, “Identification of Navigation Lead Candidates Using Citation and Co-Citation Analysis,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9587, pp. 556–568, 2016, doi: 10.1007/978-3-662-49192-8_45.

“Content sensitive document ranking method by analyzing the citation contexts.” Accessed: Aug. 03, 2024. [Online]. Available: https://patentimages.storage.googleapis.com/ad/09/9c/3adaea688a2e2d/WO2016099422A2.pdf

A. Cohan and N. Goharian, “Scientific document summarization via citation contextualization and scientific discourse,” Int. J. Digit. Libr., vol. 19, no. 2–3, pp. 287–303, Sep. 2018, doi: 10.1007/S00799-017-0216-8/METRICS.

K. Ravi, S. Setlur, V. Ravi, and V. Govindaraju, “Article Citation Sentiment Analysis Using Deep Learning,” Proc. 2018 IEEE 17th Int. Conf. Cogn. Informatics Cogn. Comput. ICCI*CC 2018, pp. 78–85, Oct. 2018, doi: 10.1109/ICCI-CC.2018.8482054.

“Sentiment classification based on linguistic patterns in citation context.” Accessed: Aug. 03, 2024. [Online]. Available: https://www.currentscience.ac.in/Volumes/117/04/0606.pdf

X. Su, A. Prasad, M. Y. Kan, and K. Sugiyama, “Neural multi-task learning for citation function and provenance,” Proc. ACM/IEEE Jt. Conf. Digit. Libr., vol. 2019-June, pp. 394–395, Jun. 2019, doi: 10.1109/JCDL.2019.00122.

M. Hernández-Alvarez, J. M. Gomez Soriano, and P. Martínez-Barco, “Citation function, polarity and influence classification,” Nat. Lang. Eng., vol. 23, no. 4, pp. 561–588, Jul. 2017, doi: 10.1017/S1351324916000346.

C. Martinez-Perez, C. Alvarez-Peregrina, C. Villa-Collar, and M. Á. Sánchez-Tena, “Current State and Future Trends: A Citation Network Analysis of the Academic Performance Field,” Int. J. Environ. Res. Public Heal. 2020, Vol. 17, Page 5352, vol. 17, no. 15, p. 5352, Jul. 2020, doi: 10.3390/IJERPH17155352.

N. Saini, S. Kumar, S. Saha, and P. Bhattacharyya, “Scientific document summarization using citation context and multi-objective optimization,” Proc. - Int. Conf. Pattern Recognit., pp. 4290–4295, 2020, doi: 10.1109/ICPR48806.2021.9412201.

R. Jha, A. A. Jbara, V. Qazvinian, and D. R. Radev, “NLP-driven citation analysis for scientometrics,” Nat. Lang. Eng., vol. 23, no. 1, pp. 93–130, Jan. 2017, doi: 10.1017/S1351324915000443.

A. Ritchie, S. Robertson, and S. Teufel, “Comparing citation contexts for information retrieval,” Int. Conf. Inf. Knowl. Manag. Proc., pp. 213–222, 2008, doi: 10.1145/1458082.1458113.

“Concit-corpus context citation analysis to learn function, polarity and influence”, [Online]. Available: https://dialnet.unirioja.es/servlet/tesis?codigo=61958

A. Khalid, F. Alam, and I. Ahmed, “Extracting reference text from citation contexts,” Cluster Comput., vol. 21, no. 1, pp. 605–622, Mar. 2018, doi: 10.1007/S10586-017-0954-9/METRICS.

S. Liu, C. Chen, K. Ding, B. Wang, K. Xu, and Y. Lin, “Literature retrieval based on citation context,” Scientometrics, vol. 101, no. 2, pp. 1293–1307, Oct. 2014, doi: 10.1007/S11192-014-1233-7/METRICS.

S. Ghosh and C. Shah, “Identifying Citation Sentiment and its Influence while Indexing Scientific Papers,” Proc. Annu. Hawaii Int. Conf. Syst. Sci., vol. 2020-January, pp. 2517–2526, Jan. 2020, doi: 10.24251/HICSS.2020.307.

D. Duma, C. Sutton, and E. Klein, “Context matters: Towards extracting a citation’s context using linguistic features,” Proc. ACM/IEEE Jt. Conf. Digit. Libr., vol. 2016-September, pp. 201–202, Sep. 2016, doi: 10.1145/2910896.2925431.

I. S. Kang and B. K. Kim, “Characteristics of Citation Scopes: A Preliminary Study to Detect Citing Sentences,” Commun. Comput. Inf. Sci., vol. 352 CCIS, pp. 80–85, 2012, doi: 10.1007/978-3-642-35603-2_11.

D. Pride and P. Knoth, “Incidental or Influential? - Challenges in Automatically Detecting Citation Importance Using Publication Full Texts,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10450 LNCS, pp. 572–578, 2017, doi: 10.1007/978-3-319-67008-9_48.

M. Valenzuela, V. Ha, and O. Etzioni, “Identifying Meaningful Citations,” 2015, Accessed: Aug. 03, 2024. [Online]. Available: http://opennlp.apache.org

M. Korobov, “Scikit-learn inspired api for crfsuite,” 2016, [Online]. Available: https://github.com/ TeamHG-Memex/sklearn-crfsuite

R. Moro, M. Vangel, and M. Bielikova, “Identification of Navigation Lead Candidates Using Citation and Co-Citation Analysis,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9587, pp. 556–568, 2016, doi: 10.1007/978-3-662-49192-8_45.

“Content sensitive document ranking method by analyzing the citation contexts.” Accessed: Aug. 03, 2024. [Online]. Available: https://patentimages.storage.googleapis.com/ad/09/9c/3adaea688a2e2d/WO2016099422A2.pdf

A. Cohan and N. Goharian, “Scientific document summarization via citation contextualization and scientific discourse,” Int. J. Digit. Libr., vol. 19, no. 2–3, pp. 287–303, Sep. 2018, doi: 10.1007/S00799-017-0216-8/METRICS.

K. Ravi, S. Setlur, V. Ravi, and V. Govindaraju, “Article Citation Sentiment Analysis Using Deep Learning,” Proc. 2018 IEEE 17th Int. Conf. Cogn. Informatics Cogn. Comput. ICCI*CC 2018, pp. 78–85, Oct. 2018, doi: 10.1109/ICCI-CC.2018.8482054.

“Sentiment classification based on linguistic patterns in citation context.” Accessed: Aug. 03, 2024. [Online]. Available: https://www.currentscience.ac.in/Volumes/117/04/0606.pdf

X. Su, A. Prasad, M. Y. Kan, and K. Sugiyama, “Neural multi-task learning for citation function and provenance,” Proc. ACM/IEEE Jt. Conf. Digit. Libr., vol. 2019-June, pp. 394–395, Jun. 2019, doi: 10.1109/JCDL.2019.00122.

M. Hernández-Alvarez, J. M. Gomez Soriano, and P. Martínez-Barco, “Citation function, polarity and influence classification,” Nat. Lang. Eng., vol. 23, no. 4, pp. 561–588, Jul. 2017, doi: 10.1017/S1351324916000346.

C. Martinez-Perez, C. Alvarez-Peregrina, C. Villa-Collar, and M. Á. Sánchez-Tena, “Current State and Future Trends: A Citation Network Analysis of the Academic Performance Field,” Int. J. Environ. Res. Public Heal. 2020, Vol. 17, Page 5352, vol. 17, no. 15, p. 5352, Jul. 2020, doi: 10.3390/IJERPH17155352.

N. Saini, S. Kumar, S. Saha, and P. Bhattacharyya, “Scientific document summarization using citation context and multi-objective optimization,” Proc. - Int. Conf. Pattern Recognit., pp. 4290–4295, 2020, doi: 10.1109/ICPR48806.2021.9412201.

R. Jha, A. A. Jbara, V. Qazvinian, and D. R. Radev, “NLP-driven citation analysis for scientometrics,” Nat. Lang. Eng., vol. 23, no. 1, pp. 93–130, Jan. 2017, doi: 10.1017/S1351324915000443.

A. Ritchie, S. Robertson, and S. Teufel, “Comparing citation contexts for information retrieval,” Int. Conf. Inf. Knowl. Manag. Proc., pp. 213–222, 2008, doi: 10.1145/1458082.1458113.

“Concit-corpus context citation analysis to learn function, polarity and influence”, [Online]. Available: https://dialnet.unirioja.es/servlet/tesis?codigo=61958

A. Khalid, F. Alam, and I. Ahmed, “Extracting reference text from citation contexts,” Cluster Comput., vol. 21, no. 1, pp. 605–622, Mar. 2018, doi: 10.1007/S10586-017-0954-9/METRICS.

S. Liu, C. Chen, K. Ding, B. Wang, K. Xu, and Y. Lin, “Literature retrieval based on citation context,” Scientometrics, vol. 101, no. 2, pp. 1293–1307, Oct. 2014, doi: 10.1007/S11192-014-1233-7/METRICS.

S. Ghosh and C. Shah, “Identifying Citation Sentiment and its Influence while Indexing Scientific Papers,” Proc. Annu. Hawaii Int. Conf. Syst. Sci., vol. 2020-January, pp. 2517–2526, Jan. 2020, doi: 10.24251/HICSS.2020.307.

D. Duma, C. Sutton, and E. Klein, “Context matters: Towards extracting a citation’s context using linguistic features,” Proc. ACM/IEEE Jt. Conf. Digit. Libr., vol. 2016-September, pp. 201–202, Sep. 2016, doi: 10.1145/2910896.2925431.

I. S. Kang and B. K. Kim, “Characteristics of Citation Scopes: A Preliminary Study to Detect Citing Sentences,” Commun. Comput. Inf. Sci., vol. 352 CCIS, pp. 80–85, 2012, doi: 10.1007/978-3-642-35603-2_11.

D. Pride and P. Knoth, “Incidental or Influential? - Challenges in Automatically Detecting Citation Importance Using Publication Full Texts,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10450 LNCS, pp. 572–578, 2017, doi: 10.1007/978-3-319-67008-9_48.

M. Valenzuela, V. Ha, and O. Etzioni, “Identifying Meaningful Citations,” 2015, Accessed: Aug. 03, 2024. [Online]. Available: http://opennlp.apache.org

M. Korobov, “Scikit-learn inspired api for crfsuite,” 2016, [Online]. Available: https://github.com/ TeamHG-Memex/sklearn-crfsuite

Downloads

Published

2024-08-01

How to Cite

Khalid, A., Khan, D., & Shaukat Ali. (2024). Classifying Text in Citation Context as Relevant or Irrelevant to the Cited Paper. International Journal of Innovations in Science & Technology, 6(3), 1088–1098. Retrieved from https://journal.50sea.com/index.php/IJIST/article/view/961