Student retention pattern prediction employing linguistic features extracted from admission application essays

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper investigates the use of linguistic features extracted from the application essays of students enrolled in a university academic program for their retention pattern prediction. Three sets of linguistic features are generated from text analysis: (1) latent Dirichlet allocation (LDA) based topic modeling with a variety of topic numbers, (2) Linguistic Inquiry and Word Count (LIWC), and (3) part-of-speech (POS) distribution. Various classification experiments are implemented to evaluate the prediction performance of student retention patterns from these three feature sets and their combinations. The results show that the POS distribution features yield the best prediction performance among these three, while neither the LDA features nor ensemble methods improves predictive performance, which is contrary to admission experts' manual analysis methods in the conventional admission processes.

Original languageEnglish (US)
Title of host publicationProceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages532-539
Number of pages8
Volume2018-January
ISBN (Electronic)9781538614174
DOIs
StatePublished - Jan 16 2018
Event16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017 - Cancun, Mexico
Duration: Dec 18 2017Dec 21 2017

Other

Other16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017
CountryMexico
CityCancun
Period12/18/1712/21/17

Fingerprint

Linguistics
Students
Experiments

Keywords

  • application essay
  • educational data mining
  • linguistic features
  • predictive analysis
  • student retention

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications

Cite this

Ogihara, M., & Ren, G. (2018). Student retention pattern prediction employing linguistic features extracted from admission application essays. In Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017 (Vol. 2018-January, pp. 532-539). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMLA.2017.0-106

Student retention pattern prediction employing linguistic features extracted from admission application essays. / Ogihara, Mitsunori; Ren, Gang.

Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. p. 532-539.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ogihara, M & Ren, G 2018, Student retention pattern prediction employing linguistic features extracted from admission application essays. in Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017. vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 532-539, 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017, Cancun, Mexico, 12/18/17. https://doi.org/10.1109/ICMLA.2017.0-106
Ogihara M, Ren G. Student retention pattern prediction employing linguistic features extracted from admission application essays. In Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017. Vol. 2018-January. Institute of Electrical and Electronics Engineers Inc. 2018. p. 532-539 https://doi.org/10.1109/ICMLA.2017.0-106
Ogihara, Mitsunori ; Ren, Gang. / Student retention pattern prediction employing linguistic features extracted from admission application essays. Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. pp. 532-539
@inproceedings{49aee539c70c4f74bcd9c848d431e51e,
title = "Student retention pattern prediction employing linguistic features extracted from admission application essays",
abstract = "This paper investigates the use of linguistic features extracted from the application essays of students enrolled in a university academic program for their retention pattern prediction. Three sets of linguistic features are generated from text analysis: (1) latent Dirichlet allocation (LDA) based topic modeling with a variety of topic numbers, (2) Linguistic Inquiry and Word Count (LIWC), and (3) part-of-speech (POS) distribution. Various classification experiments are implemented to evaluate the prediction performance of student retention patterns from these three feature sets and their combinations. The results show that the POS distribution features yield the best prediction performance among these three, while neither the LDA features nor ensemble methods improves predictive performance, which is contrary to admission experts' manual analysis methods in the conventional admission processes.",
keywords = "application essay, educational data mining, linguistic features, predictive analysis, student retention",
author = "Mitsunori Ogihara and Gang Ren",
year = "2018",
month = "1",
day = "16",
doi = "10.1109/ICMLA.2017.0-106",
language = "English (US)",
volume = "2018-January",
pages = "532--539",
booktitle = "Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Student retention pattern prediction employing linguistic features extracted from admission application essays

AU - Ogihara, Mitsunori

AU - Ren, Gang

PY - 2018/1/16

Y1 - 2018/1/16

N2 - This paper investigates the use of linguistic features extracted from the application essays of students enrolled in a university academic program for their retention pattern prediction. Three sets of linguistic features are generated from text analysis: (1) latent Dirichlet allocation (LDA) based topic modeling with a variety of topic numbers, (2) Linguistic Inquiry and Word Count (LIWC), and (3) part-of-speech (POS) distribution. Various classification experiments are implemented to evaluate the prediction performance of student retention patterns from these three feature sets and their combinations. The results show that the POS distribution features yield the best prediction performance among these three, while neither the LDA features nor ensemble methods improves predictive performance, which is contrary to admission experts' manual analysis methods in the conventional admission processes.

AB - This paper investigates the use of linguistic features extracted from the application essays of students enrolled in a university academic program for their retention pattern prediction. Three sets of linguistic features are generated from text analysis: (1) latent Dirichlet allocation (LDA) based topic modeling with a variety of topic numbers, (2) Linguistic Inquiry and Word Count (LIWC), and (3) part-of-speech (POS) distribution. Various classification experiments are implemented to evaluate the prediction performance of student retention patterns from these three feature sets and their combinations. The results show that the POS distribution features yield the best prediction performance among these three, while neither the LDA features nor ensemble methods improves predictive performance, which is contrary to admission experts' manual analysis methods in the conventional admission processes.

KW - application essay

KW - educational data mining

KW - linguistic features

KW - predictive analysis

KW - student retention

UR - http://www.scopus.com/inward/record.url?scp=85048463393&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048463393&partnerID=8YFLogxK

U2 - 10.1109/ICMLA.2017.0-106

DO - 10.1109/ICMLA.2017.0-106

M3 - Conference contribution

VL - 2018-January

SP - 532

EP - 539

BT - Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -