Problems and Prospects in the Applying Methods of [630315]
Problems and Prospects in the Applying Methods of
Analysis Educational Data
C
opyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the
terms of the Creative Commons Attribution License 4.0 International License
Abstract At the present stage, education is the most
important component of the development of a country's
economic growth. Often a changing situation requires high professionalism and considerable intellectual effort to make
an effective decision. The increase in information flows and
the analysis of relevant information generated by the
participants in the educational process plays an important
role in the process of quality management of the education
process. The creation of training systems and the spread of network technology has led to the accumulation of a large
amount of data, and this has in turn aroused great interest in
the study of Data Mining methods used to analyze the new type of educational data. In this paper, a comparison of
methods for analyzing educational data (EDM) and learning
analytics (LA) was made, and attention to their peculiarities was paid.
Keywords Intelligent Analysis of Educational Data,
Learning Analytics, E-Learning, Big Data
1.Introduction
In connect ion with the active use of digital technologies
and the development of e -learning systems during the
traditional educational process, relatively large data arrays are accumulated, and therefore, in recent years, there has
been an exponential growth of data in the educational sector.
This has led to the emergence of a new direction – the analysis of educational data (Educational Data Mining –
EDM) (Baker & Siemens, 2012) in the field of artificial intelligence in the early 2000s. EDM is an interdisciplinary area that originated at the junction of other disciplines.
In fact, EDM can be represented as a combination of three
main areas: computer science, education, and statistics. The intersection of these three areas also forms other disciplines
such as computer-aided learning, DM and machine learning,
and Learning analytics (LA), which are closely related to EDM (Fig.1). Of all above -mentioned areas, LA is the area
most relevant to EDM and can be defined as measuring,
collecting, analyzing, and presenting data about students, education objectives and optimization, and the conditions in
which learning takes place. It is focused on decision making
based on data generated in the learning process.
Fig.1. Main areas related to Educational Data Mining (EDM)
E
DM de velops, improves methods for processing data
generated in the learning process and extracts patterns from
them. For a certain session of the electronic educational
environment (EEE), some amount of data containing specific details for analysis is generated . Before EOS, the flow of
information required sophisticated methods to collect, analyze and interpret student: [anonimizat],
EDM methods use various types of data, helping to improve
the design of the educational environment and the educational process.
Despite the d evelopment of high information technologies
and the e -learning method, each site visitor faces some
problems due to the lack of direct human contact, i.e. there is
a certain barrier in this area of application. This is due to the
lack of preparedness of ed ucational workers in the
interpretation of the volume of big data and the learning process, as well as specialists in the relevant field. After
examining some of the similarities and differences between
the EDM and LA methods, we can conclude, “EDM focuses
more on methods and methodologies, and LA focuses on
applications” (Ferguson, 2012).
The purpose of this article is to set out the features of the
EDM and LA methods for preventing a barrier in this area, to
consider the features of these two relatively n ew and
increasingly popular areas of research related to the collection, analysis and interpretation of educational data, and to explore problems and trends caused by the increasing
enormous growth of information (Belonozhko et al, 2017).
The essence of th e research is presented in the following
sections of the work:
• The emergence and general objectives of EDM and LA
methods;
• Features and advantages of EDM in the educational
process;
• The main similarities and differences between EDM
and LA methods;
• Directions of research: problems and solutions, trends;
• Conclusions.
2. Research Methods
The methodological basis of the study was a
system -evolutionary theoretical approach based on the
complementarity of system principles with the principles of
evoluti onary development, including the concept of
classical analysis.
At present, a tendency has b een outlined for the
development of a comprehensive science of education – the
so-called educational science or eduсology, the main
theoretical constructs of which can serve as a
methodological basis for the development of the education
economy.
The articl e considers the methodology of education
informatization as a purposeful organization of the process
of providing the education sector with methods, technology
and practice for creating and making optimal use of
scientific, pedagogical, educational and met hodological and
software and technological developments aimed at realizing
the didactic opportunities of information and
communication technologies used in comfortable and health -saving conditions.
3. The Emergence a nd General
Objectives of EDM a nd LA Meth ods
Since the 80s of the 20th century, the creation of training
systems and the spread of network technology has led to the accumulation of a large amount of data, and this has had in
turn aroused great interest in the study of Data Mining
methods used to analyze the new type of educational data.
Thanks to technology similar to learning management
systems such as Moodle, Sakai and ILIAS, it has become
possible to obtain information about student behavior outside the traditional educational environment. At t he same
time, at international conferences on the use of artificial intelligence methods in education (International Conference on Artificial Intelligence in Education, International
Conference on Intelligent Tutoring Systems, etc.) (Romero
& Ventura, 2010), regular seminars dedicated to the development of methods in the educational sector were held
(Belonozhko et al, 2017).
The evolution of learning analytics has gone through
three eras (Peña -Ayala, 2017), these periods are closely
related to the creation and development of the Society for
Learning Analytics Research (SoLAR). The first epoch
corresponded to the earliest jobs published until 2011. A job
printed in 1996 examines the scholarship report declared by
Boyer (Boyer, 1996), where he analyzed America n higher
education.
The second era began in 2011 with the goal of
encouraging and supporting research, cooperation, and the spread of LA labor throughout the world and continued
until 2013. Separate articles were published on journals and
materials of the LAK conference (Learning Analytics &
Knowledge Conference), cited during this period.
As for the third and current epoch, this one began in 2014
and was still ongoing. LA is currently receiving more attention at conferences and journals indexed by TR -JCR
(Thomson Reuters – Journal Citation Reports).
The progress of work on LA, presented at the LAK 2014
to 2017 conferences, is shown in Fig.2 to highlight the
evolution of complete articles in terms of number and
diversity based on the three categories of the proposed LA
taxonomy.
Fig.2. Count of full papers presented in LAK -2011 to LAK -2017 classified
by the category
The popularity of the existing two methodologies – EDM
and LA is due to the following factors. First, s tatistical
methods, methods of machine learning and data collection,
and the development of predictive models or decision rules are powerful mathematical tools in the field of educational
data analysis, and new technologies expand their capabilities.
Secon dly, there is increasing interest in using a data -driven
approach to make better decisions (Daradoumis et al, 2010)
in the educational field to improve the quality of the learning
process.
The main essence of EDM and LA is to extract
information from data related to education. Information may be targeted to several stakeholders (Daradoumis et al, 2010) –
instructors, students, managers and researchers. Each of
them is an executor of certain functions: the instructor
develops and organizes the learning proce ss and evaluates its
effectiveness (Daradoumis & Xhafa, 2009); students have
the opportunity to get recommendations on resources, taking
into account their performance, goals and motivation,
analyze the results of the educational process, comparing them w ith other courses; managers, using information, more
efficiently allocate human and material resources in order to improve the overall quality of their academic offerings; researchers conduct studies based on educational data.
Bienkowski’s (Bienkowski et a l, 2012) report about main
problems of implementing and applying the methods of EDM and LA, and Peña -Ayala’s work (Peña- Ayala, 2014),
describing in detail the applications and methods of EDM,
has a wide citation.
The next step in the development of this di rection is
connected with the holding of annual conferences devoted to
EDM, the emergence of mass publicly available online
courses (MEPs) with extensive data collection capabilities
such as Khan Academy, Coursera, edX, Udacity, etc.
4. Features and Advan tages of EDM i n
The Educational Process
EDM features include goals, data, research methods and
applications.
EDM objectives (Baker & Yacef, 2009) consist of:
1. Prediction of students’ behavior in the learning process;
2. Development of new models and ways of presenting
knowledge in the subject area;
3. Study of the interaction effects in the system “learning –
student”;
4. Development of knowledge about the phenomenon of
learning and the psychology of students.
Complicated data, which educators usually do not deal
with, present difficulties for analysis by traditional methods:
• the number of visits to the EEE website;
• the number of the most frequent visited pages;
• the number of views or downloads of the
necessary materials for study;
• the information on brows ers and frequency of
visits to certain pages in time;
• the information on origin of visitors;
• the information on the number of visits and their
duration for each student for a certain period of time;
• the information on the most popular keywords to
search in formation in the system; • the information about electronic resources
downloaded, read or viewed by a student and about the
amount of material for study.
Such data are provided in particular by the Moodle system
(Kay et al, 2006; Nesbit et al, 2008).
The inv estigation of data generated in the learning process
for possible analyzing the students’ learning processes
depending on their interaction with the environment (Baker
et al, 2012), EDM develops and adapts various methods. Prediction, clustering, classific ation, search for sequential
patterns, text mining and methods, search for binding rules specific to EDM – discovery with the help of models and data
distillation for human judgment (Baker & Siemens, 2013)
refer to traditional DM methods. Each of them is a pplied to
solving problems of a specific nature (Romero & Ventura, 2010).
There are a number of advantages of EDM methods used
by participants of the educational process – students and
teachers. Students have the opportunity to adapt the course in
terms of the level and assimilation of knowledge, since EEE,
taking into account the duration and frequency of the visit,
collects detailed information about each individual’s action,
processes and forms a learning model. Based on the analysis
of the collected dat a, the EEE generates an adapted hint; the
student is compensated for the loss of time for learning and is
offered a new course for study. At the received hint, teachers
study the situation and make adjustments in the content of the course, follow the learning process and classify students
according to specific characteristics (by academic
performance, activity, preliminary preparation, etc.) assess their knowledge. Because of these situations, the
administration of EEE is able to evaluate the effectiveness of
the course and improve its condition.
5. The Main Similarities a nd
Differences Between EDM a nd LA
Methods
EDM usually looks for new patterns in data and develops
new models; LA applies known prognostic models in
education systems. EDM and LA are aimed at the same goal:
to improve the quality of education by analyzing a huge
amount of data to extract useful information for interested parties. These methods are closely related to each other; from
this point of view, they have many common characteristics,
at the same time, significant differences. These differences
lie in the following features (Baker & Siemens, 2012):
• EDM allows studying the components of the
system and the relationships between the components; LA enables to explore the whole system.
• EDM is based on educational software, while LA
is connected to a specific semantic network.
• EDM performs automated adaptation; LA informs
and enhances faculty and students.
• EDM uses classification, clustering, Bayesian
modeling, prediction, detection with models , and
visualization methods;
• LA aims at analyzing social networks, tonality
and influence, predicting student success, analyzing
the concept and models for creating meaning.
In some studies (Baker & Siemens, 2012), EDM and LA
are considered as separate areas that study the automation of patterns’ identification in educational data and ensure the
preparation of data in a suitable form for human analysis.
According to the abovementioned authors, these differences
are broad trends in each community and, as a r esult, do not
define the relevant areas. A similar idea is expressed in
(Baker & Inventado, 2014), which states that “the overlap and differences between communities are largely limited,
evolving from the interests and values of specific researchers,
rather than reflecting a deeper philosophical split”.
Bienkowski (Bienkowski et al, 2012) showed that LA
covers more disciplines than EDM. In addition to computer science, statistics, psychology, and the sciences of learning, LA is related to computer science and sociology.
Before the automation of the process of obtaining, storing
and processing data attempts have been made to draw conclusions on a sample of experts – specialists.
At the present stage of development of technical means,
researchers are provided with enormous opportunities when
working with a huge amount of data and people. Technical
progress, leading us to the era of big data, leads to faster and
more reliable results, and in turn, solutions that are more
effective. The combination of these two m ethodologies is a
promising direction for government bodies as well.
Directions of research: problems and solutions, trends
According to the paper we can say that these two areas are
relatively new areas of research, and they still have a number
of unsolve d problems:
1. Lack of theoretical and practical knowledge among a
significant proportion of teachers and managers regarding the use of the necessary tools. To solve this problem, researchers must disseminate their results by developing a
data- driven culture in an educational environment (Romero
& Ventura, 2013) by collaborating with a large number of
teachers and/or students to evaluate their proposals during
experiments to facilitate data analysis.
2. Additional costs for storing and managing data,
becaus e different data analysis packages may not always
easily integrate with each other and assistive devices.
3. Ethics and personal privacy. The ethics provided in
(Greller & Drachsler, 2012) should be taken into account at all stages of data analysis, from d ata collection to
interpretation of results and decision- making. Consideration
should be given to the ownership of student data that differs from country to country.
4. The specificity of the results in the field of EDM. Since
most of the research on EDM w as conducted in North
America and Western Europe, the results obtained in them
may differ significantly from those obtained in countries with different cultural traditions (Baker & Yacef, 2009)
Trends in future research on EDM are based on the
following pr ovisions (Belonozhko et al, 2017): 1. EDM tools must be fairly convenient, simple, and
integrated into EOS and provide an interface for accessing data.
2. There should be possibility for uniformly describing the
models obtained by using educational data.
3. Methods of data analysis should be adapted to the
application of educational data.
4. The problem of incomplete data collected when using
popular social networks – Facebook, Vkontakte, etc. should
be eliminated through the integration of social networks into
the educational environment and the performance of part of
their functions by Massive Open Online Courses (MOOCs).
The use of EDM and LA methods in network
environments is determined by the generation of large
educational data with dimensions that go beyond the
capabilities of common software tools for capturing, storing,
managing, and processing in a reasonable amount of time
(Snijders et al, 2012). The main differences between big data and analytics are volume, speed, and diversity (McAfee &
Brynjolf sson, 2012).
The Environments MOOCs such as Coursera, edX or
Class2Go of large universities, which have been popular since 2012, allow students from all over the world to attend a
variety of courses, free of charge, to narrow the gap between educational opportunities associated with economic
inequality. Typically, a large number of students are trained
in such courses. This creates a problem of scalability of visitors (Kay et al, 2013), very high dropout rates and very
different participation models (Clow, 2013).
The maximum potential of EDM and LA in MOOCs
justifies itself in the diversity of students and the extremely high level of student instructors. Different origin of
participants, language skills, goals, experience, and levels of education, needs and learning styles indicates the relevance
of course personalization in the automation of these systems.
It is known that the existing MOOCs’ platforms provide
limited data storage, adaptive MOOCs (aMOOCs) appear.
To improve and personalize the management of MOOCs, it
is proposed to use software agents that can redesign them
according to the profile of each participant (Daradoumis et al,
2013). Unlike many MOOCs, described by sets of
consecutive videos and quizzes, large companies such as Google or Amazon use algorithmic approaches to select
searches, announcements, and purchase recommendations. Sonwalkar (Sonwalkar, 2013) describes the development of the first aMOOCs platform, which is implemented using the
cloud architecture of Amazon Web Services.
Adaptive learning is very relevant today. An avalanche of
information flows and overloads, a rapidly changing modern world and the need for continuous learning require the
development of new learning skills. Learning should be so dynamic as to allow the formation of personal learning
pathways tuned to the level of knowledge and needs of a
particular student.
6. Conclusion
Being relatively new and promising areas of research and
improving educational experience, these two methods –
EDM and LA are aimed at enhancement the educational
process and help participants in this process – students,
teachers and researchers to make more effective decisions
using data. By increasing the capabilities of modern
technical means of information processing and the
availability of DM, statistical and machine learning methods,
the growth of educational data has been increased.
One of the applications is the Internet environment, in
which data is constantly generated with various formats and levels of hierarchy. Unlike traditional courses , dropout rates
for online courses are higher. EDM and LA are mainly used
to monitor students and groups and adapt learning
experiences. The methods for analyzing educational data have many similarities and at the same time several
differences. Despite their current improvements, there are
some barriers to the use of EDM and LA educational environments.
The application of the analysis of educational data
provides a number of advantages to the participants –
students, teachers and administrators of the educa tional
process. Using EDM allows students to tailor the course for
fitting their abilities. In the system, according to the
accumulated information about the student depending on the
duration and frequency of viewing it, a learning model is
formed.
Student s are offered shortened paths for completion the
course, considering their interest in passing tests and homework assignments. At the prompts of EEE, problems revealed by students’ errors in tests and homework are
identified, they are recommended additiona l materials for
studying the course.
Obtaining information on the course of the educational
process, teachers have the opportunity to improve the
content of the materials, based on data on the frequency and distribution of errors; the performing students determine the
causes of these errors and eliminate them.
REFERENCES
[1] Baker, R., & Yacef, K. (2009). The State of Educational Data
Mining in 2009: A Review and Future Visions. JEDM |
Journal of Educational Data Mining, 1(1), 3 -17. Retrieved
from
https://jedm .educationaldatamining.org/index.php/JEDM/arti
cle/view/8
[2] Baker, R., Siemens, G.: Educational data mining and learning analytics, in Cambridge handbook of the learning sciences (2nd edition), R. K. Sawyer, Ed., Cambridge, UK: Cambridge University Press, (in press)
http://www.columbia.edu/~rsb2162/ BakerSiemensHandbook2013.pdf
[3] Baker, R. S. J. D., Costa, E., Amorim, L., Magalhães, J., &
Marinho, T. (2012). Mineração de Dados Educacionais:
Conceitos, Técnicas, Ferramentas e Aplicações. Jornada de Atualização em Informática na Educação, 1, 1 –29. [4] Baker, R. S. J. D., & Inventado, P. S. (2014). Educational Data Mining and Learning Analytics. In J. A. Larusson, & B.
White (Eds.), Learning Analytics: from Research to Practice
(pp. 61 –75). New York, NY: Springer.
[5] Belon ozhko P.P., Karpenko A.P., Hramov D.A. Analiz
obrazovatel'nyh dannyh: napravleniya i perspektivy
primeneniya // Internet -zhurnal «NAUKOVEDENIE» Tom 9,
№4 (2017) / URL:
http://naukovedenie.ru/PDF/15TVN417.pdf
[6] Bienkowski, M., Feng, M., & Means, B. (2012). E nhancing
Teaching and Learning Through Educational Data Mining
and Learning Analytics: An Issue Brief. Retrieved from
http://tech.ed.gov/wp -content/uploads/2014/03/edm -la-brief.
pdf
[7] Boyer, E. (1996). The scholarship of engagement. Bulletin of
the American A cademy of Arts and Sciences, 49(7), 18 –33.
[8] Clow, D. (2013). MOOCs and the funnel of participation. In D. Suthers, K. Verbert, E. Duval, & X. Ochoa (Eds.), Proceedings of the 3rd International Conference on Learning
Analytics and Knowledge (pp. 185 –189). do i:
http://dx.doi.org/10.1145/2460296.2460332
[9] Daradoumis, T., Juan, A., Lera -López, F., & Faulin, J. (2010).
Using Collaboration Strategies to Support the Monitoring of
Online Collaborative Learning Activity. In M. Lytras, P. O. D.
Pablos, D. Avison, J. sipior, Q. Jin, W. Leal, D. Horner (Eds.), Technology Enhanced Learning. Quality of Teaching and Educational Reform (pp. 271 –277). Springer Berlin
Heidelberg. doi:
http://dx.doi.org/10.1007/978 -3-642-13166- 0_39
[10] Daradoumis, T., Rodríguez -Ardura, I., Faulin, J., &
Martínez- López, F. J. (2010). CRM Applied to Higher
Education: Developing an e -Monitoring System to Improve
Relationships in e -Learning Environments. International
Journal of Services Technology and Management, 14(1), 103– 125. doi: http://dx.doi.org/10 .1504/IJSTM.2010.032887
[11] Daradoumis, T., Bassi, R., Xhafa, F., & Caballé, S. (2013). A review on massive e- learning (MOOC) design, delivery and
assessment. Proceedings of the 8th International Conference
on P2P, Parallel, Grid, Cloud and Internet Computing (pp.
208– 213). Compiegne, France. doi:
http://dx.doi.org/10.1109/3pgcic.2013.37
[12] Ferguson, R. (2012). The State Of Learning Analytics in 2012: A Review and Future Challenges. Technical Report KMI -12-01, Knowledge Media Institute, The Open
University, UK.
http://kmi.open.ac.uk/publications/techreport/kmi -12-01
[13] Greller, W., & Drachsler, H. (2012). Translating Learning into Numbers: A Generic Framework for Learning Analytics. Educational Technology & Society, 15(3), 42 –57.
[14] Kay J., Maisonneuve N., Yacef K., Zaiane O.R. Mining Patterns of Events in Students' Teamwork Data // Proceedings of Educational Data Mining Workshop. 2006,
Taiwan.URL:http://www.educationaldatamining.org/ITS200
6EDM/Kay_Yacef.pdf
[15] Kay, J., Reimann, P., Diebold, E., & Kummerfeld, B. (2013).
MOOCs: So Many Learners, So Much Potential… IEEE
Intelligent Systems, 28(3), 70– 77. doi:
http://dx.doi.org/10.1109/MIS.2013.66
[16] McAfee, A., & Brynjolfsson, E. (2012). Big Data: The
Management Revolution. Harvard Business Review, 90(10),
60–66.
[17] Nesbit J.C., Xu Y., Winne P.H., Zhou M. Sequential pattern
analysis software for educational event data // 6th
International Conference on Methods and Techniques of
Behavioral Research “Measuring Behaviour”, 26 -28.08.2008,
Maastricht, Netherlands, P. 1 -5.
[18] Peña- Ayala, A. (2014). Educational Data Mining:
Applications and Trends. New York, NY: Springer. doi: http://dx.doi.org/10.1007/978 -3-319-02738- 8
[19] Peña- Ayala, A. (2017). Learning Analytics: Fundaments,
Applications, and Trends: A View of the Current State of the Art to En hance e- Learning. Springer International Publishing.
Consulté à l’adresse https://books.google.fr/books?id=x8omDgAAQBAJ
[20] Romero C.R., & Ventura, S. (2010). Educational data mining:
A review of the state of the art. IEEE Transactions on Systems,
Man and Cybe rnetics, Part C: Applications and Reviews,
40(6), 601- 618.
doi:http://dx.doi.org/10.1109/TSMCC.2010.2053532
[21] Siemens, G., & Baker, R. S. J. D. (2012). Learning analytics and educational data mining: towards communication and
collaboration. In S. B. Shum, D. Gasevic, & R. Ferguson
(Eds.), Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (pp. 252 –254). doi:
http://dx.doi.org/10.1145/2330601.2330661
[22] Snijders, C., Matzat, U., & Reips, U. -D. (2012). ‘Big Data’:
Big gaps of knowledge in the field of Internet science.
International Journal of Internet Science, 7, 1 -5.
[23] Sonwalkar, N. (2013). The First Adaptive MOOC: A Case Study on Pedagogy Framework and Scalable Cloud Architecture — Part I. MOOCs Forum, 1, 22 –29. doi:
10.1089/mooc.2013.0007.
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: Problems and Prospects in the Applying Methods of [630315] (ID: 630315)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
