We recommend students are familiar with Computer Science concepts, like those in the previous courses of the Degree Program. In addition, this subject is based on the skills and knowledge acquired in the following ones:
- Logic
- Statistics
- Algorithm Design
- Intelligent systems
- Knowledge Based Systems
Data mining and machine learning are linked to the field of statistics and computer algorithms. They are based on techniques for the extraction of knowledge from data sets. In recent years, these disciplines are gaining importance due to the increase in data production -propitiated by phenomena such as the rise of the Internet or social networks- or the development of new techniques for obtaining genetic information. From a professional point of view, there is a rising demand for data scientists in fields as diverse as marketing, market analysis, security, or biology.
Course competences | |
---|---|
Code | Description |
CM05 | Ability to acquire, formalise, and represent human knowledge in a computable form for the solution of problems throughout a digital system in any application context, especially the one linked to computational aspects, perception, and behaviour in intelligent frames. |
CM07 | Ability to know and develop computational learning techniques, and design and implement applications and systems which could use them, including the ones for the automatic extraction of information and knowledge from great batches of information. |
INS01 | Analysis, synthesis, and assessment skills. |
INS04 | Problem solving skills by the application of engineering techniques. |
INS05 | Argumentative skills to logically justify and explain decisions and opinions. |
PER02 | Ability to work in multidisciplinary teams. |
PER04 | Interpersonal relationship skills. |
PER05 | Acknowledgement of human diversity, equal rights, and cultural variety. |
SIS01 | Critical thinking. |
SIS03 | Autonomous learning. |
SIS09 | Care for quality. |
UCLM03 | Accurate speaking and writing skills. |
Course learning outcomes | |
---|---|
Description | |
Description and application of different phases of the discovery process of knowledge extraction from large volumes of data. | |
Development and implementation of a small to medium-sized information retrieval system. | |
Knowledge and development of computational learning techniques, both supervised and unsupervised, and design and implement applications and systems that use them. | |
Additional outcomes | |
Not established. |
LABORATORY
a complete KDD process will be developed throughout the course. The student will propose the domain linked to their interests of work and/or research or some topic proposed by the professors.
1. Problem Selection
2. Data Selection.
3. Pre-processing.
4. Transformation.
5. Data Mining.
6. Use of patterns discovered in an application.
Training Activity | Methodology | Related Competences | ECTS | Hours | As | Com | Description | |
Class Attendance (theory) [ON-SITE] | Lectures | CM05 CM07 | 0.6 | 15 | N | N | Teaching of the subject matter by lecturer (MAG) | |
Individual tutoring sessions [ON-SITE] | CM05 CM07 INS05 SIS01 SIS09 UCLM03 | 0.18 | 4.5 | N | N | Individual or small group tutoring in lecturer¿s office, classroom or laboratory (TUT) | ||
Study and Exam Preparation [OFF-SITE] | Self-study | CM05 CM07 INS01 SIS01 SIS03 SIS09 | 1.8 | 45 | N | N | Self-study (EST) | |
Other off-site activity [OFF-SITE] | Practical or hands-on activities | CM05 CM07 INS01 INS04 PER02 PER04 PER05 SIS03 SIS09 UCLM03 | 0.9 | 22.5 | N | N | Lab practical preparation (PLAB) | |
Problem solving and/or case studies [ON-SITE] | Problem solving and exercises | CM05 CM07 INS01 INS04 PER02 PER04 PER05 SIS01 SIS09 | 0.6 | 15 | Y | N | Worked example problems and cases resolution by the lecturer and the students (PRO) | |
Writing of reports or projects [OFF-SITE] | Self-study | CM05 CM07 INS01 INS04 INS05 PER02 PER04 PER05 SIS01 SIS03 SIS09 UCLM03 | 0.9 | 22.5 | Y | N | Preparation of essays on topics proposed by lecturer (RES) | |
Laboratory practice or sessions [ON-SITE] | Practical or hands-on activities | CM05 CM07 INS04 PER02 PER04 PER05 SIS03 SIS09 | 0.72 | 18 | Y | Y | Realization of practicals in laboratory /computing room (LAB) | |
Progress test [ON-SITE] | Assessment tests | CM05 CM07 INS01 INS04 INS05 PER02 SIS01 SIS09 UCLM03 | 0.1 | 2.5 | Y | N | Progress test 1 of the first third of the syllabus of the subject (EVA) | |
Progress test [ON-SITE] | Assessment tests | CM05 CM07 INS01 INS04 INS05 PER02 SIS01 SIS09 UCLM03 | 0.1 | 2.5 | Y | N | Progress test 2 of the two first thirds of the syllabus of the subject (EVA) | |
Progress test [ON-SITE] | Assessment tests | CM05 CM07 INS01 INS04 INS05 PER02 SIS01 SIS09 UCLM03 | 0.1 | 2.5 | Y | N | Progress test 3 of the complete syllabus of the subject (EVA) | |
Total: | 6 | 150 | ||||||
Total credits of in-class work: 2.4 | Total class time hours: 60 | |||||||
Total credits of out of class work: 3.6 | Total hours of out of class work: 90 |
As: Assessable training activity Com: Training activity of compulsory overcoming (It will be essential to overcome both continuous and non-continuous assessment).
Evaluation System | Continuous assessment | Non-continuous evaluation * | Description |
Progress Tests | 7.50% | 0.00% | Progress test 1. Non-compulsory activity that can be retaken (rescheduling). To be carried out at the end of the first third of the teaching period. |
Progress Tests | 15.00% | 0.00% | Progress test 2 Non-compulsory activity that can be retaken. To be carried out at the end of the second third of the teaching period. |
Progress Tests | 27.50% | 0.00% | Progress test 3. Non-compulsory activity that can be retaken. To be carried out during the non-teaching period |
Theoretical papers assessment | 15.00% | 15.00% | Non-compulsory activity that can be retaken. To be carried out before end of teaching period. |
Laboratory sessions | 25.00% | 25.00% | Compulsory activity that can be retaken. To be carried out during lab sessions |
Oral presentations assessment | 10.00% | 10.00% | Non-compulsory activity that can be retaken. The students in the continuous mode will be evaluated in theory/laboratory sessions The students of non-continuous mode will be evaluated from this activity through of an alternative system. |
Final test | 0.00% | 50.00% | Compulsory and can be retaken activity to to be carried out on the date scheduled for the final ordinary exam. |
Total: | 100.00% | 100.00% |
Not related to the syllabus/contents | |
---|---|
Hours | hours |
General comments about the planning: | The course is taught in three weekly sessions of 1.5 hours. |
Author(s) | Title | Book/Journal | Citv | Publishing house | ISBN | Year | Description | Link | Catálogo biblioteca |
---|---|---|---|---|---|---|---|---|---|
Adriaans, P. W.; Zantinge, D. | Data Mining. | Addison-Wesley | 1996 | ||||||
Berry, M. J. A.; Linoff, G. | Data Mining Techniques. | New York | Wiley Computer Publishing. | 1996 | |||||
Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. | The KDD Process for Extracting Useful Knowledge from Volumes of Data. | 1996 | |||||||
Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P.; Uthurusamy, R. (Eds) | Advances in Knowledge Discovery and Data Mining. | Cambridge MA | AAAI/MIT Press | 1996 | |||||
Igual, Laura, Seguí, Santi | Introduction to Data Science | Springer | 9783319500171 | 2017 | This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning and the practical application of data science. | https://link.springer.com/book/10.1007%2F978-3-319-50017-1 | |||
Jan Van der Plass | Python Data Science Handbook | O'Reilly | 9781491912058 | 2016 | https://learning.oreilly.com/library/view/python-data-science/9781491912126/ | ||||
Jefrey Leek | The Elements of Data Analytic Style | LeanPub | 2014 | http://worldpece.org/sites/default/files/datastyle.pdf |