Titulaciones en las que se imparte:
- Máster Universitario de Ingeniería Informática (EIT Digital - Data Science)
Curso académico: 2022-2023
Tipo:
Semestre: 1º
Créditos: 3
This course introduces to the parallelization of Machine Learning algorithms (supervised and unsupervised) on Big Data distributed computing platforms. In particular, MapReduce and RDD paradigms are explained jointly with the corresponding distributed platforms supporting them (Hadoop and Apache Spark). Using several well-known machine learning algorithms, the student will learn how to parallelize machine learning algorithms using the corresponding Hadoop and Apache Spark APIs.
Grupo | Horario | Profesorado |
---|---|---|
Único | Consultar 1º. semestre | Mozo Velasco, Alberto |
Content and Learning Activities
Unit 1: Distributed computing platforms for Big Data
1.1. Introduction to big data
1.2. MapReduce and Hadoop
1.3. RDDs and Spark
Unit 2: Scalable machine learning
2.1. Introduction to machine learning
2.2. Scalable optimizacion