Catálogo Tokioschool

Just another WordPress site

Big Data

  • Presentation
  • Methodology
  • Your classes
  • Online platform
  • Contents overview​
  • Digital teachers
  • Internships
  • Certifications
  • We are Tokio
  • Contact
  • Presentation
  • Methodology
  • Your classes
  • Online platform
  • Contents overview​
  • Digital teachers
  • Internships
  • Certifications
  • We are Tokio
  • Contact
  • Presentation
  • Methodology
  • Your classes
  • Online platform
  • Contents overview​
  • Digital teachers
  • Internships
  • Certifications
  • We are Tokio
  • Contact
  • Presentation
  • Methodology
  • Your classes
  • Online platform
  • Contents overview​
  • Digital teachers
  • Internships
  • Certifications
  • We are Tokio
  • Contact

COURSE

Big Data

Discover Tokio School

Presentation

Every minute of every day millions of raw data have generated that need to be collected, analysed, managed and from which value is obtained. How is this achieved? Thanks to Big Data, a technology that allows patterns and behaviours to be established and thus help the business fabric in decision-making. That is why Data Scientists are a fundamental figure nowadays and the demand for qualified professionals is constant.

With this course, you will be able to obtain the Data Science certification from IBM

In addition, you will have access to the IBM Data Science course, one of the most important companies in the sector. A total of 75 hours, divided into theory classes, laboratories and case studies, will give you the technical experience to become an expert in data analysis. Does this not seem like enough? Well, by taking the course you will have the chance to obtain the official certificate.

Objectives

Introduce students to the world of programming
Become familiar with the Big Data ecosystem and how to use it to solve problems
Visualise the data correctly in or detox a clear interpretation of them
Knowing and putting into practice the different techniques for data exploitation
Prepare projects oriented to Big Data including the fundamental elements

Salary

Salaries for big data careers are increasing just as quickly as the demand for skilled professionals. Many of these jobs report compensation well into the six-figure range and above market pay in order to compete in the talent war.

According to Glassdoor, the average salary range for a data analyst is between €59,303 and €104,466. However, in Germany, Sweden, and Ireland, the national annual average salary for a data analyst is €84,239, €75,531, and €76,791 respectively.

Data Architect

Annual Salary Range: $127,750-$176,500+

These professionals are tasked with designing the structure of complex data frameworks, as well as building and maintaining these databases. Data architects develop strategies for each subject area of the enterprise data model and communicate plans, status, and issues to their company’s executives.

Data Scientist

Annual Salary Range: $113,500-$162,500+

Data scientists design and construct new processes for modeling, data mining, and production. In addition to conducting data studies and product experiments, these professionals are tasked with developing prototypes, algorithms, predictive models, and custom analysis.

Data Analyst

Annual Salary Range: $87,500-$126,250+

Data analysts work with large volumes of data, turning them into insights businesses can leverage to make better decisions. They work across a variety of industries—from healthcare and finance to retail and technology.

Data analysts work to improve their own systems to make relaying future insights easier. The goal is to develop methods to analyze large data sets that can be easily reproduced and scaled.

Job opportunities

Data architect
Data architect
Data Scientist
Data Scientist
Data consultant
Data consultant
Big data developer
Big data developer

Methodology

Tailor-made method
Digital teachers
Personalised tutoring
Practical training
Online classes
Tailor-made method

Our courses do not have a start and end date. With Tokio’s 100% online training programme, you decide your pace, circumstances and capabilities and we follow you. Ours is «tailor-made» learning.

Digital teachers

They are your teachers, experts with real knowledge that will help you to improve your knowledge of this profession.

Personalised tutoring

Our educational advisors will accompany you throughout your training. They will help you achieve your goals through realistic objectives, organisation and motivation for tokiers!

Practical training

Self-assessment questionnaires, final exams, exercises, case studies… Learning by doing! You will learn by doing. In addition, you will have up to 300 hours of quality professional internships in companies in the sector.

Online classes

You will have live classes. And if you have not been able to attend, no problem! We’ll upload them to the virtual platform so you can watch them as many times as you want.

Final project
Soft Skills
Job Orientation
Employment Observatory
Final project

You’re almost there! To conclude your training, you’ll have to demonstrate everything you’ve learned through a project.

Soft Skills

You will receive extra training to improve your skills (communication, leadership, teamwork…) thanks to our short courses.

Job Orientation

We will give you all the keys to succeed in any selection process.

Employment Observatory

We put at your disposal, on the student platform, an Employment Observatory where you will find the best job opportunities according to your preferences and your sector.

Your classes

Live
You can connect live to the classes with your specialist teacher. The online classes will follow the syllabus and raise new questions and information that goes beyond the theoretical content of the books. At the end of each class, you can ask your questions so that the teacher can answer them live.
On a recorded basis
If you can’t attend a class live, don’t worry! All classes are recorded and uploaded to your platform so that you can access them whenever you want.
Doubt resolution
The digital teachers will dedicate the whole class to solving your doubts, exercises or practical cases. It is an excellent opportunity to interact with the specialist teacher, ask your questions and learn from the doubts of other classmates.
Masterclass

You will be able to attend online masterclasses given by renowned professionals in the sector who collaborate with Tokio School by sharing their experiences. These sessions will also be participative and you will be able to ask them your questions.

Online platform

Our methodology is designed so that you become
the protagonist of the learning process.

Content overview

Module 1: Introduction to Big Data

Unit 1: Big Data ecosystem

  • Component definition and architecture
  • Availability, Scalability and Resilience
  • Introduction to Hadoop and MapReduce

Unit 2: Data-Driven Strategies

  • Dashboards
  • Business intelligence vs big data

Unit 3: Processing environments

  • Cloud Computing
  • Internet of Things (IoT)

Unit 4: Big Data use cases: examples from industry

Module 2: The data and its life cycle

Unit 1: Data

  • The Data
  • Data quality
  • Data rights

Unit 2: Data life cycle

  • Data Creation and Acquisition 
  • Extraction, Processing and Loading
  • Data Warehousing
  • Analysis for exploitation 
  • Visualisation and Storytelling for exploitation
    • Selection of visual elements
  • Decision-making
Module 3: Scalable data storage

Unit 1: Distributed databases

  • Type of scalability
    • Vertical scalability
    • Horizontal scalability
  • Non-distributed databases
  • CAP Theorem

Unit 2: NoSQL 

  • Relational databases

Unit 3: NoSQL – Key-value

Unit 4: NoSQL – Columnars

  • Architecture
  • Data modelling in Cassandra
  • Cassandra Query Language – CQL
  • CQL – Data model creation
  • Keyspace
  • Table

Unit 5: NoSQL – Document-oriented

Unit 6: NoSQL – Graph-oriented

Module 4: Big Data architecture

Unit 1: Hadoop ecosystem

  • Introduction to Hadoop
  • Hadoop Ecosystem Tools

Unit 2: Cluster and distributed systems (HDFS, MapReduce)

Unit 3: Data analysis with Hive and Pig

Unit 4: Data processing with Spark 

  • Spark RDD (Resilient Distributed Datsets)
  • Spark Streaming
  • Spark SQL
Module 5: Analysis for data exploitation

Unit 1: Data profiles

  • Data Scientists 
  • Data engineer

Unit 2: Exploratory Data Analysis

  • Descriptive statistics
  • Data distribution
  • Exploration of categorical and binary data
  • Correlation
  • Exploration of 2 or more variables

Unit 3: Data sampling techniques

  • Random selection
  • Bias selection
  • Selection by statistical distribution

Unit 4: Hypothesis testing

  • A/B sample testing
  • Hypothesis testing
  • Statistical significance and P-value
  • P-value

Unit 5: Regression and Prediction

  • Linear Regression
  • Multilinear Regression
  • Interpreting regression results
  • Predicting using regression

Unit 6: Supervised learning

Unit 7: Unsupervised learning

  • Main components
  • Algorithms: K-Means, Hierarchical Clusters.

Unit 8: Introduction to Deep Learning

  • Fundamental concepts 
  • Neural Networks
Module 6: Big Data projects presentation and storytelling

Unit 1: Presentation of a Big Data project

  • The importance of context
  • The audience and its importance

Unit 2: Components for the presentation of a Big Data project

Sukiru: soft skills for digital samurai

You will receive extra training to improve your skills (communication, leadership, teamwork…) thanks to our short courses.

Sukiru: soft skills for digital samurai

You will receive extra training to improve your skills (communication, leadership, teamwork…) thanks to our short courses.

#alwaysforward

Digital teachers

Postgraduate in Information Technology with two years of hands-on business/data analytics experience. Passionate about working to turn data into information, information into insight and insight into business decisions.
Namita VermaSensei
Namita-Verma

Time to get on the tatami

Do you want to show what you’re worth? At Tokio School we have agreements with more than 3,000 companies in the technology and digital sector. You can do up to 300 hours of optional internships while expanding your network and your CV. Where would you like to do an internship? Suggest companies! You will be part of Tokio Net, our network of students and alumni.

Certifications

Once you have finished your training you will receive the following qualifications:

diploma-big-data
Big Data Course
*Training not officially recognised for academic purposes

We are Tokio

We’re not the kind of people who like to pin medals on themselves, but if others do…

excellence-2022_A-_Tokio_best_training_center_esports-2

TOP Educational Agreements

Contact

Do you have any questions? We are at your disposal for whatever you need.

+353 (1) 9026926

+31 (20) 3694593

+44 (20) 38079342

+32 (2) 7810204

+45 (7) 0890272

The content of this catalogue is subject to change at the discretion of the centre's management. The information not related to the centre contained in this catalogue is subject to the decision of the administration or competent authority.

Training is not approved for official academic purposes.

#alwaysforward

Machine Learning

Machine Learning was born from pattern recognition, but today it allows us to develop applications that improve their performance by «learning» from data collected in past situations. In this Python specialisation you will be able to apply Machine Learning to real projects, including preparation and related tasks, deployment in production and the lifecycle of a model.

MODULE 1: INTRODUCTION TO MACHINE LEARNING

Unit 1: Introduction to Big Data and Machine Learning

  • Introduction to Machine Learning
    • The theory of gravity
    • The scientific method
    • Mathematical models
    • Scientific method applications
    • Data science
    • Introduction to Big Data
    • Introduction to Machine Learning
    • The equation of the straight line
    • Model training
    • Working with Machine Learning models
    • Machine Learning applications
    • AlphaGo
  • Linear algebra
    • Relationship to the areas of big data, machine learning and artificial intelligence
    • Elements
    • Operations and properties

Unit 2: Work environment

Unit 3: Python and Scikit-learn numeric libraries

MODULE 2: SUPERVISED LEARNING

Unit 1: Linear regression 

  • Simple
    • Model equation
    • Graphical representation
    • Types of variables
  • Multivariable
    • Data modelling
    • Curve modelling
    • Analytical resolution
    • Cost function
    • Solving by iterative methods
    • Resolution algorithm

Unit 2: Gradient descent optimisation

  • Gradient descent
  • Convergence
  • Local and global minima
  • Learning ratio
    • Learning ratio choice
  • Training algorithm

Unit 3: Standardisation, regularisation and validation

  • Standardisation
    • Problem
    • What is standardisation?
    • Updated training algorithm
  • Regularisation
    • Deviation and variance
    • Regularisation
    • Regularised cost function
  • Cross-validation
    • Resolution methods
    • Dataset subdivision
    • K-fold
    • Updated training algorithm

Unit 4: Bayesian models and model evaluation

  • Example: carcinogenic cells’ classification
  • Sensitivity and specificity

Unit 5: Classification

  • Decision trees
    • Representation
    • Main concepts
    • Categorical and continuous target variables
    • Node splitting
    • Advantages and disadvantages of decision trees
    • Limitations on tree size
    • Tree pruning
    • Decision trees vs. linear models
    • Bootstrapping
    • Training algorithm
  • Logistic regression
    • Data modelling
    • Binary and multi-class classification
    • Hypothesis
    • Activation function: sigmoid
    • Cost function
    • Training algorithm: binary classification
    • Training Algorithm: multiclass classification
  • Classification by SVM
    • Logistic regression vs. SVM
    • Hypothesis
    • Kernels and landmarks
    • Hypothesis transformation
    • Types of kernels available
    • Cost functions
    • Regularisation parameter
    • Training algorithm: multiclass classification

Unit 6: Introduction to neural networks 

  • Natural neurons
  • Artificial neurons
  • Perceptron
  • Multi-layer or deep neural networks
    • Propagation of predictions
    • Cost function
    • Training
    • Multi-class classification
    • Training algorithm: binary classification
MODULE 3: UNSUPERVISED LEARNING

Unit 1: Optimisation by randomisation

  • Problem: local minima
  • Multiple initialisations
  • Implementation

Unit 2: Clustering

  • Differences between clustering and classification
  • K-means 
  • Other clustering algorithms
MODULE 4 – SEMI-SUPERVISED LEARNING

Unit 1: Anomalies detection

  • The problem
  • Anomalies in supervised vs. unsupervised and semi-supervised learning
  • Model representation
  • Choice of features
  • Normal or Gaussian multivariate distribution
  • Training algorithm

Unit 2: Recommendation systems

  • Linear regression recommendation systems
  • Recommendation systems approach
  • Cost function
  • Training algorithms
  • Prediction performance
  • Similarity between examples

Unit 3: Genetic algorithms

  • Natural evolution
  • Natural evolution of behaviour
  • Main concepts
  • Algorithms applied to optimisation
  • Examples
MODULE 5: AUTOMATIC LEARNING SYSTEMS DEVELOPMENT

Unit 1: ML systems approach

  • Initial approach
    • Data cleansing and transformation
    • Large-scale implementation

Unit 2: Feature engineering

  • Definition and characteristics
  • Creation of characteristics
  • Problems and solutions
  • Data quality

Unit 3: Principal Components Analysis (“PCA”)

  • Variables representation
  • Dimensionality reduction
  • Definition and applications
  • Visual representation

Unit 4: Assemblies

  • Definition and applications
  • Types of errors
  • Assembly techniques
  • Bagging
  • Max voting
  • Mean and weighted mean
  • Random forest
  • Boosting and adaptive boosting or AdaBoosting
  • Stacking

Unit 5: Models’ evaluation and improvement

  • Deviation and variance
  • Evaluation metrics: linear regression
  • Evaluation metrics: classification
  • Deviation and variance avoidance
  • Error analysis and evaluation of results 

Unit 6: Operations in ML

  • ML Engineering 
  • Operations in ML