Program Courses and Requirements
The requirements for successful completion of the MSc Data Science and Analytics are:
MRP option
Four (4) required courses
Two (2) elective courses
Two (2) seminar courses
Major Research Project
Thesis option
Three (3) required courses
Two (2) electives
Directed Study course
Thesis
More details on the Program Options
Required Courses
To introduce students to the theory and design of algorithms to acquire and process large dimensional data. Advanced data structures, graph algorithms, and algebraic algorithms. Complexity analysis, complexity classes, and NP-completeness, approximation algorithms and parallel algorithms. Study of algorithmic techniques and modeling frameworks that facilitate the analysis of massively large amounts of data. Introduction to information retrieval, streaming algorithms and analysis of web searches and crawls.
Overview of artificial learning systems. Supervised and unsupervised learning. Statistical models. Decision trees. Clustering. Feature extraction. Artificial neural networks. Reinforcement learning. Applications to pattern recognition and data mining.
The course will discuss data management techniques for storing and analyzing very large amounts of data. The emphasis will be on columnar databases and on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data. Big Data applications, Columnar stores, distributed databases, Hadoop, Locality Sensitive Hashing (LSH), Dimensionality reduction, Data streams, unstructured data processing, NoSQL, and NewSQL.
The course teaches to use data to recommend optimum course of action to achieve the optimum outcome and to formulate new products and services in a data driven manner. The course will cover all these issues and will illustrate the whole process by examples. Special emphasis will be given to data mining and computational techniques as well as optimization and stochastic optimization techniques. Prerequisite: DS8002.
This course assists the student with the development of the Thesis through the proposal, preliminary literature review, outline, and reporting stages. It is tailored to the needs of each student and the work in this course will be used as a foundation for the Thesis. Students are required to select an advisor and present a formal report, or take a formal examination, at the end of the class. Directed studies course is a prerequisite for starting Thesis work, and requires approval from PD. Pass/Fail
Program Electives
Overview of data visualization. Basic visualization design and evaluation principles. Learn to acquire, parse, and analyze large datasets. Techniques for visualizing multivariate, temporal, text-based, geospatial, hierarchical, and network/ graph data using tools such as ggplot2, R, D3, etc.
The course covers important topics in text mining including: basic natural language processing techniques, document representation, text categorization and clustering, document summarization, sentiment analysis, social network and social media analysis, probabilistic topic models and text visualization. Prerequisites: DS8002 and DS8003
This course consists of lectures, seminars and readings covering the latest advances and research in data science and analytics. The course description will be announced prior to scheduling of the course. Please note that this course may not be offered every year.
This course focuses on topics related to reinforcement learning. The course will cover making multiple-stage decisions under uncertainty, heuristic search in planning, Markov decision processes, dynamic programming, temporal-difference learning including Q-learning, Monte Carlo reinforcement learning methods, function approximation methods, and the integration of learning and planning. Other topics can be included as well. Prerequisites: DS8002
This course will cover modern machine learning techniques from a Bayesian probabilistic perspective. Bayesian probability allows us to model and reason about all types of uncertainty. The result is a powerful, consistent framework for approaching many problems that arise in machine learning, including parameter estimation, model comparison, and decision making. We will begin with a high-level introduction to Bayesian inference, then proceed to cover more-advanced topics. Prerequisites: DS8002.
The course aims to present the mathematical, statistical and computational challenges of building stable representations for high-dimensional data, such as images, text and data. The topics include: Convolutional neural networks. Autoencoders, their sparse, denoising variants, and their training. Regularization methods for preventing overfitting. Stacked autoencoders and end-to-end networks. Recurrent and recursive networks. Multimodal approaches. Deep architectures for vision, speech, natural language processing, and reinforcement learning. Prerequisite: DS8002.
This course will cover methods and practical algorithms for exploring and analyzing graphs. We will also cover applications in various domains (e.g., web, social science, computer networks, neuroscience). The topics that we will cover include ranking, label propagation, clustering and community detection, similarity, and anomaly detection in the graph setting, and tools to handle graph data. Students will have the chance to programmatically
analyze graph datasets.
3. Seminar Courses
The course will focus on communicating and presenting data analytics and modeling results. It aims at building the competency in story telling from numbers. The course also covers ethical and social impacts of data science, analytics and AI. Prerequisite: DS8012
This course will be an introduction to research preparation, experimental design, methods of data collection, exploratory data analysis, and understanding threats to validity of results with aim to prepare student for MRP work.
4. Major Research Project
The student is required to conduct an applied advanced research project. The project will be carried out under the guidance of a supervisor. On completion of the project, the results are submitted in a technical report format to an examining committee and the student will make an oral presentation of the report to the committee for assessment and grading of the report. The student is expected to provide evidence of competence in the carrying out of a technical project and present a sound understanding of the material associated with the research project. This is a “Milestone.”
5. Thesis
The student is required to conduct advanced research on a topic related to data science. The topic is chosen in consultation with thesis supervisor, and the student presents research plan in writing before research starts. The student must submit the completed research in a thesis format to an examination committee and make an oral presentation of the thesis. The student is expected to furnish evidence of competence in research and a sound understanding of data science associated with the research. This is “Milestone”.