Certificate in Data Analytics

Nowadays, massive amounts of data are available via the Internet, or they are stored in the companies’ databases. The main problem faced is how to leverage such data into information useful for decision making. The main purpose of this certificate is to help build the skills necessary to tackle this problem.

This certificate is meant for students having a previous background in computer science, engineering, business, or science, or students currently in their final year of such a degree, and who are interested in upgrading their skills to be able to analyze data in their field. High School students with industry experience are also welcome to this program. Students with no prior background may take it as well, but they should expect to take more time to complete it, as they will have to take a significant number of prerequisite courses in addition to the core program.

The certificate is composed of 3 required courses that form the basis of data analytics. The subjects covered in those required courses include: data storage into databases, SQL queries, statistical analysis through linear regression, and finally data visualization and data mining techniques so that raw data can be converted to information useful for decision making. In the 2 elective courses, the student can build further their knowledge in the area(s) of their choice, which make up the data analytics field: data acquisition and integration, data storage, data visualization, data mining, and statistics; including the current technologies used in industry.

General Regulations

  1. Each person entering the program must have the approval of the Data Analytics Governance Committee (data_analytics@unb.ca)
  2. Only two of the five courses listed below for the certificate may be transferred from another degree or similar program.  The DA 4993 project cannot be transferred.  
  3. Normally a student must have grade 12 mathematics to enter the program. Math 1863 may be taken as one of the optional courses in the certificate program by those students who do not have grade 12 mathematics from high school or feel that they are weak in the subject. 
  4. To earn a certificate a student must successfully complete all required courses, elective courses, and the project, with a grade of C or better.  

Requirements

• 3 required courses: INFO1103, STAT4703, and DA4403
• 2 elective courses from: CS2383, CS3423, CS3773, CS4525, CS4783, STAT3083, STAT3703, STAT 3093, STAT4043, STAT4203, STAT4243, DA 4803 / DA 4813 / CS 4998 / CS 4999, BA3126
• 1 project (DA4993), which should be an industry-related project or a research-related project, involving a large amount of data.
• Note: students should also ensure that the pre-requisite courses are passed. In particular, the following courses are pre-requisites to the required courses above:

1. CS 1073
2. STAT 1793 and STAT 2793 (or one equivalent sequence: BA1605/BA2606, PSYC2901/PSYC3913, or STAT3083/STAT3093)

Students with a prior degree in BScCS or BISc would have such prerequisites covered. Students with a prior degree in business, economics, biology, psychology (except BA major in psyc, with only PSYC2901), mathematics, or statistics, would most probably have already the proper background in statistics (#2 above). Students with a prior degree in engineering (assuming STAT2593 and CS1003 already taken) would have to take STAT2793 and CS1073. Engineering students who have taken CS1023 could take CS2616 rather than CS1073 (covering CS1083 as well, which might be needed for some elective courses).

This certificate requires a minimum of three terms of courses, followed by a project to complete the program on a full-time basis. An example of a course schedule for students without the prerequisites is as follows:

Fall Courses Winter Courses Fall Courses
CS 1073
STAT 1793
MATH 1503
STAT 2793
INFO 1103
STAT 4703
DA 4403
TWO Electives


Information about elective courses (to help in the course selection):

course prerequisites purpose
CS 2383 - data structures and algorithms

CS 1073

CS 1303

For students who are planning on writing programs to perform specific analyses, this course presents data structures that will help manipulate data internally in an efficient way.
CS 4525 – database management systems II

INFO1103
CS1073/CS1083
CS2253
CS3403

For an advanced coverage of database technologies (including data warehouses).
CS3423 – data management CS1073/CS1083 Covers technologies used in the storage and manipulation of data, outside of a database framework (e.g., XML, regular expressions, etc).
CS3773 – Topics in Web Science Provides an overview of Web-based architectures and applications facilitating online data analytics using open data.
CS4783 – Web: Semantics, Services and Solutions

CS1073/CS1083
CS1303
CS2383

Focuses on the methodologies and infrastructures driving the migration toward the semantic web. Covers interoperability, distributed data sources, information retrieval, information extraction, web services and workflow technology.

STAT4203 – intro to multivariate data analysis

STAT 1793 and STAT2793, or equivalent (see #2 above)
MATH1503 or MATH2213

More advanced statistical techniques for dealing with a large number of variables (including how to reduce that number of variables using principal components analysis).
STAT4243 – statistical computing

CS1073 or CS1003
STAT 1793 and STAT2793, or equivalent (see #2 above)

For programming in R, the language of choice when it comes to using libraries of statistical techniques.
STAT4043 – sample survey theory

STAT 1793 and STAT2793, or equivalent (see #2 above)

For those who are planning on gathering and analyzing data through surveys.
STAT3083 – probability and mathematical statistics I

MATH1013
STAT1793 or equivalent

In depth study of common probability distributions on which most statistical analyses and decision making rely.
STAT 3093 • STAT 3083
• STAT 2793 or equivalent (see #2 above)
Covers fundamental statistical inference concepts at a more in depth level than in the introductory statistics courses, as well as other common but more advanced estimation methods.

STAT3703 – experimental design

 STAT 1793 and STAT2793, or equivalent (see #2 above) Basic + complex designs for organizing experimental data collection and corresponding data analysis procedures.

DA 4803 / DA 4813 (independent studies in DA)

CS 4998 / CS 4999 (directed studies in CS or applied CS)

 Department approval For covering topics of interest that are not currently included in available courses (e.g.,  BIG Data technologies). Students can also choose topics supportive of their project course (DA 4993). The student should find a supervisor for this.
BA3126 – Frontiers of E-Commerce I

BA2123
BA2663

This course incorporates a lot of data visualization techniques.

Further information may be obtained by contacting data_analytics@unb.ca. In particular, the Department of Computer Science’s web site will be updated with information about the current tools and technologies taught in the computer science courses making up the certificate, and project details.