Computer Science

CS4545Big Data Systems3 ch (3C) (P)

Data systems are going through a major transition due to the challenges of Big Data processing. The outcome of this shift is the emergence of a new breed of systems that can handle data at massive scales. This course presents some of these systems, along with the principles of query processing. Specifically, it compares Relational vs. NoSQL data models and covers the foundations of query processing, including index-based access and join processing. It presents the principles of parallel databases, and explores batch processing frameworks, as well as iterative processing frameworks. It also covers SQL interfaces over these frameworks. It introduces update-intensive systems and graph data stores. It includes the special topics of spatial and spatio-temporal data processing.

Prerequisites: (CS 1103 or CS 2545) and 75 ch or permission of the instructor. CS 3543 is recommended.