Course


IERG4300/ESTR4300 – Web-scale Information Analytics

IERG Elective MIEG Elective Undergraduate
Co-requisite(s):
Unit(s):
3
Pre-requisite(s):
Exclusion:
CSCI5510 or ENGG4030 or ESTR4300
Term Offered:
Term 1
Teacher:
Prof. Wing Lau
Remarks:

The course discusses data-intensive analytics, and automated processing of very large amount of structured and unstructured information. We focus on leveraging the MapReduce paradigm to create parallel algorithms that can be scaled up to handle massive data sets such as those collected from the World Wide Web or other Internet systems and applications. We organize the course around a list of large-scale data analytic problems in practice. The required theories and methodologies for tackling each problem will be introduced. As such, the course only expects students to have solid knowledge in probability, statistics, linear algebra and computer programming skills. Topics to be covered include: the MapReduce computational model and its system architecture and realization in practice ; Finding Frequent Item-sets and Association Rules ; Finding Similar Items in high-dimensional data ; Dimensionality Reduction techniques ; Clustering ; Recommendation systems ; Analysis of Massive Graphs and its applications on the World Wide Web ; Large-scale supervised machine learning; Processing and mining of Data Streams and their applications on large-scale network/ online-activity monitoring.

Advisory: Basic hands-on operating system configuration and software installation skills covered in lab courses like IERG2602 and IERG3800 are required.