CSCI 4144 - Assignment 1 - Basic Techniques


  • 作业标题:CSCI 4144 - Data Mining and Data Warehousing Assignment 1 - Basic Techniques
  • 课程名称:Dalhouse University CSCI 4144 Data Mining and Data Warehousing
  • 完成周期:2天

The main purpose of this assignment is to get familiar with processes of constructing and using a data warehouse. There are two sections: the first focuses on simple data loading and cleaning with simple data, and the second focuses on more complex data. In both cases, we will use publicly available datasets focused in the healthcare domain.

Section 1 - Data cleaning and ETL

A Notifiable disease (https://en.wikipedia.org/wiki/Notifiable_disease#Canada) is any disease that, by law, must be reported to government authorities. Aggregating data on these diseases allows the authorities to monitor their development, and provides early warning of possible outbreaks. The Canadian
Notifiable Disease Surveillance System (https://diseases.canada.ca/notifiable/) is a searchable database tool provided by the Public Health Agency of Canada. In this Section, we will practice cleaning some small, simple datasets

。。。

Section 2 - Data imputation, reduction, and basic analysis

The novel coronavirus disease 2019 (COVID-19 (https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19.html)) is a contagious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in December 2019. The disease quickly spread worldwide, resulting in the COVID-19 pandemic.

。。。

Bonus [5 Marks]

  • We will give up to 5 bonus marks for innovative work going substantially beyond the minimal requirements.
  • These marks can make up for marks lost in other sections of the assignment, but your overall mark for this assignment cannot exceed 100%.
  • You may decide to pursue any number of tasks of your own design related to this assignment, although you should consult with the instructor or the lead
  • TA before embarking on such exploration, and the value of bonus work is left to the discretion of the markers.
  • Be sure to document your work sufficiently for the markers to understand what you’re doing. You can add additional Code or MarkDown cells below, as necessary.
  • Certainly, the rest of the assignment takes higher priority

文章作者: IT神助攻
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 IT神助攻 !
  目录