APAN 5400 - Managing Data - Assignment 7


  • 作业标题: APAN 5400 - Assignment 7
  • 课程名称:C-lumbia University APAN 5400 Managing Data
  • 完成周期:2天

In this assignment you will grapple with the issue of when to use a data warehouse and when to use a data lake. By doing this assignment you will sharpen your understanding the considerations involved in making these decisions.
Objectives

This assignment supports the following objectives:

  • Identify the defining characteristics of a data warehouse
  • Describe the difference between a data warehouse and a data lake
  • Identify situations in which a data lake should be used over a data warehouse and vice versa.

Details

For each of these scenarios, decide whether you will use a data warehouse or a data lake to store and manage the data. Justify your answer in terms of the variety of data, the volume of the data, and the velocity of the data as well as what sorts of applications need to be supported in these scenarios.

Scenario 1

A car rental company has over 600,000 cars. Each car has about 30 sensors which the car rental company wants to monitor at 15 minute intervals. The company wants to collect that data for multiple purposes. First, they would like to understand how the car is being driven during rental period (How far? At what speeds? How much time spent in the idling mode? Etc.) in order to possibly make adjustments to their business model as well as to make adjustments the type of routine maintenance of the cars. Second, they would like to know where the cars are located (in order to possibly alert emergency crews, should the need arise). And, third, they would like to determine the current driving condition of the car (that is, the probability that it will need non-routine maintenance work in the immediate future) in order to possibly substitute another car for a car with a high probability of needing non-routine maintenance work. You are tasked to design a system that allows the company to store the data with the ability to analyze and report on it.

Scenario 2

A utility (electricity) company collects the usage of its 5 million customers for the lifetime of their subscription. The company would like to provide a service to its customers where they are able to monitor their hourly, daily, weekly, and monthly usage. You are tasked to design a system that allows the company to collect and store the data, and present that data to their customers on an ad hoc basis. (Please keep in mind that in this scenario there is no requirement that this data source will be used to make any determinations of repair, maintenance, or electricity outages).

Assessment

The assignment is worth 50 points total: 25 points for each scenario (5 points for making the correct determination, 20 points for the quality of the justification provided). Give as many details as possible. For each question you should write one page, single spaced, 12 point font size, Times New Roman font style.
Your answers will be assessed on what considerations you bring to bear in justifying your decisions, how clearly you see the types of data that is needed in these scenarios, and in what detail you conceive the applications built on this data.

Submission

To complete your submission,

  1. Please submit a PDF file or Word Document.
  2. Click the blue Submit Assignment button at the top of this page.
  3. Click the Choose File button, and locate your submission.
  4. Feel free to include a comment with your submission.
  5. Finally, click the blue Submit Assignment button.

。。。


文章作者: IT神助攻
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 IT神助攻 !
  目录