Question 1 What are the three chracteristicts of Big Data, and what are the main considerations in processing big data? Question 2 Explain the differences between BI and Data Science Question 3 Briefly describe each of the four classifications of Big Data Structure types (i.e Structured to Unstructured). Question 4 List and briefly describe each of the phases in the Data Analytics LifeCycle. Question 5 In which phase would the team expect to invest most of the project time?Why? Where would the team expect to spend the least time? Question 6 Which R command would create a scatterplot for the dataframe “df”, assuming df contains values for x and y? Question 7 What is a rug plot used for in a density plot? Question 8 What is a type1 error? What is a type 2 error? Is one always more serious than the other? Why? Question 9 Why do we consider K-means clustering as a unsupervised machine learning algorithm? Question 10 Detail the four steps in the K-means clustering algorithm. Question 11 List three popular use cases of the Association Rules mining algorithms? Question 12 Define Support and Confidence Question 13 How do you use a “hold-out” dataset to evaluate the effectiveness of the rules generated? Question 14 List two use cases of linear regression models Question 15 Compare and contrast linear and logistic regression methods

