attached image contains external dataQuestion 1a)The world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the future or past population of a particular country was or might be.First, however, we need to formulate our data such that sklearn's Ridge regression class can train on our data. To do this, we will write a function that takes as input a country name and return a 2-d numpy array that contains the year and the measured population.Function Specifications:Should take a str as input and return a numpy array type as output.The array should only have two columns containing the year and the population, in other words, it should have a shape (?, 2) where ? is the length of the data.The values within the array should be of type int.Hint: You'll need to use both the the population and country map dataframes given above.def get_year_pop(country_name): b) Now that we have have our data, we need to split this into a training set, and a testing set. But before we split our data into training and testing, we also need to split our data into the predictive features (denoted X) and the response (denoted y).Write a function that will take as input a 2-d numpy array and return four variables in the form of (X_train, y_train), (X_test, y_test), where (X_train, y_train) are the features + response of the training set, and (X-test, y_test) are the features + response of the testing set.Function Specifications:Should take a 2-d numpy array as input.Should split the array such that X is the year, and y is the corresponding population.Should return two tuples of the form (X_train, y_train), (X_test, y_test).def feature_response_split(arr):c)Now that we have formatted our data, we can fit a model using sklearn's Ridge() class. We'll write a function that will take as input the features and response variables that we created in the last question, and returns a trained model.Function Specifications:Should take two numpy arrays as input in the form (X_train, y_train).Should return an sklearn Ridge model.The returned model should be fitted to the data.Hint: You may need to reshape the data within the function. You can use .reshape(-1, 1) to do this import numpy as npimport pandas as pdfrom numpy import arrayfrom sklearn.ensemble import RandomForest Regressorfrom sklearn.model_selection import KFoldfrom sklearn.metrics import mean_squared_errorpopulation_df = pd. read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Analyse Project/world_population.csv', index_col='Country Code')meta_df = pd. read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Analyse Project/metadata.csv', index_col='Country Code')population_df.head()

Question

attached image contains external dataQuestion 1a)The world population data spans from 1960 to 2017. We'd like to build a predictive model that can give us the best guess at what the future or past population of a particular country was or might be.First, however, we need to formulate our data such that sklearn's Ridge regression class can train on our data. To do this, we will write a function that takes as input a country name and return a 2-d numpy array that contains the year and the measured population.Function Specifications:Should take a str as input and return a numpy array type as output.The array should only have two columns containing the year and the population, in other words, it should have a shape (?, 2) where ? is the length of the data.The values within the array should be of type int.Hint: You'll need to use both the the population and country map dataframes given above.def get_year_pop(country_name): b) Now that we have have our data, we need to split this into a training set, and a testing set. But before we split our data into training and testing, we also need to split our data into the predictive features (denoted X) and the response (denoted y).Write a function that will take as input a 2-d numpy array and return four variables in the form of (X_train, y_train), (X_test, y_test), where (X_train, y_train) are the features + response of the training set, and (X-test, y_test) are the features + response of the testing set.Function Specifications:Should take a 2-d numpy array as input.Should split the array such that X is the year, and y is the corresponding population.Should return two tuples of the form (X_train, y_train), (X_test, y_test).def feature_response_split(arr):c)Now that we have formatted our data, we can fit a model using sklearn's Ridge() class. We'll write a function that will take as input the features and response variables that we created in the last question, and returns a trained model.Function Specifications:Should take two numpy arrays as input in the form (X_train, y_train).Should return an sklearn Ridge model.The returned model should be fitted to the data.Hint: You may need to reshape the data within the function. You can use .reshape(-1, 1) to do this import numpy as npimport pandas as pdfrom numpy import arrayfrom sklearn.ensemble import RandomForest Regressorfrom sklearn.model_selection import KFoldfrom sklearn.metrics import mean_squared_errorpopulation_df = pd. read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Analyse Project/world_population.csv', index_col='Country Code')meta_df = pd. read_csv('https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Analyse Project/metadata.csv', index_col='Country Code')population_df.head()

Accepted Answer

Predictive modeling is a statistical technique that uses machine learning and data mining to predict…