Create a Program to Create Custom Transformer in Python Assignment Solution

July 10, 2024

Dr. Victoria

🇬🇧 United Kingdom

Python

Dr. Victoria Campbell holds a Ph.D. in Computer Science from a leading university in the UK and has completed over 800 assignments related to Python file handling. With a passion for teaching and research, Dr. Campbell specializes in complex data manipulation, optimization algorithms, and machine learning applications in Python. Her expertise includes text file parsing, CSV data processing, and implementing advanced error handling mechanisms.

Hire Now

Python

Key Topics

Instructions
Requirements and Specifications

Submit Your Python Assignment

Get a FREE Quote

Tip of the day

Focus on mastering pattern matching and recursion—core strengths of OCaml. Use the REPL to test functions quickly, and always handle all pattern cases to avoid runtime errors. Type inference is powerful, but adding type annotations can make your code easier to understand and debug.

News

Mojo 25.1 debuted—an MLIR-based, Python-like language optimized for AI and systems programming, providing students with a high-performance tool for compute-intensive coursework

Instructions

Objective

Write a program to create custom transformer in python language.

Requirements and Specifications

Source Code

# DTSC670: Foundations of Machine Learning Models ## Module 2 ## Assignment 4: Custom Transformer and Transformation Pipeline #### Name: Begin by writing your name above. Your task in this assignment is to create a custom transformation pipeline that takes in raw data and returns fully prepared, clean data that is ready for model training. However, we will not actually train any models in this assignment. This pipeline will employ an imputer class, a user-defined transformer class, and a data-normalization class. Please note that the order of features in the final feature matrix must be correct. See the below figure that illustrates the input and output of the transformation pipeline. The positions of features $x_1$ and $x_2$ do not change - they remain in the first and second columns, respectvely, both before and after the transformation pipeline. In the transformed dataset, the $x_5$ feature is next, and is followed by the newly computed feature $x_6$. Finally, the last two columns are the remaining one-hot vectors obtained from encoding the categorical feature $x_3$. <img src="DataTransformation.png " width ="500" /> # Import Data Import data from the file called `CustomTransformerData.csv`. ### ENTER CODE HERE ### # Create Custom Transformer Create a custom transformer, just as we did in the lecture video entitled "Custom Transformers", that performs two computations: 1. Adds an attribute to the end of the data (i.e. new last column) that is equal to $\frac{x_1^3}{x_5}$ for each observation 2. Drops the entire $x_4$ feature column. You must name your custom transformer class `Assignment4Transformer`. This transformer will be used in a pipeline. In that pipeline, an imputer will be run *before* this transformer. Keep in mind that the imputer will output an array, so **this transformer must be written to accept an array.** Additionally, this transformer will ONLY be given the numerical features of the data. The categorical feature will be handled elsewhere in the full pipeline. This means that your code for this transformer **must reflect the absence of the categorical $x_3$ column** when indexing data structures. ### ENTER CODE HERE ### # Create Transformation Pipeline for Numerical Features Create a custom transformation pipeline for numeric data only called `num_pipeline` that: 1. Applies the `SimpleImputer` class to the data, where the strategy is set to `mean`. 2. Applies the custom `Assignment4Transformer` class to the data. 3. Applies the `StandardScaler` class to the data. ### ENTER CODE HERE ### # Create Numeric and Categorical DataFrames Create two new data frames. Create one DataFrame called `data_num` that holds the numeric features. Create another DataFrame called `data_cat` that holds the categorical features. ### ENTER CODE HERE ### # Quick Testing The full pipeline will be implemented with a `ColumnTransformer` class. However, to be sure that our numeric pipeline is working properly, lets invoke the `fit_transform()` method of the `num_pipeline` object. Then, take a look at the transformed data to be sure all is well. ### Run Pipeline and Create Transformed Numeric Data ### ENTER CODE HERE ### ### One-Hot Encode Categorical Features Similarly, you will employ a `OneHotEncoder` class in the `ColumnTransformer` below to construct the final full pipeline. However, let's instantiate an object of the `OneHotEncoder` class called `cat_encoder` that has the `drop` parameter set to `first`. Next, call the `fit_transform()` method and pass it your categorical data. Take a look at the transformed one-hot vectors to be sure all is well. ### ENTER CODE HERE ### # Put it All Together with a Column Transformer Now, we are finally ready to construct the full transformation pipeline called `full_pipeline` that will transform our raw data into clean, ready-to-train data. Construct this ColumnTransformer below, then call the `fit_transform()` method to obtain the final, clean data. Save this output data into a variable called `data_trans`. ### ENTER CODE HERE ### # Prepare for Grading Prepare your `data_trans` NumPy array for grading by using the NumPy [around()](https://numpy.org/doc/stable/reference/generated/numpy.around.html) function to round all the values to 2 decimal places - this will return a NumPy array. Please note the final order of the features in your final numpy array, which is given at the top of this document. ___You MUST print your final answer, which is the NumPy array discussed above, using the `print()` function! This MUST be the only `print()` statement in the entire notebook! Do not print anything else using the print() function in this notebook!___ print(np.around(data_trans,decimals=2))

Related Samples

Explore our free Python assignment samples to gain insights into solving programming challenges. These samples showcase practical applications and coding techniques in Python, providing valuable learning resources for students and enthusiasts alike.

See All Samples

Prime Number Check, Sum of Even Numbers, Guessing Game, and Dice Simulation in Python

Python

Word Count

4091 Words

Writer Name:Walter Parkes

Total Orders:2387

Satisfaction rate:

Python Assignment Sample: Analyzing Stock Market Data with Pandas

Python

Word Count

2184 Words