×
Reviews 4.9/5 Order Now

Create a Program to Implement Gradient Descent in Python Assignment Solution

June 25, 2024
Dr. Andrew Taylor
Dr. Andrew
🇨🇦 Canada
Python
Dr. Andrew Taylor, a renowned figure in the realm of Computer Science, earned his PhD from McGill University in Montreal, Canada. With 7 years of experience, he has tackled over 500 Python assignments, leveraging his extensive knowledge and skills to deliver outstanding results.
Key Topics
  • Instructions
    • Objective
  • Requirements and Specifications
Tip of the day
Focus on Rust’s strict ownership rules and borrow checker to avoid common errors. Use tools like clippy for linting and cargo for dependency management to ensure clean and efficient code.
News
The rise of languages such as Rust and Go is notable for their performance and safety features, making them increasingly popular in systems programming.

Instructions

Objective

Write a python assignment program to implement gradient descent.

Requirements and Specifications

program-to-implement-gradient-descent-in-python

Source Code

In this quiz, you will: TODO #1. Debug my gradient descent implementation (5PTS for `dj_dw` and 5PTS for `dj_db`) to find the best parameters ($w$ and $b$) for the linear regression model being used. TODO #2. Play with the learning rate (5PTS for finding a good `alpha`). TODO # 3. Experiment with the learning rate value (5PTS to find the learning rate `alpha` that breaks the process). Note. - We're only covering the single feature case (only focused on the size of a house). - We're not concerned with testing. import math, copy import numpy as np import matplotlib.pyplot as plt # Problem Statement We've been given two data points ($m$ from the lecture notes; here $m$ is 2). Each point describes the size of a house and the corresponding price. | Size (1000 sqft) | Price (1000s of dollars) | | ----------------| ------------------------ | | 1 | 300 | | 2 | 500 | # Load our data set x_train = np.array([1.0, 2.0]) #features y_train = np.array([300.0, 500.0]) #target values ### Model Let's use the model we covered in class. So for a given input value (size), and paramters ($w$ and $b$), our model can predict the price of the house. Since there's only data points, the correct model is the line that passes through both of these points. We can work out the answer by hand, but let's see how linear regression works problem. (Correct answer for a model of the form $f_{wb}(x) = wx + b$ on this dataset is w=200 and b=100.) def f_wb(x, w, b): """ Computes the prediction of a linear model Args: x (scalar): input value w,b (scalar): model parameters Returns predicted value based on x, w, and b """ return w * x + b We'll use `plot_predictions` to plot how our model is doing. def plot_predictions(x, y, w, b): """ Plots prediction model and actual values Args: x (ndarray(m,)): input values, m examples y (ndarray(m,)): output values, m examples (the correct answers) w,b (scalar): model parameters """ m = x.shape[0] predictions = np.zeros(m) for i in range(m): predictions[i] = f_wb(x[i], w, b) # Plot our model prediction plt.plot(x, predictions, c='b',label='Our Prediction') # Plot the data points plt.scatter(x, y, marker='x', c='r',label='Actual Values') # Set the title plt.title("Housing Prices") # Set the y-axis label plt.ylabel('Price (in 1000s of dollars)') # Set the x-axis label plt.xlabel('Size (1000 sqft)') plt.legend() plt.show() Before we get into any machine learning, let's try some values of $w$ and $b$ (`w_try`, `b_try`) and see what comes out. w_try = 50 b_try = 300 plot_predictions(x_train, y_train, w_try, b_try) We're not doing that great. The blue line is what we would use to predict prices given some test sizes. ### Compute Cost Keeping track of the cost will help us know how well the $w$ and $b$ parameters capture the training data. #Function to calculate the cost def compute_cost(x, y, w, b): """ Computes the cost of a model Args: x (ndarray(m,)): input values, m examples y (ndarray(m,)): output values, m examples (the correct answers) w,b (scalar): model parameters Returns cost based on mean square error of prediction (using x, w, and b) and the correct asnwer (y) """ m = x.shape[0] cost = 0 for i in range(m): cost = cost + (f_wb(x[i], w, b) - y[i])**2 total_cost = 1 / (2 * m) * cost return total_cost compute_cost(x_train, y_train, w_try, b_try) ### TODO #1 (10PTS) ### Gradient Descent `compute_gradient` will compute gradients used during the gradient descent process. This process will update $w$ and $b$, starting from an initial guess of these values. Please fix Lines 19 and 20. def compute_gradient(x, y, w, b): """ Computes the gradient for linear regression Args: x (ndarray (m,)): input values, m examples y (ndarray (m,)): output values, m examples (the correct answers) w,b (scalar) : model parameters Returns dj_dw (scalar): The gradient of the cost w.r.t. the parameters w dj_db (scalar): The gradient of the cost w.r.t. the parameter b """ # Number of training examples m = x.shape[0] dj_dw = 0 dj_db = 0 for i in range(m): dj_dw += (f_wb(x[i], w, b) - y[i]) * x[i] #FIX ME dj_db += (f_wb(x[i], w, b) - y[i]) * x[i] #FIX ME dj_dw = dj_dw / m dj_db = dj_db / m return dj_dw, dj_db `run_gradient_descent` will run gradient descent. The initial values for parameters (`w_initial` and `b_initial`), the learning rate (`alpha`), and how long to run gradient descent (`num_iters`) are specified by the user. The history of the cost and the parameter values are stored and returned (`J_history` and `p_history`). def run_gradient_descent(w_initial, b_initial, alpha, num_iters, J_history, p_history): """ Computes the gradient for linear regression Args: w_initial (scalar): initial value of w b_initial (scalar): initial value of b alpha: learning rate num_iters (scalar): how long gradient descent will be run J_history (list): store cost after every iteration p_history (list): store parameter tuple (w,b) after every iteration Returns dj_dw (scalar): The gradient of the cost w.r.t. the parameters w dj_db (scalar): The gradient of the cost w.r.t. the parameter b """ w = w_initial b = b_initial for i in range(num_iters): # Get the gradients dj_dw, dj_db = compute_gradient(x_train, y_train, w, b) # Update parameters w = w - alpha * dj_dw b = b - alpha * dj_db # Save off for plotting J_history.append(compute_cost(x_train, y_train, w , b)) p_history.append([w,b]) # Print cost every at intervals if i% math.ceil(num_iters/10) == 0: print(f"Iteration {i:4}: Cost {J_history[-1]:0.2e} ", f"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e} ", f"w: {w: 0.3e}, b:{b: 0.5e}") return w, b, J_history, p_history J_history = [] p_history = [] num_iters = 100 alpha = 0.008 w_initial = 0. b_initial = 0. w_final, b_final, J_history, p_history = run_gradient_descent(w_initial, b_initial, alpha, num_iters, J_history, p_history) print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})") ### Cost versus iterations of gradient descent A plot of cost versus iterations is a useful measure of progress of the gradient descent process. Cost should always decrease in successful runs. # plot cost versus iteration fig, ax = plt.subplots(1,1, figsize=(6, 6)) ax.plot(J_history) ax.set_title("Cost vs. iteration(start)") ax.set_ylabel('Cost') ax.set_xlabel('iteration step') plt.show() ### Final Model let's use the updated values of the parameters $w$ and $b$ (`w_final`, `b_final`) to plot the model. plot_predictions(x_train, y_train, w_final, b_final) #### TODO #2 (5 PTS) Either increase or decrease `alpha` to arrive at a good value. Track the cost vs iteration plot to help guide the selection. Play with `num_iters`, if even you're best `alpha` doesn't quite get you there. J_history = [] p_history = [] num_iters = 100 alpha = 0.02 #FIX ME. w_initial = 0. b_initial = 0. # Run gradient descent w_final, b_final, J_history, p_history = run_gradient_descent(w_initial, b_initial, alpha, num_iters, J_history, p_history) print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})") # plot cost versus iteration fig, ax = plt.subplots(1,1, figsize=(6, 6)) ax.plot(J_history) ax.set_title("Cost vs. iteration(start)") ax.set_ylabel('Cost') ax.set_xlabel('iteration step') plt.show() # Plot final model plot_predictions(x_train, y_train, w_final, b_final) #### TODO #3 (5 PTS) With `num_ters` set to 100, increase `alpha` starting from 0.002. At what value does the gradient descrent process break down and you start seeing NaNs in the cost, for instance? J_history = [] p_history = [] num_iters = 100 alpha = 0.5 #Experiment w_initial = 0. b_initial = 0. # Run gradient descent w_final, b_final, J_history, p_history = run_gradient_descent(w_initial, b_initial, alpha, num_iters, J_history, p_history) print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})") # plot cost versus iteration fig, ax = plt.subplots(1,1, figsize=(6, 6)) ax.plot(J_history) ax.set_title("Cost vs. iteration(start)") ax.set_ylabel('Cost') ax.set_xlabel('iteration step') plt.show() # Plot final model plot_predictions(x_train, y_train, w_final, b_final) ## Congratulations! - You're done with the final quiz. - Please convert this notebook to a PDF and submit on Gradescope - Make sure all output cells are visible in the PDF

Similar Samples

Explore our array of programming homework samples at ProgrammingHomeworkHelp.com. Our curated examples in Java, Python, C++, and more showcase our expertise in solving diverse coding challenges. Each sample is crafted to illustrate clear, efficient solutions that adhere to academic standards. Dive into our samples to see how we can help you excel in your programming assignments.