This example generates parameters for a second order polynomial that fits synthetic data.
The main source for this example comes from:
https://donaldpinckney.com/books/pytorch/book/ch2-linreg/2018-03-21-multi-variable.html
which has a good explanation of the underlying concepts, and a deeper explanation of the algorithm itself.
Other sources for details on plotting and data generation are:
https://www.kaggle.com/kanncaa1/pytorch-tutorial-for-deep-learning-lovers
https://www.guru99.com/pytorch-tutorial.html
The code
Import modules:

Print version numbers:

Output:

Generate Data
Now generate the synthetic data. In this example, the underlying model is a second-order polynomial, with added random noise.
Define the underlying noiseless model:

There is nothing special about the constants chosen, so pick your favorites.
Now some parameters that define the number of data points, the maximum x value, and the amplitude of the noise:

Use these parameters to generate a NumPy array of random x-values and use the x-values values to generate corresponding NumPy array of y-values:

Note that the noise is adjusted to center at zero.
Now plot the generated data:

Output:

In this plot–and subsequent plots–the clean y-data is plotted as a blue line, and the noisy y-data is plotted as green circles.
I am using Spyder so had to change the default setting in Spyder so that the plots would appear in separate windows. This step is not necessary and a bad idea if you are plotting a lot of figures. Anyway, see this link for instructions on how to do this:
https://www.scivision.dev/spyder-with-ipython-make-matplotlib-plots-appear-in-own-window/
You will need to re-start the kernel after making this change.
Now we need to reconfigure the generated data for use in PyTorch.
Because we generated the data, we already know the underlying model, a second-order polynomial. For this regression, we will also assume a second-order polynomial (cheating, yes). The independent variables for this polynomial are x and x2, so we need an array with two rows: one with the x-values generated above and one with them squared.

These values do not change during training.
Now convert this and the y-values from a NumPy array to a PyTorch tensor:

Define the Model
A second-order polynomial has three coefficients, two for the x and x2 variables (A), plus one that defines the intersection with the y-axis (b). Since, obviously, we do not know these values yet, we initiate them with random values:

Output:

Define a function that generates y-values using the current estimated values of A and b and the x and x2 values generated above.

Define the loss function and optimizer.

In this example, the loss function is constructed, but PyTorch has a good number built-in, including a mean squared error function that is the same as this function (nn.MSELoss with reduction = ‘sum’). See https://pytorch.org/docs/stable/nn.html#loss-functions for more.
And now plot predicted y-values using the randomly initialized values of A and b:

This function, plot_state, is defined in the full code listing at the end of this post.
Output:

The predicted y-values are plotted with magenta circles. Even with randomly initiated coefficients, the polynomial shape is apparent.
Before we can train the model, we need to define some parameters:

where loss_list is an array to capture the losses for display later, num_iterations is self-explanatory, and num_plots defines how many times to plot current predicted y-values during training.
Now training:

Output after first iteration:

Output after 500 iterations:


After the 500th iteration, the estimated coefficients of the polynomial are 98.1, 4.3, and 4.8, while the coefficients used to generate the synthetic data were 100, 2, and 5. Although the coefficients of the y-intercept and x2 are reasonably close, the coefficient of x is way off. Perhaps this is not surprising given that the y-intercept is much larger than the x-coefficient, and that a function of x2 will dominate a function of x, if their coefficients are of the same order of magnitude. In addition, the amplitude of the noise is quite large and reducing its amplitude improves the fit (it had better). Or maybe there is something wrong with the code.
Now plot the loss over iterations:

Output:

The loss drops very quickly in the first 50 or so iterations, but is still declining at over 400 iterations. Reducing the synthetic data y-intersect speeds up the convergence.
Full code listing:
"""
from: https://donaldpinckney.com/books/pytorch/book/ch2-linreg/2018-03-21-multi-variable.html
plotting: https://www.kaggle.com/kanncaa1/pytorch-tutorial-for-deep-learning-lovers
data generation and plotting: https://www.guru99.com/pytorch-tutorial.html
"""
# import modules
import matplotlib
import matplotlib.pyplot as plt
import torch
import torch.optim as optim
import numpy as np
# print module versions
import sys # only used to print python version
print('Python: ',sys.version)
print('torch: ',torch.__version__)
print('matplotlib: ',matplotlib.__version__)
print('numpy: ',np.__version__)
####################################################################
# generate synthetic data
####################################################################
# define noiseless underlying model
def y_clean(x_input):
return 10 + 2*x_input + 5*np.power(x_input,2)
# generate synthetic data
num_x = 50 # number of x points (also batch size)
x_max = 8 # maximum value (minimum = 0)
noise = 50 # random noise amplitude
x = x_max*np.random.rand(num_x) # generate random x values
x = np.sort(x) # to make plotting easier
y = y_clean(x) + noise*np.random.rand(num_x)-(noise/2)
# plot noisy data
plt.figure()
plt.plot(x,y_clean(x),'b')
plt.scatter(x,y,c='g')
plt.xlabel("x values")
plt.ylabel("output")
plt.title("Clean and Noisy Input")
plt.show()
# generate array with x and x^2 rows
x_dataset_np = np.stack((np.power(x,1),np.power(x,2)),axis=0)
# convert numpy array to pytorch tensor
x_dataset = torch.from_numpy(x_dataset_np).float()
y_dataset = torch.from_numpy(y).float()
####################################################################
# define plotting function that display current predicted values
####################################################################
def plot_state(x_data,y_data,xx,iteration):
plt.figure()
y_predict = model(x_data).data.numpy()
plt.scatter(xx,y_predict[0,:],c='m')
plt.scatter(xx,y_data[:],c='g')
plt.plot(xx,y_clean(xx),'b-')
plt.title(f"Iteration {iteration}")
plt.xlabel("x values")
plt.ylabel("output")
plt.show()
####################################################################
# define PyTorch model
####################################################################
n = 2 # define degree of polynomial
torch.manual_seed(1) # ensures a consistent sequence of pseudo-random numbers
# assign random values to initial polynomial parameters (A,b)
A = torch.randn((1, n), requires_grad=True)
b = torch.randn(1, requires_grad=True)
print(A,b)
# define model
def model(x_input):
return A.mm(x_input) + b
# define batch loss function
def loss(y_predicted, y_target):
return ((y_predicted - y_target)**2).sum()
# define the optimizer
optimizer = optim.Adam([A, b], lr=2.5)
# plot predicted output using randomly initialized parameters, A and b
plot_state(x_dataset,y_dataset,x,0)
####################################################################
# train
####################################################################
loss_list = []
num_iterations =500
num_plots = 4 # number of plots per run
nn = int(num_iterations / num_plots)
for t in range(num_iterations):
# zero out gradients
optimizer.zero_grad()
# calculate output using current polynomial parameters (A,b)
y_predicted = model(x_dataset)
# calculate sum of squared error
current_loss = loss(y_predicted, y_dataset)
# calculate the loss gradient
current_loss.backward()
# update polynomial parameters (A,b)
optimizer.step()
# store loss
loss_list.append(current_loss.data)
if (t % nn == 0)or(t == (num_iterations-1)):
print(f"t = {t+1}, loss = {current_loss}, A = {A.detach().numpy()}, b = {b.item()}")
plot_state(x_dataset,y_dataset,x,t+1)
# plot loss versus iterations
plt.figure()
plt.plot(range(num_iterations),loss_list)
plt.xlabel("Number of Iterations")
plt.ylabel("Loss")
plt.title("Loss vs. Iterations")
plt.yscale("log")
plt.grid(True,which="both" )
plt.show()
