Curve Fitting

Curve Fitting#

Teng-Jui Lin

Content adapted from UW CHEME 375, Chemical Engineering Computer Skills, in Spring 2021.

Python skills and numerical methods
- curve fitting by scipy.optimize.curve_fit()
- minimizing least squared residual byscipy.optimize.minimize()
ChemE applications
- Curve fitting of data
- Clausius-Clapeyron equation

Fitting the Clausius-Clapeyron equation#

Problem Statement. Fit a curve according to the Clausius-Clapeyron equation for the temperature series

[80, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280]

with units of K, and the vapor pressure series

[4.98e-02, 1.22e+01, 3.65e+02, 4.02e+03, 2.43e+04, 6.62e+04, 1.47e+05, 2.68e+05, 8.91e+05, 9.85e+05, 1.62e+06]

with units of Pa.

Solution. One form of the Clausius-Clapeyron equation is given by

\ln P^{*} = - \frac{Δ H_{vap}}{R T} + B,

where $P^{*}$ is the vapor pressure, $Δ H_{vap}$ is the heat of vaporization, $T$ is the absolute temperature, $R$ is the ideal gas constant, and $B$ is a constant.

Given $P^{*}$ and $T$ datasets, we can fit the Clausius-Clapeyron equation by recognizing that

\ln P^{*} = \frac{A}{T} + B,

where $A = - \frac{Δ H_{vap}}{R}$ , is a linear equation of $\ln P^{*}$ and $\frac{1}{T}$ .

We can also write it in the exponential form:

P^{*} = \exp (\frac{A}{T} + B) .

Implementation: Curve fitting using `scipy.optimize.curve_fit()`#

In this approach, we use scipy.optimize.curve_fit() to fit the exponential form directly.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import minimize

# define given data points
T = np.array([80, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280])
Pv = np.array([4.98e-02, 1.22e+01, 3.65e+02, 4.02e+03, 2.43e+04, 6.62e+04, 1.47e+05, 2.68e+05, 8.91e+05, 9.85e+05, 1.62e+06])

def clausius_clapeyron(T, A, B):
    '''Exponential form of the Clausius-Clapeyron equation'''
    return np.exp(A/T + B)

# use scipy.optimize.curvefit()
popt, pcov = curve_fit(clausius_clapeyron, T, Pv)
popt

array([-1477.20170831,    19.57105488])

# define curve fit line
T_fit = np.arange(60, 300, 10)
Pv_fit = clausius_clapeyron(T_fit, *popt)

# plot settings
%config InlineBackend.figure_format = 'retina'
%matplotlib inline

plt.rcParams.update({
    'font.family': 'Arial',  # Times New Roman, Calibri
    'font.weight': 'normal',
    'mathtext.fontset': 'cm',
    'font.size': 18,
    
    'lines.linewidth': 2,
    
    'axes.linewidth': 2,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'axes.titleweight': 'bold',
    'axes.titlesize': 18,
    'axes.labelweight': 'bold',
    
    'xtick.major.size': 8,
    'xtick.major.width': 2,
    'ytick.major.size': 8,
    'ytick.major.width': 2,
    
    'figure.dpi': 80,
    
    'legend.framealpha': 1, 
    'legend.edgecolor': 'black',
    'legend.fancybox': False,
    'legend.fontsize': 14
})

fig, axs = plt.subplots(1, 2, figsize=(8, 4))
axs[0].plot(T, Pv, 'o', label='Data point')
axs[0].plot(T_fit, Pv_fit, label='Curve fit')
axs[0].set_xlabel('$T$')
axs[0].set_ylabel('$P^*$')
axs[0].set_xlim(60, 300)
axs[0].set_ylim(0, 2e6)
axs[0].ticklabel_format(axis="y", style="sci", scilimits=(0, 0))
axs[0].legend()

axs[1].plot(1/T, np.log(Pv), 'o', label='Data point')
axs[1].plot(1/T_fit, np.log(Pv_fit), label='Curve fit')
axs[1].set_xlabel('$\dfrac{1}{T}$')
axs[1].set_ylabel('$\ln P^*$')
axs[1].set_ylim(top=16)
axs[1].legend()


plt.tight_layout(True)

../../_images/214ce00ef6dd11f1d3ad96d72b26bedc7c3d0c9a44e39ff0826a3e5f683bc926.png

By inspection of the linear form, the curve fit did not best fit the trend as $\frac{1}{T}$ increases.

Implementation: Curve fitting using `scipy.optimize.curve_fit()`#

In this approach, we use scipy.optimize.curve_fit() to fit the linearized form.

def clausius_clapeyron_linear(x, A, B):
    '''Linear form of the Clausius-Clapeyron equation'''
    return A*x + B

# use scipy.optimize.curvefit()
popt, pcov = curve_fit(clausius_clapeyron_linear, 1/T, np.log(Pv))
popt

array([-1915.93999621,    21.54152215])

# define curve fit line
inv_T_fit = 1/np.arange(60, 300, 10)
ln_Pv_fit = clausius_clapeyron_linear(1/T_fit, *popt)

fig, axs = plt.subplots(1, 2, figsize=(8, 4))
axs[0].plot(T, Pv, 'o', label='Data point')
axs[0].plot(1/inv_T_fit, np.exp(ln_Pv_fit), label='Curve fit')
axs[0].set_xlabel('$T$')
axs[0].set_ylabel('$P^*$')
axs[0].set_xlim(60, 300)
axs[0].set_ylim(0, 2e6)
axs[0].ticklabel_format(axis="y", style="sci", scilimits=(0, 0))
axs[0].legend()

axs[1].plot(1/T, np.log(Pv), 'o', label='Data point')
axs[1].plot(inv_T_fit, ln_Pv_fit, label='Curve fit')
axs[1].set_xlabel('$\dfrac{1}{T}$')
axs[1].set_ylabel('$\ln P^*$')
axs[1].set_ylim(top=16)
axs[1].legend()

plt.tight_layout(True)

../../_images/c625095216ae66fc34b6f7e994f2981e91308c2badb69c6e8785f16a3a662771.png

Implementation: Curve fitting using `scipy.optimize.minimize()`#

The residual (error) between the $i$ th data point and the best fit line can be defined as

Res = x_{fit, i} - x_{data, i},

so the sum of squared residual (SSR) is

SSR = \sum (Res)^{2} = \sum_{i = 1}^{n} (x_{fit, i} - x_{data, i})^{2}

The best fit line of the data points will have a minimum SSR.

In this approach, we use scipy.optimize.minimize() to minimize the sum of squared residual of the exponential form.

def clausius_clapeyron_SSR(params):
    '''Sum of squared residal of the Clausius-Clapeyron equation'''
    A, B = params
    return np.sum(((Pv - clausius_clapeyron(T, A, B))/(0.1*Pv))**2)

# define initial guess
A_guess = -1000
B_guess = 10
guess = [A_guess, B_guess]

# use scipy.optimize.minimize()
res = minimize(clausius_clapeyron_SSR, guess)
res.x

array([-1931.98479473,    21.45976783])

# define curve fit line
T_fit = np.arange(60, 300, 10)
Pv_fit = clausius_clapeyron(T_fit, *res.x)

fig, axs = plt.subplots(1, 2, figsize=(8, 4))
axs[0].plot(T, Pv, 'o', label='Data point')
axs[0].plot(T_fit, Pv_fit, label='Curve fit')
axs[0].set_xlabel('$T$')
axs[0].set_ylabel('$P^*$')
axs[0].set_xlim(60, 300)
axs[0].set_ylim(0, 2e6)
axs[0].ticklabel_format(axis="y", style="sci", scilimits=(0, 0))
axs[0].legend()

axs[1].plot(1/T, np.log(Pv), 'o', label='Data point')
axs[1].plot(1/T_fit, np.log(Pv_fit), label='Curve fit')
axs[1].set_xlabel('$\dfrac{1}{T}$')
axs[1].set_ylabel('$\ln P^*$')
axs[1].set_ylim(top=16)
axs[1].legend()

plt.tight_layout(True)

../../_images/c1074c0bd2c022213df29dfe4fa5c6bd04b5d487faab4eda266b9c3c49d0b0d2.png

Conclusion#

In this notebook, we explored implementation of curve fitting using curve_fit() and minimize(). The coefficients found in each implementation is summarized below:

No.	Method	$A$	$B$
1	`curve_fit()` with exponential form	-1477	19.57
2	`curve_fit()` with linear form	-1916	21.54
3	`minimize()` with exponential form	-1932	21.46

By inspection of the graphs and the coefficient table, we found that method 2 and 3 most accurately fits the over data trend, whereas method 1 fails to capture the trend at small $T$ .

Curve Fitting

Contents

Curve Fitting#

Fitting the Clausius-Clapeyron equation#

Implementation: Curve fitting using scipy.optimize.curve_fit()#

Implementation: Curve fitting using scipy.optimize.curve_fit()#

Implementation: Curve fitting using scipy.optimize.minimize()#

Conclusion#

Implementation: Curve fitting using `scipy.optimize.curve_fit()`#

Implementation: Curve fitting using `scipy.optimize.curve_fit()`#

Implementation: Curve fitting using `scipy.optimize.minimize()`#