Based on the Landsat satellite image, the corresponding apparent radiance value is calculated according to the gain value and the offset value of the thermal infrared band of the Landsat data. Then, the surface specific radiance is estimated by using the image vegetation coverage. Finally, according to the Planck inverse function and the Landsat preset, the calibration factor calculates the surface temperature. The specific implementation go as follows:

(1) Apparent radiance calculation

Apparent radiance calculation is the process of converting the image pixel gray value to the corresponding heat radiation intensity, transferring the image's grayscale values to radiance values with the Radimetric Calibration tool in ENVI software.

(2) Estimation of surface specific emissivity

The surface specific emissivity is an ability to characterize the electromagnetic radiation of a ground object. Based on the near-infrared and red-light bands of the image, the normalized vegetation index NDVI is calculated according to formula (1).^{[11]}

| \(NDVI=\left({TM}_{\mathrm{N}\mathrm{I}\mathrm{R}}-{TM}_{\mathrm{R}\mathrm{e}\mathrm{d}}\right)/\left({TM}_{\mathrm{N}\mathrm{I}\mathrm{R}}+{TM}_{\mathrm{R}\mathrm{e}\mathrm{d}}\right)\) | （1） |

Where: \({\mathrm{T}\mathrm{M}}_{\mathrm{N}\mathrm{I}\mathrm{R}}\) is the near-infrared band reflectivity of the image; \({\mathrm{T}\mathrm{M}}_{\mathrm{R}\mathrm{e}\mathrm{d}}\) is the red band reflectance of the image.

Second, to estimate the surface specific emissivity using image vegetation coverage. Due to the differences in the structure of surface materials, the corresponding surface radiance calculation methods are different for different land types. There are roughly three types: water, town, and natural surfaces. The water surface pixel has a high surface emissivity and a similar radiance to the black body. Therefore, when estimating the water body specific emissivity, it is often assigned a value of 0.995. The natural surface and the town surface pixel are estimated by the surface emissivity according to the formula. (2) and formula (3) calculation:^{[12,13,14]}

| \({e}_{\mathrm{S}\mathrm{u}\mathrm{r}\mathrm{f}\mathrm{a}\mathrm{c}\mathrm{e}}=0.9625+0.0614{P}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}-0.0461{{P}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}}^{2}\) | （2） |

| \({e}_{\mathrm{B}\mathrm{u}\mathrm{i}\mathrm{l}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}}=0.9589+0.086{P}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}-0.0671{{P}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}}^{2}\) | （3） |

Where:\(\mathrm{ }{\mathrm{e}}_{\mathrm{S}\mathrm{u}\mathrm{r}\mathrm{f}\mathrm{a}\mathrm{c}\mathrm{e}}\) represents the surface specific emissivity of natural surface pixels; \({\mathrm{e}}_{\mathrm{B}\mathrm{u}\mathrm{i}\mathrm{l}\mathrm{d}\mathrm{i}\mathrm{n}\mathrm{g}}\) represents the surface specific emissivity of the town pixel; \({\mathrm{P}}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}\) represents the vegetation coverage, as calculated by equation (4):^{[15,16] }

| \({P}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}={\left[\frac{NDVI-{NDVI}_{\mathrm{S}\mathrm{o}\mathrm{i}\mathrm{l}}}{{NDVI}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}-{NDVI}_{\mathrm{S}\mathrm{o}\mathrm{i}\mathrm{l}}}\right]}^{2}\) | （4） |

Where: NDVI is the normalized vegetation index of the image, \({\mathrm{N}\mathrm{D}\mathrm{V}\mathrm{I}}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}\) is the NDVI value of the pure vegetation pixel in the image, \({\mathrm{N}\mathrm{D}\mathrm{V}\mathrm{I}}_{\mathrm{S}\mathrm{o}\mathrm{i}\mathrm{l}}\) is the NDVI value of the pure soil pixel in the image. When the NDVI of a pixel is greater than 0.70,\(\mathrm{ }{\mathrm{P}}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}\) takes a value of 1; when NDVI is less than 0.05, \({\mathrm{P}}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}\) takes a value of 0; when NDVI is between 0.05 and 0.7, \({\mathrm{N}\mathrm{D}\mathrm{V}\mathrm{I}}_{\mathrm{V}\mathrm{e}\mathrm{g}\mathrm{e}}\) and NDVI are respectively used. \({\mathrm{N}\mathrm{D}\mathrm{V}\mathrm{I}}_{\mathrm{S}\mathrm{o}\mathrm{i}\mathrm{l}}\) takes values of 0.70 and 0.05, and uses the above formula to estimate the vegetation coverage of the image.

（3）Surface temperature calculation

Based on the calculation results of apparent radiance and surface specific emissivity in the thermal infrared band, the surface thermal radiance is reversed according to the radiative transfer equation, as shown in equation (5):

| \(B\left({T}_{\mathrm{S}\mathrm{e}\mathrm{n}\mathrm{s}}\right)=\frac{L-{L}^{↑}-\tau \left(1-\epsilon \right){L}_{↓}}{\tau \epsilon }\) | （5） |

Where: \(\mathrm{B}\left({\mathrm{T}}_{\mathrm{S}\mathrm{e}\mathrm{n}\mathrm{s}}\right)\) is the surface thermal radiance; L is the planetary brightness temperature value; \({\mathrm{L}}^{↑}\) and \({\mathrm{L}}_{↓}\) are the atmospheric up-radiation and atmospheric down-radiation, respectively, according to the Landsat satellite transit time, the latitude and longitude of the center of interest, and the atmospheric mode. And the type of sensor obtained, the specific parameters can be directly obtained through NASA related webpage (http://atmcorr.gsfc.nasa.gov/); τ is the atmospheric path transmittance, which can be obtained by NASA related webpage; ε is the surface specific radiation rate.

Calculate the true surface temperature^{[17,18]}using the Plank inverse function in conjunction with Landsat's preset scaling constants \({\mathrm{K}}_{1}\) and \({\mathrm{K}}_{2}\) as shown in equation (6):

| \({T}_{\mathrm{S}\mathrm{u}\mathrm{r}\mathrm{f}\mathrm{a}\mathrm{c}\mathrm{e}}=\frac{{K}_{2}}{\mathrm{ln}\left(\frac{{K}_{1}}{\mathrm{B}\left({\mathrm{T}}_{\mathrm{S}\mathrm{e}\mathrm{n}\mathrm{s}}\right)}+1\right)}-273.15\) | （6） |

Where: \({\mathrm{T}}_{\mathrm{S}\mathrm{u}\mathrm{r}\mathrm{f}\mathrm{a}\mathrm{c}\mathrm{e}}\) is the true temperature of the surface; \({\mathrm{K}}_{1}\) and \({\mathrm{K}}_{2}\) are the preset scaling constants of Landsat, which can be obtained from the image header file.

In order to avoid the influence of multi-phase data, the multi-temporal surface temperature data is normalized by pixel, and the specific implementation is as shown in formula (7):

| \(Image=\frac{X-{X}_{\mathrm{m}\mathrm{i}\mathrm{n}}}{{X}_{\mathrm{m}\mathrm{a}\mathrm{x}}-{X}_{\mathrm{m}\mathrm{i}\mathrm{n}}}\) | （7） |

Where: Image represents the image cell value after normalization, the pixel value is between 0 and 1; X represents the image surface temperature value; \({\mathrm{X}}_{\mathrm{m}\mathrm{a}\mathrm{x}}\) and \({\mathrm{X}}_{\mathrm{m}\mathrm{i}\mathrm{n}}\) represent the maximum and minimum values of the image surface temperature, respectively.

**2.3.2 Extracting thermal anomalies based on improved box plot method**

The improved box plot method has five statistics, which are the upper and lower quartiles, the upper and lower non-abnormal value intercept lines, and the median. The upper and lower non-abnormal value intercept lines are used to distinguish between normal values and abnormal values, and the data within the non-abnormal value intercept line is a normal value, otherwise it is an abnormal value. There are two main differences between this method and the most common box plot method in statistics: one is to introduce the Bowley coefficient, and the Pauli coefficient is used as the skewness multiplier of the sample data in the form of "1". When the data set is normally distributed, the Bowley coefficient is concentrated near 0. When the data is right skewed, the Bowley coefficient takes 1; when the data is left skewed, the Bowley coefficient takes −1.The second is to calculate the non-outlier intercept line by using the semi-quartile range, which can be better applied to data with skewed state.

^{[19]}The specific implementation of the method is as follows:

(1) Get the upper and lower quartiles and the semi-quartile range

The surface temperature data is obtained from the image-by-pixel by the IDL program, and then the upper and lower quartiles and medians of the corresponding data are obtained by using the SPSS software, that is, the values are ranked at 25%, 50%, and 75% respectively after sorting the data from small to large.

Then, based on the corresponding values based on the quartiles, the upper and lower semi-quartile distances are calculated using EXCEL software, as shown in equations (8) and (9):^{[20]}

| \({SIQR}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}={Q}_{2}-{Q}_{1}\) | （8） |

| \({SIQR}_{\mathrm{u}\mathrm{p}}={Q}_{3}-{Q}_{2}\) | （9） |

Where: \({\mathrm{S}\mathrm{I}\mathrm{Q}\mathrm{R}}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}\) and \({\mathrm{S}\mathrm{I}\mathrm{Q}\mathrm{R}}_{\mathrm{u}\mathrm{p}}\) are the upper and lower limits of the box length, that is, the upper and lower semi-quartile distance of the improved box plot; \({\mathrm{Q}}_{1}\),\(\mathrm{ }{\mathrm{Q}}_{2}\) and\({\mathrm{Q}}_{3}\) are the upper quartiles of the sample data respectively. , median and lower quartile.

（2）Calculate non-outlier intercept lines

The non-abnormal value intercept line includes a non-abnormal maximum intercept line and a non-abnormal minimum intercept line for determining whether the sample data is an abnormal value. The non-abnormal maximum intercept line is used as the basis for judging whether it is a thermal abnormal value. If the sample data exceeds this value, it is determined to be a thermal abnormal value, that is, there may be a thermal anomaly. The farther the sample data deviates from the outside of the extreme anomaly intercept line, the more serious the thermal anomaly is. The implementation process is as shown in formula (10), formula (11) and formula (12):

| \({B}_{\mathrm{c}}=\frac{{SIQR}_{\mathrm{u}\mathrm{p}}-{SIQR}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}}{{SIQR}_{\mathrm{u}\mathrm{p}}+{SIQR}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{m}}}\) | （10） |

Where: \({\mathrm{B}}_{\mathrm{c}}\) is the Bowley coefficient, between −1 and 1; \({\mathrm{S}\mathrm{I}\mathrm{Q}\mathrm{R}}_{\mathrm{u}\mathrm{p}}\) a\({\mathrm{S}\mathrm{I}\mathrm{Q}\mathrm{R}}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{m}}\) are the upper and lower semi-quartile distances of the box respectively.

| \({f}_{\mathrm{u}\mathrm{p}}={Q}_{3}+1.5×IQR×\left(\frac{1+{B}_{\mathrm{c}}}{1-{B}_{\mathrm{c}}}\right)\) | （11） |

| \({f}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}={Q}_{1}-1.5×IQR×\left(\frac{1-{B}_{\mathrm{c}}}{1+{B}_{\mathrm{c}}}\right)\) | （12） |

Where: \({\mathrm{f}}_{\mathrm{u}\mathrm{p}}\) and \({\mathrm{f}}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}\) are the non-abnormal maximum intercept line and the non-abnormal minimum intercept line of the box respectively. When a certain data in the sample data set is larger than \({\mathrm{f}}_{\mathrm{u}\mathrm{p}}\), it is classified as a thermal abnormal value; when the sample data set is some When a data is less than \({\mathrm{f}}_{\mathrm{d}\mathrm{o}\mathrm{w}\mathrm{n}}\), it is classified as a cold abnormal value.

Finally, use ArcGIS software to filter the values larger than the non-abnormal maximum intercept line by pixel by pixel

**2.3.3 Extracting thermal anomaly zone based on thermal anomaly frequency**

The surface temperature of the thermal anomaly zone tends to be higher than the surrounding surface temperature, forming a local heat island phenomenon, and its frequency of occurrence is generally greater than the frequency of occurrence of thermal anomalies caused by other factors. Therefore, based on the results of the thermal anomaly extraction, the frequency of occurrence of the annual thermal anomaly area is statistically analyzed. In order to avoid the phenomenon of missed judgment, when the frequency of occurrence of a thermal anomaly zone is greater than 60%, it is determined as an urban thermal anomaly zone, as shown in formula (13):

| \({f}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}=\frac{{i}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}}{{n}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}}\) | （13） |

Where: \({\mathrm{f}}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}\) represents the frequency of occurrence of thermal anomalies; \({\mathrm{i}}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}\) represents the number of occurrences of a local thermal anomaly in a year; \({\mathrm{n}}_{\mathrm{l}\mathrm{o}\mathrm{c}\mathrm{a}\mathrm{l}}\) represents the total number of occurrences of the overall thermal anomaly in a year

In order to further explore the distribution range of urban thermal anomaly areas, statistically analyze the frequency of occurrence of thermal anomalies in 2008–2017, and take the area with the extraction frequency greater than 60% as the urban thermal anomaly area of 10 a, as shown in formula (14):

| \({f}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}=\frac{{i}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}}{{n}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}}\) | （14） |

Where:\(\mathrm{ }{\mathrm{f}}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}\) represents the frequency of occurrence of thermal anomalies in 10 a; \({\mathrm{i}}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}\) represents the total number of occurrences of a local thermal anomaly in 10 years; \({\mathrm{n}}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}\) represents the total number of occurrences of the overall thermal anomaly in 10 years, that is, a value of is 10.