Abstract: Widely distributed in the vicinity of the equator, tropical forest is the type of forest with the most abundant species worldwide which has a profound effect on global climate. Therefore, it is of great significance for a country to develop the forest resources inventory and perform dynamic monitoring. Research on the classification of natural forests not only supports the investigation of tropical forests, but also provides the basis for the study of forest species diversity. The dual-polarized SAR data from Sentinel-1A sensor and the optical remote sensing data from Landsat-8 sensor were used for classification of Hainan island tropical natural forest. First, we analyzed the single-band, multi-band, normalized difference vegetation index (NDVI) characteristics of optical data, and the single-phase, multi-temporal,multi-polarized backscattering characteristics of SAR data. Then, optical and backscattering characteristics were selected for natural forest classification whereby the natural forest range of Hainan Island was extracted by using support vector machine (SVM).
Keywords: multi-source remote sensing data; tropical forest; remote sensing classification; support vector machine (SVM)
|English title||Classification of Hainan island natural forests based on multi-source remote sensing data|
|Corresponding author||Gong Chen (email@example.com)|
|Data authors||Shi Jiankang, Gong Chen, Li Xinwu, Wan Xiangxing, Sun Zhongchang|
|Geographical scope||18°10’N – 20°10’N, 108°37’E – 111°03’E|
|Spatial resolution||30 m||Data volume||1.20 GB|
|Data format||*.bat, *.tif, *.shp, *.xml|
|Data service system||http://www.sciencedb.cn/dataSet/handle/711|
|Source(s) of funding||Major Science and Technology Program of Hainan Province (ZDKJ2016021); Natural Science Foundation of Hainan Province (20164177).|
|Dataset/Database composition||Classification_of_Hainan_Island_Land_cover.zip.zip stores data on the classification of Hainan land cover, with a data volume of about 1.20 GB;|
Classification_of_Hainan_Island_Natural_Forest.zip stores data on the classification of Hainan natural forest, with a data volume of about 15.5 MB.
Forest is an important component of global ecosystem. It is also an important material base for sustainable development and the precious resource for human society. With the global climate change and the destruction of human activities, forest resources have been destroyed to various degrees and are facing more and more threats12. The traditional methods of forest investigation still play an important role in the monitoring of forest resources. However, with the progress of remote sensing, it is gradually becoming an important means of forestry investigation. Compared with the traditional methods, remote sensing can macroscopically, quickly and accurately collect the forest information. Through the interpretation and analysis of remote sensing images, we can complete the qualitative or quantitative investigation of forest resources. Classification of remote sensing images is an important application in forestry field. It not only provides strong support for forest resource investigation, but also plays an important role in mapping, planning, and monitoring of forest3.4. At present, remote sensing monitoring of artificial forest in Hainan Island has made great progress5.6. The remote sensing classification accuracy of rubber forest in mesoscale scale is higher than that of rubber forest in large scale. The classification accuracy of oil palm, teak and eucalyptus forest in small scale is about 90%. For the classification of tropical natural forests in Hainan Island, field survey is still the main method. There are few studies on its classification by remote sensing.
For the tropics, cloudy or rainy weather is the most important factors affecting optical remote sensing data. Microwave remote sensing can effectively penetrate clouds or fogs because of its long wavelength, so as to reduce the influence of clouds or fogs on observation results. Therefore, microwave remote sensing is a powerful supplement to remote sensing forest monitoring in tropical areas7.For optical remote sensing, we mainly use the band information, color, texture and other characteristics of data to monitor forest. For microwave remote sensing, short wave bands (C and X band) are used to identify the backscattering characteristics of different leaf shapes in forest canopy. With the combination advantages of optical and microwave remote sensing data, it has become a trend to monitor tropical forest. Under cloudy or rainy conditions, the combination of remote sensing and field survey data in tropic regions can better carry out the local forest monitoring.
The tropical forests of Hainan Island are mainly distributed in the central mountainous areas. There are five major tropical forest regions in Jian Feng ridge, Ba Wang ridge, Li Mu mountain, Wu Zhi mountain and Diao Luo mountain. Jian Feng ridge is the largest tropical forest area in Hainan Island. With the topography and climate conditions, tropical forest types in Jian Feng ridge have various vegetation types and vertical distribution structure. The main forest types in Jian Feng ridge are typical tropical rain forest, tropical monsoon rain forest, evergreen moss forest and evergreen deciduous broad-leaved mixed forest. There are more than 2,800 kinds of vegetation in Jian Feng ridge. The main vegetation families are rutaceae, papilionaceae, compositae, arboraceae, moraceae, lauraceae, borneolum, palmae, myrtle, elaeocarpus and symplocos. The other four main forest regions have some differences in the distribution of forest types due to the terrain, climate and soil conditions, but the overall distribution of forest types is similar.
Hainan Island is taken as the research area in this dataset. Using optical and multi-temporal SAR remote sensing data, the spatial distribution of the main forest types in Hainan Island is extracted. This study can provide some reference for the classification of tropical natural forests in other areas, and also provide references for the protection of natural forests in Hainan Island.
2.1 Data Source
2.1.1 Remote sensing data
Considering the synergy between the separability of data and the economy of large-area applications, we selected high resolution remote sensing data as data sources, including OLI data provided by Landsat-8 and interferometric wide ground range detected multi-view products (IW-GRD) provided by Sentinel-1A. The projections of the data are UTM projections based on WGS84. The details are as follows: there are 4 scenes of OLI data provided by Landsat-8 from November 2015 to February 2016 (less cloud cover and similar acquisition time). There are 20 scenes of SAR data provided by Sentinel-1A from June 2015 to April 2016 in 10 periods. Details of the data are shown in Table 1.
2.1.2 Forest survey data
To ensure the accuracy and reliability of classification, we conducted a forestry survey in Hainan Island. We collected the samples of natural forest types such as typical tropical rain forest, tropical monsoon rain forest, evergreen moss forest and evergreen deciduous broad-leaved mixed forest, and the samples of artificial forest types such as rubber forest, eucalyptus forest, betel nut. The main parameters of data acquisition are forest type, geographical coordinate and forest type characteristics (forest canopy characteristics, representative tree species).
Based on the principle of sample typicality and accessibility of sampling sites, we selected Dan Zhou, Tun Chang and Le Dong et al as artificial forest sampling sites, and Jian Feng ridge, Bai Hua ridge and Dong Zhai Port et al as natural forest sampling sites. Artificial forests are mostly distributed in human activity areas. To quickly collect samples, artificial forest sample sites are basically collected along expressways. Natural forests are mostly distributed in mountainous areas. Samples need to be collected on foot, so the sampling area is limited. The samples of natural forests mainly concentrated in Jian Feng ridge and Bai Hua ridge. Sample distribution and sample information are shown in Figure 1.
Field sampling data contains 300 points. According to the records of sample points, the specific characteristics of tropical forest types in Hainan Island are as follows:
(1) Artificial forest: They mainly distribute in the plains and hilly areas with low altitude. There are always large areas of continuous single tree species. The planting distribution has obvious regularity, and most of them are economic forest.
(2) Typical tropical rain forest: The vegetation is flourishing and the types are abundant. They are not affected by human activities. The spatial structure is obviously stratified with 5-7 layers. The main layers are herbaceous, shrub, young arbor, general arbor and tall arbor.
(3) Evergreen deciduous broad-leaved mixed forest: They are affected by human activities. There are many sun plant species. The spatial structure is not obviously stratified with only 1-2 layers. The main layers are shrub and arbor. The deciduous species are mainly albizia chinensis, liquidambar and radermachera hainanensis et al.
(4) Tropical monsoon rain forest: They are affected by several human activities. They generally have 3-4 layers. And they have seasonal changes. Syzygium and banyan are the main species.
(5) Evergreen coniferous forest: The leaf shape is needle-shaped. The main tree species are Pinus caribaea and Pinus koraiensis. There are few coniferous forest species with large area and continuous distribution.
(6) Evergreen moss forest: They mainly distribute on the top of the mountain above 1200 m. The height of canopy height is low. The distribution area is very small.
2.2 Data processing
Since the imaging mechanism of optical and SAR remote sensing is different, the factors affecting the quality of images are also different. Therefore, we divide the preprocessing into two parts: SAR data preprocessing and optical data preprocessing. SAR data preprocessing includes radar calibration, image filtering and terrain correction. OLI data preprocessing includes radiance calibration and atmospheric correction. Because of the registration of different resolution data, SAR data are re-sampled to the same 30 m spatial resolution as OLI data. Then, we extract the classification features after registration. With these features, Support Vector Machine (SVM) classification algorithm is used to extract and classify natural forests. Based on the field data and forestry survey data, the accuracy of natural forest classification was analyzed. We finally obtain the natural forest classification map of Hainan Island. The flow chart is shown in Figure 2.
2.2.1 Extraction of classification features
When extracting the classification features, we should consider not only the spectral characteristics, but also the amount of information provided by bands and the repeatability of different bands in classification8.9. We analyze the single-band and multi-band spectral characteristics of OLI data and the VV and VH polarization backscattering coefficients of SAR data, and extract the appropriate classification features.
(1) Extraction of spectral characteristics
The single-band and multi-band statistical characteristics of OLI data are analyzed:
(a) The standard deviation of mid-near infrared band (Band5) and shortwave infrared band (Band6 and Band7) is large, while that of blue band (Band4) is the smallest (Table 2). It shows that the blue band has less reflectance information, and the near-infrared or shortwave infrared band contains more information.
(b) The covariances between near infrared band and the first four bands are the largest (Table 3). It shows that the isotropy between deep blue band, blue band, shortwave infrared band 2 and other bands is weak.
(c) Through the correlation coefficients of each band (Table 4), we know that blue band, shortwave infrared band 2 have stronger correlation with other bands. Their independence is relatively weak.
Because the deep blue band is suitable for coastal observation, the deep blue band is also a band to be considered for classification. We select other bands of OLI data as the preferred bands except blue band and shortwave infrared band 2.
(2) Extraction of SAR backscattering characteristics
VV and VH polarization backscattering coefficients of SAR data are analyzed (Table 5 and Table 6):
(a) The trend of standard deviation from rainy season (from June to October) to dry season (from November to February) is increasing. It shows that the information obtained in dry season is more abundant than that obtained in rainy season.
(b) The standard deviation of VH polarization is higher than that of VV polarization, which indicates that the cross polarization can get more information than the co-polarization.
(c) Two kinds of polarization SAR data both reach the maximum standard deviation in February, which indicates that if the size of classification data is taken into account, only one phase SAR data is selected, that is, SAR data imaged in dry season.
After analyzing the statistical characteristics of optical and SAR data, considering the size of data, the validity of band and seasonal variation, we select 1, 3, 4, 5 and 6 bands of OLI data, normalized vegetation index (NDVI) and dual-polarization SAR data (2015-08-12 and 2016-02-08) as the combinations for classification.
Table 5 Statistical characteristics of VH polarization backscattering coefficient in different periods
|Statistical characteristics||Minimum||Maximum||Average||Standard deviation||Brightness difference|
Table 6 Statistical characteristics of VV polarization backscattering coefficient in different periods
|Statistical characteristics||Minimum||Maximum||Average||Standard deviation||Brightness difference|
2.2.2 Classification algorithm
Based on support vector machine (SVM) algorithm, we extract natural forest region and classify natural forest types. SVM is based on Vapnik-Chervonenkis (VC) Dimension theory and structural risk minimization principle. It uses the information provided by limited samples to optimize the learning ability and complexity in order to obtain the best generalization ability10.The principle of SVM is to transform the training samples into higher dimensions, and find the optimal hyperplane in higher dimensions. We can distinguish different types of data with this hyperplane11.SVM algorithm has been widely used in remote sensing technology.
SVM algorithm is mainly realized by SVM module in ENVI software. The main parameters in SVM classification module are kernel function and optimal parameters. Kernel functions include Linear kernel, Polynomial kernel, Sigmoid kernel and Radial Basis Function (RBF) kernel.
RBF kernel is selected for modeling. RBF kernel is the most popular and has good learning ability with ideal classification results12.In SVM module of ENVI, we need to select the classification parameters. When extracting the natural forest region, the value gamma of RBF kernel is 0.083 and the value C (penalty coefficient) is 100. When classifying the natural forest types, the value gamma of RBF kernel is 0.077 and the value C is 100.
2.2.3 Classification system of tropical natural forest
Because the distribution of tropical natural forest in Hainan Island is very complex, we first extract the natural forest region, then we classify the natural forest types in the region.
(1) Land classification system of Hainan Island
The primary land classification of Hainan Island is mainly aimed at extracting the natural forest region, so we only classify the basic objects. According to remote sensing data and the use of primary land classification, we established the primary land classification system of Hainan Island. Referring to field samples, we interpreted Landsat-8 remote sensing images and established interpretation markers. Table 7 shows the primary land classification of Hainan Island and the description of interpretation markers. Figure 3 shows the diagram of main objects. With these interpretation markers, we selected training samples in SVM classification.
|Primary object||Secondary object||Interpretation markers|
|Forest||Coast forest||Dark green; near the seaside; Dark stripes|
|Artificial forest||Dark green; Most have regular texture features|
|Natural forest||Dark green; Bright; Irregular texture features|
|Non-forest||Artificial surface||Bright grey; Some are blue|
|Water||Dark cyan; Some waters are green|
|Bare land||Brown; Some have regular textures|
|Other vegetation||Light green or bright green; Regular shape|
(2) Tropical natural forest classification system of Hainan Island
Hainan Island is located in the tropics, which is the important distribution of tropical forest vegetation in China. There are still some arguments in tropical forests classification system of Hainan Island. In the Technical regulations for forest resources planning and Survey issued in 201013., forest types are classified according to use or tree species. Because there are many species of trees, remote sensing interpretation is still difficult. In the vegetation classification system proposed by Song 14 in 2011, forest is divided into an independent vegetation class, which is more conducive to the forest classification.
Based on the vegetation classification system proposed by Song14.15. in 2011, combined with the field samples and the remote sensing characteristics, we classified the tropical natural forest of Hainan island into typical tropical rain forest, tropical monsoon rain forest, coniferous forest, coastal forest and evergreen broad-leaved forest (Table 8). Because the evergreen deciduous forest is only scattered in the evergreen broad-leaved forest, there is no obvious difference between them. We classify the evergreen broad-leaved forest and the evergreen deciduous broad-leaved mixed forest into one kind and unified into evergreen broad-leaved forest.
|Vegetation class||Vegetation subclass||Vegetation type group||Vegetation type|
|Forest||Ⅰ. Coniferous forest||1. Evergreen coniferous forest||1) Thermal evergreen coniferous forest|
|Ⅱ. Broad-leaved forest||2. Evergreen deciduous broad-leaved mixed forest||2) Secondary evergreen deciduous broad-leaved mixed forest|
|3. Evergreen moss forest||3) Mountainous evergreen moss forest|
|4. Evergreen broad-leaved forest||4) Typical evergreen broad-leaved forest|
|5) Seasonal evergreen broad-leaved forest|
|5. Tropical monsoon rain forest||6) Tropical deciduous monsoon rainforest|
|7) Tropical semi-deciduous monsoon rainforest|
|6. Tropical rain forest||8) Typical tropical rainforests|
|9) Tropical seasonal rainforest|
|7. Coast forest||10) Mangrove|
|11) Tropical coral reef coastal forest|
|Ⅲ. Bamboo forest and bamboo grove||8. Bamboo forest||12) Tufty bamboo forest|
|13) Mixed bamboo forest|
3.1 Primary land classification of Hainan Island
Firstly, we used SVM algorithm to get the primary land classification of Hainan Island. According to the primary land classification system, Hainan Island is divided into 7 objects: water, natural forest, bare land, artificial forest, artificial surface, coastal forest and other vegetation (Figure 4).
3.2 Natural forest region of Hainan Island
Based on the primary land classification of Hainan Island, the natural forest region was extracted by SVM method. Based on field survey samples and visual interpretation of remote sensing images, patches of natural forests were identified with SVM, and the natural forest region was finally obtained (Figure 5).
3.3 Natural forest classification based on SVM
To avoid the interference of other objects in natural forest classification, we extract the natural forest region and classify the natural forest in this region. It is difficult to recognize the natural forest on remote sensing images visually. Therefore, the selection of natural forest classification samples mainly depends on field samples. We selected 280 natural forest samples, randomly selected 70% of the samples for training and classification, and 30% for accuracy verification.
We use the separability tool provided by ENVI software to calculate the difference between two categories. The statistical distance between categories is determined by calculating Jeffries-Matusita distance and transformed divergence, so that the separability of training samples (ROI) can be obtained. Table 9 is the separability statistics of natural forest samples in Hainan Island.
The separability value of training samples ranges from 0 to 2.0. When the separability value is greater than 1.9, it shows that the samples have good separability. When the separability value is less than 1.8, the samples need to be edited or re-selected. When the equivalent value is less than 1, two kinds of samples are considered to be synthesized into one kind. The separability between any natural forest types is greater than 1.9 (Table 9), which proves that the five natural forest types have good separability.
|Separability||Typical tropical rain forest||Tropical monsoon rain forest||Evergreen coniferous forest||Evergreen broad-leaved forest||Coast forest|
|Typical tropical rain forest|
|Tropical monsoon rain forest||1.9365|
|Evergreen coniferous forest||1.9220||1.9157|
|Evergreen broad-leaved forest||1.9236||1.9291||1.9048|
The natural forest classifications based on SVM are shown in Figures 6 and 7. In Figure 7, there is a yellow line, which is caused by the missing of Sentinel-1A SAR data. In the splicing process, the line is the missing part of the data.
4.1 Accuracy analysis of primary land classification in Hainan island
The confusion matrix is used to evaluate the classification accuracy. Producer accuracy, user accuracy, overall accuracy and kappa coefficient are used for accuracy evaluation16.. Table 10 shows the accuracy evaluation of primary land classification in Hainan Island.
With the confusion matrix to evaluate the accuracy of classification results, it can be seen that SVM method has achieved good results in primary land classification of Hainan Island. The overall accuracy is 95.15% and the Kappa coefficient is 0.943.
|Classification data||Objects||Water||Natural forest||Bare land||Artificial forest||Artificial surface||Coastal forest||Other Vegetation||Total||User accuracy|
|Overall accuracy: 95.15% Kappa coefficient: 0.943|
4.2 Accuracy analysis of natural forest classification in Hainan island
Producer accuracy, user accuracy, overall accuracy and kappa coefficient are also used for accuracy evaluation of natural forest classification in Hainan Island (Table 11).
Table 11 shows that SVM method achieves good accuracy in natural forest classification. For natural forest classification, although the overall accuracy of classification results is very high, the main reason is that coastal forest and other easily distinguished forest types increase the overall accuracy. After excluding coastal forest and other forest types, the accuracy of some forest types in natural forest classification becomes lower, such as evergreen coniferous forest.
For evergreen coniferous forest, the producer accuracy based on SVM method is only 53.57%. There are two reasons for the low producer accuracy of evergreen coniferous forests. (1) In the field survey, it can be found that the distribution of evergreen coniferous forest is very scattered. Even if there are evergreen coniferous forest in patches, its area is very small, which is not enough for one pixel at all. It will result in mixed pixels, which will cause errors in training and classification of samples. (2) We used C-band SAR data. Although C-band cannot penetrate the lush tree canopy, it has the penetration for sparsely distributed evergreen coniferous forests. There are broad-leaved shrubs below the evergreen coniferous canopy in tropical natural forest, which makes it difficult to distinguish evergreen coniferous forest from evergreen broad-leaved forest. Therefore, C-band is not sensitive to the leaf shape of evergreen coniferous forest, which reduces the classification accuracy.
|Forests||Typical tropical rain forest||Tropical monsoon rain forest||Evergreen coniferous forest||Evergreen broad-leaved forest||Coast forest||Total||User accuracy|
|Classification data||Typical tropical rain forest||1389||37||76||207||0||1709||81.28%|
|Tropical monsoon rain forest||108||1354||0||66||0||1528||88.61%|
|Evergreen coniferous forest||0||0||90||0||0||90||100%|
|Evergreen broad-leaved forest||0||4||0||1352||0||1356||99.71%|
|Coast forest||3||0||2||0||0||1 080||99.54%|
|Overall accuracy: 91.19% Kappa coefficient: 0.8837|
We selected Hainan Island as the study area. With the multi-source remote sensing data, natural tropical forest in Hainan Island was extracted and classified. The primary land classification and natural forest classification of Hainan Island are carried out with the combination of optical data and SAR data. The overall classification accuracy of SVM is 95% and 91.2%, respectively. The results show that SVM method has the good accuracy in natural forest classification.
We would like to express our sincere thanks to Hainan academy of environmental science for their assistance in collecting the key parameters of tropical forest in Hainan, including forest types, geographical coordinates and characteristics of forest types. Also Thanks should be given to Hainan forestry department for their assistance in collecting the auxiliary data about tropical forest, including field survey data of forest resources and distribution data of forest in Hainan.
Thapa R B, Itoh T, Shimada M, et al. Evaluation of ALOS PALSAR sensitivity for characterizing natural forest cover in wider tropical areas. Remote Sensing of Environment 155 (2014): 32 – 41.
Chen S. Study on Forest Biomass Inversion Method Based on Landsat-7 ETM+ and PALSAR Data. Master’s Dissertation, China University of Mining and Technology 2018.
Liao K, Qi S, Wang C, et al. Estimation of Forest Aboveground Biomass and Canopy Height in Jiangxi Province Using GLAS and Landsat TM Images. Remote Sensing Technology and Application 33 (2018): 713 – 720.
Zhang J, Zhang M, Liu S, et al. Application of FY-3 Meteorological Satellite in Monitoring Remote Sensing of Rubber Plantation in Hainan Island. Chinese Journal of Tropical Crops 35 (2014): 2059 – 2065.
Zhang J, Tao Z, Liu S, et al. Rubber Planting Acreage Calculation in Hainan Island Based on TM Image. Chinese Journal of Tropical Crops 31 (2010): 661 – 665.
Zhou J, Li Z, Li X. Research on Rules of the Valley Glacier Mot ion in Western China Based on ALOS/ PALSAR Interferometry. Acta Geodaetica et Cartographica Sinica 35 (2013): 2507 – 2516.
Jia K, Li Q. Review of Features Selection in Crop Classification Using Remote Sensing Data. Resources Science 38 (2009): 341 – 347.
Bo Y, Wang J. Exploring the Scale Effect in Thematic Classification of Remotely Sensed Data the Statistical Separability-based Method. Remote Sensing Technology and Application 19 (2004): 443 – 449.
Zheng E. Cost Sensitive Data Mining Based on Support Vector Machines: Theories and Applications. Doctor’s Dissertation, Zhejiang University 2006.
Wang L. A Study on SAR Image Classification Base on Bag of Visual Words. Master’s Dissertation, University of Chinese Academy of Sciences 2015.
Cui D, Guo R. Comparison and Analysis of Support Vector Machines Based on Several Parameter Optimizations in Runoff Forecasting. Journal of Water Resources Research 34 (2013): 34 – 38.
GB/T 26424—2010. Technical regulations for forest resources planning and Survey. Investigation Planning and Design Institute of National Forestry Beijing, 2011.
Song Y. Recognition and proposal on the vegetation classification system of China. Chinese Journal of Plant Ecology 35 (2011): 882 – 892.
Song Y, Yan E, Song K. Synthetic comparison of eight dynamics plots in evergreen broadleaf forests, China. Biodiversity Science 23 (2015): 139 – 148.
Shi J, Gong Ch, Li X, Wan X & Sun Zh. Classification of Hainan island natural forests based on multi-source remote sensing data. Science Data Bank, DOI: 10.11922/sciencedb.711 (2019).
How to cite this article
Shi J, Gong Ch, Li X, Wan X & Sun Zh. Classification of Hainan island natural forests based on multi-source remote sensing data. China Scientific Data 4(2019). DOI: 10.11922/csdata.2018.0091.zh