减灾研究历史数据集专题 I 区论文(评审中) 版本 EN1
下载
A dataset of Ya'an Earthquake based on social media
 >>
: 2018 - 02 - 02
: 2018 - 03 - 14
9990 33 0
Abstract & Keywords
Abstract: The Ya'an Earthquake occurred on April 20, 2013 (Beijing time). Its epicenter was located in Lushan County, Ya'an City, Sichuan Province, and the magnitude of this earthquake was 7.0. As of 14:30, 24th April, the earthquake caused a total of 196 people dead, 21 missing and 11470 injured. We find that, with the development of information and communication technology, the micro-blog system has shown great potential in promoting emergency responses because they provide an easily accessible platform on which disaster information could be rapidly organized and disseminated to a large number of people. In view of this, we built this dataset of Ya'an Earthquake based on the Sina-Weibo microblog posted within Sichuan Province during 7 days after the earthquake. Sina-Weibo, a platform for information sharing and exchange, entertainment, leisure and life services, was launched in August 2009. Sina-Weibo can be used to search, inquire and publish post-earthquake-related data. It provides a platform where the public can communicate, express their feelings, offer suggestions, and so on.
Keywords: Ya'an Earthquake; Sina-Weibo; Sichuan Province; data mining
Dataset Profile
Chinese Title雅安地震灾情的社交媒体数据集
English TitleA dataset of Ya'an Earthquake based on social media
Data authorsTian Chuanzhao, Li Guoqing, Yang Tengfei, Li Zhenyu
Data corresponding authorLi Guoqing
Time range20 – 26 April, 2013
Geographical scopeSichuan Province
Data volume51418 records (about 5 MB)
Data format.xls
Data service system<http://www.sciencedb.cn/dataSet/handle/560>
Sources of fundingNational Key R&D Program of China (2016YFE0122600)
Dataset compositionThe data set consists of two parts:
(1) "Data.rar" contains 21 tables of Sina-Weibo text data, and each table corresponds to a region;
(2) "Classification sample.rar" is a sample subset illustrating the classification of text data in "Data.rar".
1.   Introduction
Ya'an Earthquake:1 according to China Earthquake Networks Center, the Ya'an Earthquake occurred at 8:02, April 20, 2013 (Beijing time). The epicenter was located in Lushan County, Ya'an City (30.3N, 103.0E), at a depth of 13 km, and the earthquake had a magnitude of 7.0. As of 10:00, April 24, 2013, there were 4045 aftershocks, with 103 aftershocks above magnitude 3 and the biggest being 5.7. 12,500 square kilometers around the epicenter were affected, involving 1.52 million population. According to the China Earthquake Administration, the earthquake has resulted in 196 people dead, 21 missing and 11,470 injured as of 14:30 (UTC+8h, April 24). Figure 1 shows the location of the earthquake occurred.


Figure 1   Location of Ya'an Earthquake
Sina-Weibo,2 an information sharing and exchange platform that provides entertainment, leisure, and other life services for the public, was released in August 2009. By the end of March 2013, the number of registered Sina-Weibo users had increased to 536 million, with an annual increase rate of 6.6%, and the number of its daily active users increased to 49.8 million, by 7.8% from the end of 2012. Sina-Weibo provides timely updates about the earthquake disasters, where users are free to make searches and queries, government users can post dynamic information about security and rescue and the public can communicate on the platform to express their feelings, such as blessing, sadness, anger, anxiety, etc. It also allows users to propose advice on the action of the government to be taken. Figure 2 shows some Earthquake information from Sina-Weibo.


Figure 2   Earthquake information from Sina-Weibo
There is growing evidence that the public will look for relevant information intensively during a certain period of time after the disaster and from all available sources, irrespective of whether the source is considered authoritative.3,6 As citizens can both access and upload disaster information through such an open communication platform, citizen-driven emergency response is a key part of an effective response to a major disaster.
At present, foreign research on this aspect goes earlier than the domestic. Glaser et al.4 analyzed Twitter data during the 2007 California Wild Fires. Vieweg et al.5 researched on the 2009 Red River Floods and the 2009 Oklahoma Grassfires by means of Twitter data too. Nowadays, Twitter has already been an effective platform for real-time updates. In China, scholars also studied the application of micro-blog in disaster response. Qu et al.6 analyzed people's responses to the 2008 Sichuan Earthquake based on Tianya Forum, and Qu et at.7 analyzed people's responses to the 2010 Yushu Earthquake based on Sina-Weibo.
2.   Data collection and processing
2.1   Overview
Using "Ya'an Earthquake" as the keyword, the search crawled text data posted on Sina-Weibo within the geographical location of Sichuan Province during 20 – 26 April, 2013. Each data record included: message content, time created, number of forwards, number of likes, number of comments and other information.
Due to the search limitations of Sina-Weibo, the amount of data would reach a peak during a certain period, or within 72 hours, after the earthquake, which is called the golden relief time. We collected the Sina-Weibo data posted from all the cities of Sichuan Province during this period at a time interval of an hour.
We first determined the city for data crawling. We collected data posted from 21 cities of Sichuan Province. Due to restrictions of the search function of Sina-Weibo, we determined the time interval for data crawling. During some special time periods when data were released in particularly large quantities, we crawled data every few hours. We crawled data every few days when data were released in small quantities. After a proper data query interval was determined, the data were stored into an appropriate data table.
We analyzed the trends of 51,418 Ya'an Earthquake-related messages collected from a week period after the earthquake. We counted the number of messages posted each day (as shown by the blue line in Figure 3), the number of messages forwarded (as shown by the red line in Figure 3), the number of messages commented (as shown by the green line in Figure 3) and the number of messages liked (as shown by the orange line in Figure 3).


Figure 3   Trends of Ya'an Earthquake-related messages
2.2   Data classification
What types of messages did people post in Sina-Weibo in response to the earthquake? To answer the question, we randomly sampled 200 microblog messages for analysis. We identified six categories of content: emotion-related, opinion-related, action-related, situation updates, general information and others. Table 1 shows a summary of the categories.
Table 1   Classification of Sina-Weibo messages
CategoryDescription
Emotion-relatedExpressing personal feelings such as blessing, sadness, anger, anxiety, etc.
Opinion-relatedCriticizing or providing suggestions to the public, government or rescue agencies
Action-relatedRequesting help, looking for missing people, or proposing relief actions or relief coordination
Situation UpdatesUpdating factual information about the earthquake
General InformationAny other earthquake relief-related information
OthersOther earthquake-related information
We applied the categories to sampled Sina-Weibo messages (Figure 4), and concluded 42% emotion-related messages, 21% action-related messages, 14% situation updates, 8% general information, 4% opinion-related messages, and 11% other messages on the earthquake.


Figure 4   Classification results
3.   Sample description
The data retrieved from Sina-Weibo were stored into 21 tables. Each table corresponds to a region. Each data entry records ID, content, location, time, forwardCount, commentCount, likeCount, keyword, province and city.
Table 2   Sample data entry
Field NameDescription
ID2231
Content#Earthquake Live # 7.0Ya'an Earthquake of Lushan: As of 18:00 April 21, there were 1642 aftershocks, including 78 aftershocks of magnitude 3 and above, 4 aftershocks above 5.0, and 18 aftershocksbetween magnitude 4.0– 4.9 , and 56 aftershocks between magnitude 3.0 – 3.9. The largest aftershock occurredat 5.45pm, April 21 at Lushan. The 5.4-magnitude aftershock occurred at the junction of the two peaks.
Location
Time2013-04-21 18:38
ForwardCount10
CommentCount5
LikeCount1
KeywordYa'an Earthquake
ProvinceSichuan
CityChengdu
4.   Quality control and assessment
When the body of a Sina-Weibo message retrieved by keyword search was deleted, we then removed this data entry from our dataset accordingly. Data without time information were also removed in the progress of quality control. In addition, information posted only with hyperlinks or without valuable information was also removed from our dataset: # Ya'an earthquake in Sichuan # # microblogging topic details: web links, # Ya'an earthquake in Sichuan # Details: web links, # Ya’an 7 earthquake # # microblogging topic details: web link, # Ya'an 7 earthquake # Details: Web links, # Ya'an 7 earthquake # #, Ya'an earthquake microblogging reported safe # #.
5.   Value and significance
As time goes by, many messages about this earthquake have now been deleted by the bloggers, which makes it difficult to access some valuable messages posted at that time. As the only dataset that collects information about the 2013 Ya'an Earthquake from Sina-Weibo, this dataset provides essential resources for studying social media responses to the earthquake of the time based on Sina-Weibo data.
Sina-Weibo provides a platform through which the public can communicate with others. With the development of the Internet in recent years, there has been in particular a large increase in the number of mobile client users, and people are more and more concerned about hot news, events. Sina-Weibo, as a major Chinese microblogging platform, plays a crucial role in the search and dissemination of hot information, especially in the event of an earthquake. This dataset can be used by academics to study what types of information are most easily forwarded, commented, liked, and how information is disseminated, and how data contents can be categorized, and so on.
Acknowledgments
This work is supported by the National Key R&D Program of China (2016YFE0122600). We thank Dr. Pang Lushen from the Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences for his suggestions on the collection of this dataset. Thank Li Zhenyu from Shandong University of Science and Technology for his support on data processing.
1.
Ya’an Earthquake, available at: <https://en.wikipedia.org/wiki/2013_Lushan_earthquake>.
2.
Sina-Weibo, available at: <https://en.wikipedia.org/wiki/Sina_Weibo>.
3.
Sutton J, Palen L & Irina S. Backchannels on the front lines: Emergent use of social media in the 2007 Southern California Fires, Proceedings of the Information Systems for Crisis Response and Management Conference (ISCRAM 2008), Washington, DC, 2008.
4.
Glaser M. California Wildfire Coverage by Local Media, Blogs, Twitter, Maps and More. PBS MediaShift. Available at: < http://mediashift.org/2007/10/california-wildfire-coverage-by-local-media-blogs-twitter-maps-and-more298/>
5.
Vieweg S & Hughes A et al. Microblogging during two natural hazards events: What twitter may contribute to situational awareness, Proc. CHI (2010): 1079 – 1088.
6.
Qu Y, Wu PF & Wang X. Online community response to major disaster: A study of Tianya Forum in the 2008 Sichuan Earthquake, Proc. HICCS, 2009.
7.
Qu Y, Huang C, Zhang P et al. Microblogging after a major disaster in China: A case study of the 2010 Yushu Earthquake, Proc. CSCW, 2011.
Data citation
1. Tian C, Li G, Yang T et al. A dataset of Ya'an Earthquake based on social media. Science Data Bank. DOI: 10.11922/sciencedb.560
稿件与作者信息
How to cite this article
Tian C, Li G, Yang T et al. A dataset of Ya'an Earthquake based on social media. China Scientific Data 3 (2018), DOI: 10.11922/csdata.2018.0004.en (under review).
Tian Chuanzhao
social media data collection and analysis, writing.
PhD; research area: disaster data mining.
Li Guoqing
advice on dataset design and data check, writing.
ligq@radi.ac.cn
PhD, Professor, research area: geospatial data infrastructure, remote sensing, big data.
Yang Tengfei
motivation of the research, writing.
PhD; research area: natural language processing, disaster information mining.
Li Zhenyu
National Key R&D Program of China (2016YFE0122600)
出版历史
I区发布时间:2018年3月14日 ( 版本EN1
II区出版时间:2018年6月22日 ( 版本EN2
最近更新时间:2018年6月22日 ( 版本EN3
参考文献列表中查看
中国科学数据
csdata