地球信息科学学报  2017 , 19 (4): 467-474 https://doi.org/10.3724/SP.J.1047.2017.00467

地球信息科学理论与方法

基于大规模手机定位数据的群体活动时空特征分析

曹劲舟1, 涂伟234*, 李清泉1234, 曹瑞5

1. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430079
2. 深圳大学 土木工程学院空间信息智能感知与服务深圳市重点实验室,深圳 518060
3. 海岸带地理环境监测国家测绘地理信息局重点实验室,深圳 518060
4. 深圳大学智慧城市研究院,深圳 518060
5. 宁波诺丁汉大学国际博士创新研究中心,宁波 315100

Spatio-temporal Analysis of Aggregated Human Activities Based on Massive Mobile Phone Tracking Data

CAO Jinzhou1, TU Wei234*, LI Qingquan1234, CAO Rui5

1. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
2. Shenzhen Key Laboratory of Spatial Smart Sensing and Services, College of Civil Engineering, Shenzhen University, Shenzhen 518060, China
3. Key Laboratory for Geo-Environmental Monitoring of Coastal Zone of the National Administration of Surveying, Mapping and GeoInformation, Shenzhen 518060, China
4. Smart City Institute, Shenzhen University, Shenzhen 518060, China
5. International Doctoral Innovation Centre, University of Nottingham, Ningbo 315100, China

通讯作者:  *通讯作者:涂伟(1984-),男,湖北黄冈人,博士,助理教授,主要从事时空大数据分析研究。E-mail:tuwei@szu.edu.cn

收稿日期: 2016-11-4

修回日期:  2016-12-24

网络出版日期:  2017-04-20

版权声明:  2017 《地球信息科学学报》编辑部 《地球信息科学学报》编辑部 所有

基金资助:  国家自然科学基金项目(41401444、41371377、41671387)深圳大学青年教师科研启动项目(2016065)国土资源部城市土地资源监测与仿真重点实验室开放基金资助课题(KF-2016-02-009)

作者简介:

作者简介:曹劲舟(1991-),男,湖南益阳人,博士生,主要从事时空大数据分析与挖掘研究。E-mail:caojinzhou@whu.edu.cn

展开

摘要

城市空间与居民行为不断交互,相互影响。探究城市空间中的群体活动分布及其时空变化能够帮助数据驱动的城市规划与城市治理。基于大数据的时空间群体活动研究是当前时空大数据研究的一个热点。本文以深圳市为例,基于约1000万手机用户在某一工作日的基站尺度的手机定位数据,识别用户停留位置和停留活动,重建活动语义信息,分析用户的停留点和停留活动的分布差异,研究群体活动的时空分布模式,探讨人群活动模式的多样分布特征。研究表明:停留位置和活动分布存在差异,每人每天平均的停留个数约为2.1个,而每人每天平均从事的活动约为3.4个;不同类型的活动在时间上存在波动;群体活动存在空间分异特征,整体上服从“空间幂律”。本研究揭示了城市空间中群体活动的多样性及其时空分布特征,对于城市居民活动研究、城市交通优化和城市规划具有重要的意义。

关键词: 手机定位数据 ; 轨迹分析 ; 时空大数据 ; 群体活动 ; 时空特征

Abstract

Urban space and the behavior of human activities constantly interact with each other. Investigation on distribution of aggregated human activities and spatio-temporal change benefits data-driven policy-making in urban planning and urban governing. In the era of big data, with the development of information and communication technologies, it is possible to collect city-scale data with high resolution in space and time by various location-aware devices and sensors. Exploration of spatial-temporal activities attracts a lot of attention. By taking about 10 million one-day tracking data of mobile phone users in Shenzhen, China as an example, this paper firstly identified their stay locations according to spatial and temporal rules to generate stay trajectory for each individual and recovered activity semantic information by labelling activity types for each stay locations. Then, the significant differences in patterns of distributions of stay locations and their activities were analyzed. Spatial and temporal distributions of different human activities were explored, respectively. The study shows that the distribution of stay locations and activities is obviously heterogeneous. The average number of stay locations of an individual per day is 2.1, while the average number of activities an individual engaged in per day is 3.4. This study furthermore suggests that different types of activities have temporal variance and spatial heterogeneity. The temporal distribution fluctuates significantly over 24 hours, which is in accordance with daily routine. The spatial distribution overall obeys “space power law”, and the spatial distribution of social activity, which has a faster-down tail, shows a more obvious pattern of spatial segregation than the other two activities. The study revealed the diversity and heterogeneity of spatial and temporal distribution of human aggregated activities in urban space, which is meaningful in analyzing human activities research and facilitating urban traffic optimization and urban planning.

Keywords: mobile phone tracking data ; trajectory analysis ; spatial-temporal big data ; aggregated human activities ; spatial-temporal pattern

0

PDF (1587KB) 元数据 多维度评价 相关文章 收藏文章

本文引用格式 导出 EndNote Ris Bibtex

曹劲舟, 涂伟, 李清泉, 曹瑞. 基于大规模手机定位数据的群体活动时空特征分析[J]. , 2017, 19(4): 467-474 https://doi.org/10.3724/SP.J.1047.2017.00467

CAO Jinzhou, TU Wei, LI Qingquan, CAO Rui. Spatio-temporal Analysis of Aggregated Human Activities Based on Massive Mobile Phone Tracking Data[J]. 地球信息科学学报, 2017, 19(4): 467-474 https://doi.org/10.3724/SP.J.1047.2017.00467

1 引言

城市是动态且复杂的[1]。城市空间不仅由土地、建筑和其他基础设施组成,也与城市内居民的活动和移动行为息息相关[2-3]。城市空间与居民行为不断交互,相互影响。随着城市化和社会变革的加速,城市空间变得复杂,人群流动和活动也变得快速化和多样化。因此,深入探究人们在城市空间的活动规律及其时空变化,可以更好地理解群体活动与城市空间的相互作用,为人类行为学研究[4]、城市规划布局[5]、公共交通优化[6]等领域提供辅助决策信息。传统的群体活动分析数据主要依靠收集出行调查问卷获得,然而此种依赖人工的方式样本量少,搜集时间长,耗时耗力[7],因此亟需可靠、易取的时空数据来快速捕捉群体活动的特征。

随着信息通信技术(Information Communication Technology,ICT)和泛在感知技术(Ubiquitous Sensing Technology)的飞速发展,使实时、高效记录人类时空位置信息成为可能,产生了海量、多源的人类时空活动数据[8-9]。这些海量数据弥补了传统方式的不足,具有高精度、高频次、高效率、低成本等特点[10]。由于智能手机的广泛使用,移动运营商通过通讯信号将基站尺度的时空位置记录下来,基于智能手机的移动定位数据为群体时空间活动特征研究提供了新的解决途径[11]

目前,基于手机定位数据的时空间活动识别主要从个体和群体2个层面出发[12]。个体层面的识别主要根据一定的时空规律和先验信息对时空轨迹链中不同的位置进行语义活动标记[13-15];群体层面则从时空活动位置与土地利用的空间关系出发,在空间集聚的尺度上来推断活动的类型[16-19]。这些研究都从不同角度探究了群体活动特征与城市空间的相互作用的关系,但是由于手机数据缺乏语义信息(类别、位置持续时间等),活动轨迹空间尺度不一,活动特征难以提取,因而基于大数据环境下的群体活动精准识别与分析仍存在巨大的挑战。

本文利用深圳市海量手机定位数据,通过识别用户停留位置和停·留活动、标记活动语义信息,分析用户的停留点和停留活动的分布差异,研究群体活动的时空分布模式,探讨人群活动模式的多样分布特征。这对于深入理解城市空间中不同类型活动的时空分布的多样性和差异性具有重要意义,对于辅助城市交通优化和城市规划具有参考价值。

2 研究区概况及数据源

本文选择深圳作为研究区。深圳位于广东省南部,南临香港,北接东莞、惠州,总面积1996 km2。如图1所示,深圳市下辖8个行政区,其中罗湖、福田、南山原属经济特区,2010年之后特区范围扩大到整个辖区。截止2015年末常住人口约137.891万人,人口密度每平方公里达到10 000人,位列全国第一,流动人口总数在全国最多[20-21]。因为历史发展的原因,深圳的南北区域经济社会发展差异大,人口分布不均匀,原特区部分(罗湖、福田、南山)在科技、金融、教育等领域高度发达,人口分布集中,旧称为“关内”,而其他行政区被称为“关外”,其制造业较为发达,人口分布较少。随着深圳城市化加速和人口的持续扩张,关内关外差异在逐渐缩小,跨区域群体出行逐渐增多,人群活动变得更为复杂。

图1   研究区域

Fig. 1   The study area

本文采用2012年3月某一工作日约1000万用户的移动手机定位数据,移动运营商以1 h左右的采样间隔通过移动基站记录手机用户通信的基站编号,获取用户所在位置。数据由移动运营商进行了匿名化处理,以保护用户隐私。通过对每一条数据记录的时间进行小时整数化处理,得到包含用户id、小时数、所在基站id、基站位置经度、基站位置纬度等属性的数据序列,如表1所示。

表1   移动手机定位数据示例

Tab. 1   Examples of mobile phone location records

用户id小时数基站id基站经度基站纬度
536****019**114.14**22.60**
536****119**114.14**22.60**
536****254**114.12**22.58**
536****2314**114.14**22.60**

注:为了隐私保护,具体数值以*号标识

新窗口打开

3 研究方法

为了研究群体活动的时空分布模式,探讨群体活动模式的多样分布特征,以及在时空上的分异,本文首先通过对原始数据的预处理和停留轨迹提取,通过构建一定的时空规则实现对人类时空活动精确识别,标记活动语义信息;随后对不同类型的活动的时空特征进行分析。具体处理流程(图2)主要包括:① 基于手机基站尺度的时空轨迹生成;② 时空停留轨迹提取;③ 家庭-工作-社会活动识别;④ 群体活动时空特征分析。

图2   基于大规模手机定位数据的群体活动时空特征分析流程图

Fig. 2   Framework of spatio-temporal analysis of aggregated human activities using cellphone location data

3.1 基于手机基站尺度的时空轨迹生成

为了从海量的手机定位数据中生成完整的时空轨迹序列,首先需要对原始数据进行预处理,过滤掉不符合需求的数据,主要步骤包括:① 去除重复数据;② 去除缺少属性的数据;③ 去除时间和空间尺度不在研究范围的数据;④ 根据每个用户的原始数据分布,剔除用户出现时间少于18 h(即数据点序列小于18个)的用户记录。

对于符合标准的数据,按照用户和时间排序,可以得到完整的个体轨迹数据集。一条个体轨迹通常表示为有带地理坐标和时间标记的点序列,如式(1)所示。

Tr={P1,P2,,Pn}(1)

式中: Pi表示个体的第i个位置记录点,如式(2) 所示。

Pi=(x,y,t)(2)

式中:t表示小时数;x,y表示手机基站位置坐标。

3.2 时空停留轨迹提取

人们通常在一天中大部分时间处于停留状态,由于活动性质的不同,人们停留的时空属性(停留地点数量、停留时间段和时长)会存在差异[22]

停留点是由满足一定时空约束条件的一系列连续的手机位置点所确定。对于时序点轨迹,需要将空间接近和时间邻近的时序点聚类成更为抽象的停留点轨迹。为了简便,每一个抽象停留点的位置由停留点内位置点个数最多的位置确定。考虑到用户i的序列点轨迹为 Tri={P1,P2,,Pn},(n为位置记录点个数),将第一个时序点 P1加入候选停留点轨迹中的第1个停留点;随后计算时序点轨迹中每一个点与已有候选停留点轨迹中的停留点的空间距离。若空间距离小于设定的阈值(本研究设为500 m),则将该点加入到候选停留点,否则该点设为新的候选停留点,直到时序轨迹中n个位置点全部计算完毕,得到候选停留点轨迹,如式(3)所示。

Tri'={S'1,S'2,,S'm}(3)

式中: S'i表示个体的第i个停留点,如式(4)所示。

S'i=(x,y,tstart,tend)(4)

式中:x,y表示第i个停留点的基站位置的坐标; tstart表示该停留开始时间; tend表示停留结束时间。

对于候选停留点轨迹中的所有停留点,若该点的开始时间与结束时间的差值小于设定阈值(本研究设为1 h),则认为该点不是真正停留点,将其从候选停留点轨迹中移除,最后得到完整的停留点轨迹。

需要注意的是,考虑到基站空间分布的不均匀性和服务范围的差异以及邻近基站之间信号跳跃的因素,本文利用ArcGIS软件生成手机基站的Voronoi 多边形,并计算任意相邻基站距离的分布。如图3所示,约93%的基站距离小于500 m(如图3中红点位置),因此选择500 m作为空间距离阈值,能最大限度满足不同基站间的差异性。

图3   任意相邻基站距离累计分布

Fig. 3   Cumulative distribution of the distance between any adjacent base tower stations

3.3 家庭-工作-社会活动识别

对于人们一天的日常规律来说,假设人们总是从家出发,通过出行移动到其他位置进行某种活动(如工作),最后总是会回到家的位置。因此,家庭和工作活动是人类最主要的2类日常活动,且具有很强的时空规律[23]。而对于其他如购物、休闲、健身等活动,在居民日常生活中发生的比例较少,且规律性不强,在本研究中统一归为社会活动。从用户一天内的停留点轨迹中识别出居家和工作的位置,本研究提出根据用户在一天内停留点的个数对识别过程进行分类处理,如图4所示。基于这样的假设,根据深圳市居民生活习惯,本研究对用户活动的识别窗口时间划分如下:对于居家位置,取夜晚时段(0-7时)为识别窗口;对于工作位置,取白天时段(9-17时)为识别窗口。具体流程如下:

图4   从停留点轨迹中识别家庭-工作-社会活动方法

Fig. 4   Identification of home, work and social activities from stay points trajectory

(1)对于一天内只有1个停留点个数的用户,认为该停留点为居家位置。

(2)对于一天内拥有2个停留点的用户,分别将2个停留点的持续时间与2个识别窗口进行匹配,将占比更多的停留点设为居家位置或者工作位置。

(3)对于一天内停留点停留个数大于2的用户,将所有停留点的持续时间分别与2个识别窗口进行匹配,若有停留点的持续时间落在识别窗口内,并占识别窗口时间长度占50%以上,则认为匹配成功,作为候选居家或工作位置;找到匹配时间最长的居家或工作活动位置作为该用户的居家和工作活动位置;若没有匹配成功,则认为该用户没有找到居家或工作活动位置。

(4)对于停留点轨迹中的所有停留点,将居家位置的停留点,标记为家庭活动;将工作位置的停留点,标记为工作活动;没有被标记为居家或工作位置的停留点,则全部标记为社会活动,最后生成完整的时空活动链。

4 结果与分析

4.1 停留位置与停留活动分析

利用本文所述的手机用户停留轨迹提取和活动识别方法处理上述约1000万用户的手机位置数据,获得所有用户的停留位置和活动,进行统计分析(图5)。结果表明:用户一天的停留位置数量非常有限,97.7%的用户不超过4个;停留个数为2的用户群体最多,所占比例为36.8%,个数为1和3的次之,所占比例分别为28.3%和24%。不同时刻,手机用户停留的次数存在差异(图6),夜间时段停留明显多于白天时段,其中有3个明显的停留低谷时间分别为7-9时、11-14时和17-19时,即交通出行早高峰、午平峰和晚高峰时段。

图5   一天内停留位置和停留活动统计分布

Fig. 5   The statistics of daily stay points and activities

图6   停留个数不同时间段分布

Fig. 6   Daily temporal distribution of number of stay points

统计所有用户一天内从事的活动个数,实验结果表明:用户一天进行的活动个数同样有限,约93%的用户每天进行的活动不超过5个;活动个数为3的用户占比最多,达26%;其次是4个和1个,分别达20.7%和18.6%。

总体上看,用户停留位置和停留活动在分布上存在明显差异。每人每天平均停留的位置个数约为2.1个。每人每天平均从事的活动约为3.4个。每人每天平均在一个停留位置不止从事一个活动。

4.2 家庭-工作活动分布特征分析

通过活动识别,提取了全市域一天内的个体活动约3436万个,其中家庭活动约有1579万,工作活动约有920万。本研究成功识别出98%的用户(约1059万)的居家位置和94%的用户(1015万)的工作位置。其中,71%的用户具有不同的居家和工作位置,21%的用户具有相同的居家和工作位置。有6%的用户只识别出了居家位置,2%的用户只有工作位置。只有0.3%的用户没有被识别出居家和工作位置(图7)。由此可知,平均每人每天在居家位置从事家庭活动的个数为1.49个,说明有一部分人在一天之中往返于居家位置和其他位置,在居家位置形成了多次停留。

图7   5种类型的居家和工作位置识别结果的人口比例

Fig. 7   Percentage of population by five types of detection of home and work locations

为了评价上述方法识别出的居家位置分布的准确性,将用户的居家位置人口分布与2010年街道级别深圳市人口普查数据进行比较(图8),发现居家位置人口分布与真实人口分布,相关系数达到0.92,在置信度水平95%下存在显著相关性。因此,本文方法识别的居家位置分布与人口普查结果在整体上有较好的一致性。

图8   街道级别居家位置人口分布与2010人口普查分布比例相关性分析

Fig. 8   Correlation between the spatial distributions of population based on home locations at street level and the population distribution from 2010 census data

4.3 不同活动时空特征分析

分别进行家庭、工作、社会活动的时空间统计,分析不同活动类型在数量强度、空间分布、一天内时间波动上的差异。整体上看,在一天24 h内,群体活动强度存在明显的随时间变化的规律(图9),与人的作息规律基本一致,其中家庭和工作两类活在24 h变化上呈现“凹凸”现象。其中,0-7时是睡眠时间,此时家庭活动占总活动的比例很高,平均约有87%;从7时开始,人们逐渐离开居家位置,进行其他日常活动,大部分人开始出行,因为出行不构成停留,总活动强度开始减少,并在出行高峰(7-9时、11-14时和17-19时)形成3个明显的低谷;在7时开始,工作活动开始逐步增强,家庭活动开始逐步减弱,并在10时形成第一个交叉,之后工作活动数量开始高于家庭活动。在18时,大部分人下班返回家中,又形成第二个交叉,之后家庭活动数量又高于工作活动,并且差距逐渐变大。在10-18时,平均约有35%的人进行家庭活动,49%的人进行工作活动。相较于家庭和工作活动,社会活动在一天中处于缓慢波动状态,变化相对不明显;从7时开始缓慢增加,并在22时左右达到高峰,说明活动类型(购物、餐饮、休闲娱乐等)变得丰富,活动越加频繁。

图9   总活动量和工作、社会、家庭活动量不同时间段变化分布

Fig. 9   Daily temporal distribution of the volume of total activity, work activity, social activity and home activity

图10展示了总体活动的空间分布。由图10 可知,不同行政区的活动密度存在较大差异,原“关内”地区活动密度明显高于原“关外”地区,呈现“南强北弱”的特征。其中,密度最大的是福田区,其次是罗湖区;福田区是深圳的中心商务区,罗湖是深圳最早开发的城区,是深圳人口的主要集聚地,大规模人口的集聚必然会导致活动的集聚和活动类型的丰富。龙岗和宝安由于聚集了大量规模较大的工厂,就业人口较多,因而活动强度也相对较强。

图10   总体活动密度空间分布

Fig. 10   Spatial distribution of activity density

图11为3种不同类型的活动密度的空间分布,不论是家庭活动、工作活动还是社会活动,在整体上都呈现出“空间幂律”的分布特征,即大量活动只分布在少数区域,而大部分区域的活动都很弱,空间分异特征明显。这是由于城市发展历史特点所决定的,也与城市不同区域常住人口分布、产业布局、交通设施规划等相关。

图11   家庭、工作、社会活动密度空间分布

Fig. 11   Spatial distributions of densities of home, work and social activities

活动密度空间强度的互补累积分布如图12所示,考虑到不同类型的活动密度数值上差异较大,通过除以每类活动的密度均值进行归一化。从图12可看出,不同类型的活动密度空间强度分布非常相似,在双对数坐标系下,呈现为截断幂律分布,说明在空间上,活动密度存在“空间幂律”分布特征。而且,社会活动的分布在尾部下降速度更快,其次是家庭活动和工作活动,说明社会活动的“空间幂律”特征更明显,空间分异更强,而工作活动相对于其他两类活动空间分布上更集中。

图12   不同类别活动密度排名的互补累积分布

Fig. 12   Complementary CDF of ranks of activity densities normalized by the mean in different categories

图13、14分别展示了不同行政区的活动总量和活动密度。由图可知,宝安区因其行政面积最大,人口总量也较多,使得活动量较其他行政区最多;而福田区因其面积较小,人口密度大,因而在家庭、工作、社会活动的密度也是最高。盐田区是深圳滨海旅游区和东部港口区域,常住人口少,在该区主要是短暂停留,流动性强,在活动量和活动密度上均为最少。

图13   不同行政区活动量分布统计

Fig. 13   The statistics of distribution of activity volumes in different administrative districts

图14   不同行政区活动密度分布统计

Fig. 14   The statistics of distribution of activity densities in different administrative districts

5 结论

本文基于约1000万手机用户在某一工作日的基站尺度的手机定位数据,通过识别用户停留位置和停留活动,分析群体活动时空分布模式特征。本文首先通过设定一定时空约束条件得到用户的停留轨迹,再识别出用户在一天内的家庭、工作和社会活动类型,通过该方法识别出98%用户的居家位置和94%用户的工作位置,最后通过分析不同用户的停留位置和停留活动的统计特征,探讨了用户的停留点和停留活动分布的差异。

基于深圳市手机定位数据的实验结果表明,大量用户在一天内只会进行少量停留和从事少量活动,停留位置和活动分布存在差异,每人每天平均2.1个停留,但是每人每天平均从事的活动数却为3.4个,表明每人每天平均在一个停留位置不止从事一个活动。通过对不同类型的活动进行时空间统计,研究了群体活动的时空分布模式,发现不同的活动在一天内不同时间段存在符合日常作息规律的波动,在空间上由于历史发展、功能产业布局等原因,在不同行政区存在空间分异特征,在整体空间分布上存在“空间幂律”特征。本文结论很好地揭示了城市空间中不同类型的群体活动及其时空分布的多样性和差异性,相较于传统的居民调查等方法,具有样本量大、覆盖范围广、成本低等优势,对于分析深圳市居民出行活动行为、辅助城市交通优化和城市规划具有重要的意义。

手机定位数据由于手机基站位置空间分辨率低,会导致停留轨迹识别错误的问题;同时,短期采样的数据由于其可重复周期短,对基于时空约束的活动识别结果会存在一定的误差。今后将结合其他来源时空轨迹数据、居民出行调查数据和土地利用数据,对活动识别的准确度进一步进行验证,同时研究不同群体的活动时空分布及其与城市功能结构的相互作用和影响。

The authors have declared that no competing interests exist.


参考文献

[1] Batty M.

The size, scale, and shape of cities

[J]. Science, 2008,319(5864):769-771.

https://doi.org/10.1126/science.1151419      URL      PMID: 18258906      [本文引用: 1]      摘要

Despite a century of effort, our understanding of how cities evolve is still woefully inadequate. Recent research, however, suggests that cities are complex systems that mainly grow from the bottom up, their size and shape following well-defined scaling laws that result from intense competition for space. An integrated theory of how cities evolve, linking urban economics and transportation behavior to developments in network science, allometric growth, and fractal geometry, is being slowly developed. This science provides new insights into the resource limits facing cities in terms of the meaning of density, compactness, and sprawl, and related questions of sustainability. It has the potential to enrich current approaches to city planning and replace traditional top-down strategies with realistic city plans that benefit all city dwellers.
[2] Kitchin R.

The real-time city? Big data and smart urbanism

[J]. GeoJournal, 2013,79(1):1-14.

[本文引用: 1]     

[3] Jiang S.

Deciphering human activities in complex urban systems : Mining big data for sustainable urban future[D].

Cambridge MA: Massachusetts Institute of Technology, 2015.

[本文引用: 1]     

[4] Noulas A, Scellato S, Lambiotte R, et al.

A tale of many cities: Universal patterns in human urban mobility

[J]. PLoS ONE, 2012,7(5):e37027.

[本文引用: 1]     

[5] Zhong C, Arisona S M, Huang X, et al.

Detecting the dynamics of urban structure through spatial network analysis

[J]. International Journal of Geographical Information Science, 2014,28(11):2178-2199.

https://doi.org/10.1080/13658816.2014.914521      URL      [本文引用: 1]      摘要

Urban spatial structure in large cities is becoming ever more complex as populations grow in size, engage in more travel, and have increasing amounts of disposable income that enable them to live more diverse lifestyles. These trends have prominent and visible effects on urban activity, and cities are becoming more polycentric in their structure as new clusters and hotspots emerge and coalesce in a wider sea of urban development. Here, we apply recent methods in network science and their generalization to spatial analysis to identify the spatial structure of city hubs, centers, and borders, which are essential elements in understanding urban interactions. We use a ‘big’ data set for Singapore from the automatic smart card fare collection system, which is available for sample periods in 2010, 2011, and 2012 to show how the changing roles and influences of local areas in the overall spatial structure of urban movement can be efficiently monitored from daily transportation.In essence, we first construct a weighted directed graph from these travel records. Each node in the graph denotes an urban area, edges denote the possibility of travel between any two areas, and the weight of edges denotes the volume of travel, which is the number of trips made. We then make use of (a) the graph properties to obtain an overall view of travel demand, (b) graph centralities for detecting urban centers and hubs, and (c) graph community structures for uncovering socioeconomic clusters defined as neighborhoods and their borders. Finally, results of this network analysis are projected back onto geographical space to reveal the spatial structure of urban movements. The revealed community structure shows a clear subdivision into different areas that separate the population’s activity space into smaller neighborhoods. The generated borders are different from existing administrative ones. By comparing the results from 302years of data, we find that Singapore, even from such a short time series, is developing rapidly towards a polycentric urban form, where new subcenters and communities are emerging largely in line with the city’s master plan.To summarize, our approach yields important insights into urban phenomena generated by human movements. It represents a quantitative approach to urban analysis, which explicitly identifies ongoing urban transformations.
[6] Fang Z, Shaw S L, Tu W, et al.

Spatiotemporal analysis of critical transportation links based on time geographic concepts: A case study of critical bridges in Wuhan, China

[J]. Journal of Transport Geography, 2012,23:44-59.

https://doi.org/10.1016/j.jtrangeo.2012.03.018      URL      [本文引用: 1]      摘要

Critical transportation infrastructure has been studied extensively in recent years. This paper presents a spatiotemporal analysis of critical transportation links based on time geographic concepts. With widespread adoption of information and communication technologies (ICT) and location-aware mobile devices, large tracking datasets have become readily available. This study uses a tracking dataset of approximately 12,000 taxis in Wuhan, China over 102week to analyze spatiotemporal origin–destination (O–D) patterns of trips that use three critical bridges connecting the three districts of Wuchang, Hankou, and Hanyang separated by the Yangtze River and the Han River. We use the space–time prism concept to identify alternative space–time paths passing through different bridges that observe the spatial and temporal constraints between each O–D pair derived from the taxi trajectory data. This case study illustrates the feasibility and benefits of using the proposed time geographic approach to analyze spatiotemporal patterns of travel demands on the critical links and their alternative paths in a transportation system.
[7] Chen J, Shaw S L, Yu H, et al.

Exploratory data analysis of activity diary data: A space-time GIS approach

[J]. Journal of Transport Geography, 2011,19(3):394-404.

https://doi.org/10.1016/j.jtrangeo.2010.11.002      URL      [本文引用: 1]      摘要

Study of human activities in space and time has been an important research topic in transportation research. Limitations of conventional statistical methods for analysis of individual-level human activities have encouraged spatiotemporal analysis of human activity patterns in a space–time context. Based on H01gerstrand’s time geography, this study presents a space–time GIS approach that is capable of representing and analyzing spatiotemporal activity data at the individual level. Specifically, we have developed an ArcGIS extension, named Activity Pattern Analyst (APA), to facilitate exploratory analysis of activity diary data. This extension covers a set of functions such as space–time path generation, space–time path segmentation, space–time path filter, and activity distribution/density pattern exploration. It also provides a space–time path based multi-level clustering method to investigate individual-level spatiotemporal patterns. Using an activity diary dataset collected in Beijing, China, this paper presents how this Activity Pattern Analyst extension can facilitate exploratory analysis of individual activity diary data to uncover spatiotemporal patterns of individual activities.
[8] Yue Y, Lan T, Yeh A G O, et al.

Zooming into individuals to understand the collective: A review of trajectory-based travel behaviour studies

[J]. Travel Behaviour and Society, 2014,1(2):69-78.

https://doi.org/10.1016/j.tbs.2013.12.002      URL      [本文引用: 1]      摘要

Understanding travel behaviour is significant in travel demand management as well as in urban and transport planning. Over the past decade, with the advancement of data collection techniques, such as GPS, transit smart cards, and mobile phones, various types of travel trajectory data are increasingly complementing or replacing conventional travel diaries and stated preference data. Other location-aware data are used in studying human movement patterns, such as social network check-in data and banknote dispersal data. Abundance of the emerging trajectory data has driven a new wave of travel behaviour research, and introduced new research problems. This paper provides a state-of-the-art review of the travel behaviour studies categorised by trajectory data types. Based on the literature review, research challenges are discussed and promising research topics in this field are proposed for future studies.
[9] 李清泉,李德仁.大数据

GIS

[J].武汉大学学报·信息科学版,2014,39(6):641-644.

[本文引用: 1]     

[ Li Q Q, Li D R.

Big data GIS

[J]. Geomatics and Information Science of Wuhan University, 2014,39(6):641-644. ]

[本文引用: 1]     

[10] Calabrese F, Diao M, Di Lorenzo G, et al.

Understanding individual mobility patterns from urban sensing data: A mobile phone trace example

[J]. Transportation Research Part C: Emerging Technologies, 2013,26:301-313.

https://doi.org/10.1016/j.trc.2012.09.009      URL      [本文引用: 1]      摘要

Large-scale urban sensing data such as mobile phone traces are emerging as an important data source for urban modeling. This study represents a first step towards building a methodology whereby mobile phone data can be more usefully applied to transportation research. In this paper, we present techniques to extract useful mobility information from the mobile phone traces of millions of users to investigate individual mobility patterns within a metropolitan area. The mobile-phone-based mobility measures are compared to mobility measures computed using odometer readings from the annual safety inspections of all private vehicles in the region to check the validity of mobile phone data in characterizing individual mobility and to identify the differences between individual mobility and vehicular mobility. The empirical results can help us understand the intra-urban variation of mobility and the non-vehicular component of overall mobility. More importantly, this study suggests that mobile phone trace data represent a reasonable proxy for individual mobility and show enormous potential as an alternative and more frequently updatable data source and a compliment to the conventional travel surveys in mobility study.
[11] Cao R, Tu W, Cao J, et al.

Comparison of urban human movements inferring from multi-source spatial-temporal data

[J]. ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016,XLI-B2:471-476.

https://doi.org/10.5194/isprsarchives-XLI-B2-471-2016      URL      [本文引用: 1]      摘要

The quantification of human movements is very hard because of the sparsity of traditional data and the labour intensive of the data collecting process. Recently, much spatial-temporal data give us an opportunity to observe human movement. This research investigates the relationship of city-wide human movements inferring from two types of spatial-temporal data at traffic analysis zone (TAZ) level. The first type of human movement is inferred from long-time smart card transaction data recording the boarding actions. The second type of human movement is extracted from citywide time sequenced mobile phone data with 30 minutes interval. Travel volume, travel distance and travel time are used to measure aggregated human movements in the city. To further examine the relationship between the two types of inferred movements, the linear correlation analysis is conducted on the hourly travel volume. The obtained results show that human movements inferred from smart card data and mobile phone data have a correlation of 0.635. However, there are still some non-ignorable differences in some special areas. This research not only reveals the citywide spatial-temporal human dynamic but also benefits the understanding of the reliability of the inference of human movements with big spatial-temporal data.
[12] 周涛,韩筱璞,闫小勇,.

人类行为时空特性的统计力学

[J].电子科技大学学报,2013,42(4):481-540.

https://doi.org/10.3969/j.issn.1001-0548.2013.04.001      URL      [本文引用: 1]      摘要

人类行为的定量化分析,特别是时空统计规律的挖掘和建模,是当前统计物理与复杂性科学研究的热点。对人类行为的深入理解,有助于解释若干复杂的社会经济现象,并在舆情监控、疾病防治、交通规划、呼叫服务、信息推荐等方面产生应用价值。该文综述人类行为时间和空间特性方面的研究进展,内容包括人类行为时间特性的实证分析和建模,人类行为空间特性的实证分析和建模,以及人类行为统计分析的应用研究。该文还将评述当前研究存在的亮点和不足,指出若干亟待解决的重大理论和实际问题。

[ Zhou T, Han X P, Yan X Y, et al.

Statistical mechanics on temporal and spatial activities of human

[J]. Journal of University of Electronic Science and Technology of China, 2013,42(4):481-540. ]

https://doi.org/10.3969/j.issn.1001-0548.2013.04.001      URL      [本文引用: 1]      摘要

人类行为的定量化分析,特别是时空统计规律的挖掘和建模,是当前统计物理与复杂性科学研究的热点。对人类行为的深入理解,有助于解释若干复杂的社会经济现象,并在舆情监控、疾病防治、交通规划、呼叫服务、信息推荐等方面产生应用价值。该文综述人类行为时间和空间特性方面的研究进展,内容包括人类行为时间特性的实证分析和建模,人类行为空间特性的实证分析和建模,以及人类行为统计分析的应用研究。该文还将评述当前研究存在的亮点和不足,指出若干亟待解决的重大理论和实际问题。
[13] Liu F, Janssens D, Cui J, et al.

Characterizing activity sequences using profile Hidden Markov Models

[J]. Expert Systems with Applications, 2015,42(13):5705-5722.

https://doi.org/10.1016/j.eswa.2015.02.057      URL      [本文引用: 1]      摘要

Abstract In literature, activity sequences, generated from activity-travel diaries, have been analyzed and classified into clusters based on the composition and ordering of the activities using Sequence Alignment Methods (SAM). However, using these methods, only the frequent activities in each cluster are extracted and qualitatively described; the infrequent activities and their related travel episodes are disregarded. Thus, to quantify the occurrence probabilities of all the daily activities as well as their sequential orders, we develop a novel process to build multiple alignments of the sequences and subsequently derive profile Hidden Markov Models (pHMMs). This process consists of 4 major steps. First, activity sequences are clustered based on a pre-defined scheme. The frequent activities along with their sequential orders are then identified in each cluster, and they are subsequently used as a template to guide the construction of a multiple alignment of the cluster of sequences. Finally, a pHMM is employed to convert the multiple alignment into a position-specific scoring system, representing the probability of each frequent activity at each important position of the alignment as well as the probabilities of both insertion and deletion of infrequent activities.
[14] Jiang S, Ferreira J J, Gonzalez M C.

Discovering urban spatial-temporal structure from human activity patterns[A]. Proceedings of the ACM SIGKDD tnternational Workshop on Urban Computing

[C]. New York, NY, USA: ACM, 2012:95-102.

[15] Isaacman S, Becker R, Cáceres R, et al.

Identifying important places in people’s lives from Cellular Network data[A]. Pervasive Computing

[M]. Springer Berlin Heidelberg, 2011:133-151.

[本文引用: 1]     

[16] Demissie M G, Correia G, Bento C.

Analysis of the pattern and intensity of urban activities through aggregate cellphone usage

[J]. Transportmetrica A: Transport Science, 2015,11(6):502-524.

https://doi.org/10.1080/23249935.2015.1019591      URL      [本文引用: 1]      摘要

This study applies passive mobile positioning data such as Call Volume, Handover, and Erlang to detect the spatiotemporal distributions of urban activities. The authors obtained hourly aggregated cellphone data from a dataset of communications in Lisbon, Portugal. Fuzzy c-mean clustering algorithm was applied to the cellphone data to create clusters of locations with similar features in two aspects of activities: the pattern and intensity of urban activities along the hours of a day. In order to validate those clusters as actual predictors of human activity, the authors compared them with clusters formed using ground truth variables namely presence of people, buildings, points of interest, and bus and taxi movement. To identify the patterns of urban activities, the Erlang data provided a better match, with the ground truth giving 69% of overall accurate predictions. In the case of the intensity of activities, Handover data provided the highest match, with the ground truth yielding 80% of overall accuracy of predictions. Hence, the results demonstrate the potential of passive mobile positioning data in detecting intensity of activities that are superimposed on the different activity patterns, which is a fundamental piece of information for transportation and urban planning.
[17] Noulas A, Mascolo C, Frías-Martínez E.

Exploiting foursquare and cellular data to infer user activity in urban Environments[A]. MDM (1)’13

[C]. 2013:167-176.

[18] Soto V, Frías-Martínez E.

Automated land use identification using cell-phone records[A]. Proceedings of the 3rd ACM International Workshop on MobiArch

[C]. New York, NY, USA: ACM, 2011:17-22.

[19] Hasan S, Ukkusuri S V.

Urban activity pattern classification using topic models from online geo-location data

[J]. Transportation Research Part C: Emerging Technologies, 2014,44:363-381.

https://doi.org/10.1016/j.trc.2014.04.003      URL      [本文引用: 1]      摘要

Location-based check-in services in various social media applications have enabled individuals to share their activity-related choices providing a new source of human activity data. Although geo-location data has the potential to infer multi-day patterns of individual activities, appropriate methodological approaches are needed. This paper presents a technique to analyze large-scale geo-location data from social media to infer individual activity patterns. A data-driven modeling approach, based on topic modeling, is proposed to classify patterns in individual activity choices. The model provides an activity generation mechanism which when combined with the data from traditional surveys is potentially a useful component of an activity-travel simulator. Using the model, aggregate patterns of users weekly activities are extracted from the data. The model is extended to also find user-specific activity patterns. We extend the model to account for missing activities (a major limitation of social media data) and demonstrate how information from activity-based diaries can be complemented with longitudinal geo-location information. This work provides foundational tools that can be used when geo-location data is available to predict disaggregate activity patterns.
[20]

深圳市人民政府公报2016年第39期(总第983期)[R]

.

URL      [本文引用: 1]     

[21]

深圳市2015年全国1%人口抽样调查主要数据公报

[EB/OL]..

URL      [本文引用: 1]     

[22] 徐金垒,方志祥,萧世伦,.

城市海量手机用户停留时空分异分析——以深圳市为例

[J].地球信息科学学报,2015,17(2):197-205.

https://doi.org/10.3724/SP.J.1047.2015.00197      URL      [本文引用: 1]      摘要

识别海量手机数据中蕴含的行为模式,是地理学的一个研究热点与难点。目前,较多研究针对手机用户移动特征开展,而对停留及其模式的研究则相对较少;其时空分异规律对理解城市人群动态,甚至优化城市系统至关重要。本文根据人们日常时空约束条件定义了手机用户停留,提出了基于海量手机位置数据的手机用户停留模式的提取方法,以深圳市约790万个匿名手机用户一天的海量手机位置数据为例,识别出了覆盖约98%用户的典型停留模式,并结合该城市土地利用的空间分布与分异特征,剖析不同停留模式的手机用户空间分异特征和城市不同区域停留次数的时段分异特征。研究发现:(1)15种停留模式可覆盖约98%的手机用户,而且其一天不同的停留位置数量不超过4个;(2)15种停留模式手机用户在城市区域空间上的分布存在分异现象,严重受制于土地利用的空间分布;(3)城市不同区域停留次数的时段分异特征与该区域常住人口、人口密度,以及区域主要职能和性质存在较强的相关性。研究结论对理解城市手机用户行为模式的群体特征有积极的意义,对城市土地利用的科学决策和城市交通规划与预测有重要参考价值。

[ Xu J L, Fang Z X, Shaw S L, et al.

The spatio-temporal heterogeneity analysis of massive urban mobile phone users' stay behavior: A case study of Shenzhen city

[J]. Journal of Geo-Information Science, 2015,17(2):197-205. ]

https://doi.org/10.3724/SP.J.1047.2015.00197      URL      [本文引用: 1]      摘要

识别海量手机数据中蕴含的行为模式,是地理学的一个研究热点与难点。目前,较多研究针对手机用户移动特征开展,而对停留及其模式的研究则相对较少;其时空分异规律对理解城市人群动态,甚至优化城市系统至关重要。本文根据人们日常时空约束条件定义了手机用户停留,提出了基于海量手机位置数据的手机用户停留模式的提取方法,以深圳市约790万个匿名手机用户一天的海量手机位置数据为例,识别出了覆盖约98%用户的典型停留模式,并结合该城市土地利用的空间分布与分异特征,剖析不同停留模式的手机用户空间分异特征和城市不同区域停留次数的时段分异特征。研究发现:(1)15种停留模式可覆盖约98%的手机用户,而且其一天不同的停留位置数量不超过4个;(2)15种停留模式手机用户在城市区域空间上的分布存在分异现象,严重受制于土地利用的空间分布;(3)城市不同区域停留次数的时段分异特征与该区域常住人口、人口密度,以及区域主要职能和性质存在较强的相关性。研究结论对理解城市手机用户行为模式的群体特征有积极的意义,对城市土地利用的科学决策和城市交通规划与预测有重要参考价值。
[23] Sevtsuk A, Ratti C.

Does urban mobility have a daily routine? Learning from the aggregate data of mobile networks

[J]. Journal of Urban Technology, 2010,17(1):41-60.

https://doi.org/10.1080/10630731003597322      URL      [本文引用: 1]      摘要

Does the distribution of Rome's population follow routine hourly, daily, or weekly patterns? And if it does, how do such patterns vary in different parts of the city? This paper reports on our investigation of the aggregate patterns of urban mobility in Rome, Italy for which we used novel data from a mobile phone operator. Unlike research that chartered urban mobility through individual travel surveys, our research determined the aggregate distribution of Rome's population over time by using the volume of call activity in mobile network cells as the unit of spatial analysis. In this paper, we first illustrate and confirm that there is significant regularity in urban mobility at different hours, days, and weeks. We then show how mobility between network cells differs at various times, and we account for the differences by using demographic, economic, and (built) environment indicators.

/