Close×
    • Select all
      |
    • MENG Jihua, LIN Zhenxin, GAO Xinyu, HE Rongpeng, ZUO Liju
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Significance] As a critical pathway to achieving Sustainable Development Goal (SDG) 2, “Zero Hunger,” and ensuring long-term ecological sustainability, the concept and practice of sustainable agriculture are undergoing a paradigm shift toward data-driven and system-oriented approaches. In recent years, Big Earth Data—comprising remote sensing, geospatial, meteorological, and agricultural Internet of Things (IoT) data—has emerged as a foundational driver for agricultural monitoring, decision support, and technological innovation in sustainable development. [Analysis] Given the interdisciplinary, multi-stakeholder, cross-regional, and evolving-goal nature of sustainable agriculture, this study begins by systematically reviewing the conceptual evolution of the term. It highlights the multidimensional implications and diverse practical pathways of sustainable agriculture, noting its growing role as a core component of global development strategies. On this basis, the paper proposes a new, data-oriented and operational interpretation of sustainable agriculture. The study then establishes an analytical framework—“data-Technology”—to clarify the pivotal role of Big Earth Data in supporting sustainable agriculture. It examines the evolution of core datasets and key technical methods across three periods: before 2015, 2015—2019, and 2020 to the present. The applications reviewed include agricultural resource monitoring, multi-scale crop condition assessment, and evaluations of agriculture's environmental impacts. The findings suggest that sustainable agriculture, enabled by Big Earth Data, is rapidly shifting from a paradigm of "observational analysis" to one of "intelligent decision-making." Furthermore, the study conducts a comparative assessment of China, the United States, and the European Union across four critical dimensions: data infrastructure; technological advancement and application; scientific research capacity; and policy support. While China has made significant progress in all four areas—with strengths in remote sensing capabilities, rapid technological rollout and demonstration, substantial research output, and clearly defined policy directives—it continues to face challenges in data ecosystem development, original algorithm innovation, commercialization of scientific outputs, and the alignment of standards and incentive mechanisms. Finally, in light of the current needs for sustainable agricultural development, this study systematically analyzes the major challenges facing Big Earth Data from four aspects: data acquisition capacity, intelligent processing methods, application promotion and services, and data governance and ethical security. In response, it proposes multi-level strategies covering standardization, model optimization, improvements to service systems, and protection of data rights, with the aim of providing a reference pathway for the efficient utilization and sustainable development of agricultural big data in the future. [Prospect] The article aims to analyze the data-driven transformation pathway of sustainable agriculture and provide a systematic reference for its green, inclusive, and intelligent development.

    • YIN Shuoshuo, YAN Haowen, LI Jingzhong, WANG Zhuo, WANG Xiaolong, LU Xiaomin, MA Ben, YANG Qili
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] We-map is a user-oriented map product characterized by short production cycles, low technical barriers, and rapid dissemination. However, its development and utilization must adhere to relevant standards. Map review is a critical process for ensuring compliance of publicly released maps, effectively preventing the dissemination of non-compliant products and safeguarding national dignity, security, and development interests. The open, user-annotated nature of We-map platforms frequently leads to blind or improper annotations, increasing the risk of non-compliant content due to the lack of systematic constraints on spatially assigned annotations. Currently, review of such annotations relies predominantly on manual interpretation, which is inefficient given the large volume of data. There is an urgent need for automated identification of sensitive information in We-map text to protect national security and territorial integrity. [Methods] This study develops an automated sensitive-information detection model for We-map text by fine-tuning the BERT (Bidirectional Encoder Representations from Transformers) architecture. We iteratively optimized model parameters and training strategies to enhance performance. The methodology involved data acquisition from standard map services, Baidu Images, and user-generated We-maps, followed by systematic text extraction. In accordance with national regulations, texts were categorized into four types: illegal online content, confidential information, stability-related content, and discriminatory content. Data augmentation techniques, including synonym replacement and text perturbation, were applied to increase dataset diversity and robustness. BERT's bidirectional encoder and multi-head self-attention mechanism captured deep semantic relationships, while transfer learning facilitated adaptation of the model to the complex contexts of We-map text. The model inputs comprised token, segment, and position embeddings, which were processed by Transformer encoders for accurate classification. [Results] Experiments on the We-map text sensitivity dataset demonstrate that the proposed model achieves an F1 score of 0.925 9, Compared with mainstream text classification models, the model surpasses TextCNN by 6.35%, DistilBERT by 1.04%, and DeBERTa by 5.68% in terms of F1 score. Validation on 40 user-drawn We-maps further confirms the model's ability to accurately detect sensitive annotations across all categories, thereby demonstrating strong adaptability. [Conclusions] The proposed method exhibits improved classification accuracy and robustness, effectively addressing the existing technical gap in automated sensitive-information screening for We-map text. It significantly improves review efficiency, ensures regulatory compliance, and alleviates the burden of manual review, thereby providing a reliable solution for We-map content security.

    • SANG Zehao, LU Jun, GUO Haitao, DING Lei, ZHU Kun, XU Guojun, WEI Haoqi
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Cross-View Object-Level Geolocalization (CVOGL) aims to accurately determine the geographic position of targets observed in ground street-view or UAV (Unmanned Aerial Vehicle) imagery within satellite imagery. Most existing methods focus on image-level matching, achieving cross-view correlation through global processing of entire images. However, they lack effective position encoding for specific targets, which prevents models from directing attention to the object of interest. Furthermore, due to variations in the coverage range of reference images, the pixel proportion of the query target in corresponding satellite images is extremely small, making precise localization highly challenging. [Methods] To address these issues, this paper proposes GHGeo, a Cross-View Object-Level Geolocalization method based on a Gaussian Kernel Function and Heterogeneous Spatial Contrastive Loss, designed for precise localization of targets of interest. First, the method employs a Gaussian kernel function to perform fine-grained position encoding on the query target, enabling refined modeling of the target's center point and distribution characteristics. Next, a dynamic attention-refined fusion module is introduced to dynamically weight the spatial similarity between cross-perceptual global context and local geometric features, predicting the target's exact position in satellite imagery through probability density. Finally, a heterogeneous spatial contrastive loss function is applied to guide the training process and mitigate cross-view feature discrepancies. [Results] Experiments conducted on the CVOGL dataset demonstrate that: In the Drone→Satellite task, GHGeo achieves localization accuracies of 67.73% and 63.00% at Intersection over Union (IoU) thresholds of ≥25% and ≥50%, respectively, outperforming the baseline method DetGeo by 5.76% and 5.34%. In the Street-view→Satellite task, GHGeo achieves accuracies of 48.41% and 45.43%, representing improvements of 2.98% and 3.19% over DetGeo. Additionally, when compared with methods such as TransGeo, SAFA, and VAGeo, GHGeo consistently demonstrates higher localization accuracy on the CVOGL dataset. [Conclusions] The proposed method significantly improves the accuracy of cross-view object-level geolocalization, providing critical technical support and precise location information for applications such as urban planning, environmental monitoring, and emergency rescue and dispatch.

    • LIANG Ce, WANG Zhonghui
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Map data matching is an important technical foundation for updating geographical databases. As one of the main components of maps, building matching technology plays a crucial role in multi-source data fusion and database updating. Currently, areal building matching methods are mainly divided into traditional approaches and machine learning approaches. This paper proposes a multi-scale areal building matching method based on spectral domain graph convolution, designed to address the limitations of traditional methods that require manual determination of matching thresholds and factor weights, as well as the insufficient feature extraction capability of existing machine learning methods. [Methods] First, a graph structure integrating both local and global building features is constructed as the model input to train the proposed matching model, GAE-MLP. Then, following a separation strategy, the matching process is divided into initial and secondary stages. The GAE-MLP model is applied in each stage to achieve 1:1, 1:N, M:1, and M:N types of building matching. [Results] Building data at scales of 1:2 000 and 1:10 000 from Exeter, UK, are used in a comparative experiment. Results show that the proposed method, by encoding buildings through the fusion of multi-level graph features, significantly improves the model's feature extraction capability. In terms of matching accuracy, compared with traditional geometric methods, BP neural network, and the CatBoost ensemble learning method, the F1-score of the proposed method increases by 2.52%, 3.53%, and 0.76%, respectively. Furthermore, the proposed method overcomes the limitations of traditional methods dependent on manually defined thresholds and factor weights. Compared with both traditional and machine learning approaches, it more effectively handles challenges such as positional deviations and shape homogenization, enabling accurate matching of complex relationships like 1:N and M:N. [Conclusions] The proposed method successfully addresses the problems of traditional approaches and achieves accurate matching of multi-scale areal buildings.

    • XIAO Jinfeng, HAN Ling
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Predicting land use change provides a rational basis for decision-making in areas such as resource allocation, environmental protection, urban development, disaster prevention, and risk assessment. It plays a crucial role in supporting the formulation of scientific and sustainable policies and in balancing the relationship between human activities and ecological security. However, existing Cellular Automata (CA) models, although widely applied in land use simulation, face limitations in capturing the complex spatiotemporal characteristics of multi-temporal land use changes. Their limited ability to represent nonlinear dynamics and long-term dependencies restricts their applicability and effectiveness, particularly under heterogeneous environmental and socioeconomic conditions. [Methods] To address these challenges, this study proposes a deep spatiotemporal modeling network named 3DCBLT, which integrates attention mechanisms and incorporates multi-scale driving factors. The proposed framework is coupled with a CA model to enhance spatiotemporal feature extraction and nonlinear modeling capability. Specifically, the 3DCBLT network employs a Convolutional Block Attention Module (CBAM) to emphasize critical spatial regions and a 3D Convolutional Neural Network (3DCNN) to extract deep spatiotemporal patterns across sequential data. In addition, a Long Short-Term Memory (LSTM) module is integrated to fully capture temporal dependencies and long-term evolutionary trends in land use dynamics, enabling the estimation of development probabilities for different land use categories. Shaanxi Province was selected as the study area, with land use data collected at five-year intervals from 2000 to 2020. Twelve indicators representing climatic, topographic, and socioeconomic conditions were introduced as driving factors. The 2020 land use map was employed for validation. [Results] Experimental results show that the proposed model achieved a Kappa coefficient of 0.888, an Overall Accuracy (OA) of 0.925, and a Figure of Merit (FoM) of 0.336. These values are substantially higher than those of benchmark models such as ANN-CA, MLP-CA, and CA-Markov, confirming the superior predictive performance of the proposed method. [Conclusions] The 3DCBLT-CA model demonstrates high accuracy, strong robustness, and superior spatiotemporal modeling capacity in predicting complex land use change. This approach not only provides a feasible and effective technical pathway for simulating land use dynamics under complex scenarios but also offers valuable support for sustainable land planning, ecological conservation, and regional policy-making.

    • ZHU Jie, ZHU Mengyao, LIU Lei, WANG Shu, SUN Yizhong, WANG Zi'an
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] The spatial growth of cities under different geographic boundaries exhibits significant differences. Reasonable geographic boundaries need to take into account the continuity of spatial characteristics and the similarity of attribute features. However, existing research lacks systematic evaluation of the spatial heterogeneity effects of these two types of constraints in urban growth simulation, thereby overlooking differences in growth characteristics across subregions. [Methods] In this paper, we construct four typical zoning schemes:① administrative zoning (Z1), dominated by spatial continuity; ② “center-edge” structural zoning (Z2), dominated by attribute similarity; ③ dual K-means clustering zoning (Z3); and ④ DSC clustering zoning (Z4), both of which incorporate dual spatial and attribute constraints. [Results] Using land-use status data for Jiangyin City (2012—2017) as an example, we apply a vector CA model to explore the differential impacts of zoning boundaries on urban spatial growth simulation. The results reveal: ① While all schemes achieve comparable overall FoM accuracy, subregional precision varies significantly (Z2: SD=0.007, minimal variation; Z4: SD=0.066, maximal fluctuation). ② Z1 and Z2 effectively guide land use to follow specific spatial distribution patterns, forming more regular and orderly urban land-use structures, as reflected in the similarity of FN, AI, LSI, and SHDI landscape indices. In contrast, Z3 and Z4 emphasize geographic differentiation and the consistency of land-use change drivers, producing more irregular and scattered patterns. ③ Z1 and Z3 demonstrate strong consistency in capturing overall urban expansion modes but exhibit weaker sensitivity to localized patterns, whereas Z2 and Z4 simulate more edge-expansion parcels and demonstrate heightened sensitivity to local variations. Although all schemes identify edge-expansion parcels, the driving factors vary in influence. ④ When considering only the geographic heterogeneity of land-use change drivers, Z4 most effectively captures spatial variability and imbalance across subregions due to its simultaneous integration of spatial location and attribute similarity. [Conclusions] This research provides scientific evidence to support the transition of urban development paradigms from "incremental expansion" to "stock renewal".

    • LI Yunqiang, CHEN Yuehong, ZHANG Xiaoxiang, MA Qiang, ZHANG Kejian, FANG Xiuqin, REN Liliang
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Accurate risk assessment is critical for effective flash flood prevention and mitigation. However, existing flash flood risk assessment methods often ignore geographical similarity between different flood-prone areas. This omission can lead to biased negative samples, fragmentation of spatial correlations among similar units, and ultimately reduced accuracy and reliability of results. [Methods] To address this limitation, this study proposes a novel flash flood risk assessment approach that incorporates geographical similarity. First, catchments are adopted as the basic analysis units, and a comprehensive factor system is constructed based on a risk framework comprising meteorological, underlying, and exposure factors. Second, a geographical similarity-constrained sample generation method is then proposed to optimize the selection of negative samples (non-flash-flood events) based on the similarity of underlying factors among catchments. Finally, a weighted directed graph is constructed based on the geographical similarity among catchments, and a graph neural network with embedded geographical similarity is developed to achieve accurate risk assessment. [Results] The proposed method is applied to the Hengduan Mountains, where 884 positive samples and 884 samples are generated. Results are compared with those from existing flash flood sample generation methods and traditional machine learning-based risk assessment approaches. Findings show that: (1) Compared with the existing random generation and environmentally balanced sample generation methods, the proposed method improves overall accuracy by 24.76% and 22.28%, respectively. (2) Compared with three machine learning-based risk assessment methods(Random Forest, XGBoost, and LightGBM), the proposed geographical similarity-embedded graph neural network improves overall accuracy by 2.84%, 3.41%, and 2.17%, respectively. (3) The flash flood risk assessment results from all four models indicate that high-risk flash flood zones in the Hengduan Mountains are primarily distributed in the southeastern region, the Sanjiang River valley in the southwest, and along rivers in the northern area. [Conclusions] By incorporating geographical similarity into both the sample generation and model construction, the proposed method significantly enhances the accuracy of flash flood risk assessment and provides valuable support for scientific decision-making in flash flood prevention and management.

    • YUAN Zehao, LUO Yubo, ZHANG Yu, FU Chenxi, CHEN Biyu
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Activity space refers to the subset of all locations with which an individual has direct contact through regular activities. Accurately characterizing individual activity spaces is essential for studying urban public resource allocation, transportation planning, and equity evaluation. However, existing activity space algorithms often fail to capture human mobility patterns, particularly the tendency of individuals to move around anchor points, and lack the efficiency and precision required for modeling large-scale trajectory datasets. [Methods] To address these limitations, this study proposes an Anchor Point-Based Activity Space (ANC) construction algorithm. ANC first identifies stay points from individual trajectories, then extracts anchor points based on stay duration and visit frequency. Next, it captures mobility patterns between anchor points. ANC then constructs both intra-anchor-point activity spaces and inter-anchor-point movement spaces, which are integrated to form the final activity space. To validate the algorithm, this study introduces two evaluation metrics in addition to area size: activity space coverage and area efficiency. ANC is compared with two baseline methods, Standard Distance Circle (SDC) and Standard Deviational Ellipse (SDE). Furthermore, individual accessibility is calculated based on the constructed activity space to demonstrate ANC’ s practical applicability in urban spatial analysis. [Results] Using mobile signaling data from Shanghai as a case study, the results indicate that: (1) The ANC algorithm outperforms SDC and SDE in terms of activity space coverage and improves area efficiency by over 50%, effectively reducing over-enclosure of unvisited areas. (2) The spatial distribution of activity spaces in Shanghai exhibits that activity spaces are smaller in central urban areas and larger in suburban regions. The areas estimated by ANC are significantly smaller than those estimated by SDC and SDE, indicating that traditional methods overestimate actual activity spaces. (3) Individual accessibility calculated using ANC is only 12%-40% of that calculated by SDC and SDE, suggesting that existing methods significantly overestimate accessibility. [Conclusions] The proposed ANC algorithm demonstrates superior accuracy and adaptability in modeling individual activity spaces. It is well-suited for large-scale trajectory data analysis and offers valuable support for urban equity assessment and planning.

    • TAO Zexing, WANG Wenqi, GE Quansheng, LI Jicheng, WANG Yuan
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Path planning is a critical decision-making component in vehicle navigation. Traditional algorithms, such as A* and Dijkstra, often fail to account for environmental factors like terrain undulation and soft soil conditions, making it difficult to dynamically evaluate vehicle force conditions and traversal efficiency under varying ground surfaces. To address these challenges, this study proposes an off-road path planning method that integrates surface environmental data with vehicle force-dynamic characteristics. [Methods] The proposed approach first assesses terrain, soil, and other environmental factors to identify traversable regions. It then incorporates vehicle geometry and dynamic parameters to simulate force distribution and velocity attenuation under different surface conditions. Based on this, an improved A* algorithm is employed to achieve joint optimization of path feasibility and traversal efficiency. To validate the effectiveness of the method, simulation experiments were conducted in two typical areas of the Qinghai-Tibet Plateau. [Results] For a maximum straight-line distance of 91 km between the start and end points, the planned routes using the improved method completely avoided unfavorable areas such as soil subsidence zones, geological hazard regions, and slope-restricted terrains. Compared with the traditional A* algorithm, the proposed method achieved a substantial reduction in estimated travel time, with an average decrease of approximately 38.63%. [Conclusions] The proposed method quantitatively evaluates the combined effects of natural surface conditions on vehicle traffic capacity from the perspective of vehicle forces and provides a practical and effective tool for route planning in off-road environments.

    • LI Longwei, LIU Xiaodong, CHEN Hui, YANG Liping, ZHAO Like, ZHANG Ka
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] In recent years, the application of airborne LiDAR point clouds has become increasingly widespread, serving as a vital data source for 3D spatial information. Semantic segmentation of this data is a crucial step toward achieving 3D scene understanding. However, some existing segmentation methods exhibit certain limitations. For instance, methods based on local aggregation, such as PointNet++ and KPConv, struggle to effectively capture long-range dependencies. Conversely, while introducing global attention mechanisms like the Transformer can expand the receptive field, their high computational cost makes them impractical for large-scale airborne LiDAR data. Furthermore, most methods have not fully leveraged the rich elevation information inherent in airborne LiDAR point clouds, a key feature for distinguishing between different object categories in urban and natural landscapes. [Methods] This paper proposes a novel point cloud segmentation network called the Projection Attention and Elevation Attention Network (PE-Net). First, to achieve efficient global dependency modeling, the Projection Attention module projects the keys and values of the traditional self-attention mechanism into a low-rank subspace. This approach captures long-range relationships with linear computational complexity, effectively overcoming the performance bottleneck of standard Transformers. Second, to fully exploit the prior knowledge of vertical structure in airborne data, the Elevation Attention module learns attention weights directly from the points' Z-coordinates and uses them to re-weight the deep features, thereby explicitly enhancing the model's sensitivity to terrain variations. Finally, the Local-Global Feature Enhancement module aggregates multi-scale contextual information through parallel max-pooling and average-pooling operations. This enables deep fusion of local geometric details and global semantics, further improving the expressive power for complex spatial structures. [Results] The proposed method was validated on several mainstream airborne LiDAR point cloud datasets, including ISPRS Vaihingen3D and GML, to demonstrate its effectiveness. Experimental results showed that PE-Net achieved an Overall Accuracy (OA) of 82.6% and an average F1-score of 72.1% on the ISPRS Vaihingen3D dataset, and an OA of 97.0% and an average F1-score of 72.8% on the GML dataset. It also produced strong segmentation results on the LASDU dataset. Notably, compared with the KPConv baseline, PE-Net improved the OA and average F1-score by 7.0% and 20.4%, respectively, on the GML dataset, underscoring the significant impact of the proposed modules. [Conclusions] Compared with existing mainstream methods, the proposed PE-Net achieved substantial improvements in overall accuracy and average F1-score, while also demonstrating excellent generalization performance. These results confirm the effectiveness and robustness of PE-Net in point cloud semantic segmentation tasks for complex 3D scenes.

    • XUE Wu, LIU Xian, ZHAO Ling, WANG Peng, ZHANG Xufeng, LI Wenhao
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Agile optical satellites can acquire large-area images through multiple-strip imaging within the same orbit. However, frequent changes in satellite inclination during imaging, along with systematic errors across different strips, introduce geometric distortions between images. Therefore, it is necessary to study high-precision stitching technology for multi-strip images. [Methods] This research proposes a high-precision mosaic process for agile satellite images. The process includes block adjustment, preliminary geometric correction assisted by open-source topographic data, precise geometric correction assisted by Digital Orthophoto Map (DOM), and color consistency adjustment. To mitigate the influence of tall buildings in urban areas on multi-view image matching, we constrain the extraction range of tie points in dense urban building areas using vector data guidance. To further improve geometric correction accuracy, a spline function is used instead of a polynomial model. High-precision tie points are matched, and their color values are used as references to construct equations based on block adjustment, enabling consistent color gain across all images. To ensure robustness, different execution sequences of these steps were designed and tested. Comprehensive experiments were conducted to analyze the impact of image stitching sequences and geometric correction methods on final accuracy. [Results] To verify the proposed method, experiments were carried out using BJ-3 and SV-2 satellite images from Jellington (Pakistan), Seoul, and the Xánthi region. The results show that geometric correction reached 1.90 pixels in plain area and 2.59 pixels in mountainous areas. [Conclusions] The results demonstrate that geometric correction performs better in plain areas, and spline functions outperform polynomial models in precise geometric correction. Based on experiments, the optimal processing workflow was summarized as follows: First, check the accuracy of image Rational Polynomial Coefficients (RPC). If the accuracy of any image RPC exceeds the threshold, recalculate all image RPCs using DOM-based control point matching. Second, perform preliminary geometric correction after block adjustment, color consistency, and mosaicking. Third, apply precise geometric correction using a spline function. The proposed method effectively addresses the geometric correction challenges of agile satellite images.

    • CHENG Chuanxiang, JIN Fei, ZUO Xibing, LIN Yuzhun, WANG Shuxiang, LIU Xiao
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Object detection, a fundamental task in the intelligent interpretation of remote sensing imagery, involves the automated localization and identification of objects of interest. While object detection algorithms have achieved significant progress in natural image analysis, their performance often degrades when directly applied to remote sensing imagery, which is characterized by complex backgrounds, large scale variations, and targets with extreme aspect ratios. [Methods] To address these challenges, we propose a novel network, termed the Background, Scale, and Profile Detector (BSPDet), specifically designed to detect multi-scale and extreme-aspect-ratio targets in complex remote sensing scenes. The proposed architecture comprises three synergistic components. First, a Global-Local Contexts Crossing Fusion Network (GLCCFNet) is introduced as the backbone to achieve scale awareness by capturing multi-scale target features. It leverages complementary global and local contexts across both channel and spatial dimensions: global context serves as spatial weights to guide local context for precise semantic interpretation and noise suppression, while local context provides channel-wise attention to augment the global context with fine-grained details. This design demonstrates superior feature extraction compared to State-of-The-Art (SOTA) backbones, including PKINet, MobileNetv4, and StarNet. Second, a Select-Focus Feature Pyramid Network (SFFPN) is developed to mitigate complex background noise via semantic balancing and spatial selectivity. SFFPN reconciles semantic disparities across multi-scale feature layers and adaptively allocates attention weights, thereby enhancing features crucial for accurate object localization. This module outperforms established feature pyramid networks such as HSFPN, BiFPN, and AFPN. Finally, a Shape-aware Decoupled Head (SSDHead), incorporating strip and deformable convolutions, is employed to achieve shape awareness, enabling accurate detection of targets with extreme aspect ratios. [Results] On three public optical datasets, BSPDet achieves superior mAP scores of 96.5% on RSOD, 73.7% on DIOR, and 94.4% on HRRSD. The model also demonstrates excellent generalization, achieving a state-of-the-art mAP of 98.7% on the SAR dataset SSDD. [Conclusions] By integrating robust mechanisms for scale and shape awareness, the proposed BSPDet framework significantly improves object detection accuracy in remote sensing imagery and offers a novel methodological direction for future research. The approach provides a solid technical foundation for practical applications such as pavement distress monitoring and maritime ship detection.

    • YANG Rui, LIU Yu
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Current remote sensing building change detection methods face several challenges, including insufficient utilization of bi-temporal image fusion information and poor image alignment, both of which often result in inaccurate detection results. In particular, the reliance on single-time-point features or simple image subtraction methods often fails to capture the complex temporal changes that occur in urban environments. Consequently, traditional methods struggle with issues such as false positives and missed detections, especially when background changes are significant or when the images are misaligned. [Methods] To address these issues, this paper proposes a novel method called FIDAMamba (Feature Interaction and Differential Alignment Mamba), a Mamba-based network specifically designed for feature interaction and differential alignment of dual-temporal images. The proposed method leverages an encoder-decoder architecture to improve the fusion of bi-temporal image features. During feature extraction, a feature interaction module is embedded within a hierarchical Mamba encoder, projecting features from both temporal images into a shared space. This enables the network to identify and extract common representations. In the subsequent stage, when predicting the difference map, the model estimates and corrects misalignments between diachronic features, thereby improving alignment. The corrected difference map, which aggregates multiple-level encoder features, is then decoded to generate the final change detection map. [Results] Experimental evaluations confirm the effectiveness of FIDAMamba in building change detection tasks. The method achieves F1 scores of 91.89% and 91.12% on the LEVIR-CD and WHU-CD datasets, respectively, with corresponding mIoU values of 84.99% and 83.68%. Compared with existing methods such as ChangeFormer and ChangeMamba, FIDAMamba improves the F1 score by 1.49% and 2.09% on LEVIR-CD, and the mIoU by 2.52% and 3.50%, respectively. These results demonstrate that incorporating feature interaction and differential alignment modules substantially strengthens the model's ability to capture and align inter-temporal differences, particularly under conditions of significant background change or image misalignment. This leads to more accurate identification of changed buildings while reducing false positives and missed detections. [Conclusions] The proposed method significantly improves change detection accuracy and provides enhanced precision in identifying structural modifications. It offers strong support for advancing both research and practical applications of remote sensing imagery in monitoring, analyzing, and detecting building changes.

    • YU Shuangshuang, KANG Shuai, ZHANG Jianjun, JIN Man, HE Dongqing
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] To provide strong support for rapid response and rescue after natural disasters, a building damage assessment model based on a two-stage convolutional neural network (BDDNET) is proposed. The model addresses the challenges of insufficient accuracy and efficiency in classifying building damage from remote sensing images. [Methods] The model builds on the U-Net++ architecture and incorporates SE, CBAM, and ASPP modules. The combined effect of these modules enables effective processing of complex backgrounds and multi-scale damage areas, thereby enhancing feature extraction. To mitigate the issue of class imbalance in the dataset, the CutMix data augmentation technique is adopted to improve the model's generalization and robustness. The feasibility of this approach is evaluated using the xBD remote sensing satellite dataset. Ablation experiments are conducted to demonstrate the contribution of each added module, and evaluation metrics are compared against U-Net++, FCN, and DeepLabv3 baselines. [Results] Experimental validation shows that the BDDNET model achieves an F1 score of 92.25% and an IoU of 86.2% in building extraction tasks, and an F1 score of 76.55% and an IoU of 64.15% in damage classification tasks. Both results outperform current mainstream models, effectively completing the tasks of building extraction and damage classification. [Conclusions] The building damage assessment model based on BDDNET significantly improves the accuracy and efficiency of post-disaster building damage assessment. By integrating deep learning with remote sensing technologies, along with module optimization and data augmentation strategies, it demonstrates strong practicality and reliability, supporting rapid response and rescue operations after disasters.

    • ZHAO Binru, LI Wei, ZHANG Feng, ZHOU Peng, LIANG Jianfeng
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Coastal zone remote sensing monitoring is crucial for resource management and environmental protection. Deep learning has introduced new methods for coastal remote sensing. Yet, the complexity of coastal imagery and the scarcity of samples often result in unsatisfactory performance in practical applications. Aiming to enrich coastal remote sensing datasets and enhance model accuracy and generalization, we developed an automated data augmentation network (RS_FAA), Meanwhile, we proposed an evaluation method to assess the effectiveness of the augmentation, thereby enabling efficient automated enhancement of coastal remote sensing samples. [Methods] Firstly, in light of the imaging and spectral features of coastal remote sensing imagery, we designed a strategy search space that integrates both geometric and color enhancements. By combining these strategies, the overreliance on geometric augmentation in existing methods was addressed. Next, we made improvements to the Fast AutoAugment network. A U-Net module, an augmentation evaluation module, and a strategy selection module were integrated to optimize the combination of multiple augmentation strategies. Finally, dataset diversity was assessed using Euclidean distance and cosine similarity. Additionally, six performance metrics were employed to evaluate the impact of the enhanced dataset on model generalization. [Results] Taking the coastal zone of Zhangzhou, Fujian Province, China, as the study area, we evaluated the augmentation effectiveness and model performance. It was found that the enhanced dataset showed a remarkable improvement in feature diversity. The overall accuracy increased by 2.4%, 1.5%, 1.3%, and 0.8% on U-Net, PsPNet, SegNet, and Deeplab V3+, respectively. Moreover, the cross - regional data recall rate witnessed a 4.8% improvement. [Conclusions] The experimental results indicate that our method can effectively enhance the diversity of coastal remote sensing datasets and boost model performance while reducing the dependency on region - specific data distributions. It can be successfully applied to coastal remote sensing monitoring tasks.

    • CHEN Xiaowei, ZHU Jiamin, LIU Yuchen
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Addressing challenges in data acquisition and the limited quantitative dimensions in evaluating the spatial quality of historic district streets, this study proposes a quantifiable assessment method for street spatial quality, using Yuhou Street in Chenzhou City as a case study. [Methods] A 17-indicator evaluation model is constructed, incorporating cultural aspects across three dimensions: pedestrian accessibility, safety and convenience, and visual comfort. To overcome data collection difficulties in narrow streets and incomplete coverage from mapping platforms, the study employs self-collected street view images—manually simulating tourist walking paths to capture street scenes—ensuring applicability to narrow sample streets. Semantic segmentation using a Fully Convolutional Neural network (FCN) and spatial visualization in ArcGIS are applied to achieve a quantifiable assessment of street spatial quality. [Results] ① Self-collected street view images effectively cover narrow alleys and provide a pedestrian-perspective view, meeting experimental data requirements. ② The combination of street view images, semantic segmentation, and data visualization enables precise localization of street indicators and spatial quality grading. The analysis reveals significant gradient distribution and spatial heterogeneity in street quality across the case study area. ③ This method establishes a complete technical framework from street view data collection to spatial quality evaluation, enabling multidimensional quantitative analysis and visual representation of street environments. It provides data support and decision-making references for the refined renewal of historic districts. [Conclusions] ① This study innovatively integrates manual streetscape collection into a "data collection—semantic segmentation—spatial quantification" technical pathway for historical districts, effectively addressing data collection challenges in narrow streets and providing a scalable technical solution for small-scale built environment research. ② The evaluation system constructed for historical and cultural districts incorporates soft indicators such as "cultural visual recognition" and "proportion of traditional businesses" into a quantitative framework, offering methodological reference and renewal strategies to enhance the quality of similar districts.