Publications

You can also find my articles on my Google Scholar profile.

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Under Review

MTL-LoRA enhances LoRA by improving the adaptation of large language models to multiple tasks simultaneously. It leverages task-specific transformation matrices and multiple up-projection matrices to effectively extract both task-specific and task-agnostic information

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

Published in ECCV, 2024

Develop a large-scale pre-training and instruction dataset, and construct a multimodal large language model, LHRS-Bot, with an innovative alignment strategy. LHRS-Bot showcases superior performance on holistic understanding of remote sensing images and complex visual reasoning.

Recommended citation: Muhtar, D., Li, Z., Gu, F., Zhang, X., & Xiao, P. (2024). LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model. arXiv preprint arXiv:2402.02544. https://arxiv.org/abs/2402.02544

FDFF-Net: A Full-Scale Difference Feature Fusion Network for Change Detection in High-Resolution Remote Sensing Images

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023

A full-scale difference feature fusion network (FDFF-Net) for change detection, which can alleviate pseudochanges and reduce the loss of change details during detection

Recommended citation: Gu, F., Xiao, P., Zhang, X., Li, Z., & Muhtar, D.. (2023). FDFF-Net: A Full-Scale Difference Feature Fusion Network for Change Detection in High-Resolution Remote Sensing Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. https://ieeexplore.ieee.org/abstract/document/10324305

CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding

Published in IEEE Transactions on Geoscience and Remote Sensing, 2023

Integrating contrast learning and masked image modeling to enhance self-supervised pre-trained representation learning. Pre-training on the large scale MillionAID dataset, showing SOTA performance across different remote sensing downstream tasks. (Outperform MAE, MoCo, SwAV, and SimMIM, etc.).

Recommended citation: Muhtar, D., Zhang, X., Xiao, P., Li, Z., & Gu, F. (2023). CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding. IEEE Transactions on Geoscience and Remote Sensing. https://arxiv.org/abs/2304.09670

Dual-Range Context Aggregation for Efficient Semantic Segmentation in Remote Sensing Images

Published in IEEE Geoscience and Remote Sensing Letters, 2023

A lightweight dual-range context aggregation network (LDCANet) for efficient remote sensing image semantic segmentation.

Recommended citation: He, G., Dong, Z., Feng, P., Muhtar, D., & Zhang, X. (2023). Dual-Range Context Aggregation for Efficient Semantic Segmentation in Remote Sensing Images. IEEE Geoscience and Remote Sensing Letters, 20, 1-5. https://ieeexplore.ieee.org/abstract/document/10005205

Index your position: A novel self-supervised learning method for remote sensing images semantic segmentation

Published in IEEE Transactions on Geoscience and Remote Sensing, 2022

Employing position index to model intricate spatial relationships among various objects in remote sensing images, facilitating fine-grained representation learning.

Recommended citation: Muhtar, D., Zhang, X., Xiao, P., Li, Z., & Gu, F. (2023). CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding. IEEE Transactions on Geoscience and Remote Sensing. https://ieeexplore.ieee.org/abstract/document/9781429