Abstract: In recent years, increasing research has shown that fine-grained local alignment is crucial for the cross-modal medical image-report retrieval task. However, existing local alignment ...