Jalal HD, Aslam S, Sultan MH, Raee GMUD, Azam M, Malik MH. Cross-Modal Knowledge Mining Leveraging Multimodal Large Language Models for Automated Video Scene Understanding and Event Detection. NAC [Internet]. 2026 May 30 [cited 2026 Jun. 4];1(1):102-31. Available from: https://scientia-nexus.org/index.php/nac/article/view/16