[1]

Jalal, H.D. et al. 2026. Cross-Modal Knowledge Mining Leveraging Multimodal Large Language Models for Automated Video Scene Understanding and Event Detection. NextGen AI & Computing Journal. 1, 1 (May 2026), 102–131. DOI:https://doi.org/10.5281/zenodo.20461727.