Automating Visual Storytelling Evaluation with Large Vision Language Models and Diffusion Models
Abstract
Human Evaluation