The move toward video and multimedia support in major AI models signals a broader trend of multimodal AI systems.