Running prompts against images, PDFs, audio and video with Google Gemini

from blog Simon Willison TIL, | ↗ original
I'm still working towards adding multi-modal support to my LLM tool. In the meantime, here are notes on running prompts against images and PDFs and audio and video files from the command-line using the Google Gemini family of models. Update: I integrated the research from this TIL into my LLM tool, which can now run multi-modal prompts against...