OpenAI introduced an improved version of GPT-4 Turbo with Vision in the API

OpenAI has introduced an improved version of GPT-4 Turbo with Vision capabilities. The chatbot is available via API. The company also announced new AI tools based on GPT-4 Turbo with Vision.

Apr 15, 2024 0 213

OpenAI introduced an improved version of GPT-4 Turbo with Vision in the API

OpenAI has announced significant improvements to its latest artificial intelligence model, GPT-4 Turbo.

GPT-4 Turbo update

The GPT-4 Turbo is now equipped with computer vision capabilities, allowing it to process and analyze multimedia input data.

The chatbot can answer questions about images, videos, and more. The company also introduced several AI tools running on GPT-4 Turbo with Vision, including the Devin programming assistant and Healthify's Snap feature.

“The much improved GPT-4 Turbo model is now available in the API and distributed to ChatGPT.”
OpenAI, April 9, 2024

Previously, the Vision model could answer general questions about what was in the images. It is now optimized for answering specific details.

What can GPT-4 Turbo do with Vision?

Some users, including OpenAI developers, shared thoughts about the model's capabilities after testing it. Among them:

Extract unstructured text and images into database tables. A user named Simon Willison sent an image to the chatbot and extracted all the text from it.
Writing code based on an interface drawing. OpenAI developers noted that GPT-4 Turbo with Vision can help write code in Make Real to create a working website.
A variety of coding tasks. The world's first autonomous AI coding agent, Devin, also runs on GPT-4 Turbo Vision.
Determining the composition and calorie content of food from a photograph. Healthify, the world's largest health and fitness app, leveraged the power of GPT-4 Turbo with Vision to create the Snap feature. It helps users get nutritional information based on food images from around the world.
Extracting web data. Kadoa leverages the power of GPT-4 Turbo with Vision to automate specific web scraping and RPA tasks that don't work with just text.
News creation. Combinator, a US-based tech startup accelerator, shared how its team is building a template user interface for hacker news stories using GPT-4 Turbo with Vision.
Convert layouts into functional interactive panels. Haroen Vermilen, a data visualization expert at Luzmo, announced that they are using the GPT-4 Turbo API with Vision to run Instach.art, a tool for transforming a Figma layout into a fully functional interactive panel with demo data.

The enhanced capabilities of GPT-4 Turbo with Vision enable you to succeed with multiple use cases and features that were not previously possible.