The internet is fawning over ChatGPT’s new vision feature.
The OpenAI tool’s latest updates began rolling out earlier this week. They enable ChatGPT to “see” when users upload images, which they can then discuss them with the chatbot. Additional “hear” and “speak” features enable users to have conversations with ChatGPT.
The new artificial intelligence (AI) capabilities make use of GPT-3.5 and GPT-4, which “apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images,” according to a Monday OpenAI blog post.
People around the world have begun testing out the new features and sharing their experiences on social media. Below are 10 creative ways ChatGPT users are making use of this new vision feature.
Identify Film Scenes
On X, formerly Twitter, some users alerted their followers that they could upload a screenshot from a movie and have ChatGPT identify the film. In one example posted by @skalskip92, ChatGPT identified Pulp Fiction from a screenshot showing actors John Travolta and Samuel Jackson. ChatGPT also shared information on the film’s historical context and, when asked, its rating on IMDB.
Writer Peter Yang also tested this capability with a screenshot from the Ridley Scott-directed 2000 film Gladiator.
Do Kids’ Homework
AI developer McKay Wrigley posted a video on X showing how ChatGPT can now explain scientific diagrams to students. In Wrigley’s example, he posted a diagram showing the inside of a human cell and asked for help understanding what each component did. ChatGPT came back with brief descriptions for each cell part.
Yang also tested out this tutor-like function by sending ChatGPT an image of an addition worksheet for a math class. ChatGPT provided answers to all of the math problems included on the sheet.
“Kids will never do homework again,” Yang tweeted.
Offer Coaching Tips
Create Labs co-founder Abran Maldonado tweeted that he provided ChatGPT with two photos taken during a football game “in honor of football season.” ChatGPT then explained what appeared to be happening in each photo and offered six coaching tips for the quarterback. Maldonado predicted the new vision feature “will forever change coaching and sports analytics.”
Users have also discovered that ChatGPT can write code based on uploaded images, charts and diagrams. In one example, Wrigley shared a photo on X of diagrams drawn out on a whiteboard that ChatGPT then transformed into code.
Several X users shared another video showing how ChatGPT created a website with a design matching a sketch drawn on paper, a photo of which was then uploaded for the chatbot to assess.
Adjust a Bicycle Seat
ChatGPT can walk users through how-to instructions for several random activities, including adjusting a bicycle seat. In one example shared by OpenAI, a user struggling to lower the seat of their bicycle could take a photo of the bicycle and follow step-by-step instructions that walk them through how to make the necessary adjustments. Users can ask follow-up questions and send along additional images to work through specific steps when they run into trouble, according to OpenAI’s video. The vision feature can be used to fix other random items around the house, OpenAI’s blog post said.
Take Better Photos
Ethan Mollick, a professor who studies AI’s impacts on education, said on X that ChatGPT’s vision feature can help users create better photographs. Mollick uploaded a photo to ChatGPT and asked for specific instructions on how to improve the image. The response he received showed ChatGPT providing tips on framing, lighting, perspective and more.
Pietro Schirano, whose X bio says he works in AI, posted on X that ChatGPT also contributed a name suggestion for an interior design style after photos of the style in question were uploaded. ChatGPT described the space’s design elements and explained the historical context for its name suggestion.
Avoid Parking Tickets
Yang tweeted he “will never get a parking ticket again” now that ChatGPT’s vision feature is available. Yang posted a photo of a sign on which several specific parking instructions were posted, each giving different instructions on when people could and could not park in that area. Yang provided ChatGPT with a specific time and weekday, asking if it was safe to park.
ChatGPT tapped into art analysis when Schirano asked about the meaning behind a four-panel cartoon. ChatGPT’s analysis broke down the cartoon by panel and provided an overall assessment on its meaning at the end.
Decipher Handwritten Notes
ChatGPT can be deployed to read messy or flowery handwriting styles. In one example shared on X by Mollick, a photo of part of a handwritten manuscript was uploaded to ChatGPT for deciphering. The chatbot did fairly well, according to Mollick.
“Likely going to be a big deal for a number of academic fields, especially as the AI can ‘reason’ about the text,” he tweeted.
Perhaps most importantly of all, ChatGPT can help kids (and adults) around the world find Waldo.
“I found him!” ChatGPT responded to a Where’s Waldo? page uploaded by Schirano, adding instructions on where to look.