![[Gemini] Building a LINE E-commerce Chatbot That Can "Tell Stories from Images"](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D800%252Cheight%3D%252Cfit%3Dscale-down%252Cgravity%3Dauto%252Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Fuc7ulj3k2ehr5j0fwdch.png&w=1200&q=75)
[Gemini] Building a LINE E-commerce Chatbot That Can "Tell Stories from Images"
Reference articles: Gemini API - Function Calling with Multimodal GitHub: linebot-gemini-multimodel-funcal Vertex AI - Multimodal Function Response Complete code GitHub Background I believe many people have used the combination of LINE Bot + Function Calling. When a user asks "What clothes did I buy last month?", the Bot calls the database query function, retrieves the order data, and then Gemini answers based on that JSON: Traditional process designed by developers: User: "Help me see the jacket I bought before" Bot: [ Call get_order_history() ] Function returns: { "product_name" : "Brown pilot jacket" , "order_date" : "2026-01-15" , ... } Gemini: "You bought a brown pilot jacket on January 15th for NT$1,890." The answer is completely correct, but it always feels like something is missing—the user is talking about "that jacket," and Gemini is just restating the text in the JSON, with no way to "confirm" what the jacket looks like. If there happen to be three jackets in the database, t
Continue reading on Dev.to
Opens in a new tab
