A Visual Chatbot for Research-Grade HCI Applications

Text-only chat interfaces limit the kinds of research and assistance a conversational system can support. This project explored a Python-based visual chatbot that accepts text, images, and video as inputs.

The system combines natural-language processing with multimodal input handling and model APIs. A user can provide visual context alongside a question, allowing the interface to support tasks that would be difficult to express through text alone.

Research focus

How should a conversation represent visual context?
What feedback helps a user understand what the system processed?
How can multimodal responses remain clear and reviewable?
Which interface patterns reduce friction in research applications?

The work was published at IITCEE 2026 in Bangalore as "Development of a Python-Based Visual Chatbot for Advancing Human-Computer Interaction in Research Applications." The publication is indexed with DOI 10.1109/IITCEE67948.2026.11394241.

For me, the project connected AI engineering with interface design: a model may support multiple modalities, but the product still has to help people use those capabilities with confidence.