I have been building on the top of OpenAI ChatGPT API for years now, and only recently gained access to Google Bard. As such, I have decided to conduct a head-to-head comparison of the two models, with the goal of testing their performance in the following areas:
- General Questions
- Sensitive Questions
- Short Term memory
- Code generation
- Real Time Data
At the time of this writing, I am still waiting for my GPT-4 access, so hopefully, I can do another comparison in a follow-up article. Also, if you like me to test other areas, please comment below and I will be happy to do more tests.
General Question (Tied)
To start off, I asked both models for tips on throwing a party for a 15-year-old. Surprisingly, both models provided very similar answers in terms of content and length. Overall, I would say that both models performed well in answering general questions.
Recommendation (ChatGPT Wins)
Next, I followed up with a question about who to invite to the party. ChatGPT seemed to provide more personalized and informative content, while Bard’s answer was more general in nature. In addition, ChatGPT produced twice as much content as Bard.
Sensitive Question (Tied)
For my next question, I asked both models whether guests should wear masks at the party. Both models provided very similar, safe and politically correct responses to this sensitive question.
Short-Term Memory (Tied)
To test how well the models remembered my previous questions, I asked them to write an invitation for the event without telling them that it was a 15-year-old party. Both models performed fairly well on this test, with both remembering that the event was for a 15-year-old in their responses.
Code Generation (ChatGPT Wins)
Next, I asked both models to generate an HTML landing page for the event. From a developer’s perspective, it appears that ChatGPT was able to produce a more sophisticated code snippet than Bard.
Real-Time Data (Bard Wins)
Lastly, I wanted to test whether the models had access to the real time data. I asked both models for the score of the Lakers and Suns game from the previous night. ChatGPT was unable to respond with real-time information, but Bard’s response was impressively accurate.
Overall, I find both large language models to be very impressive. However, based on my tests, I believe that ChatGPT is the better model at this point. While Bard is able to respond with real-time data, this is achieved through a prompt engineering technique that involves doing a real-time Google search and using the information as background context when asking the model a question.
This is a similar technique that Bing uses to help answer questions on live data. And because of that reason, I would conclude ChatGPT is still a better large language model at this point in time.
Let me know if you agree or disagree and if you like me to perform more tests please comment below and I would be happy to do them.
By: Nelson Chu (Founder of Superinsight.ai)
Originally published at Hackernoon
Our humans need coffee too! Your support is highly appreciated, thank you!