GPT-3 and AI in Customer Support
“Could we be using GPT-3 to automatically respond to people instead of hiring those new customer service agents?”
If that’s a question you heard around budget time this year, this article is for you. If not, then perhaps you have a broader interest in AI and machine learning and the ways they could assist in delivering better customer service.
GPT-3 has been making news recently, so it’s worth taking a look to understand what it is and how it might help.
What is GPT-3?
GPT-3 is a language model — a way for machines to understand what human languages look like. That model can then be used to generate prose (or even code) that seems like it was written by a real person.
In simple terms, language models help computers estimate the probability of word sequences. You have a language model, too; what is the missing word in this sentence? “Why did the ___ cross the road?”
One of the most important differences between GPT-3’s generation of tools and earlier machine learning models is that you don’t need to train it with high-quality, carefully labeled and structured information. Instead, GPT-3 imbibed an enormous (and broad) amount of public online text and used that to develop its model.
That model produces some impressive results across a variety of use cases.
We asked GPT-3 to answer real customer service questions
The potential for customer service usage is clear — could this software read your incoming customer questions and generate accurate, helpful answers? Our own data scientist, Matt Mazur, decided to find out.
The team at OpenAI made GPT-3 available via an API, so Matt signed up. He fed GPT-3 only six examples of real responses from our entirely human (and highly skilled) Help Scout customer service team.
From those six responses, GPT-3 did not learn anything about Help Scout or its products; it only looked at the voice, tone, and structure our team used in providing those answers. That’s why some of the answers sound real, but don’t make much sense.
Next, Matt took some genuine customer questions (different from the six examples) and had GPT-3 generate responses. Please note we have no plans to actually implement GPT-3 at Help Scout — this was a purely experimental exercise. Here is an example from one of Matt’s tests:
Below I’ve included four examples (real questions, edited only for privacy) along with a comparison of our actual customer service team’s answers. Remember, GPT-3 is not using any Help Scout-specific information here, other than what it may have absorbed from the open web during that language model creation.
Question 1
Hi! I just sent out the first few emails using a template but the text looks purple. It didn't look like that in the preview. How do I avoid this?
Best, Salman
GPT-3 answered | Our human team answered |
---|---|
Hello Salman, The color of the text in the email template is controlled by the color of the text in the HTML code. If you're using a template from the Email Templates section of the builder, you can find the HTML code in the Code section of the builder. If you're using a template from the Template Gallery, you can find the HTML code in the Code section of the template. If you're using a template from the Template Gallery, you can also find the HTML code in the Code section of the template. | Hello Salman, It looks like you're using Gmail to read your emails, is that right? If yes, then this is actually a display feature in Gmail and it's done to indicate to you that the text is an exact copy from another thread. Rest assured your recipients won't see the purple color from their end, if they are viewing this message for the first time. Hope this clarifies things but let me know if you need anything else :) |
Of note:
GPT-3 has some convincing technically correct answers here, but they just don’t apply. It is the sort of mistake an inexperienced human could easily make.
Our real support pro has spotted a more likely cause and avoids a lot of unnecessary complexity.
Question 2
Hi,
I just downgraded our account from the Standard Plan ($20 per user per month) to the Basic Plan ($12 per user per month), but I am being billed $15 per user per month.
Cheers Vincent
GPT-3 answered | Our human team answered |
---|---|
Hello Vincent, I've taken a look at your account and can see that you have been billed for the Standard Plan for the month of October. I've also checked with our billing team and they've confirmed that the Basic Plan is the correct plan for your account. I've made a note of your query and will be sure to follow up with you once the issue has been resolved. | Hey Vincent, Happy to shed some light! The Basic plan is $15 per user/month when paid monthly, or $12 per user/month when paid annually. To get that lower price-point, head to the Your Plan page to switch to the annual plan. Heads up: That is a 12 month commitment since it comes with such a steep discount. Let me know if there’s anything else I can help with for now! |
Of note:
Kristi, from our Customers team, noted: “The customer needs clarity around annual vs. monthly billing, but they don't realize it — something GPT-3 can’t know.”
This answer from GPT-3 is the type of polite yet completely unhelpful response that is particularly irritating.
Question 3
Hi, Hope you’re well. We already have a Help Scout account but we are keen to attend the webinar on 15th October on "How to Create Customer Flow with Messaging." The only thing is that it's on at 4AM our time. Will you be sending out a recording?
Many thanks, Scott
GPT-3 answered | Our human team answered |
---|---|
Hello Scott, I'm sorry to hear that you won't be able to join us live for the webinar on the 15th. I've forwarded your request to our team and they'll be in touch with you shortly. Thank you for your interest! | Hi Scott, Great to hear you're interested in checking out that webinar! Go ahead and sign up, and we'll automatically send you a recording afterwards. :) Let me know if there's anything else I can help with for now! |
GPT-3 and AI in Customer Support - Callout
Of note:
There is an explicit question, but GPT-3 predicts a very generic answer. It’s likely that if it had more examples to work with, it could produce a better result to a pretty simple question.
Question 4
I have tried multiple times to create an account but keep getting this error: (screenshot removed).
Info I submitted, Tacomia is company name and beans@redact.io
is the email.
Not sure what the problem is. I'm trying to set up an account for a new company I started. I already have HelpScout for an existing business. Please reply to blum@beanborn.com
.
GPT-3 answered | Our human team answered |
---|---|
Hello Blum, I'm afraid I'm not able to see any issues with your account creation. Please let me know if you have any other questions or concerns. | Hi Jack, Thanks for your interest in using Help Scout! To share a bit more context with you, we have a strict vetting process to make sure Help Scout does not get abused by malicious users. I was able to find your attempted sign-up on our end, but it was indeed blocked by our filters, I'm afraid. To be able to continue with the sign-up, can you share a bit more detail with us? Can you let us know a little bit more about your business and how you're hoping to use Help Scout? What is your business website? Do you have a LinkedIn profile or Twitter profile? How many team members do you expect will be using Help Scout? Which email hosting service do you currently use for your business? We look forward to getting to know you! :) |
Of note:
Again, GPT-3 is unfailingly polite, as our real team members are, but without the behind-the-scenes context, it doesn’t have a real chance to divine an answer.
Disha, from our Customers team, said: “[some of] the answers were incorrect or incomplete and the AI sounded dismissive by confidently providing short/wrong/vague/incomplete answers. It didn't leave any room for the possibility of being wrong, and didn't ask for clarifying info which we would have done.”
All in all, GPT-3’s answers sound very real in many cases, but they are also over-confident and unhelpful. As one team member put it, “This could be me on days where I am sleep deprived and my reading comprehension is non-existent.”
What is GPT-3 actually doing?
How should we understand the answers that GPT-3 is providing? Let’s start by being clear on what it is not doing.
GPT-3 is not:
Searching a knowledge base or reading help documents to find the “right” answer.
Understanding anything about Help Scout the company or its products.
Judging whether an answer is correct or helpful.
What GPT-3 is doing is predicting, based on what it knows about how our English language works, what the response text is statistically most likely to be. Then it is using the six example Help Scout answers it has seen as a model for the tone of voice and structure to use in generating the final response. Note: We’ve set up quite an unreasonable test scenario, due to the limitations of our GPT-3 access.
Given all of that, it is impressive how close it comes to creating plausibly real responses, without ever understanding the context of the individual customer, their goals, or the products. It reveals just how much of our human interaction is almost formulaic — we have routines and phrases that we fit into the right situations over and over.
There’s nothing wrong with that at all — often that is exactly what our customers need. And human support people aren’t immune from giving a practiced response without having noticed a key detail in the question that changes everything. Sometimes, though, that’s going to be insufficient or actively unhelpful.
GPT-3 is probably going to be really good at producing the sort of mostly fine (as long as you aren’t an edge case) answers we have come to expect from … companies I don’t need to name for you. You’re aiming higher than that, though, for your company, as do we at Help Scout.
The verdict: GPT-3 for customer service
We must remember that GPT-3 doesn’t really know anything. A language model has no understanding of people or why they behave in certain ways. It just looks at how people have written in the past and uses that to predict what they would write in a given situation. It won’t judge its own answers, no matter how sneaky, funny or even racist they might be.
GPT-3 is working on a fine-tuning API which would allow us to feed it more specific Help Scout information, giving it additional knowledge to draw on. That might produce better results.
Still, AI isn’t yet able to directly give the sort of nuanced, thoughtful service that helps companies stand out. It falls into a sort of uncanny valley, sounding convincingly human much of the time, but then being unsettlingly close-to-but-not-quite real.
However there are plenty of other ways this technology really could help: offering a "best guess" suggestion for human staff, producing an automatic summary of a conversation, or offering writing tools, for example.
Recently Google trained a trillion-parameter AI language model that should provide even more impressive capabilities. The technology will continue to improve rapidly.
Should a small or medium-sized team be looking to engage with AI customer service tools today? Yes, but only if you have already done the work to understand what good customer service looks like in your company and how you can give your existing team the best chance of success.
AI tools can’t replace a customer-centric mindset or leadership that doesn’t value customer service. If you try, you’ll only be providing mediocre service more quickly.