IFOST Blog: HipChat

Showing posts with label HipChat. Show all posts

Wednesday, 3 May 2017

Chatbots that get smarter at #AtlassianSummit

I’m in Barcelona at the moment, at AtlasCamp giving a talk about helpdesk chatbots that get smarter.

It’s easy to write a dumb chatbot. It’s much harder to write a smart one that responds sensibly to everything you ask it. Some famous examples: if a human mentions Harrison Ford, they a probably not talking about a car.

There are three different kinds of chatbot, and they are each progressively harder to get write.

The simplest chatbots are just a convenient command-line interface: in Slack or Hipchat, there are usually “slash” commands. Developers will set up a program that wakes up to “/build” whenever it is entered into a room that pulls the latest sources out of git, compiles it and shows the output of the unit tests. Since this is a very narrow domain, it’s easy to get right, and as it is for the benefit of programmers, it’s always cost-effective to spend some programmer time improving anything that isn’t any good.
The next simplest are ordering bots, that control the conversation by never letting the user deviate from the approved conversational path. If you are ordering a pizza, the bot can ask you questions about toppings and sizes until it has everything it needs. Essentially this is just a replacement for a web form with some fields, but in certain markets (e.g. China) where there are near-universal chat platforms this can be quite convenient.
The hardest are bots that don’t get to control the conversation, and where the user might ask just about anything.

Support bots are examples of that last kind: users could ask the helpdesk just about anything, and the support bot needs to respond intelligently.

I did a quick survey and found at least 50 startups trying to write helpdesk bots of various kinds. It’s a lucrative market, because if you can even turn 10% of helpdesk calls into a chat with a bot, that can mean huge staff cost savings. I have a customer with over 150 full-time staff on their servicedesk -- there are millions of dollars of savings to be found.

Unfortunately, nearly every startup I’ve seen has completely failed to meet their objectives, and customers who are happy with their investments in chatbots are actually quite rare.

I’ve seen three traps:

Several startups have lacked the courage to believe in their own developers. There’s a belief that Microsoft, Facebook, Amazon, IBM and Google have all the answers, and that if we leverage api.ai or wit.ai or lex or Watson or whatever they’ve produced this month that there’s just a simple “helpdesk knowledge and personality” to put on top of it, like icing on a cake. Fundamentally, this doesn’t work: for very soud commercial reasons the big players are working on technology for bots that replace web forms and with that bias comes a number of limiting assumptions.
A lot of startups (and larger companies) believe that if you just scrape enough data from the intranet -- analyse every article in Confluence for example -- that you will be able to provide exactly the right answer to the user. Others take this further and try to scrape public forums as well. This doesn’t work because firstly, users often can’t explain their problem very well, so there’s not enough information up front even to understand what the user wants; and secondly... have you actually read what IT people put into their knowledge repositories?
There are a lot of different things that can go wrong, and a lot of different ways to solve a problem. If you try to make your support chatbot fully autonomous, able to answer anything, you will burn through a lot of cash handling odd little corner cases that may never happen again.

The most promising approach I’ve seen was one taken by a startup that I was working with late last year. When they decided to head in another direction, I bought the source code back off them.

The key idea is this: if our support chatbot can’t answer every question -- as indeed it never will -- then there has to be a way for the chatbot to let a human being respond instead. If a human being does respond, then the chatbot should learn that that is how it should have responded. If the chatbot can learn, then we don’t need to do any up-front programming at all, we can just let the chatbot learn from past conversations. Or even have the chatbot be completely naive when it is first turned on.

The challenge is that in a support chat room, it’s often hard to disentangle what each answer from the support team is referring to. There are some techniques that I’ve implemented (e.g. disentangling based on temporal proximity, @ mentions and so on). A conversative approach is to have a separate bot training room where only cleanly prepared conversations happen. Taking this approach means that we substitute expensive highly-paid programmers writing code to handle conversations and replace them with an intern writing some text chats.

It’s actually not that hard to find an intern who just wants to spend all day hanging out in chat rooms.

Whatever approach you take, you will end up with a corpus of conversations: lots of examples of users asking something, getting a response from support, clarifying what they want, and then getting an answer.

Predicting the appropriate thing to say next becomes a machine learning problem: given a new, otherwise unseen data blob, predict which category it belongs to. The data blobs are all the things that have been said so far in the dialog, and the category is whatever it is that a human support desk agent is most likely to have said as a response.

There is a rich mine of research articles and a lot of well-understood best practice about how to do machine learning problems with natural language text. Good solutions have been found in support vector machines, LTSM architectures for deep neural networks, word2vec embedding of sentences.

It turns out that techniques from the 1960s work well enough that you can code up a solution in a few hours. I used a bag-of-words model combined with logistic regression and I get quite acceptable results. (At this point, almost any serious data scientist or AI guru should rightly be snickering in the background, but bear with me.)

The bag-of-words model says that when a user asks something, you can ignore the structure and grammar of what they’ve written and just focus on key words. If a user mentions “password” you probably don’t even need to know the rest of the sentence: you know what sort of support call this is. If they mention “Windows” the likely next response is almost always “have you tried rebooting it yet?”

If you speak a language with 70,000 different words (in all their variations, including acronyms), then each message you type in a chat gets turned into an array of 70,000 elements, most of which are zeroes, with a few ones in it corresponding to the words you happen to have used in that message.

It’s rare that the first thing a support agent says is the complete and total solution to a problem. So I added a “memory” for the user and the agent. What did the user say before the last thing that they said? I implemented this by exponential decay. If your “memory” vector was x and the last thing you said was y then when you say z I’ll update the memory vector to (x/2 + y/2). Then after your next message, it will become (x/4 + y/4 + z/2). Little by little the things you said a while ago become less important in predicting what comes next.

Combining this with logistic regression, essentially you assign a score for how strong each word is in each context as a predictor. The word “password” appearing in your last message would score highly for a response for a password reset, but the word “Windows” would be a very weak predictor for a response about a password reset. Seeing the word “Linux” even in your history would be a negative strength predictor for “have you tried rebooting it yet” because it would be very rare for a human being to have given that response.

You train the logistic regressor on your existing corpus of data, and it calculates the matrix of strengths. It’s a big matrix: 70,000 words in four different places (the last thing the user said, the last thing the support agent said, the user’s memory, and the support agent’s memory) gives you 280,000 columns, and each step of each dialog you train it on (which may be thousands of conversations) is a row.

But that’s OK, it’s a very sparse matrix and modern computers can train a logistic regressor on gigabytes of data without needing any special hardware. It’s a problem that has been well studied since at least the 1970s and there are plenty of libraries to implement it efficiently and well.

And that is all you have to do, to make a surprisingly successful chatbot. You can tweak how confident the chatbot needs to be before it speaks up (e.g. don’t say anything unless you are 95% confident that you will respond the way that a support agent will). You can dump out the matrix of strengths to see why the chatbot chose to give an answer when it gets it wrong. If it needs to learn something more or gets it wrong, you can just give it another example to work with.

It’s a much cheaper approach than hiring a team of developers and data scientists, it’s much safer than relying on any here-today-gone-tomorrow AI startup, and it’s easier to support than a system that calls web APIs run by a big name vendor.

If you come along to my talk on Friday you can see me put together the whole system on stage in under 45 minutes.

Tuesday, 23 February 2016

Draining the meeting bogs and how not to suffer from email overload (part 3)

This is the third in my series of blog posts about how we unknowingly often let our IT determine how we communicate, and what to do about it.

Teams need to communicate at three different speeds: Tomes, Task Tracking and Information Ping-pong. When we don't have the right IT support to for all three, things go wrong.

In this post, I'll talk about team communication Information Ping-Pong. Information Ping-Pong is that rapid communication that makes your job efficient. You ask the expert a question and you get an answer back immediately because that's their area, and they know all about it.

It's great: you can stay in the flow and get much, much more done. It's the grail of an efficient organisation; using the assembled team of experts to their full.

Unfortunately, what I see in most organisation is that they try to use email for this.

It doesn't work.

Occasionally the expert might respond to you quite quickly, but there can be long delays for no obvious reason to you. You can't see what they are doing -- are they handling a dozen other queries at the same time? And worse: it is just contributing to everyone's over-full inbox.

The only alternative in most organisations is to prepare a list of questions, and call a meeting with the relevant expert. This works better than email, but it's hard to schedule a 5 minute meeting if that's all you need. Often the bottom half of the list of prepared questions don't make sense in the light of the answers to the first half, and the blocked-out time is simply wasted.

The solution which has worked quite well for many organisations is text-chat, but there are four very important requirements for this to work well.

GROUP FIRST Text chats should be sent to virtual rooms; messages shouldn't be directed to an individual. If you are initiating a text-chat to an individual, you are duplicating all the problems of email overload, but also expecting to have priority to interrupt the recipient.

DISTURBED ROLE There needs to be a standard alias (traditionally called "disturbed") for every room. Typically one person gets assigned the "disturbed" role for the day for each team and they will attempt to respond on behalf of the whole team. This leaves the rest of the team free to get on with their work, but still gives the instant-access-to-an-expert that so deeply helps the rest of the organisation. (Large, important teams might need two or more people acting in the disturbed role at a time. )

HISTORY The history of the room should be accessible. This lets non-team members lurk and save on asking the same question that has already been answered three times that day.

BOT-READY Make sure the robots are talking, and plan for them to be listening. If a job is completed, or some event has occurred, or any other "news" can be automatically sent to a room, get a robot integrated into your text chat tool to send it. This saves wasted time for the person performing the "disturbed" role.

Most text chat tools also have "slash" commands or other ways of directing a question or instruction to a robot. These are evolving into tools that understand natural language and will be one of the most significant and disruptive changes to the way we "do business" over the next ten years.

Skype and Lotus Notes don't do a very good job on any of the requirements listed above. Consumer products (such as WhatsApp) are almost perfectly designed to do the opposite of what's required. WeChat (common in China) stands slightly above in that at least it has an API for bots.

The up-and-coming text chat tool is a program called "Slack", although Atlassian's Hipchat is a little more mature and is better integrated with the Atlassian suite of Confluence and Jira.

Unlike most of the tools I've written about in this series, the choice of text chat tool really has to be done at a company level. It is difficult for a team leader or individual contributor to drive the adoption from the grassroots up; generally it's an IT decision about which tool to use, and then a culture change (from top management) to push its usage. Fortunately, these text chat tools are extraordinarily cheap (the most expensive I've seen is $2 per month per user), and most have some kind of free plan that is quite adequate. Also, there's a good chance that a software development group will already be using Hipchat, which means that adoption can grow organically from a starting base.

Outside of a few startups, text-chat is very rare. And also outside of a few startups, everything takes far longer than you expect it to and inter-team communication is painfully slow. It's not a co-incidence. We think this mess is normal, but it's just driven by the software we use to intermediate our communications.

The next post in the series will hopefully be next Tuesday.

I'm hoping to put this (and many other thoughts) together in a book (current working title: "Bimodal, Trimodal, Devops and Tossing it over the Fence: a better practices guide to supporting software") -- sign up for updates about the book here: http://eepurl.com/bMYBC5

If you like this series -- and care about making organisations run better with better tools -- you'll probably find my automated estimator of effort and duration very interesting.

Greg Baker (gregb@ifost.org.au) is a consultant, author, developer and start-up advisor. His recent projects include a plug-in for Jira Service Desk which lets helpdesk staff tell their users how long a task will take and a wet-weather information system for school sports.

Tuesday, 9 February 2016

Draining the meeting bogs and how not to suffer from email overload (part 1)

Some meetings are important; sometimes face-to-face is the best way to work through an issue. And email is a necessary business tool. But in many of the organisations I work with, I've seen meetings and emails used as a crutch because their staff aren't given what they need in order to work more efficiently.

I blame IT for this, perhaps too harshly, but IT should be thinking both about how individuals communicate, and also about the requirements for teams to communicate.

In general, there are three broad ways that teams communicate:

With Tomes that answer "Who are we? What do we do? How did we get here? Why are we doing this?"
Using different ways to say "We're working on it"
By playing Information Ping-pong

To be efficient, it's important that staff from outside the team can "lurk" (watch what is going on) without engaging the team across all three methods.

If other staff can't lurk -- they will either email you or ask for a meeting.

What I see all too often is desperate staff, who are over-worked because they are forced to use email and meetings -- tools which are very ill-suited to all three speeds of communication.

I'll discuss each of them in follow-up blog posts; I'll schedule them for Tuesday each week unless something else more interesting crops up.

I'm hoping to put this together in a book (current working title: "Bimodal, Trimodal, Devops and Tossing it over the Fence: a better practices guide to supporting software") -- sign up for updates about the book here: http://eepurl.com/bMYBC5

If you like this series -- and care about making organisations run better with better tools -- you'll probably find my automated estimator of effort and duration very interesting.

Greg Baker (gregb@ifost.org.au) is a consultant, author, developer and start-up advisor. His recent projects include a plug-in for Jira Service Desk which lets helpdesk staff tell their users how long a task will take and a wet-weather information system for school sports.

Wednesday, 11 November 2015

An Atlassian story

A little while back I was talking with two Daves.

Dave #1 is the CIO of a college / micro-university
Dave #2 is on the board of a small airline.

Because I tend to get involved in non-traditional projects, they often ask me what I'm working on, probably out of amusement more than anything else. At the time I was building a service catalog for Atlassian (and now I have an awesome plug-in on their marketplace).

Neither had any idea of who Atlassian is or what it does, which was no surprise to me. I don't quite understand why an Australian company with $200m+ in revenue and a market cap in the billions which isn't a miner, telco or bank isn't memorable.

Still, the tools Atlassian makes are mostly used by software developers, and my circle of friends and acquaintances doesn't include many coders so "Atlassian makes the software to help people make software" isn't a good way of describing what they do.

Other than a brief stint at Google, I haven't been a full-time employee or even long-term contractor in any normal company this century, so instead I talked about what's unusual and different at Atlassian based on all the other organisations I've worked with. I talked about how the very common assumptions about the technologies to co-ordinate a business are quite different here.

Type of communication	Corporate default (aka "what most companies do")	What is generally done at Atlassian
Individual-to-individual	Email or talk over coffee.	HipChat @ the individual in a room related to the topic
Individual-to-group	Teleconference / webinar, for all matters big or small.	A comment in the HipChat room. Or sometimes Google Hangouts and HipChat video conference if it's something long and important.
Reporting (project status, financials)	Excel document or similar	Confluence status page or JIRA board
Proposal	Word document or Powerpoint presentation	Confluence page
Feedback on proposals	Private conversation with the person who proposed it, or maybe on another forum page somewhere else on their sharepoint portal.	The discussion in the comments section of the confluence page.

And yeah, my secret superpower is to be able to narrate HTML tables in speakable form. There was a lot of "on the one hand... on the other hand at Atlassian..."

Note that the Atlassian column is all public and searchable (in line with being an open company), and the "default corporate column" is not. Also, in the Atlassian column, you opt-in to the information source; in the "traditional company" column, the sender of the information chooses who to share it with.

Why is this interesting? Because the Atlassian technologies and the Atlassian way of doing things is an immune system against office politics.

In order to get really nasty office politics, you really need an information asymmetry: managers need to be able to withhold information from other managers in order to get pet projects and favourite people promoted, and for others to be dragged down by releasing information at the worst possible moment when the other party can't prepare for it. (Experience and being middle-aged: you see far too much which you wish you hadn't.) When information is shared only with the people you choose to share it with, that's easy to do.

At Atlassian, it's not quite like that. Sure, there are still arguments and disagreements. Sometimes there is jostling for position and disagreements about direction and there are people and projects we want to see happen. And not every bad idea dies as quickly as it should. But because other interested parties can opt-in as required for their needs it's a very level playing field and ---

And that's where Dave #1 cut me off, shook my hand and said, "Wow. Thank you. That's exactly it. That's EXACTLY it."

Dave #2 just nodded sagely. "Yes," he said, and then again more slowly: "yes."

Summary: Atlassian sell office-politics treatments.

Greg Baker (gregb@ifost.org.au) is a consultant, author, developer and start-up advisor. His recent projects include a plug-in for Jira Service Desk which lets helpdesk staff tell their users how long a task will take and a wet-weather information system for school sports.

Other related sites

I lecture in natural language processing, and research non-manifold machine learning. I consult to companies needing help with technology management, AI strategy.
I have built a prediction market platform for enterprises to help with employee engagement and strategic decision making.
I wrote this NPS survey analyzer
As a build-a-business-in-one-day exercise, I have a GPT-via-email bot.
Conference call system to help when you need simultaneous translation done, but the only equipment you have on hand are people's mobile phones... which aren't all smartphones: church-translation.com
My Amazon author page
In the past I used to answer a lot of questions on Quora
Early 21st century pre-singularity geek-nerd poetry also available in book form

Search This Blog