IFOST Blog: May 2017

Too often we whinge and whine about problems in the legal system. Let me share two stories where the legal system did what it was supposed to.

The first: a customer of mine wasn’t paying for work that obviously had been done -- a joint marketing program of Google adwords which were visible in the Google analytics console, a class I taught and some other hard-to-deny activities. There were all sorts of excuses and promises to pay by a certain date that would come and then go without anything happening. Months went by.

I’ve run IFOST for just under 20 years and I’ve tried to avoid court at all costs, even when I’ve been in the right and could have done so. It’s better to live at peace, forgive and forget and so on.

I don’t know why I finally flipped and decided that enough was enough. I threatened legal action, and didn’t get a sensible response. So I went to lawlink and created an account. Actually, I had to create three accounts: the first one was in the name “Greg Baker” but my passport, driver’s license and Medicare card all say “Gregory Baker” so I couldn’t validate my identity. The second I choose the wrong category of company, but I got it right the third time. About an hour wasted from some bad UI design on lawlink, but I’ve done it now.

Then I walked through an online form for a statement of claim. It was less than $10,000 so I was able to file in small claims court. It was straightforward enough to do so, but I wish there had been a better UI for calculating what the incurred interest expense was. You have to go to another page to get the annual interest rates for each year, turn that into a monthly interest rate by hand, apply those interest rates (e.g. in a spreadsheet) and then add up it all up. It could have been easier.

The filing fee was just under $200 and I received all the paperwork by email. I then forwarded the most official looking PDF to my contacts and said “you are going to have to answer this case, including paying the filing fees and interest.” I had an apology that afternoon from the most senior management at my client. I posted the statement of claim, signed a stat dec that I had done so. The day that their legal team was served with it, stuff happened and I was paid.

It was such a positive experience. The rule of law was respected, it wasn’t difficult or expensive or time consuming for me to make the case and the matter was resolved far faster than if I’d tried to keep chasing up the debt the way I normally do.

The second story: I was talking to my lawyer about the experience (she wasn’t involved in the case at all, I just thought I would let her know afterwards) and noticed that she’d written a piece about changes to contract law in Australia.

Here’s the article: https://www.linkedin.com/pulse/unfair-contract-term-provisions-extended-small-business-turner

I am not a lawyer but as a small business owner dealing with multinationals all the time, all I can say is: wow. I have stacks of absurd contracts that I’ve had to sign to win business that have exposed me to a lot of risk. I’ve just relied on the good nature of my customers most of the time, and been burned occasionally. Or, sometimes, I’ve just walked away from good productive work that would benefit everyone simply because I couldn’t afford this risk.

But now I have some sort of protection: I can (and will) point out clauses that a court would disallow and leave it up to the customer to decide whether they want to keep the pointless clause in place.

In Australia, where the vast majority of the economy consists of small businesses, there really was no reason that our legal system should have allowed lopsided contracts, ever. This one small change to contract law will have a ripple effect through the Australian economy. I look forward to seeing how this plays out.

So that’s two bits of good news in a week.

I’m in Barcelona at the moment, at AtlasCamp giving a talk about helpdesk chatbots that get smarter.

It’s easy to write a dumb chatbot. It’s much harder to write a smart one that responds sensibly to everything you ask it. Some famous examples: if a human mentions Harrison Ford, they a probably not talking about a car.

There are three different kinds of chatbot, and they are each progressively harder to get write.

The simplest chatbots are just a convenient command-line interface: in Slack or Hipchat, there are usually “slash” commands. Developers will set up a program that wakes up to “/build” whenever it is entered into a room that pulls the latest sources out of git, compiles it and shows the output of the unit tests. Since this is a very narrow domain, it’s easy to get right, and as it is for the benefit of programmers, it’s always cost-effective to spend some programmer time improving anything that isn’t any good.
The next simplest are ordering bots, that control the conversation by never letting the user deviate from the approved conversational path. If you are ordering a pizza, the bot can ask you questions about toppings and sizes until it has everything it needs. Essentially this is just a replacement for a web form with some fields, but in certain markets (e.g. China) where there are near-universal chat platforms this can be quite convenient.
The hardest are bots that don’t get to control the conversation, and where the user might ask just about anything.

Support bots are examples of that last kind: users could ask the helpdesk just about anything, and the support bot needs to respond intelligently.

I did a quick survey and found at least 50 startups trying to write helpdesk bots of various kinds. It’s a lucrative market, because if you can even turn 10% of helpdesk calls into a chat with a bot, that can mean huge staff cost savings. I have a customer with over 150 full-time staff on their servicedesk -- there are millions of dollars of savings to be found.

Unfortunately, nearly every startup I’ve seen has completely failed to meet their objectives, and customers who are happy with their investments in chatbots are actually quite rare.

I’ve seen three traps:

Several startups have lacked the courage to believe in their own developers. There’s a belief that Microsoft, Facebook, Amazon, IBM and Google have all the answers, and that if we leverage api.ai or wit.ai or lex or Watson or whatever they’ve produced this month that there’s just a simple “helpdesk knowledge and personality” to put on top of it, like icing on a cake. Fundamentally, this doesn’t work: for very soud commercial reasons the big players are working on technology for bots that replace web forms and with that bias comes a number of limiting assumptions.
A lot of startups (and larger companies) believe that if you just scrape enough data from the intranet -- analyse every article in Confluence for example -- that you will be able to provide exactly the right answer to the user. Others take this further and try to scrape public forums as well. This doesn’t work because firstly, users often can’t explain their problem very well, so there’s not enough information up front even to understand what the user wants; and secondly... have you actually read what IT people put into their knowledge repositories?
There are a lot of different things that can go wrong, and a lot of different ways to solve a problem. If you try to make your support chatbot fully autonomous, able to answer anything, you will burn through a lot of cash handling odd little corner cases that may never happen again.

The most promising approach I’ve seen was one taken by a startup that I was working with late last year. When they decided to head in another direction, I bought the source code back off them.

The key idea is this: if our support chatbot can’t answer every question -- as indeed it never will -- then there has to be a way for the chatbot to let a human being respond instead. If a human being does respond, then the chatbot should learn that that is how it should have responded. If the chatbot can learn, then we don’t need to do any up-front programming at all, we can just let the chatbot learn from past conversations. Or even have the chatbot be completely naive when it is first turned on.

The challenge is that in a support chat room, it’s often hard to disentangle what each answer from the support team is referring to. There are some techniques that I’ve implemented (e.g. disentangling based on temporal proximity, @ mentions and so on). A conversative approach is to have a separate bot training room where only cleanly prepared conversations happen. Taking this approach means that we substitute expensive highly-paid programmers writing code to handle conversations and replace them with an intern writing some text chats.

It’s actually not that hard to find an intern who just wants to spend all day hanging out in chat rooms.

Whatever approach you take, you will end up with a corpus of conversations: lots of examples of users asking something, getting a response from support, clarifying what they want, and then getting an answer.

Predicting the appropriate thing to say next becomes a machine learning problem: given a new, otherwise unseen data blob, predict which category it belongs to. The data blobs are all the things that have been said so far in the dialog, and the category is whatever it is that a human support desk agent is most likely to have said as a response.

There is a rich mine of research articles and a lot of well-understood best practice about how to do machine learning problems with natural language text. Good solutions have been found in support vector machines, LTSM architectures for deep neural networks, word2vec embedding of sentences.

It turns out that techniques from the 1960s work well enough that you can code up a solution in a few hours. I used a bag-of-words model combined with logistic regression and I get quite acceptable results. (At this point, almost any serious data scientist or AI guru should rightly be snickering in the background, but bear with me.)

The bag-of-words model says that when a user asks something, you can ignore the structure and grammar of what they’ve written and just focus on key words. If a user mentions “password” you probably don’t even need to know the rest of the sentence: you know what sort of support call this is. If they mention “Windows” the likely next response is almost always “have you tried rebooting it yet?”

If you speak a language with 70,000 different words (in all their variations, including acronyms), then each message you type in a chat gets turned into an array of 70,000 elements, most of which are zeroes, with a few ones in it corresponding to the words you happen to have used in that message.

It’s rare that the first thing a support agent says is the complete and total solution to a problem. So I added a “memory” for the user and the agent. What did the user say before the last thing that they said? I implemented this by exponential decay. If your “memory” vector was x and the last thing you said was y then when you say z I’ll update the memory vector to (x/2 + y/2). Then after your next message, it will become (x/4 + y/4 + z/2). Little by little the things you said a while ago become less important in predicting what comes next.

Combining this with logistic regression, essentially you assign a score for how strong each word is in each context as a predictor. The word “password” appearing in your last message would score highly for a response for a password reset, but the word “Windows” would be a very weak predictor for a response about a password reset. Seeing the word “Linux” even in your history would be a negative strength predictor for “have you tried rebooting it yet” because it would be very rare for a human being to have given that response.

You train the logistic regressor on your existing corpus of data, and it calculates the matrix of strengths. It’s a big matrix: 70,000 words in four different places (the last thing the user said, the last thing the support agent said, the user’s memory, and the support agent’s memory) gives you 280,000 columns, and each step of each dialog you train it on (which may be thousands of conversations) is a row.

But that’s OK, it’s a very sparse matrix and modern computers can train a logistic regressor on gigabytes of data without needing any special hardware. It’s a problem that has been well studied since at least the 1970s and there are plenty of libraries to implement it efficiently and well.

And that is all you have to do, to make a surprisingly successful chatbot. You can tweak how confident the chatbot needs to be before it speaks up (e.g. don’t say anything unless you are 95% confident that you will respond the way that a support agent will). You can dump out the matrix of strengths to see why the chatbot chose to give an answer when it gets it wrong. If it needs to learn something more or gets it wrong, you can just give it another example to work with.

It’s a much cheaper approach than hiring a team of developers and data scientists, it’s much safer than relying on any here-today-gone-tomorrow AI startup, and it’s easier to support than a system that calls web APIs run by a big name vendor.

If you come along to my talk on Friday you can see me put together the whole system on stage in under 45 minutes.

IFOST Blog

Search This Blog

Tuesday, 16 May 2017

Two happy stories about the Australian legal system

Wednesday, 3 May 2017

Chatbots that get smarter at #AtlassianSummit