If AI struggles with fourth-grade science question answering, should AI be expected to hold an adult-level, open-ended chit-chat about politics, entertainment, and weather? It is thus encouraging to see that Microsoft’s Satya Nadella did not give up on Tay after its debacle, and Amazon’s Jeff Bezos is sponsoring an Alexa social chatbot competition. I love this below quote from Jeff:
AllAgriculture (24) AI & ML (142) AR, VR, & MR (65) Asset Tracking (53) Blockchain (21) Building Automation (38) Connectivity (148) Bluetooth (12) Cellular (38) LPWAN (38) Data & Analytics (131) Devices & Sensors (174) Digital Transformation (189) Edge & Cloud Computing (54) Energy & Utilities (42) Finance & Insurance (10) Industrial IoT (101) IoT Platforms (81) Medical & Healthcare (47) Retail (28) Security (139) Smart City (88) Smart Home (91) Transport & Supply Chain (59) UI & UX (39) Voice Interaction (33)
It may be tempting to assume that users will navigate across dialogs, creating a dialog stack, and at some point will navigate back in the direction they came from, unstacking the dialogs one by one in a neat and orderly way. For example, the user will start at root dialog, invoke the new order dialog from there, and then invoke the product search dialog. Then the user will select a product and confirm, exiting the product search dialog, complete the order, exiting the new order dialog, and arrive back at the root dialog.
This means our questions must fit with the programming they have been given. Using our weather bot as an example once more, the question ‘Will it rain tomorrow’ could be answered easily. However if the programming is not there, the question ‘Will I need a brolly tomorrow’ may cause the chatbot to respond with a ‘I am sorry, I didn’t understand the question’ type response.
As AOL's David Shingy writes in Adweek, "The challenge [with chatbots] will be thinking about creative from a whole different view: Can we have creative that scales? That customizes itself? We find ourselves hurtling toward another handoff from man to machine -- what larger system of creative or complex storytelling structure can I design such that a machine can use it appropriately and effectively?"
It won’t be an easy march though once we get to the nitty-gritty details. For example, I heard through the grapevine that when Starbucks looked at the voice data they collected from customer orders, they found that there are a few millions unique ways to order. (For those in the field, I’m talking about unique user utterances.) This is to be expected given the wild combinations of latte vs mocha, dairy vs soy, grande vs trenta, extra-hot vs iced, room vs no-room, for here vs to-go, snack variety, spoken accent diversity, etc. The AI practitioner will soon curse all these dimensions before taking a deep learning breath and getting to work. I feel though that given practically unlimited data, deep learning is now good enough to overcome this problem, and it is only a matter of couple of years until we see these TODA solutions deployed. One technique to watch is Generative Adversarial Nets (GAN). Roughly speaking, GAN engages itself in an iterative game of counterfeiting real stuffs, getting caught by the police neural network, improving counterfeiting skill, and rinse-and-repeating until it can pass as your Starbucks’ order-taking person, given enough data and iterations.
Pop-culture references to Skynet and a forthcoming “war against the machines” are perhaps a little too common in articles about AI (including this one and Larry’s post about Google’s RankBrain tech), but they do raise somewhat uncomfortable questions about the unexpected side of developing increasingly sophisticated AI constructs – including seemingly harmless chatbots.
ELIZA's key method of operation (copied by chatbot designers ever since) involves the recognition of clue words or phrases in the input, and the output of corresponding pre-prepared or pre-programmed responses that can move the conversation forward in an apparently meaningful way (e.g. by responding to any input that contains the word 'MOTHER' with 'TELL ME MORE ABOUT YOUR FAMILY'). Thus an illusion of understanding is generated, even though the processing involved has been merely superficial. ELIZA showed that such an illusion is surprisingly easy to generate, because human judges are so ready to give the benefit of the doubt when conversational responses are capable of being interpreted as "intelligent".