“What technology is behind Sophia?”

Image from wikipedia https://en.wikipedia.org/wiki/File:Sophia_at_the_AI_for_Good_Global_Summit_2018_(27254369807)_(cropped).jpg of Sophia during the global AI for Good summit.
By Scott Hamilton
Last week I wrote about some of the robots at the United Nations (UN) Artificial Intelligence (AI) for Good conference. One robot in particular is rather impressive. Sophia, from Hanson Robotics.
Sophia is the first AI-powered humanoid robot to gain citizenship of a nation. She was granted Saudi Arabian citizenship in October 2017 and was named the UN Development Program’s first Innovation Champion, also making her the first non-human to be given a UN title. This week I would like to take a closer look at the technologies behind Sophia. Is she really an independent personality, or is she simply a complex set of computer algorithms? I will let you decide.
First, Sophia utilizes computer vision technology to recognize and track faces. She is also able to recognize and respond to facial expressions and gestures based on standard image processing algorithms. There is nothing fantastic or new about computer vision algorithms; they have been around in pretty much the same form since the 1960s. Computers, of course, have become much more advanced, allowing them to process the image data in real time instead of taking several hours or even days to process image data.
Next on the list we have natural language processing to understand and respond to human speech. The very early language processing algorithms could only recognize a few words. I built a voice controlled robot as a senior design project in 1995; it only recognized seven commands: “forward,” “go back,” “left turn,” “right,” “stop,” “on” and “off.” It did not really recognize words, but counted spikes in the sound waves; as it turns out, I was cheating. Each command resulted in a different number of spikes, thus the command “left turn” and “right” instead of using “left” and “right.” I can’t remember the details entirely of how I determined the commands, but to me this is one of the most impressive and useful features in Sophia. Although I have to wonder if she does better than the automated operators you get when calling a credit card company or bank. I imagine so, since most of the voice command operated phone systems utilize the spike count method to recognize words, which explains their limited vocabulary.
Now come the two most confusing technologies in Sophia, AI and Machine Learning. I still find it odd that the two fields of study are separate, especially since they are so closely linked. Machine Learning is the process used to teach an AI new tasks, much like we teach a child to speak in complete sentences, read, write and study. There is a strong belief that modern machines learn like the human brain. An AI consists of artificial neurons that are linked via a series of programmable synapses. The computer brain learns new things by modification of the synapses to cause the firing of certain neurons to affect other neurons in different ways. Each neuron represents a portion of a learned information. The training of an AI as complex as Sophia would require several years of computation to teach the neural network how to properly interact with her human counterparts.
We mistakenly believe that it requires a lot of computer power to pull off AI, but in reality a bulk of the computation is used in the initial training (Machine Learning) stage and very little computing power is required to operate the neural network (AI) once it has been trained. An AI is really nothing more than a complex table of linked information with a rapid indexing system allowing the robot to recall information rapidly in response to human speech and actions. I am of the opinion that the intelligent part of AI is actually in the machine learning process.
The final component of Sophia has also been in use for several decades, and that is the mechanics behind her movements and facial expressions. A lot of the components utilized in building the mechanical body for Sophia came from the medical industry. The development of artificial limbs was mainly driven by the medical community, whereas the intelligence came out of computer science and psychology research; though impressive, there is nothing really spectacular when it comes to her robotics.
If you are interested in learning more about Sophia’s technologies, Hanson Robotics offers a software development kit for interacting directly with Sophia, or at least with her intelligence, which is not actually on the robot, but rather back at a central control center and powered by several computers, and according to Facebook’s head of AI, Yann LeCun, is a group of human puppeteers deliberately deceiving the public. LeCun believes Sophia was a “BS Puppet” and not driven by a true AI at all. Until next week, stay safe and learn something new.
Scott Hamilton is an Expert in Emerging Technologies at ATOS and can be reached with questions and comments via email to sh*******@**********rd.org or through his website at https://www.techshepherd.org.