ChatGPT, reviving the smart speaker.

Artificial intelligence assistant services have been in full-fledged competition since Alexa began in 2014. From 2018, various types of smart speakers were released, allowing AI assistants to be used not only through smartphone apps, but also through a variety of speakers.

However, these smart speakers didn’t take long to become a nuisance. First of all, AI assistants didn’t understand speech very well. They often mistook unintentional triggers as commands and woke up, disturbing the peace.

Moreover, even when they could understand the speech, their responses were often inadequate. They could only be used for basic tasks such as checking the weather, setting an alarm, playing music, and listening to the radio, and were not very helpful when it came to answering questions or finding information.

ChatGPT, reviving the smart speaker - Superchat

Thanks to ChatGPT, such smart speakers and 1st generation AI assistants are getting the opportunity to spread their wings of resurrection. In August 2022, Amazon unveiled a new AI language model that can improve Alexa. This model, called Alexa Teacher Models (AlexaTM), demonstrates excellent performance in various language translations and text summarization. Even before that, in 2021, Google announced an open-domain chatbot called LaMDA, which is an AI model that can converse in multiple personas on various topics.

A startup called Gorilla Technology launched an app called Super Chat, which can converse with historical figures or world-famous personalities in their personas. Similar startups providing AI persona services, such as Poe from Quora, Character.ai, and D-ID, are also gaining attention.

In addition, a startup called CygniContGraVitas launched an AutoGPT service that uses GPT-4. After setting the final goal for the AI, the AI establishes a plan to perform various tasks to achieve the goal and strives towards achieving it. While ChatGPT can only respond to human commands and questions, AutoGPT can self-improve and perform various detailed tasks towards achieving the goal after the first goal and instruction.

For example, if you command “Create one million Instagram accounts to follow,” AutoGPT will diligently produce content and perform various detailed tasks towards achieving this goal. If such models are applied to smart speakers, they can provide conversation services beyond expectations. Perhaps the 1st generation AI assistant will now be able to realize the ideal of Jarvis in the movie Iron Man.

ChatGPT, reviving the smart speaker - josh-ai

Josh.ai, a developer of voice-controlled home automation systems founded in 2015, has unveiled a prototype that uses the ChatGPT API to apply much more natural and intelligent functionality to their smart speaker than existing AI assistants. Thanks to ChatGPT, even if users issue incorrect or nonsensical questions considering the context, Josh can consider the surrounding situation and correct them so they can understand and respond appropriately.

In addition, it can operate the surrounding objects linked to AI assistants in a contextually relevant way to provide a more integrated service experience. For example, if a user says, “I’m really tired today. What are some ways to relax?” the Josh smart speaker linked to ChatGPT can suggest relaxation techniques like guided meditation or dim the lighting and show relaxing videos from YouTube on the TV.

ChatGPT, reviving the smart speaker - RizzGPT

Stanford students developed a prototype of glasses called ‘RizzGPT’ that combines GPT-4 with glasses to provide a service that shows various information through the glasses as text when talking with others.

The conversation between the user and the other person is converted into text through the AR glasses, which are connected to a smartphone and sent to ChatGPT. Additionally, the information about the scene the user is looking at, such as the other person’s face, clothes, state, surrounding objects, and environment, is also transmitted to GPT-4. This allows for more seamless conversations by providing information not only through voice but also about what is happening around the user.

By interpreting the information transmitted to GPT-4 and displaying it as text through the glasses’ display, users can have more effective conversations. In the future, it will also be possible to provide additional information to the user in the form of pictures, videos, or sound.

For example, glasses could provide timely and accurate information during a lecture, an important presentation, or a complex electrical wiring project, enhancing the user’s value. This is the realization of Jarvis, which we have seen in movies. Thanks to AI technology, such as LLM (Large Language Model), which enables AGI (Artificial General Intelligence) services like ChatGPT.

ChatGPT, reviving the smart speaker - GPT

As such, ChatGPT will be able to provide new functions that were previously impossible by integrating with smart speakers, AR glasses, and various IoT (Internet of Things) devices, and ensure better service quality than before. Of course, such ChatGPT could also be incorporated into robots, beyond its role as a virtual assistant to help us, and could even have a physical presence.

This is another dimension of the problem in which ChatGPT enters our reality, not just virtually, and our society must deeply consider and prepare for what social impact this technology will have, and take measures to ensure that the technology does not pose a threat to humanity.

ChatGPT, reviving the smart speaker.

Leave a Comment Cancel Reply