If you thought viruses mutated at lightning speed, you are clearly not in the loop about ChatGPT. The generative AI technology that started from the bottom is flying up and above. Every day it wakes up and decides to ‘slay’, in all senses of the word. The advent of, developments in, concerns about and new additions to ChatGPT has created a league of its own to the point where there is a separate news column dedicated to keeping tabs on it.
It sees, it speaks, it hears
If there is one thing that ChatGPT dislikes more than prompts about ‘how to commit a crime’ (rightfully so) is its contenders who are also chatbot leaders; Microsoft, Google and Anthropic. Because of the fear that the rest will steal its main character vibes and also because it is fun to explore the limitless potential of what generative AI can do, OpenAI’s ChatGPT is now able to ‘see, hear and speak’. On 25 September the company announced that the latest upgrade to the chatbot, the most significant since GPT-4’s introduction, permits users of the ChatGPT mobile app to opt for voice conversations. Users have the option to select from five distinct synthetic voices for the bot’s responses. Additionally, users can share images with ChatGPT and emphasise specific areas for discussion or analysis.
This additional feature to ChatGPT represents a significant instance of a multimodal AI system. Unlike relying on a single AI model tailored for a specific type of input, such as a large language model (LLM) or a speech-to-voice model, this approach leverages multiple models collaborating harmoniously to form a more unified and versatile AI tool. The initial step in enabling ChatGPT’s multimodal capabilities involves incorporating image and voice input. As a user-centric application, these two data types are among the most frequently used by individuals. This does not imply that there are inherent limitations preventing an AI model from being trained to handle various other types of data, ranging from Excel spreadsheets and 3D models to photos accompanied by in-depth information. Therefore an AI model can be trained to address these forms of data as well.
OpenAI had planned to implement these changes for its paying users within the two weeks that followed the announcement. The voice feature will be exclusive to the iOS and Android apps, while the image processing capabilities will be accessible across all platforms. This mammoth update coincides with the escalating competition in the artificial intelligence chatbot arena which involves all the aforementioned key players. These tech giants are in a race to not only launch new chatbot applications but also introduce new features. For example, Google has unveiled a series of enhancements for its Bard chatbot, and Microsoft has added visual search functionality to Bing.
Microsoft’s substantial additional investment in OpenAI, totalling $10 billion, earlier this year marked it as the most substantial AI investment of the year, as per PitchBook. In April, OpenAI reportedly concluded a $300 million share sale, valuing the startup between $27 billion and $29 billion. This funding round saw participation from notable firms like Sequoia Capital and Andreessen Horowitz.
Of course, concerns were flagged about AI-generated synthetic voices because they can be used to create deepfakes that can easily fool humans. This worry was not raised merely by lay people but by cybersecurity professionals and researchers who are in the know about risks associated with deepfakes. However, OpenAI addressed this question by stating that the synthetic voices were designed in collaboration with voice actors they had a direct working relationship with. Despite the claims, like how it always is with technology, it should be taken with a pinch of salt.
It is no longer limited, it is now updated
Another growth spurt that ChatGPT has had is that it can access new and fresh data. In simple terms, it can browse the internet. Initially, it could only give us information up to September 2021. This was one of the major concerns that was voiced out by people far and wide because the lack of up-to-date information could lead to a dearth of important details needed to draw unbiased inferences. Up until now, the famous chatbot’s knowledge was bogged down in one place. Overcoming a limitation as such, ChatGPT can now use the internet to offer users real-time information, thanks to its integration with Bing search.
Real-time ChatGPT functionality is initially accessible to ChatGPT Plus subscribers and Enterprise clients. Users can activate this feature by selecting the GPT-4 version and opting for ‘Browse with Bing.’ OpenAI has plans to extend access to a broader user base in the future. It’s worth noting that OpenAI had conducted tests involving internet access for ChatGPT in May but temporarily disabled it due to concerns that it could potentially be used to circumvent paywalls.
Yet again, this development runs parallel to Meta’s introduction of its ChatGPT competitor, Meta AI, which leverages Bing search to provide real-time internet-based answers. Microsoft, OpenAI’s investor and partner, has also launched its AI-powered Bing Chat, capable of delivering real-time responses.
This scenario has two sides; the good and the bad. The good news is that the information is now much more reliable in comparison to what it spewed before and is duly updated thus bridging the information gap. Habitually, the bad news outdoes the good news because ChatGPT’s previous inability to access all data had provided a buffer. This avoided generating harmful or illegal content that may have recently surfaced on the internet in response to a query. Since it lacked access to all information, it was also programmed to refrain from disseminating misinformation propagated by people with ill intentions, particularly in areas like politics or healthcare decisions. The conundrum is, that in order to realise its full potential, the regulations that are built around ChatGPT need to come off. However, the minute that the guardrails loosen a nail or two, the technology spawns into a more dangerous version of itself. One that will be readily misused. Striking a balance is the solution, but even all the trust and faith placed in humankind does not guarantee such measures being taken. We are known to function at extremes.
Therefore, what is patent is that as the developments keep coming, worries continue to be raised and there is no end to how far ChatGPT will evolve. Some of us are aware that ChatGPT is flying too close to the sun, but tech aficionados are well aware that with the way that it is handling things, it will never pull an Icarus, even if humanity does.
(Sandunlekha Ekanayake)