September 30, 2024

OpenAI Says Its New Models Perform Like An ‘Extremely Smart PhD’

An issue with ChatGPT that seemed to be causing the chatbot to message users without their consent was reportedly resolved, according to OpenAI. The question, “How was your first week at high school?” was posed to a user by ChatGPT. “Did you message just me first?” the person asked the AI chatbot in a strange encounter that was shared on Reddit. The corporation asserts that an incorrect delivery of a prior message was the source of the problem.

AI deal of the week

According to a report by Forbes, Salesforce Ventures, the software giant’s investment arm, plans to allocate an additional $500 million to generative AI businesses, increasing the total amount of money the company has provided to AI startups to $1 billion. The company has supported industry leaders like Anthropic, Cohere, Mistral, and Hugging Face through its generative AI fund.

The “godmother of AI,” Fei Fei Li, a Stanford scientist, cofounded World Labs, a startup that has raised $230 million from Andreessen Horowitz, NEA, and Radical Ventures to develop spatial intelligence through AI models that can comprehend and interact with real-world objects and locations. This development is noteworthy.

Big plays

OpenAI unveiled the OpenAI o1, a new family of big language models. To brainstorm several answers to a question and deliver better, more accurate solutions, they are built to react to queries more slowly than GPT models. They are designed to “reason” through complicated issues, considering and weighing choices as they proceed through a task.

The model can execute complex jobs in the legal, coding, and scientific domains, and its intellect is comparable to that of a “very smart PhD,” according to Nikunj Handa, an API team member at OpenAI. It accomplishes this by applying “chain of thought” reasoning, which is a methodical approach to problem-solving. “People go through great lengths to get an LLM to do the thing that they want it to do. And this model sort of does it more easily out of the box without needing to tell it ‘please don’t output things in that format’” he stated.

Developers who have invested at least $1000 in the platform are the only ones with access to these early versions, which are not linked to the internet or any other external tools. Safety concerns also exist. Model evaluation business Apollo Research discovered that the models could trick users into believing they are safe and helpful when, in fact, they were not adequately answering their questions during the testing of previous iterations of the models.

Data conflicts

According to Wired, UK-based parenting site Mumsnet has an archive of over six billion words, and AI companies have been collecting its data. Since then, the platform has made an effort to reach licensing agreements with AI firms, such as OpenAI, to market its content, which is primarily created by women.

However, negotiations with OpenAI, which at first showed interest in the data on the website, broke down when the giant of AI claimed to only form agreements for sizable datasets that aren’t made available to the general public. Mumsnet is currently pursuing legal action for suspected copyright infringement against the business and other scrapers.

A deeper look

Thomas Jefferson’s letters, black and white images of Rosa Parks, and The Giant Bible of Mainz—a manuscript from the fifteenth century that is thought to be among the last handwritten Bibles in Europe—all feature. These are a few of the 180 million objects that the Library of Congress is home to, along with books, manuscripts, maps, and audio recordings.

Each year, hundreds of thousands of people pass under the Renaissance-style domes, which are adorned with mosaics and murals, as they stroll through the library’s high-ceilinged corridors supported by pillars. The more than 200-year-old library has, however, recently drawn a different kind of patron: artificial intelligence (AI) businesses who are keen to use the 185 petabytes of data held in the library’s digital archives to build and hone their most sophisticated AI models.

Judith Conklin, the chief information officer at the Library of Congress (LOC), told Forbes, “We know that large language model companies are very interested in our large amount of digital material.” It’s quite well-liked.

However, those who wish to obtain the Library’s data must gather it via the API, a gateway that permits data downloads for anybody to access, be they an AI researcher or a genealogy. They are not allowed to scrape content straight off the website, which is a popular practice for AI businesses and, according to Conklin, has become a “hurdle” for the library because it slows down public access to its archives.

Some people, she claimed, simply scrape our websites to obtain the data they need quickly to train their models. “We have to manually slow them down if they’re degrading the functionality of our websites.”

Finding data is only a small portion of the narrative. The world’s largest library is being courted by businesses like Microsoft, Amazon, and OpenAI as a client. According to their claims, subject matter experts and librarians can get assistance from AI models for activities like locating records, navigating catalogues, and summarising lengthy texts. Although there are still some unresolved issues, this is undoubtedly feasible. For example, AI models trained on modern data may have trouble distinguishing between historical events and objects like books and phones. Natalie Smith, the director of digital strategy at the LOC, revealed this to Forbes. They frequently adapt contemporary conceptions to historical writings because of an overpowering predisposition toward the present, according to Smith.

Weekly demo

Walter Kortschak, a billionaire venture capitalist who has over 40 years of experience investing in internet startups such as Twitter and Lyft, warned Forbes not to get caught up in the “mass hysteria” surrounding AI. The investor, who has shares in Anthropic and OpenAI, gave some guidance, stating that it’s critical to exercise caution, “invest in outlier founders,” and limit the amount of money you spend.

Model behaviour

The terms of service for Snapchat’s “My Selfies” tool, which allows users to upload photos to create AI-generated images, state that the social media business has the right to use user and friend images created by AI in adverts, 404 Media reported. The functionality is on by default, although it can be toggled off. According to a Reddit post, at least one individual saw an advertisement on Snapchat Stories that included an AI-generated picture of them.

(Tashia Bernardus)

Use of this site constitutes acceptance of our User Agreement and Privacy Policy and Cookie Statement. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of The Hype Economy, a Hype Insight Group entity.