DataSet/DataTable - Search News

How to Tell a Good Speech Dataset for AI From a Bad One

Speech AI datasets look interchangeable until production exposes gaps in transcripts, speakers, audio conditions, licenses, ...

C&EN

Chemists ran 50,688 reactions to make a huge open dataset

The dataset, which the researchers have made available on the Open Reaction Database, is nearly five times as large as the ...

Tech Times

AI Chart Understanding Breakthrough: MIT-IBM Dataset Lets Small Models Beat GPT-4o

MIT and IBM released ChartNet, a 1.7-million-sample synthetic training dataset that lets compact open-source vision-language ...

Frontiers

Showcasing FAIR² Data Articles: Unlocking Trustworthy, AI-Ready Scientific Data for Reuse and Impact in Space Technologies

Scientific knowledge is fundamentally built on data; yet, for too long, research datasets have remained siloed, poorly documented, and inconsistently ...

Wired

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The ...

VentureBeat

Table-augmented generation shows promise for complex dataset querying, outperforms text-to-SQL

AI has transformed the way companies work and interact with data. A few years ago, teams had to write SQL queries and code to extract useful information from large swathes of data. Today, all they ...

15don MSN

Carbonfact Vaayu Merger Expands Sustainability and Carbon Dataset

Carbonfact's CEO said its acquisition of Vaayu is representative of a larger trend of consolidation among sustainability and carbon data platforms.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results