DiscoverThe New Stack PodcastHow Training Data Differentiates Falcon, the LLM from the UAE
How Training Data Differentiates Falcon, the LLM from the UAE

How Training Data Differentiates Falcon, the LLM from the UAE

Update: 2024-05-30
Share

Description

The name "Falcon" for the UAE’s large language model (LLM) symbolizes the national bird's qualities of courage and perseverance, reflecting the vision of the Technology Innovation Institute (TII) in Abu Dhabi. TII, launched in 2020, addresses AI’s rapid advancements and unintended consequences by fostering an open-source approach to enhance community understanding and control of AI. In this New Stack Makers, Dr. Hakim Hacid, Executive Director and Acting Chief Researcher, Technology Innovation Institute emphasized the importance of perseverance and innovation in overcoming challenges. Falcon gained attention for being the first truly open model with capabilities matching many closed-source models, opening new possibilities for practitioners and industry. 

Last June, Falcon introduced a 40-billion parameter model, outperforming the LLaMA-65B, with smaller models enabling local inference without the cloud. The latest 180-billion parameter model, trained on 3.5 trillion tokens, illustrates Falcon’s commitment to quality and efficiency over sheer size. Falcon’s distinctiveness lies in its data quality, utilizing over 80% RefinedWeb data, based on CommonCrawl, which ensures cleaner and deduplicated data, resulting in high-quality outcomes. This data-centric approach, combined with powerful computational resources, sets Falcon apart in the AI landscape.

 

Learn more from The New Stack about Open Source AI: 

Open Source Initiative Hits the Road to Define Open Source AI 

 Linus Torvalds on Security, AI, Open Source and Trust

Transparency and Community: An Open Source Vision for AI 

Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

How Training Data Differentiates Falcon, the LLM from the UAE

How Training Data Differentiates Falcon, the LLM from the UAE

Dr. Hakim Hacid, Alex Williams, The New Stack