2024-12-22

LangChain state of AI in 2024, Meta Video WaterMarking

A technical AI newsletter - written with an entrepreneurial spirit for builders

Welcome to your daily newsletter on AI

What have we got for you today?

  • LangChain - state of AI in 2024

  • Meta VideoSeal - verifying video authenticity

  • One line of Code to Load HuggingFace Dataset

We are 100% free!

And with your support, we create more FREE content!

Please share with a friend, or 10!

🎯 RELEASES 🎯

Bringing insights into the latest trends and breakthroughs in AI

LangChain
State of AI in 2024

Synopsis

LangChain's "State of AI 2024" report provides an in-depth analysis of the current artificial intelligence landscape, announcing that 30k users per month have been signing up for LangSmith. The report underscores how these developments are reshaping user experiences, driving innovation, and setting the stage for future breakthroughs in AI technology.

Core Observations

  1. Top 10 AI providers: OpenAI, Ollama, Anthropic, Azure OpenAI, and Groc, …

  2. Top 10 Vector Store Providers : Chrome, Faiss, PineCone, PgVector (PostGres), Qdrant, …

  3. Usage: Python and Javascript are two main users, with 85% python.

  4. AIAgents: this is ranked #1 focus in AI right now.

  5. LLM as a Judge: They ranked the top 10 metric.

Broader Context

The insights presented in LangChain's "State of AI 2024" reflect the dynamic and rapidly evolving nature of AI this year 🤯. A key note is the growing influence of Javascript - 15% and growing. The data provided for top providers of AI models and vector store match sentiments of the developer community and are recorded with log traces in LangGraph. LangChain claims that LangGraph is the standard logging and tracing tool for AI apps in the market right now.

Meta
Video Watermarking for AI Integrity

Synopsis

Meta has introduced VideoSeal, a groundbreaking technology for video watermarking that ensures the authenticity and traceability of AI-generated videos. This innovation addresses concerns about misuse and misinformation in AI-generated media, offering a secure way to identify and verify content origins.

Core Observations

VideoSeal embeds invisible verifiable watermarks into their videos.

  1. Fraud Protection: crucial in an era where deepfake incidents have quadrupled, now accounting for 7% of global fraud cases.

  2. Resilience to Manipulation: The tool's watermarks remain detectable even after common video edits such as blurring, cropping, or compression.

  3. Open-Source Accessibility: under the MIT license, the code is complete with research papers, training code, and inference code.

Broader Context

The proliferation of AI-generated content and sophisticated video editing tools has made it both important and challenging to moderate digital platforms. VideoSeal addresses these challenges by embedding imperceptible signals into videos, allowing for identification. Basically, online news will have a tool to verify its authenticity, so, we will now find it hard to give Jackie Chain Trump’s hairstyle.

We are 100% free!

And with your support, we create more FREE content!

Please share us with a friend!

⚙️ BUILDERS BYTES ⚙️

Informing builders of latest technologies and how to use them

TimeScale - PostGres
Load HuggingFace Dataset - 1 line of Code

What will you learn today?

In a few lines, you can load a HuggingFace dataset into your PostGres database. With support like PGAI that can be set up in a single SQL line, this is extremely powerful.

Key Takeaways

  1. Timescale: Timescale plugin PGAI auto loads datasets into SQL. Learn more about their embeddings here (video).

  2. HuggingFace Support: built in native support for huggingFace. Nothing more to say really, just start pulling data.

  3. Simple: added a third point because we seem to like to think in 3s. It’s so simple a 3rd point isn’t needed, so I wanted to make a point by saying that.

First, let’s get set up

# start the docker container
docker run -d --name pgai -p 5432:5432 \
-v pg-data:/home/postgres/pgdata/data \
-e POSTGRES_PASSWORD=password timescale/timescaledb-ha:pg17

# install PGAI in database
docker exec -it pgai psql -c "CREATE EXTENSION ai CASCADE;"
# go into the container
docker exec -it pgai psql

Now, let’s create a table from your dataset

select ai.load_dataset('rajpurkar/squad', table_name => 'squad');

Ya, that’s it! You just loaded a training set into your database in a single line of code.

SQuAD Training set
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable. 

rajpurkar/squad

Check out our repo here with demos released every newsletter.

Do you have a product in AI and would like to contribute?
email us: [email protected] 

Is there something you’d like to see in this section?
share your feedback

🤩 COMMUNITY 🤩

Cultivating curiosity with latest in professional development

Tools

Product Hunts Top Tools in 2024

We recommend Supabase for Python and JavaScript

THANK YOU

Found something cool?
Want something different?

Our Mission at AlphaWise

AlphaWise strives to cultivate a vibrant and informed community of AI enthusiasts, developers, and researchers. Our goal is to share valuable insights into AI, academic research, and software that brings it to life. We focus on bringing you the most relevant content, from groundbreaking research and technical articles to expert opinions to curated community resources. 

Looking to connect with us?

We actively seek to get involved in community with events, talks, and activities. Email us at [email protected] 

Looking to promote your company, product, service, or event?