• Alphawise
  • Posts
  • Humanoid Robots - a look at current research + companies

Humanoid Robots - a look at current research + companies

Today's community section has tools for your team and some talks to keep you actively thinking.

A technical AI newsletter
written with an entrepreneurial spirit for builders

What is today’s beat?

RELEASES
🧨 Humanoid Robots
🧨 Qwen chat release
🧨 LLamaIndex Visual Document Retrieval

BUILDER BYTES
⭐️ a few code repos, with highlights from Memora and Crawl4AI
⭐️ a paper about VLM (vision language model) benchmarks

COMMUNITY
🤩 a tool for each department (product, marketing, executives)
🤩 a few talks to keep you sharp



Your FREE newsletter
 share or subscribe
to show support

🎯 RELEASES 🎯

Bringing insights into the latest trends and breakthroughs in AI

EngineAI
Robotics - physical AI will be the next frontier

Humanoid walking the streets in Shenzhen, China

The sighting of a humanoid robot in Shenzhen, China, has blown up on social media - and yes it is real! Firms like Tesla, Google DeepMind, and Amazon are at the forefront, developing sophisticated robots aimed at revolutionizing industries and despite popular belief it is project to create jobs according to the world economic forum’s recent report. Let’s have a look at major research groups and companies ongoing developments.

  1. Tesla's Optimus: Tesla is advancing its humanoid robot, Optimus, designed to perform repetitive or hazardous tasks is expected to debut in Tesla factories in 2025.

  2. Google DeepMind and Apptronik Collaboration: Apptronik, specializing in AI-powered humanoid robotics, has partnered with Google DeepMind to develop intelligent humanoid robots capable of autonomous operations across various environments

  3. Figure's Humanoid Robot: Figure, a startup with a team from Boston Dynamics (notably the longest standing contributor), Tesla, and Google DeepMind, focuses on general-purpose humanoid robots for diverse tasks. The company secured $675 million in funding, achieving a valuation of $2.6 billion.

  4. Agility Robotics: working directly with Amazon, they have a long standing goal of improving their warehouses.

  5. Exoskeletons: workforce solutions to aid humans brought by Vancouver’s 2025 CES award winner HumanInMotion or German Bionic 

The humanoid robotics sector is experiencing unprecedented growth, driven by significant investments from tech giants and startups alike. The conversion of simulation advancements (especially environment models), hardware capabilities, sensory advancements and integrated sets stage for a rapidly evolving ecosystem that will hit sectors such as manufacturing, healthcare, and services.

Since EngineAI hit the spotlight this week, view the video here

Alibaba
Qwen chat has been released! It’s models are all open source!!

Qwen chat with models, web search, image generation, and artifacts

Alibaba has introduced Qwen Chat, a web-based interface that enables users to interact seamlessly with its Qwen series models. This platform allows for exploration of various functionalities, including text, audio, and visual data processing, thereby enhancing user engagement with AI capabilities.

  1. Advanced Language Understanding: Qwen models, such as Qwen2.5-72B, have achieved an MMLU score of 86.1, indicating superior performance in natural language understanding tasks.

  2. Enhanced Coding Capabilities: Qwen2.5-Coder-7B-Instruct has outperformed larger models, achieving a HumanEval score of 79.9, demonstrating its proficiency in code generation and comprehension.

  3. Mathematical Proficiency: Qwen2.5-Math-72B-Instruct has achieved a score of 87.8 on the MATH benchmark, reflecting its advanced mathematical reasoning abilities.

Alibaba has not been very loud, but this release is. Let’s face it, the Chinese landscape is different, and Alibaba dominates the ecosystem in China. By providing a user-friendly interface with features like web search, image generation, code helper, reasoning and more, Alibaba just dropped a bomb! The Qwen models' competitive performance in language understanding, coding, and mathematics positions them as viable alternatives to leading models like GPT-4 and PaLM 2. Check out their models here or on HuggingFace (yeah, they actually open sourced the majority of their Qwen series).

LLamaIndex
VDR-2B-Multi-V1 for Multilingual Visual Document Retrieval

LlamaIndex model discussion

This HuggingFace blog unveiled the model release of LLamaIndex VDR-2B-Multi-V1, a multilingual embedding model designed to enhance visual document retrieval across various languages and domains. This model enables efficient querying of visually rich documents without relying on OCR or data extraction pipelines. The blog includes some details, training specs, and code for technical readers to follow along. Here are few interesting details about the model.

  1. Multilingual Training Dataset: VDR-2B-Multi-V1 was trained on a dataset comprising 500,000 high-quality samples in Italian, Spanish, English, French, and German, making it the largest open-source multilingual synthetic dataset for visual document retrieval.

  2. Enhanced Inference Efficiency: The model utilizes 768 image patches, resulting in three times faster inference and significantly lower VRAM usage compared to previous models that used 2,560 image patches.

  3. Cross-Lingual Retrieval Capability: VDR-2B-Multi-V1 excels in cross-lingual retrieval, allowing users to search for documents in one language using queries in another, thereby improving performance in real-world multilingual scenarios.

The development of VDR-2B-Multi-V1 addresses the growing need for efficient and accurate retrieval of visual documents in a multilingual context - without complex pipelines to deal with disparate sources. By eliminating the dependence on OCR and complex data extraction processes, this model streamlines document querying and enhances accessibility across different languages. Its improved inference speed and reduced computational requirements make it a practical solution for diverse applications. Basically, it’s pretty damn cool.

read more here

⚙️ BUILDER BYTES ⚙️

Informing builders of latest technologies and how to use them

Trending

Today we have some code for you.

Today’s personal favorite is a toss-up between
Memora and Crawl4AI 
but we’ll leave you to be the judge!

And this paper was way too cool to pass up, you gotta try out the interactive demo!

Found something cool?
Want something different?

🤩 COMMUNITY 🤩

Cultivating curiosity with latest in professional development

THANK YOU

Found something cool?
Want something different?

Our Mission at AlphaWise

AlphaWise strives to cultivate a vibrant and informed community of AI enthusiasts, developers, and researchers. Our goal is to share valuable insights into AI, academic research, and software that brings it to life. We focus on bringing you the most relevant content, from groundbreaking research and technical articles to expert opinions to curated community resources. 

Looking to connect with us?

We actively seek to get involved in community with events, talks, and activities. Email us at [email protected] 

Looking to promote your company, product, service, or event?