- Alphawise
- Posts
- 2024-12-20
2024-12-20
Generative AI + Physics engine = 𤯠!

A technical AI newsletter - written with an entrepreneurial spirit for builders
Welcome to your daily newsletter on AI
What have we got for you today?
𤯠Generative AI can create 4D simulations, with a physics engine.
š„ø A look at Googleās AI Studio
𤦠OpenAI releases its desktop application, again.
š„ Quick tutorial with Andrew Ngās AiSuite
We are 100% free!
And with your support, we create more FREE content!
Please share with a friend, or 10!
šÆ RELEASES šÆ
Bringing insights into the latest trends and breakthroughs in AI
Genesis
Generative AI + Physics engine = 𤯠!
Synopsis
Genesis, an open-source generative AI framework combined with a physics engine, has been introduced, aiming to revolutionise robotics and simulation environments. While currently generating significant buzz, its practical applications show real promise for ultra-fast simulations and creating generative 4D worlds. The framework is expected to play a pivotal role in advancing fields like robotics, marketing simulations, and embodied AI research
Core Observations
4D, or known otherwise as NeRF, is just saying 3D plus time. Here are the key factors of impact with this development.
Physics-Based Simulations
Genesis excels in running ultra-fast physics simulations, enabling real-time testing and modelling for robotics.
Generative 4D Environments
The framework can generate immersive, dynamic 4D environments for testing or showcasing scenarios.
Open-Source Accessibility
As an open-source project, Genesis invites widespread collaboration, allowing developers and researchers to customise and integrate it into diverse projects.
Broader Context
I personally tried to run an example in their docs, and was quite frustrated to find their basic introduction doesnāt work - and there were many other complaints online. The potent potential this area has to impact robotics - even without its paper being released yet - is real because data and environment creation are two critical factors in making complex robotic systems work. Donāt believe the hype, but watch out for this one!
Here is the code that got 10k stars in a day
Funny note - I loved reading through the Github issues here, people are a little pissed. An early release it great, but itās not a stretch of the imagination to think that the results were rendered not generated with the code. š§
Google
A glance at Googleās AI Studio
Synopsis
Google has introduced Gemini 2.0 Flash Thinking, an advanced AI model designed to enhance reasoning capabilities and provide transparent problem-solving processes. It ranks 4th, and has a lot of hype as a general LLM, particularly against models like OpenAI's o1.
Core Observations
Tools Section
Structured Output enables rapid prototying with the UI so that the modelās responses can be formatted in a predefined schema or structure (e.g., JSON, tables, bullet points).
Code Execution activates the ability to run code within the interface.
Function Calling enables the AI to call pre-defined functions. Itās ideal for situations where AI outputs need to be processed or acted upon programmatically (e.g., calling an API, invoking a service).
Grounding, which is like RAG, improves response accuracy by grounding AI outputs in specific datasets or knowledge bases.
Advanced Settings:
Temperature controls the randomness of the AIās responses. A lower temperature (e.g., 0.2) makes the modelās outputs more focused and deterministic, while a higher value (e.g., 1.0) introduces greater variability and creativity.
Token Count shows the number of tokens (i.e., words, punctuation, and spaces) used in the interaction. Itās crucial for understanding usage limits or optimising responses within token constraints.
Broader Context
Despite the hype since its release on December 11th, it hasnāt quite lived up to its recent news about being a āreasoningā contender with GPT o1. Letās face it, the o1 model is gigantic, and these models will undergo significant optimisations so they can be run on hardware without exuberant costs. Above we showed how to use the model with Googleās AIstudio, which makes prototyping particularly easy compared to raw programming - and rapid prototyping even easier.
Google
OpenAI - discusses MacOS and Windows features again.
Synopsis
Day 11 of OpenAIās 12 Days of OpenAI announced significant enhancements to the ChatGPT desktop applications for macOS and Windows. This marks a new expectation for desktop users - the application!
Core Observations
Expanded Application Integrations: ChatGPT desktop apps now support additional integrated development environments (IDEs) and productivity tools, including BBEdit, MATLAB, Nova, Script Editor, TextMate, VSCodium, Cursor, WindSurf, JetBrains IDEs (e.g., IntelliJ IDEA, PyCharm), Warp, Prompt, Apple Notes, Notion, and Quip. Wow, what a mouthful!
Advanced Voice Mode Integration: The Advanced Voice Mode feature has been enhanced to function across these newly supported applications. This enables hands-free operation to your CPU.
Agentic AI Development: OpenAI emphasized that these updates are steps toward developing more agentic AI systems, where ChatGPT can perform tasks autonomously on behalf of users.
Broader Context
The desktop application was first announced on June 21st 2024. But, itās all about the hype - so letās get fired up and look at whatās new and improved! By embedding AI capabilities directly into commonly used software, OpenAI has opened accessibility, with the help of its tooling, to an array of other applications (listed above in observation 1). The focus on agentic AI reflects a broader industry trend toward developing AI systems that can autonomously manage tasks, potentially reshaping workflows with a particularly interesting effect on our work lives, not just personal computing.
Trending
āļø BUILDERS BYTES āļø
Informing builders of latest technologies and how to use them
Super - a real super title!
What will you learn today?
We will quickly explore Andrew Ngās AiSuite, recently released this month. It makes using different model providers easy, and so we will show you how to use GROK and OPENAI and compare their results with python.
Key Takeaways
Structured outputs: With a unified output style for all AI models, it decouples code dependency on the model.
Comparing Responses: Understand how to evaluate and compare responses from diverse AI models for the same questions.
# Simple method to call a model and gets its structured output
def ask(message, sys_message="You are a helpful agent.", model="groq:llama-3.2-3b-preview"):
# Initialize the AI client for accessing the language model
client = ai.Client()
# Construct the messages list for the chat
messages = [
{"role": "system", "content": sys_message},
{"role": "user", "content": message}
]
# Send the messages to the model and get the response
response = client.chat.completions.create(model=model, messages=messages)
# Return the content of the model's response
return response.choices[0].message.content
To view the full code, to here
Trending
Small model, big impact: Patronus AIās Glider outperforms GPT-4 in some areas
Meta AI Introduces ExploreToM: A Program-Guided Adversarial Data Generation Approach (code, paper, dataset)
IBM's enterprise AI models updated
š¤ SHOUT OUT
𤩠COMMUNITY š¤©
Cultivating curiosity with latest in professional development
Talks & Events
THANK YOU
Our Mission at AlphaWise
AlphaWise strives to cultivate a vibrant and informed community of AI enthusiasts, developers, and researchers. Our goal is to share valuable insights into AI, academic research, and software that brings it to life. We focus on bringing you the most relevant content, from groundbreaking research and technical articles to expert opinions to curated community resources.
Looking to connect with us?
We actively seek to get involved in community with events, talks, and activities. Email us at [email protected]