Alphawise
Posts
🔥KhanMigo, distrupt education🔥

🔥KhanMigo, distrupt education🔥

Tutorial inside with ChatGPT for vision tasks - finding smiles!

Matthew McCann
January 03, 2025

A technical AI newsletter
written with an entrepreneurial spirit for builders

What is today’s beat?

Tulu3, an AllenAi.org development, is a great ML resource to leverage.
Khan Academy, and how they will disrupt education
Free Nvidia resource including training and a webinar

Your FREE newsletter
share
to show support

🎯 RELEASES 🎯

Bringing insights into the latest trends and breakthroughs in AI

Allen.ai
Tulu3, fully open-source!

Synopsis

The Allen Institute for AI (AI2) is has many open-source models, tools, and datasets - check out HuggingFace. With its latest release, Tulu-3, AI2 challenges the dominance of closed-source AI by emphasising transparency, accessibility, and research collaboration. This initiative not only competes with proprietary technologies from major tech companies but also enhances the accessibility of state-of-the-art AI for academic, industrial, and societal applications.

Core Observations

Tulu3 is a high-performance, open-source language model designed to compete with closed models from major tech companies. Its transparency in architecture and implementation sets a new standard for accessibility in advanced AI research - view its paper.

Dataset Accessibility:
The model’s training and evaluation datasets are open and hosted on Hugging Face. AI2 emphasizes reproducibility and ethical AI usage by offering decontaminated and documented datasets for researchers.
Open Ecosystem Components:
- Open-Instruct: A toolkit for fine-tuning models, promoting robust instruction-following behavior.
- OLMES: Tools supporting ethical and efficient data filtering and decontamination.
- Collaboration-Focused Design: Integration with GitHub repositories for iterative community-driven development.
Transparent Performance Metrics:
Comprehensive evaluations in technical domains and general language tasks provide clarity on the model's strengths, limitations, and optimal use cases, reinforcing accountability in AI development.

Broader Context

AI2's approach addresses challenges by empowering educators, researchers, and developers globally. By promoting collaborative innovation, Tulu-3 represents a pivotal step toward transparent AI development. Its ecosystem reduces the reliance on closed-source solutions, ensuring that advanced AI capabilities are available to a broader audience.

Try their chat here

Khan Academy
AI Integration in Education

Synopsis

Khan Academy is at the forefront of integrating artificial intelligence (AI) into education through its AI-powered assistant, Khanmigo. This tool is designed to enhance both teaching and learning experiences by providing personalized support to students and streamlining administrative tasks for educators. Khanmigo's implementation in various educational settings demonstrates its potential to transform traditional educational methodologies.

Core Observations

Advanced AI Models: Khanmigo is powered by OpenAI's GPT-4 technology, enabling it to assist users with inquiries in mathematics, science, humanities, and coding. This integration allows Khanmigo to guide students through problem-solving processes, encouraging critical thinking rather than simply providing answers.
Strategic Cloud Partnerships: In collaboration with Microsoft, Khan Academy has transitioned a portion of its cloud services to Microsoft Azure. This partnership enhances the scalability and accessibility of Khanmigo, ensuring reliable performance for users. Additionally, Microsoft has enabled Khan Academy to offer Khanmigo for Teachers free of charge to all U.S. educators, further supporting the educational community
Core features: Advertised as math tutor, writing assistant, code reviewer, reading buddy, science and humanities Guide, language learning support, and more, Khan Academy is aiming at taking a sleeping industry at large by providing (1) real time support and (2) interactive classroom engagement.

Broader Context

The modern education system emerged during the Industrial Revolution in the 18th and 19th centuries, driven by the need to educate the working class for industrial labour. This period marked the advent of standardised curricula, mass education, and the establishment of public schools. The factory-style classroom model with mild militant vibes prioritised uniformity, discipline, and efficiency, laying the groundwork for contemporary educational systems.

What is the take on this? Finally, let’s disrupt how we learn. It’s about time we engage students with concepts like mastery, self-pace, and adaptive interest. Out with the old, in with the new!

⚙️ BUILDERS BYTES ⚙️

Informing builders of latest technologies and how to use them

What will you learn today?

Learn how to use OpenAI's GPT-4 API to analyze images, detect faces, and determine if people are smiling, while incorporating JSON responses into your workflow.

For non-code savvy users
- copy the prompt below, and use GPT-4o to manually perform this

Key Takeaways

Image Analysis with AI: Understand how to leverage GPT-4 API for image recognition.
Base64 Encoding: Learn to encode image files into Base64 strings for seamless API integration.
Structured JSON Responses: Extract meaningful insights from AI responses formatted in JSON for easy integration into applications.
Custom Prompts for AI: How to use keyword “JSON” in prompt with response_format “json_object” to structure customize your prompt return.

from openai import OpenAI
from base64 import b64encode
from os import getenv
from json import loads

api_key = getenv("OPENAI_API_KEY")

if not api_key:
    raise RuntimeError("OPENAI_API_KEY not found")

client = OpenAI(api_key=api_key)


# Define the function to analyze an image with GPT-4
def analyze_image(image_path, img_type: str='image/jpg'):
    # Read the image file in binary mode
    with open(image_path, 'rb') as image_file:
        image_data = image_file.read()

    # Encode the image to base64
    img_b64_str = b64encode(image_data).decode('utf-8')

    # Create the prompt for analysis
    prompt = (
        "Can you find a face in this image?  Are the people smiling?  \n"
        "Your response should be in JSON format. \n"
        "Please structure your reply in dict format as follows:\n"
        "{ 'n_people': 'add number here', 'n_smiling': 'add number here'}"
    )

    # Send the request to OpenAI

    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:{img_type};base64,{img_b64_str}"
                        },
                }
            ],
        }],
        response_format={"type": "json_object"}
    )

    # Extract and return the response content
    r = response.choices[0].message.content
    return loads(r)

if __name__ == "__main__":
    # Example usage
    file_path = "images/smile.jpg"
    _ret = analyze_image(file_path)
    print(_ret)
    print(type(_ret))

    file_path = "images/not_smile.jpg"
    _ret = analyze_image(file_path)
    print(_ret)
    print(type(_ret))

The code can be found in our repo in short_tutorials/openai/gpt-vision.py

⭐️ ⭐️⭐️⭐️⭐️
Like these tutorials?
👉️ Star out repo to show support
⭐️⭐️⭐️⭐️⭐️⭐️

Do you have a product in AI and would like to contribute?
👉️ email us: [email protected]

Is there something you’d like to see in this section?
👉️ share your feedback

🤩 COMMUNITY 🤩

Cultivating curiosity with latest in professional development

TALKS

Webinar - Enhance Visual Understanding With Generative AI (January 22, 2025, 2 hrs)

LEARNING

Nvidia has a massive learning base, some free and some paid. Here we have a few favorites, 100% free.

THANK YOU

Found something cool?
Want something different?

Our Mission at AlphaWise

AlphaWise strives to cultivate a vibrant and informed community of AI enthusiasts, developers, and researchers. Our goal is to share valuable insights into AI, academic research, and software that brings it to life. We focus on bringing you the most relevant content, from groundbreaking research and technical articles to expert opinions to curated community resources.

Looking to connect with us?

We actively seek to get involved in community with events, talks, and activities. Email us at [email protected]

🔥KhanMigo, distrupt education🔥

Tutorial inside with ChatGPT for vision tasks - finding smiles!

What is today’s beat?

🎯 RELEASES 🎯

Bringing insights into the latest trends and breakthroughs in AI

Allen.ai
Tulu3, fully open-source!

Synopsis

Core Observations

Broader Context

Khan Academy
AI Integration in Education

Synopsis

Core Observations

Broader Context

Trending

We are 100% free!

⚙️ BUILDERS BYTES ⚙️

What will you learn today?

Key Takeaways

Trending

🤩 COMMUNITY 🤩

TALKS

LEARNING

THANK YOU

Found something cool?
Want something different?

Looking to promote your company, product, service, or event?

🔥KhanMigo, distrupt education🔥

Tutorial inside with ChatGPT for vision tasks - finding smiles!

What is today’s beat?

🎯 RELEASES 🎯

Bringing insights into the latest trends and breakthroughs in AI

Allen.aiTulu3, fully open-source!

Synopsis

Core Observations

Broader Context

Khan AcademyAI Integration in Education

Synopsis

Core Observations

Broader Context

Trending

We are 100% free!

⚙️ BUILDERS BYTES ⚙️

What will you learn today?

Key Takeaways

Trending

🤩 COMMUNITY 🤩

TALKS

LEARNING

THANK YOU

Found something cool?Want something different?

Looking to promote your company, product, service, or event?

Allen.ai
Tulu3, fully open-source!

Khan Academy
AI Integration in Education

Found something cool?
Want something different?