Local LLMs, Beta Access to AutomatED's GPTs, and Gemini's Promise

Are local LLMs a solution to privacy concerns? Only if they perform well. So do they?

Graham Clay
December 22, 2023 • Estimated Reading Time: 11 minutes

[image created with Dall-E 3 via ChatGPT Plus]

Welcome to AutomatED: the newsletter on how to teach better with tech.

Each week, I share what I have learned — and am learning — about AI and tech in the university classroom. What works, what doesn't, and why.

In our last weekly piece of the year, I connect recent progress in local large language models to higher ed’s privacy concerns with AI. I also discuss the promise of Google’s Gemini and foreshadow next week’s Premium piece (which will come with access to the Beta version of one of AutomatED’s custom GPTs).

💡 Idea of the Week: Local LLMs for Privacy
👀 What to Keep an Eye on: AutomatED's GPT Betas
❓ An AI Question Mark: Gemini for Handwriting Analysis
🔗 Links
⬆️ How to Access Premium

💡Idea of the Week:
Local LLMs for Privacy

Many of professors’ best uses of LLMs like ChatGPT, Claude, and Gemini entangle them in privacy issues because these use cases involve personally identifiable or otherwise protected student data.

Back in October, I wrote a piece on navigating some of these issues; that is, on using AI while protecting student data. But this piece was incomplete in an instructive dimension, and my idea of the week reveals why.

My October missive was dedicated to popular and ecosystem-integrated AI tools like ChatGPT, Zoom’s AI Companion, and Bard. These tools are not “sandboxed” at all, in that they potentially could expose any data that they have access to. I argued that there are five broad strategies professors can use to protect student data while using these tools:

Don’t Run Student Data Through Them
Limit Uploaded Data to “Completely Safe” Categories
Change the Consent Paradigm with Your Students
Pseudonymize or Anonymize
A Combination of the Above

Either you remove the risk entirely but also remove many of the benefits of AI (options 1 and 2), you reduce the risk significantly but lose time in doing so (option 4), you accept the risk but get your students to sign off on it explicitly and often (option 3), or you mix these strategies (option 5).

The problem is this: these options all assume that using the AI tools listed is the only option professors have to realize the relevant pedagogical and productivity gains. For this reason, my analysis was incomplete — intentionally but still misleadingly — because I did not discuss using different AI tools for the same outcomes.

What about sandboxed AI tools?

This question becomes all the more pertinent now that more and more high-performing open-source LLMs are being released, from Llama 2 to Mistral 7B, Mixtral 8×7B, and LLaVa 1.5. By “open-source,” I mean that the LLMs are released by their developers to the public so that they can be freely downloaded and used by anyone who chooses to run them. By “more and more high-performing,” I mean that these LLMs are becoming quite competitive with the latest versions of the big non-sandboxed and non-open source LLMs like ChatGPT (use case tests over the past few months have shown that Mistral 7B, for instance, is comparable on many metrics to ChatGPT3.5).

But, for present purposes, the most important aspect of these LLMs is that they can be run locally — on one’s own computer (or virtual machine), whether or not it is connected to the internet. This means that they can be sandboxed by being isolated from potential data leaks.

(It also means, according to Ethan Mollick, that local LLMs will soon permeate our environments.)

Professors need to be aware of this sort of solution to the aforementioned privacy problems, although it does come with its own challenges. It is a sixth option that amounts to removing the privacy risks of AI with less time lost and a retention of the benefits of AI, at least if deployed properly.

I am working on a comprehensive guide to help professors — and other experts, like those at teaching and learning centers — navigate privacy issues related to AI. And you better believe that I will discuss this sixth option in this guide, unlike my October piece…

👀 What to Keep an Eye on:
AutomatED’s GPT Betas

For next week’s holiday edition of our Premium newsletter, we will be granting our Premium subscribers limited access to a Beta version of a custom GPT that we have built.

This custom GPT is easy to use and it is excellent at helping professors create innovative and engaging assignments and assessments in the age of AI, whether AI-inclusive or AI-exclusive. I am using it a lot already to improve my spring 2024 courses.

Eventually, once Sam Altman rights the OpenAI boat after being unceremoniously thrown off it, this AutomatED-built assignment maven will be released on the GPT store for anyone to purchase for a small fee. For now, though, it will be available for free only to our Premium subscribers (see below for three ways to get Premium).

And we have more GPTs under development, so stay tuned!

❓ An AI Question Mark:
Gemini for Handwriting Analysis

One of the selling points of Google’s new Gemini LLM — which is now the engine behind Bard — is that it was “built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.”

There is a lot of debate whether Gemini offers anything unique with respect to multimodal capabilities. It looks like it is comparable to GPT4 Vision (GPT4-V), according to analyses of their respective capabilities. To make matters worse, the initial rollout by Google featured a demonstration video that misrepresented Gemini’s capabilities to make them seem more impressive.

We are in the process of exploring the pedagogical upshots of the multimodal capabilities of Gemini and its competition. It remains to be seen if they will be useful for professors out of the box without significant integration with other software like what companies like Graide have been doing for a while now.

For instance, we have tried to get Gemini to complete one of the pedagogical tasks displayed in Google’s own documentation of Gemini’s capabilities, but it was not entirely successful.

The task is to evaluate a handwritten student answer to a physics question, complete with diagram. Here are the question and instructions for Gemini:

A pedagogical challenge for Gemini from Google’s paper.

In short, the student is right that the energy of the skier is the same at the beginning and end (potential energy at the top is equal to kinetic energy at the bottom), but their mistake is to assume that the energy at the beginning is calculated with the formula “E = mgL.” Instead, the formula should be “E = mgH." Otherwise, they are on the right track.

Here is Gemini’s initial response to this image and a prompt I added to it for clarity (“As I say in the image, the handwritten part of the image is the student's solution. I want you to reason about the question step by step. Did the student get the correct answer? If the solution is wrong, please explain what is wrong and solve the problem. Make sure to use LaTeX for math and round off the final answer to two decimal places, if the student got it wrong.”):

Gemini’s initial response, via Bard.

But this is off target on a few fronts. First, mass isn't needed after you solve for velocity, since mass is on both sides of the equation and thus cancels out. Second, the height and the length are given in the drawing (which does not show a 45 degree angle slope). The height is 40m and the length is 80m.

I inform Gemini of these issues and it makes the appropriate corrections:

Gemini learns from its mistakes.

To be sure, there are powerful use-cases in the future of these multimodal LLMs. We will continue to experiment to find them — it could be that our prompting needs optimizing, an integration with another software tool would address their shortcomings, or something else on our end is going wrong — but it seems that they need to be more reliable if professors are to gain a lot from them.

🔗 Links

Cheating Fears Over Chatbots Were Overblown, New Research Suggests

A.I. tools like ChatGPT did not boost the frequency of cheating in high schools, Stanford researchers say.

www.nytimes.com/2023/12/13/technology/chatbot-cheating-schools-students.html

Does Amazon Q have promise in K12?

While Amazon Q is pitched for enterprise use, we see its value for the school setting.

www.aithena.ai/p/extracredit

Tools vs Agents: The Dehumanizing Threat of Impersonal AI Educators

What’s forming are two distinct camps: those who view generative AI as tools, like ChatGPT, Bard, and Bing’s chatbot, vs. those who see generative AI as a step toward intelligent agents akin to second brains or workforce replacements.

marcwatkins.substack.com/p/tools-vs-agents-the-dehumanizing?r=1z9b3o&utm_campaign=post&utm_medium=web

Chatbot Hype or Harm? Teens Push to Broaden A.I. Literacy

Students at a New Jersey high school want to widen A.I. discussions beyond dueling tropes of tech magic and doomsday panic.

www.nytimes.com/2023/12/13/technology/ai-chatbots-schools-students.html

⬆️ How to Access Premium

Late in the fall of 2023, we started posting Premium pieces every two weeks, consisting of comprehensive guides, releases of exclusive AI tools like AutomatED-built GPTs, Q&As with the AutomatED team, in-depth explanations of AI use-cases, and other deep dives.

So far, we have three Premium pieces:

To get access to Premium, you can either upgrade for $5/month (or $50/year) or get one free month for every two (non-Premium) subscribers that you refer to AutomatED.

To get credit for referring subscribers to AutomatED, you need to click on the button below or copy/paste the included link in an email to them.

(They need to subscribe after clicking your link, or otherwise their subscription won’t count for you. If you cannot see the referral section immediately below, you need to subscribe first and/or log in.)

Local LLMs, Beta Access to AutomatED's GPTs, and Gemini's Promise

Are local LLMs a solution to privacy concerns? Only if they perform well. So do they?

Table of Contents

💡Idea of the Week: Local LLMs for Privacy

👀 What to Keep an Eye on: AutomatED’s GPT Betas

❓ An AI Question Mark: Gemini for Handwriting Analysis

🔗 Links

⬆️ How to Access Premium

💡Idea of the Week:
Local LLMs for Privacy

👀 What to Keep an Eye on:
AutomatED’s GPT Betas

❓ An AI Question Mark:
Gemini for Handwriting Analysis