99.9% Reliable AI Detection

Does it matter? Plus, drafting grants with Gemini and analyzing teaching with ChatGPT.

Graham Clay
August 12, 2024 • Estimated Reading Time: 15 minutes

[image created with Dall-E 3 via ChatGPT Plus]

Welcome to AutomatED: the newsletter on how to teach better with tech.

Each week, I share what I have learned — and am learning — about AI and tech in the university classroom. What works, what doesn't, and why.

In this week’s piece, I discuss AI detection tools from OpenAI and Google, including whether I’d use them even if they were super reliable. Before that, I share some links of interest to higher educators, including to a survey showing that 71% of undergraduates across 16 countries think their universities should train them to effectively use AI.

📢 Quick Hits: AI News and Links
🤖 Two New Tutorials — and other AutomatED Updates
💡 Idea of the Week: Would You Use an AI Detection Genie?
✉️ What You, Our Subscribers, Are Saying

📢 Quick Hits:
AI News and Links

1. Flux, an open-source text-to-image creator that is comparable to industry leaders like Midjourney, was released by Black Forest Labs (the “original team” behind Stable Diffusion). It is capable of generating high quality text in images (there are tons of educational use cases). You can play with it on their demo page, on Poe, or by running it on your own computer (tutorial here).

2. A Digital Education Council (DEC) survey of nearly 4000 undergrads from 16 countries found that 71% of them agree or strongly agree that their universities should train them to effectively use AI, 86% of them use generative AI in their studies, 54% use it weekly (24% daily), 58% feel that they lack sufficient AI knowledge/skills, and 48% said they did not feel prepared for the AI-enabled workplace. Full report downloadable here, summary from DEC here, succinct analysis with summative graphics here. (Other interesting — and large — recent surveys here, here, here.)

3. Phillipa Hardmon continues trying to answer the question “How Close is AI to Replacing Instructional Designers”, with the focus this time on creating course outlines. Again, her tentative conclusion: [ID + AI] > [non-ID + AI] > [ID - AI].

4. Meta released SAM 2 Demo, an open-source AI model that enables you to segment or select objects in videos or images (from a dude skateboarding to a hunk of dough being kneaded). See here for an interview with Joseph Nelson about computer vision and the implications of SAM.

5. “A Reddit user discovered the pre-prompt instructions embedded in Apple's developer beta for macOS 15.1, offering a rare glimpse into the backend of Apple's AI features,” features which I discussed before in arguing that process-focused assessments may soon find their end.

6. Blackboard is all-in on AI, trying not to lose market share to Canvas and others. Discussion of their yearly conference here.

7. "People are raising doubts about AI Summer,” including Goldman Sachs and Sequoia Capital, with AI stumbles from Figma to Google.

🤖 New ChatGPT and Gemini 1.5 Pro Tutorials
— and other AutomatED Updates

At long last, I’m pleased to announce that I have now finished my ✨Premium Tutorials on Google’s Gemini and OpenAI’s ChatGPT, namely on…

- How to draft grant applications with Gemini 1.5 Pro

An illustration of how to prompt long-context LLMs, with a focus on Google’s 2-million token beast
I have used the research-backed method I recommend here for several purposes, including drafting a recent grant application and a section of the Tutorial itself (actual prompt included)

and

- How to grade and quantitatively analyze your teaching with GPT-4o

An illustration of how to prompt ChatGPT’s Vision and Advanced Data Analysis capabilities

Each Tutorial covers what these powerful AI tools are best at, with use case examples that are both directly applicable to higher educators and generalizable to other tasks we complete regularly.

These Tutorials follow on the heels of my prior Premium Guide on…

- How to teach your students to use AI (in general and for your field’s unique tasks)

With 71% of undergraduates saying that their universities should train them to effectively use AI, now is the time to think seriously about how best to do so
The first part is not paywalled, as is the case with other Premium pieces, so feel free to check it out even if you aren’t a paying subscriber

These mammoth releases — each is 5,000-7,000 words, with countless examples, prompts, and links — bring the Premium Archive up to 15.

September Changes to Premium

With the growth of the Archive, we are changing the structure of Premium, starting September 1. Here are the changes…

The Monthly subscription option will be phased out, with all Monthly subscribers given a steep discount to upgrade to Annual subscriptions on August 26 (stay tuned if this is you)
Annual subscriptions’ price will increase to $99 from $80 (upgrade here if you want to lock in the current price)
One-off shareable Premium pieces’ price will increase from $5 to $7.50, and we will make all but the latest 2 Premium pieces purchaseable in this way (at any given time)
- The currently available one-offs are (purchase at current price here):

The Return of AutomatED Webinars?

I am considering starting up bimonthly Zoom webinars again, like our “Build Your Own Custom GPT” webinar from this past spring. They would be on Fridays and would last an hour or two depending on topic, with discounts for Premium subscribers. I could cover AI assistants like custom GPTs, training students to use AI, ethical issues related to AI use (and solutions), … .

Let me know if you’d be interested.

Would you be interested in bimonthly AutomatED webinars, occurring on Fridays?

💡 Idea of the Week:
Would You Use an AI Detection Genie?

Suppose you had a genie who you were certain is (almost) always correct when he judges that a bit of writing is AI-generated.

Would you rely on him as an educator to judge your students’ work?

What if his reasons for his judgments are completely opaque to you?

That is, suppose you have to tell any student who you accuse “I don’t really know how he came to his judgment, but I know he is reliable. He’s been proven!”

Would you rely on the genie if his evidence was opaque to you and your students?

If you have time, tell me why after you click your answer.

According to a report by the Wall Street Journal, OpenAI has developed a method to reliably detect when ChatGPT is used for writing essays or research papers — supposedly, it is 99.9% effective.

The “watermarking” feature has been ready for release for a year but remains unreleased due to internal debates at the AI giant. (OpenAI are also at work on a “text metadata” approach that may be more promising, or so they explained in a blog post update after the release of the WSJ report.)

With the watermarking feature, the company is weighing concerns about user retention — with nearly a third of loyal ChatGPT users potentially being deterred by the feature, despite 4/5 of people worldwide supporting its implementation — against their commitment to transparency. OpenAI also expressed concerns about potential disproportionate effects on non-native English speakers (who, they note, may use AI to improve their English, without any intention to use it for nefarious purposes). To make matters worse, they said their blog post that it is “trivial” for “bad actors” to circumvent the feature using other tools.

Meanwhile, Google has been at work on SynthID, a toolkit designed to watermark and identify AI-generated content across various modalities including text, images, audio, and video.

Like OpenAI’s solution, SynthID embeds watermarks directly into the token generation process of large language models. This technique subtly adjusts the probability scores of predicted tokens throughout the text generation, creating a pattern that serves as a watermark. Google claims this method works for as little as three sentences of text, with accuracy improving for longer passages.

Now, back to the genie.

In a newsletter from over a year ago (“Do AI Detection Tools Work? Does it Matter?”), I grappled with the efficacy and implications of AI detection tools in higher education. In short, I proceeded as follows:

To assess the effectiveness of these tools, I conducted tests using Winston AI's detector. I fed it various AI-generated texts, including those from ChatGPT and Google's Bard, as well as human-written content.

The tests showed that Winston AI accurately identified both AI-generated and human-written text, even when I attempted to confuse it through various prompting techniques or by mixing AI and human content.

Despite these positive results, I remained uncertain about the broader reliability of such tools. I acknowledged that my limited testing couldn't conclusively validate Winston AI’s claims of 99% accuracy.

More fundamentally, I worried about the gap between facts and evidence in using these tools. As an educator, I argued that we are obligated to accuse our students only when supported by transparently verifiable evidence (and obviously other conditions need to be met, too).

I harkened back to the old days of Googling quoted excerpts of student essays, finding the source on Sparknotes, and providing that source in an accusation.

In short, AI detection tools, operating as black boxes, fail to provide the kind of clear, common-ground evidence needed to support accusations of academic dishonesty, or so I argued.

They are like the genie; supposedly, they are extremely reliable, but we can point at nothing mutually evaluable when we claim that they are.

As a result of this dynamic, I expressed concern that relying on these tools could erode trust between professors and students, potentially damaging the learning environment — even more so than making accusations of academic dishonesty in the first place.

I concluded that these complications should incentivize professors to avoid assignments susceptible to AI plagiarism or to design tasks that explicitly incorporate AI tools. Better off avoiding the situation entirely, at least when feasible. (Hence the birth of our first ✨Premium piece, my Guide on How Professors Can Discourage and Prevent AI Misuse.)

So, where do I stand now?

One issue is whether OpenAI’s — or Google’s — unreleased watermarking solution is similar to the genie or not.

If watermarking embeds identifiable patterns directly into the generated text — that can later be revealed in a way that all can see — this could potentially provide more concrete evidence of AI usage, bridging the gap between facts and verifiable proof that I found so troubling before.

Light in the black box.

The core bit of progress here is that these technologies are integrated at the source of AI text generation.

On the other hand, it is hard to see how we won’t still be heavily reliant on OpenAI or Google. Even with watermarking, we're still relying on technological solutions that aren't fully transparent to or evaluable by most users.

Yet, a bigger issue is that reliable and evaluable watermarking doesn't help differentiate between permissible and impermissible uses of AI tools.

As educators, we're still left with the challenge of defining and identifying academic dishonesty in an AI-augmented world.

We must continue to grapple with fundamental questions about the role of AI in education, the nature of original work in an AI-assisted world, and how to maintain academic integrity without eroding trust or stifling innovation.

Given AutomatED’s content, you won’t be surprised to hear that I believe our focus should shift from trying to detect AI use to ethically and productively integrating AI into the learning process. Rather than relying solely on detection and punishment, we should generally be designing curricula and assessments that acknowledge the reality of AI tools and teach students to use them responsibly and effectively.

With that said, there are surely cases when we should try to discourage and prevent AI use, like when we are trying to develop our students’ AI-independent judgment (for a more in-depth discussion of this point, see here).

Students often will need such judgment to use AI well, at least until AI becomes a super reliable genie at producing high-quality work in philosophy, economics, biology, or …

✉️ What You, Our Subscribers, Are Saying

We got a lot of responses to our poll in our last newsletter on “How to Cite AI”:

When you teach next, will you require your students to cite their use of AI?

“If it is used, it must be disclosed in some manner.”

Anonymous Subscriber

And to return to the question from the newsletter before that…

Will you require your students to purchase an AI tool next semester?

“Our University has access to Microsoft Copilot so we will use that. Plus they can access GPTs I created for free, as you mentioned in your article.”

Anonymous Subscriber

What'd you think of today's newsletter?

Graham

Expand your pedagogy and teaching toolkit further with ✨Premium, or reach out for a consultation if you have unique needs.

Let's transform learning together.

Feel free to connect on LinkedIN, too!