Banning AI in Final Exams and Creating Your Own GPT Evaluator

I reflect on (summative) final exams' technology constraints and show how to create a GPT to help grade.

Graham Clay
December 04, 2023 • Estimated Reading Time: 13 minutes

[image created with Dall-E 3 via ChatGPT Plus]

Welcome to AutomatED: the newsletter on how to teach better with tech.

Each week, I share what I have learned — and am learning — about AI and tech in the university classroom. What works, what doesn't, and why.

In this week’s piece, I explain why I think we should all be reconsidering final exams’ technology constraints, and I convert last week’s AI use-case into a dedicated GPT.

💡 Idea of the Week: Reconsider Final Exams' Constraints
🧰 An AI Use-Case: Custom GPTs for Evaluation
👀 What to Keep an Eye on: Premium Guide Next Week
🔗 Links
⬆️ How to Access Premium

💡 Idea of the Week:
Reconsider Final Exams’ Constraints

I know many of you are in the midst of preparing for or administering final examinations. As I prepare for the two finals that I am administering for my classes this semester, I am reflecting on their constraints — that is, the rules to which my students must conform in taking these exams. Can the students bring their notes? Can they use technology? More than ever, the latter question is pressing.

I have two conflicting temptations, both predicated on the assumption that my final exams are summative (i.e., designed to assess whether students have mastered the primary aspects of my courses).

First, I want to continue to lean into allowing my students to constructively and productively use AI tools to create their submissions for my courses’ assignments and assessments, where ‘constructive’ and ‘productive’ are understood in alignment with my position on solutions to what I have called “the AI plagiarism problem” and my prior discussion of when professors should make room for AI training in their teaching. This temptation manifests as a desire to allow devices, internet access, and AI use in my final exams, insofar as they allow students to best express the skills, knowledge, and intellectual virtues that I expect them to have after they have completed my class. AI assisted or not, can they consistently produce high quality work?

Second, I want to see where my students stand with respect to the skills, knowledge, and intellectual virtues that I want to inculcate in them, supposing that they cannot use AI and other technological tools. How dependent are they on AI to create submissions to my assignments and assessments at the standards of “high quality” that I expect of them?

This sort of dilemma becomes more pressing for us professors as various technological tools — AI forefront amongst them — become more capable. The more powerful the tools, the more tempting it is to center the effective use of these tools in one’s assessments but also the more tempting it is to be worried that one’s students are overly dependent on them.

After reflecting on this dilemma, I have come to a tentative conclusion about the constraints of my final exams. This tentative conclusion is my idea of the week: if I cannot justify why it is significantly worse, all things considered, for students of a given class to be dependent on technological tools to produce high quality work in that class, then I should allow the use of those tools in my final exam for that class. In other words, unless the costs of student dependency on AI tools to produce high quality work are, from my perspective, significantly worse than the benefits, I should allow the tools in my exams.

One cost would be that students need to be able to perform without the AI tools in their future jobs (e.g., because they will be working in environments without access to the tools). Another is that effective AI tool use requires judgment that must be or is best assessed independently of AI tool use.

The qualifier ‘significantly worse’ is included because I think I should err on the side of allowing technology, given its rapidly increasing role in my field, the workplace, and society more generally.

The one disclaimer I should mention is that allowing technology use in a final exam introduces some other complications, like those related to accessibility and those related to the possibility that students attempt to communicate with one another via their devices.

All things considered, I am not allowing my students to use technology in either of my final exams this semester (although they are trained to use AI and encouraged to use it in several other assessments), but I will continue to reflect on whether I can justify this position as I reflect on what these assessments are intended to achieve.

🧰 An AI Use-Case for Your Toolbox:
Custom GPTs for Evaluating Major Assignments

Last week, I explained how a professor can use ChatGPT4 as a first-pass evaluator or grader of student work. This week, my AI use-case for your toolbox is a conversion of last week’s advice into a GPT.

If you have not heard, OpenAI announced in early November that you can “create custom versions of ChatGPT that combine instructions, extra knowledge, and any combination of skills.” We at AutomatED are working on building some deluxe GPTs for general kinds of purposes, like grading and mentoring. We hope to release these to the “GPT Store” that OpenAI plans to release in 2024. In the meantime — and regardless of the utility of AutomatED’s GPTs — we recommend that professors experiment with creating GPTs for a range of instructional purposes.

For instance, you could create a GPT that more reliably performs the first-pass evaluation or grading that I described last week. You could also create one for in-class activities (e.g., providing feedback on student submissions to the students themselves). The idea is to create a specialized GPT that is excellent at completing a given task, rather than relying on a chat — with its running conversational context and limited context window — to do so. Here’s how you would go about doing this….

NOTE: this use-case refers to ChatGPT4, which requires a $20/month subscription. If you don’t have one already, take it from me: it’s worth it. But if you don’t want to splurge yet, use Poe in the meantime.

Step One: Exemplar,
Anonymize/Pseudonymize, and Prompt

If you completed a process like that outlined in last week’s piece, you created an exemplar (an example of the grading/feedback output you expect from ChatGPT4 for the assignment you are interested in), you anonymized or pseudonymized your students’ submissions (and/or got consent from them), and you perfected your prompts to get ChatGPT4 to generate the outputs you needed.

In my case, I wanted ChatGPT4 to help ease the cognitive load of parsing my students’ papers so that I could transition from one oral exam to another more quickly and effectively. Since my oral exams consist of six initial questions based on the specifics of each student’s paper (with follow-ups customized to the student’s responses), I wanted ChatGPT4 to help me think of good questions — at least get me going in the right direction or notice something I may have otherwise missed. In the process of prompting and iterating ChatGPT4 to complete this task in a dedicated chat, I determined the sorts of information that I needed to supply to get the outputs I wanted. With a good grip on these instructions, I can now create a dedicated GPT for the task.

Step Two: Create a GPT

To create a GPT, start by clicking the “Explore” button in the upper left corner of your chat interface.

Then click the “Create a GPT” button.

You can then either prompt ChatGPT to help you build your GPT or “Configure” it yourself. Here is what my “Configure” screen looked like once I input the successful instructions from my aforementioned chat (the one I discussed last week):

If I had not already determined the basic contours of effective instructions for this assignment, I would need to experiment more with these instructions going forward.

Below the instructions, I uploaded the exemplars as well as the assignment’s prompt and rubric. I also checked the “Code Interpreter” checkbox (to allow uploads in prompts to the GPT) and unchecked the data harvesting checkbox:

Step Three: Tinker

Before publishing the GPT (you can publish it to yourself, to anyone who has the link, or publicly), you can “Preview” its outputs on the right side of the screen. In my case, I uploaded some student papers to see if I would get the same sorts of results that I had gotten with similar instructions in the aforementioned chat.

If you don’t get the outputs you want, you need to modify the instructions, the uploaded files, or the other settings you have selected. You could add information to the prompts you offer the GPT in the “Preview”, but the whole point of the GPT is to streamline this process, so you would ideally get the GPT polished to the point that it is more prepared to generate the outputs you want with little added effort.

Step Four: Profit

The primary advantage of creating a GPT for this purpose rather than using an ongoing chat is control or reliability. Once you perfect the instructions and the uploaded background files (i.e., the “knowledge”), you needn’t worry about your conversation evolving in ways that result in undesirable outputs. ChatGPT’s chats do not have unlimited context windows, so as you continue to engage with them, they reach a point where they do not access some of your prior exchanges or instructions. This is not true if you create a GPT.

Another advantage is that you can share a GPT with others. If you have a team of TAs helping you, you or they can develop a GPT that is shared amongst the group to help with grading or evaluation. This not only saves time and improves the quality of outputs, but it also ensures that the various graders/evaluators are closer to being on the same page.

In next week’s piece, I will discuss creating GPTs for in-class use…

👀 What to Keep an Eye on:
Premium Assignment Creation Guide Next Week

On Wednesday of next week — December 13th — our next Premium piece will be released to those who are subscribed to Premium (which is $5 per month or $50 per year), including those who have referred two or more subscribers (see below).

In this comprehensive guide, we will outline the considerations professors should take into account as they design assignments and assessments in the age of AI.

For those of you who have referred two or more subscribers (including those who did so in the more distant past), we will reach out early next week with your Premium access.

🔗 Links

OpenAI: Altman Returns

As of this morning, the new board is in place and everything else at OpenAI is otherwise officially back to the way it was before. Events seem to have gone as expected. If you have read my previous two posts on the OpenAI situation, nothing here should surprise you.

thezvi.substack.com/p/openai-altman-returns

Ego, Fear and Money: How the A.I. Fuse Was Lit

The people who were most afraid of the risks of artificial intelligence decided they should be the ones to build it. Then distrust fueled a spiraling competition.

www.nytimes.com/2023/12/03/technology/ai-openai-musk-page-altman.html

Everything announced at OpenAI's first developer event | TechCrunch

OpenAI held its first developer event on Monday and it was action-packed. The company launched improved models to new APIs. Here is a summary of all

techcrunch.com/2023/11/06/everything-announced-at-openais-first-developer-event

Human Subjects Protection in the Era of Deepfakes

The unique risks posed by deepfakes require special consideration for the Defense Department’s use of the technology.

www.lawfaremedia.org/article/human-subjects-protection-in-the-era-of-deepfakes

⬆️ How to Access Premium

Late in the fall of 2023, we started posting Premium pieces every two weeks, consisting of comprehensive guides, releases of exclusive AI tools like AutomatED-built GPTs, Q&As with the AutomatED team, in-depth explanations of AI use-cases, and other deep dives.

So far, we have three Premium pieces:

To get access to Premium, you can either upgrade for $5/month (or $50/year) or get one free month for every two (non-Premium) subscribers that you refer to AutomatED.

To get credit for referring subscribers to AutomatED, you need to click on the button below or copy/paste the included link in an email to them.

(They need to subscribe after clicking your link, or otherwise their subscription won’t count for you. If you cannot see the referral section immediately below, you need to subscribe first and/or log in.)

Banning AI in Final Exams and Creating Your Own GPT Evaluator

I reflect on (summative) final exams' technology constraints and show how to create a GPT to help grade.

Table of Contents

💡 Idea of the Week: Reconsider Final Exams’ Constraints

🧰 An AI Use-Case for Your Toolbox: Custom GPTs for Evaluating Major Assignments

Step One: Exemplar, Anonymize/Pseudonymize, and Prompt