✨Tutorial: How to Draft Grant Applications with Gemini
Ways educators can best use its massive context window.
[image created with Dall-E 3 via ChatGPT Plus]
Welcome to AutomatED: the newsletter on how to teach better with tech.
Each week, I share what I have learned — and am learning — about AI and tech in the university classroom. What works, what doesn't, and why.
In this fortnight’s Premium edition, I present a tutorial for prompting Gemini 1.5 Pro, with a focus on the capability that sets it apart: its massive context window (currently 2 million tokens!).
My focus is on showing how to leverage this capability in general for countless educational use cases by showing you how to use it to complete one task that is common in higher ed:
drafting/writing grant applications
Before that, I provide some advice on prompting Gemini in general — advice that applies and will apply to other LLMs as their context grows.
Table of Contents
👉 Gemini Primer
The standout feature of Google’s Gemini 1.5 Pro is its massive context window. While other models typically max out at 200,000 tokens or less, Gemini 1.5 Pro boasts a massive context window of up to 2 million tokens .
This leap in context size does not undermine Gemini 1.5 Pro’s frontier capabilities in reasoning, summarization, etc. and its multimodality. That is, it is comparable to other leading models — like GPT-4o or Claude 3.5 Sonnet — but it simply can handle much larger inputs. As Google researchers put it…
Importantly, this leap in long-context performance does not come at the expense of the core multi-modal capabilities of the model. Across a extensive battery of evaluations, Gemini 1.5 Pro greatly surpass[es] Gemini 1.0 Pro. These include core capabilities such as Math, Science and Reasoning, Multilinguality, Video Understanding, Natural Image Understanding, Chart and Document Understanding, Multimodal Reasoning, Code, and more. These evaluations additionally evaluate on a series of “agentic” tasks including Function Calling, planning and in-the-wild long-tail real world use cases such as improving job productivity for professionals. These advances are particularly striking when benchmarking against Gemini 1.0 Ultra, a state-of-the-art model across many capabilities.
As a consequence, the same prompting strategies that work with shorter-context LLMs are important to remember and utilize when prompting Gemini 1.5 Pro.
But which prompting strategies work well in general when using LLMs? As I detail in my ✨Guide on How to Train Students to Use AI, an effective prompt tends to do the following:
Encourage the LLM to take on a role, preferably of a relevant expert, with information about what this amounts to.
Encourage the LLM to work step-by-step, whether "prior" to producing an output or through a sequence of outputs that show its work.
Make clear the context of the user's needs.
Enumerate the user's objectives.
Clarify the practical constraints, like the categories or length of output one needs.
Outline the desired format, tone, or style for the LLM's outputs.
Give exemplars of prompt-response pairs (illustrating success).
Present sufficient detail about the above.
Given that many cases require a sequence of prompts to achieve desired results, effective prompting also requires a grip on how to break down complex queries into steps or sequences of directives. This skill involves learning how to refine and adjust prompts based on initial responses, ask follow-up questions, or request clarifications.
There are countless guides to effective LLM prompting in general that explain these rules of thumb in greater detail (see, e.g., here, here, here, and here). I recommend reading them if you don’t feel confident deploying them, because they are all relevant to using Gemini 1.5 Pro.
However, with that said, there are two ways in which prompting with long context is special:
Some best practices for prompting in general become more important.
Some tricks that aren’t necessary or impactful in short context become relevant.
I will cover these in the next section below.
🚜 Gemini’s Long Context Capability
General Description & Example Use Cases
Gemini 1.5 Pro's long context capability allows users to input massive amounts of text, code, audio, image, and video data alongside prompts, enabling the model to process, analyze, and reason about extensive content in a single frame of reference.
The net result is that Gemini 1.5 Pro enables educators to synthesize and apply information in ways that were previously impractical or impossible because they were too time-consuming. It’s almost as if Gemini 1.5 Pro can act as a graduate student assistant to help with these sorts of tasks. For example, long context enables a range of tasks, such as:
Processing large files: You can upload large PDFs, extensive code repositories, or lengthy videos alongside prompts.
Improved long-form analysis: The model can maintain coherence and accuracy over extremely long inputs, making it ideal for tasks requiring deep comprehension of extensive materials.
"Needle-in-a-haystack" recall: Gemini 1.5 Pro demonstrates near-perfect recall of specific information even when it's buried within millions of tokens of context.
In-context learning: The model can effectively learn new tasks or even rudiments of a new language from instructional materials provided within its vast context window.
Enhanced multimodal capabilities: Gemini 1.5 Pro can reason across text, code, audio, image, and video inputs within a single context.
To put it more concretely, here are some ways professors, learning specialists, and other higher educators can use Gemini 1.5 Pro's long context capability to enhance their work:
Analyze entire academic papers or textbook chapters: Educators can input multiple scholarly articles or book chapters for in-depth analysis, literature reviews, or to generate comprehensive summaries and discussion points.
Engage with sizable multimedia course materials: The model can process and analyze long lecture transcripts, video content, or audio recordings, helping educators create summaries, extract key points, develop supplementary materials, or compare performances.
Conduct comprehensive literature reviews: By inputting multiple academic papers or research articles, educators can use Gemini 1.5 Pro to identify trends, synthesize findings, and generate literature review drafts across large sets of academic texts.
Develop complex, multifaceted lesson plans: Educators can input various source materials, including textbooks, research papers, and multimedia content, to help create rich, interconnected lesson plans that draw from diverse sources.
Facilitate in-context learning of new subjects: Gemini 1.5 Pro can effectively learn new tasks — or even a new language — from instructional materials provided within its vast context window, potentially assisting in curriculum development for niche or rapidly evolving subjects.
Process long-form student submissions: Gemini 1.5 Pro can assist in reviewing and providing feedback on extensive student work such as theses, dissertations, or lengthy research papers, maintaining consistency and attention to detail throughout.
In what follows, I explain how to prompt Gemini 1.5 Pro get it to perform at its best. I first provide a general discussion of prompting long-context LLMs — which applies to other LLMs if you push them to their limits, like Claude 3.5 Sonnet with its 200,000-token context window — before turning in the subsequent section to the concrete application of this discussion to drafting grant applications.
Prompting Long Context
Effectively leveraging Gemini 1.5 Pro's long context capability requires a structured approach.
Structure is the most crucial aspect of prompting long context because it becomes easier and easier for the LLM to go down the wrong road as the context grows in size. Structure makes clearer to the LLM how all the parts of your prompt and your files relate to one another.
Of course, the content placed within the structure matters, too, and I will discuss it as well.
I have used the following approach for a range of long context tasks, from drafting grant proposals for large federal grants to drafting sections of this very Tutorial (see below for the prompt I used to draft the first section).
And it’s not just my own experimentation that leads me to this approach. It is a generalized version of a method that is called “Corpus-in-Context Prompting” that Google DeepMind researchers Lee et al. have found to be superior in their extensive empirical tests of long context prompting.
In the next section, I will show each step in this structured process — as well as prompt examples that illustrate them — but here is a summary:
Subscribe to Premium to read the rest.
Become a paying subscriber of Premium to get access to this post and other subscriber-only content.
Already a paying subscriber? Sign In