So are they loading every exam in its entirety into the same context window and doing the whole thing with a single prompt? Or letting the LLM take arbitrary actions instead of only returning a value? It seems like they would have had to make some pretty bad decisions for this to be possible, beyond maybe just adjusting the one grade. I wonder what the prompt jailbreak equivalents of sql injection would actually look like
So are they loading every exam in its entirety into the same context window and doing the whole thing with a single prompt? Or letting the LLM take arbitrary actions instead of only returning a value? It seems like they would have had to make some pretty bad decisions for this to be possible, beyond maybe just adjusting the one grade. I wonder what the prompt jailbreak equivalents of sql injection would actually look like