Super cool approach. I wouldn't have guessed it would be that effective if someone had explained it to me without the data.
I'm curious how easy it is to "defeat". If you take an AI generated text that is successfully identified with high confidence and superficially edit it to include something an LLM wouldn't usually generate (like a few spelling errors), is that enough to push the text out of high confidence?
I ask because I work in higher ed, and have been sitting on the sidelines watching the chaos. My understanding is that there's probably no way to automate LLM detection to a high enough certainty for it to be used in an academic setting as cheat detection, the false positives are way too high.

