The fact that people are still treating it like entirely raw text input is insane to me. If you have a document, have a separate input for the user to paste/upload data, and then another for the user's instruction.
That allows you to do things like chunk the document while leaving the rest of their instruction alone, or do a sliding window of just the document while your instruction stays static.
Or you just use a quick model to check if there area instructions at the end and bring it to the beginning.