Yes, you can use packages such as PDF::API3 to “dumpster-dive” quite a ways into the “guts” of a PDF file, but if you can identify defects from the text content of the file, your ugly approach might be the most cost-effective. The content of a PDF can be very beastly unpredictable, making it difficult to write reliable logic to track down problems.
And in appropriate status meetings, keep oh-so politely mentioning the ¢o$t of the fact that this system is still not working as the business should have reason to expect. Every hour spent ... opportunity costs ...