I've been finding more and more that small changes in my code are making huge changes in the output and trying to continuously update the tests to exactly match...
I don't know your problem domain, but would it be possible to test for other (higher level) invariants besides an exact textual match? For instance, maybe you could just check that a transformation preserves syntatic correctness, or that all tags are balanced, or properly nested, or conserved, etc. Let's take a different example. If you are creating a ray tracer, you could have a unit test that renders a red sphere on a black background and then compares that bitmap to a correctly rendered red sphere bitmap, checking that each pixel is correct. Or, you could instead write tests like...
- Render a red sphere. Are any pixels in the bitmap red?
- Render a red sphere with a large radius. Are there more red pixels than with a small radius?
- Render a red sphere that is far away. Does it produce fewer red pixels than one that is close to the camera?
Does that make any sense?