There's overhead in calling the subs in the first place. Doubly so for method calls. Cut down on the number of calls.
Examine your algorithms to determine why it's necessary to call FTS::printto()/assign() an order of magnitude more times than anything else in your program. Can the work be put off, done in batches? Can you leave yourself helpful references in the data so that less looping/searching is necessary? (Fewer arrays, more hashes). Is there an opportunity to memoize functions somewhere? Can you prepare the data better during FTS::parsefile (or new_fromfile or something) so that it's more easily dealt with later on?
Beware of premature optimization