Well it seems it is the emoticon because I found a proposed regex
here which got me past the offending tweet. However, it croaked on a later tweet:
Incorrect string value: '\xF3\xBE\x94\x9F \xD0...' for column 'orig' at row 1 Кол мне в лоб!
Разыгрываем кровь (почти как настоящая!) и кол в лоб.
С помощью этих спецэффектов образ для... 392907622149791744 at /home/steve/load-tweets.pl line 64.
And I don't know how to modify the regex pattern to intercept that sequence. Can someone teach me so I don't have to come back here every time I need to whack another mole? Or does someone have a better pattern?