In essence it seems like something that video compression tools should make easier, but as its not the specific form they're often used for, "hardware acceleration" is probably not going to help and thus the processing may be a bit involved for a phone.
It'd be easier if you are just matching pieces (flip them over); just matching edge shapes is obviously less data. but you have the problem that many pieces may be identically shaped; which might not be a show stopper. "here's 3 places it could go" would be a result worth reporting.
However. This is a "take the point out of the process" kind of idea, it seems to me. Anyone who actually enjoys doing puzzles is going to feel this kind of thing would be cheating; and those who don't do puzzles aren't interested.