(1) There are free tools for both OCR and text-to-speech that you could use if you want to try hacking up your own solution. I've used Tesseract for doing OCR from images successfully. It can take a bit of fiddling around depending on the nature of the image to get a good read out of it but it works. I've also used Speech Dispatcher on Linux to implement tools that periodically check for problems on the systems I support and yell at me (literally) if there is a problem that needs my immediate attention. Writing something that integrates the two nicely with all the features you want would probably be a lot of work, but a quick and dirty way to get the two programs to just read aloud some text from an image can be as simple as:
spd-say "`tesseract image.png stdout`"
if your text is clean enough for the OCR to pick it out without a lot of preprocessing.
(2) If you already have an Android app that does everything you want -- other than running on a phone instead of your desktop -- you might be able to just emulate Android on your desktop and use the app.