1. You haven't given an example image. so this is speculation. One simple method you could do is to erode the image. This should result in the non-bold text being removed, leaving only the bold text (now thinner) to be passed through to the OCR stage. From above link (because who doesn't like images):.