I am developing an app for extracting number from image card using sample code of android-ocr(tesseract-ocr).
I have trained the data as per the card font. It is detecting few card if the card having unique background, but if the card having multi background (attached sample) then the number is not recognizing.
Even if the card number little overlapping the background then also not recognition.
I tried to use the below steps to remove the background.
1. Smoothing the cropped image using
GaussianBlur( crop, crop, Size(3,3), 0, 0, BORDER_DEFAULT );
cvtColor( crop, crop, CV_RGB2GRAY );
2. Edge detection used sobel
crop = SobelEdgeDetect(crop);
3. Converting to bitwise not
cv::bitwise_not(crop,crop);
4. Used adaptiveThreshold to remove shadowed kind of things
adaptiveThreshold(crop,crop,255,CV_ADAPTIVE_THRESH_MEAN_C, CV_THRESH_BINARY,75,10);
after using these steps I am getting the image (attached bar-Process`png, citi1-Process.png, citi-Process.png), which is coming bold with blank space inside the number. In this case the ocr application is not recognize the number.
I do not how can fill these number with bold letter.
Now the big challenge for me to remove the background from any image card with out disturbing the text part of the card. So please suggest me how can I overcome all the above issue specific how can remove the background of the image`
I have attached few sample & output data for your reference.
Thanks & regards
Anil