ARTICLE AD BOX
I need to extract player statistic in a image : enter image description here
I have try pytesseract with preprocessing image : convert to grey scale, resize img with a factor 2 and filter on edge :
with open(TEXT_FILE, "w", encoding="utf-8") as f_out: for img_file in sorted(FRAMES_DIR.glob("*.png")): # option : prétraitement léger #img = Image.open(img_file).convert("L") # niveaux de gris img = Image.open(img_file) gray_image = ImageOps.grayscale(img) # Resize the image to enhance details. scale_factor = 3 resized_image = gray_image.resize( (gray_image.width * scale_factor, gray_image.height * scale_factor), resample=Image.LANCZOS ) # Apply adaptive thresholding using the `FIND_EDGES` filter. thresholded_image = gray_image.filter(ImageFilter.FIND_EDGES) text = pytesseract.image_to_string( thresholded_image, lang="eng", # français (ajoute "eng" si besoin) #config="--psm 6" # adapté pour du texte en ligne / interface config = tessdata_config ).strip() if text: f_out.write(f"===== {img_file.name} =====\n") f_out.write(text + "\n\n") print(f"OCR terminé. Résultat dans : {TEXT_FILE}")but the result was not good.
Can you give me some advise to this task or an other lib to extract text
regards
