I have a usecase where I am getting an OCRed copy. All the header, footer and page number data has been converted into text. And it has broken the document formatting as well. Attaching a sample document for the referenc…...extracting text from the body content, effectively skipping headers...assist in inserting formatted content. Here’s a basic example of...