MarkItDown is an open-source Python library from Microsoft that converts various file formats to Markdown for indexing and analysis. Markdown is a popular lightweight markup language with plain text ...
This is a proof of concept that I put together out of curiosity today, and it’ll likely break for some documents or Microsoft Excel, but it’s been working well for me, and I thought I’d share it.