Wednesday, 25 January 2017

Converting Word, HTML, PowerPoint and PDF documents to text

The F# Journal just published an article:

"The first challenge in Natural Language Processing (NLP) is usually converting available documents into text ready for processing. This article looks at functions that convert Word, HTML, PowerPoint and PDF documents into text using the Microsoft.Office.Interop.Word, HtmlAgilityPack, Spire.Presentation and iTextSharp Nuget packages, respectively..."

If you subscribe to the F# Journal then can read this article here otherwise subscribe to the The F# Journal today to read this article and many more!

No comments: