Kensho Extract

Kensho Extract is a leading artificial intelligence (AI) solution that allows you to structure and access both text and tables from documents. Utilizing sophisticated machine learning (ML) models, Extract converts complex PDF documents into easy-to-use machine-readable formats.

Extract is built with finance and business in mind, leveraging S&P’s deep library of financial documents, Extract is the ideal solution for unlocking insights from complicated business and finance documents.

With Extract, you can:

  • Quickly transform unstructured documents into a machine-readable format that organizes the headers, titles, paragraphs, tables and footers detected within the document in natural reading order
  • Interpret messy page layouts, structuring text into cohesive paragraphs that can then be effectively analyzed and searched
  • Augment your human workforce with easy to use document extraction tools, including a browser-accessible user interface

Service Provider Information

Kensho is the Artificial Intelligence accelerator for S&P Global that develops cutting-edge technologies to transform businesses. Kensho's elite engineering talent uses the latest advances in machine learning and the unparalleled breadth and depth of data available at S&P Global to create new actionable insights and solutions for decision makers.

Key Information

Use Cases

  • Enable Full-Text Search: Convert inaccessible, static PDF documents to machine-readable formats to enable full-text document search of PDF internal document repositories and shared platforms such as virtual data rooms
  • Feed Sophisticated NLP Solutions: Convert inaccessible, static PDF documents to machine-readable documents formats to enable more sophisticated natural language processing (NLP) solutions such as key-value pair (KVP) extraction, named entity recognition (NER), and topic modeling to produce actionable insights
  • Export Tabular Information at Scale: Find and identify any tables within static PDF documents and export them into user friendly formats such as JSON, Excel or CSV

Benefits

  • Tabular Extraction Model Flexibility: Unlike other specific-use tabular extraction tools that rely more heavily on “hard-coded” rule-based logic, Kensho Extract’s machine learning (ML) model allows for high performance over a much broader range of document table types
  • Business & Finance Niche: Kensho Extracts outperforms more general-purpose extraction products on financial documents with complicated layouts
  • Proprietary S&P Financial Training Data: Kensho Extract leverages S&P Global's rich document repository, while other extraction vendors rely on open-source data
  • Speed & Scalability: With processing performance 10x faster compared to other vendors, Kensho Extract can process millions of pages en masse

Details