Textual Data Analytics: Sentiment Scores & Behavioral Metrics
New Research Workspace Available!
New research combining Textual Data Analytics & Credit Default Swap Pricing datasets, published in a notebook format, is available on the S&P Capital IQ Workbench platform. The notebook provides a step by step walkthrough of the white paper and the underlying code used to generate the analysis.
The Textual Data Analytics (TDA) dataset takes earnings calls transcripts one step further with sentiment and behavioral-based metrics rigorously researched and tested against frequently used quantitative strategies.
Implement signals from 800+ predictive and descriptive metrics derived from Natural Language Processing (NLP) in combination with data from professionals, ownership, and estimates datasets with differentiated and additive characteristics. Quickly assess the sentiment and transparency expressed on earnings calls of 11,600+ active companies to analyze and hone in on high-impact individual sections, speaker types, and individual components of calls or as building blocks to uncover additional signals.
Dataset includes:
- Bag-of-Words Sentiment metrics such as Net Positivity, Positivity-to-Negative, Positive Sentiment and Negative Sentiment as well as TF-IDF and Cosine Similarity weighted sentiment
- Behavioral metrics such as Language Complexity, Analyst Favoritism, Numerical Transparency, Guidance References, Exogenous Factor References, Language Similarity and Timing of Numerical References
- Measurement of Financial Performance Topic Identification and neighboring positive descriptors around market-moving topics such as "Revenue", "Bottom-Line", and "EPS"
- Objective measures of Executive and overall earnings call performance in a time series or cross-section
[Awards]
- 2020 Data Management Insight Award for Best Proposition for AI & Machine Learning
Vendor information
- Primary Entity TypeCompany
- Coverage Count11,600+
- Geographic CoverageGlobal
- Industry CoverageConsumer, Energy and Utilities, Financials, Healthcare, Industrials, Materials, Real Estate, Technology, Media & Telecommunications
- History Initiated2004
- Earliest Significant Coverage2008
- Point In TimeYes
- Point In Time DetailsEach call transcript is updated multiple times by S&P Global. The scores calculated for each version are retained.
- Data SourceCalls transcribed by S&P Global transcripts team.
- Field Count10s
- Delivery ChannelCloud, Feed
- Delivery PlatformFTP, Marketplace Workbench, Snowflake, Xpressfeed™
- Reporting FrequencyVariable
- Dataset LatencyDaily
- Dataset Size (GB)21