This articles explains our NLP process and why we use it it in our products.
Personality
In the Personality tab of our Consumer Insights portals, we leverage a mixed methodology where survey respondents opt-in to share social media accounts, posts and handles followed. Each respondents' posts are then collected and run through NLP to predict aspects about each person, including their personality traits, motivators and personal concerns.
The NLP leveraged here was developed using a combination of off-the-shelf, open source lexica based on decades of academic research in the field of psychology, and built on datasets of at least 100K, and upwards of 1 million users, that has been adapted for marketing purposes. These inform the psychological constructs you see in the portal.
Language Analysis
For each person’s message, we used first a top-down approach with a natural language algorithmic lexicon to identify the percentage of words within each message related to key constructs of interest. Our primary focus here was to identify psychological constructs that directly relate to the way people are motivated or can be marketed to. More specifically, our lexicon pulled out key words related to motivations (e.g. affiliation, accomplishment, etc.), personal concerns (e.g. leisure, home) and personality (e.g. introverted/extroversion, etc). This allowed us to isolate the language features, and the psychological processes that are most associated with our outcomes or differences with various populations.
Interests (Affinities)
The Interests tab enables categorical analysis of following trends, zooming in at the account level. For instance, examining these trends alongside our personas, Boomerangs are following pages like Managing Diabetes, Bob Dylan, Pacifico Beer, and Johnny Cash.
Marketers use this data to bolster product and content strategies with audience research, partnerships, and content generation.
Product Trends and Social Listening
This tab categorizes the posts of our panel by any relevant categories (e.g. sleep, sex, politics). We further ensure we are listening to real people (not just businesses and dispensaries--though we do track these also for relevant metrics). We train language attributes like product categories by training our tagging algorithms with research-in-the loop methodology - incorporating our subject experts into machine learning. We also train poster attributes like consumer and company posters using this same methodology. Our neural networks are trained on hundreds of thousands of products and social posts, which our model then compares against hand-annotated products and social posts, to come up with a prediction of whether or not a given product or post has a particular difference. Due to differences in the types of language used, this methodology is robust against "sponsored" and similar types of posts, identifying real consumer behavior in the wild.
We then we use our in-house subject experts and researchers to check and refine these findings by using first the tools then our actual portals themselves.
Our dashboards benefit from real client and internal feedback loops, so we encourage you to keep a tight communication with us to ensure the social dashboard is meeting your needs with respect to tags and categories.
Product Trends is available in the Innovation Insights Portal. Social listening tools are found in Innovation Insights, Consumer Insights, and Brand Health.