SDG 10: Reduced Inequality
SDG 16: Peace and Justice Strong Institutions
2. Project Details
Company or Institution
Factmata Narrative Monitoring
General description of the AI solution
Factmata has developed a Narrative Monitoring product which combines two unique and innovative technologies: Topic Clustering and Content Scoring.
Firstly, Topic Clustering allows us to read many different opinions shared online across social media, blogs, forums and news articles to identify similar opinions and cluster them together as human understandable ‘narratives’ which can be tracked over time.
Secondly, our content scoring algorithms use Natural Language Processing to read text and identify signals of harmful content including racism, sexism, hate speech and other forms of discriminatory language.
The combination of these technologies effectively means any social media platform, content moderator, programmatic advertiser or even brands and PR agencies monitoring social media, can identify discriminatory narratives earlier and better than human analysis alone.
Earlier identification means quicker response and the potential for preventing that harmful narrative from spreading into the mainstream. Better understanding of the narrative means a more targeted response resulting in more effective combating of that harmful narrative.
In addition, we provide identification of the social media accounts authoring the content to support more direct action against bad actors or identification of broader networks of disinformation, misinformation, hate speech and other forms of harmful content.
We have flexible graphs and reporting capabilities meaning anyone using our product to counter hate speech and discrimination can track their progress and impact on the narratives.
Our aim is to use AI to enable a much broader range of companies, agencies and other organisations to be able to identify harmful content online, where currently this is limited to only the companies that can afford the amount of human analysis required. In this way we think we will have a much bigger impact on the overall problem.
Excellence and Scientific Quality: Please detail the improvements made by the nominee or the nominees’ team or yourself if your applying for the award, and why they have been a success.
The types of AI we’re using include machine learning algorithms with a focus on Natural Language Processing (NLP) including Natural Language Understanding (NLU), Natural Language Generation (NLG), Topic Detection and Modelling and Automated Content Scoring. These technologies combined in the way we’re using them to identify online narratives and score them for harmful content is not being done by anyone else, therefore it is extremely innovative.
In order for our content scoring models to be effective we trained them using experts; for example our clickbait model was trained by communities of journalists. We’ve taken similar approaches to achieve excellent overall accuracy, recall and F1 scores. Again, this approach has not been widely adopted by other content moderation organisations due to the cost and the preference in crowdsourcing feedback on the models. This approach is acceptable for identifying binary and highly objective signals, for example detecting profanity but for identifying discrimination which is highly subjective, using the wisdom of the crowd isn’t good enough.
Our research work is clearly detailed so for every one of our 19 individual models both the annotation guidelines, the training data and the updates to this data is recorded and accessible to clients and partners for transparency and explainability.
Our TRL is currently 7, having completed the MVP of our product earlier this year and having deployed it to a single live client for operational use. We’re currently enhancing the User Interface and adding the basic functionalities we need to compete in the media monitoring and social listening markets but the underlying AI technologies are complete and have been proven effective and in some cases market leading in lab and operational environments by independent 3rd party testers. In June ‘21 we won the CogX award for the best AI product in marketing and ad tech.
Scaling of impact to SDGs: Please detail how many citizens/communities and/or researchers/businesses this has had or can have a positive impact on, including particular groups where applicable and to what extent.
Our tool has the potential to impact millions of peoples lives on a daily basis. Ethnicity, religion, nationality, gender – these elements, along with many others, are weaponized and harnessed to create social divisions that ultimately degrade democratic structures and serve the interests of those that would profit from discord and conflict.
We therefore aim to directly increase the proportion of populations who believe decision-making is inclusive and responsive, by sex, age, disability and population group by the identification of where this is not currently the case. In addition, the use of our technologies to identify discriminatory narratives should have a direct reduction in proportion of the population reporting having personally felt discriminated against or harassed within the previous 12 months on the basis of a ground of discrimination prohibited under international human rights law.
We’re making our technology available for free to journalists and other creators of good quality content as well as anyone seeking to combat fake news and other forms of harmful content. For brands, agencies and other paying organisations we’re making our technology available at a price point that will enable a much broader range of organisations to be able to afford to identify and combat harmful content about them or topics relevant to them. We’re also adding multi-lingual support to enable a broader range of countries to identify this kind of content.
The effectiveness of our solution can be tracked using the very indicators we use to identify harmful content. For example a fake news narrative about COVID vaccines can be tracked, acted upon and the result of the decrease in the number of opinions or the stance of those opinions for or against the fake narrative can be measured over time.
Scaling of AI solution: Please detail what proof of concept or implementations can you show now in terms of its efficacy and how the solution can be scaled to provide a global impact ad how realistic that scaling is.
We have multiple case studies of clients that have utilized our tool to gain insight into threatening narratives or damaging misinformation circulating online. We are also actively tracking COVID vaccine misinformation and identified multiple threads online of harmful content that has seen active engagement from online communities. Independent lab tests of our underlying technologies by existing market leading vendors in the content moderation and media monitoring markets have validated that our technology is more effective than existing technologies, or humans, in identifying harmful content as well as identifying and explaining narratives. We’ve just won an award for best AI product in the marketing & Ad Tech category.
The challenges to scaling our technologies are quantifiable and understood problems which require investment in infrastructure to cope with the volume of content we process. These are not novel or complex problems to solve, they just require sufficient funding. In terms of the applicability of our technology it is agnostic of data source, for example it works as well with Tweets as it does with Facebook or Reddit posts, so new platforms entering the market is not a threat, more an opportunity as organisations struggle to scale their current human analysis. We will implement our multi-lingual models within the next four weeks, enabling us to expand beyond english-speaking countries and identify cross-border and pan-lingual narratives.
We have only just launched our products, so we have only one early adopter of our narrative monitoring product and one adopter of our content scoring APIs. However, we have over 1,000 users of our consumer facing technologies for example our browser plugin and over 1,000 members of our distribution list for news on developments in this space.
Ethical aspect: Please detail the way the solution addresses any of the main ethical aspects, including trustworthiness, bias, gender issues, etc.
Our AI is co-created with a community of expert annotators made up of journalists and relevant experts. These expert communities annotate content for us which leads to: (1) much higher quality algorithms that can detect nuanced types of discrimination, disinformation and hate speech, (2) community involvement and feedback from the very start, and (3) a high resilience against bias influencing our AI. In addition, our AI algorithms are fully explainable, with reasoning provided for every piece of content we annotate. Explainability is foundational to our system and we try to solve for bias by justifying each piece of content we flag, alongside harnessing the different perspectives of expert annotators. Accountability and transparency are integral to Factmata’s mission and AI functionality – adhering to these principles enables us to remain answerable to the general public.
We aim to avoid becoming “black-box” AI, in which the AI’s scoring and backend training methods are not transparent and held accountable. We require that our AI justify every single decision it makes by explaining which parts of the content caused the AI to flag it. Via constant community feedback, we are able to iterate and improve our solution at every step of our journey.
However, there is an additional ethical problem facing our company not related to the potential bias in the algorithm but the potential misuse of the product itself by bad actors, for example to identify truthful and good quality narratives to subvert or good quality content producers and influencers to target with attacks. Our aim is to monitor the use of the product by our clients to identify misuse (any form of support discrimination or the creation or dissemination discriminatory or harmful content) and ensure we have legal protections to revoke access and stop the misuse.