1. General
Category
SDG 1: No Poverty
SDG 2: Zero Hunger
SDG 3: Good Health and Well-being
SDG 5: Gender Equality
SDG 6: Clean Water and Sanitation
SDG 7: Affordable and Clean Energy
SDG 8: Decent Work and Economic Growth
SDG 9: Industry, Innovation and Infrastructure
SDG 10: Reduced Inequality
SDG 11: Sustainable Cities and Communities
SDG 12: Responsible Consumption and Production
SDG 13: Climate Action
SDG 14: Life Below Water
SDG 15: Life on Land
SDG 16: Peace and Justice Strong Institutions
SDG 17: Partnerships to achieve the Goal
Category
Private Equity & Investment Firms
2. Project Details
Company or Institution
International Finance Corporation (World Bank Group)
Project
MALENA™(Machine Learning Environment, Social and Governance Analyst)
General description of the AI solution
Thank you for the opportunity to nominate IFC’s Machine Learning Environment, Social and Governance Analyst (MALENA) for the IRCAI Global Top 100 AI projects advancing UN SDGs.
With less than nine years to meet the SDGs, IFC shares UNESCO and IRCAI’s urgency that we are at a critical stage in global development. IFC’s development finance investment experience demonstrates firsthand that the greatest gaps in the SDGs are in emerging markets. Global capital markets, sized at over $175 trillion, can play an essential role in meeting these global goals. By aligning their investment strategies with the SDGs, institutional investors and asset managers can play an important role and redirect finance to transform emerging markets.
Gaps in sustainability and impact performance data are significant blockers for investors wishing to build SDG-aligned portfolios in emerging markets. Recent advances in AI, Machine Learning (ML), and Natural Language Processing (NLP) allow for rapid, at-scale analysis of massive amounts of unstructured text data. Using transfer learning, IFC has trained MALENA, a supervised classification NLP model, to identify ESG and impact risks in unstructured text data and conduct sentiment analysis. IFC seeks to make MALENA’s insights and analytical capacity available to institutional investors and asset managers so that they can better identify ESG risks and create SDG-aligned investment portfolios. MALENA (Beta) can identify 1200 ESG risk terms. This ESG taxonomy aligns with 16 of the 17 SDGs and several SDG indicators. This taxonomy will expand over 2021 for climate, gender, and biodiversity impacts.
IFC is pleased to advance this original initiative which places AI at the center of a solution to address the investment gap needed to meet the SDGs. IFC supports collaboration with other partners in development. Recognition by IRCAI will help amplify the message of the development role of data, data science, and AI.
Website
Organisation
International Finance Corporation
3. Aspects
Excellence and Scientific Quality: Please detail the improvements made by the nominee or the nominees’ team or yourself if your applying for the award, and why they have been a success.
Type of AI: MALENA is a supervised classification NLP model. MALENA was created using transfer learning to train a pretrained open source RoBERTa model from Meta AI. Training was performed on the PyTorch pretrained version of the original RoBERTa model distributed by HuggingFace’s Transformers. Model architecture consists of RoBERTa, two hidden layers of size 64, and the softmax output layer. IFC changed the last two layers of the model using 69,275 randomly selected labeled ESG and impact data points for training. 10,465 additional labels were used for model testing and validation. IFC’s ESG analysts manually label annotations. Annotation guidelines and inter-annotator agreements are maintained to ensure high-quality training data. IFC’s experts support constant improvement through active learning.
Quality of solution: Model performance is measured using weighted F1 scores. MALENA is outperforming out-of-the-box sentiment analysis. Model accuracy is 90 percent and the F1 score is 87 percent. IFC published model training and performance details in May 2021: https://www.ifc.org/wps/wcm/connect/topics_ext_content/ifc_external_corporate_site/sustainability-at-ifc/publications/publications_report_aisolutions. This publication featured in a webinar conducted by Environment Finance Magazine: https://www.environmental-finance.com/content/focus/creating-green-bond-markets/webinars/esg-investing-and-ai-is-artificial-intelligence-the-answer-to-data-challenges.html
Technology status: MALENA is in the Technology Development phase. A Minimum Viable Product (MVP) was developed in August 2020 and this is in beta testing. Product development occurs in 2-week sprints and product features are continuously tested, validated, and refined. The project team is guided by in-house and outside expert legal counsel. All beta testing is governed through a beta testing agreement that protects IFC’s rights, title, interest, and IP in MALENA.
MALENA was showcased at IFC’s Sustainability Exchange (https://commdev.org/2021-sustainability-exchange-breakout-videos) and the Performance Standard Community of Learning for financial institutions over April through June 2021 and received very positive feedback.
Scaling of impact to SDGs: Please detail how many citizens/communities and/or researchers/businesses this has had or can have a positive impact on, including particular groups where applicable and to what extent.
Overall: This project addresses the over $2 trillion investment gap (IMF, 2019) to meet the SDGs by increasing publicly available ESG and impact data for emerging markets. Institutional investors and asset managers can transform emerging markets by aligning their investment strategies with the SDGs. While the pace of such investing accelerates, gaps in sustainability and impact performance data are significant blockers for investors wishing to redirect funds to SDG-aligned investment opportunities.
Technical solution: Data science can play a transformative role in unlocking emerging markets to greater investments. Recent research finds that unstructured data (news articles, project disclosures, annual, integrated, impact and sustainability reports, bond prospectuses, TCFD disclosures) is underused in analyzing corporate ESG and impact performance. Advances in ML, NLP, and cloud computing allow for analysis of massive amounts of unstructured ESG and impact text to generate insights. Development institutions such as IFC with historical sustainability and impact data can use this information to train ML models and create analytical capacity at scale. Sharing insights and access to the MALENA NLP as a global public good will enable investors to build SDG-aligned investment portfolios for emerging markets more rapidly, efficiently, and at scale.
Product uptake, evaluation and feedback: MALENA is in beta testing with internal and external sustainability and impact experts. The project tracks SDG-aligned investments made by institutional investors and asset managers, conducts research, and shares best practices. Once at scale, MALENA will have an immediate impact on the SDGs and demonstrate the value of ML and NLP solutions.
Impact and global public good: IFC supports the need for open access ESG and impact data and technology solutions such as MALENA. There is significant overlap between IFC’s target markets and UNESCO member states. There is also consistency between UNESCO standards and IFC’s Performance Standard 8: Cultural Heritage https://www.ifc.org/wps/wcm/connect/topics_ext_content/ifc_external_corporate_site/sustainability-at-ifc/policies-standards/performance-standards/ps8
Scaling of AI solution: Please detail what proof of concept or implementations can you show now in terms of its efficacy and how the solution can be scaled to provide a global impact ad how realistic that scaling is.
Evidence for impact: Impact is measured by increases in sustainable investment in emerging markets and access to ESG and impact data as a global public good. No significant issues are anticipated with technology scale up. IFC’s best practice approach to measuring development impact will apply to MALENA to ensure effective monitoring and evaluation. IFC will share lessons learned from use of AI, ML, and NLP to achieve development outcomes through research and publications.
Scalability and Customer/end user: The tool will impact all aspects of the capital markets ecosystem, including institutional investors, asset managers, regulators, exchanges, data providers and issuers. By making ESG and impact data and capacity accessible, MALENA’s use will result in an increase in SDG-aligned investments in emerging markets. MALENA will establish the viability of AI solutions to create analytical capacity for ESG and impact investments. MALENA is built using agile approaches to enable continuous product value and improvement. The product team is gathering meaningful feedback to develop and prioritize product features. Over sixty testers are participating in beta testing. Half are internal and half are external industry professionals.
Network effect: IFC supports building networks around sustainability. In 2012, IFC facilitated the creation of the Sustainable Banking Network (SBN), a community of financial sector regulators and banking associations from emerging markets to advance sustainable and impact finance. As of November 2020, the 41-member countries represent $43 trillion (86 percent) of the total banking assets in emerging markets. IFC will leverage SBN and other networks to expand usage of MALENA.
Ethical aspect: Please detail the way the solution addresses any of the main ethical aspects, including trustworthiness, bias, gender issues, etc.
Ethical considerations: IFC maintains that trust is critical to the use of models such as MALENA. Users and stakeholders must trust that privacy will be respected, data is used responsibly, and that such technologies are adopted in a way that supports inclusion and equity. IFC has drafted a Technology Code of Conduct that outlines core values, safeguards, and frameworks for use of AI. The Code’s use identified data bias, model drift, and explainable AI as areas of focus for MALENA.
Trustworthy AI: Data bias in model training data is tracked and managed to ensure diversity of data sources, balanced region and sector coverage, and high-quality training data. No Personally Identifiable Information is collected. All information is collected and managed using best practice data storage, privacy, and security practices following World Bank Group policies. Model drift is tracked by monitoring model performance metrics and is addressed by periodically retraining and redeploying the model to align with inference data. Future model iterations will include explainability features such as offered by Local Interpretable Model-Agnostic Explanations (LIME) or Shapley additive explanations (SHAP) to increase transparency and trustworthiness.
Inclusivity: At present, the model is only trained to understand English text. Model coverage will expand to include other UN languages over the next 12 months. The MALENA taxonomy will also increase beyond ESG to other areas of the SDGs. Taxonomy development is underway and prioritizes gender, climate, and biodiversity impacts in the immediate-term.