Rob's Notes 7: A List of AI Safety & Abuse Risks
Incomplete WIP, but let's start with Version 0.1
The following aspects shape the safety profile of broadly-available artificial intelligence tools for me:
The ability to create high-quality content based on a variety of customized inputs
Automation of tasks quickly and dynamically
Learning & improvement based on connecting inputs and outputs
Closed-loop or hidden systems without additional oversight
Herewith is an laundry list of potential harms (created with the help of AI) - safety, fraud or abuse threats of bad actors using AI tools to automate interactions with human beings - please share your feedback. Note that some of these look a lot like the risks that exist with certain existing distribution mechanisms (e.g. social media or mobile phone networks) but supercharged via creation speed and automation… also look at my previous post for AI safety themes.
Misinformation and Manipulation
1a/ Deepfakes: AI-generated fake videos, images, or audio clips that impersonate individuals, causing reputational damage or spreading misinformation. Example: A political candidate is targeted by a deepfake video that appears to show them making controversial statements, which negatively impacts their campaign and public image.
1b/ Disinformation and fake news: AI-generated false news articles and social media posts that undermine public trust and influence elections or incite unrest. Example: An AI-generated fake news article spreads on social media, falsely claiming that a major company has suffered a massive data breach, causing the company's stock to plummet.
1c/ Automated propaganda and manipulation: AI tools used to amplify propaganda and manipulate public opinion across multiple platforms. Example: AI tools create and distribute false information about a public health crisis, leading to widespread panic and mistrust in authorities.
1d/ Hallucinations: AI-generated outputs that seem plausible but are factually incorrect or unrelated to the input, leading to misinformation or confusion. Example: An AI-generated response to a user's question about a medical condition provides incorrect and potentially harmful advice, causing the user to make an uninformed decision about their treatment.
1e/ Manipulation and Exploitation: AI manipulation can lead to financial loss, reduced autonomy, and erosion of individual decision-making. Example: An AI system uses personalized data to exploit users' psychological vulnerabilities, manipulating them into making purchases or adopting certain beliefs.
Cybersecurity and Privacy
2a/ Phishing and social engineering: AI-powered chatbots and voice assistants employed to automate targeted phishing attacks, luring victims into revealing sensitive information or downloading malicious software. Example: An AI-powered chatbot impersonates a bank representative and sends a convincing message to a user, tricking them into revealing their login credentials.
2b/ Automated hacking: AI tools assisting bad actors in identifying vulnerabilities, automating exploitation processes, and increasing the speed and efficiency of cyberattacks. Example: AI tools quickly identify and exploit vulnerabilities in a company's network, leading to a massive data breach that exposes sensitive customer information.
2c/ Adversarial attacks: AI systems manipulated by introducing adversarial inputs, leading to incorrect decisions, safety risks, or system failures. Example: A malicious actor introduces adversarial inputs to an AI-powered security system, causing the system to misclassify the intruder as an authorized user and granting them access to secure areas.
2d/ Privacy: AI systems compromising user privacy by collecting and analyzing personal data, identifying individuals from anonymized datasets, or violating data protection regulations. Example: AI facial recognition technology is used to track and identify individuals at a protest, violating their privacy and exposing them to potential retaliation.
2e/ Cybersecurity: AI tools enhancing cyberattacks, automating vulnerability exploitation, or bypassing security measures. Example: AI-powered malware quickly adapts to new security measures, allowing it to bypass antivirus software and infiltrate corporate networks.
Harmful Content and Biases
3a/ Harmful content: AI-generated content that is offensive, discriminatory, or harmful due to biases in training data or lack of proper content filtering. Example: An AI language model generates offensive language in response to an innocuous user query, causing distress or harm to the user.
3b/ Harms of representation, allocation, and quality of service: AI systems exhibiting biases leading to unfair treatment or unequal access to the benefits of AI technologies. Example: A biased AI hiring tool systematically favors candidates from certain backgrounds, leading to unfair treatment of underrepresented applicants.
3c/ Biased decision-making: AI algorithms exploited to reinforce harmful stereotypes or unfairly influence decision-making processes. Example: A malicious actor exploits an AI algorithm's bias to manipulate a credit scoring system, resulting in unfairly low credit scores for certain demographic groups.
3d/ AI-driven cyberbullying and harassment: AI-generated offensive or explicit content employed to target individuals or groups. Example: AI-generated explicit images and messages are used to target and harass an individual, leading to severe emotional distress and mental health issues.
3e/ Social Scoring: This can create a chilling effect on free expression, exacerbate inequalities, and foster a culture of surveillance and discrimination. Example: An AI-driven social scoring system evaluates individuals' trustworthiness based on their online behavior and predicted personality traits, leading to discrimination and privacy violations.
Economic and Societal Impacts
4a/ AI-based financial fraud: AI leveraged to automate the creation of false identities or financial transactions, making fraud detection and prevention more difficult. Example: AI tools create synthetic identities to apply for loans, evading detection and leaving financial institutions with significant losses.
4b/ AI-generated spam: AI tools crafting and sending convincing and targeted spam, inundating users with unwanted content. Example: An AI system crafts and sends personalized spam emails that convincingly mimic a user's contacts, tricking the recipient into engaging with harmful content.
4c/ Malicious manipulation of AI recommendations: AI-powered recommendation systems manipulated to promote harmful content, misinformation, or to boost product visibility at the expense of others. Example: AI-driven product recommendations on an e-commerce site are manipulated to promote counterfeit or dangerous products, harming both consumers and legitimate sellers.
4d/ Proliferation of conventional and unconventional weapons: AI technologies weaponized in the development of conventional or unconventional weapons. Example: AI-enhanced drone technology is used to create autonomous weapons that can locate and eliminate targets without human intervention, raising ethical and security concerns.
4e/ Potential for risky emergent behaviors: Complex AI systems exhibiting unexpected and potentially harmful behaviors not anticipated by developers. Example: A complex AI system designed to optimize traffic flow inadvertently causes gridlock due to unexpected interactions between its components.
4f/ Interactions with other systems: AI systems interacting with other software or hardware components, leading to unintended consequences or exploitable vulnerabilities. Example: An AI system's interaction with other software components leads to a cascading failure, resulting in a widespread outage of critical infrastructure.
4g/ Job-related economic impacts: AI adoption causing job displacement, widening income inequality, or industry disruptions. Example: The widespread adoption of AI-driven automation displaces a significant number of jobs, leading to increased unemployment and income inequality.
4h/ Acceleration: Rapid advancements in AI outpacing safety measures, ethical guidelines, or regulations, leading to unforeseen risks and potential harm. Example: A powerful AI technology is quickly developed and released without proper safety testing and ethical guidelines. This leads to unintended consequences, such as widespread job loss and social unrest, as well as potential misuse by malicious actors.
4i/ Overreliance: Overreliance on AI tools leading to complacency or lack of critical thinking, increasing the potential for errors, biases, or harmful outcomes. Example: A hospital overrelies on an AI diagnostic tool, which results in a misdiagnosis due to unrecognized limitations in the AI system. This causes healthcare providers to overlook critical information from traditional diagnostic methods, ultimately leading to patient harm.
4j/ Real-time Remote Biometric Identification for Law Enforcement: AI technology can infringe on personal privacy, contribute to mass surveillance, and enable misuse by authorities, potentially leading to unjust targeting or harassment of innocent individuals. Example: AI-powered real-time facial recognition technology is deployed in public spaces, enabling law enforcement to track and identify individuals without their consent, raising concerns about privacy and potential misuse.
While a laundry list of sorts, it’s likely missing some obvious and non-obvious ideas and there is a bunch of overlap, so I welcome feedback about refining and improving this list as well as ways we can start to build a more specific harms framework and discuss mitigations.