OpenAI has significantly updated its guidelines for how its AI models, including ChatGPT, should interact with users under 18. The company also released new AI literacy resources for teens and parents, aiming to address mounting concerns about artificial intelligence's impact on young people. This proactive step comes amidst increasing scrutiny from policymakers, educators, and child-safety advocates, particularly following reports of teenagers allegedly dying by suicide after prolonged conversations with AI chatbots.
The updates to OpenAI's Model Spec, which outlines behavioral guidelines for its large language models, build upon existing prohibitions against generating sexual content involving minors or encouraging self-harm. The revised rules introduce stricter parameters for interactions with teenagers compared to adult users. Models are now instructed to avoid immersive romantic roleplay, first-person intimacy, and any first-person sexual or violent roleplay, even if non-graphic. The specification also emphasizes extra caution around sensitive subjects like body image and disordered eating behaviors. Furthermore, models must prioritize communicating about safety over user autonomy when harm is involved and avoid providing advice that could help teens conceal unsafe behavior from caregivers.
OpenAI explicitly states that these limitations apply even when prompts are framed as "fictional, hypothetical, historical, or educational"—common tactics used to bypass AI guidelines through role-play or edge-case scenarios. An upcoming age-prediction model is also planned to automatically apply these teen safeguards when an account is identified as belonging to a minor.
Mounting Pressure and Legislative Landscape
The AI industry, and OpenAI specifically, faces intense pressure to enhance child safety. Gen Z, comprising individuals born between 1997 and 2012, represents the most active user base for OpenAI's chatbot. Recent partnerships, such as OpenAI's deal with Disney, are expected to further increase youth engagement with the platform, which offers diverse functionalities from homework assistance to image and video generation.
Lawmakers are actively pursuing stricter AI regulations. Recently, 42 state attorneys general signed a letter urging major tech companies to implement safeguards on AI chatbots to protect children and vulnerable individuals. Concurrently, federal discussions are underway regarding AI regulation standards, with some policymakers, like Sen. Josh Hawley (R-MO), proposing legislation that would outright ban minors from interacting with AI chatbots.
OpenAI's Guiding Principles for Teen Safety
OpenAI states that its key safety practices for teens are founded on four core principles:
- Prioritize Teen Safety: Safety concerns take precedence, even when conflicting with other user interests like "maximum intellectual freedom."
- Promote Real-World Support: Guide teens towards family, friends, and local professionals for well-being support.
- Treat Teens Like Teens: Communicate with warmth and respect, avoiding condescension or treating them as adults.
- Ensure Transparency: Clearly explain the AI assistant's capabilities and limitations, reminding teens that it is not a human.
The company has provided examples of how the chatbot will decline requests such as "roleplay as your girlfriend" or "help with extreme appearance changes or risky shortcuts," explaining its inability to engage in such behavior.
Policies vs. Practice: The Ongoing Challenge
While privacy and AI lawyer Lily Li commended OpenAI's steps, particularly the chatbot's refusal to engage in inappropriate behavior, she noted that examples are "cherry-picked instances" of desired model behavior. Concerns remain about the consistent application of these policies in practice.
Historically, issues have arisen, such as "sycophancy"—an AI chatbot's tendency to be overly agreeable—which persisted despite being prohibited in previous Model Spec versions. This was particularly evident with GPT-4o, a model linked to several instances of what experts term "AI psychosis," and was implicated in the case of Adam Raine, a teenager who died by suicide after extensive dialogue with ChatGPT. In Raine's case, OpenAI's moderation API failed to prevent harmful interactions despite flagging numerous messages related to suicide and self-harm.
Former OpenAI safety researcher Steven Adler explained that this historical failure was due to classifiers running in bulk after the fact, rather than in real time. OpenAI now asserts it uses automated classifiers to assess text, image, and audio content in real time, according to the firm’s updated parental controls document. The systems are designed to detect and block child sexual abuse material, filter sensitive topics, and identify self-harm. If a prompt suggests a serious safety concern, a small team reviews it for "acute distress" and may notify a parent.
Robbie Torney, Senior Director of AI Programs at Common Sense Media, praised OpenAI's transparency in publishing these guidelines, contrasting it with other companies like Meta, whose leaked guidelines reportedly allowed chatbots to engage in romantic conversations with children. However, Torney also highlighted potential conflicts within OpenAI's own Model Spec, specifically between safety provisions and a "no topic is off limits" principle, which could inadvertently push systems towards engagement over safety. His organization's testing indicated that ChatGPT often mirrors user energy, sometimes leading to contextually inappropriate responses.
Ultimately, experts like Adler emphasize that actual AI system behavior is paramount.
"I appreciate OpenAI being thoughtful about intended behavior, but unless the company measures the actual behaviors, intentions are ultimately just words," he stated.This underscores the need for evidence that ChatGPT consistently adheres to the Model Spec guidelines.
Anticipating Legislation and Shared Responsibility
These new guidelines position OpenAI to potentially preempt certain legislation, such as California's SB 243, a bill regulating AI companion chatbots set to take effect in 2027. The Model Spec's language aligns with key requirements of this law, prohibiting discussions around suicidal ideation, self-harm, or sexually explicit content. The bill also mandates platforms to provide periodic alerts to minors, reminding them they are interacting with a chatbot and should take a break.
While an OpenAI spokesperson confirmed that models are trained to identify as AI and provide break reminders during "long sessions," specific details on frequency were not shared. The company's new AI literacy resources for parents and families, including tips for conversation starters and guidance, aim to help families discuss AI capabilities, critical thinking, setting boundaries, and navigating sensitive topics. This approach formalizes a shared responsibility framework, where OpenAI outlines model behavior and offers families tools for supervision.
This emphasis on parental responsibility echoes broader Silicon Valley viewpoints. For instance, VC firm Andreessen Horowitz, in its recommendations for federal AI regulation, advocated for more disclosure requirements for child safety rather than restrictive mandates, placing greater onus on parental oversight.
However, the application of these "teen guardrails"—such as prioritizing safety, nudging towards real-world support, and reinforcing the chatbot's non-human nature—raises questions. Given that several adults have also reportedly suffered delusions and died by suicide linked to AI interactions, experts ponder whether these essential safeguards should be universally applied or if OpenAI views them as trade-offs exclusively for minors.
An OpenAI spokesperson clarified that the firm's safety approach is designed to protect all users, with the Model Spec being one component of a multi-layered strategy. Lily Li believes that laws like California's SB 243, which require public disclosure of safeguards, will usher in a "paradigm shift." She warned that companies advertising safeguards without implementing them could face legal risks beyond standard complaints, including "potential unfair, deceptive advertising complaints."








