Empath Convo | Empath Ecosystem

20260111_1627_Image Generation_remix_01keqf8v1hfapt14zwjanmfwdn.png

Psychological evaluations and interventions aren’t equally accessible to everyone. Many people are stuck on long waiting lists for an evaluation before they can even begin receiving psychological treatment. I believe generative AI has the potential to democratize access to mental health services. That’s why I designed Empath Convo: to conduct structured clinical interviews in key areas of concern that adolescents may face during this crucial stage of development.

I started with a fuzzy problem: how to translate clinical assessment of worry, anxiety, and bodily sensations into an AI agent that can conduct that kind of assessment. I took a series of steps to clarify the problem and make the product ready.

Phase I: I began by learning how clinical assessments for children and adolescents are typically conducted in a natural, human-to-human setting.

Phase II: I reviewed the literature on health anxiety, worry, depression, and social comparison in adolescents.

Phase III: I learned how to build an LLM-based chatbot for clinical assessment.

Phase IV: I used OpenAI’s API to build the chatbot’s full-stack system, including the chatbot’s natural in-session behavior, leveraging advanced prompt-engineering and automation techniques.

Next, I’ll explain what I did in each phase.

PHASE I
Clinical Assement of Children and Adolecents

Although I had some experience through coursework and live observations of assessments with young people in clinical settings, I found my knowledge insufficient to begin designing the chatbot’s user experience. To address this, I began studying and mastering the key references in this domain.

I started with a classic book in the field.

In addition to teaching assessment techniques, crisis management, and interviewing approaches for specific conditions, the book gave me a clear picture of what an evaluation looks like between a psychologist and a young person. It reinforced an important insight: the chatbot experience shouldn’t pretend to be a human conducting an assessment.

Instead, I realized I needed to design a distinct AI personality—one that can support clinical evaluation without imitating a clinician. This goes against some commercial mental health products that present chatbots with human avatars. I also believe that approach may be especially unconvincing—or even off-putting—for young people, making it less viable for this population.

PHASE II
Mental Health Issues in Adolecence

In this phase, I realized it was essential to understand what mental health difficulties look like during adolescence. I conducted a brief literature review on depression, social anxiety, and health anxiety among adolescents.

I can summarize my takeaway as follows:

Adolescence amplifies emotional and physical “noise.”

Normal developmental changes (identity formation, body changes, peer sensitivity) can mimic clinical symptoms, so assessment needs to focus on duration, intensity, and functional impairment rather than emotion labels alone.
Comorbidity is common and shapes how symptoms look.

Worry/anxiety, depression, and social anxiety often overlap. This means assessment should be modular but connected, with branching logic that detects shared drivers (e.g., rumination, avoidance, sleep disruption).
Maintaining behaviors matter as much as symptoms.

Across conditions, patterns like avoidance, reassurance seeking, checking, withdrawal, and rumination keep distress going. These behaviors are easier to assess reliably than abstract feelings and are highly actionable for triage.
Social comparison is a major stress amplifier—especially online.

Comparison influences self-worth, mood, and social anxiety. Interviews should explicitly probe platforms, frequency, emotional aftermath, and behavior changes (e.g., avoidance, self-criticism).
Depression in teens is often “hidden” in behavior.

Symptoms may show up as irritability, loss of interest, low motivation, sleep changes, and disengagement rather than sadness. Assessment must include concrete questions about school, routines, relationships, and energy.
Health anxiety often involves bodily vigilance + reassurance loops.

Teens may focus on bodily sensations and interpret them catastrophically, leading to cycles of checking, googling, and seeking reassurance. Structured interviews should map the loop: trigger → interpretation → emotion → behavior → short-term relief → longer-term cost.
Context and equity shape disclosure and symptom expression.

Stigma, culture, identity, neurodiversity, and family dynamics affect how adolescents report distress. The system should use non-assumptive language, offer multiple response formats, and avoid clinician mimicry.

My key takeaway from Phase II was that clinical assessment of adolescent mental health is both narrow and sensitive—too complex to rely on a free-flowing conversation and hope it naturally intersects with the core clinical concerns. At the same time, it’s also too diverse and nuanced to be handled well by a rule-based, fully scripted chatbot.

So, I decided to integrate an OpenAI LLM with a well-established questionnaire for worry and health anxiety: the Whiteley Index-6 (WI-6).

PHASE III
Leveraging LLMs to Transform Validated Questionnaires Into Clinical Conversations

I realized there needed to be a middle ground: a way to benefit from the structure of validated questionnaires while still allowing the flexibility to ask personalized follow-up questions based on an individual’s initial responses.

To achieve this, I defined the chatbot’s flow so that each topic begins with a Likert-scale question. The first follow-up question is predefined, and the Likert response sets the its direction. After that, the LLM decides what to ask next—how to phrase it, what to focus on, and how many follow-up questions to ask—until it has gathered enough information about that specific area of concern.

PHASE IV
Using Prompt Engineering Techniques to Build AI-chatbot

In this phase, I used advanced prompt-engineering techniques to build the agent. I followed an approach I call a hierarchical supervisor–worker (or supervisor–labor) method.

At each development step, I first gave a meta-prompt to the supervisor agent, asking it to generate detailed prompts for a second agent (the worker/labor). The worker then produced code to implement the specific chatbot behavior. After that, I shared both the generated code and the chatbot’s behavior back with the supervisor to get critiques and a new set of prompts. I repeated this cycle to refine what the worker agent initially missed or implemented poorly.

This approach is most effective when the task is complex, and we don’t know the exact steps needed to achieve the desired result. However, when the task is less technical and depends heavily on human judgment, this approach may be less effective.

Below is a preview of the UI, along with an example that shows how the chatbot interviews one of the questions.

IMPACT

While there is still significant room for advancement in mental health, digital health, and AI, this project achieved three milestones worth highlighting.

First, the project developed an approach for using AI chatbots to deliver assessments for different types of mental health concerns. An assessment cannot be an unstructured conversation or a set of questions that lack validation. At the same time, the capabilities of LLM-based models should be leveraged to enable deeper, more personalized assessment based on a user’s interactions, emotional tone, and specific areas of concern. We met this goal and demonstrated potential effectiveness for health anxiety by incorporating the Whiteley Index into the product.

Second, as I mentioned earlier, we deliberately chose not to design the chatbot in a way that reinforces the illusion of speaking with another human being. The UI elements, tone, and overall experience should not lead users to believe they are interacting with a person. I designed the experience to be transparently “a conversation with a chatbot,” rather than a chatbot that repeatedly claims it is artificial while presenting itself through a human avatar. That combination is paradoxical, and I believe it should be avoided when using AI in mental health settings.

Third, we demonstrated how game elements can be integrated into assessment paradigms, and how this combination could support a new generation of gamified, AI-enabled psychological interventions.

The application and the underlying technology I built can be adapted to a wide range of clinical conditions by integrating LLM-based models with validated questionnaires, particularly for use with adolescents.

TAKE AWAY

As members of the psychology community, we have a responsibility to set ethical boundaries for the use of AI systems in mental health. People in technical and business sectors may be less aware of the risks these systems can pose. I did my best to establish ethical guardrails for applying AI in mental health, drawing on my background in both psychological research and computer science and acting as a bridge between these fields.

Beyond the underlying philosophy, I also demonstrated how automation and prompt engineering can support and accelerate the development of AI systems.

My experience developing AI chatbots led me to the third product I designed: Empower.

PHASE I Clinical Assement of Children and Adolecents

PHASE II Mental Health Issues in Adolecence

PHASE III Leveraging LLMs to Transform Validated Questionnaires Into Clinical Conversations