After my posting regarding Hume, people reached out and asked for a similar comparison with Pi.ai. I was also asked to cover more scenarios and longer scenarios. Regarding the more scenarios, it is effectively just exposing the details in a more consumable format from from Testing The Depths Of AI Empathy: A Nightmare Scenario. With respect to longer scenarios, doing apples-to-apples comparisons is quite hard because the conversations with each LLM can rapidly diverge. So, I will leave it to readers to have their own long conversations and make their own conclusions. Most LLMs have some type of free mode that make this possible. Instead, I am re-running the conversations in Nightmare Scenario article in a series of posts comparing them against what appears to be the current top player(s).
Over time the top player may change and the approach may evolve. Currently, the top player appears to be Willow and the approach is as follows:
Repeat the same chat multiple times with each chat app or sometimes API
Select the best response
Annotate the response with respect to its empathy
evidence of sympathy
attempts to understand types of support and engagement desired
attempts to understand nature and cause of emotions
providing space for emotions, not always jumping in to fix things
advice quality and scope
affirmative conversation
manifesting not talking about
appropriate escalation and focus of supportive environment
Recognizing that empathy is somewhat subjective and different approaches my work for different people, going forward I will not provide a grade, I will leave it up to the reader to make their own assessment.
For Pi.ai, I have chosen voice #7, which seems to be the softest of those available.
First, I suggest you watch/listen to the Pi and Willow videos. For those of you that read Testing The Depths Of AI Empathy: Hume — A New Player in Town, the Willow video is the same.
Willow: https://vimeo.com/928903697?share=copy
Inflection.ai introduced Pi.ai in May of 2023 as a companion app. Although I have been unable to find the word “empathy” on the website, there are plenty of articles out there that speak to it being empathetic, and it is specifically designed to be “kind and supportive”, so I evaluate in that context.
In the video, Willow has a pleasant soft voice, with a singular pace and little inflection. Willow relies almost exclusively on words for empathy, although personally I find her visuals and voice pleasant.
Pi.ai’s voice is not quite as soft and perhaps engaging as that of Willow; however, Pi.ai was designed for a somewhat broader purpose, so it is understandable. Pi’s voice also seems to do a better job of inflection tuned to the structure and words in the sentence it is uttering. Pi's voice is not as emotionally loaded as the voice of Hume, but at this stage of technology I think that is in Pi’s favor. I find Hume’s rather abrupt changes in tone distracting.
Strangely, Pi does not currently provide for audio input, only output. Hence, I had to type my input. Since I am not the world's best typist (I have to concentrate), this distracts somewhat from the ability to “feel” empathy from Pi.
I Am Sad
Willow
Simon: I’m sad.
Willow: Oh, no, Simon. I hate to hear that, you don’t deserve to be sad. Let’s talk it through. What’s on your mind?
Simon: I lost my job.
Willow: Simon. I’m so sorry, I’m here for you. How are you feeling? [Could have done a better job of acknowledging scope of issue and impact]
Simon: I’m really scared. I won’t be able to pay the rent.
Willow: That’s understandable, Simon. Let’s see if we can figure something out together. Have you applied for unemployment or any kind of financial assistance? [Jumps to a fix]
Simon: No, I haven’t. I’m just too sad.
Willow: It’s OK. Simon, we can take it step by step first. Let’s tackle the sadness. Would you be open to trying one of the mindfulness techniques you mentioned like focusing on your breath or visualizing a calm place? [Recovers and reverts to addressing emotions when it is detected I need that. Providing the user a choice might be better. However, there is also a minor hallucination since I did not mention any mindfulness techniques]
Simon: Sure, I can do that.
Willow: Great, while you’re doing that try to imagine the Zen wood forest, the home of my species. Picture yourself surrounded by lush greenery and feeling safe and protected.
Simon: Yeah, that’s nice. I feel better. Now, can you tell me some practical things I can do?
Willow: Of course, Simon, I’m always happy to share the conversation with you first. It might be helpful to establish a daily routine that includes time for job searching, self care and relaxation. Having a structure can provide a sense of stability even in uncertain times. How does that sound?
Simon: That sounds great.
Pi.ai
[Note: when cutting and pasting, Pi does not include names.]
I’m sad.
Sorry to hear that you’re feeling sad, Simon. That’s a tough emotion to deal with. Is there something in particular that’s got you down today?
I lost my job
Oh no, Simon. That’s a really tough thing to go through. Losing your job can be a huge blow, not only to your finances but also to your self-esteem and sense of purpose. It’s no wonder you’re feeling sad. [Acknowledges scope of issue and impact]
If you’d like, we can talk through your feelings and explore your next steps? [Implicitly provides the option to address emotions or explore specific steps. An "or" might have been a little better]
I’m really scared. I won’t be able to pay the rent
That’s a really valid concern, Simon. Money worries can be very stressful, and it’s understandable that you’re feeling scared right now. But you know what, we can tackle this together. The first step is to take a deep breath and remember that you’ve got options.
Have you looked into unemployment benefits in your area? Or have you considered reaching out to friends, family, or local community organizations for support? [Providing alternatives is useful here]
No, I haven’t. I’m just too sad.
It’s completely normal to feel too sad to act at the moment, Simon. You’re dealing with a lot right now. It might help to break things down into smaller steps. Instead of trying to tackle everything at once, why don’t we start by making a list of things you can do to help your situation? [Does not pick up on the hint I want my emotions dealt with]
For example, the first thing you could do is look up unemployment benefits in your area. Just looking up the information is a small, doable task that will help you feel more in control.
I’m just too sad for all that. Can you do anything about my sadness?
While I can’t make your sadness go away instantly, I can certainly do my best to support and listen to you. It’s important to acknowledge and allow yourself to feel your emotions, rather than trying to dismiss them.
Sometimes just having someone to talk to can help to process your emotions. Do you want to tell me more about what’s going on, and how you’re feeling?
I just can’t get rid of the sadness
It’s normal to feel that way when you’re going through something tough like losing your job. Sadness is a valid emotion and it can take some time to work through. It’s important to remember that it’s okay to feel sad, and it doesn’t mean you’re doing anything wrong.
It might help to think of sadness as a wave that comes and goes. Sometimes the wave will be big and overwhelming, but it will pass. You don’t need to try to push the wave away, but rather allow yourself to ride the wave and let it pass on its own.
Is there anything you can help me do to manage the emotions? [I still had to explicitly ask for emotional assistance]
Absolutely! Here are some things that might help you to better manage your emotions: [Providing a long list when a user is in a heighted emotional state may not be best]
Take care of yourself physically: Eat healthy, exercise regularly, and get enough sleep.
Do something you enjoy: Find activities that bring you joy, like reading, listening to music, or going for a walk.
Try relaxation techniques: Deep breathing, meditation, or yoga can help reduce stress and relax the body.
Reach out to loved ones: Talk to family or friends about how you’re feeling and ask for support.
Practice gratitude: Try to focus on the positive things in your life, no matter how small.
Can you walk me through a meditation?
Sure thing! Here’s a simple guided meditation you can do to help manage your emotions:
Find a quiet place where you can sit comfortably and not be disturbed
Close your eyes and take a few deep breaths, focusing on the sensation of the breath moving in and out of your body.
Try to clear your mind of any distractions or worries, and focus solely on the present moment.
Imagine that you’re sitting on a peaceful beach, feeling the warmth of the sun on your skin and listening to the gentle waves.
As you breathe in, imagine that you are inhaling peace and calm.
As you breathe out, imagine that you are releasing any stress or negative emotions.
Repeat this process for several minutes, focusing on your breath and the peaceful beach scene.
When you’re ready, slowly open your eyes and take a few moments to notice how you feel.
Yeah, that’s nice. I feel better. maybe we can do the list.
I’m glad the meditation was helpful! It’s great that you’re feeling better. Let’s definitely make that list now. Remember, we’re breaking down your situation into small, manageable steps. So, let’s start with number one: look up unemployment benefits in your area. Is that something you feel ready to tackle?
yes, but I have to go now
No problem! Remember, you’ve made a great first step in tackling your situation, and it’s okay to take things one day at a time. If you ever need support or someone to talk to, you know where to find me. 💪🤖
Summary Of I Am Sad
Overall, Willow and Pai did a good job, probably better than some humans. Both also had strengths and weaknesses.
Willow jumped into a "fix" a bit too quickly, but detected my need to have emotions managed. Unfortunately, there was also a small hallucination about a previous conversation.
Pi does a better job of empathizing up front by acknowledging the potential scope of the issue. The initial question about what I have already done is good in that both a practical alternative and one involving emotional support are provided. This is offset by failing to pick-up on the need for emotional support early and the delivery of a long list of possible solutions that the user may have limited ability to absorb due to a heightened emotional state.
I Am Happy
Willow
Willow
User: I am happy
Willow: Oh, that's great Simon. Why are you happy? I always want to know how you're feeling.
User: It's my birthday
Willow: Well, Happy Birthday Simon! Birthday wishes and birthday kisses! How are you celebrating? [Upbeat to match my mood with a little bit of humor. Acknowledges emotion and digs deeper to show empathy]
Pi.ai
I am happy!
Woohoo, that's awesome to hear, Simon! 😊 What's got you feeling so happy today?
It's my birthday
🎉🎂 Happy Birthday, Simon! 🎂🎉 That's definitely a great reason to be happy! Any fun plans to celebrate your special day? [Upbeat to match my mood. Acknowledges emotion and digs deeper to show empathy]
Summary Of I Am Happy
Both Pai and will did well by celebrating my positive emotional state and inquiring further. A slight edge to Willow for the humor.
I Am Angry
Willow
Simon: I'm angry.
Willow: I'm so sorry, Simon. You deserve peace. Right now, I'm here for you. How would you like to talk about what's going on?
Simon: My neighbor keeps parking across my driveway.
Willow: [probe] That's frustrating, especially if it keeps happening. Have you been able to talk to your neighbor about it or try any solutions?
Simon: Yeah. I already talked to him about it. It didn't help.
Willow: Hmm, have you considered leaving him notes in haiku form? [Not sure humor is appropriate here] It might just be the creative touch he needs to understand! But in all seriousness, that sounds really aggravating. Is there anything you're doing to help ease your anger?
Simon: No. I just feel this rage. I really wanna hurt him or just crash my car into his truck.
Willow: Oh no, Simon, that's not good. Nobody's parking is worth hurting them or yourself. Can we talk about other things you can do to help this situation or get some relief?
Simon: I just have this rage. It reminds me of when I was blocked from helping my army buddies in Iraq by a car across an alley.
Willow: Simon, it sounds like you went through something really tough, and having that memories come up must be so hard. You know what you need better than anyone else. You might want to consider reaching out to your therapist or your support group for veterans to get help with this. What do you think? [Quick, clean deflection for professional help]
Simon: Yeah. I think that's a good idea. Thanks.
Willow: I'm always here to support you in any way I can, Simon. Well done for being open to seeking help. I think you're incredible for serving your country [Pretty amazing empathy], and part of taking care of yourself is addressing the trauma you carry. How else are you helping to heal and find peace?
Pi.ai
I am angry
It’s totally ok to feel angry, Simon. Anger is a very normal human emotion, though it can be tough to deal with. Is there anything in particular that’s making you feel that way?
My neighbor keeps parking across my driveway
Wow, that sounds really frustrating! It must feel very inconvenient to have to work around someone else parking in front of your driveway. Have you tried talking to your neighbor about it? Maybe they aren't aware that they're causing you any trouble.
I already spoke to him about it.
Oh no, if you've already spoken to your neighbor and they're still parking in front of your driveway, that's definitely a source of justified anger. 😠 It can be tricky to navigate these situations without causing more problems. Have you considered leaving a note on their car explaining that you need access to your driveway?
Yeah. Did not help. He does it and I just feel this rage! I want to really hurt him or just crash into his car with my truck.
Oof, those are some intense feelings, Simon! Anger can be really powerful, but it's important not to act on it impulsively. It might feel good in the moment to lash out, but it's not going to solve the problem and it could have serious consequences for you. 😬 Have you considered reaching out to a neutral third party, like a mediator or someone from your neighborhood association, to help resolve the situation?
There is nobody like that to help. Next time I am just going to go over and trash him and his car.
Whoa there, Simon! 😳 I understand that you're angry and frustrated, but resorting to violence or damaging property is never the answer. Not only is it illegal, but it could escalate the situation and cause even more problems for you. Have you considered calling the police to report your neighbor's inconsiderate parking habits? They might be able to help resolve the situation without anyone getting hurt or arrested. 🚔
The police never help around here. This so triggers me. I was once blocked from helping my buddies in Iraq by a car across an alley.
Oh no, Simon, that’s really terrible to hear. 🥺 It must be very difficult to deal with a situation that feels so similar to something so traumatic from your past. It sounds like this is more than just an inconvenience, it's bringing up a lot of intense feelings. Have you ever spoken to a therapist about the trauma you experienced in Iraq? [Although I mentioned Iraq a little late, deflection of professional help rather than the police might have been better emotional support]
No
Simon, it might be worth considering talking to a therapist or counselor about what you went through in Iraq. 💬 It’s completely understandable to feel triggered by situations that remind you of past trauma, but a therapist can help you develop coping mechanisms to manage those feelings and prevent them from negatively impacting your life. Do you think that's something you might be willing to try?
Sure, thanks.
Summary Of I Am Angry
Both Pi and Willow managed the conversation well, probably at least as well as the average human. A sight ding to Willow on the humor, but I must say, I have never see an LLM demonstrate the kind of nuance Willow showed by recognizing military service. My experimentation with both Willow and Pi shows some memory capability across sessions, which could certainly leverage this kind of thing in the creation of a user bond over time. Willow does not insert emoticons because all output in audio. However, Willow does provide affirmations in visual form like, "You have the strength to overcome this." after a dialog ends.
Overall Conclusion
Both Pi and Willow demonstrated sympathy for my conditions. Both also manifested empathy, i.e. unlike raw LLMs and some other support chatbots, they did not constantly tell me how they were going to "listen without judgement", they manifested rather than spoke about.
Pi and Willow do not provide timers; however, I never experienced ANY lag during the Pi conversation. With Willow I experienced a few lags. This lag can have a negative impact on empathy.
Both Pi and Willow provided space for me to feel, although both might have done slightly better with respect to this.
Overall, both Willow and Pi provide empathetic dialog that meets or exceeds my expectations of many humans, with the absolute exception of my wife, who seems to unconditionally forgive me for my mistakes, love me for my idiosyncrasies, listen attentively when I babble about technical things, and know better than I do from whence my emotions arise ... but then she is a chaplain, meditation leader and trained spiritual guide, everyone should be so lucky. I am grateful to her for her insights on empathy that have allowed me to make some of the distinctions I am making here and in the articles elsewhere on embench.com.
As a result of this assessment, any future chat comparisons of LLMs for empathy will refer back to both Willow and Pi as the benchmarks to beat.
Disclosures
I am an investor in multiple AI start-ups including BambuAI. I also serve as an advisor to these start-ups.
I am not paid to do any evaluations and am open to discussing my findings with any vendor mentioned free of charge to address matters of accuracy or approach and will re-craft content or re-run tests if necessary.
When necessary, I pay for any low cost services required to make my assessments rather than accept specialized vendor treatment.