Career DishReal jobs, real talk

Is Data Science Stressful?

~18 min read · 6 voices

We asked six data scientists the same question. The technical difficulty came up, but it wasn't usually the thing. The thing was usually something about decisions that don't happen, pipelines that break quietly, or being the only person in the building who can read the output critically.

These characters are composites, built from dozens of real accounts, interviews, and community threads. The people aren't real. The experiences are.

The question we asked: What stresses you out most about this job?

What you'll learn

S

Seth

34 · Senior data scientist at a logistics SaaS company in Denver · 3 years · Former operations analyst, no formal DS degree, self-taught Python

The Excel question. That's what I call it. It's 4:15 on a Thursday and Marcus, the VP of Sales, sends me a Slack that says "quick question" and then attaches a spreadsheet and asks why two numbers don't match. One number is from the Salesforce dashboard he built himself. The other is from the report our data team produces. They differ by 11%. He wants to know in the next 45 minutes because he's presenting to the board at 5:30.

I am not exaggerating when I say this happens every four to six weeks. Not always Marcus. Sometimes it's Felicia in Finance, sometimes it's the CEO. But the pattern is identical: last-minute number reconciliation with a high-stakes meeting attached. And the reason the numbers don't match is never simple. Last time it was because Marcus's Salesforce report was counting "opportunities created" and the data team report was counting "qualified opportunities," and those are different things with a different date filter applied, and explaining that in 45 minutes to a non-technical person who is stressed about a board meeting is genuinely difficult. I did it. The meeting went fine. Marcus emailed me afterward and said "good catch." That was it. No change to the Salesforce setup. No documentation of the definition difference. Two months later it'll happen again.

What stresses me isn't the technical part. I can find the discrepancy. What stresses me is that this keeps happening because the definitions aren't documented anywhere, and the definition documentation is not a glamorous project, so it never gets prioritized, and I've raised it four times in four quarters. My manager, Lisa, agrees it should be done. It just keeps getting pushed. I'm living with the consequences of that deprioritization on a rotating basis, and I can see exactly what needs to happen to fix it, and I can't make it happen unilaterally. That's the part that actually wears me down.

I can see exactly what needs to happen to fix it. I can't make it happen unilaterally. That's the part that wears me down.
— Seth

A

Asha

29 · Data scientist at a healthcare analytics firm in Boston · 3 years · Applied math degree, master's in biostatistics

The weight of the numbers. My company builds risk models for Medicaid-managed care organizations. The models help payers figure out which members are at high risk for hospital readmission, and the payers use those outputs to decide which members get care coordination resources, which means phone calls, case management, scheduling support. We have about 400,000 members in our models at any given time.

I know intellectually that any model is imperfect. But there's a specific feeling that comes with knowing that my model's false negative rate is translating to real people who don't get a care coordinator call who might have needed one. The last round of model validation, my recall on high-risk members came in at 68%. Which in the literature for this type of model is actually reasonable. But 32% of high-risk members who aren't flagged is not an abstraction to me. I've run the math. For our population size, that's roughly 3,800 people in a year who probably should have gotten an outreach call and didn't.

My boss Chen reminds me regularly that the counterfactual is the organization using no model, or a much worse model. And he's right. The 68% recall is better than anything they had before. But I still think about the 32% when I'm trying to fall asleep. Chen doesn't have this problem, or if he does, he's stopped mentioning it after five years in the field. I'm three years in and it hasn't gotten lighter.

The 68% recall is better than anything they had before. But I still think about the 32% when I'm trying to fall asleep.
— Asha

B

Bianca

31 · Solo data scientist at a 60-person e-commerce brand in Nashville · 2 years · Former marketing analyst, did a master's in statistics while working

Being the only one. When I joined, the founder, a guy named Derek, told me "you're going to build our data function from scratch." That sounded exciting in the interview. What it actually means is that I own everything from data infrastructure to reporting to ad-hoc analysis to the occasional model, and there is no one to review my work, no one to check my assumptions, no one to say "that confidence interval seems off" because nobody else here can read a confidence interval. Derek reviews my slides but he's reading the direction of the arrows, not the statistical validity of the estimates underneath.

What's stressful about that is not the volume of work, though the volume is real. It's the loneliness of the quality control. I have to be my own reviewer. I catch my own mistakes because there is no one else to catch them. Last fall I found an error in a customer lifetime value model that had been running for six months. The error was in how I was handling customers who'd made exactly one purchase. I was treating them as churned too aggressively, which was deflating LTV for a whole segment. When I found it and reran the model, the LTV for that segment went up by 23%. Six months of strategic decisions made on the wrong number. Nobody knew but me.

I told Derek. He was fine about it. "These things happen, glad you caught it." But that's the thing. I almost didn't catch it. I only found it because I was doing a quarterly review of all running models, which I do because I know nobody else will. In a team environment, a second pair of eyes would have found this in code review. I'm my own code review and I'm not always a rigorous reviewer of my own work. That gap sits with me every day.

I have to be my own reviewer. I catch my own mistakes because there is no one else to catch them.
— Bianca

F

Felix

38 · Staff data scientist at a fintech company in New York · 6 years at the company · Physics PhD, spent 2 years in academic postdoc before this

Model debt. It's like technical debt but for ML. We have 14 models in production. Six of them I wouldn't rebuild the way they were built if I were starting fresh. Three of them were built by people who left the company and aren't documented well enough that I could confidently change them. Two of them are running on data pipelines that have never been tested for edge cases, which means a data quality issue upstream will silently produce wrong predictions without triggering any alert. That's my baseline context when I show up to work.

My manager Theodora knows about the model debt. We have it in the quarterly planning cycle as a line item called "model maintenance." It never gets the allocation it needs because there are always more urgent new features. This is true of technical debt in general, but model debt is worse because the consequences aren't always visible. A broken feature fails obviously. A model that's slowly drifting as the data distribution shifts doesn't alert anyone. It just gets quietly less accurate, and the business keeps making decisions on it, and nobody notices until something downstream breaks or someone asks a hard question.

The specific thing that stresses me: I'm not sure all the models I inherited are performing as well as they were when deployed. I've done spot checks on the ones I can validate. The ones I can't validate easily are the ones I worry about at 11 PM. Nadia, my partner, is a physician. She has a different version of this, where a decision she made on a patient stays with her. I've started to think model debt is my version of that. Decisions locked inside production systems that I can't fully audit. It's not a comfortable feeling.

A broken feature fails obviously. A model that's slowly drifting doesn't alert anyone. It just gets quietly less accurate, and the business keeps making decisions on it.
— Felix

N

Natalie

26 · Data scientist at an edtech company in Austin · 1.5 years · Statistics degree from UT, first DS role

Not knowing if what I'm doing is right. That's the straightforward version. I'm still in my first role and there's this specific anxiety that comes from not having enough context to know whether my approach is reasonable or whether it's reasonable-seeming but fundamentally off. My manager Julian is supportive and technical and reviews my work, but he's stretched across four direct reports and I don't always feel like I can ask "wait, am I thinking about this correctly from first principles" without it reading as insecure or incompetent.

Specifically: I'm building an engagement prediction model that's supposed to identify students who are at risk of dropping a course. Last week I was deciding between logistic regression and a gradient boosting model, and I knew the boosting model would get better AUC on the validation set but I also know boosting models can overfit and this training data only has 14 months of history. I chose logistic regression and documented the reasoning. But I spent two hours on that decision, staring at a comparison table I built, and I'm still not fully sure I made the right call. Julian said "looks good" in code review, which I think means he agrees, but I'm not certain "looks good" was a rigorous technical assessment versus a "that seems defensible" response from someone who was in four meetings that afternoon.

I know this is what junior-to-mid feels like and that it gets better. My former classmate Ingrid is three years into a DS role and she says she's at the point where she trusts her own reasoning most of the time. I'm not there yet. It's not paralyzing. I just spend more mental energy on uncertainty than I expected to. I thought the hard part of this job would be the math. The hard part turns out to be calibrating how confident to be in my own judgment.

I thought the hard part of this job would be the math. The hard part is calibrating how confident to be in my own judgment.
— Natalie

A

Aaron

41 · Lead data scientist at a retail analytics consultancy in Minneapolis · 9 years in DS total · MBA, no technical degree, learned statistics through economics and finance

The precision that doesn't exist. Our clients come to us wanting answers that are more certain than the data can produce. This is not a failure of intelligence on their end. They've been raised on a business culture that treats data as a source of clean answers, and clean answers are usually what they ask for. "Tell us which of our 400 retail locations to close." "Tell us which SKUs are going to underperform next quarter." They want a list. They want the list to be right.

What I can actually give them is a probabilistic ranking with a set of assumptions they need to vet and confidence bounds that get wider in smaller markets. I can give them the analysis framework that makes their decision better-informed. I cannot give them certainty. My colleague Deanne has been in this field longer than me and she says the best reframe is to tell clients you're buying down risk, not buying certainty. I've borrowed that framing and it helps. But it doesn't eliminate the friction that comes when a CFO looks at a 90% confidence interval and says "so you're telling me 1 in 10 times you're wrong?" and there's no comfortable answer that doesn't confirm that yes, probabilistic methods have probabilistic outputs.

My previous life was corporate finance. Fifteen years. I built discounted cash flow models that looked precise because they had two decimal places. They were not more certain than a probability interval. They were more comfortable-feeling. Moving into data science, I carry this awareness of how much certainty is performed in professional settings rather than actually present, and it sometimes makes the conversations harder because I'm less willing to perform certainty than some of my clients want me to. That's the specific stress for me. Managing the gap between what the client wants to believe the model can tell them and what the model can actually tell them. That gap is where most of my energy goes.

Most certainty in professional settings is performed, not actual. Data science just makes that harder to hide.
— Aaron

What We Noticed

Pattern 1

The technical difficulty was rarely the primary stressor. Seth's Excel question, Aaron's precision problem, Morgan's confidence intervals, and Natalie's model selection anxiety all involve statistics. But what actually wore them down was the surrounding human layer: undocumented definitions, stakeholder expectations that exceed what data can deliver, lack of a second reviewer. The math was solvable. The organizational and communication context around the math was the hard part.

Pattern 2

Isolation showed up across nearly every level of experience. Bianca is genuinely alone, the only DS at her company. Asha carries the weight of consequential decisions without colleagues who share her statistical frame. Natalie can't tell if her manager's "looks good" was rigorous or generous. Felix inherits legacy models from people who've left. The loneliness of the role, being one of very few people in a company who can read an analysis critically, compounds in different ways at every career stage.

Pattern 3

The stresses that stayed with people were the ones attached to decisions that already happened. Asha thinks about the 32% who weren't flagged. Felix lies awake about models he can't fully audit. Bianca found a six-month error. Aaron manages the gap between probabilistic truth and the certainty clients want to act on. The forward-looking technical challenges were manageable. The uncertainty about whether past work was right, and whether anyone would ever know if it wasn't, was the kind that persists.

Frequently Asked Questions

Is data science a stressful career?

It depends on context. Data scientists at companies where they're the sole analyst often report high stress from competing requests and the absence of peer review. At larger companies, model debt, stakeholder expectation management, and the weight of consequential decisions are more common stressors. The technical difficulty is rarely what experienced practitioners cite as the primary issue.

What burns data scientists out?

The most commonly cited driver is the translation layer: building something technically correct and watching it get ignored, misinterpreted, or overridden by a gut decision. A secondary driver is isolation. Being one of the few people in a company who can critically read a model output means you can't vent or validate with most colleagues. That isolation compounds over years.