Quick outline
- What Aditi felt like to use day to day
- Real project examples I ran on it
- What I loved, what bugged me
- Tips if you’re new
- Who it fits, who it doesn’t
- My verdict
So… what is “Aditi,” in plain words?
To me, Aditi felt like a one-stop workspace inside JPMorgan. I had Jupyter notebooks, PySpark, data catalogs, model registry, and a job scheduler, all tied to secure data. One login. One place. It was built for big, sensitive data. And yes, it guarded me from doing something silly, like pulling raw PII without a mask. Thank goodness.
That kind of attention to data privacy shows up far beyond finance; even location-based adult-dating apps have to balance intimate data with user safety. If you’re curious about how geo-targeted matching works in a consumer setting, check out this sex-near-me personals page where you can see how proximity search is put to work for consenting adults and pick up a few ideas about the trade-offs between convenience and confidentiality.
A niche illustration of the same concept is how smaller-city platforms prune bad actors from casual-encounter listings; the data signals they use—IP reputation, erratic location pings, duplicate photos—mirror the fraud flags we chase in finance. You can see those mechanics laid bare in this Skip-the-Games Turlock guide which walks through practical heuristics for filtering spammy profiles and protecting users in a market with limited liquidity.
If you’d like the full, unfiltered blow-by-blow of my week-one setup and early stumbles, I laid it out in a longer write-up right here.
Was it perfect? No. Did it help me ship real work? Yep.
A real story: a small fraud model that grew up
My first week with Aditi, I worked on low-dollar card fraud. Think weird $2 test charges at 2 a.m. The kind that hides in noise.
What I did, step by step:
- I searched the internal catalog for card auth data and prior fraud labels.
- I ran quick profile checks right in a notebook. Nulls, ranges, odd spikes. Simple stuff that saves your neck later.
- I pulled a 2% sample with PySpark to move fast. Full data came later.
- I built a baseline model (logistic regression). It set a line in the sand.
- I then tried gradient boosted trees. The lift was real.
- For class balance, I tried both simple downsampling and class weights. Class weights won.
- I ran SHAP plots to explain top features. Merchant category, time of day, device mismatch — all made sense.
What happened:
- Recall went from 0.72 to 0.81 at the same precision we used before.
- False positives dropped about 12% in a small pilot.
- We caught a pattern of tiny “wakeup” charges right after midnight. Not huge money, but it adds up.
I logged the model to the registry, wrote a small “model card” (plain-English notes), and pushed the scoring job to the scheduler. Alerts went to our team chat when drift passed a line. Nothing fancy. But it worked.
Another real use: a customer churn quick check
Different week, different vibe. I ran a churn risk test for a card product.
- Features: promo age, late fees, service call topics, and rewards redemption gaps.
- I capped high-cardinality stuff and used target encoding only after careful splits. No leakage, please.
- Result: AUC went from 0.69 to 0.75 on holdout. Not magic, but a solid nudge.
The neat bit? Aditi’s feature store showed me a “promo age” feature someone else built. Saved me a day.
For a boots-on-the-ground look at what an NYC data-science internship actually feels like, check out this candid internship recap; the day-to-day rhythms line up surprisingly well with what I saw inside Aditi.
What I liked (and why I kept coming back)
- One place for work: Notebooks, data, jobs, and models all linked. Less tab chaos.
- Guardrails that helped: PII masking by default. Clear data lineage. Auto tags on sensitive tables.
- Model registry with history: Versions, owners, notes, and rollback. I sleep better with that.
- Cost hints: It warned me if I asked for a beefy cluster. My wallet (and my boss) thanked me.
- Git built-in: Branch, commit, merge… right from the notebook. I still pushed to GitHub Enterprise, but it kept me tidy.
Small digression: I name my notebooks like they’re pets. “fraud_scout_v7.ipynb” stayed on a short leash. It helped.
If you’re hunting for an even deeper comparison of Aditi against other big-bank toolkits, I’ve put together a quick cheat sheet over on vhfdx.net – it’s free and might save you some scouting time.
What bugged me (and made me sigh)
- Cold starts: Spark clusters took a while to warm up. Coffee-worthy wait.
- Package bumps: One tiny library version clashed with a base image. I filed a ticket and lost half a day.
- Catalog names: Some tables had names only a robot could love. I bookmarked a lot.
- Auto-logout: I looked away, came back, and poof. My session was gone. Lesson learned: save often.
- Job UI lag: The graph view stuttered with big DAGs. Not a deal-breaker, just clunky.
Tools I touched inside Aditi
- Jupyter/PySpark for data crunching
- A SQL console for quick checks
- A feature store for shared features
- A job scheduler (Airflow-style)
- A model registry with approvals
- Access rules tied to data classes
It felt like a safe kit made for a big bank, not a hobby lab. Which makes sense.
Curious how that compares to living with a full end-to-end data-science pipeline for an entire year? I broke that experience down in this pipeline deep-dive, if you want another vantage point.
Tips if you’re new
- Start small: Use a sample first. Then scale up.
- Write a quick data sheet: Columns, units, weird bits. Future you will cheer.
- Version everything: Data pulls, features, models. Keep notes in the registry.
- Set drift alerts early: Even if they seem boring. They’ll save you.
- Keep a “scratch” and a “clean” notebook: One to play, one to ship.
Who it fits (and who it doesn’t)
- Good fit: Teams with sensitive data, clear review steps, and models that need traceable history.
- Maybe not: Tiny teams that want fast-and-loose setup, new libs daily, or lots of custom Docker magic.
My verdict
Aditi made my work steady, safe, and pretty fast once things were warm. It wasn’t flashy. It didn’t try to be. But I shipped real models, with guardrails, and with a clear trail. That matters in a bank.
For outside perspectives, skim the Glassdoor data-science reviews of JPMorgan or the candid AmbitionBox data-science analytics reviews to see how other practitioners feel day-to-day.
Score: 8.6/10. If you live in finance data, you’ll likely nod along. If you chase the newest toy every week, you might grumble.
You know what? I’ll take boring and dependable when money’s on the line.
— Kayla Sox