From Protein Folding to Polling Data

Statistics was my favourite undergraduate course at Queen’s University. Not just because I got my highest grade in it — though that helped — but because it was the most connected discipline I encountered. Deeply analytical, rigorously mathematical, and yet directly tethered to the real world. Every other course felt like it existed in its own silo. Statistics was the language that let you talk about all of them.

I studied biochemistry at Queen’s, then stayed for graduate work in bioinformatics. My research involved case-based reasoning, statistical analysis, and protein folding — work that sat at the intersection of biology, computation, and probability. Every day was about building models, testing hypotheses, and understanding what data could and could not tell you. It was rigorous, precise, and deeply satisfying.

Then I went into politics.

Being the only person in the room who understood a margin of error#

Political communications is not a field that selects for mathematical rigor. It selects for instinct, narrative ability, and speed. The people I worked alongside were sharp, strategic, and effective communicators. But when polling data landed on the table, I was often the only person in the room who understood what the numbers actually meant.

I watched colleagues treat polls as exact measurements rather than probability distributions. A poll showing 34% versus 31% was read as a three-point lead, not as a statistical tie within the margin of error. I watched people cherry-pick the one poll out of five that supported their preferred narrative and dismiss the other four. I watched correlation get treated as causation routinely — a spike in social media mentions coincided with a bump in the polls, so obviously the social media campaign caused the bump.

None of this was malicious. It was the natural result of numerate people working in an innumerate environment. Political staff are not stupid — they are trained in a discipline where evidence means something different than it does in science. In politics, a compelling anecdote can outweigh a dataset. A well-told story about one person’s experience can move more votes than a statistically significant finding across a thousand respondents. The incentive structure does not reward statistical rigor. It rewards persuasion.

Coming from a research background where a p-value meant something specific and a confidence interval had a precise interpretation, this was disorienting. I was a stranger in a strange land — fluent in a language nobody else spoke, watching it get mangled daily.

How scientific training changed my approach to political work#

The scientific training did not make me a better political operative in any straightforward way. It did not give me better instincts about messaging or a sharper sense for voter sentiment. What it gave me was skepticism.

When someone presented data to support a strategy, I interrogated the methodology. When a campaign claimed a tactic had worked, I asked what the control group was. When polling showed a shift, I asked whether the shift was within the standard error or genuinely meaningful. This made me annoying in meetings. It also made me right more often than I was wrong.

Scientific skepticism is not cynicism. It is the discipline of requiring evidence before accepting claims — including claims you want to believe. In politics, where motivated reasoning is the default cognitive mode, that discipline is rare and genuinely useful. It does not replace political instinct. It complements it.

Why I built statistics.tools#

I built Statistics Tools because statistics was the discipline that connected everything in my career — research, politics, software, data analysis — and I wanted to build something in that space.

The existing landscape for statistical tools online is split between two extremes. On one side, there is full statistical software — SPSS, R, GraphPad Prism — that costs money, requires installation, and demands expertise to operate. On the other side, there are scattered online calculators that handle one or two tests each, often with server-side processing and ad-supported business models. There was room for something in between: a comprehensive, browser-based platform where a researcher, student, or analyst could run a t-test, check a chi-square distribution, compute a confidence interval, or estimate sample size — all without installing software, creating accounts, or uploading data to someone else’s server.

I built 132 calculators across seven categories: descriptive statistics, confidence intervals, probability distributions, sample size and power analysis, hypothesis tests, correlation and regression, and statistical reference tables. I included methods that are rarely available in free web tools — bootstrap confidence intervals with BCa method, Bayesian A/B testing, propensity score estimation, Kaplan-Meier survival analysis, repeated measures ANOVA with sphericity corrections, and all six standard ICC forms.

I also compiled a 433-term statistical glossary — organized both alphabetically and thematically, with cross-references to relevant calculators. It started as a reference for making sure the tools themselves were precise, and grew into a standalone educational resource.

Building each calculator as a learning resource#

The design decision I am most deliberate about is that every calculator is also a teaching tool. Each one includes the mathematical formula rendered in LaTeX, a step-by-step calculation walkthrough, preset example datasets for exploration, a discussion of assumptions and limitations, and APA-style reporting guidance for academic use.

This is where my scientific background directly shaped the product. In research, you do not just run a test and report the result. You justify why the test is appropriate, check the assumptions, interpret the output in context, and acknowledge limitations. I wanted every calculator on Statistics Tools to encourage that same discipline — not just give you a number, but help you understand what the number means and whether you should trust it.

The Shapiro-Wilk test does not just return a p-value — it explains what normality means, when it matters, and what to do if your data are not normal. The Levene’s test does not just test homogeneity of variance — it explains why equal variances matter for certain tests and links you to Welch’s t-test as an alternative when the assumption fails. The Bonferroni correction explains the multiple comparison problem in plain language, because that is a concept most people get wrong.

Statistics as the language that connects everything#

Looking back, the thread that runs through my career — from biochemistry to bioinformatics to political communications to software development — is the ability to look at data and understand what it is telling you. Statistics is the formal language for that understanding. Building Statistics Tools was a way of giving that language to anyone who needs it, without the barriers of cost, installation, or expertise that have historically kept it behind walls.

Statistics Tools runs entirely in the browser, works offline, and is available in four languages.

From Protein Folding to Polling Data

Being the only person in the room who understood a margin of error#

How scientific training changed my approach to political work#

Why I built statistics.tools#

Building each calculator as a learning resource#

Statistics as the language that connects everything#

More Posts

Building 120 Image Tools for the Browser

Building 357 Financial Calculators Without Selling You Anything

Building Heritage Guide: A Technical Deep Dive