Hashing vs Encryption: Two Concepts Constantly Confused
Hashing is one-way; encryption is two-way. Get the difference right and avoid catastrophic security mistakes.
“Encrypted in our database” is a phrase that should make every security engineer wince. Half the time, the speaker actually means hashed. The other half, they really do mean encrypted, and that's its own problem. Knowing the difference is one of the most consequential pieces of vocabulary in software.
The core difference
Hashing is one-way. You feed input in; a fixed-length digest comes out. There's no mathematical way to reverse the digest back to the input. Same input always produces the same output.
Encryption is two-way. You encrypt with a key, you decrypt with a key (the same one, for symmetric encryption; a different one, for asymmetric). The whole point is that the original data can be recovered.
| Hashing | Encryption | |
|---|---|---|
| Reversible? | No | Yes (with key) |
| Key required? | No | Yes |
| Output size | Fixed | Roughly input size |
| Same input → same output? | Always | Depends (initialization vector) |
| Use cases | Passwords, integrity, fingerprints | Confidentiality of data at rest or in transit |
Try a hash: See how the same input always produces the same digest.
Open Hash Generator →When to hash
- Passwords: store the hash, not the password. When the user logs in, hash their input and compare hashes. Use a slow, salted, memory-hard algorithm: Argon2, bcrypt, or scrypt. Never raw SHA-256 — it's too fast, allowing brute-force attacks.
- Integrity: compute a SHA-256 of a file, publish it, and consumers can verify their copy matches. Used everywhere from package managers to git commits.
- Content addressing: name files by their hash. Same content always has the same name. Used by IPFS, git, and content delivery networks.
- Fingerprinting: dedupe files, detect changes, build cache keys.
When to encrypt
- Data at rest: database fields containing PII, credit cards, health info. AES-256-GCM is the modern default.
- Data in transit: TLS handles this for you. Every HTTPS connection uses asymmetric crypto to exchange keys, then symmetric crypto to bulk-encrypt the conversation.
- End-to-end messaging: Signal's protocol uses asymmetric keys to encrypt messages so only the recipient can read them, even Signal can't.
Why “we hash passwords” isn't enough
Two attacker techniques break naive password hashing:
- Rainbow tables: precomputed hash tables for common passwords. Defense: salt — add a unique random value per user before hashing.
- Brute force: hash every possible password until you find a match. Modern GPUs can do billions of SHA-256 hashes per second. Defense: use a slow algorithm like Argon2 that takes ~100ms per hash, making brute force 1,000,000× slower.
Common confusion: tokens, secrets, and signatures
- API keys: these are secrets, not encryption keys. They prove identity. Don't hash them on the server side and reject anything that doesn't match — instead store them hashed and compare hashes (like passwords).
- JWTs: the “signature” in a JWT is a HMAC (a kind of keyed hash) — not encryption. JWT contents are visible to anyone who has the token.
- HMAC: hash with a secret key. Used to verify integrity AND authenticity (only someone with the key could have produced this hash).
A simple test
Whenever you hear “encrypted” about a stored value, ask: can the system recover the original? If yes, it's encrypted. If no, it's hashed. Most password databases should answer “no” — if a system can email you your forgotten password, that's a red flag.