Solving anagrams: the irrational way
When you are first trying to solve anagram coding problem, one of the first ideas you come up with is to compare checksums of the words, which are tolerant to rearrangement of letters. In other words, it’d be awesome to get “def” and make A out of it, then get “fed” and make B, then compare A and B and say “Hey, they are equal so they are anagrams!”.
To make these magical checksums, you need to map each letter to a unique number. That’s where we recall that we have char codes for every letter:
const aCharCode = 'd'.charCodeAt(0) // 100
const bCharCode = 'e'.charCodeAt(0) // 101
const cCharCode = 'f'.charCodeAt(0) // 102
However, if we try to use pure char codes to make a checksum, we will face a collision and information loss:
const sum1 = 'd'.charCodeAt(0) + 'e'.charCodeAt(0) // 100 + 101
const sum2 = 'c'.charCodeAt(0) + 'f'.charCodeAt(0) // 99 + 102
console.log(sum1 === sum2) // true, 201 === 201
So the idea is to use special numbers that’s not additive. Square roots can be on the top of the mind. :)
const sum1 = Math.sqrt('d'.charCodeAt(0)) + Math.sqrt('e'.charCodeAt(0)) // √100 + √101
const sum2 = Math.sqrt('c'.charCodeAt(0)) + Math.sqrt('f'.charCodeAt(0)) // √99 + √102
console.log(sum1 === sum2) // false, 'de' !== 'cf', √100 + √101 !== √99 + √102
Unfortunately, since we step into the territory of float numbers, we can not use pure === operator for comparison anymore. We should use Number.EPSILON instead:
const sum1 = Math.sqrt('d'.charCodeAt(0)) + Math.sqrt('e'.charCodeAt(0)) // √100 + √101
const sum2 = Math.sqrt('e'.charCodeAt(0)) + Math.sqrt('d'.charCodeAt(0)) // √101 + √100
const diff = Math.abs(sum1 - sum2);
console.log(diff < Number.EPSILON) // true
Also we should normalize the tolerance for large numbers:
// imagine sum1 and sum2 are large
const diff = Math.abs(sum1 - sum2);
console.log(diff < Math.max(sum1, sum2) * Number.EPSILON)
Basically, with this code we said that if sums are equal (or their difference is close enough to 0), strings are anagrams!
So the final code to detect anagrams would be:
function isAnagram(s1: string, s2: string): boolean {
let sum1 = 0;
let sum2 = 0;
if (s1.length !== s2.length) {
return false;
}
for (let i = 0; i < s1.length; i++) {
sum1 += Math.sqrt(s1[i].charCodeAt(0));
sum2 += Math.sqrt(s2[i].charCodeAt(0));
}
const diff = Math.abs(sum1 - sum2);
// Sometimes we have the same magnitude, but different number, like 1.1*10^(-10) vs 5.1*10^(-10)
// So strings are anagrams, but the result could be still false
// It's better to compare exponent values: -10 vs -10, but decreasing the tolerance works a bit faster
const extraMagnitudeToStripFalsePositive = 10;
return diff < Math.max(sum1, sum2) * extraMagnitudeToStripFalsePositive * Number.EPSILON;
};
Summary
There are lots of different solutions for such a famous coding problem and this one is just one of them. When I came up with it, I found it interesting enough to share.
So we can use sums of char codes to compare strings, however to avoid collisions we must use non-additive numbers like square roots of char codes. Comparing float numbers, we can say that if difference is effectively close to zero enough, these strings are anagrams
You can see the Leetcode solution here