Display the source code in std/numeric.d from which this page was generated on github.

If you spot a problem with this page, click here to create a Bugzilla issue.

Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using local clone.

Function `std.numeric.gapWeightedSimilarityNormalized`

The similarity per gapWeightedSimilarity has an issue in that it grows with the lengths of the two strings, even though the strings are not actually very similar. For example, the range ["Hello", "world"] is increasingly similar with the range ["Hello", "world", "world", "world",...] as more instances of "world" are appended. To prevent that, gapWeightedSimilarityNormalized computes a normalized version of the similarity that is computed as gapWeightedSimilarity(s, t, lambda) / sqrt(gapWeightedSimilarity(s, t, lambda) * gapWeightedSimilarity(s, t, lambda)). The function gapWeightedSimilarityNormalized (a so-called normalized kernel) is bounded in [0, 1], reaches 0 only for ranges that don't match in any position, and 1 only for identical ranges.


						
				Select!(isFloatingPoint!F,F,double) gapWeightedSimilarityNormalized(alias comp, R1, R2, F)
				(
				

				  R1 s,
				

				  R2 t,
				

				  F lambda,
				

				  F sSelfSim = F.init,
				

				  F tSelfSim = F.init
				

				)
				

				if (isRandomAccessRange!R1 && hasLength!R1 && isRandomAccessRange!R2 && hasLength!R2);

The optional parameters sSelfSim and tSelfSim are meant for avoiding duplicate computation. Many applications may have already computed gapWeightedSimilarity(s, s, lambda) and/or gapWeightedSimilarity(t, t, lambda). In that case, they can be passed as sSelfSim and tSelfSim, respectively.

Example

import std.math : isClose, sqrt;

string[] s = ["Hello", "brave", "new", "world"];
string[] t = ["Hello", "new", "world"];
writeln(gapWeightedSimilarity(s, s, 1)); // 15
writeln(gapWeightedSimilarity(t, t, 1)); // 7
writeln(gapWeightedSimilarity(s, t, 1)); // 7
assert(isClose(gapWeightedSimilarityNormalized(s, t, 1),
                7.0 / sqrt(15.0 * 7), 0.01));

Authors

Andrei Alexandrescu, Don Clugston, Robert Jacques, Ilya Yaroshenko

License

Boost License 1.0.