Nils Diewald

Levenshtein-Damerau Distance

With this program you can determine similar and dissimilar words according to the Levenshtein-Damerau distance from a corpus of 8780 words of German. The corpus is taken from the 10k frequency word list by Peter Kolb.

Additionally, the average distance of the searched word to all other words in the corpus is determined. The word with the smallest average distance to all other words is the word seien with a value of about 6,06. The word with the largest average distance (about 21.45) is Fußball-Weltmeisterschaft. These values are influenced to a large extent by the average string length of about 7.71.

The average value of all average distances is about 2.74.

The average distance of the word spanier to all other words in the corpus is 6.97129840546697.

Similar words:

spanien (1), panzer (2), papier (2), daniel (3), japaner (3), klavier (3), langer (3), maier (3), panik (3), papiere (3)

... More similar words!

Dissimilar words:

fußball-weltmeisterschaft (21), verwaltungsgemeinschaft (20), nationalsozialistischen (19), mecklenburg-vorpommern (19), bundesverfassungsgericht (19), einwohnerentwicklung (18), durchschnittseinkommen (18), wirtschaftswachstum (17), wettbewerbsfähigkeit (17), verschickenleserbrief (17)

... More dissimilar words!