====== Sorting Unicode ====== To have sort and search in unicode ignore diacritical marks you need to use [[http://en.wikipedia.org/wiki/Unicode_normalization|normalized compatibility decomposition NFKD]] and then take just the first (or only the ASCII) characters of each sequence. The [[http://www.icu-project.org/|ICU library]] has C and Java bindings for normalization and lots of other stuff. In Python this looks like unicodedata.normalize('NFKD',s).encode('ASCII','ignore')