th
"most frequent" objects, the index in the array being the reference of the object.
This array is split in several range levels:
- T[ 0 , f(1)[ : is the level 1
- T[ f(1) , f(2)[ : is the level 2
- T[ f(2) , f(3)[ : is the level 3
- ..
- T[ f(k-1) , f(k)[ : is the level k
Where f(1)=1 and f(x+1) = f(x) + ceil( f(x)*h ) with h a real in the range ]0 , 1[
When an object of reference r is used two cases can happen:
- r>=n : the object is not already in the array. It will exchange it's reference
with an object of the kth level.
- r<n : the object is in the array, at a level x. It will exchange it's reference
with an object of the (x-1)th level.
The idea is that most used objects will gradually move to the lower levels.
Some roughs experiments seems to show that the best results are when:
- the real h is near to 0.7
- the size n of the array is half the numbers of objects
For objects that are not already in the array, one important point is to
chose where to put it in the kth level, so that all objects in the
array have the same chance to move off.
Implementation
Sample program in Python
def mtfa(words, max=1024, delta=.5):
toidx = {}
fromidx = []
result = []
for w in words:
iold = toidx.get(w, None)
if iold is None:
inew = len(toidx)
result.append(inew)
toidx[w] = inew
if inew=len(fromidx):
result.append(iold)
pay += iold
inew = int(len(fromidx)*delta)+iold%int(len(fromidx)*(1-delta))
permute = fromidx[inew]
toidx[w] = inew
fromidx[inew]= w
toidx[permute] = iold
elif iold!=0:
result.append(iold)
inew = int(iold*delta)
permute = fromidx[inew]
toidx[w] = inew
fromidx[inew]= w
toidx[permute] = iold
fromidx[iold]=permute
return result