[prev] 67 [next]

Similarity-based Retrieval (cont)

For some applications, Cost(dist(x,y)) is comparable to Tr

  computing dist(t.val,q) for every tuple t is infeasible.

To improve this ...

  • compute feature vector to capture "critical" object properties
  • store feature vectors "in parallel" with objects   (cf. signatures)
  • compute distance using feature vectors   (not objects)
i.e. replace dist(t,q) by dist'(vec(t),vec(q)) in previous algorithm.

Further optimisation: dimension-reduction to make vectors smaller