关键词:
Evaluation
Retrievability
Patent
Information retrieval
Recall-oriented domain
Corpus access
Non-Boolean
Hybrid systems
摘要:
The development of models and systems in Information Retrieval (IR) has been driven by the empirical measurement of effectiveness. However, in recall-oriented domains such as patent search where there is a significant cost of missing a relevant document, standard IR effectiveness measurement only reveals part of the truth. Since credible estimates of recall are not available, it is difficult to evaluate or design systems for this domain. Here, we propose a measure of corpus access, retrievability, and show using four large patent corpora that it can be used both to evaluate models for patent retrieval and also the corpora themselves for the ease with which a document can be retrieved. (C) 2011 Elsevier Ltd. All rights reserved.