The Art of Mathematics Retrieval
| Authors | |
|---|---|
| Year of publication | 2011 |
| Type | Article in Proceedings |
| Conference | Proceedings of the 2011 ACM Symposium on Document Engineering |
| MU Faculty or unit | |
| Citation | |
| web | |
| Doi | https://doi.org/10.1145/2034691.2034703 |
| Field | Informatics |
| Keywords | math indexing and retrieval; mathematical digital libraries; information systems; information retrieval; mathematical content search; document ranking of mathematical papers; math text mining; MIaS; WebMIaS |
| Attached files | |
| Description | The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene. |
| Related projects: |