Mon . 20 Jun 2020
TR | RU | UK | KK | BE |

Apache Lucene

apache lucene, apache lucene tutorial
Apache Lucene is a free and open-source information retrieval software library, originally written in 100% pure Java by Doug Cutting It is supported by the Apache Software Foundation and is released under the Apache Software License

Lucene has been ported to other programming languages including Object Pascal, Perl, C#, C++, Python, Ruby and PHP

Contents

  • 1 History
  • 2 Features and common use
  • 3 Lucene-based projects
  • 4 Users
  • 5 See also
  • 6 References
  • 7 Bibliography
  • 8 External links

History

Doug Cutting originally wrote Lucene in 1999 It was initially available for download from its home at the SourceForge web site It joined the Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005

Lucene formerly included a number of sub-projects, such as LuceneNET, Mahout, Tika and Nutch These three are now independent top-level projects

In March 2010, the Apache Solr search server joined as a Lucene sub-project, merging the developer communities

Version 40 was released on October 12, 2012

The latest version of Lucene is 630 which was released on November 8, 2016

Features and common use

While suitable for any application that requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of Internet search engines and local, single-site searching

Lucene has also been used to implement recommendation systems For example, Lucene's 'MoreLikeThis' Class can generate recommendations for similar documents In a comparison of the term vector-based similarity approach of 'MoreLikeThis' with citation-based document similarity measures, such as Co-citation and Co-citation Proximity Analysis Lucene's approach excelled at recommending documents with very similar structural characteristics and more narrow relatedness In contrast, citation-based document similarity measures, tended to be more suitable for recommending more broadly related documents, meaning citation-based approaches may be more suitable for generating serendipitous recommendations, as long as documents to be recommended contain in-text citations

At the core of Lucene's logical architecture is the idea of a document containing fields of text This flexibility allows Lucene's API to be independent of the file format Text from PDFs, HTML, Microsoft Word, Mind Maps, and OpenDocument documents, as well as many others except images, can all be indexed as long as their textual information can be extracted

Lucene-based projects

Lucene itself is just an indexing and search library and does not contain crawling and HTML parsing functionality However, several projects extend Lucene's capability:

  • Apache Nutch — provides web crawling and HTML parsing
  • Apache Solr — an enterprise search server
  • Elasticsearch — an enterprise search server
  • Compass — the predecessor to Elasticsearch
  • DocFetcher — a multiplatform desktop search application
  • LuceneNET — a port of Lucene written in C# and targeted at NET Framework users There are currently two variations of the software, differing in Generics support and a few bug fixes
  • Swiftype - an enterprise search startup based on Lucene
  • Ferret — a search library for Ruby programming language inspired by Lucene There is also a Ruby on Rails plugin called acts_as_ferret Ferret utilizes Poshlib
  • Kinosearch — a search engine written in Perl and C and a loose port of Lucene The Socialtext wiki software uses this search engine, and so does the MojoMojo wiki It is also used by the Human Metabolome Database HMDB and the Toxin and Toxin-Target Database T3DB
  • Apache Lucy is a successor project of both KinoSearch and Ferret, being jointly developed by the authors of these and having bindings in both Perl and Ruby
  • Luke — A Java-based GUI for Lucene which allows you to display and modify indexes

Users

For a list of companies that use Lucene rather than extend, see Lucene's "Powered By" page As an example, Twitter is using Lucene for its real time search

See also

  • Free software portal
  • Hadoop
  • Hibernate search
  • Xapian
  • Sphinx search engine
  • List of information retrieval libraries
  • LGTE
  • Information extraction
  • Text mining
  • eGranary Digital Library
  • Enterprise search
  • Manatee indexing library

References

  1. ^ "LuceneImplementations" apacheorg Retrieved 23 September 2015 
  2. ^ KeywordAnalyzer "Better Search with Apache Lucene and Solr" PDF 19 November 2007 
  3. ^ "Apache Lucene - Welcome to Apache Lucene" apacheorg Retrieved 4 February 2016 
  4. ^ "Apache Lucene - Lucene Core News" 
  5. ^ McCandless, Michael; Hatcher, Erik; Gospodnetić, Otis 2010 Lucene in Action, Second Edition Manning p 8 ISBN 1933988177 
  6. ^ GNU/Linux Semantic Storage System
  7. ^ J Beel, S Langer, and B Gipp, “The Architecture and Datasets of Docear’s Research Paper Recommender System,” in Proceedings of the 3rd International Workshop on Mining Scientific Publications WOSP 2014 at the ACM/IEEE Joint Conference on Digital Libraries JCDL 2014, London, UK, 2014
  8. ^ a b M Schwarzer, M Schubotz, N Meuschke, C Breitinger, V Markl, and B Gipp, https://wwwgippcom/wp-content/papercite-data/pdf/schwarzer2016pdf "Evaluating Link-based Recommendations for Wikipedia" in Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries JCDL, New York, NY, USA, 2016, pp 191-200
  9. ^ Perner, Petra 2007 Machine Learning and Data Mining in Pattern Recognition: 5th International Conference Springer p 387 ISBN 978-3-540-73498-7 
  10. ^ a b "What are the main differences between ElasticSearch, Apache Solr and SolrCloud - Quora" quoracom Retrieved 23 September 2015 
  11. ^ "Elasticsearch: RESTful, Distributed Search & Analytics - Elastic" elasticco Retrieved 23 September 2015 
  12. ^ "The Future of Compass & Elasticsearch" the dude abides Retrieved 2015-10-14 
  13. ^ Riley, Matt May 9, 2012 "What is the technology stack behind Swiftype - Quora" Quora Retrieved 3 October 2014 
  14. ^ https://githubcom/jkraemer/ferret Ferret-Github repository
  15. ^ http://wwwjkraemernet/projects/acts_as_ferret
  16. ^ a b Natividad, Angela "Socialtext Updates Search, Goes Kino" CMS Wire Retrieved 2011-05-31 
  17. ^ Marvin Humphrey "KinoSearch - Search engine library - metacpanorg" p3rlorg Retrieved 23 September 2015 
  18. ^ Diment, Kieren; Trout, Matt S 2009 "Catalyst Cookbook" The Definitive Guide to Catalyst Apress p 280 ISBN 978-1-4302-2365-8 
  19. ^ "HMDB: a knowledgebase for the human metabolome" Nucleic Acids Res 37 Database issue: D603–10 January 2009 doi:101093/nar/gkn810 PMC 2686599 PMID 18953024 
  20. ^ "T3DB: a comprehensively annotated database of common toxins and their targets" Nucleic Acids Res 38 Database issue: D781–6 January 2010 doi:101093/nar/gkp934 PMC 2808899 PMID 19897546 
  21. ^ Michael McCandless; Erik Hatcher; Otis Gospodnetić 2010 Lucene in Action 2 ed Manning Publications p 338 ISBN 978-1-933988-17-7 
  22. ^ "Apache Lucy" apacheorg Retrieved 23 September 2015 
  23. ^ "DmitryKey/luke" GitHub Retrieved 2015-10-14 
  24. ^ "PoweredBy" apacheorg Retrieved 23 September 2015 
  25. ^ MG Siegler "Twitter Quietly Launched A New Search Backend Weeks Ago" TechCrunch AOL Retrieved 23 September 2015 

Bibliography

  • Gospodnetic, Otis; Erik Hatcher; Michael McCandless 28 June 2009 Lucene in Action 2nd ed Manning Publications p 475 ISBN 1-9339-8817-7 
  • Gospodnetic, Otis; Erik Hatcher 1 December 2004 Lucene in Action 1st ed Manning Publications p 456 ISBN 978-1-9323-9428-3 

External links

  • Official website
  • LuceneNET
  • List of Lucene Ports or Implementations in Other Languages on the Apache wiki
  • Schmidt, Marco 2005 "Lucene Wikipedia indexer" Archived from the original on Jul 2006 Introductory article with Java code for search 
  • Apache Lucene popular APIs in GitHub

apache lucene, apache lucene core, apache lucene download, apache lucene elastic search, apache lucene example, apache lucene jar, apache lucene query syntax, apache lucene solr, apache lucene tutorial, apache lucene youtube


Apache Lucene Information about

Apache Lucene


  • user icon

    Apache Lucene beatiful post thanks!

    29.10.2014


Apache Lucene
Apache Lucene
Apache Lucene viewing the topic.
Apache Lucene what, Apache Lucene who, Apache Lucene explanation

There are excerpts from wikipedia on this article and video

Random Posts

La Porte, Indiana

La Porte, Indiana

La Porte French for "The Door" is a city in LaPorte County, Indiana, United States, of which it is t...
Fernando Montes de Oca Fencing Hall

Fernando Montes de Oca Fencing Hall

The Fernando Montes de Oca Fencing Hall is an indoor sports venue located in the Magdalena Mixhuca S...
My Everything (The Grace song)

My Everything (The Grace song)

"My Everything" was Grace's 3rd single under the SM Entertainment, released on November 6, 2006 Unli...
Turkish Straits

Turkish Straits

The Turkish Straits Turkish: Türk Boğazları are a series of internationally significant waterways in...