350 rub
Journal Information-measuring and Control Systems №11 for 2009 г.
Article in number:
Super Arabic morphological analyzer (SAMA1)
Authors:
B. Shannaq, V. V. Alexandrov
Abstract:
A morphological process is a means of changing a stem to adjust its meaning to fit its syntactic and communicational context. Arabic is considering one of the major world languages, due to the morphological complexity of the Arabic language; Arabic morphology has become an integral part of many Arabic information retrieval systems. Building large-scale morphological analyzers is typically a laborious and time - consuming task. In this paper, we present an Arabic morphological analyzer (SAMA1) for performing morphological analysis, which entails finding the root of any Arabic words, The SAMA1 was likely to find the root of any Arabic words does not depend on dictionary words in Arabic, has a high percentage of accuracy and speed in finding the root. SAMA1 has been tested on a corpus of 24013 (Nouns and verbs) Root abstracts from Arabic thesaurus, Arabic books and the Saudi Arabian national computer conference. It runs an order of magnitude faster than other algorithms in the literature. The experimental results illustrate that SAMA1 is an efficient Morphological analyzer for Arabic words and gives generalization accuracy of about 98%.
Pages: 60-62
References
  1. Abdelali, A., Cowie, J., and Soliman, S. H., Arabic information retrieval perspectives // Proceedings of JEP-TALN 2004 Arabic Language Processing. Fez 19-22. April. 2004.
  2. Larkey, L., Ballesteros, L., and Connell, M., Improving stemming for arabic information retrieval: Light stemming and co-occurrence analysis. In SIGIR 2002. P. 269-274.
  3. Larkey, L., Ballesteros, L., and Connell, M., Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. SIGIR 2002, Finland. P. 275-282.
  4. Gey, F., Oard, D., The TREC-2001 cross-language information retrieval track: searching Arabic using English, French or Arabic queries. NIST, TREC 2001 Proceedings, P. 16-25.
  5. Larkey, L., and Connell, M., Arabic information retrieval at UMass in TREC-10. In: Voorhees, E.M. and Harman, D.K. (Eds.) The Tenth Text Retrieval Conference, TREC 2001 NIST Special Publication 500-250. 2002. P. 562-570.
  6. Al-Fedaghi, S. and Al-Anzi, New algorithm to generate Arabic root-pattern forms. In Proceedings of the 11th national computer conference. King Fahd University of Petroleum&Minerals, Dhahran, Saudi Arabia. 1989. P. 391-400.
  7. Al-Shalabi, R., Design and Implementation of an Arabic Morphological System to Support Natural Language Processing. PHD thesis, Computer Science. Chicago. 1996.
  8. Beesley, K. R., Arabic finite-state morphological analysis and generation. In COLING-96: Proceedings of the 16th international conference on computational linguistics. 1996. V.1. P. 89-94.
  9. Khoja,S. and Garside. Stemming Arabic text. Computing Department Lancaster University, Lancaster, 1999.
  10. http://www.comp.lancs.ac.uk/computing/users/khoja/stemmer.
  11. Darwish, K., Doermann, D., Jones, R., Oard, D., and Rautiainen, M. TREC-10 experiments at Maryland: CLIR and video. In TREC 2001. Gaithersburg: NIST, 2001.
  12. www.LearnArabicOnline.com
  13. Khoja, S., Garside R., and Knowles, G., An Arabic tagset for the morphosyntactic tagging of Arabic corpus linguistics, Lancaster University, Lancaster, UK. 2001.
  14. Lavie, A., Peterson, E., Probst, K., Wintner S., and Eytani, Y., Rapid prototyping of a transfer-based Hebrew-to-English Machine Translation system. Proceedings of the TMI-04. 2004.
  15. Morneau, R., Designing an artificial language: Arabic morphology. 1994.
  16. Goweder, A. and De Roeck, A., Assessment of a significant Arabic corpus.Presented at the Arabic NLP Workshop at ACL/EACL 2001. Toulouse. France. 2001.
  17. Larkey, L., S., Ballesteros, L., and Connell, M. E., Improving Stemming for Arabic Information Retrieval: Light Stemming and Co-occurrence Analysis // In SIGIR-02, August 11-15. 2002. Tampere. Finland. 2002 P. 275-282.
  18. Hayder, K. Al Ameed, Shaikha O. Al Ketbi, Amna A. Al Kaabi, Khadija S. Al Shebli, Naila F. Al Shamsi, Noura H. Al Nuaimi, Shaikha S. Al Muhairi, Arabic light stemmer: anew enhanced approach // Software Engineering Dept. College of Information Technology, UAE University,PO. Box 17555. Al-Ain. UAE.
  19. ChenA. and Gey, F., Building an Arabic Stemmer for Information Retrieval School of Information Management and Systems University of California at Berkeley. CA 94720-4600, USA.