Debre Berhan University Institutional Repository

JobVacancyAnnouncementTextCategorizationusing MachineLearningAlgorithm

Show simple item record

dc.contributor.author ALEMAYEHU, TEFERA
dc.date.accessioned 2021-10-28T06:53:52Z
dc.date.available 2021-10-28T06:53:52Z
dc.date.issued 2021-05
dc.identifier.uri http://etd.dbu.edu.et:80/handle/123456789/810
dc.description.abstract Availabilityoflargeamountofelectronicjobvacancytextonthewebmakesthe identification ofrelevantvacancyannouncementrelated to a specifictopicisa challengingtask.It’salsotrueforAmharictexts.Amharic(ኣማርኛ)isanEthiopian languagewhichcomes from SemiticlanguageandusedasfirstlanguagebyAmhara andworkinglanguageoffederalgovernment.Largeamountofelectronictextsinthis domainhasbeengenerated.So,a textcategorizationmechanism is required for finding,filteringandmanagingtherapidgrowthofonlineinformation.Thegoalof automatictextcategorization is to classifydocumentsinto a certain numberof predefined categoriesbyusingrulebasedormachinelearning.Theaim ofthisstudy is thereforeto investigate theapplication of machine learning techniques for vacancytextcategorization. A totalof1678vacancyannouncementtextwitheightcategories:“ጤና”(health), “ምህንድስና”(engineering),“የኮምፒዉተርሳይንስዘርፎች”(computing),“ተፈጥሮሳይንስ”(natural science)“ማህበራዊሳይንስ”(socialscience),“ህግ”(law),“ግብርና”(agriculture)and“ቢዝነስ እናኢኮኖሚክስ”(businessandeconomics)werecollected.Afterpreprocessingthetextfor tokenization,stemming word variants and removing stop words and unwanted charactersandweightingtheimportanceofaterm,1610pre-categorizedtextwere usedtotraintheclassifier.Inthisstudythreesupervisedmachinelearningclassifiers, namelysupportvectormachine,kNearestNeighborandNaïveBayesclassifiersare usedtocategorizethevacancytext. Experimentalresultshowsthat,SupportVectorMachineoutperformstheothertwo classifiers(K-NearestNeighborandNaïveBayes)withanaccuracyof76.4%.Thisisa promisingresulttodesignvacancytextcategorizationmodelforjobsannouncedin Amhariclanguage.ህግ(law)categoryisanitem whichperformsthebestclassification accuracyinthecurrentstudy.Because,law categoryisanitem thatsharetheleast commontermswithotherfieldofstudywhencomparedwiththerestofanitemsused inthecurrentstudy.However,therearechallengesindesigningjobvacancytext categorizationmodel.Themainchallengeinthisstudyis;thereareconflictingtagsasa resultofcommonwordsindifferentcategorieswhereitischallengingtaskformachine xiii to categorizethesewords.Itisthereforerecommendedto applysemanticbased Amharicvacancytextcategorization. en_US
dc.language.iso en en_US
dc.subject JobVacancy;Amhariclanguage;TextCategorization;MachineLearning en_US
dc.title JobVacancyAnnouncementTextCategorizationusing MachineLearningAlgorithm en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DBU-IR


Browse

My Account