,i,r I ' I ' Laporan,projek ini dikernukakar:r t uehagai mornenulf sebal,mgiam daripada syarat pengaelugerahcn Ijazah Sarj.ana il4{ed,e Sains (,$ains Komptrter) Fakulti Sains Kornputgr dan Sistem Maklumat Universiti Teknolo gi Malaysia NOVEMBER 2OO5
ABSTRAK Bilangan data Indeks Syariah yang dijanakan setiap minit terlalu banyak sehingga menyukarkan pelabur untuk menganalisa setiap datayangdijanakan. Tanpa analisa yang terperinci, peramalan pergerakan Indeks Syariah akan terjejas. Kajian ini dijalankan bagi mengkaji keberkesanan teori set kasar dan teknik pendiskretan terhadap data Indeks Syariah. Dalam kajian ini, dua teknik pendiskretan iaitu Equal Frequency Binning (EFB) danminimum Description Length Principle(MDlp) diuji untuk menentukan teknik pendiskretan yang lebih sesuai digunakan terhadap data Indeks Syariah bagi meramal pergeralan indeks tersebut. Hasil kajian menunjukkan bahawa MDLP mencatatkan peratusan pengelasan arfiara53%-80% sementara EFB mencatat peratusan pengelasan antara 44.4%-63.3%. rn menunjukkan bahawa teknik pendiskretan MDLP sesuai digunakan terhadap data Indeks Syariah. Kajian ini turut menunjukkan bahawa bagi teknik pendiskretan EFB, nilai bin yangsesuai digunakan adalah 10, dan pecahan data yangterbaik digunakan untuk pasca pemprosesan adalah 70o/o data latihan dan3}oh data ujian bagi mendapat peratusan pengelasan yang baik.
VI ABSTRACT / A huge amount of Syariah Indices generated every minute causes difficulty for analysts to scrutinize every data being produced. Without an indepth analysis on the generatedata, it is hard to predict the movement of these Indices. This study focuses on the effectiveness of implementing discretization techniques of rough set theory on Syariah Index. By comparing two discretization techniques, Equal Frequency Binning (EFB) and Minimum Description Length Principle (MDLp), the Syariah Index data which has gone through the discretizationprocess are evaluated to determine its efficiency in predicting the movement of the Syariah fndex. Results indicates that the MDLP technique records classification persentage between 53vo to 80% while the EFB technique records between 44.4%to 63.30/o. This proves that the MDLP technique is good for usage for Syariah Index. This study also concludes that for the EFB technique, the appropriate number of bins is 10 and partition of j0% data for training and3}%data for testing in post processing is suitable to achieve good classifi cation results.
7l BIBLIOGRAFI, Bazan, J., Nguyen, H.S., Nguyen, S.H., Synak, P. and Wroblewski, J. (1998). Rough Set Algorithms in Clasification Problem. Heidelberg: Physica-Verlag, 23-57. Nguyen, H.S. (1997). Discretization of Real Value Attributes: Boolean Reasoning Approach. University of Warsaw: Ph.D. Thesis. Noranisah Kamaruddin (2002). Penggunaan Teori Set Kasar Dalam Perlombongan Data. Universiti Utara Malaysia: Tesis Sarjana. Oehm, A. and Komorowski,J. (1997). ROSETTA - A Rough Set Toolkit for Analysis of Data. Proceedings of the 5ft Intemational Workshop on Rough Sets and Soft Computing (RSSC'97). Pawlak, Z. (1999). Rough Set: Theoretical Aspect of Reasoning About Data. Boston: Kluwer Publications. Rokiah Ahmad, Maslina Darus, Siti Mariyam Hj. Shamsuddin dan Anxaliza Abu Bakar (2001). Pendiskretan Set Kasar Menggunakan Ta'akulan Boolean Terhadap Pencaman Simbol Matematik. Jurnal Teknologi Maklumat. Sarjon Defit and Mohd Noor Md Sap (2002). Data Discretization for Mining Associaton Rules. Seminar ICT Peringkat Kebangsaan2002 (SICTUUM).
72 8. Zta*o,W. (2001). Rough Seis: frends, Challenges and Prospects. Computer Science Department, University of Regina. 9. Zhanggu,Z.,Yan, H and Fu, A. (1999). A New Stock Price Prediction Method Based on Pattern Classification. ICS WUT Research Report. 10. Jastini Binti Mohd Jamil. Pengkelasan Terhadap Data Pasca- / Pendislcretan Menggunakan Set Kasar dan Rambatan Balik : Saiu Perbandingan. Jabatan Grafik dan Multimedia, Fakulti Sains Komputer dan sistern Maklumat, Universiti Teknologi Malaysai. 11. Stud. Techn. Knut Magne Risvik. Discretization ofnumerical Attributes - Preprocessing Sr Machine Learning. University of Trondheim, Norway. 12. Liu, Huan & Rudy Setiono (1995). Chi2: Feature Selection and Discretization of Numeric Attributes. National University of Singapore. 13. Kok Y.P. (2004). Rough Set for Predicting The Kuala Lumpur StockExchange Composite Index Returns. Proceedings of the Knowledge Management International Conference & Exhibition (KMICE). r4. The Kuala Lumpur Stock Exchange Website. http://www.klse,com.mlt 15. Hussain F., Liu H., Tan C.L., and Dash M. (2002). Discretization : An Enabling Technique.Data Mining and Knowledge Discovery. Kluwer Academic. t6. Risvik K.M. (1997). Discretizationof Numerical Attributes : Prepocessing for Machine Learning. Master Thesis. Department of Computer and Information Science, Norwegian University of Science and Technology. T7, Dougherty J., Kohavi R., and Sahami M. (1995). Supervised and Unsupervised Discretizationof Continuous Features. Proceedings of the 12tr International Conference on Machine Learnins.
73 Liu H. and setiono R. (1995).chi2: Feature Selection and Discretization of Numeric Attributes. Proceedings of the seventh IEEE Intemational Conference on Tools with Artificial Intelligence. Liu H. and setino R. (1997). Feature selection via Discretization of Numerical Attributes. IEEE Transactions on Knowledge and Data Engineering. I Tay, F.E.H and Shen L.X. (1997). A ModifiedChi2Algorithm for Discretization. IEEE Transactions on Knowledge and Data Engineering. kerber R. (1992). chi Merge: Discretization of Numerical Attributes. Proceedings of the ninth International Conference on Artificial Intelligence. Kok Y.P. (2003). Discretization for Numerical Attributes: A Preprocessing for Stock Market Prediction. Serninar of Actuarial Science and Financial Mathematics, University Kebangsaan Malaysia; Unpublished.