User Tools

Site Tools


emis_user_manual

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
emis_user_manual [2023/02/27 22:11] – [References] ghacheyemis_user_manual [2024/03/28 00:33] (current) – [School Accreditation Dashboard] ghachey
Line 337: Line 337:
 === Sum of Item Variances === === Sum of Item Variances ===
  
-Is simply the sum of the variance of each item (i.e. question) on the test. The variance is a measure of variability, in this case how much the correct vs incorrect scores vary across all candidates for a given item.+Is simply the sum of the variance of each item (i.e. question) on the test. The variance is a measure of variability (i.e. dispersion), in this case how much the correct vs incorrect scores vary across all candidates for a given item. The theoretical lowest value is no variability at all (i.e. 0) while the highest value would be 0.25. Therefore, the scale of the sum of Item Variances for a 60 item test would be 15.
  
 <note tip>You can find an easy to follow article on how to calculate the variance of a dataset [[https://www.scribbr.com/statistics/variance/|here]] [BhandariVariance2023]</note> <note tip>You can find an easy to follow article on how to calculate the variance of a dataset [[https://www.scribbr.com/statistics/variance/|here]] [BhandariVariance2023]</note>
 +
 +=== Variance of Total ===
 +
 +The variance of total is the variance of the dataset of all total scores of the candidates. For example, in this test there were 689 candidates and 60 items. Each of the candidate will have scored a number of correct items (e.g. a student with 45 out of 60 for a score of 45). The dataset would look like {45, 34, 51, 23,..., 39} with 689 scores. The variance of total is how much the total candidate scores vary. The lower bound would be 0 with absolutely no dispersion (all the same results) and the upper bound would be an extremely high dispersion (theoretically as high as 900 for this particular dataset of 60 possible different scores for a large enough dataset (i.e. hundreds of candidates). The lower and more specifically the upper bound of this measure is full dependent on the range of the dataset (e.g. 0-60 as possible score).
 +
 +=== Standard Deviation of Total ===
 +
 +The standard deviation is also a measure of variability (i.e. dispersion) of values in a dataset. It assesses how far a data point likely falls from the mean (i.e. average).
 +
 +<note tip>Further reading on how to produce the standard deviation can be found [[https://www.scribbr.com/statistics/standard-deviation/|here]] [BhandariStd2023]</note>
 +
 +The standard deviation along with the mean can then produce the commonly used normal distribution visually showing the probability of how far student's score falls from the mean (average).
 +
 +{{ :user-manual:student-score-normal-distribution-sample.png?nolink&600 |}}
  
 === Cronbach's Alpha (Alpha) === === Cronbach's Alpha (Alpha) ===
Line 352: Line 366:
   * **.50 or below**: Questionable reliability. This test should not contribute heavily to the course grade, and it needs revision.   * **.50 or below**: Questionable reliability. This test should not contribute heavily to the course grade, and it needs revision.
  
 +<note tip>For further reading on cronbach's alpha refer to [[https://statisticsbyjim.com/basics/cronbachs-alpha/|this article]]</note>
  
 +=== Standard Error of Measurement ===
 +
 +A standard error of measurement estimates the variation around a "true" score for an individual when repeated measures are taken. 
 +
 +<note tip>The mathematical formula behind this estimation is nicely explained [[https://www.statology.org/standard-error-of-measurement/|here]].</note>
  
-<note tip>For further reading on cronbach's alpha refer to [[https://statisticsbyjim.com/basics/cronbachs-alpha/|this article]]</note> 
  
 === Item Difficulty === === Item Difficulty ===
Line 380: Line 399:
 === Item Discrimination Index === === Item Discrimination Index ===
  
-The item's discrimination index is a measure of how well an item is able to distinguish between candidates who are knowledgeable and those who are not. A negative discrimination index may indicate that the item is measuring something other than what the rest of the test is measuring. Optimally an item should have a positive discrimination index of at least 0.2, which indicates that high scorers have a high probability of answering correctly and low scorers have a low probability of answering correctly.+The item's discrimination index is a measure of how well an item is able to distinguish between candidates who are knowledgeable and those who are not (more [[https://knowledgeburrow.com/what-is-discrimination-index-and-its-formula/|here]] [BrownDI2020]). A negative discrimination index may indicate that the item is measuring something other than what the rest of the test is measuring. Optimally an item should have a positive discrimination index of at least 0.2, which indicates that high scorers have a high probability of answering correctly and low scorers have a low probability of answering correctly.
  
 <note tip>The 0.2 threshold is an arbitrary number. Some systems use different thresholds. For example, ScorePak® classifies item discrimination as "good" if the index is above 0.30; "fair" if it is between 0.10 and 0.30; and "poor" if it is below 0.10.</note> <note tip>The 0.2 threshold is an arbitrary number. Some systems use different thresholds. For example, ScorePak® classifies item discrimination as "good" if the index is above 0.30; "fair" if it is between 0.10 and 0.30; and "poor" if it is below 0.10.</note>
Line 394: Line 413:
 <note tip>For those wondering how the discrimination index is produced, here it is. Two groups for the whole exam are created: the top 27% of candidates and the bottom 27% of candidates based on the total scores of candidates. The total number of candidates who got the correct answer for the studied item in the top 27% group is recorded (e.g. 104 in figure above) and the same is done for the bottom 27% group (e.g. 56 in figure above). The number recorded for the bottom 27% group is subtracted from the number recorded for the top 27% group, then divide that number by the total in a group (e.g. 187 in the figure above)</note> <note tip>For those wondering how the discrimination index is produced, here it is. Two groups for the whole exam are created: the top 27% of candidates and the bottom 27% of candidates based on the total scores of candidates. The total number of candidates who got the correct answer for the studied item in the top 27% group is recorded (e.g. 104 in figure above) and the same is done for the bottom 27% group (e.g. 56 in figure above). The number recorded for the bottom 27% group is subtracted from the number recorded for the top 27% group, then divide that number by the total in a group (e.g. 187 in the figure above)</note>
  
 +More on test analysis can be found [[https://www.washington.edu/assessment/scanning-scoring/scoring/reports/item-analysis/|here]] [UWItemAnalysis]
  
 ==== Dashboard Analysis ==== ==== Dashboard Analysis ====
Line 757: Line 777:
 ==== School Accreditation Dashboard ==== ==== School Accreditation Dashboard ====
  
-This is where you can get some live data analytics just like every other module in the EMIS. You access just like other dashboard, Click on ''School Accreditation -> Dashboard''+This is where you can get some live data analytics just like every other module in the EMIS. You access just like other dashboard, Click on ''School Accreditation -> Dashboard''
 + 
 +<note important>In order for the dashboard to contain data from the most recently approved and uploaded cloudfiles, the EMIS warehouse has to be regenerated.</note>
  
 {{ :user-manual:school-accreditation-dashboard-1.png |}} {{ :user-manual:school-accreditation-dashboard-1.png |}}
Line 1161: Line 1183:
 ===== Data Operations ===== ===== Data Operations =====
  
-==== Upload Workbook ====+==== Annual Census Data Transfer ==== 
 + 
 +The Pacific EMIS supports a number of ways to transfer data from schools to a centralized database. They have varying degrees of maturity and complexity of user adoption. They are summarized in the below table. 
 + 
 +^ Method      ^ Description       ^ Maturity          ^ User Adoption Complexity ^ 
 +| PDF Survey    | A comprehensive survey of data with the majority of it in aggregate format containing enough for the main key performance indicators     | Very mature       | Easy       | 
 +| Excel Workbook    | A comprehensive survey of data with support for individual rosters and richer data containing enough for the main key performance indicators and more. Built on widely used spreadsheet (Excel)     | Very mature       | Moderate       | 
 +| Student Information System (SIS)    | The highest form of data management within the schools then pushed to the centralized system. An online web-based student information system meant for daily use by schools.     | On-going development       | Difficult       | 
 + 
 + 
 + 
 +==== Upload PDF Survey ==== 
 + 
 +<note important>This documentation below is strictly how to load an Annual School Census PDF Survey into the EMIS. Information on how to use the PDF Survey itself is documented in [[annual_school_census_pdf_survey_user_manual|Annual School Census PDF Survey User Manual]]</note> 
 + 
 +<note important>Not every country uses the PDF Survey. Some use the full roster workbook. This section is only relevant if you use the PDF Survey, typically containing yearly aggregate data as oppose to full student/staff rosters.</note> 
 + 
 +Your Annual School Census PDF Survey must be completed first. Then if you have appropriate permissions you simply go to ''Data Operation -> Upload PDF Survey'' sub-menu item. Simply drag the PDF Survey as shown below. 
 + 
 +<note important>Occasionally The PDF Survey upload would result in an error "Empty Name Token" which would prevent the survey from loading. This came from the open-source dependency called PdfFileAnalyzer used to extract the data from the PDF file. A custom branch of this open-source dependency PdfFileAnalyzer  was developed to eliminate that error, so more surveys will now load without problem.</note> 
 + 
 +<note tip>To speed up the uploading of the file, you can now zip the survey and upload that. You can do this with the default zipping tool on your operating system. We recommend installing [[https://www.7-zip.org/|7-zip]] a free open source tool that can easily zip files also. Compressing the file (i.e. zipping) can speed up to 3-4 times faster the uploading.</note> 
 + 
 +{{ :user-manual:pdf-survey-upload.png?nolink |}} 
 + 
 +==== Upload Excel Workbook ====
  
 <note important>This documentation below is strictly how to load a Annual School Census Workbook into the EMIS. Information on how to use the workbook itself is documented in [[annual_school_census_workbook_user_manual|Annual School Census Workbook User Manual]]</note> <note important>This documentation below is strictly how to load a Annual School Census Workbook into the EMIS. Information on how to use the workbook itself is documented in [[annual_school_census_workbook_user_manual|Annual School Census Workbook User Manual]]</note>
  
-Your Annual School Census Workbook must be completed first. Then if you have appropriate permissions then you you simply go to ''Data Operation -> Upload Workbook'' sub-menu item. Simply drag the workbook as shown below.+Your Annual School Census Workbook must be completed first. Then if you have appropriate permissions you simply go to ''Data Operation -> Upload Workbook'' sub-menu item. Simply drag the workbook as shown below.
  
 {{ :user-manual:workbook-upload-1.png |}} {{ :user-manual:workbook-upload-1.png |}}
Line 1372: Line 1419:
 ==== Rebuild Warehouse ==== ==== Rebuild Warehouse ====
  
-===== Professional Data Publications =====+===== Professional Data Analysis, Reports and Publications =====
  
-We have combination of tools (e.g. [[data_publishers_toolkit|Data Publishing Toolkit]], [[multi-part_word_documents|Multi-part Word Document]]) and practices for publishing data into Excel tables and pivot tablesThis forms the basis for the construction of documents such as Annual Statistics Digest, etc. These connection are direct connections to SQL Server using Excel’s OLEDB capacity. It is possible to establish SQL connections across the WAN, but this entails some careful security setup to allows users through a firewall to reach the SQL ServerIn particular, a "read only" user. In order to support users who do not have access to SQL (i.e. either across their LAN, or across the WAN) to access data in Excel we are building support for new options. This would allow more options for remote hosting, and also for a broader set of users to have access to Excel.+The Pacific EMIS offers variety of means for data analysis, reports and publicationsThey range from very simple to use to highly advanced requiring advanced data analysis skillsThe methods are summarized below.
  
-The three available planned options are available in various levels of maturity+^ Type ^ Easy of Use ^ Flexibility ^ 
 +|Pre-built Reports | Easy | More rigid. Users use reports as they are provided directly in the Pacific EMIS web UI. JasperReport and Microsoft SQL Reporting Services are supported with mostly JasperReports provided and tested. | 
 +|Pre-built Dashboards | Easy | More rigid. Users make use of the data analysis dashboards as they are provided directly in the Pacific EMIS web UI. | 
 +|Custom Reports | Moderate | More flexible. User with SQL and JasperReports (or MS SQL Reporting Services) skills can build new or modify existing reports with possibility to access them either from the JasperReports Server or integrate them directly in the Pacific EMIS Web UI following some simple convention. | 
 +|Flexible Excel/Word Reports | Moderate | More flexible, user with data analysis, excel and a bit of SQL skills can build any types of report or presentation imaginable. Pull data in excel and link it to MS Word or Power Point presentation. | 
 +|Custom Dashboards | Advanced | Users with advanced SQL, C#, TypeScript, Angular and Data Analysis skills could expand on the existing dashboards provided in the Pacific EMIS out-of-the-box. Since this project is open source the custom dashboartds are required to be released as open source.| 
 +|External Tools | Advanced | Users with advanced SQL and/or data analysis skills could connect to the RESTful API or use excel data exports to use with their favourite tools (e.g. Python/Pandas/Dash, R, Tableau, Power BI, SPSS, SAS) | 
 + 
 +Further notes on the Flexible Excel/Word Reports type of reporting. We have a combination of tools (e.g. [[data_publishers_toolkit|Data Publishing Toolkit]], [[multi-part_word_documents|Multi-part Word Document]]) and practices for publishing data into Excel tables and pivot tables. This forms the basis for the construction of documents such as Annual Statistics Digest, etc. These connection are direct connections to SQL Server using Excel’s OLEDB capacity. It is possible to establish SQL connections across the WAN, but this entails some careful security setup to allows users through a firewall to reach the SQL Server. In particular, a "read only" user. In order to support users who do not have access to SQL (i.e. either across their LAN, or across the WAN) to access data in Excel we are building support for new options. This would allow more options for remote hosting, and also for a broader set of users to have access to Excel. 
 + 
 +The three planned options are available in various levels of maturity and the table below shows the comparative strength and weaknesses of the 3 approaches:
  
 ^ Mode ^ Connection ^ Refreshable? ^ Flexibility ^ Distribution ^ Maturity ^ ^ Mode ^ Connection ^ Refreshable? ^ Flexibility ^ Distribution ^ Maturity ^
Line 1384: Line 1441:
  
 All three options will be documented in the following sections. All three options will be documented in the following sections.
-+
 ==== Direct SQL / DPT ==== ==== Direct SQL / DPT ====
  
Line 1397: Line 1454:
 Power query extends Excel’s ability to access data from relational sources with the ability to extract, transform and load data from a wide variety of sources including http rest end points. Power query extends Excel’s ability to access data from relational sources with the ability to extract, transform and load data from a wide variety of sources including http rest end points.
  
-=== Dynamic Workbooks ===+==== Dynamic Workbooks ====
  
 Currently this design is prototyped: Currently this design is prototyped:
Line 1423: Line 1480:
 </code> </code>
  
-This table shows the comparative strength and weaknesses of the 3 approaches: 
  
 +====== Public Data Dissemination ======
 +
 +Aside from data about individuals, just about all other data benefits from being public for a number of reasons:
 +  * **Transparency and Accountability**: Open data fosters transparency, holding organizations and governments accountable.
 +  * **Innovation and Collaboration**: Public data fuels innovation, enabling collaboration among researchers, entrepreneurs, and developers.
 +  * **Informed Decision-Making**: Accessible data empowers individuals and organizations for better decision-making.
 +  * **Citizen Engagement**: Open data encourages citizen participation and engagement in public affairs.
 +  * **Economic Growth**: Data availability stimulates economic growth, fostering entrepreneurship and creating jobs.
 +  * **Research Advancement**: Open data accelerates research and scientific discoveries across various fields.
 +  * **Public Services Improvement**: Governments utilize open data to enhance public services and address societal challenges.
 +  * **Cross-Sector Collaboration**: Open data facilitates collaboration between sectors to address complex issues.
 +  * **Community Empowerment**: Open data enables communities to address local challenges and actively participate in development.
 +  * **Accountable Governance**: Open data supports accountable governance, allowing citizens to monitor government activities and foster trust.
 +
 +The Pacific EMIS supports a number of ways to disseminate data in a timely manner that KEMIS does not yet fully adopt. They are summarized in the table below:
 +
 +^ Method ^ Description ^ Time to Access ^ Access Difficulty ^
 +| Publications | Often the most official and slowly curated sources of data. | Longest time. Requires more efforts in collecting, validating, preparing and publishing and can come sometime years after. | Easy. Typically through downloadable PDF documents. |
 +| Mobile App | Download all data and view analysis on your phone. | Fast. New data immediately available. | Easy. Download the Pacific Open Education Data app on your Android or iPhone |
 +| Public Login | The web application supports a public login to view all dashboards and download excel data exports. | Fast. New data immediately available. | Easy. Login the country EMIS public profile from login page. |
 +| REST API | Access the raw data through special web requests in formats such as JSON and XML. | Fast. New data immediately available. | Difficult. Typically used by developers or researching with programming skills |
 +
 +A word of caution. The pacific island countries faces unique challenges making it hard to continuously improve the quality of data. Accessing the data any of the fast methods has the caveat that it has had less time for scrutiny and possible remediation. So it should always be used with care and we always appreciate any apparent issue brought to our attention so we can work to fix them.
 +
 +Each of these methods will have a sub-sections here with additional documentation.
 +
 +===== References =====
  
-==== References ====+[BhandariStd2023] Bhandari, P. (2023, January 20). How to Calculate Standard Deviation (Guide) | Calculator & Examples. Scribbr. Retrieved February 27, 2023, from https://www.scribbr.com/statistics/standard-deviation/
  
 [BhandariVariance2023] Bhandari, P. (2023, January 18). How to Calculate Variance | Calculator, Analysis & Examples. Scribbr. Retrieved February 23, 2023, from https://www.scribbr.com/statistics/variance/ [BhandariVariance2023] Bhandari, P. (2023, January 18). How to Calculate Variance | Calculator, Analysis & Examples. Scribbr. Retrieved February 23, 2023, from https://www.scribbr.com/statistics/variance/
emis_user_manual.1677535885.txt.gz · Last modified: 2023/02/27 22:11 by ghachey