User Tools

Site Tools


excel_workbook_developer_guide

Excel Workbook Developer Guide

This is the newest most comprehensive main data collection tool used as an annual school survey. It collects school's resources, student rosters, teaching/non-teaching staff, WASH data and can be extended to collect more.

Only in use with country collecting granular rosters as part of the annual census. Currently includes FSM and RMI

Workbook Connections and Names

Set the data connection to the country's production database. Currently this is used only for schools in the SchoolsList sheet but can also be used to refresh “lookups” in the List sheet. From Excel top menu Data → Connections and then edit the properties (in definition tab.)

Naturally, the lookups in List sheet will need to be changed. Use the Name Manager from the Formula top menu in excel to make sure you adapt their new range since the number of lookups will often change.

workbook-list-name-manager.jpeg

Workbook Data

Some of the data from the workbook may not always have a place in the SQL schema. In those cases the DB stored the XML parts. It could be queried like this.

SELECT TSV.schNo
, TSV.svyYear
, SS.ssSchType
, tchFirstName
, tchFamilyName
, tcheData.value('(/row/@First_Name)[1]', 'nvarchar(50)') xFirstName
, tcheData
, tchClass
, tchClassMax
, tchSector
, tcheData.value('(/row/@ECE)[1]', 'nvarchar(1)') G0
, tcheData.value('(/row/@Grade_1)[1]', 'nvarchar(1)') G1
, tcheData.value('(/row/@Grade_2)[1]', 'nvarchar(1)') G2
, tcheData.value('(/row/@Grade_3)[1]', 'nvarchar(1)') G3
, tcheData.value('(/row/@Grade_4)[1]', 'nvarchar(1)') G4
, tcheData.value('(/row/@Grade_5)[1]', 'nvarchar(1)') G5
, tcheData.value('(/row/@Grade_6)[1]', 'nvarchar(1)') G6
, tcheData.value('(/row/@Grade_7)[1]', 'nvarchar(1)') G7
, tcheData.value('(/row/@Grade_8)[1]', 'nvarchar(1)') G8
, tcheData.value('(/row/@Grade_9)[1]', 'nvarchar(1)') G9
, tcheData.value('(/row/@Grade_10)[1]', 'nvarchar(1)') G10
, tcheData.value('(/row/@Grade_11)[1]', 'nvarchar(1)') G11
, tcheData.value('(/row/@Grade_12)[1]', 'nvarchar(1)') G12 
FROM pTeacherRead.TeacherSurveyV TSV
       INNER JOIN SchoolSurvey SS
             ON TSV.ssID = SS.ssID
ORDER BY tchClass

We can effectively have values pass through from the workbook straight to the SQL server, and have them appear in views, without finding a home for those values in the relational storage.

There is another arguably neater syntax for the above query. Check out some of the code in EMIS DB.

Workbook Localized Columns

The EMIS' workbook has the ability to localize the name of the column names in the Excel workbook. In fact, it is important to do so for the workbook to load correctly. For example, a “District” in FSM is called a “State” in RMI it is an “Atoll / Island”. Note that ('District','School No','School Type') are always localized based on the table dbo.sysVocab because they are always always localized. Other column names can also be localized (e.g. “FSM SSN” to “RMI SSN”) and this is best done in the table CensusWorkbookColumnMaps. See example below two additional localization.

You can also refer to pSurveyOps.CensusWorkbookColumnMapping for details.

Note that currently it translates from an FSM local terminology to the desired localized terminologies. This is because the workbook originated in FSM though the default terms will eventually be changed to be the Pacific EMIS defaults

Workbook Commits to Repository

The workbooks are also committed into the repository of the Census Workbooks under external tools. Note a macro needs to be run before committing changes to the workbook. Run the macro from within the folder where the workbook has changed. For example, in Census Workbook/FSM/. The macros export the code, remove the contents of all the lists, and also produce a simple text file listing the following:

  • All the sheets in the workbook
  • All the list objects, and the field names in each list
  • All the “defined names” in the workbook, and their formulas.

This is done prior to committing and will make it possible to diff key structural changes to the workbooks, without having to visually examine them.

Workbook Troubleshooting

Since the workbook contains a large amount of data it is not joyful to troubleshooting with the SQL Profiler. A more light weight approach is to use a tool to convert the data into the same XML that would normally go in the SQL Database from loading the workbook through the Pacific EMIS. The tool can be found at https://bitbucket.org/softwords/softwords.xllistutilities/src/master/

Just edit the Program.cs file and run the tool from VS Studio. You will generate something like the following (which is truncated.)

DECLARE @xml xml = CONVERT( xml, '
<ListObject FirstRow="4">
  <row Index="0" SchoolYear="SY2019-2020" State="Yap" School_Name="Bael Elementary School" School_No="YAP402" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="1" Girls_Toilets_Usable="1" Boys_Toilets_Total="1" Boys_Toilets_Usable="1" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
  <row Index="1" SchoolYear="SY2019-2020" State="Yap" School_Name="Colonia Middle School" School_No="YAP403" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="3" Girls_Toilets_Usable="3" Boys_Toilets_Total="2" Boys_Toilets_Usable="2" Available="Yes" Soap_and_Water="Water only" />
  ...
  <row Index="2" SchoolYear="SY2019-2020" State="Yap" School_Name="Dalipebinaw ECE Center" School_No="YAP404" School_Type="Government" Main_Source_Drinking_Water="Rainwater" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="1" Girls_Toilets_Usable="1" Boys_Toilets_Total="1" Boys_Toilets_Usable="1" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
  <row Index="3" SchoolYear="SY2019-2020" State="Yap" School_Name="Dalipebinaw Elementary School" School_No="YAP405" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="2" Girls_Toilets_Usable="2" Boys_Toilets_Total="2" Boys_Toilets_Usable="2" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
</ListObject>
')
 
EXEC procname @xml

Which you need to then add of couple of easy manual edits shown below.

DECLARE @xml xml = CONVERT( xml, '
<ListObject FirstRow="4" schoolYear="SY2019-2020" state="Yap">
  <row Index="0" SchoolYear="SY2019-2020" State="Yap" School_Name="Bael Elementary School" School_No="YAP402" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="1" Girls_Toilets_Usable="1" Boys_Toilets_Total="1" Boys_Toilets_Usable="1" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
  <row Index="1" SchoolYear="SY2019-2020" State="Yap" School_Name="Colonia Middle School" School_No="YAP403" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="3" Girls_Toilets_Usable="3" Boys_Toilets_Total="2" Boys_Toilets_Usable="2" Available="Yes" Soap_and_Water="Water only" />
  ...
  <row Index="2" SchoolYear="SY2019-2020" State="Yap" School_Name="Dalipebinaw ECE Center" School_No="YAP404" School_Type="Government" Main_Source_Drinking_Water="Rainwater" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="1" Girls_Toilets_Usable="1" Boys_Toilets_Total="1" Boys_Toilets_Usable="1" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
  <row Index="3" SchoolYear="SY2019-2020" State="Yap" School_Name="Dalipebinaw Elementary School" School_No="YAP405" School_Type="Government" Main_Source_Drinking_Water="Piped water supply" Currently_Available="Yes" Toilet_Type="Flush / pour-flush toilets" Girls_Toilets_Total="2" Girls_Toilets_Usable="2" Boys_Toilets_Total="2" Boys_Toilets_Usable="2" Common_Toilets_Total="0" Common_Toilets_Usable="0" Available="Yes" Soap_and_Water="Water and Soap" />
</ListObject>
')
 
EXEC pSurveyOps.censusLoadWash @xml, @fileReference='DE7D5752-07E1-404F-9E7B-452E90CC27BF', @USER='ghachey@nuzusys.com'
From Brian when asked about in https://bitbucket.org/softwords/pineapples/issues/683/support-for-loading-wash-in-isolation-2020 This is a consequence of https://bitbucket.org/softwords/pineapples/issues/798/workbook-build-support-to-localise-column, which allows vocab translation of column names; ie School ID.

These are expected to be returned to their generic vocab bane ie ‘SchoolNo’ before passing to the load proc. This translation is done in CensusWorkbookcontroller - see https://bitbucket.org/softwords/pineapples/commits/88db4479d3b9#chg-Pineapples/Controllers_Api/CensusWorkbookController.cs

The down side is you do now get this problemif using xllistUtilities - probably the best solution is to temporarily set the column name School Id to SchoolNo before, save then run xllistutilities. Or, do a global search and replace in the xml in Notepad++

Another useful tool is to use some audit views that decodes the XML of any loaded worbooks and you can query the whole workbook data directly from SQL.

pSurveyOps.TeacherSurveyAudit

Also common.BuildTeacherSurveyAudit is a helper to assist in keeping this view up-to-date, as columns may be added in the future. So using this view we can drill in to reconcile in detail from workbook⇒ TeacherSurvey⇒warehouse.TeacherLocation

DECLARE @fileID nvarchar(50) = 'C718F985-6E19-4C14-B455-B61BDE8EDD6C'
 
 
-- this is a verbatim dump of the xml loaded into each teacher survey record
-- from the workbook with the given id
SELECT * FROM 
[pSurveyOps].TeacherSurveyAudit
WHERE fileID = @fileID
 
 
-- subtotals on Teaching Staff / Non-teaching staff
SELECT Staff_Type, COUNT(*) Num 
FROM pSurveyOps.TeacherSurveyAudit
WHERE fileID = @fileID
GROUP BY Staff_Type
 
-- there will be a teacherlocation record for each teachersurvey record
SELECT * 
FROM warehouse.TeacherLocation TL
    INNER JOIN pSurveyOps.TeacherSurveyAudit A
        ON TL.tID = A.tId AND TL.surveyYear = A.svyYear
        AND A.fileID = @fileID
 
-- group by the classification of the teacher - T A M X 
SELECT TAMX, COUNT(*) Num
FROM warehouse.TeacherLocation TL
    INNER JOIN pSurveyOps.TeacherSurveyAudit A
        ON TL.tID = A.tId AND TL.surveyYear = A.svyYear
        AND A.fileID = @fileID
GROUP BY TAMX
 
 
-- most teaching staff will be 'T' or 'M" (mixed)
-- non teaching staff will be 'A' or 'X' (Other duties - cooks, gardeners etc)
-- however there may be some overrides to this, to attempt to correct some misclassifications
-- these are now reduced becuase they are reported during the validation phase of the upload
-- so they should be manually corrected and confirmed before loading
-- Break down the TAMX totals to show these exceptions
SELECT TAMX 
, SUM(CASE WHEN Staff_Type = 'Teaching Staff' THEN 1 END) TeachingStaff
, SUM(CASE WHEN Staff_Type = 'Teaching Staff' THEN NULL ELSE 1 END) NonTeachingStaff
, COUNT(*) Num
FROM warehouse.TeacherLocation TL
    INNER JOIN pSurveyOps.TeacherSurveyAudit A
        ON TL.tID = A.tId AND TL.surveyYear = A.svyYear
        AND A.fileID = @fileID
GROUP BY TAMX
 
-- enumerate the idenitifed 'corrections', showing the recorded activities for the staff member
-- this will exp[lain the reason for the correction
-- the number of row returned here should correspond to the previous query 
-- Non-Teaching T + Non Teaching M + Teaching A + Teaching X
SELECT TAMX
, A.[INDEX]
, A.First_Name
, A.Last_Name
, A.Staff_Type
, A.Job_Title
, A.ECE, A.Grade_1, A.Grade_2, A.Grade_3, A.Grade_4, A.Grade_5, A.Grade_6, A.Grade_7, A.Grade_8, A.Grade_9, A.Grade_10, A.Grade_11, A.Grade_12
, A.Admin, A.Other
, TL.A, TL.X, TL.T
FROM warehouse.TeacherLocation TL
    INNER JOIN pSurveyOps.TeacherSurveyAudit A
        ON TL.tID = A.tId AND TL.surveyYear = A.svyYear
        AND A.fileID = @fileID
WHERE ( Staff_Type = 'Non Teaching Staff' AND TAMX IN ('T','M'))
OR (Staff_Type = 'Teaching Staff' AND TAMX IN ('A','X'))
ORDER BY Staff_Type, [INDEX]
 
 
-- Show the NET difference between the 'raw' report of Teaching NonTeaching and the adjusted report in warehouse.TeacherLocation
SELECT 
SUM(CASE WHEN TAMX IN ('T','M') THEN NonTeachingStaff END) TeachingStaffUp
, SUM(CASE WHEN TAMX IN ('A','X') THEN TeachingStaff END) TeachingStaffDown
, SUM(CASE WHEN TAMX IN ('T','M') THEN NonTeachingStaff END) -
SUM(CASE WHEN TAMX IN ('A','X') THEN TeachingStaff END) NetUp
FROM
(
SELECT TAMX 
, SUM(CASE WHEN Staff_Type = 'Teaching Staff' THEN 1 END) TeachingStaff
, SUM(CASE WHEN Staff_Type = 'Teaching Staff' THEN NULL ELSE 1 END) NonTeachingStaff
, COUNT(*) Num
FROM warehouse.TeacherLocation TL
    INNER JOIN pSurveyOps.TeacherSurveyAudit A
        ON TL.tID = A.tId AND TL.surveyYear = A.svyYear
        AND A.fileID = @fileID
GROUP BY TAMX
) SUB
excel_workbook_developer_guide.txt · Last modified: 2022/02/18 18:46 by ghachey