<- Back to SIGS Index


Documentation


Documenting your work is an important step as you go through the process of collection and entry. It enables staff and interns to keep track and be held accountable for the work that they have done. Work that had not been documented will easily fall through the cracks and it could cause work duplicates, mis-entry and other unforeseen errors, in other words, it is not going to be efficient. The other purpose of documentation is to create evidence, the presence of evidence supports our organization's integrity and also serve as a reference for future work and archival purposes.

The few aspects of documentation includes tracking, creating proper filenames and organizing files and folders. The different documenting aspects can be performed in any part of the research process. For example, one could organize and rename the files after entering the data onto admin or one could update the tracking document at the end of the day instead of doing it on the go, it all depends whichever way feels most comfortable. Though flexible, it is time sensitive and it is highly recommended at least that one documented their work at the end of the day or at the end of one's shift




 





Tracking


Tracking in this sense not only to track our work but also serves to evaluate the progress of SIGS as a whole. The first document is called the 'Sweep Sheet' which is a reusable and perpetual document where updates overwrite the older ones but only after the new update is issued. The second document is called the 'CEC Sheet' where 'CEC' stands for "Collection, Entry and Checks", this is a fixed document where it is renewed every year. Updates to the 'CEC sheet' are static and will append new updates instead of overwriting the old ones.


Sweep Sheet (Spreadsheet)

The purpose of this tracking document is to track the status and most recent updates to every SIG in our database. It contains the necessary columns that is vital to the process of collection.

Below is the skeleton for a 'Sweep Sheet':
columndescription
sig_idRefers to the unique id of the SIG, important to have in case of discrepancy in name
sig_nameName of the SIG as on Admin
urlWeb address of the SIG's website, although it could sometimes be the SIG's social media site if web address not given
url_statusShow the availability of the SIG's web address, can be 'working', 'broken' or 'redirect'
release_statusAdmin release status of the SIG, can be 'live', 'internal' or 'admin'
rating_group'Yes' if the group does ratings, 'No' if not
recent_ratingThe latest rating published by the group if not the last updated on Admin
recent_endorsementsThe latest endorsement given by the group to election candidates if not the last updated on Admin
tracking_statusDetermines whether or not the SIG should still be tracked, can be 'active', 'inactive' or 'new'
check_dateDate when it was last checked by SIGS researcher
check_byInitials of SIGS researcher who last checked the SIG

URL Status

Tracking Status

Columns that needs to be updated for every check: url_status, tracking_status, check_date, check_by. The other columns will automatically updated on a timely basis from a queried results on another spreadsheet.


CEC Tracking Sheet (Spreadsheet)

The purpose of this document is to track collected ratings and endorsements. The general rule of thumb is that if there are ratings or endorsements that needs entered, it will be recorded down on the CEC sheet. There are three parts to the CEC: Collection, Entry and Checks. The tables below illustrates when it is split into three different parts, in reality, they appear along side in one sheet.

Collection
columndescription
spanYear or range of years the SIGS data covers in the file
stateState(s) that the SIGS data covers in the file
sig_idRefers to the unique id of the SIG, important to have in case of discrepancy in name
sig_namePreferably the name of the SIG as stated on Admin although sig_id will be referred to as a fail safe
data_typeThe type of SIGS data either categorized as 'ratings' or 'endorsements'
date_collectedDate when SIGS data was last obtained by SIGS researcher
collected_byInitials of SIGS researcher who obtained the SIGS data

These columns should be filled after the collected item is finalized and ready to be stored. The person can choose anytime after the collection of the files to document but it is recommended to document all that had been collected by the end of their shifts.

Entry
columndescription
entry_idRefers to the unique id of ratings or endorsements entry generated when initially entered onto admin
entry_methodThe method that was used to enter the SIGS data onto Admin
date_enteredDate when SIGS data was entered onto Admin by SIGS researcher
entered_byInitials of SIGS researcher who entered the SIGS data onto Admin

These columns should be filled after an entry is completed on Admin and the database had generated their unique ID(s). The person who entered on Admin for the collected item should fill these columns as soon as possible to avoid work falling through the cracks.

Checks
columndescription
date_taggedDate where entry was tagged by SIGS researcher
tagged_byInitials of SIGS researcher who tagged the entry on Admin, only applies to ratings
date_webcheckedDate when entry was web-checked by SIGS researcher
webchecked_byInitials of SIGS researcher who web-checked the entry

Depending if there is a time constraint, tagging and checking will not be necessary immediately after entry and can be done by other person as a way of cross checking. However, the person who is responsible for entry could see the potential errors while the other person might not.

Columns in Collection, Entry and Checks were color coded based on the name of the section in the order that was mentioned above. They were arranged into three sections because they are each a distinct process and can be perform in conjunction or separately. It is also specifically design to increase the flexibility of our work flow. For example, collected ratings or endorsements does not need to be entered immediately after collection, but it can be entered at another time or by other researchers who did not collect it.




Organizing Files and Folders


Another aspect to documentation is the arrangement of file and folders. It is vital to keep a sound naming system and directory structure, having done so will assist in future retrievals and prevent the loss of file identity. All our files are currently stored on a network drive.


Naming System

The general rule to naming SIGs files is to provide necessary pieces of information that helps identify a file. These pieces of information are things like date, year, state abbreviation, group name etc. Some information are found to be consistent over the years of SIGs research and we are going to use these it to name SIGs files. In order for these pieces of information to present itself clearly, they will have to be arrange in a certain order such that the position of each piece matters. For example, year has to come before state abbreviation and state abbreviation has to come before the group's abbreviation and so on. Another reason to emphasize this is because it helps in locating the files easier both visually and logically. This is especially important in case of using year in the naming of files. Ratings and Endorsement files are typically produced on a yearly basis and should be first categorize as such. Sometimes even with distinct pieces of information, there can exist a duplicate of filenames, which is why in the naming structure you will see below can include additional info towards the end of the filename.

These are some technical terms you need to know as part of the context:
termsdescription
Elementsthese are the pieces of information as mentioned above that gives the files its identity
Positionrefers to the position of the naming elements
Namespacetwo or more elements of the same group

Note: 'Namespace' in this case shares the same meaning in essence with namespaces that are used in computing.

To structure the elements of the filename, we will be using characters such as underscores ('_') and dashes ('-'). Underscores are used to delimit the elements, and dashes are used to denote two elements sharing the same namespace. The element subsequent to the dash is more specific than the element prior to the dash and so on. In some cases the naming element prior to the dash makes sense of the element subsequent to the dash.


Naming files for Ratings

There are typically four different types of files when creating an entry for ratings, especially when the process involves using the harvester. This four different types are the Ratings, Extract, Worksheet and Harvest file.

General Structure:
[yearSpan]_[stateAbbreviation]_[groupAbbreviation]_[fileCategory]-[fileType]_[additionalInfo].[fileExtension]

file categoryfile typedescriptionadditional infofile extensionsexamples
Ratingsnot specifiedInitial document that contains all the ratings information by the group.House, Senate, (other office chambers/types), (numerical values)pdf, html, ods2019-2020_IA_NFIB_Ratings.pdf
ExtractContains extracted ratings information from the scorecard.House, Senate, (other office chambers/types), (numerical values)csv2019-2020_IA_NFIB_Ratings-Extract.csv
WorksheetData from extract file is cleaned, re-modeled, matched and translated; shows all your workings.matched, (type of issue), (numerical values)ods, xlsx2019-2020_IA_NFIB_Ratings-Worksheet.ods
HarvestContains data that is readable by the harvester and ready to be uploaded onto the database.Lifetime, (type of issue)csv2019-2020_IA_NFIB_Ratings-Harvest.csv

Note: The original file for ratings will not have an extension to its file type.


Naming files for Endorsements

There are two file categories for endorsement, single and multiple endorsements. A single endorsement file would typically have more details in their name for easier identification, whereas a multiple endorsement file would typically contain a list of endorsements with multiple offices.

General Structure:
[yearSpan]_[stateAbbreviation]_[groupAbbreviation]_[fileCategory]_[officeType]-[additionalInfo].[fileExtension]

file categorydescriptionoffice typeadditional infoexamples
EndorsementContains an individual or a single endorsement (required)lastname2016_NY_NFIB_Endorsement_Gubernatorial-Cuomo, 2017_NE_NFIB_Endorsement_Legislative-NE-07-Carrell
EndorsementsContains a list or more than one endorsement (not required) primary, general, (state abbreviation), (date-YYYYMMDD), (numerical values) 2018_NY_NFIB_Endorsements_Legislative-primary, 2018_NA_NFIB_Endorsements_Congressional-IA

The following table shows the common types of offices and how it should appear in the [officeType] element:
type of officefilename [officeType]
PresidentialPresidential
CongressionalCongressional
GubernatorialGubernatorial
StatewideStatewide
State LegislativeLegislative
State JudicialJudicial
Special[officeType]-[stateAbbreviation]-[districtNumber]


Directory Structure

SIGS in the VoteSmart database is separated into two categories, national and state groups. Under the national SIGS, there are no more categories but only the groups themselves and all of them are unique. On the other hand, state SIGS are categorized by the state abbreviation. Within each SIGS, there are two main groups: ratings and endorsements. Both groups contains files that are typically at a yearly interval with possibility of span of 2 years. Depending on whether or not if the ratings and endorsements are collected for that year, a folder corresponding to that year will be created.

So to put this into perspective, the directory structure will look like this:

Directory structure for National Groups:
National Groups --> (Name of SIG) --> Ratings/Endorsements --> (Year or Span) --> (Files within that year)

Directory structure for State Groups:
State Groups --> (State Abbreviation) --> (Name of SIG) --> Ratings/Endorsements --> (Year/Span) --> (Files within that year)

In some cases, national or even state groups contains the endorsements or ratings of specific states. A somewhat typical scenario for national groups are endorsements and ratings for state level candidates. This would meant that there is a possibility of multiple state level files in the national SIG. Folders named with state abbreviations is used to reduce the clutter of files within the same year. It will look like this:

... --> (Year/Span) --> (State Abbreviation) --> (files in that state and year)

Note: The state abbreviation for candidates on the national level is 'NA'.



There are no comments on this page.
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki