Congressional Record (Floor Speeches Harvester) Guidelines:: PVSWiki

Congressional Record (Floor Speeches Harvester) Guidelines:

Collecting Floor Speeches from the Congressional Record:

Through pyadmin, the Floor Speeches Harvester runs on a daily basis on the Congressional Record, and automatically enters these speeches into our Database. The primary role of a researcher covering the Harvester is to Categorize and Tag each speech for our database. This process can be somewhat tedious and slow, but is a job that needs to be done.

There are a number of ways that these Speeches can be worked through, each option has its own benefits, so it is primarily up to you as a researcher on what is your preferred method:

Option 1: PyAdmin

This first option is through accessing pyadmin.votesmart.org, by following these steps:

1. Access pyadmin.votesmart.org (the best way to do this is through regular admin and selecting the 'SIG Ratings' link under Harvesters on the bottom left of the page.)
2. This page will take a while to load, once it does, scroll again down the left hand side and select 'Congressional Floor Speeches'
3. This will bring you to a page with a whole list of linked job numbers of previous run Harvester jobs. Use the scroll and pages tab to find and select your assigned job number (most recent runs jobs are on the last page)
4. Select the first speech in the job:

A. If the speech boxes are filled in, this speech has been entered into the database, and needs to be Categorized and Tagged.
B. If there is nothing entered, you can move onto the next speech by selecting the 'Next' button in the top right corner.
C. If the 'Next' button cannot be selected, that means you have reached the end of the job.

Note: You may experience slowness on the Harvester, and it is best to just stay patient with it, and allow it to load itself. Overloading the system with multiple clicks may cause it to crash.

Option 2: SQL

The other way that speeches can be updated with Cats and Tags is through SQL. This involves querying for all floor speeches that do not have Cats and Tags for a certain date, and finding these speeches in Admin and updating their Cats and Tags. This option is a lot more technical, can be a quicker process of working through a job.

SELECT c.candidate_id, s.title
FROM speech s
JOIN speech_candidate sc USING (speech_id)
JOIN candidate c USING (candidate_id)
WHERE NOT (EXISTS ( SELECT 1 FROM speech_category sc WHERE sc.speech_id = s.speech_id))
AND NOT (EXISTS (SELECT 1 FROM speech_tag st WHERE st.speech_id = s.speech_id)) AND s.speechtype_id = '14' AND s.speechdate = '2018--'
ORDER BY c.candidate_id

Note: Make sure to enter the date of the speech correctly, otherwise the correct Job and speeches to work on will not show up.

Once you run the query, you will need to search each speech through Admin, using the candidate_id, and then add Cats and Tags to the Floor Speech. You can check to make sure Cats and Tags were added, as the speech will not show up if the query is re-run.

Formatting

Location will always be Washington, DC.
Speech type will always be Floor Statement.
URL will always be http://www.gpo.gov/...

-Before and after every quote from the politician, type the phrase BREAK IN TRANSCRIPT. Example:

-You must use it more than once if someone ELSE is speaking between quotes from the person you are collecting. Example:

PVSWiki : Congressional

Congressional Record (Floor Speeches Harvester) Guidelines: