Preparing to Import to our Database: Tips for Formatting Data in Excel


Note: also see: PCT instructions for preparing Candidate Lists for import

Excel and its equivalents (such as OpenOffice Calc) are tools in which you can transform bulk, structured data produced by external organizations into something we can import into our database. By the end of your work, each cell will correspond to a field in Admin.


Imports Vs. Manual Entry

Advantages of bulk imports:
Disadvantages of bulk imports:

General Procedure:

  1. Obtain a data file
    • Some websites will have data available for bulk download. Ultimately you want a .csv or .xls but as long as it is in a structured format, there's likely a converter you can find on Google that will get it into a .csv (though you will want to double check for accuracy in this case).
    • Extract tables from PDFs using TabulaInstructions
    • Alternatively, you may try copying and pasting data from the original source
      • Use your cursor to select and copy the necessary information
      • Right-click a cell in your spreadsheet and go to "Paste Special"
      • Select Text and click "OK"
      • In an ideal world, your paste will look like you typed in everything neatly yourself. This is rare.
    • As a last resort, you may try contacting the source of the data to see if they will provide the data in bulk upon requests, but this generally only makes sense timewise for large data sources or if your contacting many businesses/governments at once
    • If it is not possible to copy and paste the data in a usable format, or get the data in bulk, this data must be entered manually into Admin according to normal procedure
  2. Open the file or Transfer the data into Excel (via Copy and Paste or Import)
  3. For data you intend to import, match up columns to the content of a field in Admin
    • use "Text to Columns" to separate data into separate columns if necessary
    • use the =CONCATENATE Function to Combine Columns. In parentheses, name every cell in which the data must be merged, separating cell numbers by a comma and no space. For instance, if you wanted to merge the data in cells F2 - J2, the command will be =CONCATENATE(F2,G2,H2,I2,J2)
    • name your column headers to mirror the naming schema of our database
  4. Use excel functions and formulas to clean up the data so it matches our normal formatting and content standards. For example, you may use the "Find and Replace" function to convert any address abbreviations into their fully spelled-out version. Do not use "Replace All" (see: ResearchSpreadsheetFunctions for other useful functions)
  5. WARNING: DO NOT SORT DATA UNLESS YOU ABSOLUTELY HAVE TO, ARE INTIMATELY FAMILIAR WITH HOW TO DO IT PROPERLY, AND HAVE ASKED YOUR SUPERVISOR FIRST. TRY TO USE AUTOFILTERS INSTEAD.
  6. MatchIds so that your data may be related to existing data in our database
  7. Save as a CSV
  8. Have your supervisor or another authorized individual check your .csv
  9. Submit the file to Mantis for import
  10. Verify the import was done

General tips for formulas:

CategoryTechnicalSkills
There is one comment on this page. [Display comment]
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki