X

Track changes made to this page

If you find this page useful and would like to be notified of changes made to this page, start by inputting your email below.



Privacy policy
Close this window

powered by ChangeDetection

Difference between revisions of "Pennsylvania Oil and Gas Data Sources"

(Updated page to only detail citation information)
Line 1: Line 1:
 
  [[Category:Documentation]]
 
  [[Category:Documentation]]
  
==May 2015 Production Update==
+
Terms of service for the PADEP site must be agreed to prior to being able to submit requests to access data.
  
Updating Production data. Extra column for WELL_API had to be added, using Well Permit #. Excel formula ="37-"&B2 was used to produce WELL_API values. Well Permit # renamed to WELL_API_COUNTY_ID. Extra columns WELL_COUNTRY set to "United States" and WELL_STATE_CODE set to "PA".
 
  
 +
Production: Entries are from 01/01/1800 to 05/26/2015 from
 +
 +
http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?%2fOil_Gas%2fOil_Gas_Well_Historical_Production_Report
  
Needed to specify that lines were delimited by \n instead of auto to get around PHP "invalid comma count error." There was a typo in one cell for a sheet that had a "\" character, which produced another invalid column count error. Also renamed all columns in CSV to columns in SQL db as follows:
+
Waste: Entries are from 01/01/1800 to 05/26/2015 from
  
 +
http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?%2fOil_Gas%2fOil_Gas_Well_Historical_Waste_Report
  
WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, PRODUCTION_INDICATOR, WELL_STATUS, FARM_NAME,WELL_ID, SPUD_DATE, GAS_QUANTITY_MCF, GAS_PRODUCTION_DAYS, CONDENSATE_QUANTITY_BBL, CONDENSATE_PRODUCTION_DAYS, OIL_QUANTITY_BBL, OIL_PRODUCTION_DAYS, AVERAGED, OPERATOR_NAME, OPERATOR_ID, WELL_COUNTY, WELL_MUNICIPALITY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT, WELL_COUNTRY, WELL_STATE_CODE
+
Permits: Entries are from 2014/10/09 to 2015/05/27 from
  
 +
http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/Permits_Issued_Detail
  
which needs to be entered on the import screen for PHPmyAdmin to upload correctly for column headers.
+
SPUD: Entries are from 2014/10/13 to 2015/06/15 from
  
 +
http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/Spud_External_Data
  
An excel macro has been written and stored on google drive in order to automate this process.
+
Inspections: Entries are from 2014/10/09 to 2015/05/27 from
  
 
+
http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/OG_Compliance
In addition to removing the \ character from CSVs, they must also be processed to remove the quotation character " to allow for proper importation. Older yearly CSV files (non-monthly) were split using a CSV split utility in order to meet MySQL's 2048KiB filesize limit
 
 
 
 
 
Duplicate entries were discovered in the production table. Entries were removed by producing a new temporary table with the same structure but a primary key consisting of the WELL_ID and production date. Data was copied over with a INSERT IGNORE statement, old data in production was dropped, and unique data copied back to the production table.
 
 
 
 
 
==Waste==
 
Insert WELL_API in leftmost column
 
Insert WELL_STATE_CODE after column P
 
Insert COUNTRY after WELL_STATE_CODE
 
No DB comments
 
No SPUD_DATE_TEXT
 
 
 
CSVs were uploaded to /var/lib/mysql/wells_db/ with owner and group changed to mysql.
 
They were then added to the waste table via:
 
LOAD DATA INFILE 'WasteExport2014-2.csv'
 
 
 
INTO TABLE waste
 
 
 
FIELDS TERMINATED BY ','
 
 
 
ENCLOSED BY '"'
 
 
 
LINES TERMINATED BY '\n'
 
 
 
IGNORE 1 ROWS
 
 
 
(WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, FARM_NAME, WELL_ID, @SPUD_DATE_TMP, WASTE_TYPE, WASTE_QUANTITY, UNITS, DISPOSAL_METHOD, AVERAGED, OPERATOR_NAME, OPERATOR_OGO, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, WASTE_FACILITY_PERMIT_ID, WASTE_FACILITY_NAME, FACILITY_ADDRESS_1, FACILITY_ADDRESS_2, FACILITY_CITY, FACILITY_STATE, FACILITY_ZIP_CODE, FACILITY_PHONE, FACILITY_LATITUDE, FACILITY_LONGITUDE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT)
 
SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');
 
 
 
 
 
CSVs for Compliance, Permits, and SPUD dates were obtained from http://www.portal.state.pa.us/portal/server.pt/community/oil_and_gas_reports/20297 using their respectively linked utilities. Maximum date range from 01/01/1800 to 05/26/2015 was used to obtain data. Default settings for all categories were appropriate except for "inspections with violations only" for compliance, which had to be set to "no" in order to properly obtain all compliance records.
 
 
 
 
 
==SPUD:==
 
 
 
Only records past 2014/10/13 were used to avoid data duplication.
 
 
 
 
 
Need to prepend "37-" to all entries in WELL_API
 
 
 
WELL_API can be cloned without "37-" for WELL_API_COUNTY_ID
 
 
 
Delete REGION column
 
 
 
Change WELL_CODE_DESC to WELL_TYPE
 
 
 
Delete WELL_STATUS
 
 
 
Add State Code and Country Column before FARM_NAME
 
 
 
No Farm Well NO
 
 
 
 
 
LOAD DATA INFILE 'cleanedSPUD.csv'
 
 
 
INTO TABLE spud FIELDS TERMINATED BY ','
 
 
 
ENCLOSED BY '"'
 
 
 
LINES TERMINATED BY '\n'
 
 
 
IGNORE 1 ROWS
 
 
 
(@SPUD_DATE_TMP, WELL_API, WELL_API_COUNTY_ID, OPERATOR_OGO, OPERATOR_NAME, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, FARM_NAME, WELL_TYPE, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, CONFIGURATION, UNCONVENTIONAL)
 
SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');
 
 
 
 
 
==PERMITS:==
 
Grabbed all permits from 2014-10-09 to 2015-05-27
 
 
 
delete REGION
 
 
 
Insert column before WELL_API, prepend with "37-" with WELL_API entries. Old
 
 
 
Insert state code and country columns after municipality
 
 
 
Delete LATITUDE_DEGREES, LONGITUDE_DEGREES
 
 
 
No PERMIT_ISSUED_DATE, DB_COMMENTS, SPUD_DATE_TEXT, permit_application_date, permit_issue_date, permit_status, proposed_total_depth
 
 
 
 
 
LOAD DATA INFILE 'perms.csv'
 
 
 
INTO TABLE permit FIELDS TERMINATED BY ','
 
 
 
ENCLOSED BY '"'
 
 
 
LINES TERMINATED BY '\n'
 
 
 
IGNORE 1 ROWS
 
 
 
(WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, PERMIT_ISSUED_DATE, OPERATOR_NAME, APPLICATION_TYPE, AUTHORIZATION_DESCRIPTION, WELL_API, WELL_API_COUNTY_ID, UNCONVENTIONAL, CONFIGURATION, WELL_TYPE, FARM_NAME, @SPUD_DATE_TMP, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, OPERATOR_OGO, OPERATOR_ADDRESS, OPERATOR_CITY, OPERATOR_STATE, OPERATOR_ZIP, AUTHORIZATION_ID, CLIENT_ID, PRIMARY_FACILITY_ID) SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');
 
 
 
 
 
==Compliance:==
 
 
 
There were issues inserting for compliance due to foreign key constraints being invalidated. This could be overridden using SET foreign_key_checks = 0; on the table prior to insertion, and then re-enabling it with SET foreign_key_checks = 1; after insertion, but this could potentially be unwise. The problem could have also been circumvented via the "IGNORE" clause after LOAD DATA INFILE 'filename' however apparently there is a bug in MySQL for innoDB tables and IGNORE in this particular instance. For the time being, data was not uploaded.
 
 
 
Otherwise, the process is the same as detailed above, with the following specifics:
 
 
 
Compliance data provided by PADEP also provides the required information for penalty and fines tables. These tables make use of automatically generated primary/foreign keys rather than WELL_API as their identifier.
 
 
 
Delete OPERATOR, FARM_NAME, UNCONVENTIONAL, SITE_ID, SITE_NAME, FACILITY, REGION, COUNTY, MUNICIPALITY, everything regarding violation, RESOLUTION_REASON_CODE_DESCRIPTION, all rows relating to ENFORCEMENT, PENALTIES, TOTAL_AMOUNT_COLLECTED
 
INSPECTION_ID is INSP_ID not INSP_ID_PK (which auto-increments)
 
API_PERMIT needs to be pre-pended with "37-"
 
 
 
Use dates from 2014-9-10 to 2015-05-27
 
 
 
 
 
LOAD DATA INFILE 'comp.csv' IGNORE
 
 
 
INTO TABLE compliance FIELDS TERMINATED BY ','
 
 
 
ENCLOSED BY '"'
 
 
 
LINES TERMINATED BY '\n'
 
IGNORE 1 ROWS
 
(INSP_ID, INSPECTION_DATE, INSPECTION_TYPE, WELL_ID_FK, INSP_CATEGORY, INSPECTION_RESULT_DESC, INSP_COMMENT)
 
SET INSPECTION_DATE = STR_TO_DATE(@INSPECTION_DATE_TMP, '%m/%d/%Y')
 

Revision as of 12:45, 19 June 2015


Terms of service for the PADEP site must be agreed to prior to being able to submit requests to access data.


Production: Entries are from 01/01/1800 to 05/26/2015 from

http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?%2fOil_Gas%2fOil_Gas_Well_Historical_Production_Report

Waste: Entries are from 01/01/1800 to 05/26/2015 from

http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?%2fOil_Gas%2fOil_Gas_Well_Historical_Waste_Report

Permits: Entries are from 2014/10/09 to 2015/05/27 from

http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/Permits_Issued_Detail

SPUD: Entries are from 2014/10/13 to 2015/06/15 from

http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/Spud_External_Data

Inspections: Entries are from 2014/10/09 to 2015/05/27 from

http://www.depreportingservices.state.pa.us/ReportServer/Pages/ReportViewer.aspx?/Oil_Gas/OG_Compliance