X

Track changes made to this page

If you find this page useful and would like to be notified of changes made to this page, start by inputting your email below.



Privacy policy
Close this window

powered by ChangeDetection

Difference between revisions of "Pennsylvania Oil and Gas Data Sources"

(May 2015 Production Update: Updated with Waste Info)
(Edit for SPUD)
Line 32: Line 32:
  
 
CSVs were uploaded to /var/lib/mysql/wells_db/ with owner and group changed to mysql.
 
CSVs were uploaded to /var/lib/mysql/wells_db/ with owner and group changed to mysql.
They were then added to the waste table via  
+
They were then added to the waste table via:
 
LOAD DATA INFILE 'WasteExport2014-2.csv'  
 
LOAD DATA INFILE 'WasteExport2014-2.csv'  
 +
 
INTO TABLE waste  
 
INTO TABLE waste  
 +
 
FIELDS TERMINATED BY ','  
 
FIELDS TERMINATED BY ','  
 +
 
ENCLOSED BY '"'
 
ENCLOSED BY '"'
 +
 
LINES TERMINATED BY '\n'
 
LINES TERMINATED BY '\n'
 +
 
IGNORE 1 ROWS
 
IGNORE 1 ROWS
 +
 
(WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, FARM_NAME, WELL_ID, @SPUD_DATE_TMP, WASTE_TYPE, WASTE_QUANTITY, UNITS, DISPOSAL_METHOD, AVERAGED, OPERATOR_NAME, OPERATOR_OGO, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, WASTE_FACILITY_PERMIT_ID, WASTE_FACILITY_NAME, FACILITY_ADDRESS_1, FACILITY_ADDRESS_2, FACILITY_CITY, FACILITY_STATE, FACILITY_ZIP_CODE, FACILITY_PHONE, FACILITY_LATITUDE, FACILITY_LONGITUDE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT)
 
(WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, FARM_NAME, WELL_ID, @SPUD_DATE_TMP, WASTE_TYPE, WASTE_QUANTITY, UNITS, DISPOSAL_METHOD, AVERAGED, OPERATOR_NAME, OPERATOR_OGO, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, WASTE_FACILITY_PERMIT_ID, WASTE_FACILITY_NAME, FACILITY_ADDRESS_1, FACILITY_ADDRESS_2, FACILITY_CITY, FACILITY_STATE, FACILITY_ZIP_CODE, FACILITY_PHONE, FACILITY_LATITUDE, FACILITY_LONGITUDE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT)
 
SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');
 
SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');
Line 44: Line 50:
  
 
CSVs for Compliance, Permits, and SPUD dates were obtained from http://www.portal.state.pa.us/portal/server.pt/community/oil_and_gas_reports/20297 using their respectively linked utilities. Maximum date range from 01/01/1800 to 05/26/2015 was used to obtain data. Default settings for all categories were appropriate except for "inspections with violations only" for compliance, which had to be set to "no" in order to properly obtain all compliance records.
 
CSVs for Compliance, Permits, and SPUD dates were obtained from http://www.portal.state.pa.us/portal/server.pt/community/oil_and_gas_reports/20297 using their respectively linked utilities. Maximum date range from 01/01/1800 to 05/26/2015 was used to obtain data. Default settings for all categories were appropriate except for "inspections with violations only" for compliance, which had to be set to "no" in order to properly obtain all compliance records.
 +
 +
 +
SPUD:
 +
 +
Only records past 2014/10/13 were used to avoid data duplication.
 +
 +
 +
Need to prepend "37-" to all entries in WELL_API
 +
 +
WELL_API can be cloned without "37-" for WELL_API_COUNTY_ID
 +
 +
Delete REGION column
 +
 +
Change WELL_CODE_DESC to WELL_TYPE
 +
 +
Delete WELL_STATUS
 +
 +
Add State Code and Country Column before FARM_NAME
 +
 +
No Farm Well NO
 +
 +
 +
LOAD DATA INFILE 'cleanedSPUD.csv'
 +
 +
INTO TABLE spud FIELDS TERMINATED BY ','
 +
 +
ENCLOSED BY '"'
 +
 +
LINES TERMINATED BY '\n'
 +
 +
IGNORE 1 ROWS
 +
 +
(@SPUD_DATE_TMP, WELL_API, WELL_API_COUNTY_ID, OPERATOR_OGO, OPERATOR_NAME, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, FARM_NAME, WELL_TYPE, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, CONFIGURATION, UNCONVENTIONAL)
 +
SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');

Revision as of 16:32, 27 May 2015


May 2015 Production Update

Updating Production data. Extra column for WELL_API had to be added, using Well Permit #. Excel formula ="37-"&B2 was used to produce WELL_API values. Well Permit # renamed to WELL_API_COUNTY_ID. Extra columns WELL_COUNTRY set to "United States" and WELL_STATE_CODE set to "PA".


Needed to specify that lines were delimited by \n instead of auto to get around PHP "invalid comma count error." There was a typo in one cell for a sheet that had a "\" character, which produced another invalid column count error. Also renamed all columns in CSV to columns in SQL db as follows:


WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, PRODUCTION_INDICATOR, WELL_STATUS, FARM_NAME,WELL_ID, SPUD_DATE, GAS_QUANTITY_MCF, GAS_PRODUCTION_DAYS, CONDENSATE_QUANTITY_BBL, CONDENSATE_PRODUCTION_DAYS, OIL_QUANTITY_BBL, OIL_PRODUCTION_DAYS, AVERAGED, OPERATOR_NAME, OPERATOR_ID, WELL_COUNTY, WELL_MUNICIPALITY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT, WELL_COUNTRY, WELL_STATE_CODE


which needs to be entered on the import screen for PHPmyAdmin to upload correctly for column headers.


This has been more work than it should have been. There is a CSV module in python if we elect to go for full automation, so this could possibly be done automatically.


In addition to removing the \ character from CSVs, they must also be processed to remove the quotation character " to allow for proper importation. Older yearly CSV files (non-monthly) were split using a CSV split utility in order to meet MySQL's 2048KiB filesize limit


Duplicate entries were discovered in the production table. Entries were removed by producing a new temporary table with the same structure but a primary key consisting of the WELL_ID and production date. Data was copied over with a INSERT IGNORE statement, old data in production was dropped, and unique data copied back to the production table.


for Waste Insert WELL_API in leftmost column Insert WELL_STATE_CODE after column P Insert COUNTRY after WELL_STATE_CODE No DB comments No SPUD_DATE_TEXT

CSVs were uploaded to /var/lib/mysql/wells_db/ with owner and group changed to mysql. They were then added to the waste table via: LOAD DATA INFILE 'WasteExport2014-2.csv'

INTO TABLE waste

FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, FARM_NAME, WELL_ID, @SPUD_DATE_TMP, WASTE_TYPE, WASTE_QUANTITY, UNITS, DISPOSAL_METHOD, AVERAGED, OPERATOR_NAME, OPERATOR_OGO, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, UNCONVENTIONAL, CONFIGURATION, HOME_USE, WASTE_FACILITY_PERMIT_ID, WASTE_FACILITY_NAME, FACILITY_ADDRESS_1, FACILITY_ADDRESS_2, FACILITY_CITY, FACILITY_STATE, FACILITY_ZIP_CODE, FACILITY_PHONE, FACILITY_LATITUDE, FACILITY_LONGITUDE, REPORTING_PERIOD, COMMENT_REASON, COMMENT_TEXT) SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');


CSVs for Compliance, Permits, and SPUD dates were obtained from http://www.portal.state.pa.us/portal/server.pt/community/oil_and_gas_reports/20297 using their respectively linked utilities. Maximum date range from 01/01/1800 to 05/26/2015 was used to obtain data. Default settings for all categories were appropriate except for "inspections with violations only" for compliance, which had to be set to "no" in order to properly obtain all compliance records.


SPUD:

Only records past 2014/10/13 were used to avoid data duplication.


Need to prepend "37-" to all entries in WELL_API

WELL_API can be cloned without "37-" for WELL_API_COUNTY_ID

Delete REGION column

Change WELL_CODE_DESC to WELL_TYPE

Delete WELL_STATUS

Add State Code and Country Column before FARM_NAME

No Farm Well NO


LOAD DATA INFILE 'cleanedSPUD.csv'

INTO TABLE spud FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(@SPUD_DATE_TMP, WELL_API, WELL_API_COUNTY_ID, OPERATOR_OGO, OPERATOR_NAME, WELL_COUNTY, WELL_MUNICIPALITY, WELL_STATE_CODE, WELL_COUNTRY, FARM_NAME, WELL_TYPE, LATITUDE_DECIMAL, LONGITUDE_DECIMAL, CONFIGURATION, UNCONVENTIONAL) SET SPUD_DATE = STR_TO_DATE(@SPUD_DATE_TMP, '%m/%d/%Y');