If you find this page useful and would like to be notified of changes made to this page, start by inputting your email below.
powered by ChangeDetection
CSVs are copied to the server. Their group and ownership are changed to 'mysql' using the chgrp and chown commands. They are then moved to /var/lib/mysql/wellsdb_dev/ such that they can be read by mysql.
CSVs should be checked for \n \ and other special characters prior to being uploaded to the server. " are okay if they are surrounding an entire field value, but care must be taken that none are unintentionally present or their count is not odd. Special characters can cause the files to be processed incorrectly, which leads to data going into the wrong columns or not being uploaded at all.
Always make backups prior to destructive operations.
The table production_co was created to contain the CO production data. Many IF conditionals were used in the import to prevent fields from being incorrectly set to 0, "", or "0000-00-00" instead of NULL. In total, 17 files were imported. Each contained the production data for one year in the range 1999-2015.
Importing:
LOAD DATA INFILE 'monthly_prod.csv'
INTO TABLE production_co
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(@MONTH_TMP, @YEAR_TMP, @API_STATE_TMP, WELL_API_COUNTY_ID, @API_SEQ_TMP, API_SIDETRACK, FORMATION_CODE, WELL_STATUS_CODE, @PROD_DAYS_TMP, @WATER_DISP_CODE_TMP, @WATER_VOL_BBLS_TMP, @WATER_PRESS_TUBING_PSIA_TMP, @WATER_PRESS_CASING_PSIA_TMP, @BOM_INVENTORY_BBLS_TMP, @OIL_VOL_BBLS_TMP, @OIL_SALES_BBLS_TMP, @OIL_VOL_ADJUSTED_BBLS_TMP, @EOM_INVENTORY_BBLS_TMP, @GRAVITY_SALES_TMP, @GAS_SALES_MCF_TMP, @GAS_FLARED_MCF_TMP, @GAS_VOL_LEASE_MCF_TMP, @GAS_SHRINKAGE_MCF_TMP, @GAS_VOL_MCF_TMP, @BTU_SALES_TMP, @GAS_PRESS_TUBING_TMP, @GAS_PRESS_CASING_TMP, OPERATOR_NUM, OPERATOR_NAME, WELL_NAME, @WELL_NUM_TMP, @ACCEPTED_DATE_TMP, IS_REVISED)
SET PERIOD_ID=CONCAT(@YEAR_TMP,"-",@MONTH_TMP),
WELL_API=CONCAT(@API_STATE_TMP,"-",WELL_API_COUNTY_ID,"-",@API_SEQ_TMP),
PROD_DAYS=IF(@PROD_DAYS_TMP="",NULL,@PROD_DAYS_TMP),
WATER_DISP_CODE=IF(@WATER_DISP_CODE_TMP="",NULL,@WATER_DISP_CODE_TMP),
WATER_VOL_BBLS=IF(@WATER_VOL_BBLS_TMP="",NULL,@WATER_VOL_BBLS_TMP),
WATER_PRESS_TUBING_PSIA=IF(@WATER_PRESS_TUBING_PSIA_TMP="",NULL,@WATER_PRESS_TUBING_PSIA_TMP),
WATER_PRESS_CASING_PSIA=IF(@WATER_PRESS_CASING_PSIA_TMP="",NULL,@WATER_PRESS_CASING_PSIA_TMP),
BOM_INVENTORY_BBLS=IF(@BOM_INVENTORY_BBLS_TMP="",NULL,@BOM_INVENTORY_BBLS_TMP),
OIL_VOL_BBLS=IF(@OIL_VOL_BBLS_TMP="",NULL,@OIL_VOL_BBLS_TMP),
OIL_SALES_BBLS=IF(@OIL_SALES_BBLS_TMP="",NULL,@OIL_SALES_BBLS_TMP),
OIL_VOL_ADJUSTED_BBLS=IF(@OIL_VOL_ADJUSTED_BBLS_TMP="",NULL,@OIL_VOL_ADJUSTED_BBLS_TMP),
EOM_INVENTORY_BBLS=IF(@EOM_INVENTORY_BBLS_TMP="",NULL,@EOM_INVENTORY_BBLS_TMP),
GRAVITY_SALES=IF(@GRAVITY_SALES_TMP="",NULL,@GRAVITY_SALES_TMP),
GAS_SALES_MCF=IF(@GAS_SALES_MCF_TMP="",NULL,@GAS_SALES_MCF_TMP),
GAS_FLARED_MCF=IF(@GAS_FLARED_MCF_TMP="",NULL,@GAS_FLARED_MCFS_TMP),
GAS_VOL_LEASE_MCF=IF(@GAS_VOL_LEASE_MCF_TMP="",NULL,@GAS_VOL_LEASE_MCF_TMP),
GAS_SHRINKAGE_MCF=IF(@GAS_SHRINKAGE_MCF_TMP="",NULL,@GAS_SHRINKAGE_MCF_TMP),
GAS_VOL_MCF=IF(@GAS_VOL_MCF_TMP="",NULL,@GAS_VOL_MCF_TMP),
BTU_SALES=IF(@BTU_SALES_TMP="",NULL,@BTU_SALES_TMP),
GAS_PRESS_TUBING=IF(@GAS_PRESS_TUBING_TMP="",NULL,@GAS_PRESS_TUBING_TMP),
GAS_PRESS_CASING=IF(@GAS_PRESS_CASING_TMP="",NULL,@GAS_PRESS_CASING_TMP),
WELL_NUM=IF(TRIM(@WELL_NUM_TMP)="",NULL,@WELL_NUM_TMP),
ACCEPTED_DATE=IF(@ACCEPTED_DATE_TMP="",NULL,@ACCEPTED_DATE_TMP)
7564623 records were imported, broken down by year as follows:
1999: 152057
2000: 310187
2001: 258572
2002: 162850
2003: 219142
2004: 189662
2005: 198574
2006: 208525
2007: 221198
2008: 243876
2009: 696667
2010: 689094
2011: 772973
2012: 1050441
2013: 758179
2014: 869975
2015: 562651
Instances of IS_REVISED that were not "Y" were set to NULL following import of all years' data:
UPDATE production_co
SET IS_REVISED=NULL
WHERE IS_REVISED<>"Y"
The WELL_COUNTY and WELL_STATUS fields were finally populated using fill_long_fields.py in ~/installers/wikimanager/ (see script for usage instructions).
The production (all states) table was also updated to include these 7564623 new records:
INSERT INTO production (WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, FARM_NAME, WELL_ID, GAS_QUANTITY_MCF, OIL_QUANTITY_BBL, OPERATOR_NAME, OPERATOR_ID, WELL_COUNTY, WELL_STATE_CODE, WELL_COUNTRY, water_bbls)
SELECT WELL_API, WELL_API_COUNTY_ID, PERIOD_ID, WELL_STATUS, WELL_NAME, WELL_NUM, GAS_VOL_MCF, OIL_VOL_BBLS, OPERATOR_NAME, OPERATOR_NUM, WELL_COUNTY, 'CO', 'United States', WATER_VOL_BBLS FROM production_co
To populate county names in CSV before import, the following formula was used:
"=VLOOKUP(A3,[API_County_codes.xlsx]Sheet1!$A$1:$B$64,2,FALSE)" and fill down column
Importing:
LOAD DATA INFILE 'Wells_proper_fields.csv'
INTO TABLE permit
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 2 ROWS
(WELL_COUNTY, @DUMMY, OPERATOR_NAME, WELL_API, WELL_API_COUNTY_ID, FARM_NAME, SPUD_DATE,
LATITUDE_DECIMAL, LONGITUDE_DECIMAL, OPERATOR_OGO, @DUMMY, @DUMMY, @DUMMY, @DUMMY, @DUMMY)
SET WELL_STATE_CODE='CO',
WELL_COUNTRY='United States'
107978 records were imported.
It was noticed that the county names were sometimes truncated in well and that this was because they were provided that way by the permit data. The following was performed on both the permit and spud tables prior to re-updating the county field using well_updates.py:
python fill_long_fields.py permit[spud] WELL_API_COUNTY_ID WELL_COUNTY
The line "AND WELL_API LIKE '05%%'" was added to the end of the query in the script so that only CO wells were targeted. The logical check on line 63, self.table == 'production_co', was changed to (self.table == 'production_co' or self.table == 'permit'['spud']) to allow for the different table names.
The update for spud was identical to the update for permit, since the well data retrieved from the COGCC site shared the same fields with both.
All CO well APIs in our records were added to the well table prior to running the well update script well_updates.py.
INSERT INTO well (WELL_API)
SELECT WELL_API FROM permit
WHERE WELL_API LIKE '05%'
107978 rows inserted.
INSERT INTO well (WELL_API)
SELECT DISTINCT WELL_API FROM production_co
WHERE WELL_API NOT IN
(SELECT WELL_API FROM permit
WHERE WELL_API LIKE '05%')
112 rows inserted.
well_updates.py was then run to populate other fields.
County names were fixed as described in the Permit section. Violation counts were added by running pre_deployment_co.py, which is located in ~/installers/wikimanager/.
The table violation_co was created to accommodate the data provided by the COGCC. The data file NOAV.xlsx was opened in Excel and date field formatting changed to match that stored in the database. Other formatting was fixed so that numbers were not improperly reported as dates. The file was then saved as a CSV and imported using the standard procedure.
Importing:
LOAD DATA INFILE 'NOAV.csv'
INTO TABLE violation_co
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@DOCUMENT_NUM_TMP, @NOAV_ISSUE_DATE_TMP, @OPERATOR_NUM_TMP, @OPERATOR_NAME_TMP, @WELL_API_STATE_TMP, @WELL_API_COUNTY_ID_TMP, @WELL_API_SEQ_TMP, @FACILITY_ID_TMP, @WELL_NAME_TMP, @WELL_NUM_TMP, @DUMMY, @DUMMY, @DUMMY, @DUMMY, @DUMMY, @WELL_COUNTY_TMP, @RULE_VIOLATED_TMP, @RULE_DESC_TMP, @INITIAL_DISCOVERY_DATE_TMP, @SELF_REPORTED_TMP, @VIOLATION_DATE_TMP, @ALLEGED_VIOLATION_DESC_TMP, @CORRECTIVE_ACTION_DESC_TMP, @CORRECTIVE_DUE_DATE_TMP, @CORRECTIVE_START_DATE_TMP, @CORRECTIVE_COMPLETE_DATE_TMP, @CORRECTIVE_ACTION_COMPLETE_TMP, @CAUSE_NUM_TMP, @ORDER_NUM_TMP, @DOCKET_NUM_TMP, @ENFORCEMENT_ACTION_TMP, @FINAL_RESOLUTION_DATE_TMP, @FINAL_RESOLUTION_COMMENTS_TMP, @OBVIOUS_DURATION_TMP, @PERFORM_AS_REQUIRED_TMP)
SET WELL_API=IF(TRIM(@WELL_API_STATE_TMP)="" OR TRIM(@WELL_API_COUNTY_ID_TMP)="" OR TRIM(@WELL_API_SEQ_TMP)="",NULL,CONCAT(LPAD(@WELL_API_STATE_TMP,2,'0'),"-",LPAD(@WELL_API_COUNTY_ID_TMP,3,'0'),"-",LPAD(@WELL_API_SEQ_TMP,5,'0'))),
DOCUMENT_NUM=IF(TRIM(@DOCUMENT_NUM_TMP)="",NULL,TRIM(@DOCUMENT_NUM_TMP)),
NOAV_ISSUE_DATE=IF(@NOAV_ISSUE_DATE_TMP="",NULL,@NOAV_ISSUE_DATE_TMP),
OPERATOR_NUM=IF(TRIM(@OPERATOR_NUM_TMP)="",NULL,TRIM(@OPERATOR_NUM_TMP)),
OPERATOR_NAME=IF(TRIM(@OPERATOR_NAME_TMP)="",NULL,TRIM(@OPERATOR_NAME_TMP)),
WELL_API_COUNTY_ID=IF(TRIM(@WELL_API_COUNTY_ID_TMP)="",NULL,LPAD(@WELL_API_COUNTY_ID_TMP,3,'0')),
FACILITY_ID=IF(TRIM(@FACILITY_ID_TMP)="",NULL,TRIM(@FACILITY_ID_TMP)),
WELL_NAME=IF(TRIM(@WELL_NAME_TMP)="",NULL,TRIM(@WELL_NAME_TMP)),
WELL_NUM=IF(TRIM(@WELL_NUM_TMP)="",NULL,TRIM(@WELL_NUM_TMP)),
WELL_COUNTY=IF(TRIM(@WELL_COUNTY_TMP)="",NULL,TRIM(@WELL_COUNTY_TMP)),
RULE_VIOLATED=IF(TRIM(@RULE_VIOLATED_TMP)="",NULL,TRIM(@RULE_VIOLATED_TMP)),
RULE_DESC=IF(TRIM(@RULE_DESC_TMP)="",NULL,TRIM(@RULE_DESC_TMP)),
INITIAL_DISCOVERY_DATE=IF(@INITIAL_DISCOVERY_DATE_TMP="",NULL,@INITIAL_DISCOVERY_DATE_TMP),
SELF_REPORTED=IF(TRIM(@SELF_REPORTED_TMP)="",NULL,TRIM(@SELF_REPORTED_TMP)),
VIOLATION_DATE=IF(@VIOLATION_DATE_TMP="",NULL,@VIOLATION_DATE_TMP),
ALLEGED_VIOLATION_DESC=IF(TRIM(@ALLEGED_VIOLATION_DESC_TMP)="",NULL,TRIM(@ALLEGED_VIOLATION_DESC_TMP)),
CORRECTIVE_ACTION_DESC=IF(TRIM(@CORRECTIVE_ACTION_DESC_TMP)="",NULL,TRIM(@CORRECTIVE_ACTION_DESC_TMP)),
CORRECTIVE_DUE_DATE=IF(@CORRECTIVE_DUE_DATE_TMP="",NULL,@CORRECTIVE_DUE_DATE_TMP),
CORRECTIVE_START_DATE=IF(@CORRECTIVE_START_DATE_TMP="",NULL,@CORRECTIVE_START_DATE_TMP),
CORRECTIVE_COMPLETE_DATE=IF(@CORRECTIVE_COMPLETE_DATE_TMP="",NULL,@CORRECTIVE_COMPLETE_DATE_TMP),
CORRECTIVE_ACTION_COMPLETE=IF(TRIM(@CORRECTIVE_ACTION_COMPLETE_TMP)="",NULL,TRIM(@CORRECTIVE_ACTION_COMPLETE_TMP)),
CAUSE_NUM=IF(TRIM(@CAUSE_NUM_TMP)="",NULL,TRIM(@CAUSE_NUM_TMP)),
ORDER_NUM=IF(TRIM(@ORDER_NUM_TMP)="",NULL,TRIM(@ORDER_NUM_TMP)),
DOCKET_NUM=IF(TRIM(@DOCKET_NUM_TMP)="",NULL,TRIM(@DOCKET_NUM_TMP)),
ENFORCEMENT_ACTION=IF(TRIM(@ENFORCEMENT_ACTION_TMP)="",NULL,TRIM(@ENFORCEMENT_ACTION_TMP)),
FINAL_RESOLUTION_DATE=IF(@FINAL_RESOLUTION_DATE_TMP="",NULL,@FINAL_RESOLUTION_DATE_TMP),
FINAL_RESOLUTION_COMMENTS=IF(TRIM(@FINAL_RESOLUTION_COMMENTS_TMP)="",NULL,TRIM(@FINAL_RESOLUTION_COMMENTS_TMP)),
OBVIOUS_DURATION=IF(TRIM(@OBVIOUS_DURATION_TMP)="",NULL,TRIM(@OBVIOUS_DURATION_TMP)),
PERFORM_AS_REQUIRED=IF(LEFT(@PERFORM_AS_REQUIRED_TMP,1)='Y' OR LEFT(@PERFORM_AS_REQUIRED_TMP,1)='N',LEFT(@PERFORM_AS_REQUIRED_TMP,1),NULL)
1147 records were added.
After the creation / updating of Colorado pages, chemical data from the fracfocus registry should be re-added to select pages. This is done by executing python chemicals.py US from the wikimanager folder. This will generated modified pages within staged_sources. After which, the pages must be committed using python commit_staged_sources.py staged_sources
In order to expediate the process, the sql statement us_well_sql can be modified to only select wells from CO.