Well Table
- deleted indexes FIPS_MUNICIPALITY_2, ..., FIPS_MUNICIPALITY_5
- deleted row where WELL_API=/ and row where well_api was blank
- deleted LATITUDE_UNCONVENTIONAL column (all values were NULL except for one, which was 0)
- deleted LONGITUDE_UNCONVENTIONAL column (all values were NULL except for one, which was 3)
- deleted UNCONVENTIONAL_UNCONVENTIONAL column (all values were NULL except for one, which was O)
- deleted HORIZONTAL_UNCONVENTIONAL column (all values NULL)
- deleted PERMIT_STATUS column (all values NULL)
- Dropped the following columns (specific to new york, exist in other tables):
hole_number,
objective_formation,
producing_formation,
quad_section_code,
map_quadrangle,
field,
status_date,
well_completion_date,
plugging_and_abandonment_date,
confidential_expiration_date,
confidential_period_type,
nysdec_region,
subject_to_financial_security,
bottom_hole_longitude,
bottom_hole_latitude,
last_modified_date
- changed WELL_STATE_CODE to NY for wells with APIs beginning with 31. It was noticed that many of these had been previously misassigned as PA.
- set instances of LATITUDE_DECIMAL, LONGITUDE_DECIMAL = 0, 0 to NULL, NULL
- deleted ~41,000 duplicate New York entries with sparse data. It was noticed that 17-digit entries ending in '-0000' had the most extensive data, so these were kept. Their 12-digit counterparts and ~800 longer APIs, with a combined total number equal to the 17-digit API count, were removed.
- replaced 562306 instances of horizontal_well with NULL to ensure proper operation of well table update script
- added index on pair WELL_COUNTY, WELL_STATE_CODE to speed up well update script (county portion)
- added index on WELL_COUNTY_WELL_MUNICIPALITY, WELL_STATE_CODE to speed up well update script (municipality portion)
- in UNCONVENTIONAL, changed 'Yes' to 'Y' and 'No' to 'N' for consistency (previously, all four of these strings were present in large numbers) and to ensure source builder scripts interpret the column correctly.
- Changed well_api to varchar(12) to help prevent APIs whose length is more than 10 digits.
- dropped horizontal_well column because CONFIGURATION serves the same purpose and is more descriptive and filled-out
- 183822 OH counties filled in using county table
- Fixing well state codes and FIPS codes:
- UPDATE well
- SET WELL_STATE_CODE='PA',
- FIPS_STATE='42'
- WHERE WELL_API LIKE '37%'
- (23715 rows affected.)
- UPDATE well
- SET WELL_STATE_CODE='OH',
- FIPS_STATE='39'
- WHERE WELL_API LIKE '34%'
- (83094 rows affected)
- UPDATE well
- SET WELL_STATE_CODE='NY',
- FIPS_STATE='36'
- WHERE WELL_API LIKE '31%'
- (765 rows affected.)
- UPDATE well
- SET WELL_STATE_CODE='WV',
- FIPS_STATE='54'
- WHERE WELL_API LIKE '47%'
- (5567 rows affected.)
- UPDATE well
- SET WELL_STATE_CODE='CO',
- FIPS_STATE='08'
- WHERE WELL_API LIKE '05%'
- (22984 rows affected.)
- index added on FIPS_STATE to speed up ~/installers/wikimanager/scrape_fips.py
- 5,225 municipality names, 6,549 FIPS municipality codes, and 1530 state-county FIPS codes have been filled in for CO wells using scrape_fips.py
- changed OPERATOR_NAME to VARCHAR(60) to accommodate some longer WV entries and fixed these values with sql updates. The correct values (previously truncated at 50 characters), were "U.S. Department of Energy/Morgantown Energy Tech Centre", "Oper in Min.owner fld,no code assgn(Orphan well proj)", "Food, Machinery & Chemical Corp, Ohio - APEX Division", "Appalachian Land & Development Co. (David J. Harmer)", and "Brannon, E. H., Bartlett, Howard, and Ryan, William". (5624 rows affected)
Unconventional Table
- changed 236 rows from 'Yes' to 'Y' and 439 rows from 'No' to 'N'. There was some type of return character on the ends of 'Yes' and 'No' making searches hard. Code example:
- UPDATE well
- SET UNCONVENTIONAL='Y'
- WHERE UNCONVENTIONAL LIKE 'Yes%'
- changed all dates '0000-00-00' to NULL
Waste Table
- deleted row of blanks, nulls, and 0's. WELL_API was simply a blank.
- changed default value for WELL_COUNTRY from 'United States' to NULL to comply with asana data standards
Wellpad Table
- deleted row of almost all blanks
- replaced all by NULL in table (empty strings)
- changed all dates '0000-00-00' to NULL
Production Table
- changed all dates '0000-00-00' to NULL
- changed default value for WELL_COUNTRY from 'United States' to NULL to comply with asana data standards
- removed all WV entries, since they were incredibly sparse. Will repopulate with the most up-to-date data. (Repopulation complete)
- changed "" to NULL for WELL_API_COUNTY_ID, PERIOD_ID, PRODUCTION_INDICATOR, WELL_STATUS, FARM_NAME, WELL_ID, OPERATOR_NAME, WELL_COUNTY, WELL_MUNICIPALITY, COMMENT_REASON, COMMENT_TEXT, water_bbls, months_in_production
- changed LATITUDE_DECIMAL and LONGITUDE_DECIMAL values of 0 to NULL
- Changed well_api to varchar(12) to help prevent APIs whose length is more than 10 digits.
Permit Table
- there were two equivalent columns : PERMIT_ISSUED_DATE and permit_issue_date. Since a search revealed that the latter was unused, it has been dropped from the table.
- removed ' Well' from the ends of entries in the configuration column, since not all entries had this appended
- replaced instances of PERMIT_ISSUED_DATE='0000-00-00' with NULL
- replaced instances of SPUD_DATE='0000-00-00' with NULL
- replaced instances of permit_application_date='0000-00-00' with NULL
- for set instances of proposed_total_depth=0 to NULL
- set instances of OPERATOR_NAME="", OPERATOR_NAME=' No record available in Charleston' (all WV), and OPERATOR_NAME=' unknown' (all WV) to NULL
- set instances of CONFIGURATION="" to NULL
- set instances of WELL_TYPE="" to NULL
- set instances of FARM_NAME="" to NULL
- set instances of LATITUDE_DECIMAL, LONGITUDE_DECIMAL = 0, 0 to NULL, NULL
- set instances of OPERATOR_OGO="" to NULL
- set instances of UNCONVENTIONAL='No' to 'N' for consistency and since 'N' was the more common of the two
- set instances of UNCONVENTIONAL='Yes' to 'Y' for reasons previous
- changed default value for WELL_COUNTRY from 'United States' to NULL to comply with asana data standards
- Changed well_api to varchar(12) to help prevent APIs whose length is more than 10 digits.
- changed OPERATOR_NAME to VARCHAR(60) to accommodate some longer WV entries and fixed these values with sql updates. The correct values (previously truncated at 50 characters), were "U.S. Department of Energy/Morgantown Energy Tech Centre", "Oper in Min.owner fld,no code assgn(Orphan well proj)", "Food, Machinery & Chemical Corp, Ohio - APEX Division", "Appalachian Land & Development Co. (David J. Harmer)", and "Brannon, E. H., Bartlett, Howard, and Ryan, William". (5836 rows affected)
Compliance Table
- in INSP_CATEGORY, changed blank entries to NULL and filled out 319,409 existing erroneous entries 'Primary Fa' to 'Primary Facility'
- changed blank entries to NULL in INSP_ID, INSPECTION_TYPE, INSPECTION_RESULT_DESC, INSP_COMMENT, DB_COMMENTS
- set instances of WELL_TYPE='NOT AVAILABLE' to NULL (all WV)
- Changed well_id_fk to varchar(12) to help prevent APIs whose length is more than 10 digits.
Spud Table
- replaced instances of SPUD_DATE='0000-00-00' with NULL
- set instances of OPERATOR_OGO="" to NULL
- set instances of OPERATOR_NAME="", OPERATOR_NAME=' No record available in Charleston' (all WV), and OPERATOR_NAME=' unknown' (all WV) to NULL
- set instances of FARM_NAME=""to NULL
- set instances of WELL_TYPE="" to NULL
- set instances of LATITUDE_DECIMAL, LONGITUDE_DECIMAL = 0, 0 to NULL, NULL
- set instances of WELL_TYPE='NOT AVAILABLE' to NULL (all WV)
- changed default value for WELL_COUNTRY from 'United States' to NULL to comply with asana data standards
- changed OPERATOR_NAME to VARCHAR(60) to accommodate some longer WV entries and fixed these values with sql updates. The correct values (previously truncated at 50 characters), were "U.S. Department of Energy/Morgantown Energy Tech Centre", "Oper in Min.owner fld,no code assgn(Orphan well proj)", "Food, Machinery & Chemical Corp, Ohio - APEX Division", "Appalachian Land & Development Co. (David J. Harmer)", and "Brannon, E. H., Bartlett, Howard, and Ryan, William". (5743 rows affected)
Municipality Table
- backed up to ~/database_backups as municipality_with_demographic_info.sql.gz, then deleted demographic info
- added column PAGE_LONG_NAME varchar(100) to hold the exact name of each municipality's page
- index added on the pair (FIPS_STATE_COUNTY, FIPS_MUNICIPALITY) to speed up ~/installers/wikimanager/scrape_fips.py
County Table
- backed up to ~/database_backups as county_with_demographic_info.sql.gz, then deleted demographic info
- deleted 67 duplicate entries to reduce from 3206 to 3139
- added LONG_NAME column, which holds the county's page name
ab_well Table
- 195 cases of well_id having the format *E+11 or *E+12 replaced by proper 13-digit IDs. The following sql was used:
- UPDATE ab_well t1, ab_well_ids t2
- SET t1.well_id = t2.well_id_13
- WHERE t1.well_id LIKE '%E+%'
- AND t1.location = t2.location
- ab_well_ids was a table temporarily created to solve this problem, and held locations and corresponding well_ids.
ab_violations Table
- changed length of ENFORCEMENT_ACTION_CATEGORY from 30 to 70 to accommodate longer entries
- duplicate entries (ignoring the auto-incremented ID field) were deleted. The backup ab_violations_2015-07-28.sql.gz of the table prior to this change was first made in ~/database_backups/.
ab_facility Table
- LOCATION field added to replace well_id (which contained errors). well_id column dropped.
ab_operator Table
- added columns LATITUDE_DECIMAL and LONGITUDE_DECIMAL (both double). These are filled in by the script ab_operator_lat_lon.py in ~/installers/wikimanager/.
bc_facility Table
- 15920 incorrect instances of COMPRESSOR_POWER=0 changed to NULL
bc_loc Table
- index added on WELL_AUTH_NUM to speed up lat/lon updates to bc_well
wikiManagerPages Table
- replaced the 3 identical indexes PAGE_NAME_2, PAGE_NAME_3, and PAGE_NAME_4 by the single index PAGE_NAME
Dropped Tables
The following tables are properly backup up in a sql.gz file, and thus are redundant and dropped:
- production_backup
- compliance_backup
- permit_backup
- penalty_backup
- violation_backup
- v_fips_year (view)
- v_compliance (view)
- v_munic_pca (view)
- v_waste_facility (view)
- v_waste_operator (view)
Added Tables
- wv_latlon_loader - can drop once done with new WV data
- muncipality_cache - used to update well table (well update script creates it each time so deleting it won't hurt anything)
- well_events - to deal with additional 4 digits appended to some APIs
- bc_production_values - additional production information for BC. To be used to generate production volume tables on BC well pages.
- bc_loc - stores latitude and longitude for BC wells based on well authorization number. Used to update bc_well.
- fips_codes - see Populating Municipality and County Pages Technical Information
- gaz_place_national - see Populating Municipality and County Pages Technical Information
Frac Modifications
In ustable1p1, column api was changed to collation type "utf8_unicode_ci"
In canadatable1, columns well_number and well_license_number changed to collation type "utf8_unicode_ci"
This allows these tables to be directly compared to the license number and well API within the wellwiki_dev well, ab_well, and bc_well pages.