If you find this page useful and would like to be notified of changes made to this page, start by inputting your email below.
powered by ChangeDetection
| Line 168: | Line 168: | ||
''Copying over configuration and operator ID info from permit:'' | ''Copying over configuration and operator ID info from permit:'' | ||
| − | ''UPDATE production pd, permit pm | + | ''UPDATE production pd, permit pm'' |
| − | SET pd.CONFIGURATION = pm.CONFIGURATION, pd.operator_id = pm.OPERATOR_OGO | + | ''SET pd.CONFIGURATION = pm.CONFIGURATION, pd.operator_id = pm.OPERATOR_OGO'' |
| − | WHERE pd.WELL_API = pm.WELL_API AND pm.CONFIGURATION IS NOT NULL AND pm.OPERATOR_OGO IS NOT NULL'' change this | + | ''WHERE pd.WELL_API = pm.WELL_API AND pm.CONFIGURATION IS NOT NULL AND pm.OPERATOR_OGO IS NOT NULL''' change this |
CSVs are copied to the server. Their group and ownership are changed to 'mysql' using the chgrp and chown commands. They are then moved to /var/lib/mysql/wellsdb_dev/ such that they can be read by mysql.
CSVs should be checked for \n \ and other special characters prior to being uploaded to the server. " are okay if they are surrounding an entire field value, but care must be taken that none are unintentionally present or their count is not odd. Special characters can cause the files to be processed incorrectly, which leads to data going into the wrong columns or not being uploaded at all.
Always make backups prior to destructive operations.
Cleared out old data since new data is much more extensive:
DELETE FROM permit
WHERE WELL_API LIKE '47%'
Populating permit table from owncomp:
Note: Duplicate rows were first removed using Excel. Duplicate rows occurred because, while all rows in owncomp were distinct, only a subset of the columns of owncomp were taken for the permit table. In some cases, the only distinguishing values between two rows occurred in the omitted columns. Deleting duplicates reduced the number of rows from 156,674 to 155,242 (a difference of 1,432). After the data was inserted into the database, one odd spud date of 2019-12-10 (for well 47-033-30564) was noted. A look into the original data revealed that the day field was blank for this record, resulting in erroneous processing. The entry was fixed manually in the database.
proposed_total_depth is equal to the proposed measured depth for Deviated/Horizontal wells or the proposed vertical depth for Vertical Wells.
LOAD DATA INFILE 'Wv_permit.csv'
INTO TABLE permit
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@WELL_API_TMP, WELL_COUNTY, DB_COMMENTS, OPERATOR_OGO, FARM_NAME, OPERATOR_NAME, @PERMIT_ISSUED_DATE_TMP, SPUD_DATE, WELL_TYPE, CONFIGURATION, proposed_total_depth)
SET WELL_API = CONCAT(SUBSTRING(@WELL_API_TMP,1,2),'-',SUBSTRING(@WELL_API_TMP,3,3),'-',SUBSTRING(@WELL_API_TMP,6,5)),
WELL_API_COUNTY_ID = SUBSTRING(@WELL_API_TMP,3,3),
PERMIT_ISSUED_DATE = DATE_FORMAT(STR_TO_DATE(CONCAT(
SUBSTRING(@PERMIT_ISSUED_DATE_TMP,1,2),'-',
SUBSTRING(@PERMIT_ISSUED_DATE_TMP,4,2),'-',
SUBSTRING(@PERMIT_ISSUED_DATE_TMP,7,2)),'%d-%m-%y'),'%Y-%m-%d'),
WELL_STATE_CODE='WV',
WELL_COUNTRY='United States';
Grabbing latitude and longitude from locs (a table is created for this to make operations easier and to make the lat/lon data available to be input into other tables' WV data as well):
CREATE TABLE wv_latlon_loader
(
WELL_API varchar(50),
LATITUDE_DECIMAL double,
LONGITUDE_DECIMAL double
);
LOAD DATA INFILE 'Wv_latlon.csv'
INTO TABLE wv_latlon_loader
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@WELL_API_TMP, LATITUDE_DECIMAL, LONGITUDE_DECIMAL)
SET WELL_API = CONCAT(SUBSTRING(@WELL_API_TMP,1,2),'-',SUBSTRING(@WELL_API_TMP,3,3),'-',SUBSTRING(@WELL_API_TMP,6,5));
Update permit to include latitude and longitude values:
UPDATE permit, wv_latlon_loader
SET permit.LATITUDE_DECIMAL = wv_latlon_loader.LATITUDE_DECIMAL, permit.LONGITUDE_DECIMAL = wv_latlon_loader.LONGITUDE_DECIMAL
WHERE permit.WELL_API = wv_latlon_loader.WELL_API;
Populating spud table from owncomp:
As with permit, duplicate rows were deleted, reducing the number of records to insert from 156,674 to 154,521 (a difference of 2,153). The date was fixed for the same one row.
LOAD DATA INFILE 'Wv_spud.csv'
INTO TABLE spud FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@WELL_API_TMP, WELL_COUNTY, OPERATOR_OGO, FARM_NAME, OPERATOR_NAME, SPUD_DATE, WELL_TYPE, CONFIGURATION)
SET WELL_API = CONCAT(SUBSTRING(@WELL_API_TMP,1,2),'-',SUBSTRING(@WELL_API_TMP,3,3),'-',SUBSTRING(@WELL_API_TMP,6,5)),
WELL_API_COUNTY_ID = SUBSTRING(@WELL_API_TMP,3,3),
WELL_STATE_CODE='WV',
WELL_COUNTRY='United States';
Update spud to include latitude and longitude values:
UPDATE spud, wv_latlon_loader
SET spud.LATITUDE_DECIMAL = wv_latlon_loader.LATITUDE_DECIMAL,
spud.LONGITUDE_DECIMAL = wv_latlon_loader.LONGITUDE_DECIMAL
WHERE spud.WELL_API = wv_latlon_loader.WELL_API;
Cleared out old data since new data is much more extensive:
DELETE FROM production
WHERE WELL_API LIKE '47%'
Populating production table from prod:
The information from the ascii file was divided roughly in half and opened in two Excel files, as the size was too great for a single one. A find-and-replace converted empty OIL_QUANTITY_BBL and GAS_QUANTITY_MCF cells to -1, to insure they were not imported into the database as 0 (sometimes a valid reported value) instead of NULL. The Excel files were saved in Windows Comma Separated Format, as this preserved the newline (some research revealed that a bug with Mac Excel was the cause for loss of newlines in other csv formats).
LOAD DATA INFILE 'wv_prod_2015-1.csv'
INTO TABLE production
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(@WELL_API_TMP, WELL_COUNTY, OPERATOR_NAME, PERIOD_ID, COMMENT_TEXT, GAS_QUANTITY_MCF, OIL_QUANTITY_BBL)
SET WELL_API = CONCAT(SUBSTRING(@WELL_API_TMP,1,2),'-',SUBSTRING(@WELL_API_TMP,3,3),'-',SUBSTRING(@WELL_API_TMP,6,5)),
WELL_API_COUNTY_ID = SUBSTRING(@WELL_API_TMP,3,3),
WELL_STATE_CODE='WV',
WELL_COUNTRY='United States';
-1's in the OIL_QUANTITY_BBL and GAS_QUANTITY_MCF were converted to NULL and latitude and longitudes added as they were for permit and spud.
Copying over configuration and operator ID info from permit:
UPDATE production pd, permit pm
SET pd.CONFIGURATION = pm.CONFIGURATION, pd.operator_id = pm.OPERATOR_OGO
WHERE pd.WELL_API = pm.WELL_API AND pm.CONFIGURATION IS NOT NULL AND pm.OPERATOR_OGO IS NOT NULL' change this