If you find this page useful and would like to be notified of changes made to this page, start by inputting your email below.
powered by ChangeDetection
Line 286: | Line 286: | ||
DROP TABLE bc_well_operator_tmp | DROP TABLE bc_well_operator_tmp | ||
+ | |||
+ | ==bc_well== | ||
+ | |||
+ | For numeric fields, blanks in the CSV are found to be filled in as 0 rather than NULL upon import into database. Therefore, we set blank fields to -1 in Excel, and convert -1 to NULL with mysql after import. | ||
+ | |||
+ | |||
+ | =IF(ISBLANK(K2),-1,K2) preserves the value of the cell (K2 in this case) if not blank, and sets to -1 otherwise. | ||
+ | |||
+ | |||
+ | A new table, bc_well_new, was created with the same structure as bc_well. This was done for multiple reasons: | ||
+ | |||
+ | * The column headings were very abbreviated and difficult to interpret in bc_well. They were renamed in bc_well_new by consulting the Well Index Documentation, well_index.wri (See [[British Columbia Oil and Gas Sources]]). | ||
+ | |||
+ | * Some of the data in bc_well contained errors (i.e. zeros instead of NULLs) due to insufficient care during previous blank values into numeric fields. | ||
+ | |||
+ | |||
+ | The collation of bc_well_new was changed to utf8_unicode_ci to allow for compatibility with other tables. Indexes were added on on WELL_AUTH_NUM and OPER_NAME to speed up population of those fields whose values are obtained from other tables in the database (see below for the queries incorporating these fields to search). | ||
[[Category:Documentation]] | [[Category:Documentation]] |
Tables are backed up using mysqldump to ~/database_backups/ before starting.
CSVs are copied to the server. Their group and ownership are changed to 'mysql' using the chgrp and chown commands. They are then moved to /var/lib/mysql/wellsdb_dev/ such that they can be read by mysql. Ensure that CSVs are free of special characters (e.g. \n and \) and extra quotation marks ".
Data for Well Sites was available as a shapefile from https://www.bcogc.ca/public-zone/gis-data (selecting the Wells link). QGIS was used to access the attribute table, from which the data was copied into an Excel file, which was then saved as CSV.
Importing:
CREATE TABLE bc_permit_tmp AS
SELECT * FROM bc_permit
WHERE 1=0
LOAD DATA INFILE 'Well_Sites.csv'
INTO TABLE bc_permit_tmp FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@dummy, TRACK_NO, @dummy, WA_NUM, APPL_RECD, APPL_TYPE, ACTIV_TYPE, @dummy, STATUS, APPRVL_DAT, AREA_HA, LAND_TYPE, CLIENT_NAME, MASTERLIC, CUTPERMIT, TIMBERMRK, ACC_PCTL, ACC_HOR, ACC_VER, @dummy)
Some blank fields imported incorrectly as zeros -
UPDATE bc_permit_tmp
SET CUTPERMIT=NULL
WHERE CUTPERMIT=0
INSERT INTO bc_permit
SELECT * FROM bc_permit_tmp
Syntax for fixing blank strings in MASTERLIC and TIMBERMRK columns:
UPDATE bc_permit
SET MASTERLIC=NULL
WHERE length(MASTERLIC)<1
Duplicates were finally removed by selecting distinct records from bc_permit into a new table, deleting the old table, and renaming the new table to bc_permit.
Data for Well Surface locations and Well Bottom locations were available as shapefiles from https://www.bcogc.ca/public-zone/gis-data (selecting the Wells link). QGIS was used to access the attribute tables, from which the data was copied into Excel files, which were then saved as CSVs.
Importing:
CREATE TABLE bc_production_tmp AS
SELECT * FROM bc_production
WHERE 1=0
LOAD DATA INFILE 'Surface_locations.csv'
INTO TABLE bc_production_tmp FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@dummy, WA_NUM, @dummy, @dummy, @dummy, AREA_NAME, OPTNL_UNIT, LOCATION, @dummy, @dummy, @dummy, OPS_TYPE, STATUS_EFF, DIRCTNL_FL, @dummy, @dummy, CONFDNTL)
LOAD DATA INFILE 'Bottom_locations.csv'
INTO TABLE bc_production_tmp FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@dummy, WA_NUM, DRILLNG_EV, @dummy, @dummy, @dummy, @dummy, AREA_NAME, OPTNL_UNIT, SH_LOC, @dummy, @dummy, @dummy, @dummy, OPS_TYPE, STATUS_EFF, DIRCTNL_FL, CONFDNTL, SPUD_DATE, RIGRLS_DAT)
INSERT INTO bc_production
SELECT *
FROM bc_production_tmp
DROP TABLE bc_production_tmp
Syntax for fixing blank strings in OPTNL_UNIT, AREA_NAME, and DIRCTNL_FL columns:
UPDATE bc_production
SET OPTNL_UNIT=NULL
WHERE LENGTH(OPTNL_UNIT)<1
The table bc_production_values was created. Field types and lengths were based on corresponding fields in other tables and examination of the data to see what was appropriate.
Fields:
WA_NUM: Well Authorization Number
CMPL_EVENT_SEQ: Completion Event Sequence Number
PERIOD_ID: Production Period (YYYYMM)
UWI: Unique Well Identifier
AREA_CODE: Area Code
FORMN_CODE: Formation Code
POOL_SEQ: Pool Sequence
GAS_QUANTITY_E3M3: Gas volume produced during production period in e3m3 (thousand cubic metres)
OIL_QUANTITY_M3: Oil volume produced during production period in m3
WATER_QUANTITY_M3: Water volumne produced during production period in m3
CONDENSATE_QUANTITY_M3: Condensate volume produced during production period in m3
PRODUCTION_DAYS: Days of production during production period
GAS_CUM_E3M3: Cumulative gas production in e3m3 since 1954
OIL_CUM_M3: Cumulative oil production in e3m3 since 1954
WATER_CUM_M3: Cumulative water production in e3m3 since 1954
CONDENSATE_CUM_M3: Cumulative condensate production in e3m3 since 1954
PROJECT_CODE: Project Code
Importing:
LOAD DATA INFILE 'zone_prd.csv'
INTO TABLE bc_production_values FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 2 ROWS
(WA_NUM,CMPL_EVENT_SEQ,PERIOD_ID,UWI,AREA_CODE,FORMN_CODE,POOL_SEQ,GAS_QUANTITY_E3M3,OIL_QUANTITY_M3,WATER_QUANTITY_M3,CONDENSATE_QUANTITY_M3,PRODUCTION_DAYS,GAS_CUM_E3M3,OIL_CUM_M3,WATER_CUM_M3,CONDENSATE_CUM_M3,PROJECT_CODE)
Syntax for fixing blank strings in AREA_CODE, FORMN_CODE, POOL_SEQ, PROJECT_CODE columns:
UPDATE bc_production_values
SET AREA_CODE=NULL
WHERE LENGTH(AREA_CODE)<1
Some modifications to facindex.csv were made before updating wellsdb_dev:
Importing:
CREATE TABLE bc_facility_tmp AS
SELECT * FROM bc_facility
WHERE 1=0
LOAD DATA INFILE 'facindex.csv'
INTO TABLE bc_facility_tmp FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 2 ROWS
(Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, @dummy, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location)
INSERT INTO bc_facility
SELECT * from bc_facility_tmp
WHERE (Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location)
NOT IN (select Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location from bc_facility)
DROP TABLE bc_facility_tmp
Syntax for fixing blank strings in LOCATION and DATE_STATUS columns:
UPDATE bc_facility
SET LOCATION=NULL
WHERE LENGTH(LOCATION)<1
Syntax for dates of "0000-00-00" in OPERATIONS_TERMINATION_DATE and OPERATIONS_EFFECTIVE_DATE columns:
UPDATE bc_facility
SET OPERATIONS_TERMINATION_DATE=NULL
WHERE OPERATIONS_TERMINATION_DATE='0000-00-00'
Importing:
CREATE TABLE bc_well_operator_tmp AS
SELECT * FROM bc_well_operator
WHERE 1=0
LOAD DATA INFILE 'Surface_locations.csv'
INTO TABLE bc_well_operator_tmp FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(@dummy, WA_NUM, @dummy, OPER_ABBRE, @dummy, @dummy, @dummy, LOCATION, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy)
INSERT INTO bc_well_operator
SELECT * FROM bc_well_operator_tmp t
WHERE (t.wa_num, t.OPER_ABBRE, t.location)
NOT IN (SELECT * FROM bc_well_operator)
DROP TABLE bc_well_operator_tmp
For numeric fields, blanks in the CSV are found to be filled in as 0 rather than NULL upon import into database. Therefore, we set blank fields to -1 in Excel, and convert -1 to NULL with mysql after import.
=IF(ISBLANK(K2),-1,K2) preserves the value of the cell (K2 in this case) if not blank, and sets to -1 otherwise.
A new table, bc_well_new, was created with the same structure as bc_well. This was done for multiple reasons:
The collation of bc_well_new was changed to utf8_unicode_ci to allow for compatibility with other tables. Indexes were added on on WELL_AUTH_NUM and OPER_NAME to speed up population of those fields whose values are obtained from other tables in the database (see below for the queries incorporating these fields to search).