Difference between revisions of "British Columbia Technical Information"

Revision as of 09:19, 21 July 2015

General

Tables are backed up using mysqldump to ~/database_backups/ before starting.

CSVs are copied to the server. Their group and ownership are changed to 'mysql' using the chgrp and chown commands. They are then moved to /var/lib/mysql/wellsdb_dev/ such that they can be read by mysql. Ensure that CSVs are free of special characters (e.g. \n and \) and extra quotation marks ".

bc_permit

Data for Well Sites was available as a shapefile from https://www.bcogc.ca/public-zone/gis-data (selecting the Wells link). QGIS was used to access the attribute table, from which the data was copied into an Excel file, which was then saved as CSV.

Importing:

CREATE TABLE bc_permit_tmp AS

SELECT * FROM bc_permit

WHERE 1=0

LOAD DATA INFILE 'Well_Sites.csv'

INTO TABLE bc_permit_tmp FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(@dummy, TRACK_NO, @dummy, WA_NUM, APPL_RECD, APPL_TYPE, ACTIV_TYPE, @dummy, STATUS, APPRVL_DAT, AREA_HA, LAND_TYPE, CLIENT_NAME, MASTERLIC, CUTPERMIT, TIMBERMRK, ACC_PCTL, ACC_HOR, ACC_VER, @dummy)

Some blank fields imported incorrectly as zeros -

UPDATE bc_permit_tmp

SET CUTPERMIT=NULL

WHERE CUTPERMIT=0

INSERT INTO bc_permit

SELECT * FROM bc_permit_tmp

Syntax for fixing blank strings in MASTERLIC and TIMBERMRK columns:

UPDATE bc_permit

SET MASTERLIC=NULL

WHERE length(MASTERLIC)<1

Duplicates were finally removed by selecting distinct records from bc_permit into a new table, deleting the old table, and renaming the new table to bc_permit.

bc_production

Data for Well Surface locations and Well Bottom locations were available as shapefiles from https://www.bcogc.ca/public-zone/gis-data (selecting the Wells link). QGIS was used to access the attribute tables, from which the data was copied into Excel files, which were then saved as CSVs.

Importing:

CREATE TABLE bc_production_tmp AS

SELECT * FROM bc_production

WHERE 1=0

LOAD DATA INFILE 'Surface_locations.csv'

INTO TABLE bc_production_tmp FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(@dummy, WA_NUM, @dummy, @dummy, @dummy, AREA_NAME, OPTNL_UNIT, LOCATION, @dummy, @dummy, @dummy, OPS_TYPE, STATUS_EFF, DIRCTNL_FL, @dummy, @dummy, CONFDNTL)

LOAD DATA INFILE 'Bottom_locations.csv'

INTO TABLE bc_production_tmp FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(@dummy, WA_NUM, DRILLNG_EV, @dummy, @dummy, @dummy, @dummy, AREA_NAME, OPTNL_UNIT, SH_LOC, @dummy, @dummy, @dummy, @dummy, OPS_TYPE, STATUS_EFF, DIRCTNL_FL, CONFDNTL, SPUD_DATE, RIGRLS_DAT)

INSERT INTO bc_production

SELECT *

FROM bc_production_tmp

DROP TABLE bc_production_tmp

Syntax for fixing blank strings in OPTNL_UNIT, AREA_NAME, and DIRCTNL_FL columns:

UPDATE bc_production

SET OPTNL_UNIT=NULL

WHERE LENGTH(OPTNL_UNIT)<1

bc_production_values

The table bc_production_values was created. Field types and lengths were based on corresponding fields in other tables and examination of the data to see what was appropriate.

Fields:

WA_NUM: Well Authorization Number

CMPL_EVENT_SEQ: Completion Event Sequence Number

PERIOD_ID: Production Period (YYYYMM)

UWI: Unique Well Identifier

AREA_CODE: Area Code

FORMN_CODE: Formation Code

POOL_SEQ: Pool Sequence

GAS_QUANTITY_E3M3: Gas volume produced during production period in e3m3 (thousand cubic metres)

OIL_QUANTITY_M3: Oil volume produced during production period in m3

WATER_QUANTITY_M3: Water volumne produced during production period in m3

CONDENSATE_QUANTITY_M3: Condensate volume produced during production period in m3

PRODUCTION_DAYS: Days of production during production period

GAS_CUM_E3M3: Cumulative gas production in e3m3 since 1954

OIL_CUM_M3: Cumulative oil production in e3m3 since 1954

WATER_CUM_M3: Cumulative water production in e3m3 since 1954

CONDENSATE_CUM_M3: Cumulative condensate production in e3m3 since 1954

PROJECT_CODE: Project Code

Importing:

LOAD DATA INFILE 'zone_prd.csv'

INTO TABLE bc_production_values FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 2 ROWS

(WA_NUM,CMPL_EVENT_SEQ,PERIOD_ID,UWI,AREA_CODE,FORMN_CODE,POOL_SEQ,GAS_QUANTITY_E3M3,OIL_QUANTITY_M3,WATER_QUANTITY_M3,CONDENSATE_QUANTITY_M3,PRODUCTION_DAYS,GAS_CUM_E3M3,OIL_CUM_M3,WATER_CUM_M3,CONDENSATE_CUM_M3,PROJECT_CODE)

Syntax for fixing blank strings in AREA_CODE, FORMN_CODE, POOL_SEQ, PROJECT_CODE columns:

UPDATE bc_production_values

SET AREA_CODE=NULL

WHERE LENGTH(AREA_CODE)<1

bc_facility

Some modifications to facindex.csv were made before updating wellsdb_dev:

changed date formats to yyyy-mm-dd
removed prefixes (DLS/NTS) and spaces in the location by "=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(F3," ",""),"DLS",""),"NTS","")" (where the F column contained the locations)

Importing:

CREATE TABLE bc_facility_tmp AS

SELECT * FROM bc_facility

WHERE 1=0

LOAD DATA INFILE 'facindex.csv'

INTO TABLE bc_facility_tmp FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 2 ROWS

(Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, @dummy, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location)

INSERT INTO bc_facility

SELECT * from bc_facility_tmp

WHERE (Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location)

NOT IN (select Facility_Code, Facility_Name, Facility_Type, Operator_Code, Operator_Name, Reporting, Class_Code, Compressor_Power, Date_Status, Operations_Termination_Date, Operations_Effective_Date, Location from bc_facility)

DROP TABLE bc_facility_tmp

Syntax for fixing blank strings in LOCATION and DATE_STATUS columns:

UPDATE bc_facility

SET LOCATION=NULL

WHERE LENGTH(LOCATION)<1

Syntax for dates of "0000-00-00" in OPERATIONS_TERMINATION_DATE and OPERATIONS_EFFECTIVE_DATE columns:

UPDATE bc_facility

SET OPERATIONS_TERMINATION_DATE=NULL

WHERE OPERATIONS_TERMINATION_DATE='0000-00-00'

bc_well_operator

Importing:

CREATE TABLE bc_well_operator_tmp AS

SELECT * FROM bc_well_operator

WHERE 1=0

LOAD DATA INFILE 'Surface_locations.csv'

INTO TABLE bc_well_operator_tmp FIELDS TERMINATED BY ','

ENCLOSED BY '"'

LINES TERMINATED BY '\n'

IGNORE 1 ROWS

(@dummy, WA_NUM, @dummy, OPER_ABBRE, @dummy, @dummy, @dummy, LOCATION, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy)

INSERT INTO bc_well_operator

SELECT * FROM bc_well_operator_tmp t

WHERE (t.wa_num, t.OPER_ABBRE, t.location)

NOT IN (SELECT * FROM bc_well_operator)

DROP TABLE bc_well_operator_tmp

bc_well

For numeric fields, blanks in the CSV are found to be filled in as 0 rather than NULL upon import into database. Therefore, we set blank fields to -1 in Excel, and convert -1 to NULL with mysql after import.

=IF(ISBLANK(K2),-1,K2) preserves the value of the cell (K2 in this case) if not blank, and sets to -1 otherwise.

A new table, bc_well_new, was created with the same structure as bc_well. This was done for multiple reasons:

The column headings were very abbreviated and difficult to interpret in bc_well. They were renamed in bc_well_new by consulting the Well Index Documentation, well_index.wri (See British Columbia Oil and Gas Sources).

Some of the data in bc_well contained errors (i.e. zeros instead of NULLs) due to insufficient care during previous blank values into numeric fields.

The collation of bc_well_new was changed to utf8_unicode_ci to allow for compatibility with other tables. Indexes were added on on WELL_AUTH_NUM and OPER_NAME to speed up population of those fields whose values are obtained from other tables in the database (see below for the queries incorporating these fields to search).

Track changes made to this page