Noobieish help request with postgres / sql (copy command to import csv with NULL date)

Hi, I have set up a postgis database (PostgreSQL 9.3.6 on currently updated open Suse) and am trying to import some third party csv file. I did some basic sql a long time ago, but I was never an expert, and am now rusty :) In my "CREATE TABLE" command I am defining about 50 attributes / columns. Two of them are: ACCIDENT_DATE date, ACCIDENT_TIME time, My csv files generally have date and time values, but there are a number of NULL values represented as --,"",-- I expected / hoped that these would be imported as NULL values, but when I run the copy command, I get the error message - ERROR: invalid input syntax for type time: "" Can someone help me to get the empty double quoted value accepted as a NULL value for date and for time? TIA sql command: copy crashes FROM '/ACCIDENT.CSV' DELIMITER ',' CSV; Data Sample with "" time: "ACCIDENT_NO","ACCIDENT_DATE","ACCIDENT_TIME","ACCIDENT_TYPE","DAY_OF_WEEK","DCA_CODE","LIGHT_CONDITION","NO_PERSONS","NO_PERSONS_KILLED","NO_PERSONS_INJ_2","NO_PERSONS_INJ_3","NO_PERSONS_NOT_INJ","NO_OF_VEHICLES","POLICE_ATTEND","ROAD_GEOMETRY","SEVERITY","SPEED_ZONE","IMAGE_NO_DIGIT","IMAGE_NO_FRAME","IMAGE_RUN_NO","NODE_ID","LOCATION_TYPE","LOCALS_INVOLVED","NODE_ID","NODE_TYPE","LOCATION_TYPE","LOCATED_MIDPOINT","COMPLEX_INT_NO","AMG_X","AMG_Y","ROAD_NAME_1","ROAD_TYPE_1","ROAD_DESC_1","ROAD_ROUTE_1","ROAD_NAME_2","ROAD_TYPE_2","ROAD_DESC_2","ROAD_NAME_3","ROAD_TYPE_3","ROAD_DESC_3","DISTANCE_LOCATION","DIRECTION_LOCATION","NEAREST_KM_POST","OFF_ROAD_LOCATION","DIRECTORY","EDITION","PAGE","GRID_REFERENCE_X","GRID_REFERENCE_Y","LGA_NAME","REGION_NAME","UNKNOWN" "W1990000191","1989-12-19","","5",3,"183","5",1,0,1,0,0,1,"1","5","2","100",1518,1517,1050,18378,"N","Y",18378,"N","N","Y",0,648329.25,5958402,"GREAT ALPINE","ROAD","",4005,"UNNAMED","","","MCILROY RAE","ROAD","",73,"S",,"","VCD","ED7","35","C","9","ALPINE","NORTH EASTERN REGION",0 "P1990028638","1990-07-19","18:00:00","4",5,"183","5",1,0,0,1,0,1,"1","5","3","100",1828,1827,1086,18378,"N","Y",18378,"N","N","Y",0,648329.25,5958402,"GREAT ALPINE","ROAD","",4005,"UNNAMED","","","MCILROY RAE","ROAD","",73,"S",,"","VCD","ED7","35","C","9","ALPINE","NORTH EASTERN REGION",0 "P1990029504","1990-08-15","15:45:00","1",4,"130","1",7,0,0,4,3,2,"1","5","3","60",146,144,1097,18383,"N","N",18383,"N","N","N",0,654588.610512537,5951870.61325035,"MYRTLEFORD-YACKANDANDAH","ROAD","",5467,"MUMMERY","ROAD","","ODONNELL","AVENUE","",36,"SW",,"","VCD","ED7","660","C","3","ALPINE","NORTH EASTERN REGION",0 "P1990039075","1990-10-26","00:15:00","4",6,"181","5",4,0,1,3,0,1,"2","5","2","100",1666,1665,1102,18398,"N","Y",18398,"N","N","N",0,652283.479975146,5950175.41617511,"MERRIANG","ROAD","",177824,"TONIAZZO","LANE","","CRAMERI","LANE","",118,"NW",,"","VCD","ED7","49","D","3","ALPINE","NORTH EASTERN REGION",0

On Fri, 16 Oct 2015 08:23:50 PM h wrote:
Can someone help me to get the empty double quoted value accepted as a NULL value for date and for time?
I've never touched PostgreSQL I'm afraid, but a quick google implies that if you are using "COPY FROM" to import the CSV you can use the NULL argument to specify what to use as the CSV equivalent to NULL. http://www.postgresql.org/docs/9.3/static/sql-copy.html # NULL # Specifies the string that represents a null value. The default is \N # (backslash-N) in text format, and an unquoted empty string in CSV # format. You might prefer an empty string even in text format for cases # where you don't want to distinguish nulls from empty strings. This # option is not allowed when using binary format. Best of luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

On Sat, 17 Oct 2015 09:38:35 +1100 Chris Samuel <chris@csamuel.org> wrote:
On Fri, 16 Oct 2015 08:23:50 PM h wrote:
Can someone help me to get the empty double quoted value accepted as a NULL value for date and for time?
I've never touched PostgreSQL I'm afraid, but a quick google implies that if you are using "COPY FROM" to import the CSV you can use the NULL argument to specify what to use as the CSV equivalent to NULL.
http://www.postgresql.org/docs/9.3/static/sql-copy.html
# NULL # Specifies the string that represents a null value. The default is \N # (backslash-N) in text format, and an unquoted empty string in CSV # format. You might prefer an empty string even in text format for cases # where you don't want to distinguish nulls from empty strings. This # option is not allowed when using binary format.
Best of luck! Chris
Thanks Chris, Unfortunately I found this: https://wiki.postgresql.org/wiki/COPY#Caveats_with_implementation - an empty string is always an empty string. I did not want to, but I will have to do a pre-process on my csv files. Ta again, H

On Sun, Oct 18, 2015 at 01:08:55PM +1100, h wrote:
Unfortunately I found this: https://wiki.postgresql.org/wiki/COPY#Caveats_with_implementation - an empty string is always an empty string.
not surprising. COPY requires perfect or near-perfect input. Most allegedly-CSV files are far from perfect.
I did not want to, but I will have to do a pre-process on my csv files.
alternatively, you could write a perl script that reads the csv files using DBD::CSV and exports the data to a postgres database using DBD::Pg, performing any transformations of the data as required - i've done this several times in the past when i needed to import client-supplied CSV files into a database. DBD::CSV is very flexible and forgiving of semi-broken and non-standard CSV (which is about 99% of all CSV files in the wild). craig
participants (3)
-
Chris Samuel
-
Craig Sanders
-
h