Home » RDBMS Server » Server Utilities » How to load unstructured data (Oracle Enterprise Server, 11.2.0.2, Linux 5.8 )
How to load unstructured data [message #579246] Sun, 10 March 2013 09:43 Go to next message
progkcp
Messages: 10
Registered: March 2013
Location: Washington DC USA
Junior Member
How to get ill-formatted data into Oracle table?

I'm trying to load a large amount of data that is not arranged into neat columns and doesn't have proper record delimiters.

I'd like to use sql loader but I don't think that will work with unstructured data.

I'm reading that perhaps using an external table would be the best way to do it.

It's sample census data and I've attached a single record to look at.

Any suggestions would be appreciated.

Re: How to load unstructured data [message #579247 is a reply to message #579246] Sun, 10 March 2013 09:53 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
1) how to know where 1 column ends & the next column begins?
2) how to know where 1 record ends & the next row begins?

What is the maximum record length in text file?

Please read and follow the forum guidelines, to enable us to help you:

http://www.orafaq.com/forum/t/88153/0/

[Updated on: Sun, 10 March 2013 10:38]

Report message to a moderator

Re: How to load unstructured data [message #579249 is a reply to message #579247] Sun, 10 March 2013 10:19 Go to previous messageGo to next message
Littlefoot
Messages: 21806
Registered: June 2005
Location: Croatia, Europe
Senior Member
Account Moderator
External table or SQL*Loader - won't make much difference here. It appears that you might need to load the whole "record" (which might be difficult as you don't know record delimiters) into a single-column table, such as
create table my_load
  (col varchar2(4000));
and then parse data you find in there. If - as BlackSwan pointed out - record length is larger that 4000 characters, you might need to use CLOB instead.

Where did you get such a file from? Can't you negotiate different (better) file structure? You know how it goes ... garbage in, garbage out.

[Updated on: Sun, 10 March 2013 10:19]

Report message to a moderator

Re: How to load unstructured data [message #579250 is a reply to message #579246] Sun, 10 March 2013 10:41 Go to previous messageGo to next message
progkcp
Messages: 10
Registered: March 2013
Location: Washington DC USA
Junior Member
BlackSwan,

Begin and end of columns -

That is a good question.

It looks like this data probably came from a census form.

I think this data might break down into multiple tables like this:

--potential tables taken from the data--
home_ownership
household_size
housing_value
population_male_or_female
population_age
education_level
...


desc home_ownership
Name Null Type
-------------------- ---- ------
TOTAL_VALUE NUMBER
PCT_POP_OWNING_HOME NUMBER
PCT_POP_RENTING_HOME NUMBER



Sample from data.
"home_ownership":{"measurement_dimension":"home_ownership","total_value":6906.0,"simple_slice_percentages":{"pct_pop_owning_home ":77.72,"pct_pop_renting_home":22.28}},


Is there a way to parse down to the line starting with "home_ownership": and then go to the value starting with "total_value": and then insert the number 6906.0 into TOTAL_VALUE column the home_ownership table, then go to "pct_pop_owning_home:" and insert value 77.72 into PCT_POP_OWNING_HOME column?

Re: How to load unstructured data [message #579269 is a reply to message #579249] Sun, 10 March 2013 20:30 Go to previous message
progkcp
Messages: 10
Registered: March 2013
Location: Washington DC USA
Junior Member
Okay, I figured out how to get this big un-delimited string into a table.

As suggested, I created a table with a clob column and then used SQL Loader to insert the whole thing into a single record.

Each record ended with the same thing, Point"}}

I used the stream record format functionality of sql loader and it worked.

Thank you.

desc t
Name Null Type
---- ---- ----
X CLOB


load data
infile '/home/oracle/Data/census_data_sample.tsv' "str 'Point\"}}'"
truncate
into table T
( x position(01:5000) char
)


Previous Topic: EXP-00008: ORACLE error 904 encountered
Next Topic: Partition table import through Data Pump
Goto Forum:
  


Current Time: Thu Mar 28 14:08:58 CDT 2024