Brian Larsen
2007-12-20 20:05:24 UTC
Hello all,
I must be slow today but I can't figure this out. I have huge ascii
files that are all comma separated. I am currently reading them in
line by line and splitting them with strsplit(), this solution worked
but now I have files that have a line every 30s for 13 years and I am
not that patient.
I found this really old post saying you don't need to worry about the
commas but it doesn't seem to work for this data, I think the reason
is that some are strings and others floats.
http://groups.google.com/group/comp.lang.idl-pvwave/browse_frm/thread/960754afec3a8169/efd7a2182fe40447?lnk=gst&q=format+with+commas#efd7a2182fe40447
What I normally do is create a struct array and read a file into that
with each column having a structure tag so I can stay organized. But
I can't figure out the format codes to do that this time...
The data looks like this:
126489604,1996-01-04T00:00:04 ,01/04/96 00:00:04, 19061, 23,
2, 3, 7.009E+03, 3.536E+02, 6.361E+01, 6.481E+02, -3.545E+02,
3.095E+03, 6.279E+03, -3.162E+00, 6.081E+00, -3.085E+00,
-9.772E-01, 2.088E-01, 3.757E-02, -2.047E-01, -8.816E-01,
-4.254E-01, -5.570E-02, -4.234E-01, 9.042E-01, 6.150E+02,
-4.040E-02, 2.000E+00, 7.046E+03, 7.994E+01, 6.333E+01,
1.819E-01, 5.307E+00, 3.935E-01, 1.452E-01, 6.427E+01, 6.665E
+01, 6.318E+01, 8.319E+01, 4.845E-02, -2.560E-01, -2.949E-01,
-3.796E-01, -1.017E-01, -1.900E-02, -1.785E-02, 5.315E-02,
-2.967E-01, -4.012E+02, 2.855E+02, 1.946E+02, -1.058E+01, -7.476E
+01, 7.141E+03, 6.264E+01, 1.304E+02, 4.168E+01, ...
where the ... is about 1/2 way down the 134 columns in each line.
I tried the obvious:
IDL> dat = strarr(134,3)
IDL> readf, lun, in, dat
and
IDL> dat = create_struct('d1','','d2','', 'd3','', 'orbit', 0, 'num',
0)
IDL> readf, lun, in,
dat
But the whole line ends up in each string element but the integers
seem right...
How do I need to think about this differently?
Cheers,
Brian
--------------------------------------------------------------------------
Brian Larsen
Boston University
Center for Space Physics
I must be slow today but I can't figure this out. I have huge ascii
files that are all comma separated. I am currently reading them in
line by line and splitting them with strsplit(), this solution worked
but now I have files that have a line every 30s for 13 years and I am
not that patient.
I found this really old post saying you don't need to worry about the
commas but it doesn't seem to work for this data, I think the reason
is that some are strings and others floats.
http://groups.google.com/group/comp.lang.idl-pvwave/browse_frm/thread/960754afec3a8169/efd7a2182fe40447?lnk=gst&q=format+with+commas#efd7a2182fe40447
What I normally do is create a struct array and read a file into that
with each column having a structure tag so I can stay organized. But
I can't figure out the format codes to do that this time...
The data looks like this:
126489604,1996-01-04T00:00:04 ,01/04/96 00:00:04, 19061, 23,
2, 3, 7.009E+03, 3.536E+02, 6.361E+01, 6.481E+02, -3.545E+02,
3.095E+03, 6.279E+03, -3.162E+00, 6.081E+00, -3.085E+00,
-9.772E-01, 2.088E-01, 3.757E-02, -2.047E-01, -8.816E-01,
-4.254E-01, -5.570E-02, -4.234E-01, 9.042E-01, 6.150E+02,
-4.040E-02, 2.000E+00, 7.046E+03, 7.994E+01, 6.333E+01,
1.819E-01, 5.307E+00, 3.935E-01, 1.452E-01, 6.427E+01, 6.665E
+01, 6.318E+01, 8.319E+01, 4.845E-02, -2.560E-01, -2.949E-01,
-3.796E-01, -1.017E-01, -1.900E-02, -1.785E-02, 5.315E-02,
-2.967E-01, -4.012E+02, 2.855E+02, 1.946E+02, -1.058E+01, -7.476E
+01, 7.141E+03, 6.264E+01, 1.304E+02, 4.168E+01, ...
where the ... is about 1/2 way down the 134 columns in each line.
I tried the obvious:
IDL> dat = strarr(134,3)
IDL> readf, lun, in, dat
and
IDL> dat = create_struct('d1','','d2','', 'd3','', 'orbit', 0, 'num',
0)
IDL> readf, lun, in,
dat
But the whole line ends up in each string element but the integers
seem right...
How do I need to think about this differently?
Cheers,
Brian
--------------------------------------------------------------------------
Brian Larsen
Boston University
Center for Space Physics