Topic
  • 6 replies
  • Latest Post - ‏2007-01-30T14:41:52Z by SystemAdmin
Wim
Wim
8 Posts

Pinned topic Sequential file split records

‏2007-01-29T14:18:41Z |
I have a csv input file in which a field contains text. Some of this text contains newlines. So the contents of such a field is split over several lines and therefore the whole record also. These text fields are contained within double quotes.

In Server Edition this would be accomplished with a column property "Contains terminators" of the sequential file stage. But this property doesn't exist in Enterprise Edition.

Would it also be possible to read such files in DataStage Enterprise Edition?

Thanks in advance,
Wim
Updated on 2007-01-30T14:41:52Z at 2007-01-30T14:41:52Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    2099 Posts

    Re: Sequential file split records

    ‏2007-01-29T15:22:31Z  
    Read the sequential file, pass it through the transformer and use CONVERT() to convert the LF to an empty character.

    [i]Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.[/i]
    • Albert Einstein
  • Wim
    Wim
    8 Posts

    Re: Sequential file split records

    ‏2007-01-29T15:52:02Z  
    Read the sequential file, pass it through the transformer and use CONVERT() to convert the LF to an empty character.

    [i]Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.[/i]
    • Albert Einstein
    Hi,

    I'm sorry I don't understand:
    When I read the sequential file with whichever options I have tried until now, one line in the input file is passed as one record and the newline character in the input file is not passed into the DataStage data flow. Therefore, I cannot convert the newline to another character in a transformer.

    You see, I'm missing something. Please enlight me.
    Thanks again,
    Wim
  • SystemAdmin
    SystemAdmin
    2099 Posts

    Re: Sequential file split records

    ‏2007-01-29T18:10:49Z  
    unless your record is fixed-length or your record terminator is different from a LF, I don't think you can read this file directly with EE.

    You could perform recordization outside of DSEE and pass them into EE via the external source stage though.

    HTH,
    D
    ************************************************************************
    • Danny Owen * E-Mail: powen@us.ibm.com *
    • 5817 Southwind Dr * Title: Advanced Consultant *
    • NLR, AR 72118 * WWW: ibm.ascential.com *
    • * Phone: (248) 346-8867 (Mobile) *
    ************************************************************************
    #include<stdio.h>#define WQ fprintf(ptr,"%s",a) main(){int **ptr;char a[
    100]={109,97,105,110,40,41,123,99,104,97,114,32,42,99,61,34,109,97,105,1
    10,40,41,123,99,104,97,114,32,42,99,61,37,99,37,115,37,99,59,112,114,105
    ,110,116,102,40,99,44,51,52,44,99,44,51,52,41,59,125,34,59,112,114,105,1
    10,116,102,40,99,44,51,52,44,99,44,51,52,41,59,125,10,0};ptr=stderr;WQ;}
    ************************************************************************
  • SystemAdmin
    SystemAdmin
    2099 Posts

    Re: Sequential file split records

    ‏2007-01-29T18:17:36Z  
    • Wim
    • ‏2007-01-29T15:52:02Z
    Hi,

    I'm sorry I don't understand:
    When I read the sequential file with whichever options I have tried until now, one line in the input file is passed as one record and the newline character in the input file is not passed into the DataStage data flow. Therefore, I cannot convert the newline to another character in a transformer.

    You see, I'm missing something. Please enlight me.
    Thanks again,
    Wim
    There is a Filter option available. From the help:

    This is an optional property. You can use this to specify that the data is
    passed through a filter program after being read from the files.
    Specify the filter command, and any required arguments, in the
    Property Value box.
  • Wim
    Wim
    8 Posts

    Re: Sequential file split records

    ‏2007-01-30T12:25:10Z  
    There is a Filter option available. From the help:

    This is an optional property. You can use this to specify that the data is
    passed through a filter program after being read from the files.
    Specify the filter command, and any required arguments, in the
    Property Value box.
    Eric,

    thanks for your reply. I believe this is the best solution possible. What I did was write a simple filter that alters the input stream to add the string "MyEOR\n" to each '\n' character not enclosed between double quotes. So the real end-of-record is now the string "\nMyEOR\n".
    In the Sequential File Stage I specify this filter and I specify Record Delimiter String = \nMyEOR\n

    Too bad that newline characters inside a quoted field are not supported in the stage, the way they are in DS Server Edition. When real text is concerned, i guess this would be a requirement for other customers as well.

    Regards,
    Wim
  • SystemAdmin
    SystemAdmin
    2099 Posts

    Re: Sequential file split records

    ‏2007-01-30T14:41:52Z  
    • Wim
    • ‏2007-01-30T12:25:10Z
    Eric,

    thanks for your reply. I believe this is the best solution possible. What I did was write a simple filter that alters the input stream to add the string "MyEOR\n" to each '\n' character not enclosed between double quotes. So the real end-of-record is now the string "\nMyEOR\n".
    In the Sequential File Stage I specify this filter and I specify Record Delimiter String = \nMyEOR\n

    Too bad that newline characters inside a quoted field are not supported in the stage, the way they are in DS Server Edition. When real text is concerned, i guess this would be a requirement for other customers as well.

    Regards,
    Wim
    Hi Wim,

    I'm glad you found a solution! I haven't yet had the pleasure of dealing with multi-line columns using EE; only server jobs from some time ago.

    Eric