Topic
2 replies Latest Post - ‏2013-02-28T12:12:13Z by SystemAdmin
SystemAdmin
SystemAdmin
1245 Posts
ACCEPTED ANSWER

Pinned topic File operations issues impact on operators

‏2013-02-27T16:27:02Z |
Hi,

In the below operator, i'm creating/writing a log file using fopen/fwrite/fclose in one of clause in an operator. here I would like to know below points if this operator runs in distributed systems, please suggest
a> all the distributed systems will have all the logs, if so same log file can be seen in multiple systems.
b> Is it good practise/suggestion to go with FinkSink instead of opening/write/closing a file?
c> if there is any error occur during open/write/close, that is not handled in below code.
so do I need to check for error after open/write/close? If such error occurs how
do I stop the job? or just I have to restart the job? could you please help me how should I be
in all
these scenarios.

(stream <schema1> out1; stream <schema2> out2) = Filter(In) {
logic state : {
}
onTuple In : {
}
onPunct HDRFileNameData: {
mutable int32 err = 0;
mutable uint64 fileDs;

fileDs = fopen(logFile,"append",err);
for (rstring tmp in logList )
fwriteString(tmp, fileDs, err);
fclose(fileDs, err);
}
param filter: notRejected;
}

Thanks,
Renu
  • BruceGlassford
    BruceGlassford
    71 Posts
    ACCEPTED ANSWER

    Re: File operations issues impact on operators

    ‏2013-02-28T02:32:06Z  in response to SystemAdmin
    You don't want to do that. Really you don't. If you need to do a log file of this sort, either log it to the built-in logging mechanism (see http://preview.tinyurl.com/cqywsz6 (shortcut to the Streams 3.0 information center page)), or use a FileSink. Either way, it prevents the expensive overhead of opening & closing files.

    Each operator instantiation will run in a single PE - so in this case, there will be one and only one instance of this operator - no distribution within it. Streams may put the operator on any node in your cluster (and a different node each time you start the job up...), so unless the file you're working with is on a shared drive (e.g. NFS or GPFS), no other node in the cluster can see it.

    Since you're not checking for any errors, anything can happen.

    If you really must use this technique, open up the file once and leave it open - use a boolean in the state clause to see if the file has already been opened, and if it's not set, open the file and check for errors, flipping the boolean if successful. Then on the Punctuation, check for the final punctuation (Sys.FinalMarker) : if (currentPunct() == Sys.FinalMarker) { fclose(fileDs, err); isMyFileOpen = false};
    • SystemAdmin
      SystemAdmin
      1245 Posts
      ACCEPTED ANSWER

      Re: File operations issues impact on operators

      ‏2013-02-28T12:12:13Z  in response to BruceGlassford
      Thanks. Will try to use FileSink and log.sys instead of manual file operations for logs.

      Renu.